Insertion Mutagenesis Using dSpm for Function Search in Arabidopsis

Klimyuk, V., Tissier,A.*, Marillonnet,S., Patel,K.and Jones, J.

The Sains bury Laboratory, John Innes Centre, Colney Lane, Norwich NR4 7UH, UK
*DEA, Cadarache, Aix-en-Provence, France

It is widely acknowledged that in order to evaluate the function of Arabidopsis genes, it would be desirable to saturate the Arabidopsis genome with insertions, prior to using PCR to detect insertions in any defined sequence. To this end we combined a scheme for selection for unlinked transpositions, with the Spm transposon, and glasshouse selections to avoid the need for axenic culture. We developed a binary vector with a T-DNA that carries (i) a non-autonomous Spm derivative containing the BAR gene selectable marker (ii) a counter-selectable genetic marker based on a cytochrome P450 that activates a DuPont proherbicide and (iii) an Spm transposase (TPase) gene under the control of the 35S, pSpm or meiosis-specific pAtDMC1 promoters. Many independent transformants were generated and their performance compared. 35S:TPase fusions generated higher transposition rates but a lower proportion of transposants carried independent events. Ten pSpm:TPase and four pAtDMC1:TPase lines are currently being used. We have currently generated ~36,000 transposants of which at least 50% are independent. The collection has been pooled and detection of insertions in defined genes was carried out either by PCR screens or by using dot-blots of IPCR products of dSpm flanking sequences. The success rate was ~50%. We plan to generate at least 60,000 transposants, which should give an ~80% chance of hitting any 5 kb target sequence.

Using adaptor ligation or inverse PCR, adjacent sequences have been determined for 1060 insertions, of which 853 were independent. These sequences have been used to establish a database that has been searched against existing databases. Insertions have been found on many of the sequence contigs deposited by the genome sequencing groups, and in a variety of interesting genes. We plan to determine 5000 adjacent sequences. As the genome sequence emerges these will correspond to insertions at defined positions that will be within ~10 kb of any gene, providing a useful 'launching pad' for subsequent linked targeted tagging.