;ID ATMU5 DNA ; ATH ; 4895 BP ;XX ;DE ATMU5 is a molecular fossil of an autonomous DNA transposon. ;XX ;AC AP002060 ;XX ;DT 01-FEB-2001 (Rel. 6.1, Created) ;DT 01-FEB-2001 (Rel. 6.1, Last updated, Version 1) ;XX ;KW autonomous DNA transposon; MUDR superfamily; TIR; ;KW transposase; DNA-binding protein; ATMU5. ;XX ;OS Arabidopsis thaliana ;XX ;OC Arabidopsis thaliana ;OC Eukaryota; Plantae; Embryobionta; Magnoliophyta; Magnoliopsida; ;OC Dilleniidae; Capparales; Brassicaceae. ;XX ;RN [1] (bases 1 to 4895) ;RA Kapitonov,V. and Jurka,J. ;RL Direct submission (January 2001) ;XX ;CC ATMU5 is a fossilized copy of autonomous DNA transposon from ;CC MUDR superfamily. ;CC It is flanked by a 10-bp target site duplication. ;CC ATMU5 has ~310-bp terminal inverted repeats (94% identity) and ;CC encodes two proteins, ATMU5p1 and ATMU5p2. ;CC Conservation of the proteins and TIRs indicates that ;CC the transposon was active recently. One copy of non-autonomous ;CC derivative of ATMU5 is present in the genome (AC079674 25687- ;CC 25111). The non-autonomous element is 95% identical to ATMU5 ;CC and is a result of a deletion of the ATMU5 internal portion. ;CC ATMU5 is 74% identical with ATMU3 and ATMU4. ;CC ATMU5p1 is a 684-aa transposase encoded by 4 exons (359-1486, ;CC 1588-1839, 1926-2591, 2657-2665). ;CC ATMU5p1: ;CC MGGVISTIGSGESESVVDRGVEVAVELESVVGDGVEVPPVEAPVGIELVARGEDCVPSRH ;CC RKRRKIRVGDEDEAEIDPDPDYEDDCAVHGDEDCDAYTDAVGGDDDATGGEGNDDEAVGG ;CC GDDATGCGGNDDEALGYDDDATSDGGNDDDDGGVEEDANLNLAEEFPEFAGVEEDCSDDD ;CC PPDDVWEEDKIPDPFSSDDEDESRTREVHGHRDAANEEVLLELKKTYNTPDDFKLAVLRY ;CC SLKTRYDIKLFRSEARIVAAKCSYVDKDGVQCPWKVYCSYEKTMHKMQIRTYVNNHICVR ;CC SGHSKMLKRSSIAYLFEERLRVNPKLTKYEMAAEILREYNLEVTPDQCAKAKTKVVKARN ;CC ASHEAHFARVWDYQAEAQKDSWIKTCRPFIGLDGAFLKWDIKGHLLAAVGRDGDNKIVPI ;CC AWAVVEIENDDNWDWFLRHLSASLGLCQMANIAILSDKQSGLVKAIHTILPHAEHRQCSK ;CC HIMDNWKRDSHDLELQRLFWKIARSYTVEEFNNHMMELQQYNRQAYESLQLTSPVTWSRA ;CC FFRIGTCCNDNLNNLSESFNRTIRQARRKPILDMLEDIRRQCMVRSAKRFLIAEKLQTRF ;CC TKRAHEEIEKMIVGLRQCERYMARENLHEIYVNNVSYFVDMEQKTCDCRKCQMVGIPCIH ;CC ATCVIIGKKEKVEDYVSDYYTKDE ;CC ATMU5p2 is a putative 258-aa DNA-binding protein, it is ;CC encoded by 3 exons in the second strand (4513-4361, 4294-4211, ;CC 3488-2949). ;CC ATMU5p2: ;CC MSLEISSVGEEGSGNRGFPSKCRCGRDVVIYTSSSKKNPGRPYFRCPTYQDDHLFKWVED ;CC CVYEEVVDAIPRISIIDSEVAVLVMVHLLVFVLHDHHLMLFVLVLHNHDLQPFVLVLRDH ;CC DLQPLVLVLRDHDLQHLVLVLRDHDLQHLVLVIQDHDLQHLVLVIQDHDLQHLVLVIQDH ;CC DLQHLVLVIQDHDLQHLVLVIQDQPVILDEIVVALVIQDKIVIALVIQDKIEIQTLLVLK ;CC LKKRVRNTCWIVKKESHK ;CC ATMU5p2 includes a conservative motif (aa positions 26-63) ;CC CHCGLEVVIYTSASKSNPGRPFFRCPTKQDDHLFKWVE present at C-terminus ;CC of DNA topoisomerase III in human, mouse and drosophila, and ;CC at C-terminus of polyproteins encoded by ORF3 in banana streak ;CC and sugarcane bacilliform retroid viruses. ;XX ;DR Positions 28694 33589 Accession No AP002060 GenBank (rel. 119.0) ;XX ;SQ Sequence 4895 BP; 1539 A; 882 C; 1118 G; 1356 T; 0 other; ATMU5 ggaaaaaatgttaattaatacaccaattttcaaaaagtggtcatttaaaccataaactccatatatggcc aaataaaacttagtaaaagcgttgaccggccattttatacatctcaaaacgttgaccaagccaaaaaccc tgcgacgttagcagtcgttaacagagtcgataacaggcgttagcaggcgttaattaatccgtttaaacac aaaacgacgtcgttttgtgtttaatcgaaaccaacaaaattcccaattcgattctcaatccctaaaatcc ccaattccaattctcatgccctaaaattcccaaattcgaaaaatcttcaaatcgtcttcccaaatgacga aaattaggatgggaggcgtgatatctacgattggatcaggggaatcagagagtgttgtggatcgtggagt tgaagtggcagtcgaattagagagtgtcgtcggagacggagttgaagttccaccggttgaagctccagtc ggaatagagcttgttgccagaggagaagattgtgttccaagccggcatcgaaaacgaagaaagataagag tgggggacgaagacgaggctgaaattgatcccgaccccgactacgaagatgattgcgcagttcatggtga tgaagactgtgatgcttatactgatgcagtaggtggtgacgatgatgcaaccggcggtgaaggtaatgat gatgaagcagtaggtggtggtgatgatgcaaccggttgtggaggtaacgatgatgaagctttaggttatg atgatgatgcaacaagcgatggaggtaacgatgatgatgacggaggtgttgaagaagatgctaacctcaa ccttgcagaagaattcccagagtttgcgggagtagaagaagattgtagtgatgatgatccaccggatgat gtatgggaggaagataaaattcccgatccgttttcatctgatgacgaagatgagagcagaaccagagaag tacatggtcatagagatgctgcgaatgaagaggttttgctagaattgaagaaaacttacaacactcccga tgacttcaagcttgcagtcttgaggtactcattgaagacaaggtacgacattaaactgtttagatcagaa gctaggattgttgctgcaaagtgtagctatgttgataaggatggtgttcaatgtccgtggaaagtgtatt gttcatatgagaagactatgcataagatgcaaataagaacttatgtgaataaccatatatgtgtgaggtc tgggcattcgaagatgttgaagcggtcttccattgcttatttgtttgaggaaaggttgcgagtgaatcca aagcttacaaaatatgagatggctgctgagatattgagggaatacaacttggaagtgactccggaccaat gtgctaaagcaaagacgaaagtggtgaaagcaagaaatgctagccatgaagcccattttgcaagagtctg ggattaccaagcagaggtaataaagcagaatccatggactgagtttgagatagagacaactgcaggggct gtcattggagccaaacagaggttttttggttatacatttgttttaaggctcaaaaggattcatggataaa aacgtgtagacctttcataggacttgatggagcttttctgaaatgggatattaaaggccatcttctagct gcagttggaagagatggagataacaagattgttcctattgcttgggctgtcgttgagatagaaaatgatg acaactgggactggttcttaagacatctatctgcaagtttggggctttgtcaaatggctaatatagctat cctctctgataaacaatctgttagtttttttaactttgtgtctttgttaaattggtttgttgttttgttt actgatttttggtcatgttttatgtcatattgcaggggctagtcaaagctatccatacaatacttccaca tgctgagcatcgacaatgttcaaaacacatcatggataattggaaaagggacagccatgatttagagctc cagcgtctgttttggaagatagcacgaagctacacggttgaggagtttaataatcatatgatggagctac agcagtacaatcgtcaagcttatgaatctctgcagcttactagtccagtgacatggtcaagagcattttt cagaataggtacatgttgcaatgacaacctcaataatttgagtgagtcttttaataggactattaggcag gcaaggagaaagccaatattggatatgctggaggatatccggaggcaatgcatggtcagaagtgcaaaga gattcctaattgctgaaaaattgcagacaaggttcacaaagagagctcatgaagagattgaaaaaatgat tgttgggcttagacagtgcgagagatacatggccagagaaaacttgcatgagatatatgtgaataatgtt agctattttgttgatatggagcagaaaacttgtgattgcaggaagtgtcagatggttgggatcccttgta ttcatgcgacttgtgtgataatcggaaaaaaagaaaaggtggaagactacgtgagtgactattacacaaa ggtaaggtggcgagaaacttacttgaaaggtataaggcctgtccaagggatgcctttatggtgtaggacg aataggttgcctgtattaccaccaccatggagaagaggcaatgctgggaggccaagtaactatgctagga ggaaaggtagaaatgaagctgcagctccttcaaatccaaacaagatgtctcgggaaaagagaatcatgac atgctctaactgccatcaagaagggcacaacaagcaaaggtgcaacattccaactgtcttgaatccaaca aagagaccaagaggtcgtcctaggaaaaatcaggtttgtgactctcatttatgactctcctttttaacaa tccaacattcatttatgactctcctttttaacaatccaacatgtgtttctgactctcttttttaatttaa gaaccagaagtgtttggatctcaatcttatcttggatcacaagagcaatcacaatcttatcttggatcac aagagcaaccacaatctcatcaaggatcacaggctgatcttgaatcacaaggaccaagtgctgcaggtcg tggtcttggatcacaaggaccaagtgctgcaggtcgtggtcttggatcacaaggaccaagtgctgcaggt cgtggtcttggatcacaaggaccaagtgctgcaggtcgtggtcttggatcacaagaaccaagtgctgcag gtcgtggtcgcggagcacaaggaccaagtgctgcaggtcgtggtcgcgtagcacaaggaccaagggctgc aggtcgtggtcgcggagcacaaggacaaagggctgcaggtcgtggttgtggagcacaaggacaaacagca tcaggtggtggtcgtggagcacaaagacaagcaggtggaccatcacaaggaccgcaaccttatagaagac atggttcagcacaagcagaagcagaaccacaaggacttcgtcgatttgaatcatggtttgagtgttcaga caatcttagtcacaagtttctttttgttattgtctgcagcttagtcataagtttctttttggtggttgaa cttgtctttatgacttagcttagtatcgatttcattttgttgttttgacttgtcttggctttgtgacatt agtgttataatgtatgaatcgtttttaatgtatgaactttgtttaggattattaatcattgtgaaacaaa acaagaattgcattaccaatcatggtttcattacaaacttagataaaagccatctaaccaaacacaaaca ttagtaaccaagaacaaaacaagcattgcattaccatagtaccatttcattacaaacttagataaaagcc atctaaccaaacacaaacactagtaaccaaacacacatcactagtaaacacaaaccatctaaccaaacac aaacgctagtaacctctcttcaatttcctcatttttggtcttgtagaacattatgatcacaattgcaatg aaaattatacaaaggcacaccaagcacactttggtcatcgatttccatctcttcaattctcttttgctcc acatttcatcttctttcaattcttggatcaaaccctttaattcagcaatctcgagcaacctccgatttgg cattgtttacctcgctgtcaatgatagagattctcggtattgcatctacaacctcttcatacacacaatc ttcaacccatttaaacaaatggtcctgtaattccatcgacaattatttacttatgaccaaaaacaataaa acaaaatcaaatctacttacatcttgataggttggacatcgaaagtaaggtcttcccggattcttcttcg agcttgatgtatagatgacgacatctctaccacaacgacactttgaaggaaacccacggttcccacttcc ttcttccccaacactcgagatttctaaactcatatctgcagaaaatggctatgaaattggagttttcgag ttcaaacgaacaaaaatggagaggcttttgaagattttcgaatttgggaattttagggcatgagaattga aattggggattttagggattgagaatcgaattgggaattttgttggtttcgattaaacacaaaacgacgt cgttttgtgtttaaacggattaattaacgcctgttatcgactctgttaacgactgctaacgtcgcagggt ttttgggcggatcaacgtttttgaatgtataaaatggccggtcaacgcttttactaagttttatttgacc atatatggaattcatggtttaaatggccactttttgaaagttggtgtattaaacaacatttttcc1