;ID ATMU3 DNA ; ATH ; 4853 BP ;XX ;DE ATMU3, an autonomous DNA transposon - a consensus sequence. ;XX ;AC . ;XX ;DT 18-SEP-2000 (Rel. 5.9, Created) ;DT 18-SEP-2000 (Rel. 5.9, Last updated, Version 1) ;XX ;KW autonomous DNA transposon; MUDR superfamily; TIR; ;KW transposase; DNA-binding protein; ATMU3. ;XX ;OS consensus ;XX ;OC Arabidopsis thaliana ;OC Eukaryota; Plantae; Embryobionta; Magnoliophyta; Magnoliopsida; ;OC Dilleniidae; Capparales; Brassicaceae. ;XX ;RN [1] (bases 1 to 4853) ;RA Kapitonov,V. and Jurka,J. ;RL Direct submission (September 2000) ;XX ;CC ATMU3 is an autonomous DNA transposon from MUDR superfamily. ;CC Its copies are flanked by 10-bp target site duplications. ;CC There are 5-10 copies of ATMU3 in the genome, they are ~99% ;CC identical to the consensus sequence. The consensus sequence ;CC was reconstructed based on 4 copies; therefore, it may ;CC represent well enough features of the active ATMU3 transposon. ;CC ATMU3 has ~280-bp terminal inverted repeats and encodes ;CC two proteins, ATMU3p1 and ATMU3p2. ;CC ATMU3p1 is a 949-aa transposase encoded by 5 exons (390-2198, ;CC 2293-2471, 2511-2696, 2912-3269 and 3338-3655): ;CC MRIGSPMLHESENLVGDGVEAPPRGERDEIGEEVGGETNVVSESCGGEDGAQDAAKATPE ;CC GEADRIGEEARGQANVVSNVDVADVAEVRSPLRRSKRRQIRLEEEAHDAVEPMPEGEANQ ;CC IGEEARGQSVDVADATAVCSPLRRSKRRQNRLEEEADVAVEESTRREIGQEEEEALGETS ;CC CGGEEEAHDEAQESGVAVADAVEVRAPLRRSKRRRIRDEEEEDLEAEVPALDEDDDCAVQ ;CC GDEDCDEVDDAVDTNREEDDTAGLGVEEDGNLDMERDFPEANGEEEASDNDSGDDIWDED ;CC KIPDPLSSDDEDDDRVEAARNDLGDPEILLALEKTYNSPEDFKLALLMSSLKTRYDIKLY ;CC NSEAMVVAAKCVYVSDEGVECPWRVRCSYEKRKHKMQIRTYYNEHTCVRSGHSKMLKVSS ;CC IGFLFEERLRVNPKLTKHEMVAEILREYKLEVTPDQCAKAKTKVLRARRASHDSHFARIW ;CC DYQAEVLLRNPGTEFNIETVAGAVIGSKQRFYRLYICFQAQRESWKQTCRPVIGIDGAFL ;CC KWDIKGHLLAAVGRDGDNRIVRIAWSVVEIENDDNWDWFLRQLSTSLGLCEMTDLAIISD ;CC KQSGLVKAIHTILPQAEHRQCSKHIMDNWKRDNHDIELQRLFWKIARSYTVEEFNNYQAD ;CC LKSPITWSRAFFRTGTCCNDNLNNLSESFNRTIRQARRKPLLDLLEDIRRQCMVRTAKRF ;CC IIADREKKKVESYVNDYYTRNRWRETYFRGIRPVQGMPLWGRLNRLPVLPPPWRRGNAGR ;CC PSNYARRKGRNEVASSSNPNKMSREKRIMTCSNCLQEGHNKKSCKNATVLSPPKRPRGRP ;CC RINEEPQGYVEGSDGHDNGSQGQGNVLQGQENVSQGQNNGSQGQNNGSQTQSQRGRGRGT ;CC QRQRGTTRGAQRQRGRGRGTSQVSEQPQGEAQPQGLAGLAPWFECSRGT ;CC ATMU3p2 is an 154-aa putative DNA-binding protein encoded by ;CC the second strand (exons 4530-4333, 4279-4046 and 4028-3996): ;CC MSCNSRNSSGESGGCNSGMISNAEAGGFKSRGFPVKCKCGLEVVMFTSSTAKNPGRPFFR ;CC CKSCEDLEMELQDHLFKWVEECMYEEVVDALPKISSIDNEIINAKAEVAVEIANLKELMI ;CC ELKEDGMWSKREIQRWKKMTKVCLCDCNCNINVL ;CC ATMU3p2 includes a conservative motif (aa positions 37-80) ;CC CKCGLEVVMFTSSTAKNPGRPFFRCKSCEDLEMELQDHLFKWVE present at C-terminus ;CC of DNA topoisomerase III in human, mouse and drosophila, and ;CC at C-terminus of polyproteins encoded by ORF3 in banana streak ;CC and sugarcane bacilliform retroid viruses. ;CC There are ~100 highly diverged copies of ATMU3p2 encoded by ;CC different families of ATMU-like DNA transposons and inserted ;CC in the genome. ;XX ;DR [1] (Consensus) ;XX ;SQ Sequence 4853 BP; 1557 A; 877 C; 1108 G; 1311 T; 0 other; ATMU3 gggaaaaatgttatttaatacctcaacttacaaaaaatggccaaattaaccgtgaactcgtgaaatggcc gttttaactctcaacaaaaagttgacttctgttttaactttcaagtttgcgttgactcggcctaattaac caccgttaaaaatccttctaacagcgtaattgacagccgttttagtccgttaagcatctgttactatagt cttacgacgtcgttttcgtgctaaagagaaatcaaaatcgagaatagaaattctcaaaacaaaatcaatt accctaaacccaaatcgaaacctaatcctgcccccaaaatcaaaatcgaaaccctaattgcttcaattcg ttttctgaaatgccattagttagaaagaatggcgtcgttatgaggattggatctccaatgcttcacgaat cagagaatttggtgggagatggagttgaagcaccaccgcgaggagaaagagatgaaattggagaagaagt tgggggtgaaacgaatgtggtatctgagtcttgtggcggagaagacggagcacaagatgcagctaaagca acgccggaaggagaagcagatcgaattggagaagaagctcgaggtcaagcgaatgtagtatcaaatgttg atgtggcagatgtggccgaagttcgttctccacttaggcgaagtaaacgaaggcaaatcagattggaaga agaagctcacgatgcagttgaaccaatgccggaaggagaagcaaatcaaattggagaagaagctcgaggt cagtccgttgatgtggcagatgcaaccgcagtttgttctccacttaggcgaagtaaacgaaggcaaaata gattggaagaggaagctgatgttgcagttgaagaatcaactcgtcgtgaaattggacaagaggaagagga agctttaggtgaaacgtcttgtggcggagaagaggaagctcacgatgaagctcaagaatcaggcgttgct gtggcagatgcagtcgaagttcgtgctccacttagacgaagtaaacgaaggagaatcagagatgaagaag aggaggatttagaggctgaagttcctgctcttgatgaagacgatgactgtgcagtccaaggagatgaaga ctgcgatgaggttgatgatgcagtagatactaatagagaagaagatgatactgctggattaggtgtcgaa gaagatggtaacttagacatggaaagagattttccagaggctaatggagaagaagaagctagtgacaatg acagcggagatgatatatgggatgaagacaagattccagatcctttgtcctctgacgatgaagatgatga tagagtagaggcagctcgaaatgatcttggtgatcctgagattttactagcattggagaagacttataac tctcctgaagatttcaagcttgctcttttgatgtcttccctaaagacaaggtatgacattaaactttata attctgaagctatggttgttgctgctaagtgtgtgtatgttagtgatgagggtgttgaatgtccgtggag agtccgttgctcttatgagaagagaaaacataagatgcaaatacgaacttattacaatgagcatacttgt gtgaggtcaggacattcgaagatgttaaaggtgtcatctattgggtttttgtttgaagaaaggttgagag tgaatccaaaactcactaaacatgagatggttgctgagatcttaagagaatacaagttggaagtgactcc agaccaatgtgctaaggcaaagacaaaagttttgagagctagacgtgctagtcatgattctcattttgct aggatatgggattatcaagcagaggtgttattgcggaatccggggacagagttcaacatagagacagttg caggagcagtgattggaagcaagcagagattttaccggttatatatttgttttcaagctcaaagggagtc atggaaacaaacttgcagacccgtaatagggatagatggagcttttctgaaatgggacataaaaggacat ctattagccgcagttggaagagatggtgacaatcggattgtccgtattgcttggtctgtagtcgagatag aaaatgatgacaattgggactggttcttgagacagctctctacaagcttggggctatgcgaaatgactga tctggcaatcatttcagataaacaatctgttagtctctattctataagattcccttcatatatctactgt aatttgagatagacaatcatactaaacttgtgttttttttgttgttttgcagggtttagtcaaggctatc cataccattcttccgcaagccgagcatcgacaatgttcaaaacacatcatggataattggaaaagggaca accacgacattgagctacaacgtctattttggaagatagcacgcagctacaccgtagaagagttcaataa ttaccaggcagacttaaaaaggtacaatatccaagcctacacgtctctccaacttactagtccgattaca tggtctagagcattctttagaaccggtacatgttgcaacgacaatctcaacaatctgagtgagtcattca atagaaccattagacaagctaggcggaaaccactgttagatcttctagaggatattaggaggcaatgcat ggttaggacagccaaaaggtttatcattgctgacaggtgcaaaacaaagtacacaccaagagctcatgct gagattgagaagatgattgctggggtccagaatacacagagatacatgtccagggataatttgcatgaaa tctatgtcaatggagttggctactttgttgatatggacttaaagacatgcggctgcaggaaatggcaaat ggttgggatcccatgtgttcatgcaacatgtgtgataatagggaaaaaaagaaggttgagagctatgtga acgactactacacaagaaataggtggcgagaaacatatttccgtggtattaggcctgtccaagggatgcc tttgtggggtcgattgaataggctgcctgtcttgccaccaccatggagaagaggcaatgccggaaggcca agcaattatgcaagaaggaaaggaagaaatgaagttgcctcttcctcaaatccgaacaaaatgtcaaggg aaaagaggatcatgacatgctctaactgcttgcaagaagggcacaacaagaaatcatgcaaaaatgctac tgttttaagtccaccaaagagaccaagaggtcgaccaaggataaatgaggtttgtatatctttcatttct attttcaaaaattctgtttcaaaaactgattgtaatgtttgtattaggaaccacaagggtatgtagaagg atcagatggacatgataatggctcacaagggcagggtaatgtgttacaagggcaggaaaatgtgtcacaa gggcagaacaatggctcacaagggcagaacaatgggtcacaaacacaaagccaaagaggaagaggtcgtg gaacacaaagacaaaggggaacaactcgtggagcacaaagacagaggggaagaggtcgtggaacatcaca agtgtctgaacaaccacaaggagaagcacaaccgcaaggacttgctggacttgcaccatggtttgaatgt tctcgtggaacatgatatgctagtctcatgtttgtttttgttgtttgaacttgtctttatgacatatgtt tattctcggtttgtttttgttgtttgaacttgtctttatgacataatctaagtcttggttacttattgtt gtttgcacttgtctctctatatgattagcttagtctcagttttaagaagttgaccttttctttcaaatga aattcattaccattaccaaagctacatgtcattacatagtaaaagcataaccaaacacaaactaaacaac aaagacatcacattcaaactaagctcccattcgattacacactaataaccaagaacaaatttctggtttt ttttcttatagaacattgatattacaattgcaatcacactaaaacaaaggcacaccaagcacactttcgt catctttttccatctctgaatttctcttttgctccacattccatcttctttcaactctatcattagctcc tttaagtttgcaatttcaacagcaacctcagcttttgcattgattatctcgttatcaatgctggagattt tgggtagagcatctacaacctcttcatacatacactcctccacccatttgaacaaatgatcctgcaattc catctcaagcttttaccgaccaaacacaaaaaacatcacattctcatatcaaatctacttacatcttcac agcttttgcaccgaaagaaaggccttccagggttcttagccgtgctcgatgtaaacatgacgacttcgag cccacatttacacttaacaggaaacccacggctcttgaaacctccagcttcagcattcgaaatcattcct gaattgcagcctccactttcaccactcgaattccttgagttgcagctcataatgcagatttgatcgaaaa atttgagagagaatgatttttagggtttgggttttgatttggggatttttcagaggtttcgtcgactttg aagtctgttgatagcacaaaacgtcgtcgtatcaccttaacggatgttctgtaacggctgtcaattacgc tgttagaaggatttttaacggtggttaattaggccgagtcaacgcaaacttaaaggttaaaacagaagtc aactttttgttgagagttaaaacggccatttcacgagttcacggttaatttggccattttttgtaagttg aggtattaaataacatttttccc1