;ID   ATMU4       DNA   ; ATH   ; 4496 BP
;XX
;DE   ATMU4 is a molecular fossil of an autonomous DNA transposon.
;XX
;AC   AC025417
;XX
;DT   18-SEP-2000 (Rel. 5.9, Created)
;DT   18-SEP-2000 (Rel. 5.9, Last updated, Version 1)
;XX
;KW   autonomous DNA transposon; MUDR superfamily; TIR;
;KW   transposase; DNA-binding protein; ATMU4.
;XX
;OS   Arabidopsis thaliana
;XX
;OC   Arabidopsis thaliana
;OC   Eukaryota; Plantae; Embryobionta; Magnoliophyta; Magnoliopsida;
;OC   Dilleniidae; Capparales; Brassicaceae.
;XX
;RN   [1]  (bases 1 to 4496)
;RA   Kapitonov,V. and Jurka,J.
;RL   Direct submission (September, 2000)
;XX
;CC   ATMU4 is a fossilized copy of autonomous DNA transposon from 
;CC   MUDR superfamily.
;CC   It is flanked by a 10-bp target site duplication.
;CC   ATMU4 has 280-bp terminal inverted repeats and encodes
;CC   two proteins, ATMU4p1 and ATMU4p2. 
;CC   ATMU4p1 is a transposase encoded by 4 exons (357-1733, 
;CC   1965-2194, 2266-2629, 3041-3325): 
;CC   MPLVRLNGIISTIGSPMEDDELESSVPVVAPPAEVISESGGGDGGVEAAAVDNRRVSGRR
;CC   RTGSRVRVVEDDEPEIDPELEIDPEPDLEDDCAVYGDDDCDVVDDAVAGEGNEDNVVEED
;CC   ANLNIADDFPEAYRADEEAGSDSDTGDDIWDDEKIPDPLSSDDEDEVVRVGEEAVCGDED
;CC   DPEVLLAIEKTFNSPDDFKRAVLMYSLKTRYNINFYRSESLMVAAKCCYVNELGVNCPWR
;CC   VLCSYEKKKHKMQIRIYFNEHICVRSGYTKMLKRSTIAALFEERLRVNPKMTKYEMVAEI
;CC   KREYKLEVTPDQCAKAKTKVLKARNASHDTHFSRIWDYQAEVLNRNPNSDFDIETTARTF
;CC   IGSKQRFFRLYICFNSQKVSWKQHCRPVIGIDGAFLKWDIKGHLLAAVGRDGDNRIVPLA
;CC   WAVVEIENDDNWDWFLKKLSESLGLCEMVNLALISDKQSGLVKAIHNVLPQAEHRQCSKH
;CC   IMDNWKRDSHDMELQRLFWKISRSYTIEEFNTHMANLKSYNPQAYASLQLTSPMTWTIRQ
;CC   ARRKPLLDMLEDIRRQCMVRTAKRFIIAERLKSRFTPRAHAEIEKMIAGSAGCERHLARN
;CC   NLHEIYVNDVGYFVDMDKKTCGCRKWEMVGIPCVHTPCVIIGRKEKVEDYVSDYYTKEHN
;CC   LIKNKQFQVMDHKNKQLRVMDHKDMVHKDMDHKNKQLLVMDHKDMGHKDQELMDRRDKGH
;CC   HKDKDQELMLDQKHKHNLNHKRKPKNKDLLG
;CC   ATMU4p2 is a putative DNA-binding protein encoded by the second 
;CC   strand (exons 4138-3974 and 3910-3605):   
;CC   MSCKSGNSYPSILDGGCWGRGLASKCHCGLEVVIYTSASKSNPGRPFFRCPTKQDDHLFK
;CC   WVEYGVYEEVVEALPKISSIDSEIMKAKCEVAIEIEQLKTMIKEVKEEAMCSEREIKNWK
;CC   RMIKCCLVCLGFIVIVIVVGMIMFGNTKEQKLVLGY
;CC   ATMU4p2 includes a conservative motif (aa positions 26-63) 
;CC   CHCGLEVVIYTSASKSNPGRPFFRCPTKQDDHLFKWVE present at C-terminus
;CC   of DNA topoisomerase III in human, mouse and drosophila, and
;CC   at C-terminus of polyproteins encoded by ORF3 in banana streak 
;CC   and sugarcane bacilliform retroid viruses.
;CC   There are ~100 highly diverged copies of ATMU4p2 encoded by 
;CC   different families of ATMU-like DNA transposons and inserted
;CC   in the genome. 
;XX
;DR   Positions 80238   84733   Accession No AC025417   GenBank (rel. 116.0)
;XX
;SQ   Sequence 4496 BP; 1417 A; 814 C; 972 G; 1293 T; 0 other;
ATMU4
ggaaaaaatgtagttaaatcccccaactttcaaaaaatggccaatcaatacgtcaacttagatggaagcc
atttaaaacatcaacttacagttgactaacaaataaaacatgaactttgcgttgacgaggccacattata
cgcgtcgttaagtcgattaacagacccaataaacgacgttttctcccgttagtatctctctgttagtcaa
gaaacggtgacgtttagaggttaatagaaacccaattagggctttgtcgagaagaaattcccaaatcaaa
tcaaattcattcttccccaaattgcttttgttcttccccaaatcaaaaaccctaaattatcgacgagtgt
tagtggatgccattagttagactcaacggcatcatatcgacgattgggtcaccaatggaagacgacgagt
tagagagttctgtacctgtagttgctccaccggcggaagtcatttcagagtcaggtggaggagatggagg
agtagaggcagctgctgttgataacagaagagtttctggacgtcgtcggacaggaagtagagtcagagtc
gtagaagatgatgagcctgaaattgatcccgagcttgaaattgatcccgaaccggatttggaagatgatt
gcgccgtgtatggtgatgacgattgtgatgttgttgatgatgcagtagctggtgaaggcaacgaagataa
cgttgttgaagaagatgctaacctcaacattgcagatgattttcccgaggcttatagagcagatgaagaa
gctggttctgattccgacaccggagatgatatctgggatgacgaaaagatcccagaccctttgtcatctg
acgacgaagatgaggttgttagagtcggagaagaggcagtttgtggagatgaagacgatccagaggtttt
gctagctatagagaaaactttcaactctccggatgacttcaagcgtgcagtattgatgtactctttgaag
acaaggtataacatcaatttctataggtctgaatcattaatggttgctgctaagtgttgctatgtaaatg
agctaggtgttaattgtccgtggagagtcttatgttcatatgagaagaagaagcataagatgcaaataag
aatttacttcaacgagcatatatgtgtgaggtccggttatacaaaaatgttgaagcggtctaccattgca
gctttgtttgaggaaaggctgagagttaatccaaagatgacaaaatatgagatggttgctgagataaaga
gagagtacaaattggaagtaactccagatcagtgtgctaaggcgaaaaccaaagttctaaaggcaagaaa
tgctagtcatgatacccatttttcaaggatatgggactatcaagcagaggtattaaatcggaacccaaac
agtgactttgacatcgagacaactgcgagaacgtttattggaagcaagcagaggttttttcgcttgtata
tatgttttaattctcaaaaggtttcatggaaacaacattgtagaccagtcataggcatagatggagcgtt
tttgaaatgggacataaaaggtcatcttcttgctgcagttggaagagatggtgataataggattgttcct
ctagcttgggcagtggtagaaatagaaaacgatgataactgggactggtttttgaagaaactgtctgaaa
gtttagggctttgtgaaatggtgaatctagctttaatttcagataagcaatcggtaagcaatcaagcttt
agtttaagctttattcgtgtttatgcttcagtttgtattctttgtttagtttcagcttttattcatgctt
tagtttcagctttattcatgctttagtttcagctttattcatgctttgtttagtttcagcttttattcat
ggatcactttcagcttttattcttggtttagtttcagcttttaatgatggattcatatttctgtgatttt
tcagggtcttgtcaaagcaatccataatgtactgccacaagcggagcatcgccaatgttctaagcacata
atggataattggaagagggacagccatgatatggagctacaacgtttgttttggaagatatcccgcagtt
acaccatagaagagttcaacactcatatggcgaatctaaagagctacaatcctcaggcttatgcatctct
gcagcttactagtcctatgacctggtcagagccttctttcgaataggtacctgctgtaatgacaacctaa
acaatttgagtgagtcttttaataggactattagacaagccagaagaaagccgttattggatatgctaga
agacatccggaggcaatgcatggtcagaactgcaaagaggtttatcatagctgagaggttaaagtcaaga
tttactccgagagctcatgctgagattgaaaaaatgattgctggttctgcaggttgcgaaagacatttgg
cgagaaacaatttgcatgagatatatgtgaatgatgttggatactttgttgatatggataaaaagacttg
tggttgcaggaaatgggagatggttggaatcccatgtgttcatacaccatgtgtgataataggcagaaaa
gaaaaggttgaagactatgttagtgactactacacaaaggtaagatggagagagacatacagagatggta
ttaggcctgtccaaggaatgccattgtggcctagaatgagtaggttgcctgtcttaccgccaccttggag
aagaggcaatcctggaagacaaagtaactatgctagaaagaaaggaagatatgaaacagcctcttcttca
aacaagaacaagatgtcacgggctaatcgaataatgacatgctctaattgtaagcaagaagggcacaaca
agagctcatgtaagaatgctactgtcctgttaccaccaccgagaccaagaggtcgaccaagtctaaatca
ggttacacttcttttacgtttcagattctcaacaagagcttcattgaattatgtacgtttcagattctca
tcaaggcttcattgaattaggaaccacaaggagcacaatcttatcaagaacaagcagtttcaggtcatgg
atcacaagaacaagcagcttcgggtcatggatcacaaggacatggttcacaaagacatggatcacaagaa
caagcagcttctggtcatggatcacaaggacatgggtcacaaggatcaagagctcatggaccgcagagac
aaagggcatcacaaagacaaagatcaagagctcatgttagatcagaagcacaagcacaatctcaaccaca
agcgcaagcccaagaacaaagacttgctgggttagaatcatggtttaattgttcatctcagctctaattg
tagtatttctgtgacatttgcgtttttttttgtcaaaaacttaagctttgcctcagggttttttggttgt
ttgaacttaagcacttctcttgttgtttgcttcttgttaaactttgaagaagataacctatgtttttgtt
gtttggttcttcttgtatgcaaacacatgatctagaaaaacatcacactcattttcattacgatagtaaa
atacaaacacaaagacaaagaacaaaacgttgcactagtaaccaagaacaagcttttgttctttggtatt
accgaacattatcataccaaccacaatcacaatgacaataaaaccaagacacaccaagcaacatttgatc
atccgtttccaattcttgatttccctttcgctacacatagcttcttcctttacttcttttatcatagtct
tcaactgctcaatctcgattgcaacctcacatttggccttcattatctcactgtcaatgctggagatttt
tggtaaagcctctacaacctcttcgtacacgccatattcaacccatttaaacaaatgatcctgttttcaa
tcggaaatcaattaccaaccaaaccaaaaagacacaaactcaattcaacttacatcttgtttcgttggac
accgaaagaatggccttccaggattcgactttgaagctgaagtatagatgacgacctccaaaccacagtg
acacttcgaagctaagccacgtccccaacaacctccgtctagaatactaggataagaatttccagatttg
cagctcatttcttcagcttttgatggagaatttggagaaaaagcagagaatttggagaagaagaagagaa
ttagggaattagggatttaaagtattggatttggggttttttgacaaacaatcgggtttctattaaccta
taaacgtcaccgtttcttgactaacagagagatactaacgggagaaaacgtcgtttattgggtctgttaa
tcgacttaacgacgcgtataatgtggcctcgtcaacgcaaagttcatgttttatttgttagtcaattgta
agttgatgttttaaatagcttccatctaaattgacatattgattggccattttttgaaagttgggggatt
taacgacatttttccc1