;ID   ATHILA6A_I  DNA   ; ATH   ; 7835 BP
;XX
;DE   ATHILA6A_I is a an internal part of the ATHILA6A endogenous 
;DE   retrovirus - a consensus sequence.
;XX
;AC   .
;XX
;DT   17-MAY-2001 (Rel. 6.1, Created)
;DT   17-MAY-2001 (Rel. 6.1, Last updated, Version 1)
;XX
;KW   Gypsy-like endogenous retrovirus; ATHILA superfamily; long terminal
;KW   repeat; reverse transcriptase; integrase; ATHILA6A_LTR; ATHILA6B_I; 
;KW   ATHILA6A_I.
;XX
;OS   consensus
;XX
;OC   Arabidopsis thaliana
;OC   Eukaryotae; mitochondrial eukaryotes; Viridiplantae;
;OC   Charophyta/Embryophyta group; Embryophyta; Magnoliophyta;
;OC   Magnoliopsida; Capparales; Brassicaceae; Arabidopsis.
;XX
;RN   [1]  (bases 1 to 7835)
;RA   Kapitonov,V.V.
;RL   Direct submission (May 2001)
;XX
;CC   ATHILA6A_I is an internal part of the ATHILA6A endogenous 
;CC   retrovirus. There are several copies of ATHILA6A_I in the genome;
;CC   they are ~99% identical to the consensus sequence. The consensus
;CC   sequence was derived from 5 proviral copies. These copies are
;CC   flanked by 99% identical LTRs (ATHILA6A_LTR). They have generated
;CC   5-bp target site duplications upon their integration into the
;CC   genome.
;CC   ATHILA6A_I encodes two proteins: 911-aa ATHILA6A1p (positions
;CC   173-2905) and 648-aa ATHILA6A2p (positions 4237-6180).
;CC
;CC   ATHILA6A1p:
;CC   MQTRSRGNQNLLFNDNIDRIARQLRTQTETDTMAAVVDEQVQPNNIGAGDAPRNHNQRNGIVPPPVQNNN
;CC   FEIKSGLIAMVQSNKFHGLPMEDPLDHLDEFDRLCSLTKINGVSEDGFKLRLFPFSLGDKAHQWEKSLPQ
;CC   GSITSWNDCKKAFLAKFFSNSRTARLRNDISGFTQTNNETFCEAWERFKGYQTQCPHHGFSKASLLSTLY
;CC   RGVLPKIRMLLDTASNGNFLNKDVEDGWELVENLAQSDGNYNEDYDRSVRTSSDSDEKHRREMKAMNDKL
;CC   DKLLLVQQKHIHFLGDDETFQVQDGETMQSEEVSYVQNQGGYNKGFNNFKQNHPNLSYRSTNVANPQDQV
;CC   YPSQQQNQPKPFVPYNQGQGYVPKQQYQGNYQQQLPPPGFTQQQQQPASTTPDSDLKNMLQQILQGQAAG
;CC   AMDLAKKMAEIHNKVDCTFNDLNIKLEALTSKVRYMEGQTASTSAPKVTGLPGKSIQNPKEYATAHAITI
;CC   CHDRELPTRHVSTSITEDSDVQDGEVSTQIEISVVGLDHSAGSRFQTQSNLDEKAAIIERMVKRFKPAPL
;CC   PSRALPWKFRKAWIERYNSLAEKQLDEMEAVMPLIEVLNLIPDPHKDVRKSILERIKIHQDSEDECDAIP
;CC   SRTTVKRSVQEKLEDPGTFTLPCSIGQLVFSNCLCDLGASVSLMPLSVARKLEFTQYKPCDLTLILADGS
;CC   SRKPFGLLQDLPVMINGVEVPTDFVVLDMEAEPKDPLILGRPFLASVGAMIDVRDGRISLNLGKHIKLQF
;CC   DINETSQRSAVEERIRAQPQPSNSITRPSTASTPDLRDLKKKSDEQEETIEKLAQTVEELKSKLDQMQEI
;CC   AKSKCGNNTIPRKKITSRWSEEIDYPPEEKEAYFEERRIEYSATHLSREDAEYDDEIREDYADPLYHPFS
;CC   S
;CC
;CC   ATHILA6A2p:
;CC   MSNYSGESSMDADYNVDEAESWSTRPEREQQAYESFRAETQRSVARRNERRAEIARGKRAMTSRYELIDE
;CC   DIDVEYEPESWHRETKLLNKPDEVTVEEYIRLFELNDFWGTRYPCYETLAQLGLLEDVQHLFEKCHLETL
;CC   MSYPYVAYKKETIEFLSTLQVELYQGLTADELESEGLGFLTFSVNEQRYQLSIKSLEGLFGFPSGKGTKP
;CC   KFEREELKDLWLTIGNDLALNSARSKSNQIRSPVIRYYQRSVANVLYPRESTGTVSNTDMEMIDSALKGI
;CC   LRRTKGKKVLKGDLNDTPPVMLLLIHLCGYRKWAHTNGRKKVRGALCVGGVVTPILIACGVPLTSPGFDP
;CC   RMMDLDHLRRCEFLEYDMVGDFYRYKFEHSLTRTANILLPCIEATTILQGENIDFRPARDYLYFESAPPT
;CC   DDNVPTTEATEDDIAETDEDREEEYDTSMYHFSEHVPPARESKSLSEAHRNNSKLQRWCKKQDRLLIKCF
;CC   KAITFLTDKISCFSSTTAIPQGERPQDMPSRRYDAPGPSHHRPEPSHHRPEPSDRVVPPVPARHSSFEPR
;CC   ELGRKKKAALARSGSRSTRLLQSRSLRDRGAGRSRRREVEYHQSGAGRDEGAEVEYPQGEAETQQGDSSM
;CC   AWEQSQAAIDDQLRSFFH
;CC   ATHILA6A_I and ATHILA6B_I share identical ~1400-bp and ~1850-bp
;CC   3'- and 5' ends, respectively. However, their central portions
;CC   (positions 2400-5970 in ATHILA6A_I) are not similar to each other.
;CC   Since, the central portion of ATHILA6B_I encodes the reverse 
;CC   transcriptase and integrase, ATHILA6A was a non-autonomous 
;CC   retroelement, whose transposition was mediated by ATHILA6B. 
;XX
;DR   [1] (Consensus)
;XX
;SQ   Sequence 7835 BP; 2246 A; 1792 C; 1765 G; 2032 T; 0 other;
ATHILA6A_I
atttggcgccgttgccaattgggtgtttgtttgctatatttgagatttcagaatatttaagatcaagttc
tttttcattttttctgaaagttactaactttgtgttatttctgtttgtctgttttgattcaggtactacg
aatcactagcttcaccagttggtactcgttgtatgcaaacacggtcacgaggaaatcagaatctcctgtt
caacgataacatcgaccgtattgcacgccaactcagaacacagacagaaacagacacaatggctgcagtt
gttgatgagcaggtgcaaccaaacaacataggtgcaggcgatgcaccacgcaaccacaatcagcgtaatg
gcatcgtgcctccacctgtgcagaacaacaactttgagatcaagagcggtctcattgccatggttcagag
caacaagttccacggcctccctatggaggatccactcgatcacctggacgagtttgataggctctgcagc
ctaacaaagatcaatggagtcagtgaggatggcttcaagctcagattgtttcctttctctcttggagaca
aagcccatcagtgggaaaagtcgcttcctcaaggctctatcacctcctggaatgactgcaagaaagcctt
cttggctaagtttttctcaaactccaggactgcgagactaaggaatgatatctccggtttcactcagacg
aataatgagactttctgcgaagcatgggagcgcttcaaaggttatcagacgcaatgtcctcatcatggtt
tctccaaagcttcgcttctcagcactctctacagaggtgtcctcccaaagatcaggatgcttctagatac
cgcttctaacgggaactttctcaacaaagacgttgaagacggatgggagctggtagagaacttagctcag
tcggatggcaactataatgaagattatgatagaagcgtccgcaccagctctgattctgatgagaagcacc
gcagggaaatgaaagccatgaatgacaaactggacaagctactgcttgtgcaacagaagcacattcattt
tctgggtgatgatgagacgttccaagttcaggatggggagactatgcagtcagaagaggtcagttatgtg
cagaaccaaggaggttacaacaaaggtttcaacaacttcaagcagaaccatcccaatctgtcttacagaa
gtacaaacgttgcaaacccgcaggaccaagtttacccttcacagcagcagaatcaacccaagccctttgt
tccatacaaccaaggtcaagggtatgttcctaagcagcagtatcagggcaactatcagcagcaacttcca
ccacctgggttcacacagcagcaacaacaaccagcttcaacaactccagattcagacttgaagaacatgt
tacagcagattctccaagggcaagctgcaggagcaatggatctcgccaaaaagatggccgaaatccacaa
caaggttgattgtactttcaacgatctgaatatcaaacttgaggcacttacctctaaggtcagatacatg
gaaggacaaacagcgtcaacttctgctccaaaagtaacaggacttccaggaaagtccatacagaacccga
aggaatacgccaccgctcacgctatcaccatctgtcatgatcgagagctgcctactcgacatgtctccac
atcaatcaccgaggacagtgatgttcaagacggggaggtttctactcagattgaaatttcagtggttgga
ctcgaccattcagcaggatcccgttttcaaacacagtccaacctagacgagaaagcagccatcattgaga
ggatggtaaaacgattcaagccagcaccattaccttcacgtgctcttccatggaaattcaggaaagcatg
gatagaaagatacaattctcttgcagagaagcagcttgatgagatggaagcggtgatgcctctaatagaa
gtgctcaacctaatcccggatcctcacaaagatgtgagaaagtcaattctggaaaggatcaagattcatc
aagattcagaagacgaatgtgatgctattccgtctaggaccactgttaagaggagtgttcaagaaaaact
ggaagatccaggaactttcactctaccatgttccatcggccaattggttttcagcaattgtctttgtgat
ttgggagcttcagtaagcttaatgccactctcagtggcaaggaagctggaattcactcagtacaaacctt
gcgacctgactttaatccttgctgatgggtcttcaagaaaacccttcggccttctacaagatctgccagt
aatgattaatggagtggaagtgcctacggatttcgttgtgcttgatatggaagcagaacctaaggatcct
ctaatcctaggaagacctttcttagcctcagtgggagcgatgatagatgtcagagatgggagaataagtc
tcaaccttgggaagcacatcaagctgcagtttgacatcaacgaaacttcgcaaaggtcagctgtagaaga
aaggatcagggctcaacctcagccttcgaattcaatcactagaccaagcacagcctctacacccgacttg
cgagatctcaaaaagaaatctgatgagcaagaagagaccatagagaagctagctcagacagttgaggaac
ttaagagtaaactggatcagatgcaagagatagctaaatccaaatgcgggaataacactatcccgaggaa
aaagattacttcaagatggtctgaagagatagattatccaccagaagagaaagaggcctatttcgaggaa
agaagaattgagtattcagctactcatctctcaagagaggatgctgaatatgatgatgagatcagagagg
actatgcagaccctctctatcacccattttcttcttaatgagtgtgaggagtcaagctagagactttaaa
caagctcacttgggaggaattcccaagactgtttctgtaaataaaacttttattttctcgttatttttga
tttgtttttggttgtgtttgtgattctcaggaacagagaaacagcgtggaggtagagtaaaaatttaaaa
tttttactctacagagcaacaggagatcgagtatttcagaaattcaagaatttgaaaaaacctctgttgc
actcagaggccatgaggtcgagtaaactggtcgagtattagtgatgattctaaaaaccaaaattttgaaa
tcatatctatgctcgaccaacagaagctacagagacttacagggagttcaacaaatttacagaggattac
agaaggttctagtcaacagaaaacagtgcttcaggacaaatagacagaacgtggcccaccacctctcact
tttgttcccccacgcgttttaagagataacaaaatctcttccttccttctccaccactcgatctcaccat
atcctctcccaccgacatcatctctctccctctccaaacactcgaccacgaaaactcactccacctcacg
tccatcactcgaccaaacctcgcaatcccttttcctctccaatcactcgaccgcaccatcacactcgacc
tcaccgttcactcgatctcgcccctccggcttcgtctctctccttcactagacgacggtactcgatttcg
ccatttcttccttctcaccactcgacggttactcgaccgctccgcccttcacttcgccggaagcttcacc
gcctcaccgtcgcctccttaccactcgaccgcaactcgacctctccgtgcttcatcttcatcttcattta
ctcgacttcgccggagcttctcgccgttccagcttcgcatactctctcaccgtctctcgtttcactcgac
cacttcacacttcgcctcaacatcttcgccggagtttctcgccattgtccgtgcttccgtcatctccgtt
cactcgaccaccggaccggcttcaccatctctcaactatccaccattcactcgacctcgccattcactgc
gcctccattcgtctctttactcgactgctcctcaaaccgccaccgtcttctctaaattcgccgtttactc
gaccacactgttacgtctctcattcgtgtacagtcgaccgctatacccgaagccacaatatcactgtact
cgaccgtttcactcgaccgcgtacttgactggtttagtgtgtgtgtttatttgaactaacatattgatat
ttggttttgagttacattctttttcagggaatcaatatgagtaactacagtggcgaatcctccatggatg
cggattacaacgtcgatgaagctgaatcttggtcaactagaccagagagagagcaacaggcttatgagag
cttcagagccgagacccaacgctcagtagctcgacgcaatgaaaggagagctgagattgctagaggaaag
agagcgatgaccagcagatatgagttgatcgacgaagatattgacgtcgagtatgagcctgagtcatggc
acagagaaacaaaactgttgaacaagcctgatgaagttacagtggaggagtacatcagacttttcgagct
gaacgacttctggggaacgaggtacccctgttatgagactctagcccagcttgggctactggaggacgta
cagcacttattcgagaagtgccatcttgagacgctgatgtcttacccgtacgtcgcttacaagaaggaaa
caatagagtttctctccactctgcaagtggagttgtatcagggacttactgcagatgaactggagagtga
agggttggggttcttgactttttcagttaacgagcagcgttaccagctatctatcaagagcttggaagga
ttatttggttttcccagtggaaagggaactaaacccaagttcgagagggaagagttgaaggatttgtggt
taaccattgggaacgatttggcgctcaactctgcaaggtctaagagcaaccagattcgaagccctgtgat
ccgctactatcagcgctcagtagcgaatgttctgtaccccagggaatctacaggcaccgtgtctaacaca
gacatggagatgattgattctgcactcaagggcattctccggagaacaaaggggaagaaggtcctaaagg
gcgaccttaatgatacaccaccggtcatgcttctgttgatccacctgtgtggatacaggaagtgggcgca
caccaacgggaggaagaaggtgcgaggagccctttgtgtaggtggcgttgtgacaccgattctgattgca
tgtggtgtacctctcacgtctccagggtttgatccgaggatgatggatttagatcacttgcgtcgttgtg
agtttctggagtacgacatggttggcgatttctatcgctacaaattcgagcactccctgacccgaacagc
caacattttgcttccctgcatcgaggccacaaccatacttcagggtgagaacattgacttcagacctgcg
cgtgattacctctactttgagagcgctccaccgactgatgacaatgtccctacgacggaagctacagagg
atgatattgctgagacggatgaggatagggaggaggagtatgatacgagcatgtatcatttcagtgagca
cgtacctccagcgcgggagagcaagagcttgagtgaagctcacagaaacaacagtaagttgcagaggtgg
tgcaagaaacaagataggctacttatcaagtgcttcaaagccatcacgtttctgacggacaagataagct
gcttctcttctactacagctattccgcagggagagcgtcctcaggacatgccttcgaggagatatgacgc
gccagggccaagtcatcacaggcctgagccaagtcaccacaggcctgagcctagtgaccgagtagtccca
ccagtccctgcgaggcattcatcattcgagcctcgggagctcgggagaaagaagaaggctgcactcgctc
ggtctggcagcaggagtacacgacttctccagtcccgtagcttacgcgaccgcggtgctggccgcagcag
aagaagagaggtcgagtatcatcagagcggtgctggccgcgacgaaggagcagaggtcgagtacccccag
ggggaagctgagacacaacagggagattcttcgatggcctgggagcaatcacaggcagctattgacgacc
aactccgctccttcttccactgaggtatgcacctcactccaccattgtaatataccatctcttgttttta
ttttgtttttgtgatgtgtttttgtcctgagtactctcttccaaatttggtcacacagtggactgtgtga
tttaagtttgggggagggctcaggaagtgtgtgttgcattgtataatcttgagtctgcattcatctaagg
catagaaaaaccaaaaaaattgaaaaattccagaaaatgatttcacaaaaatagagtgttcatgtagttg
cattgcatttaggatcgagtctagagtgtttcgtttaggattgttgcatatgcataggggataatgatga
gatagccttgtaagcattttggttcaccagataagctcagtgccctcgttgttagttgtttgatgcgttg
tcattgaaattgaagtaagaactgcaccatgcctagattgctctactcgaccacactgttaggatctgat
atcattccctatcaatttgaacttgaatctgatttagaattatcatgtcttggcatcgaatttgaactca
tggataccctaaaatacttggattttcttactcattttaaccactcttgttgatccaagtagctgactct
ccttattagagcagttaacccatacccaaacctgaactttctttcaagccctatatcacttgtgagtgtt
tgtgaggtcttatttcgattgagcttggtagaaagtgttaggttcgtaacgacagagatagtgtctcatg
tagttctagttcgcgttcttcagactggataggactaggtgggcgcttatatcatgggttgggatgtgtt
taaaagaaaagagggaatccattgttgatgaggaaagggaaagaattctaggggaagtaagctaaagaag
ttagaaaaaaatctagtaaaggttttgggaatgttaaagaaaagaatgaggttcttgttagctaaagaag
aagggttaaaagcctttggttttaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaacaggaacct
tagttgttaaagaaatccaaacccgctagatgtatcagagcgttgagaaagcttctcctagagttaagag
aaaagaaaagaatgatatgaaaaagagtttgaaagattcatgagtgcaaagggtagagttaagttgggac
aggagttgggattaccattagagcttcattgttatactctgggtagatgggatcttatctctgtatgcat
aacttgggacttacctttagcattctactaaagctcaatcattcttgagggatcccctgttacttaagcc
tattctgtaggggaccatctttgtctcttgaccttcaccttagccaaatgagttcattgatgatgcattg
cttgattcacgttccagaactaatgaatgttaaagggattggtagatttgaaaacatgtgtaggtcgagt
ataagagacggatttattgataacaaggcatggctaacgtttttgagtggaattcaatcatatcgcatct
tagaactaccaacttggacattgattttatttgctctatcatatgctttggttctgagtccccgccttca
ctcctctccttcaactatgtcttcttatttgcttgagggcaagcaaagactaagtttgggggagt1