;ID   ATHILA6B_I  DNA   ; ATH   ; 7963 BP
;XX
;DE   ATHILA6B_I is a an internal part of the ATHILA6B endogenous 
;DE   retrovirus - a consensus sequence.
;XX
;AC   .
;XX
;DT   17-MAY-2001 (Rel. 6.1, Created)
;DT   17-MAY-2001 (Rel. 6.1, Last updated, Version 1)
;XX
;KW   Gypsy-like endogenous retrovirus; ATHILA superfamily; long terminal
;KW   repeat; reverse transcriptase; integrase; ATHILA6A_LTR; ATHILA6A_I; 
;KW   ATHILA6B_I.
;XX
;OS   consensus
;XX
;OC   Arabidopsis thaliana
;OC   Eukaryotae; mitochondrial eukaryotes; Viridiplantae;
;OC   Charophyta/Embryophyta group; Embryophyta; Magnoliophyta;
;OC   Magnoliopsida; Capparales; Brassicaceae; Arabidopsis.
;XX
;RN   [1]  (bases 1 to 7963)
;RA   Kapitonov,V.V.
;RL   Direct submission (May 2001)
;XX
;CC   ATHILA6B_I is an internal part of the ATHILA6A endogenous 
;CC   retrovirus. There are several copies of ATHILA6B_I in the genome;
;CC   they are ~99% identical to the consensus sequence. The consensus
;CC   sequence was derived from 3 proviral copies. These copies are
;CC   flanked by 99% identical LTRs (ATHILA6A_LTR). They have generated
;CC   5-bp target site duplications upon their integration into the
;CC   genome.
;CC   ATHILA6B_I encodes 1888-aa polyprotein, ATHILA6Bp (positions
;CC   173-5815). A middle portion of ATHILA6Bp is composed of the
;CC   reverse transcriptase and integrase domains. These domains are
;CC   encoded by a DNA region, which starts and ends at positions 3170 and
;CC   5683, respectively.
;CC   ATHILA6Bp:
;CC   MQTRSRGNQNLLFNDNIDRIARQLRTQTETDTMAAVVDEQVQPNNIGAGDAPRNHNQRNGIVPPPVQNNN
;CC   FEIKSGLIAMVQSNKFHGLPMEDPLDHLDEFDRLCSLTKINGVSEDGFKLRLFPFSLGDKAHQWEKSLPQ
;CC   GSITSWNDCKKAFLAKFFSNSRTARLRNDISGFTQTNNETFCEAWERFKGYQTQCPHHGFSKASLLSTLY
;CC   RGVLPKIRMLLDTASNGNFLNKDVEDGWELVENLAQSDGNYNEDYDRSVRTSSDSDEKHRREMKAMNDKL
;CC   DKLLLVQQKHIHFLGDDETFQVQDGETMQSEEVSYVQNQGGYNKGFNNFKQNHPNLSYRSTNVANPQDQV
;CC   YPSQQQNQPKPFVPYNQGQGYVPKQQYQGNYQQQLPPPGFTQQQQQPASTTPDSDLKNMLQQILQGQATG
;CC   AMDLSKRMAEIHNKVDCSYNDINIKVEALTSKIRYIEGQTGSTAAPKFTGPSRKSMSNSEEYAHAITLRS
;CC   GKELPTKESPNQNTEDSVDQDGEDFCQNGNSAEKAIEEPILDQPTRLLAPAASPLVEKPAAAKTKDNVFV
;CC   PPPYKPPLPFPGRFKKVMIQKYKALLEKQLKNLEVTMPLVDCLALIPDSNKYVKDMITERIKEVQGMVVL
;CC   SHECSAIIQQKIIPKKLGDPGSFTLPCALGPLAFNKCLCDLGASVSLMPLSVAKKLGFNKYKPCNISLIL
;CC   ADRSVRIPHGLLEDLPVMIGMVEVPTDFVVLEMDEEPKDPLILGRPFLATAGAIIDVKKGKIDLNLGRDL
;CC   KMTFDITNTMKKPTIEGNVFWIEEMDMLADEMLEELGETDHLQSALTKDSKEGDLHLEILGYQKLLDEHK
;CC   AVENPGEYEDLGERAREEYILDLITRPTAHSVYSTELLDHNNPSEANLVSDDWSELKAPKVDLKPLPKGL
;CC   RYVFLGLNSTYPVIVNDGLTADQVNLLITELKKYRKAIGYSLDDIKGISPTLCTHRIHLENESYSSIEPQ
;CC   RRLNPNLKEVVKKEILKLLDAGVIYPISDSTWVSPVHCVPKKGGMTVVKNSKDELIPTRTITGHRMCIDY
;CC   RKLNAASRKDHFPLPFIDQMLERLANHPYYCFLDGYSGFFQIPIHPNDQEKTTFTCPYGTFAYKRMPFGL
;CC   CNAPATFQRCMTSIFSDLIEEMVEVFMDDFSVYGSSFSSCLLNLCRVLKRCEETNLVLNWEKCHFMVREG
;CC   IVLGHKISEEGIEVDKAKVDVMMQLQPPKTVKDIRSFLGHAGFYRRFIKDFSKLARPLTRLLCKETEFAF
;CC   DDECLTAFKLIKEALITAPIVQAPNWDFPFEIMCDASDYAVGAVLGQRIDKKLHVIYYASRTMDDAQVRY
;CC   ATTEKELLAVVFAFEKFRSYLVGSKVTVYTDHAALRHIYAKKDTKPRLLRWILLLQEFDMEIVDKKGIEN
;CC   GVADHLSRMRIEDEVPIDDSMPEEQLMAIQQLNESAQIRKSLDQVCTIEEKLPWYADHVNYLVSGEEPPN
;CC   LSSYEKKKFFKDINHFYWDEPYLYTLCKDKIYRRCVSEDEIEGILLHCHGSAYGGHFATFKTVSKILQAG
;CC   FWWPSMFKDAQEFISKCDSCQRRGNISRRNEMPQNPILEVEIFDVWGIDFMGPFPSSYGNKYILVAVDYV
;CC   SKWVEAIASPTNDARVVLKLFKTIIFPRFGVPRIVISDGGKHFINKVFENLLKKHGVKHKVATPYHPQTS
;CC   GQVEISNREIKAILEKIVGSTRKDWSAKLDDALWAYRTAFKTPIGTTPFNLLYGKSCHLPVELEYKAMWA
;CC   VKLLNFDIKTAEEKRLIQLNDLNEIRLEAYESSKIYKERTKSFHDKKIVSRDFKVGDQVLLFNSRLRLFP
;CC   GKLKSRWSGPFSVTAVRPYGAITLAGKNGDFTVNGQRLKKYMIDQFIPEGTSVPLEEPLNA
;CC   ATHILA6B_I and ATHILA6A_I share identical ~1400-bp and ~1850-bp
;CC   3'- and 5' ends, respectively. However, their central portions
;CC   (positions 2475-6118 in ATHILA6B_I) are not similar to each other.
;CC   Since, the central portion of ATHILA6B_I encodes the reverse 
;CC   transcriptase and integrase, ATHILA6A was a non-autonomous 
;CC   retroelement, whose transposition was mediated by ATHILA6B. 
;XX
;DR   [1] (Consensus)
;XX
;SQ   Sequence 7963 BP; 2399 A; 1633 C; 1801 G; 2130 T; 0 other;
ATHILA6B_I
atttggcgccgttgccaattgggtgtttgtttgctatatttgagatttcagaatatttaagatcaagttc
tttttcattttttctgaaagttactaactttgtgttatttctgtttgtctgttttgattcaggtactacg
aatcactagcttcaccagttggtactcgttgtatgcaaacacggtcacgaggaaatcagaatctcctgtt
caacgataacatcgaccgtattgcacgccaactcagaacacagacagaaacagacacaatggctgcagtt
gttgatgagcaggtgcaaccaaacaacataggtgcaggcgatgcaccacgcaaccacaatcagcgtaatg
gcatcgtgcctccacctgtgcagaacaacaactttgagatcaagagcggtctcattgccatggttcagag
caacaagttccacggcctccctatggaggatccactcgatcacctggacgagtttgataggctctgcagc
ctaacaaagatcaatggagtcagtgaggatggcttcaagctcagattgtttcctttctctcttggagaca
aagcccatcagtgggaaaagtcgcttcctcaaggctctatcacctcctggaatgactgcaagaaagcctt
cttggctaagtttttctcaaactccaggactgcgagactaaggaatgatatctccggtttcactcagacg
aataatgagactttctgtgaagcatgggagcgcttcaaaggttatcagacgcaatgtcctcatcatggtt
tctccaaagcttcgcttctcagcactctctacagaggtgtcctcccaaagatcaggatgcttctagatac
cgcttctaacgggaactttctcaacaaagacgttgaagacggatgggagctggtagagaacttagctcag
tcggatggcaactataatgaagattatgatagaagcgtccgcaccagctctgattctgatgagaagcacc
gcagggaaatgaaagccatgaatgacaaactggacaagctactgcttgtgcaacagaagcacattcattt
tctgggtgatgatgagacgttccaagttcaggatggggagactatgcagtcagaagaggtcagttatgtg
cagaaccaaggaggttacaacaaaggtttcaacaacttcaagcagaaccatcccaatctgtcttacagaa
gtacaaacgttgcaaacccgcaggaccaagtttacccttcacagcagcagaatcaacccaagccctttgt
tccatacaaccaaggtcaagggtatgttcctaagcagcagtatcagggcaactatcagcagcaacttcca
ccacctgggttcacacagcagcaacaacaaccagcttcaacaactccagattcagacttgaagaacatgt
tacagcagattctacaagggcaagcaacaggagcaatggatttgtccaaaaggatggcagaaattcacaa
caaggttgactgcagttacaatgatatcaatatcaaggttgaagcactgacatctaagattagatacata
gaaggccaaactgggtcgaccgcagcacccaaatttacaggaccttccagaaaatcaatgtcaaattcag
aggagtacgctcatgctatcacacttagaagcggtaaagaattacccacaaaagagagcccgaaccagaa
cactgaggacagtgtggatcaagatggggaggatttctgtcaaaatggaaattccgctgaaaaagcaata
gaagagcctatactcgaccaacctactcgactactagctcctgcagcatctcctcttgttgaaaaaccag
ctgccgccaaaaccaaagacaacgtcttcgttcccccaccttacaagcccccactcccatttcctggtcg
attcaagaaggtcatgatacagaagtacaaagctctcctggagaaacaattgaaaaatcttgaagtcacg
atgcctcttgtcgactgtcttgcacttataccagactctaacaaatatgtgaaagacatgattacagaga
ggatcaaagaagtccaaggaatggtggtgctaagtcacgaatgcagtgcgatcattcagcagaagatcat
tccaaaaaaactaggagatcctggttccttcaccttaccttgcgctctaggccctttagccttcaacaag
tgcctctgtgatctaggtgcatctgtgagtctcatgcctctctctgtggcaaagaaactgggcttcaaca
agtacaagccctgcaatatatccttgatcctagctgacagatctgttaggattccccatggtctgctgga
ggacttaccagtcatgataggcatggtcgaagtaccgacagactttgtggttttagaaatggatgaagaa
cccaaagaccctttgatattgggaagaccatttttagctacagctggagctatcattgatgttaagaagg
gaaaaattgatcttaacctcgggagagatctcaagatgaccttcgatatcaccaacaccatgaagaagcc
tacgatagaaggaaatgttttctggattgaagagatggatatgttagctgacgaaatgctagaagagctg
ggagaaacagatcatctacagagtgctctaacaaaggatagcaaggaaggagatttacatttggagattt
tggggtaccaaaagctgttagatgaacacaaagcagttgagaatccaggagagtatgaagatttgggtga
aagagcaagggaggaatatatactcgacctcatcactcgaccaacagcccattctgtgtactcgaccgag
ctactcgaccacaacaacccctctgaagccaacttagtatccgatgactggtctgaactcaaggcaccca
aagtagatttgaaaccgctaccaaaagggctaaggtacgtttttcttggactaaactctacttatcctgt
catagtgaatgatgggctaactgctgatcaagtaaacctgctgataaccgagctcaagaagtataggaaa
gctataggatattcgttagatgatattaaggggatttcgcccaccttatgcacccatagaatccatcttg
aaaatgaatcctactctagcattgaacctcaaaggagattaaaccctaacttgaaagaggttgtcaaaaa
ggagatacttaaactattagatgctggggttatctaccctatctctgatagcacttgggtatctccagtc
cactgcgttcctaaaaaaggaggtatgacagttgttaaaaattctaaagatgaactgatacccactagga
ctataactggacataggatgtgtattgactataggaagttaaatgctgcctctagaaaagaccatttccc
attgcccttcattgatcagatgctagaaagattagcaaaccatccttactattgctttttggatggttat
agcggattttttcaaatccctattcacccaaatgaccaagaaaaaaccactttcacttgtccttatggga
cctttgcttacaagcgtatgcctttcggtctctgtaatgcaccagctacttttcagcggtgtatgacttc
tattttctcggatttgatagaggagatggtagaggtattcatggatgatttttctgtgtatggctcttcc
ttctcctcgtgtttgttgaacctgtgcagggtacttaaaagatgtgaagagacaaacctagtgctgaact
gggagaaatgccatttcatggttagagaaggcatcgttttgggccacaaaatttctgaagagggaataga
ggttgataaggctaaggttgacgtgatgatgcagttacagccacccaaaactgtcaaagacatcaggagt
tttcttggacatgcaggattttacagaagattcatcaaagatttctccaagttggcaagaccgctcacca
gactactgtgcaaggaaaccgagtttgcatttgatgacgaatgtttgacagcctttaagttaatcaagga
agctttgatcactgcaccaatagtccaagctccaaactgggacttcccattcgagatcatgtgcgatgct
tcagattacgcagtaggagcagtcttgggccagcggattgataaaaagctgcatgtgatctactacgcga
gcagaacgatggatgatgcacaagtaaggtatgccactacagagaaggaactgctagctgtggtatttgc
atttgaaaaatttagaagctacctagtgggttctaaagtgacagtctacactgatcatgctgctctgagg
catatatacgcgaagaaggataccaaacctaggctgttgagatggatcttattgctccaagagtttgata
tggagatagttgataagaagggaattgaaaacggtgtagctgatcatctgtctaggatgaggattgaaga
cgaggtccctattgacgactccatgccagaagaacagctaatggcaattcagcaacttaacgagagtgca
caaattcggaaatcactcgatcaagtatgtacaattgaagaaaagcttccgtggtatgctgatcacgtca
actacttggttagtggtgaggagcctccgaatctgtcgagctatgagaagaagaagttcttcaaggacat
taaccatttctactgggacgaaccttatctctacacactctgcaaagataagatctacaggagatgcgtc
tcagaagacgaaatcgaaggcatcctattgcattgccatgggtctgcctatggtggccacttcgcaacgt
tcaagacagtgtcgaaaatcctgcaagctggtttttggtggccatcaatgtttaaggatgctcaggaatt
tatctcgaaatgtgattcatgtcagagaagagggaacatcagcagaaggaatgagatgcctcagaatccg
atattagaagttgagatctttgacgtttggggaattgattttatgggtccgtttccttcctcttacggga
acaagtacatactggtcgcagtagactatgtatcaaaatgggtggaagccatagccagccccaccaatga
tgctagagttgtgctaaagctgttcaagacaattatctttcctagatttggagtcccgagaattgtgatc
agtgacggagggaaacattttatcaacaaggtttttgagaaccttttgaagaagcatggagtaaagcaca
aggtcgccactccttatcatccacagacgagcgggcaggtggaaatctccaacagggagataaaagcaat
tttagagaaaattgtgggaagtacaaggaaagattggtctgctaagctcgatgacgcactatgggcttac
agaacagccttcaagacccctattggcacgactcctttcaacctcctctatggaaaatcctgtcatttgc
ctgttgaactcgagtataaagccatgtgggcagttaaactcctgaacttcgacattaaaaccgccgagga
gaagcggttgatccaactgaacgatctcaacgagattcgcttagaagcctatgagagttccaaaatctac
aaggagcgaaccaagtctttccatgacaagaagatagtctcaagagattttaaggttggtgatcaagtgt
tgctgttcaactctcgcctgaggctttttccaggcaagctcaagtctagatggtctggtcctttctctgt
tactgcagtccgaccttatggtgctatcactctagctggaaagaatggagacttcacagtcaatggccag
cggctcaagaaatacatgatagatcagtttattccagaaggaacctctgttcctttggaggagcctttaa
acgcctaatgagtatgaagagtcaagctagagacctaaaacaagctcacttgggaggaagtcccaagcct
atctttgtacatatctttcattttccttgttgtttttgatgcatcttgttagtgttttcaggagataaat
atgaagagttgctggaaatagattctggctttgaaggaacagcaatacactcgaccacaaagcaatcaaa
ctcgacattgttctggcgtcttcacaccaggaaatcactcgaccacaccctataaagaccgaatgacgat
gtggtcgagtatagcgaatgatgcagtcacgacttctccactcccgtagcttacgcgaccgcggtggtgg
ccgcagcagaagaagagaggtcgagtatcatcacagcggtgctggccgcgacgaaggagcagaggtcgag
tacccccagggggaagctgggacacaacagggagattcttcgatggcctgggagcaatcacagacagcta
ttgacgaccaactccgctccttcttccactgaggtatgcacctcactccaccattgtaatataccatctc
ttgtttttattttgtttttgtgatgtgtttttgtcctgagtactctcttccaaatttggtcacacagtgg
actgtgtgatttaagtttgggggagggctcaggaagtgtgtgttgcattgtataatcttgagtctgcatt
catctaaggcatagaaaaaccaaaaaaattgaaaaattccagaaaatgatttcacaaaaatagagtgttc
atgtagttgcattgcatttaggatcgagtctagagtgtttcgtttaggattgttgcatatgcatagggga
taatgatgagatagccttgtaagcattttggttcaccagataagctcagtgccctcgttgttagttgttt
gatgcgttgtcattgaaattgaagtaagaactgcaccatgcctagattgctctactcgaccacactgtta
ggatctgatatcattccctatcaatttgaacttgaatctgatttagaattatcatgtcttggcatcgaat
ttgaactcatggataccctaaaatacttggattttcttactcattttaaccactcttgttgatccaagta
gctgactctccttattagagcagttaacccatacccaaacctgaactttctttcaagccctatatcactt
gtgagtgtttgtgaggtcttatttcgattgagcttggtagaaagtgttaggttcgtaacgacagagatag
tgtctcatgtagttctagttcgcgttcttcggactggataggactaggtgggcgcttatatcatgggttg
ggatgtgtttaaaagaaaagagggaattcattgttgatgaggaaagggaaagaattctaggggaagtaag
ctaaagaagttagaaaaaaaatctagtaaaggttttgggaatgttaaagaaaagaatgaggttcttgtta
gctaaagaagaagggttaaaagcctttggttttaaagaaaaaaaaaaacaggaaccttagttgttaaaga
agtccaaacccgctagatgtatcaagagcgttgagaaagcttctcctagagttaagagaaaagaaaagaa
tgatatgaaaaagagtttgaaagattcatgagtgcaaagggtagagttaagttgggacaggagttgggat
taccattagagcttcattgttatactctgggtagatgggatcttatctctgtatgcataacttgggactt
acctttagcattctactaaagctcaatcattcttgagggatcccctgttacttaagcctattctgtaggg
gaccatctttgtctcttgaccttcaccttagccaaatgagttcattgatgatgcattgcttgattcacgt
tccagaactaatgaatgttaaagggattggtagatttgaaaacatgtgtaggtcgagtataagagacgga
tttattgataacaaggcatgactaacgtttttgagtagaattcaatcatatcgcatcttagaactaccaa
cttggacattgattttatttgctctatcatatgctttggttctgagtccccgccttcactcctctccttc
aactatgtcttcttatttgcttgagggcaagcaaagactaagtttgggggagt1