;ID ATHILA6B_I DNA ; ATH ; 7963 BP ;XX ;DE ATHILA6B_I is a an internal part of the ATHILA6B endogenous ;DE retrovirus - a consensus sequence. ;XX ;AC . ;XX ;DT 17-MAY-2001 (Rel. 6.1, Created) ;DT 17-MAY-2001 (Rel. 6.1, Last updated, Version 1) ;XX ;KW Gypsy-like endogenous retrovirus; ATHILA superfamily; long terminal ;KW repeat; reverse transcriptase; integrase; ATHILA6A_LTR; ATHILA6A_I; ;KW ATHILA6B_I. ;XX ;OS consensus ;XX ;OC Arabidopsis thaliana ;OC Eukaryotae; mitochondrial eukaryotes; Viridiplantae; ;OC Charophyta/Embryophyta group; Embryophyta; Magnoliophyta; ;OC Magnoliopsida; Capparales; Brassicaceae; Arabidopsis. ;XX ;RN [1] (bases 1 to 7963) ;RA Kapitonov,V.V. ;RL Direct submission (May 2001) ;XX ;CC ATHILA6B_I is an internal part of the ATHILA6A endogenous ;CC retrovirus. There are several copies of ATHILA6B_I in the genome; ;CC they are ~99% identical to the consensus sequence. The consensus ;CC sequence was derived from 3 proviral copies. These copies are ;CC flanked by 99% identical LTRs (ATHILA6A_LTR). They have generated ;CC 5-bp target site duplications upon their integration into the ;CC genome. ;CC ATHILA6B_I encodes 1888-aa polyprotein, ATHILA6Bp (positions ;CC 173-5815). A middle portion of ATHILA6Bp is composed of the ;CC reverse transcriptase and integrase domains. These domains are ;CC encoded by a DNA region, which starts and ends at positions 3170 and ;CC 5683, respectively. ;CC ATHILA6Bp: ;CC MQTRSRGNQNLLFNDNIDRIARQLRTQTETDTMAAVVDEQVQPNNIGAGDAPRNHNQRNGIVPPPVQNNN ;CC FEIKSGLIAMVQSNKFHGLPMEDPLDHLDEFDRLCSLTKINGVSEDGFKLRLFPFSLGDKAHQWEKSLPQ ;CC GSITSWNDCKKAFLAKFFSNSRTARLRNDISGFTQTNNETFCEAWERFKGYQTQCPHHGFSKASLLSTLY ;CC RGVLPKIRMLLDTASNGNFLNKDVEDGWELVENLAQSDGNYNEDYDRSVRTSSDSDEKHRREMKAMNDKL ;CC DKLLLVQQKHIHFLGDDETFQVQDGETMQSEEVSYVQNQGGYNKGFNNFKQNHPNLSYRSTNVANPQDQV ;CC YPSQQQNQPKPFVPYNQGQGYVPKQQYQGNYQQQLPPPGFTQQQQQPASTTPDSDLKNMLQQILQGQATG ;CC AMDLSKRMAEIHNKVDCSYNDINIKVEALTSKIRYIEGQTGSTAAPKFTGPSRKSMSNSEEYAHAITLRS ;CC GKELPTKESPNQNTEDSVDQDGEDFCQNGNSAEKAIEEPILDQPTRLLAPAASPLVEKPAAAKTKDNVFV ;CC PPPYKPPLPFPGRFKKVMIQKYKALLEKQLKNLEVTMPLVDCLALIPDSNKYVKDMITERIKEVQGMVVL ;CC SHECSAIIQQKIIPKKLGDPGSFTLPCALGPLAFNKCLCDLGASVSLMPLSVAKKLGFNKYKPCNISLIL ;CC ADRSVRIPHGLLEDLPVMIGMVEVPTDFVVLEMDEEPKDPLILGRPFLATAGAIIDVKKGKIDLNLGRDL ;CC KMTFDITNTMKKPTIEGNVFWIEEMDMLADEMLEELGETDHLQSALTKDSKEGDLHLEILGYQKLLDEHK ;CC AVENPGEYEDLGERAREEYILDLITRPTAHSVYSTELLDHNNPSEANLVSDDWSELKAPKVDLKPLPKGL ;CC RYVFLGLNSTYPVIVNDGLTADQVNLLITELKKYRKAIGYSLDDIKGISPTLCTHRIHLENESYSSIEPQ ;CC RRLNPNLKEVVKKEILKLLDAGVIYPISDSTWVSPVHCVPKKGGMTVVKNSKDELIPTRTITGHRMCIDY ;CC RKLNAASRKDHFPLPFIDQMLERLANHPYYCFLDGYSGFFQIPIHPNDQEKTTFTCPYGTFAYKRMPFGL ;CC CNAPATFQRCMTSIFSDLIEEMVEVFMDDFSVYGSSFSSCLLNLCRVLKRCEETNLVLNWEKCHFMVREG ;CC IVLGHKISEEGIEVDKAKVDVMMQLQPPKTVKDIRSFLGHAGFYRRFIKDFSKLARPLTRLLCKETEFAF ;CC DDECLTAFKLIKEALITAPIVQAPNWDFPFEIMCDASDYAVGAVLGQRIDKKLHVIYYASRTMDDAQVRY ;CC ATTEKELLAVVFAFEKFRSYLVGSKVTVYTDHAALRHIYAKKDTKPRLLRWILLLQEFDMEIVDKKGIEN ;CC GVADHLSRMRIEDEVPIDDSMPEEQLMAIQQLNESAQIRKSLDQVCTIEEKLPWYADHVNYLVSGEEPPN ;CC LSSYEKKKFFKDINHFYWDEPYLYTLCKDKIYRRCVSEDEIEGILLHCHGSAYGGHFATFKTVSKILQAG ;CC FWWPSMFKDAQEFISKCDSCQRRGNISRRNEMPQNPILEVEIFDVWGIDFMGPFPSSYGNKYILVAVDYV ;CC SKWVEAIASPTNDARVVLKLFKTIIFPRFGVPRIVISDGGKHFINKVFENLLKKHGVKHKVATPYHPQTS ;CC GQVEISNREIKAILEKIVGSTRKDWSAKLDDALWAYRTAFKTPIGTTPFNLLYGKSCHLPVELEYKAMWA ;CC VKLLNFDIKTAEEKRLIQLNDLNEIRLEAYESSKIYKERTKSFHDKKIVSRDFKVGDQVLLFNSRLRLFP ;CC GKLKSRWSGPFSVTAVRPYGAITLAGKNGDFTVNGQRLKKYMIDQFIPEGTSVPLEEPLNA ;CC ATHILA6B_I and ATHILA6A_I share identical ~1400-bp and ~1850-bp ;CC 3'- and 5' ends, respectively. However, their central portions ;CC (positions 2475-6118 in ATHILA6B_I) are not similar to each other. ;CC Since, the central portion of ATHILA6B_I encodes the reverse ;CC transcriptase and integrase, ATHILA6A was a non-autonomous ;CC retroelement, whose transposition was mediated by ATHILA6B. ;XX ;DR [1] (Consensus) ;XX ;SQ Sequence 7963 BP; 2399 A; 1633 C; 1801 G; 2130 T; 0 other; ATHILA6B_I atttggcgccgttgccaattgggtgtttgtttgctatatttgagatttcagaatatttaagatcaagttc tttttcattttttctgaaagttactaactttgtgttatttctgtttgtctgttttgattcaggtactacg aatcactagcttcaccagttggtactcgttgtatgcaaacacggtcacgaggaaatcagaatctcctgtt caacgataacatcgaccgtattgcacgccaactcagaacacagacagaaacagacacaatggctgcagtt gttgatgagcaggtgcaaccaaacaacataggtgcaggcgatgcaccacgcaaccacaatcagcgtaatg gcatcgtgcctccacctgtgcagaacaacaactttgagatcaagagcggtctcattgccatggttcagag caacaagttccacggcctccctatggaggatccactcgatcacctggacgagtttgataggctctgcagc ctaacaaagatcaatggagtcagtgaggatggcttcaagctcagattgtttcctttctctcttggagaca aagcccatcagtgggaaaagtcgcttcctcaaggctctatcacctcctggaatgactgcaagaaagcctt cttggctaagtttttctcaaactccaggactgcgagactaaggaatgatatctccggtttcactcagacg aataatgagactttctgtgaagcatgggagcgcttcaaaggttatcagacgcaatgtcctcatcatggtt tctccaaagcttcgcttctcagcactctctacagaggtgtcctcccaaagatcaggatgcttctagatac cgcttctaacgggaactttctcaacaaagacgttgaagacggatgggagctggtagagaacttagctcag tcggatggcaactataatgaagattatgatagaagcgtccgcaccagctctgattctgatgagaagcacc gcagggaaatgaaagccatgaatgacaaactggacaagctactgcttgtgcaacagaagcacattcattt tctgggtgatgatgagacgttccaagttcaggatggggagactatgcagtcagaagaggtcagttatgtg cagaaccaaggaggttacaacaaaggtttcaacaacttcaagcagaaccatcccaatctgtcttacagaa gtacaaacgttgcaaacccgcaggaccaagtttacccttcacagcagcagaatcaacccaagccctttgt tccatacaaccaaggtcaagggtatgttcctaagcagcagtatcagggcaactatcagcagcaacttcca ccacctgggttcacacagcagcaacaacaaccagcttcaacaactccagattcagacttgaagaacatgt tacagcagattctacaagggcaagcaacaggagcaatggatttgtccaaaaggatggcagaaattcacaa caaggttgactgcagttacaatgatatcaatatcaaggttgaagcactgacatctaagattagatacata gaaggccaaactgggtcgaccgcagcacccaaatttacaggaccttccagaaaatcaatgtcaaattcag aggagtacgctcatgctatcacacttagaagcggtaaagaattacccacaaaagagagcccgaaccagaa cactgaggacagtgtggatcaagatggggaggatttctgtcaaaatggaaattccgctgaaaaagcaata gaagagcctatactcgaccaacctactcgactactagctcctgcagcatctcctcttgttgaaaaaccag ctgccgccaaaaccaaagacaacgtcttcgttcccccaccttacaagcccccactcccatttcctggtcg attcaagaaggtcatgatacagaagtacaaagctctcctggagaaacaattgaaaaatcttgaagtcacg atgcctcttgtcgactgtcttgcacttataccagactctaacaaatatgtgaaagacatgattacagaga ggatcaaagaagtccaaggaatggtggtgctaagtcacgaatgcagtgcgatcattcagcagaagatcat tccaaaaaaactaggagatcctggttccttcaccttaccttgcgctctaggccctttagccttcaacaag tgcctctgtgatctaggtgcatctgtgagtctcatgcctctctctgtggcaaagaaactgggcttcaaca agtacaagccctgcaatatatccttgatcctagctgacagatctgttaggattccccatggtctgctgga ggacttaccagtcatgataggcatggtcgaagtaccgacagactttgtggttttagaaatggatgaagaa cccaaagaccctttgatattgggaagaccatttttagctacagctggagctatcattgatgttaagaagg gaaaaattgatcttaacctcgggagagatctcaagatgaccttcgatatcaccaacaccatgaagaagcc tacgatagaaggaaatgttttctggattgaagagatggatatgttagctgacgaaatgctagaagagctg ggagaaacagatcatctacagagtgctctaacaaaggatagcaaggaaggagatttacatttggagattt tggggtaccaaaagctgttagatgaacacaaagcagttgagaatccaggagagtatgaagatttgggtga aagagcaagggaggaatatatactcgacctcatcactcgaccaacagcccattctgtgtactcgaccgag ctactcgaccacaacaacccctctgaagccaacttagtatccgatgactggtctgaactcaaggcaccca aagtagatttgaaaccgctaccaaaagggctaaggtacgtttttcttggactaaactctacttatcctgt catagtgaatgatgggctaactgctgatcaagtaaacctgctgataaccgagctcaagaagtataggaaa gctataggatattcgttagatgatattaaggggatttcgcccaccttatgcacccatagaatccatcttg aaaatgaatcctactctagcattgaacctcaaaggagattaaaccctaacttgaaagaggttgtcaaaaa ggagatacttaaactattagatgctggggttatctaccctatctctgatagcacttgggtatctccagtc cactgcgttcctaaaaaaggaggtatgacagttgttaaaaattctaaagatgaactgatacccactagga ctataactggacataggatgtgtattgactataggaagttaaatgctgcctctagaaaagaccatttccc attgcccttcattgatcagatgctagaaagattagcaaaccatccttactattgctttttggatggttat agcggattttttcaaatccctattcacccaaatgaccaagaaaaaaccactttcacttgtccttatggga cctttgcttacaagcgtatgcctttcggtctctgtaatgcaccagctacttttcagcggtgtatgacttc tattttctcggatttgatagaggagatggtagaggtattcatggatgatttttctgtgtatggctcttcc ttctcctcgtgtttgttgaacctgtgcagggtacttaaaagatgtgaagagacaaacctagtgctgaact gggagaaatgccatttcatggttagagaaggcatcgttttgggccacaaaatttctgaagagggaataga ggttgataaggctaaggttgacgtgatgatgcagttacagccacccaaaactgtcaaagacatcaggagt tttcttggacatgcaggattttacagaagattcatcaaagatttctccaagttggcaagaccgctcacca gactactgtgcaaggaaaccgagtttgcatttgatgacgaatgtttgacagcctttaagttaatcaagga agctttgatcactgcaccaatagtccaagctccaaactgggacttcccattcgagatcatgtgcgatgct tcagattacgcagtaggagcagtcttgggccagcggattgataaaaagctgcatgtgatctactacgcga gcagaacgatggatgatgcacaagtaaggtatgccactacagagaaggaactgctagctgtggtatttgc atttgaaaaatttagaagctacctagtgggttctaaagtgacagtctacactgatcatgctgctctgagg catatatacgcgaagaaggataccaaacctaggctgttgagatggatcttattgctccaagagtttgata tggagatagttgataagaagggaattgaaaacggtgtagctgatcatctgtctaggatgaggattgaaga cgaggtccctattgacgactccatgccagaagaacagctaatggcaattcagcaacttaacgagagtgca caaattcggaaatcactcgatcaagtatgtacaattgaagaaaagcttccgtggtatgctgatcacgtca actacttggttagtggtgaggagcctccgaatctgtcgagctatgagaagaagaagttcttcaaggacat taaccatttctactgggacgaaccttatctctacacactctgcaaagataagatctacaggagatgcgtc tcagaagacgaaatcgaaggcatcctattgcattgccatgggtctgcctatggtggccacttcgcaacgt tcaagacagtgtcgaaaatcctgcaagctggtttttggtggccatcaatgtttaaggatgctcaggaatt tatctcgaaatgtgattcatgtcagagaagagggaacatcagcagaaggaatgagatgcctcagaatccg atattagaagttgagatctttgacgtttggggaattgattttatgggtccgtttccttcctcttacggga acaagtacatactggtcgcagtagactatgtatcaaaatgggtggaagccatagccagccccaccaatga tgctagagttgtgctaaagctgttcaagacaattatctttcctagatttggagtcccgagaattgtgatc agtgacggagggaaacattttatcaacaaggtttttgagaaccttttgaagaagcatggagtaaagcaca aggtcgccactccttatcatccacagacgagcgggcaggtggaaatctccaacagggagataaaagcaat tttagagaaaattgtgggaagtacaaggaaagattggtctgctaagctcgatgacgcactatgggcttac agaacagccttcaagacccctattggcacgactcctttcaacctcctctatggaaaatcctgtcatttgc ctgttgaactcgagtataaagccatgtgggcagttaaactcctgaacttcgacattaaaaccgccgagga gaagcggttgatccaactgaacgatctcaacgagattcgcttagaagcctatgagagttccaaaatctac aaggagcgaaccaagtctttccatgacaagaagatagtctcaagagattttaaggttggtgatcaagtgt tgctgttcaactctcgcctgaggctttttccaggcaagctcaagtctagatggtctggtcctttctctgt tactgcagtccgaccttatggtgctatcactctagctggaaagaatggagacttcacagtcaatggccag cggctcaagaaatacatgatagatcagtttattccagaaggaacctctgttcctttggaggagcctttaa acgcctaatgagtatgaagagtcaagctagagacctaaaacaagctcacttgggaggaagtcccaagcct atctttgtacatatctttcattttccttgttgtttttgatgcatcttgttagtgttttcaggagataaat atgaagagttgctggaaatagattctggctttgaaggaacagcaatacactcgaccacaaagcaatcaaa ctcgacattgttctggcgtcttcacaccaggaaatcactcgaccacaccctataaagaccgaatgacgat gtggtcgagtatagcgaatgatgcagtcacgacttctccactcccgtagcttacgcgaccgcggtggtgg ccgcagcagaagaagagaggtcgagtatcatcacagcggtgctggccgcgacgaaggagcagaggtcgag tacccccagggggaagctgggacacaacagggagattcttcgatggcctgggagcaatcacagacagcta ttgacgaccaactccgctccttcttccactgaggtatgcacctcactccaccattgtaatataccatctc ttgtttttattttgtttttgtgatgtgtttttgtcctgagtactctcttccaaatttggtcacacagtgg actgtgtgatttaagtttgggggagggctcaggaagtgtgtgttgcattgtataatcttgagtctgcatt catctaaggcatagaaaaaccaaaaaaattgaaaaattccagaaaatgatttcacaaaaatagagtgttc atgtagttgcattgcatttaggatcgagtctagagtgtttcgtttaggattgttgcatatgcatagggga taatgatgagatagccttgtaagcattttggttcaccagataagctcagtgccctcgttgttagttgttt gatgcgttgtcattgaaattgaagtaagaactgcaccatgcctagattgctctactcgaccacactgtta ggatctgatatcattccctatcaatttgaacttgaatctgatttagaattatcatgtcttggcatcgaat ttgaactcatggataccctaaaatacttggattttcttactcattttaaccactcttgttgatccaagta gctgactctccttattagagcagttaacccatacccaaacctgaactttctttcaagccctatatcactt gtgagtgtttgtgaggtcttatttcgattgagcttggtagaaagtgttaggttcgtaacgacagagatag tgtctcatgtagttctagttcgcgttcttcggactggataggactaggtgggcgcttatatcatgggttg ggatgtgtttaaaagaaaagagggaattcattgttgatgaggaaagggaaagaattctaggggaagtaag ctaaagaagttagaaaaaaaatctagtaaaggttttgggaatgttaaagaaaagaatgaggttcttgtta gctaaagaagaagggttaaaagcctttggttttaaagaaaaaaaaaaacaggaaccttagttgttaaaga agtccaaacccgctagatgtatcaagagcgttgagaaagcttctcctagagttaagagaaaagaaaagaa tgatatgaaaaagagtttgaaagattcatgagtgcaaagggtagagttaagttgggacaggagttgggat taccattagagcttcattgttatactctgggtagatgggatcttatctctgtatgcataacttgggactt acctttagcattctactaaagctcaatcattcttgagggatcccctgttacttaagcctattctgtaggg gaccatctttgtctcttgaccttcaccttagccaaatgagttcattgatgatgcattgcttgattcacgt tccagaactaatgaatgttaaagggattggtagatttgaaaacatgtgtaggtcgagtataagagacgga tttattgataacaaggcatgactaacgtttttgagtagaattcaatcatatcgcatcttagaactaccaa cttggacattgattttatttgctctatcatatgctttggttctgagtccccgccttcactcctctccttc aactatgtcttcttatttgcttgagggcaagcaaagactaagtttgggggagt1