;ID ATHILA6A_I DNA ; ATH ; 7835 BP ;XX ;DE ATHILA6A_I is a an internal part of the ATHILA6A endogenous ;DE retrovirus - a consensus sequence. ;XX ;AC . ;XX ;DT 17-MAY-2001 (Rel. 6.1, Created) ;DT 17-MAY-2001 (Rel. 6.1, Last updated, Version 1) ;XX ;KW Gypsy-like endogenous retrovirus; ATHILA superfamily; long terminal ;KW repeat; reverse transcriptase; integrase; ATHILA6A_LTR; ATHILA6B_I; ;KW ATHILA6A_I. ;XX ;OS consensus ;XX ;OC Arabidopsis thaliana ;OC Eukaryotae; mitochondrial eukaryotes; Viridiplantae; ;OC Charophyta/Embryophyta group; Embryophyta; Magnoliophyta; ;OC Magnoliopsida; Capparales; Brassicaceae; Arabidopsis. ;XX ;RN [1] (bases 1 to 7835) ;RA Kapitonov,V.V. ;RL Direct submission (May 2001) ;XX ;CC ATHILA6A_I is an internal part of the ATHILA6A endogenous ;CC retrovirus. There are several copies of ATHILA6A_I in the genome; ;CC they are ~99% identical to the consensus sequence. The consensus ;CC sequence was derived from 5 proviral copies. These copies are ;CC flanked by 99% identical LTRs (ATHILA6A_LTR). They have generated ;CC 5-bp target site duplications upon their integration into the ;CC genome. ;CC ATHILA6A_I encodes two proteins: 911-aa ATHILA6A1p (positions ;CC 173-2905) and 648-aa ATHILA6A2p (positions 4237-6180). ;CC ;CC ATHILA6A1p: ;CC MQTRSRGNQNLLFNDNIDRIARQLRTQTETDTMAAVVDEQVQPNNIGAGDAPRNHNQRNGIVPPPVQNNN ;CC FEIKSGLIAMVQSNKFHGLPMEDPLDHLDEFDRLCSLTKINGVSEDGFKLRLFPFSLGDKAHQWEKSLPQ ;CC GSITSWNDCKKAFLAKFFSNSRTARLRNDISGFTQTNNETFCEAWERFKGYQTQCPHHGFSKASLLSTLY ;CC RGVLPKIRMLLDTASNGNFLNKDVEDGWELVENLAQSDGNYNEDYDRSVRTSSDSDEKHRREMKAMNDKL ;CC DKLLLVQQKHIHFLGDDETFQVQDGETMQSEEVSYVQNQGGYNKGFNNFKQNHPNLSYRSTNVANPQDQV ;CC YPSQQQNQPKPFVPYNQGQGYVPKQQYQGNYQQQLPPPGFTQQQQQPASTTPDSDLKNMLQQILQGQAAG ;CC AMDLAKKMAEIHNKVDCTFNDLNIKLEALTSKVRYMEGQTASTSAPKVTGLPGKSIQNPKEYATAHAITI ;CC CHDRELPTRHVSTSITEDSDVQDGEVSTQIEISVVGLDHSAGSRFQTQSNLDEKAAIIERMVKRFKPAPL ;CC PSRALPWKFRKAWIERYNSLAEKQLDEMEAVMPLIEVLNLIPDPHKDVRKSILERIKIHQDSEDECDAIP ;CC SRTTVKRSVQEKLEDPGTFTLPCSIGQLVFSNCLCDLGASVSLMPLSVARKLEFTQYKPCDLTLILADGS ;CC SRKPFGLLQDLPVMINGVEVPTDFVVLDMEAEPKDPLILGRPFLASVGAMIDVRDGRISLNLGKHIKLQF ;CC DINETSQRSAVEERIRAQPQPSNSITRPSTASTPDLRDLKKKSDEQEETIEKLAQTVEELKSKLDQMQEI ;CC AKSKCGNNTIPRKKITSRWSEEIDYPPEEKEAYFEERRIEYSATHLSREDAEYDDEIREDYADPLYHPFS ;CC S ;CC ;CC ATHILA6A2p: ;CC MSNYSGESSMDADYNVDEAESWSTRPEREQQAYESFRAETQRSVARRNERRAEIARGKRAMTSRYELIDE ;CC DIDVEYEPESWHRETKLLNKPDEVTVEEYIRLFELNDFWGTRYPCYETLAQLGLLEDVQHLFEKCHLETL ;CC MSYPYVAYKKETIEFLSTLQVELYQGLTADELESEGLGFLTFSVNEQRYQLSIKSLEGLFGFPSGKGTKP ;CC KFEREELKDLWLTIGNDLALNSARSKSNQIRSPVIRYYQRSVANVLYPRESTGTVSNTDMEMIDSALKGI ;CC LRRTKGKKVLKGDLNDTPPVMLLLIHLCGYRKWAHTNGRKKVRGALCVGGVVTPILIACGVPLTSPGFDP ;CC RMMDLDHLRRCEFLEYDMVGDFYRYKFEHSLTRTANILLPCIEATTILQGENIDFRPARDYLYFESAPPT ;CC DDNVPTTEATEDDIAETDEDREEEYDTSMYHFSEHVPPARESKSLSEAHRNNSKLQRWCKKQDRLLIKCF ;CC KAITFLTDKISCFSSTTAIPQGERPQDMPSRRYDAPGPSHHRPEPSHHRPEPSDRVVPPVPARHSSFEPR ;CC ELGRKKKAALARSGSRSTRLLQSRSLRDRGAGRSRRREVEYHQSGAGRDEGAEVEYPQGEAETQQGDSSM ;CC AWEQSQAAIDDQLRSFFH ;CC ATHILA6A_I and ATHILA6B_I share identical ~1400-bp and ~1850-bp ;CC 3'- and 5' ends, respectively. However, their central portions ;CC (positions 2400-5970 in ATHILA6A_I) are not similar to each other. ;CC Since, the central portion of ATHILA6B_I encodes the reverse ;CC transcriptase and integrase, ATHILA6A was a non-autonomous ;CC retroelement, whose transposition was mediated by ATHILA6B. ;XX ;DR [1] (Consensus) ;XX ;SQ Sequence 7835 BP; 2246 A; 1792 C; 1765 G; 2032 T; 0 other; ATHILA6A_I atttggcgccgttgccaattgggtgtttgtttgctatatttgagatttcagaatatttaagatcaagttc tttttcattttttctgaaagttactaactttgtgttatttctgtttgtctgttttgattcaggtactacg aatcactagcttcaccagttggtactcgttgtatgcaaacacggtcacgaggaaatcagaatctcctgtt caacgataacatcgaccgtattgcacgccaactcagaacacagacagaaacagacacaatggctgcagtt gttgatgagcaggtgcaaccaaacaacataggtgcaggcgatgcaccacgcaaccacaatcagcgtaatg gcatcgtgcctccacctgtgcagaacaacaactttgagatcaagagcggtctcattgccatggttcagag caacaagttccacggcctccctatggaggatccactcgatcacctggacgagtttgataggctctgcagc ctaacaaagatcaatggagtcagtgaggatggcttcaagctcagattgtttcctttctctcttggagaca aagcccatcagtgggaaaagtcgcttcctcaaggctctatcacctcctggaatgactgcaagaaagcctt cttggctaagtttttctcaaactccaggactgcgagactaaggaatgatatctccggtttcactcagacg aataatgagactttctgcgaagcatgggagcgcttcaaaggttatcagacgcaatgtcctcatcatggtt tctccaaagcttcgcttctcagcactctctacagaggtgtcctcccaaagatcaggatgcttctagatac cgcttctaacgggaactttctcaacaaagacgttgaagacggatgggagctggtagagaacttagctcag tcggatggcaactataatgaagattatgatagaagcgtccgcaccagctctgattctgatgagaagcacc gcagggaaatgaaagccatgaatgacaaactggacaagctactgcttgtgcaacagaagcacattcattt tctgggtgatgatgagacgttccaagttcaggatggggagactatgcagtcagaagaggtcagttatgtg cagaaccaaggaggttacaacaaaggtttcaacaacttcaagcagaaccatcccaatctgtcttacagaa gtacaaacgttgcaaacccgcaggaccaagtttacccttcacagcagcagaatcaacccaagccctttgt tccatacaaccaaggtcaagggtatgttcctaagcagcagtatcagggcaactatcagcagcaacttcca ccacctgggttcacacagcagcaacaacaaccagcttcaacaactccagattcagacttgaagaacatgt tacagcagattctccaagggcaagctgcaggagcaatggatctcgccaaaaagatggccgaaatccacaa caaggttgattgtactttcaacgatctgaatatcaaacttgaggcacttacctctaaggtcagatacatg gaaggacaaacagcgtcaacttctgctccaaaagtaacaggacttccaggaaagtccatacagaacccga aggaatacgccaccgctcacgctatcaccatctgtcatgatcgagagctgcctactcgacatgtctccac atcaatcaccgaggacagtgatgttcaagacggggaggtttctactcagattgaaatttcagtggttgga ctcgaccattcagcaggatcccgttttcaaacacagtccaacctagacgagaaagcagccatcattgaga ggatggtaaaacgattcaagccagcaccattaccttcacgtgctcttccatggaaattcaggaaagcatg gatagaaagatacaattctcttgcagagaagcagcttgatgagatggaagcggtgatgcctctaatagaa gtgctcaacctaatcccggatcctcacaaagatgtgagaaagtcaattctggaaaggatcaagattcatc aagattcagaagacgaatgtgatgctattccgtctaggaccactgttaagaggagtgttcaagaaaaact ggaagatccaggaactttcactctaccatgttccatcggccaattggttttcagcaattgtctttgtgat ttgggagcttcagtaagcttaatgccactctcagtggcaaggaagctggaattcactcagtacaaacctt gcgacctgactttaatccttgctgatgggtcttcaagaaaacccttcggccttctacaagatctgccagt aatgattaatggagtggaagtgcctacggatttcgttgtgcttgatatggaagcagaacctaaggatcct ctaatcctaggaagacctttcttagcctcagtgggagcgatgatagatgtcagagatgggagaataagtc tcaaccttgggaagcacatcaagctgcagtttgacatcaacgaaacttcgcaaaggtcagctgtagaaga aaggatcagggctcaacctcagccttcgaattcaatcactagaccaagcacagcctctacacccgacttg cgagatctcaaaaagaaatctgatgagcaagaagagaccatagagaagctagctcagacagttgaggaac ttaagagtaaactggatcagatgcaagagatagctaaatccaaatgcgggaataacactatcccgaggaa aaagattacttcaagatggtctgaagagatagattatccaccagaagagaaagaggcctatttcgaggaa agaagaattgagtattcagctactcatctctcaagagaggatgctgaatatgatgatgagatcagagagg actatgcagaccctctctatcacccattttcttcttaatgagtgtgaggagtcaagctagagactttaaa caagctcacttgggaggaattcccaagactgtttctgtaaataaaacttttattttctcgttatttttga tttgtttttggttgtgtttgtgattctcaggaacagagaaacagcgtggaggtagagtaaaaatttaaaa tttttactctacagagcaacaggagatcgagtatttcagaaattcaagaatttgaaaaaacctctgttgc actcagaggccatgaggtcgagtaaactggtcgagtattagtgatgattctaaaaaccaaaattttgaaa tcatatctatgctcgaccaacagaagctacagagacttacagggagttcaacaaatttacagaggattac agaaggttctagtcaacagaaaacagtgcttcaggacaaatagacagaacgtggcccaccacctctcact tttgttcccccacgcgttttaagagataacaaaatctcttccttccttctccaccactcgatctcaccat atcctctcccaccgacatcatctctctccctctccaaacactcgaccacgaaaactcactccacctcacg tccatcactcgaccaaacctcgcaatcccttttcctctccaatcactcgaccgcaccatcacactcgacc tcaccgttcactcgatctcgcccctccggcttcgtctctctccttcactagacgacggtactcgatttcg ccatttcttccttctcaccactcgacggttactcgaccgctccgcccttcacttcgccggaagcttcacc gcctcaccgtcgcctccttaccactcgaccgcaactcgacctctccgtgcttcatcttcatcttcattta ctcgacttcgccggagcttctcgccgttccagcttcgcatactctctcaccgtctctcgtttcactcgac cacttcacacttcgcctcaacatcttcgccggagtttctcgccattgtccgtgcttccgtcatctccgtt cactcgaccaccggaccggcttcaccatctctcaactatccaccattcactcgacctcgccattcactgc gcctccattcgtctctttactcgactgctcctcaaaccgccaccgtcttctctaaattcgccgtttactc gaccacactgttacgtctctcattcgtgtacagtcgaccgctatacccgaagccacaatatcactgtact cgaccgtttcactcgaccgcgtacttgactggtttagtgtgtgtgtttatttgaactaacatattgatat ttggttttgagttacattctttttcagggaatcaatatgagtaactacagtggcgaatcctccatggatg cggattacaacgtcgatgaagctgaatcttggtcaactagaccagagagagagcaacaggcttatgagag cttcagagccgagacccaacgctcagtagctcgacgcaatgaaaggagagctgagattgctagaggaaag agagcgatgaccagcagatatgagttgatcgacgaagatattgacgtcgagtatgagcctgagtcatggc acagagaaacaaaactgttgaacaagcctgatgaagttacagtggaggagtacatcagacttttcgagct gaacgacttctggggaacgaggtacccctgttatgagactctagcccagcttgggctactggaggacgta cagcacttattcgagaagtgccatcttgagacgctgatgtcttacccgtacgtcgcttacaagaaggaaa caatagagtttctctccactctgcaagtggagttgtatcagggacttactgcagatgaactggagagtga agggttggggttcttgactttttcagttaacgagcagcgttaccagctatctatcaagagcttggaagga ttatttggttttcccagtggaaagggaactaaacccaagttcgagagggaagagttgaaggatttgtggt taaccattgggaacgatttggcgctcaactctgcaaggtctaagagcaaccagattcgaagccctgtgat ccgctactatcagcgctcagtagcgaatgttctgtaccccagggaatctacaggcaccgtgtctaacaca gacatggagatgattgattctgcactcaagggcattctccggagaacaaaggggaagaaggtcctaaagg gcgaccttaatgatacaccaccggtcatgcttctgttgatccacctgtgtggatacaggaagtgggcgca caccaacgggaggaagaaggtgcgaggagccctttgtgtaggtggcgttgtgacaccgattctgattgca tgtggtgtacctctcacgtctccagggtttgatccgaggatgatggatttagatcacttgcgtcgttgtg agtttctggagtacgacatggttggcgatttctatcgctacaaattcgagcactccctgacccgaacagc caacattttgcttccctgcatcgaggccacaaccatacttcagggtgagaacattgacttcagacctgcg cgtgattacctctactttgagagcgctccaccgactgatgacaatgtccctacgacggaagctacagagg atgatattgctgagacggatgaggatagggaggaggagtatgatacgagcatgtatcatttcagtgagca cgtacctccagcgcgggagagcaagagcttgagtgaagctcacagaaacaacagtaagttgcagaggtgg tgcaagaaacaagataggctacttatcaagtgcttcaaagccatcacgtttctgacggacaagataagct gcttctcttctactacagctattccgcagggagagcgtcctcaggacatgccttcgaggagatatgacgc gccagggccaagtcatcacaggcctgagccaagtcaccacaggcctgagcctagtgaccgagtagtccca ccagtccctgcgaggcattcatcattcgagcctcgggagctcgggagaaagaagaaggctgcactcgctc ggtctggcagcaggagtacacgacttctccagtcccgtagcttacgcgaccgcggtgctggccgcagcag aagaagagaggtcgagtatcatcagagcggtgctggccgcgacgaaggagcagaggtcgagtacccccag ggggaagctgagacacaacagggagattcttcgatggcctgggagcaatcacaggcagctattgacgacc aactccgctccttcttccactgaggtatgcacctcactccaccattgtaatataccatctcttgttttta ttttgtttttgtgatgtgtttttgtcctgagtactctcttccaaatttggtcacacagtggactgtgtga tttaagtttgggggagggctcaggaagtgtgtgttgcattgtataatcttgagtctgcattcatctaagg catagaaaaaccaaaaaaattgaaaaattccagaaaatgatttcacaaaaatagagtgttcatgtagttg cattgcatttaggatcgagtctagagtgtttcgtttaggattgttgcatatgcataggggataatgatga gatagccttgtaagcattttggttcaccagataagctcagtgccctcgttgttagttgtttgatgcgttg tcattgaaattgaagtaagaactgcaccatgcctagattgctctactcgaccacactgttaggatctgat atcattccctatcaatttgaacttgaatctgatttagaattatcatgtcttggcatcgaatttgaactca tggataccctaaaatacttggattttcttactcattttaaccactcttgttgatccaagtagctgactct ccttattagagcagttaacccatacccaaacctgaactttctttcaagccctatatcacttgtgagtgtt tgtgaggtcttatttcgattgagcttggtagaaagtgttaggttcgtaacgacagagatagtgtctcatg tagttctagttcgcgttcttcagactggataggactaggtgggcgcttatatcatgggttgggatgtgtt taaaagaaaagagggaatccattgttgatgaggaaagggaaagaattctaggggaagtaagctaaagaag ttagaaaaaaatctagtaaaggttttgggaatgttaaagaaaagaatgaggttcttgttagctaaagaag aagggttaaaagcctttggttttaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaacaggaacct tagttgttaaagaaatccaaacccgctagatgtatcagagcgttgagaaagcttctcctagagttaagag aaaagaaaagaatgatatgaaaaagagtttgaaagattcatgagtgcaaagggtagagttaagttgggac aggagttgggattaccattagagcttcattgttatactctgggtagatgggatcttatctctgtatgcat aacttgggacttacctttagcattctactaaagctcaatcattcttgagggatcccctgttacttaagcc tattctgtaggggaccatctttgtctcttgaccttcaccttagccaaatgagttcattgatgatgcattg cttgattcacgttccagaactaatgaatgttaaagggattggtagatttgaaaacatgtgtaggtcgagt ataagagacggatttattgataacaaggcatggctaacgtttttgagtggaattcaatcatatcgcatct tagaactaccaacttggacattgattttatttgctctatcatatgctttggttctgagtccccgccttca ctcctctccttcaactatgtcttcttatttgcttgagggcaagcaaagactaagtttgggggagt1