;ID ATHILA7A_I DNA ; ATH ; 4754 BP ;XX ;DE ATHILA7A_I is an internal portion of ATHILA7A endogenous retrovirus. ;XX ;AC AC005965 ;XX ;DT 27-DEC-2001 (Rel. 6.3, Created) ;DT 27-DEC-2001 (Rel. 6.3, Last updated, Version 1) ;XX ;KW Gypsy-like endogenous retrovirus; Athila superfamily; internal ;KW portion; ORF1; pol; reverse transcriptase; ATHILA7LTR; ATHILA7A_I. ;XX ;OS thale cress. ;XX ;OC Arabidopsis thaliana ;OC Eukaryotae; mitochondrial eukaryotes; Viridiplantae; ;OC Charophyta/Embryophyta group; Embryophyta; Magnoliophyta; ;OC Magnoliopsida; Capparales; Brassicaceae; Arabidopsis. ;XX ;RN [1] (bases 1 to 4754) ;RA Kapitonov,V. and Jurka,J. ;RT The ATHILA7A subfamily. ;RL Repbase Reports 1:(4) p. 8 (2001) ;XX ;CC ATHILA7A_I is an internal portion of ATHILA7A endogenous retrovirus. ;CC It belongs to the gypsy superfamily and is related to the ATHILA-like ;CC endogenous retroviruses. ATHILA7A is a subfamily of the ATHILA7 ;CC family. ATHILA7A_I and ATHILA7_I share ~95% identical 1.6 kb 5'- and ;CC ~2 kb 3'-terminal portions. ;CC There are two well preserved copies of ATHILA7A_I present ;CC in the A. thaliana genome; they are 96% identical to each other and ;CC are flanked by the 2% divergent ATHILA7LTR long terminal repeats. ;CC ATHILA7A_I encodes well preserved remnants of a protein similar to ;CC the ATHILA ORF1 protein (it is present also in other ATHILA-like ;CC retroviruses that populate the O. sativa, H. vulgare and V. faba ;CC genomes). The protein, ATHILA7Ap, is 930-aa. One false stop codon ;CC present in the copy deposited in Repbase Update is corrected based ;CC on the second ATHILA7A copy and other ATHILA ORF1-like proteins ;CC deposited in GenBank. ;CC ATHILA7Ap: ;CC MNEPEVGGHNGNGQANGGGHMPQHQPRAHQPIGAFDEPNIRGNRNGIQAPPVENNNFEIKSSLINMVQSS ;CC KFHCLSMEDPLDHLDQFDMLCSTVKINGISEDAFKLRLFPFSLGDRARIWEKNLPQGSITSWDQCKRAFL ;CC LKFFSTTRTARLRNEISSFTQRSNKSFCEAWERFKGYKMQCPHHGFSKESILSTLYRGVLPKFRMLLDTA ;CC SNANFLGQDIDDGLSLVENLAQSDGNYGEDYDRTPRESNEMSNLHRKEIKALNEKIDKMILANQKPIHFV ;CC SESDVYQGYQEHMEGCVERQEEVNYAIGQGYNKFNLNYRNHPNLSYRSNNVENPQDQSYPPLKPPGFTQQ ;CC PNYQPQPQRNFQPKPQAYHHNQHQGSSSNPPPQADTNALLRQILEGQGRGAIDLATQMKGMHTKVDDIYG ;CC ELNAKIERLNVHVYSPSSSTSKHPMGTLKGKSETNHKEFCNAIFINDFDMVENMSYTQSREDGRIDENEK ;CC AIEEISKLLYGSNVENLMVASDEKAKKSTNGNDMITKSVEKKEASRVEPLPYEPLLPFPGRVLTKAKKKV ;CC FSSFKANMSRVGAPLPCVENLSQIPLHFKFIQAILENREKVEEIMRAFDSPITPQTEPKSIIKLEDPGKF ;CC TIPCSLGDLQLDDALCDSGASVNVMSLEMVKSLGVKDMNHHTSSIMFGDASSTTPLGLIEDYPLKVGDCI ;CC VPTDFMIVEMKDTHKVPLILGTPFLNTVGANIDFPNKRVTLLHVNGNVSYPIKPFSTKFCGTITNEEVKT ;CC KKEPKSLKDQLEIKVVNAKNREDFTVDEKILDGECLHFLFDAQEESTKKKELGKAKMVLEKNKRMMKRTH ;CC PPTLDNSPNPTSSMTLTLLRYNDGILEYRVKCKGRSNPFSSVKAILTPEFKEKGSKSVEELMKEVLTLAF ;CC KGSTRSFTTPPITSLPKVHN ;XX ;DR Positions 72809 68056 Accession No AC005965 GenBank (rel. 124.0) ;XX ;SQ Sequence 4754 BP; 1555 A; 859 C; 965 G; 1375 T; 0 other; ATHILA7A_I atttggcgccgttgccggtttgcaatagttattagccattaggatttcatatttgcttgagactaagcct tttactttagttactaatcctttatacttgatctttgctcttgtcttaaaggtgtatgaacctacgaagt tacggccgagcaaacttagtagaagagattaacgacattcgccggtttgaaagagaaaacgctagagcaa gaagaaaaagagaacgacttgaacaccttgctcggttaggcgtcatgaatgaacccgaagttgggggcca taatggaaatggccaagctaatgggggtggtcatatgccacaacatcaaccaagagctcatcaaccaatt ggagcctttgatgagccaaacatccgcggaaatagaaacggtattcaagctcctccggttgagaacaaca actttgagattaaatcaagcctcatcaacatggttcaaagctctaagtttcattgtttatcaatggagga ccctttggatcatctagaccaatttgatatgttgtgtagcacggtgaaaatcaatgggatctcggaagat gcattcaaacttagactcttcccattttctcttggagaccgagctcgtatttgggagaagaatctacctc aaggctccatcacctcttggaccaatgcaaaagagcattcctcttaaagtttttctccacaacaagaacg gcaaggttgagaaatgagatatctagcttcactcaaaggagtaataaaagtttttgtgaagcttgggaaa ggttcaaaggatacaaaatgcaatgtcctcatcatggtttctctaaagagtctatactaagtaccttgta tagaggagttctccctaaatttcgcatgttgttggacacggcaagcaatgcaaacttcttgggccaagat attgatgatggtttgtctttggtagagaatctagctcaaagtgatggtaattatggtgaagattatgacc gcactccaagagaaagtaatgagatgagtaatctacaccgcaaagagatcaaagctttgaatgaaaagat agataagatgattcttgctaatcaaaaaccgatccattttgtatccgagagtgatgtctaccaagggtat caagaacacatggaaggttgtgtagagcgacaagaggaggtcaactatgctattggacaaggctacaaca aattcaacctcaactataggaatcatcccaacctttcatataggagcaataatgtggagaacccacaaga tcaaagctacccacccttaaagcctcccggtttcacacaacaacccaactatcaacctcaaccccaaagg aacttccaaccaaaacctcaagcctaccaccacaatcaacaccaagggagttcatcgaatcctccacctc aagcggatacaaacgctcttcttagacaaattctagaagggcaaggaagaggagcaatcgatcttgccac acaaatgaaaggaatgcacaccaaggttgatgacatttatggtgagctcaatgccaaaatagagcgtttg aatgtccatgtttatagcccatcctcatccacttccaagcatccaatgggaactttaaagggtaaatctg aaactaatcataaggagttttgcaatgctatcttcataaacgattttgatatggttgaaaatatgagcta tacacaatcaagagaggatggtagaattgatgagaatgaaaaggcaatagaagagatctctaaacttcta tatggttctaatgttgagaacttaatggttgctagtgatgagaaggccaagaagagtacgaatgggaatg atatgataaccaagagtgttgaaaagaaagaggcatcaagagttgagcctcttccttatgagcctctact tccgtttcccgggcgagttcttaccaaagctaaaaagaaagttttctcaagttttaaagccaacatgagt agagttggagcgcctttgccttgtgtggaaaacttgtctcaaattcctcttcactttaagttcattcaag ccattctagaaaatcgagagaaagtggaagagatcatgagagcctttgattcaccaatcacaccacaaac ggaaccaaaatctattatcaagcttgaagatccgggtaaattcaccataccttgctcacttggtgattta caacttgatgatgccttatgtgattcgggtgcaagcgtgaatgtgatgtcacttgagatggtaaagagtc taggggttaaagatatgaaccaccacacctcctccatcatgtttggagatgcttcttcaacaactcctct tggtttgattgaagactatccactcaaagttggtgattgcatagtcccaacggatttcatgatagtggag atgaaagacacccacaaggttcctctcattctaggaactccatttctcaacaccgtgggagccaacatcg acttcccaaacaagagagttactcttctccacgtgaatggtaatgtctcatatccaatcaagcctttctc tacaaagttttgtggaacaatcactaatgaagaagtgaaaacaaagaaagagcctaaaagtttgaaggat caacttgagatcaaagttgtgaatgcgaagaatagagaggatttcacggttgatgagaaaatcttggatg gtgaatgtctacactttttgtttgatgcgcaagaagaaagtactaaaaagaaggagttgggtaaagctaa gatggttcttgagaagaacaagaggatgatgaaaagaacccaccctcccacccttgacaattcaccaaat cctacctcttctatgactctaacacttttgaggtataatgatggaattttagagtatagagtcaagtgca agggtcgatccaaccctttttcatcggttaaagcgattctcacaccggagttcaaagaaaaaggttcaaa aagtgttgaagaattgatgaaagaggtcctcacacttgcattcaagggctctacaaggagtttcacaact cctcctatcacttcactccctaaggttcacaactagggccaaggtatggtcctatctatcctttgtatat actattttctttcattttctctttgtttttcgttagtcttttttccggaattatctttacaccgagacgg tgtgaaataagtttgggggagagactaaccatctaatgttgtgttatgttttgattttctttctttgagt cttgcattgtcaataactatgttttgagaaaaataaaaaaaaaatctgaaaattttgaaaaatccaaaaa aaatcatgtaggtttgcatattaattttctcttttaggattgagtctagtagcatctagtgtacatttgc atttgcataggggataatgatttaattaccttgtagaaattcaagcttaggattgagcttaaagcccttg atcacagttaatcgttgcttgatcatcttggaaaaaatggaaaacccatgcttagacttttgcgtttctc gaaagcaatttctacttgtgtgaagcttacatgacttttgaaatatctctatgaatttggacttaatttg atattgatttctcgcttatggatcctaaatgtattctcatggtagtctaacagactcattcgggttatct tattcgtttagccaatcttttgttaacccgaatggccattctcacctttaaaatggttttccctaccctt aaccttaaaacctttctttcaagcctatgtactatttgtgagtgaggcctcttgtatgaaaatgctatag attttgaaatcttcaaagtattgagaacgacaaggttttagctcttgtttttagctagcttttggaacat ctttggactagctactaggttgggaatgaaagatttaagtggttagattgaatacttgggattgggtttg gaaaaaagaaaagaaaaaagaatgaatgaaaagaaatgtgtgatcaatggttttataggagtctctaggg aataagttataaataagaaagctctaaatctctaagtgtaaagagaaagtttaatatcaagctagttccc caaagaaaagaaacttgcaaagctataagtatttgggaacaaagataaacttagaaaagaaatagaaaag aacaaagtgtcaaaagaagaagtcttccctaggtgactagcaagaaaggattgatcttgaggattgtggt tcaaaaagaccaaagtgtctattgattggggttgaaaagaagaagaaaagagtttgcaagaagggtagaa ataggagtttacatctaagacgggtagaacaacaagagctagactttgtatgcataaattgttcttggtc ttagatggatttcatgattgtctagcattactattttttttagagaaacccaccaaaatgtaaatctttt gaaaccacctccaaaagccttcaccaagaccgattgagaatgatttatcttccatttgcgagaccgcgcc aaacacttaaatgaatgttaaagagatgtttcaacttgattgtgagtctttgctaatagttggagatttg aaattgagctatgcactaagaagagtatggagaagtacgaaagagtatgatgattgattcaagcaaattt gaactatctttaaggactttgagttgaatccttcgtttgatattttgtggctatgaatccccaccttcaa acctctttcttttctatctcttagttcttgcttgaggactagcaagaaataagtttgggggagt1