;ID ATHILA4C_I DNA ; ATH ; 8639 BP ;XX ;DE ATHILA4C_I is a an internal part of the ATHILA4C endogenous ;DE retrovirus - a consensus sequence. ;XX ;AC . ;XX ;DT 31-MAY-2001 (Rel. 6.2, Created) ;DT 31-MAY-2001 (Rel. 6.2, Last updated, Version 1) ;XX ;KW Gypsy-like endogenous retrovirus; ATHILA superfamily; ;KW protease; reverse transcriptase; integrase; ATHILA4C_LTR; ATHILA6B_I; ;KW ATHILA4C_I. ;XX ;OS consensus ;XX ;OC Arabidopsis thaliana ;OC Eukaryotae; mitochondrial eukaryotes; Viridiplantae; ;OC Charophyta/Embryophyta group; Embryophyta; Magnoliophyta; ;OC Magnoliopsida; Capparales; Brassicaceae; Arabidopsis. ;XX ;RN [1] (bases 1 to 8639) ;RA Kapitonov,V.V. and Jurka,J. ;RL Direct submission (May 2001) ;XX ;CC ATHILA4C_I is an internal part of the ATHILA4C endogenous ;CC retrovirus. There are ~10 copies of ATHILA4C_I in the genome; ;CC they are ~94% identical to the consensus sequence. The consensus ;CC sequence was derived from 10 proviral copies. These copies are ;CC flanked by ~96% identical LTRs (ATHILA4C_LTR). They have generated ;CC 5-bp target site duplications upon their integration into the ;CC genome. ATHILA4C_I sequences can be split further into ;CC minor subfamilies. ;CC ATHILA4C_I encodes two proteins: 1772-aa ATHILA4C1p (positions ;CC 155-5473) and 499-aa ATHILA4C2p (positions 5863-7362). ;CC ATHILA4C1p is composed of the retroviral gag-like, protease, reverse ;CC transcriptase and integrase domains. ;CC ATHILA4C1p: ;CC MADFHDCVDRHHQGVDRHPLQNQTATIGDFNRTDLFYTNRSAFKLPPFERDDFAIHPAYYDLVSRGKFRG ;CC APDESPLDHLEVFEDIVSSIKAEGVPADYLLCKLFPHSLASRGTSWLRQLEPGSLTNWTDTKNAFMNHFF ;CC DESVTEAIRLQISSFTQAPAESLRASWLRFRSYQRECPQHGFQESQLINLFYKGIDKPYQNQLDAASSCN ;CC FMTRTTSEALLLITNALTCLSTQEFDKERRISAEIATASKETPVSAISAPIQAPPPPSSETRMESMLAQL ;CC LAGLTKLDTKYESLSTKLDSKYDSLSTDLNSKIDNLRSQFSNLSPTSASINAVTLRSGKQLNPILQRERS ;CC AQPSSFPIAENESVSIDTPGCRSTPITLDDSVFPLSSGIDNFAEEEETIPDGVDRHPAPVDRHPARSDNV ;CC QIPAATKSANRRIPFPKSPKKSRQALDDVRCKAMIDKLIVEMPLVEAIHLSPTIRRYVKTMVTKNLTKEC ;CC SVMMISEQGSDIIQERIPRKLPDPGTFVLSVTINHDSFPRALCDLGSSVNLMPRSVAMRLGYSNLEPTFI ;CC TLVLADRSTRIPDGILIDVPVMIGKSMIPTDFVVLPYEKEPKDPLILGRSFLHTAGAIIDVRQGRIGLNV ;CC GDLTMQFDMNTLVKKPIIEGKTFLIDSFTSSASDSISEMELEDPLERVLVSSIEDSADLDSETSTYTKLL ;CC DETEHVMQLTVEEALPSVTSTPTTTSDWDPAKAPKIELKPLPAGLRYAFLGENSTYPVIVNASLNPAELT ;CC LLLSKLRNHRKALGYSLDDIAGISPDVCMHRIHLEDESKSSVEHQRRLNPNLKEVVKKEIMKLLEAGIIY ;CC PISDSSWVSPVHVVPKKGGVTVVKNEKDELIPTRTITGHRMCIDYRKLNAATRKDHFPLPFIDQMLERLA ;CC NHKYYCFLDGYSGFFQIPIHPDDQEKTTFTCPYGTFAYRRMPFGLCNAPATFQRGMMSIFTDMIEDIMEV ;CC FMDDFSVYGSSFEDCLENLYKVLARCEEKHLVLNWEKCHFMVQDGIVLGHRISEHGIEVDRAKIEVMTSL ;CC QALDNVKAVRSFLGHAGFYRRFIKDFSRIARPLTALLCKEVKFEFTQECHDAFQQIKQALISAPIVQPPD ;CC WDLPFEVMCDASDFAVGAVLGQRKDKKLHAIYYASRTLDDAQRNYATTEKELLAVVFAFEKFRSYLVGSK ;CC VIVHTDHAALKYLMQKKDAKPRLLRWILLLQEFDIEVRDKKGVENGVADHLSRIRIDDDVPINDFLPEEN ;CC IYMIDTAEEDDYKRDRLQNRASVSIDTPIMSIDTHISEEVDIRSCAMVSIDTIAPVDRHPSESTRNWSPT ;CC ENCAVTAVEKDYPWYADIVNYLAADVEPDNFTDYNKKRFLREIRRYYWDEPYLYKHCSDGVYRRCIAATE ;CC VPDILSHCHSSSYGGHFATFKTVSKVLQAGFWWPTMFRDAQKFISQCDPCQRRGKISKRNEMPQKFILEV ;CC EVFDCWGIDFMGPFPPSNKNLYILVAVDYVSKWVEAIASPKNDSAVVMKLFKSIIFPRFGVPRIVISDGG ;CC KHFINKILAKLLLQYGVQHRVATPYHPQTSGQVEVSNRQIKEILEKTVGKAKKEWSYKLDDALWAYRTAF ;CC KTPLGTTPFHLLYGKACHLPVELEHKAAWAVKMMNFDIKSAGERRLIQLNELDEIRIHAYDNSKLYKERT ;CC KAYHDKKILTRTFEPNDQVLLYDSRLTIFPGKLSSRWTGPYTVHSVRPYGTVILKNNNGKPFAVNGQRVK ;CC HYWAEAEIPVEKPLDLQDPPVD ;CC ;CC ATHILA4C2p: ;CC MTGQKKKARKSVGPSSSSAVPHHRRRFSTAAGLRPSQPESQAAAPQPPIDRFPWPKLPKERIPSQRVWEK ;CC DVNREFTKGDYINRPFSPDWDDYDTLFYNAWMSVEILPTRFADFGLMQRLQIEKSVLGLLDDIGLGTICC ;CC RQYDLYPELVKQFMASVRVSYVNDRKRNAQEGALIFFIRGVRYSLPLRDLCDIYGFDNDLTGVSLPGQFK ;CC DSQIFWSRFGNGIYDSKDAVHSEIRHPVLRYLVRLISSTLLCKMEPGKMRLSELLLLYHALHDFFPDSLG ;CC FEQVDRNVNFGAVFAHHLVSLKTKPFTGRGQKSERVGSLLTPIFEHFRISFEGEEVNTTRVTMDETYLKN ;CC SHWLKGNLLWCFRDDTGQHMIQLPRPALTEITGEHEEIGFHPDPSLLHAAPRTRRQRGSASGSAPTQTED ;CC EFIDPAGGPRVGSSSSALPYQLPSPPPIPMEPQVFQQYVVDSFKSVWNAIATLSRCGCVAPTRRRRRSPA ;CC PTSGSEHED ;XX ;DR [1] (Consensus) ;XX ;SQ Sequence 8639 BP; 2426 A; 1875 C; 1844 G; 2494 T; 0 other; ATHILA4C_I aatttggcgtcgttgccgaagttctcttttagttcaccattagactaggtgtcattttaagtctagatac ttcttttctatcacctattcacgtgtttctttgtcttttggtctcttgtgtttcaggtaccttcagattg agaatcacgtttccatggcagactttcacgattgtgtcgaccgacaccatcagggtgtcgatcgacaccc tctgcagaaccagacggctacaataggcgattttaacagaactgatctcttctacacaaaccgatcagca ttcaagctacctccatttgaaagagatgactttgctatacaccctgcctattatgatcttgtatctagag gaaagtttagaggagctcccgatgaatcaccattagatcatttggaagtcttcgaagacatcgtttcctc tattaaagcagaaggtgtaccagctgattaccttctttgcaaactcttcccccactctctcgcgagcaga ggaacatcatggttgcgtcaactggaaccaggttccttaaccaattggactgatacaaagaatgcattca tgaatcacttcttcgatgaatccgtgacagaggcaattcgtttgcaaatttcttcgttcactcaagctcc agccgagtcccttagagcatcttggctccgattcaggtcttatcagcgtgaatgtccccaacatggcttc caagagagccaactcatcaacctattctacaaaggaatagacaagccataccagaatcaacttgatgcag caagctcctgtaacttcatgacaagaaccactagtgaagcattactcctcatcaccaatgctttaacttg tctttcgacgcaggaatttgacaaagaacgaagaatttcagctgagatagccaccgcgagcaaagagact cctgtttcagcaatttctgctcctatccaagccccccctccaccctcatcagagaccagaatggagtcta tgcttgcgcagcttcttgcaggcctaacaaagcttgacaccaaatatgaatctctctccacaaagcttga tagcaaatatgattctctctccacagatctcaatagcaagattgacaatctacgctctcaattctccaat ctctcacctacatcagcttctatcaatgcagtcacactacgcagcggaaagcagctcaacccaatacttc agcgcgaacgatcagctcagccttcatcttttccaattgcagaaaacgaatcagtgtcgatcgatacacc agggtgtcgatcgacaccaattactcttgacgactctgttttccctttgtcgagtggaatcgacaatttt gcggaggaggaggaaactattcccgatggtgtcgatcgacacccagctcctgtcgatcgacacccagctc gttcagacaatgtgcaaattccagctgcaaccaagtcagcaaatcggcgaattcccttccccaagagtcc taaaaagtcaagacaagccttagatgatgtcagatgcaaggctatgattgataagctgattgttgagatg cctctagttgaagctattcatctatctcccacaatcagaaggtatgttaagacaatggtcactaagaatt tgacgaaagaatgcagtgttatgatgatttcagagcaaggcagcgacatcattcaggaacgaatcccaag aaagctacctgatcctggaacttttgtacttagtgtcaccattaatcacgattccttcccaagagcttta tgtgatctcggttctagtgtgaatctgatgcctcgttcagttgctatgcgtcttgggtattccaatcttg agcctacctttatcactcttgtgttggcagatcgttctacccgaattccagatggaattcttattgatgt tcccgtgatgattggaaagagcatgattcctacagactttgtagtcttgccttatgagaaggaacctaag gatccattaatccttgggagatccttcttgcacacagctggagcaatcatagatgttcgacaagggagga tagggctgaatgttggcgatcttactatgcagttcgatatgaacactttggtcaagaaaccgatcataga agggaaaaccttcttgatagattctttcacttcatcagcttcagatagtatttcagagatggaattagaa gacccactagagcgagttttggtctcttccatagaggatagtgcagatttggacagtgaaacttctacat atactaagctgctagatgagaccgaacatgtgatgcaacttacagttgaagaagctcttccatctgtgac ttcaacgccgactactacttcagattgggatccagcaaaggctcctaaaatcgaattgaaaccacttcca gcagggctaaggtacgcttttcttggtgaaaattctacttatcctgttattgtgaatgcttctcttaatc ccgcagagctcaccttattgctaagcaagctgcgcaatcatcgcaaagctcttggctattctcttgatga cattgcaggtatatctccagatgtatgcatgcataggatccaccttgaggatgagtctaagtcttcagtt gaacatcagagaaggctgaatccgaatctgaaagaagtggttaagaaagagattatgaaactgttggaag cagggattatctatccaatttcagatagcagttgggttagtccagttcatgtggttcctaagaagggagg tgttacagtagtcaagaatgagaaagacgagctgattcctactcggacaatcacaggacatcggatgtgc atcgattacagaaagctgaatgctgctaccaggaaagaccatttccccttaccatttatcgatcagatgt tggagaggttagcaaatcataagtactattgcttccttgatggatactcaggattctttcagatcccgat tcatccagatgaccaggagaaaacgactttcacctgcccctatggtacatttgcttatcggagaatgccc ttcggtctttgtaatgctcctgcaacatttcagagaggtatgatgtctatcttcacagacatgattgagg atatcatggaggttttcatggatgatttttcagtttatggatcatcgtttgaggattgcttagagaatct ctacaaagtgttagcaagatgtgaggagaaacatctagttttgaattgggagaaatgtcacttcatggta caggatggaatagttctcggacacaggatttctgagcatggtatagaagttgatagagctaagattgagg tcatgacaagtcttcaagcgcttgataatgttaaagcagtgaggagtttccttggacatgctggtttcta caggagattcatcaaagacttcagcagaatcgcaagaccattgactgctttactctgtaaagaagtcaaa ttcgagtttacacaagagtgtcatgatgcctttcagcagataaaacaagccttgatcagcgcaccaattg ttcagccaccagattgggatttaccttttgaggtaatgtgtgatgcgagtgattttgcagttggagctgt tctaggacagaggaaggataagaaacttcatgccatctactatgcaagtagaactcttgatgatgctcaa aggaattatgcaactacagagaaggagttgttagctgtggtgtttgctttcgagaaattcagatcttatc tcgttggatccaaggttattgttcacacagatcatgctgccttgaagtatttgatgcaaaagaaggatgc taagccaagacttttgagatggattttgcttcttcaagaatttgacatagaggttagagataagaaagga gttgagaatggtgtagctgatcatttgtcacgcattaggatcgatgatgatgtccctataaacgatttct tgcctgaagagaacatatacatgattgatacagctgaagaagatgactacaaacgtgacaggttgcagaa tcgagcttcagtgtcgatcgacactcccattatgtcgatcgacactcacatttcagaggaagttgatatt cgtagttgcgcgatggtgtcgatcgacaccattgcacctgtcgatcgacacccttctgaatcgacaagaa attggtcaccaactgagaattgcgccgtcacagcggtcgagaaagattacccatggtatgctgacattgt taattacctagctgcagatgtggaacctgataatttcacagattacaacaagaagagattcctaagagag attaggcgatattattgggatgaaccttatctctataagcattgttctgatggagtttacaggagatgca ttgctgcaacagaggttcctgatatactatcgcattgtcacagctctagctatggtggtcattttgcaac cttcaagacagtatccaaagttcttcaagcaggcttttggtggcctacaatgtttcgggatgctcagaag ttcatatcgcaatgtgacccttgtcagagaaggggaaagatcagcaagcgtaatgagatgcctcagaaat ttatactcgaagtcgaagtcttcgattgttggggtatagatttcatgggaccattcccaccttccaacaa gaatctctacatcctagtagctgtggattatgtctccaaatgggtagaagctatagctagtccgaagaac gattcagcggttgtcatgaaactcttcaagtctatcatcttccctcgttttggagtgccacgcatagtca ttagtgacggaggtaagcatttcattaacaagattcttgcgaaattgcttttacagtatggagtccagca tcgggttgctactccctaccatccacaaacgagcggccaagttgaagtttccaacaggcaaatcaaagag attcttgagaaaacagtgggtaaagcgaaaaaggagtggtcctacaagttagatgatgcactgtgggcct acaggacagctttcaaaactccgcttggtaccacaccttttcatcttctatatggtaaagcttgtcatct tccagtggaattagagcacaaagcagcttgggcagtcaagatgatgaatttcgacatcaaatcagctgga gaaaggagacttattcagttgaatgagcttgatgaaatacgaattcacgcctatgacaactcgaagctct acaaggaacgcacaaaagcttaccacgacaagaagatcctcactcggacctttgagcctaatgaccaggt acttctttatgattctagattgacaatatttcctggaaaattgtcttctcgttggacaggtccctacaca gtccactcggttaggccatatggaacagtcattctcaaaaacaacaatggaaaaccatttgctgtgaatg gacagagagtcaaacactactgggctgaagccgagattccagtagaaaagcctttggatcttcaggaccc acctgttgattaagcaactgcaaagtcaagctattgactataaacaagcgcttagtgggaggcaacccac tggtaagtatttctcctttttatttatttcattcatttgattcttaacttaggatttttgattttatagg actattcagattaaaaaaaaaaaaaaaaaaaaaacgtgtcgaccaaaaacgcgatggtgtcgatcgacac cctcattttccaggtaagtcgttttaacctaaaacttcatcttcttcatttcatttctctctaaaaccgc cgaattcactaaaacctctctagatccttcgattcttcaccgtttcttgtgatttcaggctctaaactca ttagaatctcatcctctatcttccccaagtgtcaattccatcgatctgcaagatgacagggcagaaaaag aaggcgaggaagagcgtcggaccttcctcctcctccgccgtccctcaccaccgccggcgcttctccaccg ccgccggtctccgcccttcccaaccggaatcccaagctgcagcccctcaacctcctattgatcgatttcc atggccaaagcttccaaaagagagaattccttctcaaagggtttgggagaaagatgtcaacagggaattc actaaaggtgactatatcaatcgccctttttctccagattgggatgattatgatactctgttctataatg cctggatgagtgtagaaattttgcctactcggtttgcggatttcggtttgatgcagcgtttgcagataga gaaatctgttttaggactgctagatgacataggtttgggaaccatctgttgtaggcagtacgatttgtat cctgagctggttaaacaattcatggcttctgttagggtttcttatgtgaatgacaggaaacgcaacgctc aggaaggagctcttatcttcttcattcgcggtgtcagatacagtttaccactgcgagatttgtgtgacat ctatgggtttgacaatgatctcactggagtttccctgcctggtcagtttaaggattctcagattttctgg agcaggtttggtaatggaatctatgattccaaggacgcagtacactcagagatccgtcaccctgttctgc gatatttggttaggctgattagtagcacactgttgtgcaagatggagcctggcaagatgagactttctga gttgttattgctctatcacgctcttcatgacttctttccagacagccttggatttgagcaggttgaccgt aatgtcaactttggtgctgtgttcgctcatcacttggtttctctcaagaccaagccctttacaggtagag gacagaaatctgagagagttggcagtcttctcactcccatattcgagcatttccgcatcagctttgaggg cgaagaagtcaacaccacgcgtgtcactatggatgagacttacctgaagaactctcactggcttaaaggc aacttactgtggtgcttcagagatgatacaggtcagcacatgattcagttacctcgccctgcacttacag agattacaggagagcatgaggagataggctttcatcccgatccttctcttctgcatgccgcaccacgcac caggcgacagagaggttcagcttcaggatctgcgcctacacagaccgaggacgagttcattgaccctgct gggggtccgagagttggatcttcctcatccgctcttccatatcagctgccttctcctccacccatcccga tggagccacaggtctttcagcagtacgtcgtcgacagtttcaagagcgtctggaacgccatagctacact ttctcgctgcggttgtgttgctcccactcgtcgtcgtcgtcgctctcccgcacccacctccggctcagag catgaggactagccatcctccttatcttgctttgttttctgactttgtttttgctttatttcagactatt ggttatttgttttgaaacttgttgcacttattttcttatgctttgttggaactattgatgtgtgttttag tgatttaacagtctaactggaggatccatgtgaaaacactaaccaaggctctacaacaatgagcagaatt cacaaagctaagcgaaacgaccagagtgtcgatcgacaccaacatggtatcgatcgacaccactctcagg tggcaactcacgttttctaactttttaacttttgattatttctctttcgattcctctagcttaataactc tggggacagtgttatctaagtctgggggagtcagttactaacttcatcttttcttttcttatgagtcagt tgtgtcattttaacttagatagtcgttttattgagtcaaaacttttgtctgaaattatgatctatctctc gagtttattgcctggattgcttaagtactgcagattcaaatcaatccgactaatggaaaatccagaaagc caatcaacaaacttaaggacatccattcccaacataatccacaaaggctggagctaatctcactgctctt ccttttacctcatcaaaacggttagtgctcataggacttgactccttgttcatacctgataagtcgacta atgtgttaaaaagaaagtgttcctctccccaaataaaaaaaaaaaaataataataaataatgagagattg agaagactttcagatagtgtataggggtaggaatgtttcctatgaccccattattatacattatttggaa cgatcaattacaaaggttggaaaaatgccgagggtgtcgatccagtatgcgagttcccctttgtttactc tctaaaaagaataagtctgggggagagaaagaacaccaaagaaaaaaatatatatataaagaatggtcga tttatcatggtatagtacaggaatgagtctagaaggttcttaagcatttacttgattgatgagacccagc gatgatagcctagtggatatagttgagtgctttggtatggaatctaggatgttgtgtgggctagcggaga attacatggtggtcaggatttgattagaactgtttagtgcatggatttggtttgtatgatcaaggtaata aactagagagattgggagtttctatgtttttcaaacctctttcacagattcaaccttttgttttgcttga gggcaagcaaaagctaagtctgggggagt1