;ID ATCOPIA56_I DNA ; ATH ; 4210 BP ;XX ;DE Internal region of the ATCOPIA56 copia-like LTR-retrotransposon - ;DE a consensus sequence. ;XX ;AC . ;XX ;DT 01-OCT-2001 (Rel. 6.2, Created) ;DT 01-OCT-2001 (Rel. 6.2, Last updated, Version 1) ;XX ;KW LTR-retrotransposon; COPIA superfamily; internal region; ;KW copia-like polyprotein; reverse transcriptase; ATCOPIA56LTR; ;KW ATCOPIA56_I. ;XX ;OS consensus ;XX ;OC Arabidopsis thaliana ;OC Eukaryota; Viridiplantae; Streptophyta; Embryophyta; Tracheophyta; ;OC euphyllophytes; Spermatophyta; Magnoliophyta; eudicotyledons; ;OC Rosidae; Capparales; Brassicaceae; Arabidopsis. ;XX ;RN [1] (bases 1 to 4210) ;RA Kapitonov,V.V. and Jurka,J. ;RL Repbase Reports 1:(1) p. 14 (2001) ;XX ;CC ATCOPIA56_I is a consensus sequence of an internal region of ATCOPIA56 ;CC copia-like endogenous retrovirus. There are three copies of ATCOPIA56_I ;CC present in the genome; they are 99% identical to the consensus sequence, ;CC and are flanked by ~2% divergent ATCOPIA56LTR LTRs and 5-bp target-site ;CC duplications. ;CC ATCOPIA56_I encodes the 1358-aa ATCOPIA56p copia-like polyprotein. ;CC ATCOPIA56p: ;CC MSSGRAEVEKFDGDGDYILWKEKLLAHMEMLGLLEGLGEEEEAEVEDSTTEISDGGNQDPETATSKLEDK ;CC ILKEKRGKARSTIILSLGNNVLRKVIKQKTAAGMIKVLDQLFMAKSLPNRIYLKQRLYGYKMSENMTMEE ;CC NVNDFFKLISDLENVKVVVPDEDQAIVLLMSLPRQFDQLKETLKYCKTTLHLEEITSAIRSKILELGASG ;CC KLLKNNSDGLFVQDRGRSETRGKGPNKNKSRSKSKGAGKTCWICGKEGHFKKQCYVWKERNKQGSTSERG ;CC EASTVTAQVTDAAALVVSRALLGFAEVTPDTWILDTGCSFHMTCRKDWIIDFKETASGKVRMGNDTYSEV ;CC KGIGDVRIKNEDGSTILLTDVRYIPEMSKNLISLGTLEDKGCWFESKKGILTIFKNDLTVLTGKKESTLY ;CC FLQGTTLAGEANVIDKEKDETSLWHSRLGHIGAKGLQVLVSKGHLDKNMIKDLQFCEDCVYGKTRRVSFG ;CC AAKHVTKDKLDYVHSDLWGSPNVPFSIGKCQYFITFIDDFTRRTWIYFIRTKDEAFSKFVEWKTQIENQQ ;CC DKKLKILRTDNGLEFCNQEFDSFCRKEGVIRHRTCAYTPQQNGVAERMNRTIMNKVRCMLSESGLGKQFW ;CC AEAASTAVFLINKSPSSSIEFDIPEEKWTGHPPDYKILKKFGSVAYIHSDQGKLNPRAKKGIFLGYPDGV ;CC KGFKVWLLEDRKCVVSRDIVFQENQMYKELQKNDMSEEEKQLTEVERTLIELKNLSADDENQSEGGDKSN ;CC QEQASTTRSASKDKQVEETDSDDDCLENYLLARDRIRRQIRAPQRFVEEDDSLVGFALTMTEDGEVYEPE ;CC TYEEAMRSPECEKWKQATIEEMDSMKKNDTWDVIDKPEGKRVIGCKWIFKRKAGIPGVEPPRYKARLVAK ;CC GFSQREGIDYQEIFSPVVKHVSIRYLLSIVVQFDMELEQLDVKTAFLHGNLDEYILMSQPEGYEDEDSTE ;CC KVCLLKKSLYGLKQSPRQWNQRFDSFMINSGYQRSKYNPCVYTQQLNDGSYIYLLLYVDDMLIASQNKDQ ;CC IQKLKESLNREFEMKDLGPARKILGMEITRNREQGTLDLSQSEYVAGVLRAFGMDQSKVSQTPLGAHFKL ;CC RAANEKTLARDAEYMKSVPYPNAIGSIMYSMIGSRPDLAYHVGVVSRFMSKPSKEHWQAVKWVMRYMKGT ;CC QDTCLRFKKDDKFEIRGYCDSDYATDLDRRRSITGFVFTAGGNTISWKSGLQRVVALSTTEAEYMALAEA ;CC VKEAIWLRGLAAEMGFEQDAVEVMCDSHNAIALSKNSVHHERTKHIDVRYHFIREKIADGEIQVVKVSTT ;CC WNPADIFTKTVPVSKLQEALKLLRVSSN ;XX ;DR [1] (Consensus) ;XX ;SQ Sequence 4210 BP; 1431 A; 701 C; 1068 G; 1010 T; 0 other; ATCOPIA56_I gtttggtatcagagcttccaggttactacctaggagcgagaagatctgctcaagatgtcttcgggcagag cagaggtggagaagttcgacggagatggggattacatcctgtggaaagaaaagttactggctcatatgga gatgttgggacttctggagggtctcggggaagaagaggaagcagaggttgaagattctaccactgagatt agtgatggaggaaaccaagacccagaaactgcaacttctaaactggaagacaagatcctcaaagaaaaaa gaggaaaagccagatctaccatcatcttgagcctgggaaacaatgttctgagaaaggtcatcaaacagaa gacagcagcaggtatgataaaggtcctggatcagttatttatggcaaaatctcttccaaatcgcatttac ttgaagcagaggctgtatggctacaagatgagtgagaatatgacgatggaggagaatgttaatgatttct tcaagttaatatcggacttggaaaacgtaaaggttgtagtcccagatgaagatcaagccatagtcttgct catgtctttaccaagacagtttgatcaactgaaggagacactgaagtactgcaagactacacttcatctc gaagaaatcacaagtgccataaggtctaagatcttggagttgggagctagtggtaagcttctcaagaata actcagatgggttgtttgttcaagacagaggcagatcagaaaccagggggaaaggaccgaacaagaacaa gagcagatctaagtcaaagggagcaggaaaaacgtgttggatctgtggcaaggagggtcatttcaagaag caatgctatgtatggaaggagaggaacaagcaaggttccacatctgaaagaggagaggcttctactgtaa ctgctcaagtcactgatgcagctgcactagtagtttcaagagctttacttggctttgctgaagtcacccc agatacatggattctagacacagggtgttccttccatatgacctgcagaaaggattggatcatagacttc aaggagactgcaagcgggaaagtaaggatgggcaatgatacttattctgaagtgaaaggaattggggatg tcagaatcaagaatgaggatggatctactatcttgctcactgatgtcaggtacataccagaaatgtcaaa gaacctcatctcacttggaactcttgaagataaaggctgctggttcgaatcgaagaaaggtattttgact atttttaagaatgatcttactgtactaactggaaagaaagagagtactttgtattttctccagggaacga cacttgcaggtgaagccaatgtcatagacaaagaaaaggatgaaacaagtttatggcacagcaggcttgg tcacattggtgcaaaagggctgcaggttttggtcagtaaaggtcatctggataagaacatgattaaagat ttgcagttttgtgaagattgtgtgtatggaaaaacacgcagggttagctttggagctgcaaagcatgtca caaaagataaactcgactatgtgcattctgatctatggggatcaccgaatgtaccattctccattggtaa gtgtcagtatttcatcactttcattgatgattttacgaggagaacttggatctatttcattagaaccaaa gatgaagctttcagcaagtttgtagaatggaaaacacagattgaaaaccaacaggacaagaagctcaaga ttctcagaacagataatgggctggagttctgtaaccaggagtttgattcattctgcagaaaagaaggagt tataaggcacaggacatgtgcttacacaccacagcagaatggtgttgctgaaaggatgaacaggaccatc atgaacaaggtcagatgcatgttaagtgaatcagggttggggaaacagttctgggcagaagcagcgtcta ctgccgtgttcctcatcaacaaaagcccaagctcttcaatagagtttgatattcctgaagagaagtggac tggtcatccaccagattacaagatactcaagaagtttggatcagtcgcttatattcattcagatcaagga aagctgaatcctagagcaaagaaggggatttttctcggatatccagatggtgtaaagggattcaaagtgt ggctgctagaagacaggaaatgtgtagtctctcgagacattgtttttcaagaaaatcagatgtacaagga actgcagaagaatgatatgtctgaggaagaaaaacagctcactgaagtagaaaggactctcatagagcta aagaatttgtctgcagatgatgaaaatcagagtgaaggaggagataagtcaaaccaagaacaagcttcaa caacaagatctgcaagtaaagacaaacaagtagaggaaactgattctgatgatgattgtctagagaacta tctactggccagggatagaattcgaagacagatcagagctccacagagattcgttgaggaagatgacagc cttgttgggtttgcattaacaatgacagaagatggagaagtttatgaaccagaaacctatgaagaagcca tgagaagtccagaatgtgagaaatggaagcaagctaccatagaagaaatggactccatgaaaaagaatga cacatgggatgtcattgataagcctgaaggaaagagagttataggctgtaagtggatattcaagagaaaa gcaggaattcccggagtagaaccaccaagatacaaagctaggcttgtcgccaaaggattttcacaaagag aaggcatagactatcaggagattttctcacctgtagtcaagcacgtgtcaatcaggtatcttttatccat tgtggttcaatttgacatggaattagaacagcttgatgttaagactgcgtttttacatgggaatctggat gagtatatattgatgagtcagcctgaaggatatgaagatgaggacagcacagaaaaagtctgtttgttaa agaaatctctgtatgggctgaagcagtctccaagacagtggaatcagagatttgactcattcatgatcaa ctcaggttatcaaagaagcaagtataatccatgtgtctacacacaacaacttaatgatggatcgtacatc tatctactgttgtatgtagatgatatgctcattgcatcacaaaacaaggaccaaatccagaagttaaaag agtcactcaacagagaatttgagatgaaggatttagggcctgcaagaaagatactgggaatggaaatcac aagaaacagagaacaaggcactttggacctgtctcagagtgagtatgtggctggagtgttgagagctttt gggatggatcaaagtaaggtctctcagacgccacttggtgcacacttcaagttaagagccgcaaatgaga aaactcttgcaagagatgctgagtatatgaagtcggttccctaccctaatgcaattggaagtatcatgta ctctatgataggatcaaggccagacttggcatatcatgtgggggttgtaagccggtttatgagtaaaccc tcaaaagaacactggcaagctgttaagtgggtcatgaggtacatgaagggaacacaagatacctgtctaa ggttcaagaaagatgacaaatttgaaatcagaggctactgcgattcagattatgcaactgatttagacag gaggagatcgattacaggatttgtattcacagctggtgggaatacaataagctggaagtcgggtttacag agagtggtggctctgtcaacaacagaagctgaatatatggcccttgcagaggcagttaaagaagccattt ggctaagagggttagctgcagagatggggtttgaacaagatgcagtagaagttatgtgtgattcacacaa tgccattgctttgtccaagaactcagtccaccatgagaggacaaagcatatagacgtgaggtatcacttc ataagggagaagatagcagacggagagattcaggttgttaaggtttcaacaacatggaatcctgcagaca tcttcacaaaaacagttccagtgagtaagcttcaagaagcgctgaagctactcagggtctcaagtaacta gggagaccacagatccgagattggaagtcaagtaacactaagaagatgagttcagtgaactttataccaa ggaggagttt1