;ID   ATCOPIA56_I DNA   ; ATH   ; 4210 BP
;XX
;DE   Internal region of the ATCOPIA56 copia-like LTR-retrotransposon -
;DE   a consensus sequence.
;XX
;AC   .
;XX
;DT   01-OCT-2001 (Rel. 6.2, Created)
;DT   01-OCT-2001 (Rel. 6.2, Last updated, Version 1)
;XX
;KW   LTR-retrotransposon; COPIA superfamily; internal region; 
;KW   copia-like polyprotein; reverse transcriptase; ATCOPIA56LTR; 
;KW   ATCOPIA56_I.
;XX
;OS   consensus
;XX
;OC   Arabidopsis thaliana
;OC   Eukaryota; Viridiplantae; Streptophyta; Embryophyta; Tracheophyta;
;OC   euphyllophytes; Spermatophyta; Magnoliophyta; eudicotyledons;
;OC   Rosidae; Capparales; Brassicaceae; Arabidopsis.
;XX
;RN   [1] (bases 1 to 4210)
;RA   Kapitonov,V.V. and Jurka,J.
;RL   Repbase Reports 1:(1) p. 14 (2001)
;XX
;CC   ATCOPIA56_I is a consensus sequence of an internal region of ATCOPIA56 
;CC   copia-like endogenous retrovirus. There are three copies of ATCOPIA56_I
;CC   present in the genome; they are 99% identical to the consensus sequence,
;CC   and are flanked by ~2% divergent ATCOPIA56LTR LTRs and 5-bp target-site 
;CC   duplications.
;CC   ATCOPIA56_I encodes the 1358-aa ATCOPIA56p copia-like polyprotein.
;CC   ATCOPIA56p:
;CC   MSSGRAEVEKFDGDGDYILWKEKLLAHMEMLGLLEGLGEEEEAEVEDSTTEISDGGNQDPETATSKLEDK
;CC   ILKEKRGKARSTIILSLGNNVLRKVIKQKTAAGMIKVLDQLFMAKSLPNRIYLKQRLYGYKMSENMTMEE
;CC   NVNDFFKLISDLENVKVVVPDEDQAIVLLMSLPRQFDQLKETLKYCKTTLHLEEITSAIRSKILELGASG
;CC   KLLKNNSDGLFVQDRGRSETRGKGPNKNKSRSKSKGAGKTCWICGKEGHFKKQCYVWKERNKQGSTSERG
;CC   EASTVTAQVTDAAALVVSRALLGFAEVTPDTWILDTGCSFHMTCRKDWIIDFKETASGKVRMGNDTYSEV
;CC   KGIGDVRIKNEDGSTILLTDVRYIPEMSKNLISLGTLEDKGCWFESKKGILTIFKNDLTVLTGKKESTLY
;CC   FLQGTTLAGEANVIDKEKDETSLWHSRLGHIGAKGLQVLVSKGHLDKNMIKDLQFCEDCVYGKTRRVSFG
;CC   AAKHVTKDKLDYVHSDLWGSPNVPFSIGKCQYFITFIDDFTRRTWIYFIRTKDEAFSKFVEWKTQIENQQ
;CC   DKKLKILRTDNGLEFCNQEFDSFCRKEGVIRHRTCAYTPQQNGVAERMNRTIMNKVRCMLSESGLGKQFW
;CC   AEAASTAVFLINKSPSSSIEFDIPEEKWTGHPPDYKILKKFGSVAYIHSDQGKLNPRAKKGIFLGYPDGV
;CC   KGFKVWLLEDRKCVVSRDIVFQENQMYKELQKNDMSEEEKQLTEVERTLIELKNLSADDENQSEGGDKSN
;CC   QEQASTTRSASKDKQVEETDSDDDCLENYLLARDRIRRQIRAPQRFVEEDDSLVGFALTMTEDGEVYEPE
;CC   TYEEAMRSPECEKWKQATIEEMDSMKKNDTWDVIDKPEGKRVIGCKWIFKRKAGIPGVEPPRYKARLVAK
;CC   GFSQREGIDYQEIFSPVVKHVSIRYLLSIVVQFDMELEQLDVKTAFLHGNLDEYILMSQPEGYEDEDSTE
;CC   KVCLLKKSLYGLKQSPRQWNQRFDSFMINSGYQRSKYNPCVYTQQLNDGSYIYLLLYVDDMLIASQNKDQ
;CC   IQKLKESLNREFEMKDLGPARKILGMEITRNREQGTLDLSQSEYVAGVLRAFGMDQSKVSQTPLGAHFKL
;CC   RAANEKTLARDAEYMKSVPYPNAIGSIMYSMIGSRPDLAYHVGVVSRFMSKPSKEHWQAVKWVMRYMKGT
;CC   QDTCLRFKKDDKFEIRGYCDSDYATDLDRRRSITGFVFTAGGNTISWKSGLQRVVALSTTEAEYMALAEA
;CC   VKEAIWLRGLAAEMGFEQDAVEVMCDSHNAIALSKNSVHHERTKHIDVRYHFIREKIADGEIQVVKVSTT
;CC   WNPADIFTKTVPVSKLQEALKLLRVSSN
;XX
;DR   [1] (Consensus)
;XX
;SQ   Sequence 4210 BP; 1431 A; 701 C; 1068 G; 1010 T; 0 other;
ATCOPIA56_I
gtttggtatcagagcttccaggttactacctaggagcgagaagatctgctcaagatgtcttcgggcagag
cagaggtggagaagttcgacggagatggggattacatcctgtggaaagaaaagttactggctcatatgga
gatgttgggacttctggagggtctcggggaagaagaggaagcagaggttgaagattctaccactgagatt
agtgatggaggaaaccaagacccagaaactgcaacttctaaactggaagacaagatcctcaaagaaaaaa
gaggaaaagccagatctaccatcatcttgagcctgggaaacaatgttctgagaaaggtcatcaaacagaa
gacagcagcaggtatgataaaggtcctggatcagttatttatggcaaaatctcttccaaatcgcatttac
ttgaagcagaggctgtatggctacaagatgagtgagaatatgacgatggaggagaatgttaatgatttct
tcaagttaatatcggacttggaaaacgtaaaggttgtagtcccagatgaagatcaagccatagtcttgct
catgtctttaccaagacagtttgatcaactgaaggagacactgaagtactgcaagactacacttcatctc
gaagaaatcacaagtgccataaggtctaagatcttggagttgggagctagtggtaagcttctcaagaata
actcagatgggttgtttgttcaagacagaggcagatcagaaaccagggggaaaggaccgaacaagaacaa
gagcagatctaagtcaaagggagcaggaaaaacgtgttggatctgtggcaaggagggtcatttcaagaag
caatgctatgtatggaaggagaggaacaagcaaggttccacatctgaaagaggagaggcttctactgtaa
ctgctcaagtcactgatgcagctgcactagtagtttcaagagctttacttggctttgctgaagtcacccc
agatacatggattctagacacagggtgttccttccatatgacctgcagaaaggattggatcatagacttc
aaggagactgcaagcgggaaagtaaggatgggcaatgatacttattctgaagtgaaaggaattggggatg
tcagaatcaagaatgaggatggatctactatcttgctcactgatgtcaggtacataccagaaatgtcaaa
gaacctcatctcacttggaactcttgaagataaaggctgctggttcgaatcgaagaaaggtattttgact
atttttaagaatgatcttactgtactaactggaaagaaagagagtactttgtattttctccagggaacga
cacttgcaggtgaagccaatgtcatagacaaagaaaaggatgaaacaagtttatggcacagcaggcttgg
tcacattggtgcaaaagggctgcaggttttggtcagtaaaggtcatctggataagaacatgattaaagat
ttgcagttttgtgaagattgtgtgtatggaaaaacacgcagggttagctttggagctgcaaagcatgtca
caaaagataaactcgactatgtgcattctgatctatggggatcaccgaatgtaccattctccattggtaa
gtgtcagtatttcatcactttcattgatgattttacgaggagaacttggatctatttcattagaaccaaa
gatgaagctttcagcaagtttgtagaatggaaaacacagattgaaaaccaacaggacaagaagctcaaga
ttctcagaacagataatgggctggagttctgtaaccaggagtttgattcattctgcagaaaagaaggagt
tataaggcacaggacatgtgcttacacaccacagcagaatggtgttgctgaaaggatgaacaggaccatc
atgaacaaggtcagatgcatgttaagtgaatcagggttggggaaacagttctgggcagaagcagcgtcta
ctgccgtgttcctcatcaacaaaagcccaagctcttcaatagagtttgatattcctgaagagaagtggac
tggtcatccaccagattacaagatactcaagaagtttggatcagtcgcttatattcattcagatcaagga
aagctgaatcctagagcaaagaaggggatttttctcggatatccagatggtgtaaagggattcaaagtgt
ggctgctagaagacaggaaatgtgtagtctctcgagacattgtttttcaagaaaatcagatgtacaagga
actgcagaagaatgatatgtctgaggaagaaaaacagctcactgaagtagaaaggactctcatagagcta
aagaatttgtctgcagatgatgaaaatcagagtgaaggaggagataagtcaaaccaagaacaagcttcaa
caacaagatctgcaagtaaagacaaacaagtagaggaaactgattctgatgatgattgtctagagaacta
tctactggccagggatagaattcgaagacagatcagagctccacagagattcgttgaggaagatgacagc
cttgttgggtttgcattaacaatgacagaagatggagaagtttatgaaccagaaacctatgaagaagcca
tgagaagtccagaatgtgagaaatggaagcaagctaccatagaagaaatggactccatgaaaaagaatga
cacatgggatgtcattgataagcctgaaggaaagagagttataggctgtaagtggatattcaagagaaaa
gcaggaattcccggagtagaaccaccaagatacaaagctaggcttgtcgccaaaggattttcacaaagag
aaggcatagactatcaggagattttctcacctgtagtcaagcacgtgtcaatcaggtatcttttatccat
tgtggttcaatttgacatggaattagaacagcttgatgttaagactgcgtttttacatgggaatctggat
gagtatatattgatgagtcagcctgaaggatatgaagatgaggacagcacagaaaaagtctgtttgttaa
agaaatctctgtatgggctgaagcagtctccaagacagtggaatcagagatttgactcattcatgatcaa
ctcaggttatcaaagaagcaagtataatccatgtgtctacacacaacaacttaatgatggatcgtacatc
tatctactgttgtatgtagatgatatgctcattgcatcacaaaacaaggaccaaatccagaagttaaaag
agtcactcaacagagaatttgagatgaaggatttagggcctgcaagaaagatactgggaatggaaatcac
aagaaacagagaacaaggcactttggacctgtctcagagtgagtatgtggctggagtgttgagagctttt
gggatggatcaaagtaaggtctctcagacgccacttggtgcacacttcaagttaagagccgcaaatgaga
aaactcttgcaagagatgctgagtatatgaagtcggttccctaccctaatgcaattggaagtatcatgta
ctctatgataggatcaaggccagacttggcatatcatgtgggggttgtaagccggtttatgagtaaaccc
tcaaaagaacactggcaagctgttaagtgggtcatgaggtacatgaagggaacacaagatacctgtctaa
ggttcaagaaagatgacaaatttgaaatcagaggctactgcgattcagattatgcaactgatttagacag
gaggagatcgattacaggatttgtattcacagctggtgggaatacaataagctggaagtcgggtttacag
agagtggtggctctgtcaacaacagaagctgaatatatggcccttgcagaggcagttaaagaagccattt
ggctaagagggttagctgcagagatggggtttgaacaagatgcagtagaagttatgtgtgattcacacaa
tgccattgctttgtccaagaactcagtccaccatgagaggacaaagcatatagacgtgaggtatcacttc
ataagggagaagatagcagacggagagattcaggttgttaaggtttcaacaacatggaatcctgcagaca
tcttcacaaaaacagttccagtgagtaagcttcaagaagcgctgaagctactcagggtctcaagtaacta
gggagaccacagatccgagattggaagtcaagtaacactaagaagatgagttcagtgaactttataccaa
ggaggagttt1