;ID   ATCOPIA78_I DNA   ; ATH   ; 4077 BP
;XX
;DE   Internal portion of the ATCOPIA78 copia-like LTR-retrotransposon
;DE   - a consensus sequence.
;XX
;AC   .
;XX
;DT   30-NOV-2001 (Rel. 6.3, Created)
;DT   30-NOV-2001 (Rel. 6.3, Last updated, Version 1)
;XX
;KW   LTR-retrotransposon; COPIA superfamily; ATCOPIA78 family; internal 
;KW   region; pol; reverse transcriptase; ATCOPIA78LTR; ATCOPIA78_I.
;XX
;OS   consensus
;XX
;OC   Arabidopsis thaliana
;OC   Eukaryota; Viridiplantae; Charophyta/Embryophyta group;
;OC   Embryophyta; Tracheophyta; euphyllophytes; Spermatophyta;
;OC   Magnoliophyta; eudicotyledons; Rosidae; Capparales; Brassicaceae;
;OC   Arabidopsis.
;XX
;RN   [1] (bases 1 to 4077)
;RA   Kapitonov,V.V. and Jurka,J.
;RT   Internal portion of the ATCOPIA78 copia-like LTR-retrotransposon.
;RL   Repbase Reports 1:(3) p. 4 (2001)
;XX
;CC   ATCOPIA78_I is a consensus sequence of an internal region of 
;CC   ATCOPIA78 copia-like retroelement flanked by ~1% divergent LTRs,
;CC   ATCOPIA78LTR, and 5 bp-long target-site duplicates.
;CC   ATCOPIA78 is one of ~100 families of copia-like LTR-retrotransposons,
;CC   which were active in the A. thaliana genome during last 20 Myrs.
;CC   The consensus sequence has been reconstructed based on eight
;CC   copies present in the genome. 
;CC   These copies are ~99.4% identical to the consensus sequence.
;CC   ATCOPIA78_I encodes ATCOPIA78p, a 1352-aa copia-like polyprotein.
;CC   ATCOPIA78p:
;CC   MASNNVPFQVPVLTKSNYDNWSLRMKAILGAHDVWEIVEKGFIEPENEGSLSQTQKDGLRDSRKRDKKAL
;CC   CLIYQGLDEDTFEKVVEATSAKEAWEKLRTSYKGADQVKKVRLQTLRGEFEALQMKEGELVSDYFSRVLT
;CC   VTNNLKRNGEKLDDVRIMEKVLRSLDLKFEHIVTVIEETKDLEAMTIEQLLGSLQAYEEKKKKKEDIVEQ
;CC   VLNMQITKEENGQSYQRRGGGQVRGRGRGGYGNGRGWRPHEDNTNQRGENSSRGRGKGHPKSRYDKSSVK
;CC   CYNCGKFGHYASECKAPSNKKFEEKANYVEEKIQEEDMLLMASYKKDEQEENHKWYLDSGASNHMCGRKS
;CC   MFAELDESVRGNVALGDESKMEVKGKGNILIRLKNGDHQFISNVYYIPSMKTNILSLGQLLEKGYDIRLK
;CC   DNNLSIRDQESNLITKVPMSKNRMFVLNIRNDIAQCLKMCYKEESWLWHLRFGHLNFGGLELLSRKEMVR
;CC   GLPCINHPNQVCEGCLLGKQFKMSFPKESSSRAQKPLELIHTDVCGPIKPKSLGKSNYFLLFIDDFSRKT
;CC   WVYFLKEKSEVFEIFKKFKAHVEKESGLVIKTMRSDRGGEFTSKEFLKYCEDNGIRRQLTVPRSPQQNGV
;CC   AERKNRTILEMARSMLKSKRLPKELWAEAVACAVYLLNRSPTKSVSGKTPQEAWSGRKPGVSHLRVFGSI
;CC   AHAHVPDEKRSKLDDKSEKYIFIGYDNNSKGYKLYNPDTKKTIISRNIVFDEEGEWDWNSNEEDYNFFPH
;CC   FEEDEPEPTREEPPSEEPTTPPTSPTSSQIEESSSERTPRFRSIQELYEVTENQENLTLFCLFAECEPMD
;CC   FQEAIEKKTWRNAMDEEIKSIQKNDTWELTSLPNGHKAIGVKWVYKAKKNSKGEVERYKARLVAKGYSQR
;CC   AGIDYDEVFAPVARLETVRLIISLAAQNKWKIHQMDVKSAFLNGDLEEEVYIEQPQGYIVKGEEDKVLRL
;CC   KKALYGLKQAPRAWNTRIDKYFKEKDFIKCPYEHALYIKIQKEDILIACLYVDDLIFTGNNPSMFEEFKK
;CC   EMTKEFEMTDIGLMSYYLGIEVKQEDNGIFITQEGYAKEVLKKFKMDDSNPVCTPMECGIKLSKKEEGEG
;CC   VDPTTFKSLVGSLRYLTCTRPDILYAVGVVSRYMEHPTTTHFKAAKRILRYIKGTVNFGLHYSTTSDYKL
;CC   VGYSDSDWGGDVDDRKSTSGFVFYIGDTAFTWMSKKQPIVTLSTCEAEYVAATSCVCHAIWLRNLLKELS
;CC   LPQEEPTKIFVDNKSAIALAKNPVFHDRSKHIDTRYHYIRECVSKKDVQLEYVKTHDQVADIFTKPLKRE
;CC   DFIKMRSLLGVAKSSLRGGVES
;CC
;CC   It's possible that some copies of
;CC   ATCOPIA78 are still active retrotransposons.
;XX
;DR   [1] (Consensus)
;XX
;SQ   Sequence 4077 BP; 1464 A; 671 C; 965 G; 977 T; 0 other;
ATCOPIA78_I
aaagtggtatcagagcttgaagatcctaaagatggcgagtaacaatgttcccttccaagtcccggtgctc
acaaagagcaactatgataattggagtctacgaatgaaggctatcctaggagcacatgacgtgtgggaga
tagtcgagaaaggtttcattgaaccggagaatgaaggtagtctttctcaaactcaaaaagatggtttgag
agactcaagaaagagagacaagaaagctctctgtctaatctatcaaggattagatgaagatacgttcgag
aaggtcgttgaagctacgtcggcgaaagaagcatgggagaagcttcgaacctcttacaaaggtgccgatc
aagtcaagaaagtacgtcttcaaactctaagaggagaatttgaagcactacaaatgaaggaaggtgaact
cgtctccgattacttctcaagagtgttgacggttactaataaccttaaaagaaacggagagaagctagat
gatgtgagaatcatggagaaagttcttagatcattggatctaaaatttgagcatattgtcaccgtcattg
aagaaacaaaagatttagaagctatgacaatagagcaacttcttgggtcattacaagcttatgaagaaaa
gaagaagaagaaagaagatatcgtcgaacaagtcctcaatatgcaaattacaaaagaagaaaacggccaa
agttaccaaagaagaggtggtggtcaagtacgaggacgaggtcgtggtggatatggaaatggacgtggtt
ggaggccacatgaagacaacacaaaccaaagaggtgaaaactcatcaagaggtcgtgggaaaggacaccc
aaaatcaagatacgataaatcaagtgtcaaatgctacaattgtgggaagtttggacattatgcttctgaa
tgtaaagctcctagcaacaaaaaatttgaggagaaggccaactacgttgaagaaaaaattcaagaagaag
acatgttattaatggctagctacaagaaagatgaacaagaagagaatcataagtggtacctcgatagtgg
tgcaagtaatcacatgtgcgggagaaaaagtatgttcgcggagcttgatgaatcggtgagaggaaatgtg
gctttaggagatgaatcgaagatggaggtaaaaggtaaaggaaacattctcattcgattgaagaatggag
atcatcaatttatttccaacgtttactatattccgagcatgaagacaaacatcttgagccttggacaact
cttagagaaaggttatgatattagattaaaagataataacctttcaataagagaccaagaaagcaatctc
attaccaaggtgccaatgtcgaaaaatagaatgtttgtcctcaacattcgaaatgacattgcacaatgtc
ttaagatgtgttacaaagaggagtcttggctatggcatcttcgattcggacatctaaattttggaggatt
ggagttgctttcaaggaaggaaatggtgagagggctaccttgtataaatcatccaaatcaagtgtgtgaa
ggatgtctacttggaaagcaattcaaaatgagctttccaaaggagtcaagttcaagagcacaaaaaccgt
tggagctaatacacaccgatgtgtgtggtccgatcaagccgaaatcacttggtaaaagtaattacttcct
tctctttattgatgatttttcaagaaaaacatgggtatattttttgaaagaaaaatccgaggtgttcgaa
attttcaaaaagtttaaagcccatgttgagaaggagagtggtcttgtgatcaaaaccatgagatccgacc
gtggaggagaatttacatccaaggagtttcttaagtattgtgaagacaacggcattcgaagacaattaac
ggtgccaagatcccctcaacaaaatggtgtagcggaaagaaagaatagaacaattcttgagatggcaagg
agcatgctcaaaagtaagagactaccaaaagagttgtgggcggaagcggtcgcgtgtgcggtttatctat
taaatcgatctccaacaaaaagtgtctccggaaaaacaccacaagaagcttggagcggaagaaagcccgg
tgtttctcatttaagagtctttggaagtattgctcatgctcatgtaccggatgagaagcggagcaaacta
gatgacaaaagtgagaagtatatcttcattggttatgataacaactccaaaggctacaagctctataatc
ccgatacgaagaagacaattattagtcgaaatatagtgttcgatgaagaaggagaatgggattggaactc
aaatgaagaagattataacttctttccacattttgaagaagatgagccggagccaacaagagaggagcca
ccaagtgaagagcctactacaccaccaacttcaccaacaagttctcaaatagaagaaagttcgagtgaaa
ggactccgcgttttagaagtatacaagagctctatgaggtaaccgaaaatcaagaaaaccttaccttatt
ttgtttatttgcggagtgcgaacccatggatttccaagaagccattgaaaagaagacttggagaaatgcc
atggatgaagagatcaaatcaatacaaaagaatgacacatgggagttaacttcacttccaaatggacaca
aggcaattggcgtgaagtgggtgtataaagcaaagaaaaactctaaaggagaagtggaaagatacaaagc
aagattggttgcaaaaggttatagtcaaagagccggaattgactatgacgaggtatttgctcccgttgct
cgtctagaaacggttagactaatcatctcactagcggctcaaaacaagtggaagatacatcaaatggatg
tcaagtcggccttcttaaatggagatcttgaagaagaagtttacattgagcaaccacaaggctacatagt
caaaggtgaagaagacaaagtcttgaggctaaaaaaggcgctttatggattaaaacaagccccaagagct
tggaatactcgaattgacaagtatttcaaggagaaagatttcatcaagtgtccatatgagcatgcactct
atatcaaaattcaaaaagaagatatattgatcgcatgcttatatgtagatgacttgatattcacgggtaa
caatccaagcatgttcgaagaattcaagaaagagatgacgaaggagttcgagatgacggacattggattg
atgtcttactatctcggaattgaagtaaaacaagaagacaatggaatattcataactcaagaaggctatg
ctaaggaggtacttaagaagttcaagatggatgactcaaatcccgtttgtacaccaatggaatgcggaat
caaactatcaaagaaagaagaaggggaaggagtggatccaacaacctttaagagcttggttggaagcttg
agatacttaacatgcacaaggcccgatattttatatgcggtcggagttgttagtcgttacatggagcatc
caacaacaactcatttcaaagcggcaaaaaggattcttcgctatatcaaaggtaccgtaaactttggctt
acattattcaactactagtgattacaagcttgttggatatagcgatagcgattggggtggagacgtagat
gaccgaaagagtacaagtggttttgtgttttacattggagacacggctttcacatggatgtcgaagaaac
aaccaattgtcactctatccacttgtgaagcggagtatgtagcggctacgtcatgtgtatgccatgctat
ttggttaagaaacctcttgaaggagttaagcttaccacaagaggaaccaacgaagatctttgtggacaac
aagtcggcaatagctttggcgaagaacccggtcttccatgatcgaagtaaacacattgacacacgctatc
actacattagagagtgtgttagcaagaaggacgtgcaattggagtatgtgaagacacatgatcaagtagc
cgatatttttaccaagcctctcaagcgtgaagactttatcaagatgaggagtttgcttggagtagcaaaa
tcaagtttaagaggggg1