;ID   ATCOPIA95_I DNA   ; ATH   ; 5139 BP
;XX
;DE   Internal portion of the ATCOPIA95 copia-like LTR-retrotransposon
;DE   - a consensus sequence.
;XX
;AC   .
;XX
;DT   27-DEC-2001 (Rel. 6.3, Created)
;DT   27-DEC-2001 (Rel. 6.3, Last updated, Version 1)
;XX
;KW   LTR-retrotransposon; COPIA superfamily; ATCOPIA95 family; RIRE-1
;KW   group; internal region; pol; reverse transcriptase; ATCOPIA95LTR; 
;KW   ATCOPIA95_I.
;XX
;OS   consensus
;XX
;OC   Arabidopsis thaliana
;OC   Eukaryota; Viridiplantae; Charophyta/Embryophyta group;
;OC   Embryophyta; Tracheophyta; euphyllophytes; Spermatophyta;
;OC   Magnoliophyta; eudicotyledons; Rosidae; Capparales; Brassicaceae;
;OC   Arabidopsis.
;XX
;RN   [1] (bases 1 to 5139)
;RA   Kapitonov,V.V. and Jurka,J.
;RT   Internal portion of the ATCOPIA95 copia-like LTR-retrotransposon.
;RL   Repbase Reports 1:(4) p. 6 (2001)
;XX
;CC   ATCOPIA95_I is a consensus sequence of an internal region of 
;CC   ATCOPIA95 copia-like retroelement flanked by ~8% divergent LTRs,
;CC   ATCOPIA95LTR, and 5 bp-long TDSs.
;CC   ATCOPIA95 is one of ~100 families of copia-like LTR-retrotransposons,
;CC   which were active in the A. thaliana genome during last 20 Myrs.
;CC   The consensus sequence has been reconstructed based on five
;CC   copies present in the genome. 
;CC   These copies are ~95% identical to the consensus sequence.
;CC   ATCOPIA95_I encodes ATCOPIA95p, a 1368-aa copia-like polyprotein.
;CC   The ATCOPIA95p ORFs in all ATCOPIA95_I copies present in the genome 
;CC   are damaged by false stop codons. There are no stop codons
;CC   in the corresponding ORF in the consensus sequence.
;CC   ATCOPIA95p:
;CC   MTEINTNSKIVGNTTVVETPDEVALRLAKELEKTQITDSGMDNVRRNLFGNGSTNTTIPPSASQGMMPDL
;CC   FDGKSGFKTWQEKMRYYLVSINMERYLTEDPPIVPQGTTYVYTVGGMDTWAQGDYCCKGLILNRLVNDLF
;CC   DLYSKAKSSKTLWLTLENKYKTDESGMQRFSTAKFLNFKMVDSKPIMEQVEALQRISQEIELEGMSICNV
;CC   FKTNCLIEKLPPGWSDFKNYLNFKRKAMTFDDLVRRLMIEGNNRGAHAGAQNQGHDVNVAEHKAKLKGKG
;CC   KGFSIPQKNLKVSSTTNFKKSNPERKFKGKCHHCGKIGHKADVCKSKAKDVKSQANLTEEDMVAVVTECN
;CC   MVDDNQVEWYYDTGATTHICTDRTMFSTYVKNKSNEQLFMGNTAMSKIEGKGKVVLKLTSGRELTLQNVK
;CC   HVPDMRKNLISGTVLSNNGFAVNFESDKLVLKKHGVYLGKGFVKGGLVKMSVMTVFPKNVASASVVEMNE
;CC   NPIAYLVESFTTWHERLGHVNYKTMRKMQNMNLIPKFKTNQEKCEVCVQAKLTKTPSPRVERTTEPLGLI
;CC   HTDLCDLKYVQTRGGKKYFVTFIDDCTRYCYVYLLHSKDEALVKFKEFTLEVENQLQRTIKIVRSDRGGE
;CC   YNEPFNAFCREKGIIHQTTAPYSPESNGVAERKNRTLKEMMNALLQESGLAQNMWGEALLTTNYILNRIP
;CC   HKVTAKSPHELWKGTVPSYKYLKVWGCLAKVAVPPPKKVTIGPKTVDCIFIGYAHNSSAYRFLVYKSDIP
;CC   DIHENTVMESRNASFFEDIFPCRKTQKRTREQRDAATSEAEDNTLGTITVEETEQEPEEQPRRSKRARKE
;CC   KSFGDDFLMAFLAENVPRTYSEAMSTPDAPYWKEAVNTEIDSILQHHTYEIADLPQGSKPLGSKWIFTIK
;CC   RKTNGDIDRYKARLVVQGFRQKEGLDFFDTYSPVTRITSIRMLIGIAALRDLEIHQMDVKTAFLNGDLEE
;CC   EIYMKQPEGFVIPGQEHKVCKLVKSLYGLKQAPKQWHEKFDSVMMSNGFTINECDKCIYFKITPTGYILL
;CC   CLYVDDMLILGSNTDIINQTKNMLKRYFEMKDMGLADVILGIKIIRTDEGLTLSQTHYAEKILDRFKHYS
;CC   NGTAKTPVDPQLHLTKNSGEPVQQVEYARVIGSLMYLTNSTRPDLAHSVNVLSRYTSNPGHKHWKAITRV
;CC   LNYLRYTKDHGLHYGKEPAVLEGYSDANWIADSKNSKSTSGYIFTLGGAAVSWKSSKQTVAAKSTMESEF
;CC   IALDTTAAEAEWLRNFLEDIPMWGKPVPAIRVHCDSQSAIGMAQSTLYNGKSRHIRRRHKTIRQLISTGV
;CC   ITIDYIKSADNLADPFTKGLNRDQVARSSRGMGLKPTT
;CC   ATCOPIA95 is much closer to the rice RIRE1 copia-like retrotransposons
;CC   than to different arabidopsis copia-like families.
;CC   Internal portions of retrotransposons from other copia families 
;CC   in the A. thaliana genome are less than 75% identical to ATCOPIA95_I.
;CC   At the same time, the B. napa genome harbors a young copia-like
;CC   retrotransposon (AL245480, positions 39393 35143) whose 4.2-kb 
;CC   internal portion is 77% identical to ATCOPIA95_I. Presumably, 
;CC   ATCOPIA95-like copia-like elements have invaded B. napa and
;CC   A. thaliana independently (transmitted from other species). 
;XX
;DR   [1] (Consensus)
;XX
;SQ   Sequence 5139 BP; 1674 A; 985 C; 1109 G; 1371 T; 0 other;
ATCOPIA95_I
atcttaattcaataaatccgcttttagcttatcgctctagacggattgagaatttttcgaattttttaaa
acgatttcaaaatcttctaaaaattgtttctgctatcatctttgggtgcttacgcggaaggtttatagaa
gaacgttattacttctttctttctcctctttctaaactgcttcgcaaacgctacgtagcgattgttgtgg
acagtttcttatataaacttctctagtttccacgaacgctcatacttgagtgagtttgtgaaacgcgaaa
caagtttctgcgaacgcttcttaagcgagtttgcaaaactaattcggtgtaatttagttacctgaatttt
ttggttaatatactaatcagatctgtttttctgtgaacagaatgactgagatcaacaccaacagcaagat
cgttggcaacaccacagttgttgagaccccggatgaggtagctttgcgccttgcgaaagagcttgaaaag
acacagatcaccgattctggtatggataatgtgcgccgaaatctgttcggtaatggttcaaccaacacta
ccataccaccctctgcttcccaagggatgatgccagacttgtttgatggcaaatccgggttcaaaacgtg
gcaagaaaagatgcgctactatttggtcagcataaacatggaaaggtacctcacagaggatccaccaata
gttccgcaaggtaccacatatgtgtatacggttggaggtatggatacctgggctcaaggtgactattgtt
gcaaaggtctgatcctgaaccgcttggtgaacgatctgtttgacctctacagcaaggccaagtcttccaa
aacactgtggctaactttagagaacaagtacaagactgatgagtctggaatgcaaagattctcaactgcg
aagtttctgaatttcaagatggtggactccaaaccaatcatggaacaggtggaggctcttcaacgtatct
ctcaagagatagagttggaagggatgtcgatctgcaacgtcttcaagacgaattgcttgatcgagaagct
accaccgggctggtcagatttcaagaattaccttaacttcaaacgtaaggcaatgacttttgatgatctc
gtccgaaggttgatgattgaaggcaacaatcgtggggctcacgcgggtgctcagaatcaagggcatgatg
ttaatgtagctgagcataaggccaagctgaaaggcaagggaaaagggttctctattcctcagaagaactt
gaaggtttcatcgacaacaaacttcaagaagagcaatcctgagcggaagttcaaaggaaagtgtcatcat
tgtggaaagattggacacaaggctgatgtttgcaagagcaaagccaaggatgtcaagagccaggcaaacc
taactgaagaggatatggttgcagtggtcactgaatgtaacatggtggacgacaaccaagtggagtggta
ctacgacactggtgcaaccacacacatctgcacagataggaccatgttctccacctatgtgaagaacaag
tcaaacgaacaactcttcatgggcaacacggcgatgtctaagattgaaggtaagggcaaagtggttctga
agctgacttcgggacgtgagcttactctgcaaaacgtgaagcatgtacctgacatgcggaagaatctcat
ctctggaacggtgctaagcaacaatggctttgccgttaactttgagtctgataagctagttcttaagaaa
catggggtgtatctgggaaagggttttgttaagggtggactggtcaaaatgtctgtaatgacagtctttc
cgaaaaatgtagcttctgcttctgttgttgaaatgaatgaaaatcctattgcttacttggttgagtcgtt
tactacttggcatgaacgtttaggacatgtcaattataaaacaatgcgaaaaatgcaaaacatgaattta
attccgaaattcaaaactaaccaagaaaaatgtgaagtatgcgtacaggcaaaactcactaaaactccct
caccacgagttgaaagaacaactgaacctctaggtttaattcacacagacttatgtgatttaaaatatgt
gcaaactagaggtggtaagaaatactttgttaccttcatagatgattgcacaagatattgttatgtatat
ctattgcatagcaaagatgaagctctggtaaaattcaaagaatttacactcgaagtagaaaatcaacttc
agagaactataaaaatagttcgaagtgatagaggaggagaatataatgagccattcaatgcattttgcag
agaaaaaggcattatacaccaaacaactgctccctattcaccagaatctaacggagttgcggaacgaaag
aatcgaactctgaaagaaatgatgaatgcactgttgcaggaatctgggttagcccagaacatgtgggggg
aagctttgcttaccactaactacatcctcaataggataccacataaggtgactgcaaagtcaccacatga
actttggaaaggtacagtaccctcgtacaaatacctaaaagtgtgggggtgtctagcaaaagtggctgta
ccacctcccaaaaaggtcacaattgggcctaagaccgtagactgtatcttcatcggatatgcgcacaaca
gcagtgcttatcgatttcttgtttataaatcggatattccagatatccatgaaaatacagttatggaatc
aagaaatgcatcgttttttgaagatatttttccttgtaggaaaactcaaaaacgaactcgtgaacaacga
gatgcagcaacctcggaagctgaggacaatactttgggtactattactgttgaagaaacagaacaggaac
ctgaagaacaacctaggcggagcaaaagagctcgcaaagagaaatctttcggtgatgatttcttgatggc
gtttttagctgaaaacgtaccaagaacttattcagaagccatgtctacccctgatgcaccttattggaag
gaagcagtcaatactgaaatagactctatcttgcaacatcatacttacgagatagcagatctaccccaag
gttctaaaccattggggagtaaatggattttcactattaaaaggaaaacgaatggtgatattgataggta
caaggctaggcttgtggtacaaggatttcgacaaaaagaaggtttagatttctttgatacctattctccg
gtaacgagaataacttcaatcaggatgctcataggcattgcagccttacgagatcttgaaatacatcaaa
tggatgtaaaaacagctttcttaaatggagatttggaagaggaaatctacatgaaacaacctgaagggtt
tgttatcccaggacaagaacacaaagtgtgtaaacttgtgaagtcactatacggactcaaacaggctcct
aagcagtggcatgagaagtttgacagtgttatgatgtctaatggttttaccattaatgaatgtgacaaat
gcatatatttcaaaattactccaacagggtacattttgttatgtttatatgtagacgatatgctcatctt
ggggagcaacacagatatcataaatcaaactaaaaacatgctcaaaagatattttgaaatgaaagacatg
ggtctagcagatgttattttgggtataaaaatcatcagaactgatgaaggccttaccttgtctcaaaccc
actatgctgagaagatacttgatcgtttcaagcattactctaatgggactgcaaaaactccagtagatcc
tcaacttcacttgaccaaaaactctggtgaacctgtgcaacaggtggaatatgcaagagtaattggcagt
ttaatgtacttgacaaacagtacaagaccggatttagcgcactctgtaaatgtacttagtcgctacacaa
gcaatccaggacataagcattggaaggctataactagagttttgaactacctacgttataccaaagatca
tggcttgcactatggtaaagaacccgcagttttggaaggttacagtgatgccaactggatcgcagattcc
aaaaactctaaatccacaagtggatacatctttacacttggaggtgcagcagtatcctggaaatctagca
aacaaactgtggcagccaaatcaactatggaatcggaattcatagcattggatacaactgcagcggaagc
cgaatggctccgtaatttcttggaagacattccaatgtgggggaaacctgtgcctgcaatacgtgtacac
tgtgatagccaatcagctataggcatggcacaaagtaccctatataatggtaaatctcgtcacatcagac
gacgacataaaaccattcgacaacttatctcaactggagtaatcacgatcgattacatcaagtcagctga
caacctagcggatccatttacaaaaggtttgaatcgagatcaagttgcaagatcatcaagaggaatgggt
ttaaagcctaccacctaagaggaagtttgatggtaaccccacctacaaagattggagatcccatgacata
ggttcaaagggacaactaaatcaaactaaatctgttataagcactgagagactgttgtctcttcctagtt
cctaaaatgatgtacagtgcttcctgtatgtgtaaaggttaagcatttgcttttaatgatttgaacctag
ctttgtaagatatgtgaagtacttacaggatacttttttaaatacaagcagctagtaaatcaccttgtga
gtgtgaaatggggccgtttctaggagaatatgtgaggctatactctccaagttcactcatgattaaccag
gcatgttcaaggccacaatgaacacaaacagagaacctcgttctacgagaaaatgaagttgtgttatacc
tgctgtctaagtttgcatcaaactccggatggttcaagacatcatgttcaccatctggccgagtaaactc
gacaggtactaacaatgagtaggttcaaagctgaaaaacgccaccaactcagatacagttagttttcgtt
tatactctcgaaggtctgtctagtctagctagttgggtcagtccttaggattagtaaagtcgtcatatgt
ttagacttatttctattcatgtgggggat1