;ID   ATCOPIA66_I DNA   ; ATH   ; 4231 BP
;XX
;DE   Internal region of ATCOPIA66 copia-like LTR-retrotransposon - 
;DE   a consensus sequence.
;XX
;AC   .
;XX
;DT   05-NOV-2001 (Rel. 6.2, Created)
;DT   05-NOV-2001 (Rel. 6.2, Last updated, Version 1)
;XX
;KW   LTR-retrotransposon; COPIA superfamily; internal region; 
;KW   copia-like polyprotein; ATCOPIA66LTR; ATCOPIA66_I.
;XX
;OS   consensus
;XX
;OC   Arabidopsis thaliana
;OC   Eukaryota; Viridiplantae; Streptophyta; Embryophyta; Tracheophyta;
;OC   euphyllophytes; Spermatophyta; Magnoliophyta; eudicotyledons;
;OC   Rosidae; Capparales; Brassicaceae; Arabidopsis.
;XX
;RN   [1] (bases 1 to 4231)
;RA   Kapitonov,V.V. and Jurka,J.
;RT   Internal region of ATCOPIA66 copia-like LTR-retrotransposon - 
;RT   a consensus sequence.
;RL   Repbase Reports 1:(2) p. 20 (2001)
;XX
;CC   ATCOPIA66_I is an internal region of the ATCOPIA66 copia-like 
;CC   endogenous retrovirus flanked by the 1% divergent ATCOPIA66LTR 
;CC   long terminal repeats, and by a 5-bp target-site duplication.
;CC   There are two copies of ATCOPIA66 present in the genome, they 
;CC   are 99% and 98% identical to the consensus sequence. 
;CC   ATCOPIA66's closest relative is 93% identical to the consensus
;CC   sequence (ATCOPIA66A: AC007188, positions 36655-41067). 
;CC   ATCOPIA66_I encodes the 1372-aa copia-like polyprotein ATCOPIA66p.
;CC   ATCOPIA66p:
;CC   MAMTSEANLNSGLKASFQVFNENSDFSLWKTRMKAHLGLAGLKGVIDDFTLTKVVPLTKSEGKKVEEGDD
;CC   DGSESSQTKEVPDLVKMEKSEQAMNVIIAHVGDVVLRKIDHCKSAAEMWETLNKLYMETSLPNRIYVQLK
;CC   FYSFKMNDTMSINENVNEFLKIVAELSSLEIVVGEEVRAILFLNGLSSRYSQLKHTLKYGNKALSLQDVI
;CC   SSAKSLERELNESLDLERSSSTVLYTTERGRPLVRNNQNNQNNQKGGQGRGRSRSNSKTRVTCWFCKKEG
;CC   HVKKDCFARKKKMETEGPGEAGVIIEKLVFSEALSVNDQMVKDLWVLDSGCTSHMTSRRDWFCDFQENGS
;CC   TTILLGDDHSVESQGQGSIRVNTHGGSIKILNNVKYVPNLRRNLISTGTLDKLGYQHEGGAGKVRYFKNQ
;CC   VTALCGSLVNGLYILDGETVMTESCAAVDSQSKTALWHSRLGHMSLNNLKVLAGKGLLNGKEIKDLDFCE
;CC   HCVMGMSKRLSFNVGKHDVVEALSYVHADLWGSSNLSPSLSGKQYFLSIIDDKTRKVWLYFLRTKDETFD
;CC   KFCEWKELVENQVDRKVKCLRTNNGLEFCNTKFDRYCKTHGIERHRTCVYTPQQNGVAERMNRTIMEKVR
;CC   CLLNESGLDESFWAEAAATAAYIINRSPASAIDHNVPEELWLNRKPGYKHLKRFGSIAYVHHDQGKLKPR
;CC   ALKGVFLGYPAGTKGYKIWLLDERKCVISRNVIFREDMVYKDLNKDVNDAVAEDAEASTSNSDVISELVK
;CC   KRVSSKQGGVTTELVEVSESESEEDSEEPAETAVTQSPEPSGLTNYQLARDRTRRQIVAPVKMKDYSQFA
;CC   FALMTYEILNVEEEPQCLHDAQKDENWELWNGAIGEEMDSLTKNGTWELVDRPKDRKVISCRWLFKIKVG
;CC   IPGVESKRYKARLVARGFSQKKGIDYQEIFALVVKHTSIRGLMFVVVNLDLELEHMDVKTAFLHGELEEE
;CC   LYMEQPEGVVSVGNEDKVFLLKKSLYGLKQAPRQWNKRFNKFMTDEKFQRSDHDQCVYVKTMNNGELVYL
;CC   LLYVDDMLIAAKNMSEVNKVKKRLSSEFEMKDMGPANKILGIEITRDRVNGVLCLSQAGYLKKVLKRFNM
;CC   SNCKSALTPIGTHFKLASVQDDSECIDTVKTPYSSAVGSVMYAMISTRPDLAYAIGLVSRFMSKPGSVHW
;CC   EAVKWLLRYIKGSHDLSLVYTKGKDLSVIGYCDSDHGGDLDRKRFTSGYVFTVGGNTISWKSCLQSVVAL
;CC   SSTEAEFIALTEAVKEAIWVKGLLEDLGFQQDKAQVWSDSQSAICLSRNSVFHERTKHMARKRSFLSDII
;CC   EEGNIEVVKIHTSINPADMLTKCIPVKSFDSALDTLKLIEWK
;XX
;DR   [1] (Consensus)
;XX
;SQ   Sequence 4231 BP; 1396 A; 639 C; 1049 G; 1147 T; 0 other;
ATCOPIA66_I
gattggtatcagagcctaaggttctttaatggcgatgacgaacgaagcaagcttgaattcagggttaaag
gcttcgtttcaagtctttaacgagaactcagacttctcgctatggaagacaaggatgaaagcacatcttg
gcttagctgggcttaaaggagtgatcgatgatttcacattgacgaaatttgttccgctaactaaaagtga
aggaaagaaggttgaggaaggtgatgatgatggatcagagtcatcacagactaaggaagtacctaatctt
gtaaagatggagaaatcagaacaagcaatgaatgtgattatcgctcacgttggtgatgttgttctgagga
aaattgatcactgcaagtctgctgcagagatgtgggaaaccttgaacaagctgtatatggaaacatcgtt
accaaatcgaatctatgtgcagcttaagttttactcattcaagatgaatgacacaatgtcaattaatgag
aatgtgaatgaattcttgaagataattgccgagttaagcagtcttgagattgtggtaggtgaagaagttc
gtgcaatattgttcttgaatggattgtcatcaagatactctcagttaaagcataccctgaagtatgggaa
caaagctttgtctctgcaggatgtaatctcctctgttaagtcgttggagagagagctaaatgaatctcta
gatcttgaaagaagctcctcaacggtcttgtatactactgaaagaggtagaccactagtgagaaacaatg
agaacaatcaaaacaatcagaagggtggtcaaggcagaggtagaagcaggtcaaactctaaaacaagagt
aacctgctggttttgcaagaaagaaggtcatgttaagaaagactgttttgctagaaagaagaaaatggaa
acagaaggtcctggtgaagctggtgtcatcatagagaaacttgttttctctgaagcactaagtgtgaatg
atcagatggttaaagatctatgggttttggactcggggtgtacttcacacatgacttatagaagagactg
gttctgtgattttcaggaaaatggatctacaaccatactactcggagatgatcactcagttgaatcgcaa
ggccaaggttccataagagttaacacacatggaggatctataaagattctaaacaatgtcaagtatgtgc
caaacctgagaaggaatctcatctccacaggaacacttgacaagttagggtatcaacatgaaggtggagc
aggaaaagtgagatactttaagaatcaggttactgctttgtgtggaagcttagtcaacggtctgtatatt
cttgatggtgagacggtgatgactgaaagttgtgcagctgtggactcacagagtaaaacagcattgtggc
atagcaggttagaccacatgagtttaaataacctgaaagttcttgctggtaagggtctattgaacggtaa
agaaatcaaagattttggatttctgcgaacattgtgtaatgggaatgtccaagagactgagtttcaatgt
gggaaaacatgatgttgtggaagcactgagctatgttcatgcagacctctggggatcatcaaatttatca
ccctccttatcaggtaaacaatactttctctctatcatagatgataagactagaaaagtatggttatatt
tccttaggactaaggatgaaacttttgataagttttgtgaatggaaagaacttgtagagaatcgggtgga
tagaaaggttaagtgtttgagaacaaataatgggttgtaattttgcaatactaagtttgacaggtactgc
aagacccatggtattgaaagacataggacgtgtgtgtacacaccacagcaaaatggtgttgcagagagaa
tgaacaggacgatcatggagaaagtgagatgtttactgaatgagtcaggtctggatgagagtttctaggc
tgaagcagctgcaactgctgcatacatcattaacagatttcctgcctcagccattgatcataatgttcct
gaagaactatggctaaacagaaagccgggatacaaacacttgaagagatttggatcaatcgcatatgttc
atcatgatcaaggaaagctgaagccaagagcattgaaaggagtgttcttaggttaccctgctggcacaaa
aggttacaagatatggctgcttgatgaaagaaaatgcgtgataagtagaaatgtgatatttcgagaagat
atggtctacaaagacctgaacaaagatgtgaatgatgcagtagcagaagacgctgaagcatcaacatcaa
attctgatgttatttcagaattggtcaagaagcgagtcagttctaagcaaggtggagtaattactgagct
ggtagaagtctgtgaaagtgaatctgaagaagattctgaagaacctgcagaaactgcagttactcagtca
cctgaaccaagtgggttgacaaattatcaacttgcaagagacagaactcgaagacagatcgtagctccgg
ttaagatgaaagactattctcaatttgcatttgcgttaatgacatatgagatactgaatgtggaagaaga
accacaatgtcttcatgatgctcaaaaggatgaaaactgggagctatggaatggagctatcggtgaagaa
atggattctttaacaaagaatggtacttgggaacttgttgacagaccaaaggacagaaaggttatcagtt
gcaggtggttattcaaaatcaaagttggtataccgggtgttgaatccaagagatacaaggcaagacttgt
tgcaagaggattctcacagaaaaaaggaattgactaccaggaaatatttgcccttgtggttaaacacacg
tctatcagaggcttaatgtttgtggtggttaatcttgacttagagcttgaacatatggatgttaaaacag
ccttcttacatggtgaattggaagaagagttgtatatggagcagccagagggtgtggtgtcagttggaaa
cgaagacaaagtttttcttctcaagaagtccttgtacggtttgaagcaggcgccaaggcaatggaacaaa
aggtttaacaagtttatgacagatgagaaatttcagagaagtgatcatgatcagtgtgtatatgtgaaga
caatgaacaatggagaacttgtctatcttctgttatacgttgatgacatgttgattgcagctaagaatat
gtcagaggttaacaaggttaagaagagactcagtagtgagtttgagatgaaagacatgggacctgcaaat
aagattcttggtatttaaatcacaagagatagagtgaatggagttttgtgtttatctcaagcaggatact
tgaaaaaggtgctaaagagattcaatatgagtaattgcaagtcagctctaactcctattggtacacattt
taaacttgcatctgtgcaggacgattcagagtgcatagatacagttaaaactccatactcaagtgctgtt
ggaagtgtaatgttcgcgatgatcagcactagaccagacttagcctatgcaataggactagttagtcgtt
tcatgagtaaaccaagatcagttcattgggaagcagtcaagtggctgctaaggtacataaaaggatcaca
ggatctgagtttggtgtatactaaagggaaagatctcagtgtcattggttactgtgattcagatcatggt
ggagacttggataggaaaaggtctactagtggatacgttttcacagtaggaggtaatacaatcagttgga
agtcttgtctacaatccgtagtggctttgtcatcaactgaagcagagtttatagctttgactgaagcagt
aaaagaagctatatgggttaaagggttgcttgaagatctcggttttcagcaggataaagcacaggtctgg
agtgattctcagtcagcgatctgtttatcgaggaatagtgtgttccatgagcgaaccaaacacatggcac
gcaagaggtcatttctgagtgagattattgaagaaggaaacattgaagttgtgaagattcacacttctat
taatcctgctgatatgttgaccaagtacattcccgtgaaaagttttgattcagctttagatactctgaag
ctgatcgaatggaagtaagctcttaagcttgcattgctggaggttaagtccaagcatggcagtctactgt
gttgaagaagatttgtatcaagatggagaat1