;ID   ATCOPIA49_I DNA   ; ATH   ; 4216 BP
;XX
;DE   Internal region of ATCOPIA49 copia-like LTR-retrotransposon - a 
;DE   consensus sequence.
;XX
;AC   .
;XX
;DT   01-OCT-2001 (Rel. 6.2, Created)
;DT   01-OCT-2001 (Rel. 6.2, Last updated, Version 1)
;XX
;KW   LTR-retrotransposon; COPIA superfamily; internal region; 
;KW   copia-like polyprotein; reverse transcriptase; ATCOPIA49LTR; 
;KW   ATCOPIA49_I.
;XX
;OS   consensus
;XX
;OC   Arabidopsis thaliana
;OC   Eukaryota; Viridiplantae; Streptophyta; Embryophyta; Tracheophyta;
;OC   euphyllophytes; Spermatophyta; Magnoliophyta; eudicotyledons;
;OC   Rosidae; Capparales; Brassicaceae; Arabidopsis.
;XX
;RN   [1] (bases 1 to 4216)
;RA   Kapitonov,V.V. and Jurka,J.
;RL   Repbase Reports 1:(1) p. 8 (2001)
;XX
;CC   ATCOPIA49_I is a consensus sequence of an internal portion of the 
;CC   ATCOPIA49 copia-like endogenous retrovirus; there are 3 copies of
;CC   ATCOPIA49_I present in the genome, they are flanked by ~2% 
;CC   divergent ATCOPIA47LTR, and by 5 bp-long target-site duplications.
;CC   These copies are 1% divergent from the consensus sequence.
;CC   ATCOPIA49_I encodes the 1361-aa ATCOPIA49p copia-like polyprotein.
;CC   ATCOPIA49p:
;CC   MAGTPIAIFDGSGDFSLWKTRIMAHLSVIGLKDVVIGTSSPPLTAEEEEDPEKKKKRDADDAARLERCDK
;CC   AKNVIFLNVADKVLRKIELCQTAAEAWGTLDRLFMIRSLPHRVFTQLSFYTFKMQENKKIDENIDDFLKI
;CC   VADLNHLQIEVTDEVQAILLLSSLPSRYDGLVETMKYSNSREKLRLDDVMVAARDKERELSQSNRSVSEG
;CC   NFARGRQEGSSNNNQRNKGKGRSRSKSRDGKRVCWICGKEGHFKKQCFKWLERNKDKGSGSSSDKGEASI
;CC   AKAEYDPAMVLMAEEENLFVSGNTADEWVLDTGCSFHMTPRRDWFSDFREVKSGYVKMGNDSLSQVKGIG
;CC   NIRIKNSDGTQITLTEVRYMPTMSRNLISLGTLEDKGCWFKSQDGILKVVKGCSTVLKGQKRETLYILLG
;CC   EAEIAESNVSEKSKDETVLWHSRLGHMSQKGMEILVKKGCLNRKVIHELKFCEDCIYGKNHRVSFPSAQH
;CC   VTKEKLAYIHSDLWGSPHNPASLGNCQYFISFIDDYSRKVWIYFLKKKDEAFEKFVEWKKMVENQSDKKV
;CC   KKLRTDNGLEYCNHYFEKFCKEEGIVRHKTVAYTPQQNGVAERLNRTIMDKVRSMLSESGMEKRFWAEAA
;CC   ATAVYLINRSPSTATNFELPEERWTGALPDMSSLRRFGCLAYVHADQGKLNPRAKKGIFTSYPEGVKGYK
;CC   VWLLEEKKCVISRNVIFREEMMYKDLKTDSQNSFYEEVMENIGEGSNQLISNITDQSITEQESVEQGGVT
;CC   EEQIVNEQNQVQTETHEEEGSSSDNSTEEVDLSNYLLVRDREKRTVKLNRRYNESNMVGFAYNTEDGGKS
;CC   EPKTYQEALSDQDWELWNGAMKEEISSMGKNHTWDLVDKPVNAKIIGCRWVFTRKAGIPGVEAPRFKARL
;CC   VAKGFTQKEGVDYNEIFSPVVKHVSIRFMLSMVAQFDMELHQMDVKTAFLHGFLDEEILMAQPEGFEDKK
;CC   YPEKVCLLKRSLYGLKQSPRQWNLRFDEFMKSIDFTRSAYDSCVYLKQQSDKSYVYLLLYVDDMLIAAKE
;CC   KSSIMELKQLLGKEFEMKDLGEAQKILGMEIARDRAAGVLTLSQEGYVKKVLRSSQMDQAKPVSTPLGIH
;CC   FKLRAATEKEYQEQFDRMKIVPYSNTVGSIMYSMIGTRPDLAYPVGVISRYMSRPLKDHWQAAKWVLRYM
;CC   KGTEKKKLCFRKNKDFLLRGYCDSDYGGDYDNRRSITGYVFTIGGNTISWKSRQQKVVAISTTEAEYMAL
;CC   TDAVKEALWLRGFSEELGFAQESVEVNCDSESVIALAKNSVHHERTKHIDIRLHFIRDIINAGLVKVVKI
;CC   ASECNPADIFTKVLPVEKFEGALHMLRVTEN
;XX
;DR   [1] (Consensus)
;XX
;SQ   Sequence 4216 BP; 1432 A; 607 C; 1060 G; 1117 T; 0 other;
ATCOPIA49_I
aattggtatcagagctcacggttgttgagcgagtgcgaatctaagatggcaggaactccgatcgcgatct
ttgatggatcaggagatttttctttgtggaaaacaaggattatggcacatctaagcgtcattggacttaa
ggatgttgtcatcggaacatcgtctccgccgctcactgcagaagaagaagaagatccggagaaaaagaag
aaacgagatgcagatgatgcagcaaggcttgagcgatgtgataaagcgaagaatgtgatcttccttaatg
ttgcagataaggtcttaagaaagatcgagctatgtcaaactgcagcagaagcttggggaactctagatcg
attgttcatgattcgatctctacctcatagagtctttactcagctaagtttctatacttttaagatgcaa
gaaaacaagaaaatcgatgagaatattgatgatttcttaaagattgtggctgatttgaatcatttgcaga
ttgaggtgactgatgaggttcaagcaattctgttgctaagttcgttgccttcaagatatgatggtttagt
tgaaaccatgaagtatagtaatagccgagaaaagttgaggttagatgatgtgatggttgcagctcgagac
aaagaaagagagttgtcacagagtaatcgatctgtatcagaaggaaattttgctagaggaagacaagaag
gatcttccaacaataaccaaagaaataagggtaaagggagatcaagatctaagtctcgagatggtaagcg
agtatgctggatatgtggtaaagagggacactttaagaaacagtgtttcaaatggcttgaaagaaacaaa
gataaaggatcaggatcaagttcagacaaaggagaagcgagtatagctaaagctgaatatgatccagcaa
tggttcttatggcagaagaagagaatctatttgtttcagggaatactgcagacgaatgggttctagatac
aggctgttcttttcacatgacaccaaggagagattggttttcagattttagagaagtgaaatctggttat
gttaaaatgggaaatgattctttgtctcaggtaaaaggaattggaaacataagaatcaagaactctgatg
ggacacagattactttaacagaagtgagatacatgccaacaatgtctaggaacctaatttctttaggaac
cttagaagacaaaggatgctggttcaagtctcaagatggtattctcaaggtggttaaaggatgctctact
gttcttaaaggacagaaaagagaaacgttgtatatacttcttggagaagctgagattgctgaatcaaatg
tctccgagaaatccaaagatgaaactgtgttatggcacagcagacttggtcatatgagtcagaagggaat
ggagatattagtgaagaagggttgtttaaacaggaaagtgattcatgagttgaagttttgtgaagactgc
atttatggaaaaaatcatagagtcagtttcccatctgctcagcatgttacaaaggagaagctagcttata
tccattcagatttatggggatctcctcataatccagcatctttgggaaactgtcaatacttcatctcctt
cattgatgactattccaggaaagtgtggatatattttctgaagaagaaagatgaagcttttgagaaattt
gttgaatggaagaaaatggtggaaaatcagtctgataagaaggttaagaaactcagaactgataatggct
tagagtactgtaatcattactttgagaagttttgtaaagaggaaggcattgtaaggcacaagactgttgc
ttacacacctcaacagaacggtgtcgctgagagactcaatcgaactataatggataaagttcgcagcatg
ttaagtgaaagtggtatggaaaagagattttgggcagaagctgctgctacagcagtgtatttgataaatc
gatctccctctactgcaacgaactttgagctgccagaagaaagatggacaggagcgttaccggatatgag
ttcattgagaaggtttggttgtttggcttatgtacacgcagatcaagggaagttgaatcctagagcaaag
aagggaatattcacgagctatcctgagggtgttaaaggatataaagtttggctacttgaagaaaagaagt
gtgtcattagtagaaacgtgatattcagagaggaaatgatgtacaaagaccttaaaactgattctcagaa
cagtttctatgaagaagtcatggagaacataggagaaggttctaatcagctgatatctaatatcacagat
caaagtattacagaacaagaaagcgttgagcaaggtggagttactgaagaacagattgttaatgaacaga
atcaagtgcagacagaaactcatgaagaagaaggaagcagtagtgataattctactgaagaagtggatct
cagtaactacctactagtaagagacagggagaaaaggactgttaagttgaacagaagatataatgagtct
aatatggttgggtttgcctacaatacagaagatgggggtaagtctgagcctaaaacataccaagaggcat
taagtgatcaagattgggagttatggaatggagctatgaaagaagagatatcatccatgggaaaaaacca
cacgtgggacttagtagataaaccagtaaatgcgaaaatcattggttgcagatgggtcttcacaaggaaa
gctggtattcccggagtggaagcacctagatttaaagctcggctggtagccaaaggttttacacagaaag
aaggtgtagattataatgaaatcttctcaccagttgttaaacatgtgtctatcaggtttatgctctcaat
ggtggctcagtttgacatggagctacatcagatggacgttaaaacagcgttcctacatggatttcttgat
gaagagatattgatggctcaacctgaaggttttgaagataagaaataccctgaaaaggtatgtttactga
aaagatccctatatggtctaaagcaatctcctagacagtggaatctgagatttgatgagtttatgaagag
tattgatttcactaggagcgcttatgacagctgtgtatacttgaagcagcaaagtgataagtcatatgtg
tatctactcctatacgtagatgacatgttgatagctgctaaagaaaagtcaagtatcatggagttaaaac
agttactgggtaaagagtttgagatgaaggatttaggtgaagctcagaaaattttaggcatggagattgc
tagagatagagctgcaggtgtgttgactctgtcacaggaaggttatgtgaagaaagttttgagatctagt
cagatggatcaggctaaacctgtgtcaacaccattaggaatccatttcaagctccgagcagctactgaaa
aagagtatcaggaacaatttgacagaatgaagattgtaccttactcaaacactgttggaagcatcatgta
ttcaatgataggcacgaggccagatcttgcttatccagttggagtgatcagtcgctacatgagtagacca
ttaaaggatcattggcaagcggctaagtgggtcttgaggtacatgaaggggacagagaaaaagaagcttt
gcttcagaaagaataaagacttcttattaagaggttactgtgactctgactatggtggtgattatgataa
ccggaggtcaatcacaggttatgttttcactattggtggaaacaccattagctggaaatcacggcaacag
aaggtggtagctatatctactacagaggcagaatatatggcattaactgatgctgttaaagaggctttgt
ggctgagaggtttttctgaagaacttggttttgcacaggaaagtgtagaggtgaattgtgattcagagag
tgttattgctttggcaaagaactctgtacatcacgaaagaacaaagcatattgatatcaggttacatttc
attcgggatatcataaatgcaggtctggttaaagtggtgaagattgcaagtgaatgcaatcctgcagata
tttttacaaaggtattgcccgtggagaagtttgaaggagcattgcatatgctccgagttactgagaactg
agaggtgaatctcaggtaccggaggggagatccaagaactagaacaagttgagacaaggaagacaatcag
agtcaaggtggagaat1