;ID   ATCOPIA52_I DNA   ; ATH   ; 4214 BP
;XX
;DE   Internal region of ATCOPIA52 copia-like LTR-retrotransposon.
;XX
;AC   AL132979
;XX
;DT   02-SEP-2001 (Rel. 6.2, Created)
;DT   02-SEP-2001 (Rel. 6.2, Last updated, Version 1)
;XX
;KW   LTR-retrotransposon; COPIA superfamily; internal region; 
;KW   copia-like polyprotein; ATCOPIA52LTR; ATCOPIA52_I.
;XX
;OS   Arabidopsis thaliana
;XX
;OC   Arabidopsis thaliana
;OC   Eukaryota; Viridiplantae; Streptophyta; Embryophyta; Tracheophyta;
;OC   euphyllophytes; Spermatophyta; Magnoliophyta; eudicotyledons;
;OC   Rosidae; Capparales; Brassicaceae; Arabidopsis.
;XX
;RN   [1]
;RA   Bloecker,H., Mewes,H.W., Lemcke,K., Mayer,K.F.X., Quetier,F.
;RA   and Salanoubat,M.
;RL   Direct submission to GenBank (January 2000)
;XX
;RN   [2]
;RA   Terol,J., Castillo,M.C., Bargues,M., Perez-Alonso,M.
;RA   and de Frutos,R.
;RT   Structural and evolutionary analysis of the copia-like elements in
;RT   the Arabidopsis thaliana genome
;RL   Mol. Biol. Evol. 18, 882-892 (2001)
;XX
;CC   ATCOPIA52 was found by [1], minor modifications of the sequence
;CC   coordinates were made by [2].
;CC   ATCOPIA52_I is an internal region of the ATCOPIA52 copia-like 
;CC   endogenous retrovirus flanked by identical ATCOPIA52LTR 
;CC   LTRs. ATCOPIA52_I encodes the 1397-aa ATCOPIA52p copia-like 
;CC   polyprotein. ATCOPIA52_I is 71% identical to ATCOPIA51_I.
;CC   ATCOPIA52p:
;CC   MADPYPFPDNVHVSSSVTLKLNDSNYLLWKTQFESLLSCHKLIGFVNGGITPPPRTLNVVTGDTSVDVAN
;CC   PQYESWFCTDQLIRSWLFGTLSEEVLGYVHNLQTSRDIWISLAENFNKSSVAREFTLRRTLQLLSKKDKT
;CC   LSAYCREFIAVCDALSSIGKPVDESMKIFGFLNGLGREYDPITTVIQSSLSKISPPTFRDVISEVKGFDV
;CC   KLQSYEESVTANPHMAFNTQRSEYTDNYTSGNRGKGRGGYGQNRGRSGYSTRGRGFSQHQTNSNNTGERP
;CC   VCQICGRTGHTALKCYNRFDHNYQSVDTAQAFSSLRVSDSSGKEWVPDSAATAHVTSSTNNLQAASPYNG
;CC   SDTVLVGDGAYLPITHVGSTTISSDSGTLPLNEVLVCPDIQKSLLSVSKLCDDYPCGVYFDANKVCIIDI
;CC   NTQKVVSKGPRSNGLYVLENQEFVAFYSNRQCAASEEIWHHRLGHSNSRILQQLKSSKEISFNKSRMSPV
;CC   CEPCQMGKSSKLQFFSSNSRELDLLGRIHCDLWGPSPVVSKQGFKYYVVFVDDYSRYSWFYPLKAKSDFF
;CC   AVFVAFQNLVENQFNTKIKVFQSDGGGEFTSNLMKKHLTDCGIQHRISCPYTPQQNGIAERKHRHFVELG
;CC   LSMMFHSHTPLQFWVEAFFTASFLSNMLPSPSLGNVSPLEALLKQKPNYAMLRVFGTACYPCLRPLGEHK
;CC   FEPRSLQCVFLGYNSQYKGYRCLYPPTGRVYISRHVIFDEETFPFKQKYQFLVPQYESSLLSAWQSSIPQ
;CC   ADQSLIPQAEEGKIESLAKPPSIQKNTIQDTTTQPAILTEGVLNEEEEEDSFEETETESLNEETHTQNDE
;CC   AEVTVEEEVQQEPENTHPMTTRSKAGIHKSNTRYALLTSKFSVEEPKSIDEALNHPGWNNAVNDEMRTIH
;CC   MLHTWSLVQPTEDMNILGCRWVFKTKLKPDGSVDKLKARLVAKGFHQEEGLDYLETFSPVVRTATIRLVL
;CC   DVATAKGWNIKQLDVSNAFLHGELKEPVYMLQPPGFVDQEKPSYVCRLTKALYGLKQAPRAWFDTISNYL
;CC   LDFGFSCSKSDPSLFTYHKNGKTLVLLLYVDDILLTGSDHNLLQELLMSLNKRFSMKDLGAPSYFLGVEI
;CC   ESSPEGLFLHQTAYAKDILHQAAMSNCNSMPTPLPQHIENLNSDLFPEPTYFRSLAGKLQYLTITRPDIQ
;CC   FAVNFICQRMHSPTTADFGLLKRILRYVKGTIHLGLHIKKNQNLSLVAYSDSDWAGCKETRRSTTGFCTL
;CC   LGCNLISWSAKRQETVSKSSTEAEYRALTAVAQELTWLSFLLRDIGVTQTHPTLVKCDNLSAVYLSANPA
;CC   LHNRSKHFDTDYHYIREQVALGLVETKHISATLQLADIFTKPLPRRAFIDLRIKLGVAEPPTTSLRG
;XX
;DR   Positions  3336   7566  Accession No AL132979    GenBank (rel. 124.0)
;XX
;SQ   Sequence 4214 BP; 1202 A; 1030 C; 791 G; 1191 T; 0 other;
ATCOPIA52_I
tggtatcagagccattctaaactctcacaggtccaaaaatggcagatccatacccttttccggacaatgt
ccatgtctctagttctgtaactctcaagctcaatgactccaactatctcctttggaagacacaattcgag
tctttgctatcttgtcacaaactcatcggctttgtcaatggtggaatcacaccaccaccacgtactctca
acgtcgtcacaggagacacctccgtcgatgtcgcaaaccctcagtatgaaagttggttctgcactgatca
gctcatccgttcatggctttttggcacgctatcagaagaagttctcgggtatgtccacaatctccaaacc
tctagagatatttggatctccttagcagaaaacttcaataagagcagtgttgctcgtgagttcacgctcc
gccgtactctgcaactcttgtcaaaaaaagacaaaactttatcagcgtattgtcgtgagtttattgctgt
gtgtgatgctttaagttctataggcaagcctgtggatgaatcaatgaagatctttggttttcttaatggt
ctgggaagagagtatgatcctatcactactgttatacaaagctctctgagtaaaatttctccaccaacct
tcagagatgtgatttccgaggttaaagggtttgatgtgaagctccagtcctatgaagaatcagttactgc
caatcctcacatggctttcaacactcaacgtagtgaatacacagacaactacacttccggcaaccgtggt
aaaggtagaggaggctatggtcaaaatcgcggcagaagtggctactctacgcgtggaaggggtttctctc
agcatcagacaaactccaataacacaggagagcgtccagtgtgtcagatctgtgggcggactggacacac
agctctaaaatgttacaacagatttgatcacaactatcaaagtgttgatactgcccaggccttctcttct
ctgcgggtttcagacagttctggtaaagagtgggtacctgattctgcagctacagctcatgtgacttctt
ccacaaataatctacaagctgcatcaccctacaacggcagtgacacagttcttgttggcgatggagcata
cttacccatcacacatgttggatccaccaccatttcttctgattcaggtactcttccactaaatgaggtc
ttagtatgtcctgatatacaaaagtcccttctatcagtatccaaactatgtgatgactatccttgcggtg
tgtattttgatgctaataaagtatgcattattgatataaatactcagaaagtggtgtcaaagggtcctcg
aagtaacggtctatatgtgttggagaaccaagaatttgtagccttctattctaatcgacagtgtgcagca
tccgaagaaatatggcaccatcgcttaggacattcaaattctcggattcttcaacaactcaagtcaagca
aggaaattagtttcaataagagcagaatgtcccctgtttgtgagccttgccagatggggaaaagttctaa
gttacagtttttttcttcgaattctcgtgagttagatcttttaggtcgaattcattgtgacctttggggc
ccctcaccagttgtatctaaacaaggtttcaagtattatgtggtgtttgttgacgattactctcgctact
catggttttatccattaaaagcaaagtcagatttttttgcggtatttgttgctttccaaaacctggttga
aaaccaatttaatacaaagatcaaggtgtttcagagtgatggaggtggtgagtttacaagtaacttaatg
aagaagcacctaacagactgtggaattcaacatagaatctcttgcccctatactcctcaacaaaacggta
tagcagaacggaagcatcgtcactttgtcgagctcggtttgtcgatgatgtttcacagtcacacaccgct
acagttctgggtagaagcgttcttcactgcaagtttcctaagcaacatgcttccctctccgtcattaggc
aatgtaagtccccttgaagctttactaaaacagaaaccaaattatgcaatgcttagagtgtttggaacag
catgttatccctgcttaagacccttaggagagcataagtttgagcctagatcactacaatgtgtatttct
tggctacaattctcagtataaaggatatagatgtctatacccacctaccggaagagtgtatatctcaagg
catgttatcttcgatgaagaaacatttcctttcaaacaaaaatatcagttcttggttccacaatacgagt
cctctcttctcagtgcttggcagtcatctataccacaagctgatcagtcactcataccgcaagctgaaga
aggaaaaattgaaagcttagcaaaacctccatcgatccagaagaatacaattcaggatactacaactcag
cctgcaattttaactgagggagtattgaatgaagaagaggaagaagactcctttgaagaaacagaaacag
aatctctgaatgaagaaacacacactcaaaatgatgaagcagaggttacagtagaagaagaagtacaaca
agaaccagaaaacactcacccaatgacaacaaggtctaaagccgggattcataaatcaaacacacgatat
gcacttcttacctcaaaattttcagttgaggaaccaaaatcgattgatgaagccttaaatcaccctggtt
ggaacaatgcggtgaatgatgaaatgagaacaattcacatgttgcatacatggtcattggttcagcccac
agaagatatgaatattttgggatgcaggtgggtgttcaaaactaaactcaaaccagatgggtctgtggat
aagctaaaagctaggcttgttgccaaaggatttcaccaagaggaaggtctagactatcttgaaaccttca
gtccggtggtcagaacagccactatccgtcttgttctcgatgttgctactgctaaaggatggaacataaa
gcaacttgatgtgtctaatgcgtttcttcacggtgaattaaaggaacctgtctacatgcttcagcctcct
ggttttgtggatcaagaaaaaccttcatatgtgtgccgtctcaccaaagctttgtatggcttaaaacagg
ctcctagagcttggtttgacacgattagtaactatcttcttgactttggtttttcttgcagcaaatcaga
tccttctctattcacatatcacaagaatgggaagactttggtgttacttctatatgtagatgacattctt
ctcaccgggagtgatcacaatctacttcaagagcttctcatgtctctcaacaaacgtttttcaatgaagg
atctgggcgctccaagttatttccttggtgtggaaattgagtcatcaccagaaggtctcttcctccatca
aactgcctacgctaaagacattcttcaccaagccgcaatgtcaaactgcaactctatgcctactccacta
cctcaacacattgagaacctgaattcagacctcttccctgaacctacttacttcagaagtttagctggaa
agcttcaatatttaaccatcacccgacccgacatacagtttgctgttaacttcatttgccaaaggatgca
ttctcctactacagcagattttggtttgctcaaacggattctgagatatgtgaaaggaactattcacttg
ggcttacacatcaagaaaaaccagaacttgtccctcgtagcttacagtgatagcgactgggctgggtgta
aggaaacaagacgctcgacaaccgggttctgtacactacttggatgcaacctcatttcgtggtcagccaa
gagacaagaaacagtgtcaaaatctagcacagaagcagagtatcgagctcttacggcagtagctcaagag
cttacttggctgtcttttctgcttagggatattggagttacacaaacccatccaaccttggtgaaatgtg
acaatctatcagcagtttatctaagcgccaatcctgctcttcataacaggtctaagcactttgacacaga
ttatcattacatcagagaacaagttgctttgggtcttgtggagacaaaacacatatctgcaacgctgcaa
cttgcagacattttcacaaaaccgctaccaagacgagccttcattgatctcagaatcaaacttggtgtag
ctgaaccacccaccacaagtttgagggggaa1