;ID   ATCOPIA47_I DNA   ; ATH   ; 4299 BP
;XX
;DE   Internal region of ATCOPIA47 copia-like LTR-retrotransposon.
;XX
;AC   AL161552
;XX
;DT   01-OCT-2001 (Rel. 6.2, Created)
;DT   01-OCT-2001 (Rel. 6.2, Last updated, Version 1)
;XX
;KW   LTR-retrotransposon; COPIA superfamily; internal region; 
;KW   copia-like polyprotein; reverse transcriptase; ATCOPIA47LTR; 
;KW   ATCOPIA47_I.
;XX
;OS   Arabidopsis thaliana
;XX
;OC   Arabidopsis thaliana
;OC   Eukaryota; Viridiplantae; Streptophyta; Embryophyta; Tracheophyta;
;OC   euphyllophytes; Spermatophyta; Magnoliophyta; eudicotyledons;
;OC   Rosidae; Capparales; Brassicaceae; Arabidopsis.
;XX
;RN   [1] (bases 1 to 4299)
;RA   Kapitonov,V.V. and Jurka,J.
;RL   Repbase Reports 1:(1) p. 6 (2001)
;XX
;CC   ATCOPIA47_I is an internal region of the ATCOPIA47 copia-like 
;CC   endogenous retrovirus flanked by 2% divergent ATCOPIA47LTR 
;CC   LTRs and 5 bp-long target-site duplication.
;CC   ATCOPIA47_I encodes the 1430-aa ATCOPIA47p copia-like polyprotein.
;CC   ATCOPIA47p:
;CC   MANQSMELYSYPVLNISNCVTVKLTERNYLLWKTQFESFLSGQNLLGFVNGATKSPDPVSAVTNIDGVVT
;CC   EIPNADYSARHRSDQVVKSWILGSLSEDILEEVITESTAQQVWEGLARYFNRVSTARLFELQRKQQTMCK
;CC   HDTPMIDYIKGIKNICEQLASAGSPVKEQMKFFAALNGLGREYEPIKTSIEGSMESTPAPTLDSITPRLT
;CC   GFADRLKSYEVVSVTPHLVFTADAPDAGNYYQTNYSGRGNFSGKGNRGRGAYNTKGRGFHQQTTTGSSMS
;CC   GENRPVCQICGKLGHPALKCWHRFNNSYQHEELPSALTAMHITEVTEHNGQEWFPDTGASAHVTNSHQHL
;CC   QQSRPYNGSDAVIVGNGEFLPITHTGSTNLSSTSGKLPLKDVLVCPDIAKPLLSVSKLTRDYPCSFEFDC
;CC   DGVRVHDKGTKRLLILGTSKDGLYVLKNTPIQAFYSTRQQATSDEVWHMRLGHPNPQILQYLSKINAIKI
;CC   NKSSKSMCEACQLGKSSRLPFSLSTSVTTKPLQRIHYDLWGPAPIVSGQGFKYYAIFIDNYSRFCWFYPL
;CC   KLKSDFFTVFINFQALVENQFSTKIQSFQCDGGGEFTSNQFVNHLQASGIKQLISCPHTPQQNGLAERKH
;CC   RHIIELGLSMMFQSRIPQKYWVKAFFTANFLSNLLPSSVLDAQKSPYEVLLGNAPDYTSLRTFGCACFPT
;CC   LRDYTQNKFDPRSLQCVFLGYNEKYKGYRCLLPSTGRVYISRHVLFDEQVFPFATLNKEVTSNLQTPLMQ
;CC   AWQKSFQVLPTTSPQPASPLFSETDFPPLPIRIPEQSTVRSEGRSGCTTGLDPASIGNSLSLIPQRMDSS
;CC   ESTSTSSEIPELAIVTQPTEESTENPAASTTTSTQTESDTPATAQHPMVTRSKSGITKPNPRYALLTHKV
;CC   SYPEPKTVTSALKDPGWNGAMTEEIGNCGEAETWSLTPRTPEMHVLGCKWVFRTKLNADVSLNKLRARLV
;CC   AKGFNQEEGIDYLKTYSPVVRSATVRGVLHVATIMEWEIKQMDVKNAFMHGDLTETVYMTQPAGFVDPDK
;CC   PNHVCHLHKSIYGLKQSPRAWFDKFSTFLLEFGFTCSYPDPSLFIYIKNKDVILLLLYVDDMVITGNNSK
;CC   ALSNLLAELNKQFRMKDLGELHYFLGIQVQNHSEGLFLSQQKYAEDLLVVAAMSDCSPMPTPLPLQIHKE
;CC   SDTDDAFPDPSYFRSLAGKLQYLTLTRPDIQFAVNFVCRKMHAPSQFDFSLLRRILRYIKGTITMGITFR
;CC   KDTDCTLRAYSDSDFGGCKSTVRSTGGFCTFLGSNLISWSSQKQDSVSKSSTEAEYRAMSEAASEITWLC
;CC   SFLKELGIPLHETPSLYCDNLSAVYLTANPAFHNRTKLFLRHYHYVRERVALGALIVKHIPSHHQIADIF
;CC   TKSLPHGPFSSLRFKLGIDSPPIPSLRGSH
;XX
;DR   Positions  190949  195248  Accession No AL161552    GenBank (rel. 124.0)
;XX
;SQ   Sequence 4299 BP; 1273 A; 991 C; 866 G; 1169 T; 0 other;
ATCOPIA47_I
tggtatcagagccatggcaaatcagtccatggagctctattcctatcctgttctaaacatctcaaactgt
gttactgttaagcttactgagagaaattacctgctctggaaaactcagtttgagtccttcttgtctggtc
agaaccttctcggcttcgtcaatggcgctaccaagagtccagatcctgtctcagctgtcacgaacattga
tggtgttgttacagagatcccaaatgctgactacagtgctaggcacagatctgaccaagtggtgaagtca
tggatcttaggctctctctctgaagacattctggaagaggtcatcactgaatccactgctcaacaagtct
gggaaggtctagccaggtatttcaatcgtgtctctactgctcgcctatttgaactgcaaagaaaacagca
gactatgtgtaagcatgatacacctatgatagattacattaagggaataaagaacatttgtgaacaactt
gcctctgctggtagtcctgttaaggaacaaatgaaattttttgctgctcttaatggtcttggtcgtgagt
atgaacctattaagacatccatagagggtagtatggaatctacacctgctcctactctagatagtatcac
tcctaggcttacaggctttgctgaccgtctcaagagctatgaagtggtgtctgtcactccccatctagtc
ttcactgcagatgctcctgatgctggcaactactatcagaccaactacagtggcagagggaatttttctg
gcaagggaaacagaggcagaggagcatacaacaccaagggacgtggctttcatcagcagaccacaactgg
ctcttccatgtcaggagaaaacagacctgtctgtcagatctgtggaaagctaggccatcctgctctcaaa
tgttggcacaggttcaacaatagctaccagcatgaagaactgcctagtgccttgacagctatgcacatca
cagaagttacagaacacaatggacaagagtggttccctgatacaggagcttcagctcatgtgaccaacag
tcatcagcatctgcaacaatcaagaccatacaatggttcagacgctgtgattgtaggaaatggtgaattc
cttccaatcacccatactggctccacaaacctgtcatcaacctcaggtaaacttccacttaaagatgttt
tagtctgtcctgacattgcaaaaccattactgtctgtgtccaaacttactagagattacccatgctcctt
tgaatttgattgtgacggtgtccgtgtacatgataagggaacaaagaggttgctaattctgggaacaagt
aaagatggtctttatgtgctgaagaacactcctatccaagccttctattccacaagacaacaagcgacat
cagatgaagtatggcatatgaggcttggccatcctaaccctcagattcttcagtacctgtcaaagatcaa
tgctatcaagatcaataagagctccaagtctatgtgtgaagcgtgtcaacttggaaaaagctcaagatta
ccattttccctttctacttctgtaaccactaaacctttgcaaaggatccattatgatttgtggggtcctg
cacctatagtgtcaggtcagggctttaaatactatgctatcttcattgacaactactcccgtttttgctg
gttctatccattgaagttaaaatctgatttcttcactgtgttcatcaactttcaagctctagttgagaat
cagttttctacaaaaatacaaagctttcagtgtgatggaggaggagagtttaccagtaatcagtttgtca
atcacctacaagcgtcaggaatcaaacaactcatctcttgtccccacacacctcagcaaaacggactagc
tgaaagaaaacacagacacatcatagagttaggcttatccatgatgtttcaaagcagaatacctcaaaag
tattgggttaaagctttcttcacggcaaattttctgagcaatctgcttccctcatctgtcttggatgctc
agaaaagtccctatgaagtgttacttggcaatgcacctgattacacttctcttcgcacgtttggttgtgc
ttgttttccaaccctgcgagactacacacaaaacaagtttgatcctagatctttacagtgtgtgttcctt
ggatacaatgaaaaatataaaggctacagatgcttacttccatcaacaggcagagtctacatcagcagac
atgttttgttcgatgaacaagtgtttccctttgcaacactcaacaaagaagtcacttccaacctacaaac
tccactgatgcaggcttggcagaaaagttttcaagtgttgcctactacctcacctcaaccagcttctcca
ctattttcagaaactgacttcccaccattacctatcagaatacctgaacagtctactgtgaggagtgaag
ggagatctgggtgtaccacaggcctagatcctgcttctataggcaacagtctctctctcattcctcagag
gatggatagttcagaaagtacctcaacatcgtcagaaatcccagaattagcaatcgtcactcagccaact
gaagaatcaacagagaatcctgcagcttcaacaacgacttctactcagactgaatcagacactcctgcaa
cagctcaacatccaatggtaaccaggtcaaagtcaggaataacaaagccaaatccgaggtatgcactact
aactcacaaagtgtcttacccggagccaaaaactgttacttctgcacttaaagatcctggctggaatgga
gcaatgacagaagaaataggcaactgtggtgaagcagaaacctggtcattgactcctagaactccagaaa
tgcatgttcttggctgcaaatgggtgttcaggactaagttaaacgctgatgtctccttgaataaactgag
agccagattggttgctaagggattcaatcaagaagaaggaatcgactatctgaaaacatatagcccagtg
gttagatctgcaacagtaagaggagtgttacatgtggcaacaataatggaatgggaaataaagcaaatgg
atgttaaaaatgctttcatgcatggtgatttgacagaaactgtctacatgactcaaccagcaggcttcgt
ggatccagacaaaccaaatcatgtctgtcatctacacaagtcaatctatggtcttaagcagtctcccaga
gcctggtttgataagttcagcacattcttactagagttcgggttcacttgcagttatcctgatccatcac
tattcatctacatcaagaacaaagatgtcattcttctattactctatgtggatgatatggtgataacagg
taacaattcaaaggccttatcgaatctattggcagagttaaataagcagtttcgaatgaaggatttgggt
gagctacattacttcttgggaattcaggttcagaatcattcagaagggctattcctatcacagcaaaagt
atgcagaagacttacttgttgtggctgccatgtctgactgcagtcctatgccaacacctctccctctgca
aattcataaagagtctgatactgatgacgccttccctgatccttcttacttcagaagtcttgcgggtaaa
ctccaatacttgactctaacaaggccagacatacaatttgcagtgaattttgtttgcaggaagatgcatg
ctccttctcagttcgatttcagtctgttgagaaggatattgcggtatatcaaaggaacaattacaatggg
aataacattcagaaaagacactgattgcacattgagagcttacagtgatagtgatttcggaggctgcaag
tcaacagtcagatctacaggtggtttctgtaccttccttggcagtaatttaatctcctggtcatcgcaga
agcaggactcagtctctaagagctcaactgaagccgagtacagagcaatgtcagaagcagcttcagaaat
aacctggttgtgctcattcctgaaggaacttggaattcctcttcatgagaccccaagtctgtactgcgac
aacttatcagcagtctatctcacggcaaatccggcgtttcacaatcgaaccaaactcttcctgcgacatt
atcactatgttcgtgaaagagtagctcttggagcgctgattgtgaagcacatcccgtctcatcatcaaat
agctgacatcttcaccaagtctcttcctcacggaccgttcagctcattaaggttcaaactcggtatcgat
tcaccaccgatcccaagtttgcgggggta1