;ID   ATCOPIA83_I DNA   ; ATH   ; 4643 BP
;XX
;DE   Internal region of the ATCOPIA83 copia-like LTR-retrotransposon.
;XX
;AC   AB010694
;XX
;DT   30-NOV-2001 (Rel. 6.3, Created)
;DT   30-NOV-2001 (Rel. 6.3, Last updated, Version 1)
;XX
;KW   LTR-retrotransposon; COPIA superfamily; internal region; 
;KW   copia-like polyprotein; reverse transcriptase; the ATCOPIA83 
;KW   family; ATCOPIA83LTR; ATCOPIA83_I.
;XX
;OS   Arabidopsis thaliana
;XX
;OC   Arabidopsis thaliana
;OC   Eukaryota; Viridiplantae; Streptophyta; Embryophyta; Tracheophyta;
;OC   euphyllophytes; Spermatophyta; Magnoliophyta; eudicotyledons;
;OC   Rosidae; Capparales; Brassicaceae; Arabidopsis.
;XX
;RN   [1] (bases 1 to 4643)
;RA   Kapitonov,V.V. and Jurka,J.
;RT   Internal portion of the ATCOPIA83 copia-like LTR-retrotransposon.
;RL   Repbase Reports 1:(3) p. 10 (2001)
;XX
;CC   ATCOPIA83_I is an internal region of the ATCOPIA83 copia-like 
;CC   endogenous retrovirus flanked by the 99% identical ATCOPIA83LTR
;CC   long terminal repeats, and a 5-bp target-site duplication (AGATA). 
;CC   ATCOPIA83 forms a separate family of copia-like retroviruses
;CC   present in the A. thaliana genome. There are two nearly identical
;CC   copies of ATCOPIA83 found on chromosome 5. However one of them
;CC   is a result of a ~14-kb duplication.
;CC   ATCOPIA83 is most close to ATCOPIA26. There is a 74% identity
;CC   between LTRs and internal portions of these two retroelements.
;CC   ATCOPIA83_I encodes well preserved remnants of the ~1500-aa 
;CC   ATCOPIA83p copia-like polyprotein.
;CC   ATCOPIA83p (nucleotides at positions 982 and 983 form a false
;CC   frame shift; we have deleted them to produce this polyprotein): 
;CC   MSTELALTTSSSTTTQPRRTISPYDLTSGDNPGTLISKPLLRGPNYDEWATNLRLALKARKKFGFADGTI
;CC   PQPDETNPDFDDWIANNALVVSWMKLTIHESLATSMSHLDDSHDMWTHIQKRFGVKNGQRIQRLKTELAT
;CC   CRQKGTPIETYYGKLSQLWRSLADYQQAKTMEEVRKEREEDKLHQFLMGLDESMYGAVKSALLSRVPLPS
;CC   LEEAYNTLTQDEESKSLSRLHDERNDGVSFAVQTTPRTRSLTKNKDSAIVCSHCGRLGHLAENCFKLVGY
;CC   PPWQEKMRRNATSSSRNQSASMGRGSSVVPATSFKGKQSFGRGASANHVANIGESTTAATSSMSGSQLTE
;CC   ADRVGISGLNDEQWKQLRQMLKERNFNSTNTKSSKFFLESWIIDSGATNHMTGTLEFLRDVCDMPPIMIK
;CC   LPDGRLTTSTKHGRVYLGSSLDLQEVFFVDGLHCHLISVSQLTRAKSCVFQITDKVCIIQDRITLTLIGA
;CC   GKQQNGLYFFRGTETVASMTRMDSSSQLWHCRLGHPSSKVLKLLSFSDSTGHAFDSKTCEICIKAKQTRD
;CC   PFPLSNNKTSSPFEMVHCDLWGPYRTTSICGSNYFLTLVDNYTRAVWLYLLPSKQTAPMHLKNFISLVER
;CC   QFSTKIKTIRSDNGTEFVCLSSFFVDHGIIHETSCVGTPQQNGRVERKHRHILNVARALRFQARLPIEFW
;CC   SYCALTAAYLINRTPTPLLQGKTPFELLYNRPPPVNHIRVFGCICYVHNQKHGGDKFESRSNKSIFLGYP
;CC   FAKKGWRVYNFETGVISVSRDVVFRETEFPFPASVFDSTPDSQLSPSNADQSFFLPSELQAPTPVSITTT
;CC   LELTQSSSSTNLNDDNFHIPSDESSSVNEMSDNEDLNSPTTNESSPFLSPASPSLPLSPASLSLPLSPAA
;CC   PSPSLPKIAEPEPEPELLGKGKRKKTQPVRLADYATTLLHQPHPSVTPYPLDNYVSSSQFSAAYQAYVFA
;CC   ISLGIEPKSYKEAILDENWRCAVSDEIVSLENLGTWTVEDLPPGKKALGCKWVFRLKYKSDGTLERHKAR
;CC   LVVLGNKQTEGIDYSETFAPVAKMVTVRAFLQQVASLDWEVHQMDVHNAFLHGDLDEEVYIKFPPGFGSD
;CC   DNRKVCRLRKALYGLKQAPRCWFAKLTTALNDYGFIQDISDYSLFTMERNGIRLHILVYVDDLIITGSSL
;CC   DVITKFKGYLSSCFYMKDLGILRYFLGIEVARSPAGIYLCQRKYAIDIITETGLLGVWPASHPLEQNHKL
;CC   ALAFGDTISDPSRYRRLVGRLIYLGTTRPELSYAIHMLSQFMSDPKADHMEAALRVVRYLKSSPGQGILL
;CC   RSNTPLVLIGWCDSGFDSCPITQRSLTGWFIQLGGSPISWKTKKHDVVSRSSAEAEYRAMADTVSELLWL
;CC   RALLPALGISCNEPIMLYSDSLSAISLAANPVYHARTKHVGRDVHFVRDEIIRGTIATKHVSTTSQLADI
;CC   MTKALGRREFDAFLLKLGICNLHTPACGGGGGG
;XX
;DR   Positions  14350 9708  Accession No AB010694    GenBank (rel. 124.0)
;XX
;SQ   Sequence 4643 BP; 1250 A; 1052 C; 956 G; 1385 T; 0 other;
ATCOPIA83_I
tggtatcagagcaaaacctaaactactaaagggtttttgtttcttgattctgtgatctgtcaattctgct
gcaagttatagttgtttctaaaattcgtttgtttccacttctttgacaaacttttctcacgatgagtacg
gagttggctcttaccacctcttcctcgactaccacacaacctcgccgtaccatctcaccatacgatctta
cctccggagacaatccaggcaccctaatttcaaaacctctcttacgtggacctaactacgatgaatgggc
cacaaatttacgtttggctttgaaagcaaggaaaaagtttggatttgcagatgggacaattcctcagccc
gacgaaacaaatcctgacttcgatgattggattgctaacaatgctcttgtggtttcatggatgaaactca
ccattcatgaaagcttagctacctcaatgtcacacctcgacgattcccatgatatgtggactcacattca
aaaacggtttggagttaaaaatggacagcgcatccaaagactcaaaaccgagttagctacatgtcgtcaa
aaaggaactcccattgagacatactacgggaaactctctcaattgtggagaagcttggcagattatcaac
aagctaaaacaatggaggaagtaagaaaagagcgtgaggaagacaagctgcatcaatttctcatgggtct
agacgaatccatgtatggagccgtcaagtccgctcttctctctcgggtaccactgccttcactcgaggaa
gcttacaacaccttgactcaagacgaagaatcaaagtctttaagtcgtttacatgatgaaagaaacgatg
gtgtgagttttgcggttcaaactactccaaggacccgcagcctcaccaaaaacaaagattctgcaattgt
ttgttctcactgtgggcgtcttggtcatctcgccgaaaactgcttcaaactagtaggatatcctccttgg
cttaagagaaaatgcgtcggaatgctacttcttcctcacgaaatcagtctgccagcatgggtcgtggctc
ctctgttgtacccgccacatctttcaaaggaaaacagagttttggtcgtggtgcatcagccaatcatgtc
gcaaatatcggagaaagtacgactgctgcaacaagctcgatgtctggttctcagttaacagaagctgacc
gtgtcggaataagtggcctgaacgatgaacaatggaagcaattgagacagatgcttaaagaacgaaattt
caattccaccaataccaaatccagtaagttctttcttgaatcttggatcattgattccggtgcgacaaat
cacatgaccggtactcttgaatttttgcgtgatgtttgtgacatgcctccgattatgattaaacttcccg
acggaaggcttacaacatctaccaaacacggacgtgtttatttaggctcatctttggatcttcaggaagt
gttttttgttgacggtctacattgtcatctaatctcagtttcacagttgactagagcaaagagttgtgtt
tttcagataactgacaaagtgtgtattattcaggaccgcatcactctaacgctgattggagctggtaagc
agcaaaatggattgtatttctttcgaggaacggaaacagtggcatcaatgacacgcatggactcgtcttc
tcagttgtggcattgtcgtctcggccatccttcctccaaagttttaaagttgttatcgttttcagattct
actggtcatgcttttgattcaaagacttgtgagatttgtattaaagctaaacaaacaagggatccttttc
ctttgagcaataataaaacaagttctccttttgaaatggtgcattgtgatctttggggtccatataggac
tacatcaatctgcggttctaactattttctcacacttgttgacaactatacacgggctgtctggttatat
ctcctaccttcgaaacaaacagctccgatgcatcttaaaaacttcatatctctggttgagagacagtttt
ctactaagataaagacaattcgcagtgacaacggcacggagtttgtttgtctctcttcattttttgttga
ccacggtattattcatgagacttcttgtgtggggacgccacaacaaaatggtagagtagaacgcaagcat
agacacattctcaatgtcgcacgtgctcttcgttttcaagctagattacctattgaattttggagttatt
gcgcactcactgcggcttacctcatcaatagaactccaactccattacttcaaggcaagacaccatttga
gcttctatataatcgtcctcctccggttaatcatattcgggttttcggttgtatttgctatgttcataat
caaaagcatggaggtgacaaatttgagagtagaagtaacaaatccatttttcttggatatccatttgcga
agaaaggctggagagtatacaattttgaaacgggtgtgatttctgtctctcgtgacgttgttttccggga
aacagaattcccttttccagcctctgtttttgactccacgccagactcgcagctctcaccctccaacgca
gatcagtcattttttttaccatcagaattacaagctcctacaccggtcagtattactacaactctggaat
taacacagtcatcgtcctctaccaatctcaacgatgataattttcatattccctccgacgaatcaagctc
tgttaatgagatgtctgacaatgaggatttgaattctccaactaccaatgaatcatctccgttcttgtca
ccggcatcaccgtctctgcctttatcaccggcatcactgtctctacctctatcaccggcagcaccctctc
cgtctctgccaaaaatagcagaaccagaacctgaacctgaactacttggaaaaggaaaacgcaagaaaac
acaacctgtgaggcttgcggattatgccactacattacttcaccaaccacacccgtcagtgactccatac
ccactggacaactatgtatcaagctctcaattctctgctgcttatcaagcttatgtttttgccatatctt
tgggtattgaaccaaagagctacaaggaagcgattcttgatgagaattggcgttgtgcagtgtctgatga
gattgtctctcttgaaaatcttggtacatggacagtagaagacctacctcctggtaagaaagcattgggc
tgcaaatgggtgtttagactaaaatacaaatcggatggtacgcttgaaagacataaagcacgtttggtgg
ttctcggtaataaacaaaccgaaggcattgactactccgaaacgttcgctcctgtcgcaaaaatggtcac
cgttcgtgcttttcttcaacaagtagcatcattagactgggaagttcaccaaatggacgttcacaacgca
tttctacacggtgatcttgatgaagaagtttacatcaaatttcctccgggttttggtagtgatgataatc
gtaaagtgtgtcgcctgcgcaaagctttgtatggattaaaacaagctcctcgatgttggttcgccaaact
tactactgctttgaatgattacgggttcatacaagacatctctgattattcgttatttaccatggagaga
aacggcatccgtttacatattcttgtctatgttgatgacttgatcatcacaggctcttctctcgatgtca
tcactaagtttaaagggtatttaagctcatgtttctacatgaaggatcttggtattttgcgatacttctt
gggaatcgaggttgctaggagtcccgccgggatttacttgtgtcagcgtaagtacgccattgatattatc
acagaaacgggtttgcttggagtctggcctgcatcacatcctttggaacagaaccataagcttgctttag
cttttggtgacacgatctcggatccttctcggtatcgtcgtcttgttggcagacttatatacctcggcac
cacacgtcctgagttgtcttatgcaattcacatgctttcccaattcatgagtgatcctaaagccgatcat
atggaagctgctctcagggtcgttcgttatctaaaatcaagtccaggtcaaggtattcttttacgctcaa
atacacctttggtactcattggttggtgtgactcaggttttgattcttgtccgattacacaaaggtctct
aacgggttggtttattcaactgggaggctctccgatctcttggaaaaccaagaaacatgatgtggtttct
agatcctcagctgaagcagagtatcgtgctatggccgacaccgttagcgagcttctttggctgcgggctc
ttcttccagctttaggtatctcgtgtaatgaacccatcatgttgtattctgatagtctttcggctattag
tctagctgccaatcctgtttatcatgcacgtacaaagcatgttggtcgcgacgtacacttcgttcgtgat
gaaattatacgaggtaccattgccaccaaacacgtctccacgacatctcaactagcagacattatgacta
aagcattgggtcgtcgtgagtttgacgcttttcttctcaagctgggtatttgtaatctccatactccagc
ttgcggggggggggggggggggg1