;ID   ATCOPIA94_I DNA   ; ATH   ; 4638 BP
;XX
;DE   Internal region of the ATCOPIA94 copia-like LTR-retrotransposon.
;XX
;AC   AC022354
;XX
;DT   27-DEC-2001 (Rel. 6.3, Created)
;DT   27-DEC-2001 (Rel. 6.3, Last updated, Version 1)
;XX
;KW   LTR-retrotransposon; COPIA superfamily; internal region; 
;KW   copia-like polyprotein; reverse transcriptase; the ATCOPIA94 
;KW   family; ATCOPIA94LTR; ATCOPIA94_I.
;XX
;OS   Arabidopsis thaliana
;XX
;OC   Arabidopsis thaliana
;OC   Eukaryota; Viridiplantae; Streptophyta; Embryophyta; Tracheophyta;
;OC   euphyllophytes; Spermatophyta; Magnoliophyta; eudicotyledons;
;OC   Rosidae; Capparales; Brassicaceae; Arabidopsis.
;XX
;RN   [1] (bases 1 to 4638)
;RA   Kapitonov,V.V. and Jurka,J.
;RT   Internal portion of the ATCOPIA94 copia-like LTR-retrotransposon.
;RL   Repbase Reports 1:(4) p. 4 (2001)
;XX
;CC   ATCOPIA94_I is an internal region of the ATCOPIA94 copia-like 
;CC   endogenous retrovirus.
;CC   ATCOPIA94 forms a separate family of copia-like retroviruses
;CC   present in the A. thaliana genome since members of other families
;CC   are less than 75% identical to ATCOPIA94_I. 
;CC   ATCOPIA94_I encodes remnants of the ATCOPIA94p copia-like 
;CC   polyprotein. The ORF which encodes ATCOPIA94p is damaged by 
;CC   one frame shift induced by a 5-bp deletion at position 3264
;CC   (masked by XXXX in the 1436-aa ATCOPIA94p sequence). 
;CC   Such a low damage indicates that ATCOPIA94 has been retrotransposed 
;CC   recently.
;CC   ATCOPIA94 is inserted exactly between two hypothetical genes
;CC   that are 86% identical to each other.
;CC   ATCOPIA94p:
;CC   IPSPLNAHYSLEDIVSYYYSKASDHPGHVISHPLLRGDNYEEWAINLETALASRKKFGFLDGRISKPEED
;CC   SSDFDDWKSINALLVSWIKMTIEPNLRSNISHKPVARDLWEHIKKHFCVSNGPRVQQLRKELSNCRQDGL
;CC   SIETYYGKLTKLWDNMDACRPRIVCTCGKCICDCLAVLETLREHDKVHDFLMGLDESAYGTVRSSLLIQE
;CC   PLPSLEYVYLKVTQDEDSRSHKQVSDSRPDGMVFAAQHASRVRPGERSSSTTVCTNCGRTGHMAESCFQI
;CC   LGFPEWWGDRPRNQNGRGCGTKSNGSGRGGGVVARDNVAQGASSSEVVHANVAITDADRAAVLNLTDEQW
;CC   ISVKRAINATKGNSDEQLSGKSSSTSWILDTGATNHLTCRRDILENVRKSTSTPIILADNRIVMGDMVGS
;CC   VTLNKHLKLHNVFYIEDLGFDLISVTQLMEENDCVMQLSVPFCVLQDRTTRTLIGVGKPSGGLVYFRSTE
;CC   VAVAVKGVPKQSLELWHQRLGHPSLKVVESLPDFVFSSHKDFSNKACDVCIRAKQTSSSFPISNNKTTEI
;CC   FEMIHCDIWGPYCEPSSSGARYFLTIVDDYSRGTWLYLMKNKSDTQTKLRDFIALVDRQFGKKVRVLRSD
;CC   NGGEFLSLTSYFLASGIVHETSCVGTPQQNGRVERKHRHLLNVARAIMFQGSLPIQFWGECALNAAYLIN
;CC   RTPSMVLNGKTPYKVLFGKRPSYDHLRTIGCLCYIHNQDHHGDKFASRSRKCVFVGYPYAKKGWRVYDLE
;CC   KEIFCVSRDVVFDEGIFPFHNDLNPLPLTLDTTSSLTPFYSDDDDDDNGPTTCSVSVPPVPVIPLRVAPI
;CC   SLADSGSKVDLPNDDLVLPDSASTTPLPTDPKPTRERRQPAYLADYETTFLCGSSSTPYPISDVLTDARF
;CC   SSAHKAFLTAITLAHEPATYAEAFADPKWRDATKEEIDALVLNHTWDIEDLPLGKVAIGSKWVFKVKHXX
;CC   XXTVERYKTRLVALGNRQKEGIDFDETFAPVAKMTSVRIFLSIVAARNWEVHQMDVHNAFLHGDLEEEVY
;CC   MKLPPGYFAQHKGKVCRLRKSLYGLKQAPRCWFAKFATALKNYGFQQSYSDYSLFSFNRDGVSLYVLVYV
;CC   DDLIVTGNSTTAISEFKQYLDSCFHMKDLGLLKYFLGIEVARSPDGIYLCQRKYVLDILQDTFLLGAQPS
;CC   GFPIEQNHTLASATGSFLTDLASYRRLVGRLVYLNVTRQDLAYSVRVLTQFMQKPREEHWTAALRVVRYL
;CC   KGTPGQGVLLRANSDLHIYGWCDSDFSGCPRSRRSLTGWFVQLGQSPVLWRTKKQKVVSMSSAEAEYRAM
;CC   SLTARELVWLQALLEDLAVFLSRPMTLYCDSTAAIHIAANPVFYERTKHIERDCHFVREKVQNGTIATVH
;CC   VSTTTQLADIFTKALGSREFDVFRDKLGIEDLHAPT
;XX
;DR   Positions  32279  36916  Accession No AC022354    GenBank (rel. 124.0)
;XX
;SQ   Sequence 4638 BP; 1184 A; 921 C; 1041 G; 1492 T; 0 other;
ATCOPIA94_I
tggtgtttaattgttgaatttctgttttgttgtttcattctttattgatctctgaattactgtatgaccc
agtgtttttaattcgtagtttaaatcgtttatttctttgcatttttatatacaattatgaaaacaaaaca
aatatcatttatgctttacataaaaacgagtctcacaaatctttagagcatcagacgtggttcttatgaa
tcgaccgcatatttattgtcttacgattaattgtatcatgttactactccagcaattaattagtatcata
tctggtttcagaacaagaaaacataattttaaaactttaatctgaattccatccccattaaacgcacatt
actcacttgaagatattgtgtcatattattattcaaaagcaagtgatcatccgggacatgtcatctccca
tccacttctccgaggtgataactatgaagaatgggctatcaacttggagactgctcttgcctcgagaaag
aagtttggttttctcgatggtcgaatttcgaagccagaagaagactcttccgactttgatgattggaaat
caatcaacgccctcttggtctcatggatcaagatgacaatcgaaccgaatctgcgctccaacatctcaca
caaacctgttgcgcgtgacttgtgggaacatatcaagaaacatttttgcgtttcgaacggtcctcgtgta
caacaactacgcaaggagttatcgaattgtcgtcaggatggtctctcgattgagacatactacggcaaac
taacaaagttgtgggacaacatggatgcttgtcgaccacgtattgtgtgtacttgtggaaagtgcatatg
tgattgtcttgccgtgttggagacgctaagggaacacgacaaggttcatgattttctcatgggacttgat
gaatcagcctatggcactgtccgctcgtctcttttgattcaagaacctttaccaagtcttgagtatgtat
accttaaggtgactcaagatgaggattctcgctctcacaagcaagtcagtgactctcgacccgacgggat
ggtgtttgcagcacaacacgcgtctcgtgtacgacccggggaaagatcttcttcgacaaccgtgtgtacc
aactgtgggagaactggtcacatggcagaaagctgttttcagattcttgggtttccagaatggtggggag
ataggcctcgtaatcaaaacggacgtggttgtggtactaaatccaatggatctggacgtggcggtggagt
tgttgcacgagataatgttgctcagggagcatcgtcatcagaagttgttcatgccaatgttgctatcaca
gatgccgatcgtgcggctgtgctcaatctcaccgacgaacagtggataagtgtcaaacgggccataaatg
caactaaaggaaactccgatgagcaactctctggtaagtcttcctctacttcgtggattttggatacagg
tgcgacaaatcatttgacatgccgccgtgatatacttgaaaatgttagaaaatcgacttcgactccaatt
attcttgctgataatcggattgtgatgggtgacatggttggttctgtgacattgaataaacacttgaagt
tacacaatgtgttttatattgaagatttgggttttgatttgatctctgttactcagttgatggaggagaa
tgattgtgtaatgcagttgagtgttccattttgtgttcttcaggaccgcactacgaggacgctgattgga
gtaggtaagccctctggagggttggtctattttcgaagcacggaggttgctgttgcggtgaagggtgttc
caaaacaatctttggagttgtggcatcaacgcttagggcatccgtctttgaaagtagtcgagtctttacc
ggattttgttttttctagtcataaggatttttcgaataaggcgtgtgacgtttgtatccgcgctaagcaa
actagttcttcttttccgataagcaataataagactactgagatttttgagatgattcattgtgatattt
ggggtccttactgtgaaccttcttcttctggagcacgatatttcttgacaattgttgacgattactctcg
tggtacttggctttatcttatgaagaataaaagtgacactcaaacaaagcttcgtgattttattgctttg
gttgataggcaatttggcaagaaagttcgtgttctaagaagtgataatggtggtgaattcttgtcactca
catcttattttcttgcatctggtattgtccatgaaacgagttgtgtgggtactcctcagcaaaacggtcg
cgttgagcgtaaacaccgtcatctcttgaatgttgcacgagcaattatgtttcaaggatctcttcctatt
cagttttggggagagtgtgcgttgaatgcggcttaccttattaatcggactccttcgatggttcttaatg
gaaagactccatacaaggttctctttgggaaacgtccttcttatgatcatcttcgaaccattggttgttt
atgctatatacataatcaagatcatcatggagataagtttgctagtcgcagccgcaaatgtgtttttgtt
ggatatccatatgctaagaagggttggcgtgtgtatgatctcgagaaagaaattttttgtgtgtcacgag
acgtggtctttgatgaagggatttttccatttcataatgatttgaatccgcttccactcactttggatac
cacgtcttcactcacacctttctattccgacgatgatgatgatgacaatggtccaacgacatgctccgtc
tcggttccaccggttcctgttatcccacttcgtgttgcaccaatatctttagcagattccggttcgaaag
ttgatttgcctaatgatgatttggttctaccagattctgcttctactactccactaccaacggatccaaa
acctactcgtgaacgacgtcaacctgcttaccttgccgattacgaaacgacgtttctatgtggttcatca
tcgactccgtatccgattagtgatgtcctcactgatgctcgtttttcatcagctcacaaagcttttctta
cagctattacattagctcatgaaccggcaacttatgcagaagcttttgcagacccaaaatggcgtgatgc
cacgaaagaagagattgatgctcttgtcttgaatcacacatgggatattgaagatttaccactaggaaaa
gttgcgattggttccaagtgggtcttcaaggttaagcatcgctaggacagtggaacggtacaagactcgg
ttagttgctttaggaaaccgacagaaagagggcattgattttgatgagacatttgctccggttgcgaaaa
tgacttcggttcgtatatttttatcgattgtggctgctcggaattgggaggtgcaccaaatggatgttca
taatgcgttcttacatggtgatctcgaagaagaggtctacatgaagttaccgccaggatactttgcccaa
cacaaaggtaaagtgtgtcgtcttcgaaagtcattgtatggcttgaaacaagctccacgatgttggtttg
caaaatttgcaacagcacttaaaaattatggttttcaacaatcatattccgattactctttgttttcatt
caaccgtgatggtgtgtctttatacgttttggtctacgttgatgacttgattgttactggaaactccact
acggcgataagtgagtttaagcagtatcttgattcatgctttcacatgaaggatcttggtcttttgaagt
atttcttgggtatcgaggttgctagaagccccgatggtatttacttatgtcaacggaagtatgttcttga
tattcttcaagacacgtttcttcttggagcacaaccatcaggttttcccattgagcaaaaccatactctt
gcttctgctaccggttcgtttctaacagatctagcatcttatcgtcgtcttgttggtcggctcgtttatc
ttaatgtcactagacaggaccttgcatactcggttcgtgttctcactcaattcatgcagaaaccacgtga
agaacattggacagcggcactacgtgttgttcgttatctcaaaggaactccgggacaaggtgttctctta
cgagctaattctgatcttcacatatacgggtggtgtgactctgatttctctggctgccctcgttcgcgtc
gctctttaacagggtggtttgttcagctcggtcaatctccagtattatggcgaactaagaaacaaaaggt
tgttagtatgtcttcagctgaggccgaatatcgtgctatgtcacttactgcacgagagttggtttggttg
caagctttgcttgaagatcttgctgtgtttctctctcgcccaatgacgttgtattgtgatagtacagcag
ctatccatattgcggccaatccagttttctacgaacgtacgaagcatatcgaacgagactgtcattttgt
tcgcgaaaaagttcagaatggtactattgccacagttcatgtttcaacgactactcagcttgcggacatt
ttcactaaggcattgggaagtcgggagtttgatgtttttcgggacaagttgggcattgaggatctccatg
ctccaacttgagggaggg1