;ID   ATCOPIA75_I DNA   ; ATH   ; 4609 BP
;XX
;DE   Internal region of ATCOPIA75 copia-like LTR-retrotransposon.
;XX
;AC   AC002391
;XX
;DT   26-OCT-2001 (Rel. 6.2, Created)
;DT   26-OCT-2001 (Rel. 6.2, Last updated, Version 1)
;XX
;KW   LTR-retrotransposon; COPIA superfamily; internal region; 
;KW   copia-like polyprotein; ATCOPIA75LTR; ATCOPIA75_I.
;XX
;OS   Arabidopsis thaliana
;XX
;OC   Arabidopsis thaliana
;OC   Eukaryota; Viridiplantae; Streptophyta; Embryophyta; Tracheophyta;
;OC   euphyllophytes; Spermatophyta; Magnoliophyta; eudicotyledons;
;OC   Rosidae; Capparales; Brassicaceae; Arabidopsis.
;XX
;RN   [1] 
;RA   Lin,X.
;RL   Direct submission in GenBank (March 2000)
;XX
;RN   [2] (bases 1 to 4609)
;RA   Kapitonov,V.V. and Jurka,J.
;RL   Direct submission (August 14, 2001)
;XX
;CC   ATCOPIA75 was identified originally by [1]. Its termini and
;CC   a target site duplication were determined by [2].
;CC   ATCOPIA75_I is an internal region of the ATCOPIA75 copia-like 
;CC   endogenous retrovirus flanked by the identical ATCOPIA75LTR 
;CC   long terminal repeats, and by a 5-bp target-site duplication (GTTAG).
;CC   ATCOPIA75_I encodes the 1496-aa ATCOPIA75p copia-like polyprotein.
;CC   ATCOPIA75p:
;CC   MSSASALVTSTSSSNTSSTTAYLINASDNPGALISSVVLKENNYAEWSEELQNFLRAKQKLGFIDGSIPK
;CC   PAADPELSLWIAINSMIVGWIRTSIDPTIRSTVGFVSEASQLWENLRRRFSVGNGVRKTLLKDEIAACTQ
;CC   DGQPVLAYYGRLIKLWEELQNYKSGRECKCEAASDIEKEREDDRVHKFLLGLDSRFSSIRSSITDIEPLP
;CC   DLYQVYSRVVREEQNLNASRTKDVVKTEAIGFSVQSSTTPRFRDKSTLFCTHCNRKGHEVTQCFLVHGYP
;CC   DWWLEQNPQENQPSTRGRGSNGRGSSSGRGGNRSSAPTTRGRGRANNAQAAAPTVSGDGNDQIAQLISLL
;CC   QAQRPSSSSERLSGNTCLTDGVIDTGASHHMTGDCSILVDVFDITPSPVTKPDGKASQATKCGTLLLHDS
;CC   YKLHDVLFVPDFDCTLISVSKLLKQTSSIAIFTDTFCFLQDRFLRTLIGAGEEREGVYYFTGVLAPRVHK
;CC   ASSDFAISGDLWHRRLGHPSTSVLLSLPECNRSSQGFDKIDSCDTCFRSKQTREVFPISNNKTMECFSLI
;CC   HGDVWGPYRTPSTTGAVYFLTLVDDYSRSVWTYLMSSKTEVSQLIKNFCAMSERQFGKQVKAFRTDNGTE
;CC   FMCLTPYFQTHGILHQTSCVDTPQQNGRVERKHRHILNVARACLFQGNLPVKFWGESILTATHLINRTPS
;CC   AVLKGKTPYELLFGERPSYDMLRSFGCLCYAHIRPRNKDKFTSRSRKCVFIGYPHGKKAWRVYDLETGKI
;CC   FASRDVRFHEDIYPYATATQSNVPLPPPTPPMVNDDWFLPISTQVDSTNVDSSSSSSPAQSGSIDQPPRS
;CC   IDQSPSTSTNPVPEEIGSIVPSSSPSRSIDRSTSDLSASDTTELLSTGESSTPSSPGLPELLGKGCREKK
;CC   KSVLLKDFVTNTTSKKKTASHNIHSPSQVLPSGLPTSLSADSVSGKTLYPLSDFLTNSGYSANHIAFMAA
;CC   ILDSNEPKHFKDAILIKEWCEAMSKEIDALEANHTWDITDLPHGKKAISSKWVYKLKYNSDGTLERHKAR
;CC   LVVMGNHQKEGVDFKETFAPVAKLTTVRTILAVAAAKDWEVHQMDVHNAFLHGDLEEEVYMRLPPGFKCS
;CC   DPSKVCRLRKSLYGLKQAPRCWFSKLSTALRNIGFTQSYEDYSLFSLKNGDTIIHVLVYVDDLIVAGNNL
;CC   DAIDRFKSQLHKCFHMKDLGKLKYFLGLEVSRGPDGFCLSQRKYALDIVKETGLLGCKPSAVPIALNHKL
;CC   ASITGPVFTNPEQYRRLVGRFIYLTITRPDLSYAVHILSQFMQAPLVAHWEAALRLVRYLKGSPAQGIFL
;CC   RSDSSLIINAYCDSDYNACPLTRRSLSAYVVYLGDSPISWKTKKQDTVSYSSAEAEYRAMAYTLKELKWL
;CC   KALLKDLGVHHSSPMKLHCDSEAAIHIAANPVFHERTKHIESDCHKVRDAVLDKLITTEHIYTEDQVADL
;CC   LTKSLPRPTFERLLSTLGVTDYVPST
;XX
;DR   Positions  91085  86477  Accession No AC002391   GenBank (rel. 124.0)
;XX
;SQ   Sequence 4609 BP; 1181 A; 1072 C; 927 G; 1428 T; 1 other;
ATCOPIA75_I
ttgtatcagagcaaaagctcttttgacctaaatttttttccaccgcagccttcgttgttctgtttcttat
tctacttctattttctcctattactatacctctactatacgatgtcttcagcatctgccttggtaacgag
tacgtcttcatctaacacgtcctccaccactgcatatcttatcaacgcatcagacaacccgggtgctttg
atctcttctgttgttttaaaagaaaacaactatgctgaatggtctgaagaactacaaaactttcttagag
ccaaacaaaaacttggcttcatcgatggatccatccccaaaccggcagctgatcctgaattaagtttgtg
gatcgccataaattctatgattgttggatggatccgcacatctatcgatccaacaattcgttctacagtt
ggttttgtttcagaggcatcacaactgtgggaaaatcttcgtcgtcgtttttcggttgggaacggtgttc
gtaagacactgttaaaagatgaaattgctgcttgtactcaagatggacaaccagttcttgcatactatgg
gcgtttgattaaactatgggaagaattacaaaactacaagtctggacgcgagtgtaagtgtgaagctgcc
agtgatatcgagaaagaacgtgaagacgatcgagttcacaagtttcttctcggtttagacagtcgtttca
gctccatccgatcttctattactgatatagaacctctacccgatctctatcaagtgtactctcgcgtggt
tcgtgaagaacagaatctcaacgcttctcgtactaaagacgttgtcaaaacagaggcgattgggttctca
gttcaatctagtactacacctcgttttcgtgataagtctactctattttgtacacattgcaatcgcaaag
gtcatgaggtcactcaatgctttctggttcatggatatcctgactggtggttagagcaaaatccccaaga
aaaccagccttctactcgtggtcgcggctccaatggtcgtggaagcagctccggtcgtggaggtaatcgt
tcttctgctcccactactcgaggtcgtggtcgtgctaacaacgctcaagcagccgctcccaccgtctccg
gcgatggcaatgatcaaatagctcagctcatctctctccttcaagctcaacgtcccagcagctcctctga
acgtttgtcgggtaacacttgtcttactgatggggttattgatactggtgcttcccatcatatgacaggg
gattgttcgattttggttgatgtttttgatatcactccttctccggttaccaaacccgatggcaaagcct
cgcaggctacgaaatgtggcacacttctcttgcatgactcttataaacttcacgatgtgttgtttgttcc
cgattttgattgcactttgatctctgtctctaaattacttaaacagacaagctcaattgcaatctttact
gacacattttgtttcttacaggaccgttttttgaggactttgattggggcgggggaagaacgtgagggag
tgtattattttaccggtgtattggcacctcgtgtacacaaagcttcgtcggattttgcgatctctggaga
tttgtggcatcgccgcctaggacatccttctactagtgttttgctttcgttaccggaatgtaatcgttct
tcacaaggttttgacaagattgacagttgtgatacttgttttcgttcaaaacaaactcgtgaggtttttc
ctattagcaataataaaacaatggaatgtttttctctaattcatggtgatgtatggggtccatatcgaac
tccatctacaacgggtgctgtttactttctcacattagttgatgactattctcgttccgtttggacgtat
ctcatgtcttctaaaactgaagtctcgcagctcattaagaatttttgtgctatgtctgaacgtcaattcg
gaaaacaagtcaaagcatttcgtactgacaatggaaccgaattcatgtgtttaacaccctactttcaaac
acacggcatacttcatcaaacatcatgtgttgacactccgcaacaaaatggtcgtgtcgaaaggaaacat
cgacatatcctgaacgttgctcgggcgtgtctatttcaaggtaaccttccggtcaaattttggggtgaaa
gtattcttactgctactcatcttatcaatcgcacaccatccgctgtgttaaaaggaaaaactccatatga
actcctctttggcgaaagaccctcatatgatatgcttcgctctttcggttgtttatgctatgctcacatt
cgcccgcgaaacaaggataaatttacttctagaagtcgcaagtgtgtctttataggctatccccatggca
agaaggcatggcgtgtttacgatctggaaaccggaaaaatatttgcaagtagggatgttcgatttcatga
ggatatctatccatatgcgactgctactcaatccaatgttcctctaccccctcctactcctccaatggtg
aatgatgactggttcttacctatctccacccaagttgattccactaatgttgactcttcatcctcatctt
ctcctgcgcaatctggatcgatcgatcagccacctagatcgatcgatcaatcaccttctacgtcgacgaa
tccagtcccagaggaaattggatcgatcgttccctcatcttctccttctagatcgatcgatcgatccaca
tctgacttgtcagcttcagatacgactgaattactaagtacaggcgaatcttctactccttcatctccgg
gtcttcctgagttattgggcaaaggttgtagagaaaagaagaagtctgttcttcttaaagattttgtcac
aaacactacatcgaagaagaaaacagcatctcataatatacactctccctcacaagttctaccctctggt
ctccccacttctctgtccgccgattcggtctctggtaagactctttatcctctctcagattttctgacta
actctggttattctgcaaatcatattgcttttatggcagcaatccttgatagcaatgaacctaagcattt
taaagatgctattttaattaaagagtggtgtgaagcaatgtctaaggagatagacgcactcgaagctaat
cacacatgggatattacagatttacctcatgggaaaaaggctatcagtagtaagtgggtttacaagttga
aatacaattcagatggaacacttgaacgtcacaaagctcgccttgttgttatgggtaatcatcaaaaaga
aggagtagacttcaaagaaacgtttgctcctgtggctaaattgactacagttagaaccattttggctgtc
gctgctgcaaaagattgggaggtccaccagatggatgttcataatgcatttttacatggcgatcttgagg
aagaagtttacatgcgactacctccaggtttcaaatgttccgacccttctaaagtgtgtcgccttcgcaa
gtccttgtatggtctcaaacaggctccccgttgttggttttctaagttgtcgaccgcacttcgtaacatt
ggtttcactcagagttatgaagactactccctattttctctgaaaaatggcgacacgattattcatgtcc
tagtctatgttgacgacctcattgttgctggtaataatcttgatgccattgatcgattcaaatcacagct
tcacaaatgttttcacatgaaagatctcggtaagcttaagtactttctcggtcttgaagtgtctcgtggt
ccggatggattttgtctctctcaacgcaaatacgcattggatatcgtcaaagaaactggtctgctaggtt
gtaagccttccgctgttcctattgctcttaaccacaaacttgcttccataaccggaccggtgtttactaa
tcccgaacaatatcgtcgcttagttggtcggtttatctaccttacyatcacgagacctgatcttagctat
gcggttcacattctttctcagtttatgcaagctccgttggttgctcattgggaggctgcgctccgtctcg
ttcgttatctcaaaggctcaccggcgcaaggcatttttctccgatcagatagttctcttatcatcaatgc
ttattgtgactctgattataacgcttgccctctcacacgaagatctctctcggcatatgtcgtttacttg
ggggactctcctatttcttggaagacaaagaaacaagacacagtctcctactcttcggctgaggccgagt
atagggcaatggcatatacgcttaaagaactgaaatggttgaaagctctactcaaggatttgggtgttca
tcacagctctcctatgaagttgcactgtgatagcgaggctgccattcatattgcggcaaatcctgtgttt
catgagcgcactaaacacatcgaatcggattgtcataaggttcgtgacgctgttctcgacaaactcatca
ctactgaacacatttatactgaagaccaggtcgccgatcttcttaccaagtcgctaccaagaccgacctt
tgaaagactcttgtccacgttaggtgttacggattacgtaccatcaacgtgaggggggg1