;ID   ATCOPIA43I  DNA   ; ATH   ; 7963 BP
;XX
;DE   Internal portion of the ATCOPIA43 copia-like endogenous retrovirus
;DE   - a consensus sequence.
;XX
;AC   .
;XX
;DT   31-AUG-2000 (Rel. 5.8, Created)
;DT   05-SEP-2000 (Rel. 5.8, Last updated, Version 2)
;XX
;KW   LTR-retrotransposon; endogenous retrovirus; COPIA superfamily; 
;KW   gag-pol; envelope; ATCOPIA43LTR; ATCOPIA43I.
;XX
;OS   consensus
;XX
;OC   Arabidopsis thaliana
;OC   Eukaryota; Viridiplantae; Charophyta/Embryophyta group;
;OC   Embryophyta; Tracheophyta; euphyllophytes; Spermatophyta;
;OC   Magnoliophyta; eudicotyledons; Rosidae; Capparales; Brassicaceae;
;OC   Arabidopsis.
;XX
;RN   [1]
;RA   Kapitonov,V.V. and Jurka,J.
;RT   Molecular paleontology of transposable elements from
;RT   Arabidopsis thaliana.
;RL   Genetica 107 (1-3), 27-37 (1999)
;XX
;RN   [2] (bases 1 to 7963)
;RA   Kapitonov,V.V. and Jurka,J.
;RL   Direct submission (August 2000)
;XX
;CC   ATCOPIA43I is a consensus sequence of an internal portion 
;CC   of ATCOPIA43 copia-like endogenous retrovirus. Its long
;CC   terminal repeat is deposited as ATCOPIA43LTR. The consensus
;CC   sequence was derived from five copies (they represent two
;CC   subfamilies) ATCOPIA43I copies are ~98% identical to the 
;CC   consensus sequence.
;CC   This retrovirus is closer to SIRE1 from Glycine max than to
;CC   other copia-like endogenous retroviruses present in the
;CC   A.thaliana genome. ATCOPIA43I encodes the 1659aa gag-pol
;CC   polyprotein (position 144-5123) and the 475aa
;CC   envelope-like protein (position 6074-7501). 
;CC   ENV:
;CC   MRSCLLNLAALTEGSFSFSATQTSKFKGFLSLKTQVTMNQEATSLTGVSGAGQPSAATPSDPPPPPVAPM
;CC   FIPKTEPGICMSKKTKSKVSAPTQRKRQCTAASAKSRSKAPAPALRRLSSRKSNRSASAAGQTEELVRDS
;CC   SDLQDGDVTEIAPPLFLSRYRENRRLLAVRDRKYPELKFPSETPYSSCFLTEEGLGRYKVIGNRCFNDMR
;CC   FLPLDGNNTASTQQLLFNAGLLPTVTEIDSYVHEVVMEFYANLPDGEEGDQLAYSVFVRGNMYEFSLAII
;CC   NQMFQLPNPSYPLDGMSEIQVPESMDEVAIALSNGKANSWKMLTSRLLSPELALLNKICCHNWSPTVNRS
;CC   VLKPERMTLLYMVAKALPFNFGKLIFDEIWACSNAVANPSSTLRLVLPNLIDQMLRFQRIVRSDEGDTTS
;CC   SAPLKFTMEVKPVPVLQSDSPTLDAALETLIASLTTMRVRLAGKPLSLNLHSVIF
;CC   GAG-POL:
;CC   MTQIIRGQGEDAWTAVEEGWEPPFDLTEDGFKITKPKANWTAEEKLQSKFNARAMNAIFNGVDEDEFKLI
;CC   QGCKSAKQAWDTLQKSHEGTSSVKRTRLDHIATQFEYLKMEPDETIVKFSSKISALANEAEVMGKTYKDQ
;CC   KLVKKLLRCLPPKFAAHKAVMRVAGNTDKISFVDLVGMLKSEEMEADQDKVKPSKNIAFNADQGSEQFQE
;CC   IKDGMALLARNFGKALKRVERGQNRDSTSWSNKDGERSRGRFSRSENDDSGKKKEIQCYECGGFGHIKPE
;CC   CPVTKRKEMKCLECKGVGHTKFECPNKSKLKEKSLISFSDSESDDEGEELLNFVAFMASSDSSKVMSDTD
;CC   SDCDEELNPKDEYRVLYDSWVQLSKDKLQLVKEKLTLEAKLANVSTEDKQKLSGITVDGNSQDYYQKKLD
;CC   CLQEECHRERDRAKLLERELNDKHKQIRMLNKGSESLDKILAMGRTDSQPRGLGYQGYTGKINKEEGRVI
;CC   NFVSGGSTSETVVRQSYTEPKKQVKSHVETKGESVVRTRMVGVICCDHCGKRFHMREQCYKFKEKVRTLW
;CC   NLAKCYIEPSRFCTVWIKKKDLYGDMEAERYNHKLERMYEEEEMHYASCCNQVNKEMQDEANLVCNWNLS
;CC   ELDDCTPQAKVAYTSAVSQDNRAWYFDSGCSRHMTGVQSVLNDFSVITNGKVTFGDGGKGSIKGKGKIEI
;CC   DDQPHLSNVYFVEGLTANLISISQLCDDDLTVTFTKTGCVALDIAGNNVLSGVCSGNNCYMWKDSEVCLS
;CC   AITSKLDLWHQRLGHMNTQSLVKIVNADVVRGIPKLEGSTSVVCKACSQGKQVKVQHKRATHIGTTSVLE
;CC   LVHMDLMGPVQTESLSGKKYILVLVDDYSRFTWVRFLREKSEAAESFKILALQLQTEKGNLVQIRSDHGG
;CC   EFQNEEFEKFCRIQGIRHQFSAPRTPQQNGVVERKNRTLQEMARAMIHGNNVSPRFWAEAVNTACYIVNR
;CC   VYVRPGTSTTPYEIWKGRSPNLCYFHTFGCVCYVLNDKDHLGKFDARSDEGIFLGYATNSMAYRVYNKRL
;CC   KRVEESVNVVFDDKHPTRLFTVEQDDDEQVEETRHSTSLVESGTSKDVEQTTEPRTSSSRLTVPKSHSET
;CC   DVIGELDGDRVTRGIKMNYRDMILFTCFVSSIEPNNIEIALEDEFWYQACHEELNQFSRHEVWDLVPRPV
;CC   HVNVVGTKWIFKNKTDEEGNVTRNRARLVAQGYSQVEGIDFDETFAPVARLESIRLLLGISCLLKIKLFQ
;CC   MDVKSAFLNGVIQEEVFVSQPKGFEDSNFPDHVYKLKKALYGLKQAPRAWYERLTLFLIEKGFKRGSVDK
;CC   TLFILVDEKDILIVQIYVDDIVFGSTKQKLVSDFVESMTKEFEMSMVGEMNYFLGLQIKQTDEGVHISQS
;CC   TYAKGLIQRFGMQTAKTSKTPMSATAKLSADEAGLSVDEKLYRGMIGSLLYLTASRPDLCFSVGVCARYQ
;CC   ANPKQSHLNAVKRILKYVKGTTDVGLFYSKQTNQNLVGFCDADWAGNLDDRRSTTGGCFFLGNNLVSWHS
;CC   KKQSCVSLSTAEAEYIALGSCCTQLLWMKQMLLDYGMTSNTLLVYCDNMSAINISKNPVQHSRTKHIDIR
;CC   HHFIRELVENKIVEISHVSSEKQLADIFTKPLDLNSFLNLKKSIGLSEY
;XX
;DR   [2] (Consensus)
;XX
;SQ   Sequence 7963 BP; 2317 A; 1465 C; 1896 G; 2285 T; 0 other;
ATCOPIA43I
attggtatcagagcgggtaaccaatctgagattagtatatctgaacaggttagatcctagttatggaacc
aacggatgttagtactggtacaggcaaggtattactgctagatactaagcggtatggatactggaaagtt
cgtatgacacaaatcattagaggccaaggtgaagatgcttggactgcagtagaggaaggatgggaacctc
cgttcgatctaacagaagatgggttcaaaatcactaaaccaaaggcgaattggactgcagaagagaagct
tcaatcaaagtttaatgcaagggctatgaatgctatcttcaatggtgttgatgaagacgagtttaagctt
attcaagggtgcaagtcagcgaaacaagcatgggatacgttacaaaaatctcatgagggaacttcgagtg
taaagagaacaagactggatcacattgctactcagtttgagtatctcaagatggaaccagatgaaacaat
tgtgaagttcagttcaaagataagtgctcttgcgaatgaggctgaggtcatgggaaagacctacaaagat
caaaaattggttaagaagctgttacgttgtctgccaccgaagtttgctgcccacaaagcagttatgaggg
ttgcagggaatactgataaaatatcatttgttgatcttgtgggaatgctcaagtcagaagaaatggaagc
tgatcaagacaaagtcaaaccgtcgaagaatattgcatttaatgcagatcaaggctctgagcagtttcag
gagattaaggatggaatggcactgctagcaaggaattttgggaaagctctgaaacgtgtggaaagaggtc
agaatcgtgacagtacatcctggagcaataaagatggagagagatcacgtggaagattctccagatctga
gaatgatgactcaggaaagaagaaggaaattcaatgctatgagtgtggtggttttggtcacattaaaccg
gagtgtccagtcaccaagagaaaagaaatgaaatgtcttgaatgcaaaggtgtgggtcacactaagtttg
aatgtcccaacaagagcaaacttaaagagaaatctcttataagtttcagtgattcagaatctgatgatga
aggtgaagagctgttaaacttcgtggcgtttatggcaagttctgactcaagcaaggttatgagtgacacg
gattctgactgcgatgaggaactgaatcccaaggatgaatacagagtgctatatgacagctgggtgcaac
taagtaaggacaagctgcagttggtaaaggaaaagctaactctagaagcaaaacttgctaatgtgagcac
agaggataagcagaaactgagtggaatcactgttgatggaaattcacaggactactatcaaaagaaactt
gactgtcttcaggaagagtgtcacagggaaagagatagagctaaacttctggaaagagagttgaatgaca
aacacaaacagatcaggatgctcaacaaagggtcagaaagtctagacaagatcttggcaatgggcagaac
tgattctcaaccaaggggtttgggttatcaaggttatacaggaaagatcaacaaggaagaaggaagagtc
atcaactttgtcagtggtggttcaacaagtgaaaccgtggtgagacaaagctatactgaaccgaagaaac
aagtgaagtcacatgttgaaacaaaaggagaatctgttgtgagaacaaggatggtgggtgtgatttgctg
cgatcattgtggtaagagatttcatatgagagaacagtgttacaaattcaaagagaaagtcagaacgctg
tggaatttagcaaagtgctacattgagccatcaagattctgcactgtgtggattaagaagaaagatctgt
atggtgatatggaagctgaaaggtacaatcacaagttagaaagaatgtacgaagaagaggaaatgcatta
cgccagttgctgcaatcaggtcaacaaagaaatgcaagacgaggcaaatcttgtttgcaactggaatctt
tctgagttggatgactgcactccacaagccaaagtggcttatacttcagctgtttctcaagacaacagag
cgtggtatttcgacagtgggtgttctcgtcatatgaccggtgtacaatctgttctaaatgacttttctgt
catcaccaatgggaaggtcacttttggagatggtggaaaaggaagcattaaaggaaaagggaagatcgag
atagatgatcaaccgcatctgtcaaatgtgtactttgttgagggactcactgctaatctaataagcatca
gtcaactgtgtgatgatgacttaactgttacgtttacaaagactggatgtgttgcactcgatattgctgg
caacaatgttctatcaggagtttgttcaggcaacaactgctacatgtggaaagactctgaagtctgtttg
tctgcaatcacatccaagcttgatctgtggcatcaacgacttggacacatgaatactcaaagcttggtca
agattgtgaatgctgatgttgtcagaggaattccaaaacttgaaggcagtacaagtgttgtctgcaaagc
ttgcagccaaggcaaacaagttaaagtgcaacataagagggccacgcatattggaactactagtgttctt
gaattggtacacatggatcttatggggccagttcagacagaaagcttgagtggtaagaagtatattttgg
ttctggttgatgactactctcggttcacatgggtgagatttctaagagaaaaatctgaggcagctgaaag
tttcaagattctggctttacaacttcagactgagaaagggaatcttgttcaaatcagaagtgatcatggt
ggagagtttcaaaatgaagaatttgagaaattctgcagaattcaaggaatcagacatcaattctctgcac
ctagaacaccacaacagaatggtgttgttgagagaaagaacagaactttacaagaaatggctagagctat
gattcatggaaacaatgtgtcaccaagattctgggctgaagctgtcaacactgcttgctacatcgtgaat
agggtgtacgtaagacctgggacaagcacaactccttatgagatatggaagggaaggtcaccaaatctgt
gctacttccatacttttggttgtgtgtgctatgttctgaacgataaggatcatcttggaaagtttgatgc
aagaagtgatgaagggatatttctgggttatgccacaaacagtatggcgtacagagtctacaataagaga
ttgaagagagttgaagaatcagttaatgttgtgtttgatgacaagcatcctacaagactcttcacggtgg
aacaagatgatgatgagcaagtagaggaaacaagacattcaacatctcttgttgaatctggaacttcaaa
agatgttgaacagactactgaaccacgtacatcatcatctcgcctcacagttcccaaaagtcactcagaa
actgacgttattggagagttggatggagacagagttaccagggggataaagatgaactacagagatatga
ttctgtttacctgttttgtgtcaagcattgagcctaacaacattgaaattgccttagaagatgaattctg
gtaccaagcttgccatgaagaactaaatcagttcagtcgtcatgaagtatgggatttagttccaagaccg
gttcatgtcaatgtggttggtactaaatggatttttaagaataaaactgatgaagaaggcaatgtgacac
gtaacagagctcgactggttgctcagggatactctcaggttgaaggaattgattttgatgagacttttgc
tccagtagctcggctggaatctattcgtcttcttcttggaatatcgtgcttgctgaaaatcaaactgttt
cagatggatgtcaagagtgcttttctgaatggagttattcaagaagaagtgtttgtgtctcaacctaaag
gctttgaagattcaaactttccagatcatgtgtacaagcttaaaaaggctctttatgggttgaagcaagc
acccagggcttggtatgagcgtctcactctgttcctaatagaaaaaggtttcaaacgtggaagtgtggac
aagactttgttcatccttgttgatgaaaaggacatcctcattgtgcaaatttatgtggatgatattgtgt
ttggaagcacaaaacagaaacttgtctctgactttgtggaatccatgaccaaagagtttgagatgagcat
ggttggagagatgaactacttcctgggtctacagatcaaacagactgatgaaggagttcatatctctcag
tctacgtatgccaaagggttgattcagagatttgggatgcagacagctaagacctcaaaaactcccatga
gtgctactgctaaactgtctgcagatgaagctggtctgagtgttgatgagaagctgtatcgtggaatgat
tgggagtcttctgtacttaacagcaagtagaccagatctgtgtttcagtgtgggtgtgtgtgcccgttat
caggctaatccaaagcagtctcatctgaatgctgtgaaacggattctcaaatatgttaaaggtacaaccg
atgttggtctgttttactctaaacaaactaatcaaaatcttgttggattctgtgatgctgattgggcagg
aaacctggatgatcgaagaagcacaacaggagggtgtttctttttgggaaacaacttagtttcatggcac
agcaagaaacaaagctgtgtatccttgtccactgcagaagctgagtacattgcactagggagctgctgta
ctcagttgctatggatgaaacagatgcttctggattacggtatgacatctaacaccctgcttgtgtactg
tgataatatgagtgcaattaacatatcaaagaatcctgtgcaacactctcgaacgaagcatattgacatc
agacaccatttcattcgtgagcttgttgaaaacaagattgttgagatctctcatgtttcctctgaaaaac
agttagctgatatatttacaaaaccattggatttgaacagtttcttgaatctgaaaaaatctattggtct
gagtgaatactaatgtgtcttggtgttaaagctattgcaataggttcctttgtacagagttgcattagct
ttgtgttttcatcaatatgtatatatatatatctattgtgtctgtctttctggcacttcatttcaggaaa
atagctgctcttcatgctcttcctcaatgaagtttccaaaaaggatgtgtgtgtggtgtgtaataataat
aaaaaaaaaggcaaaaagaaaaaaaaaaaggcctgaaggaatgaacaaatgaaagttcaaaaaaaaaaaa
aaacaaaggttgtgtgttggcaggaaagggattggtattctttcttctcttggggtatatatgtatctat
gagtaccccctcaaaacattgttcttcattacaggctctaccaatccccctcgccgtgttacccacacac
atcatgaggttgcttctttggagagtatgacgagtgtagactatcctgtcttgatcagtgtgtatggttc
atctcactgggccaggaagctgctattttctccatactaaaagtaagctatgcacctctgtactttggaa
tctgtccttctcgcttggatgcatttcgcatgtgggttccttgcctagtagaagtttctttaagctatat
agtatatatgtctatatcttaagattatgctatgttttatctttcagaggatgacttgatgagatctcac
attcagtttgaatcacagttcttgaatgacagtggctgttaatttcaagtgttgatttggttgaatttgt
ggttctatatggtttagtcctactttatgttgcaatcatattatgtacatatgttggctgtctactacgt
ctctgggtttctgttgtgatttgtttttgggcctggtataaggccctgtcgattgtttaacggatctggt
ttagtgttttgcagcctgtcttggttttcgtgctgtttttagggtttcccttgatgcgctcgtgcctttt
aaacctagcagctctcacagaaggctcattctccttctctgctactcaaacctcaaaattcaaagggttt
ctttctctaaaaacacaagtcaccatgaatcaagaagcgacctcgttgactggagtctctggtgctggtc
agccttccgctgcgactccttctgaccctccaccgcctccggttgctccaatgttcattcccaagactga
accaggcatctgcatgtctaagaaaaccaagtccaaggtctctgctccgactcagcgcaagagacagtgt
acggctgcatctgccaagtccagatctaaagcgcctgctcctgctctccgacgtctgtcaagcaggaaat
ccaaccgatctgcctctgctgctggacaaactgaagaacttgtccgtgactcttcagatctgcaagatgg
tgatgtcaccgagattgctccgcctctctttctgtctcgttaccgagagaaccgtcggcttcttgctgtt
cgtgatcgtaagtatccggagctcaaattcccatcggagactccctacagtagctgttttctgactgagg
aaggtcttgggcgctacaaggtcatcgggaatcgctgcttcaacgacatgcgtttccttcctctggatgg
aaacaacactgcgagtactcagcaacttctcttcaatgctggtctgcttcctactgtcaccgaaattgac
tcttatgtgcatgaggttgtcatggagttctatgcaaaccttcccgatggtgaagaaggtgatcaacttg
cctactccgtgttcgtcagaggcaacatgtatgaattctccctggctatcatcaatcaaatgtttcagct
tcctaatccttcctatccgctggatggtatgtctgagattcaagttcctgagtccatggatgaagttgcg
attgctctgtcaaacggcaaagccaactcctggaagatgctcacttctcggctgttatctcctgagctag
ctctgctcaacaagatctgctgtcacaattggagtcccactgtgaaccgctctgttctgaagccggagag
gatgactctgctttacatggtggccaaagctctaccgttcaacttcggcaaactcatctttgatgaaatc
tgggcgtgttcgaatgcagttgcgaacccctcctctactctccgacttgtgcttccgaacctgattgatc
aaatgctgcgctttcagcgcattgtccgctctgatgaaggagataccacttcgtctgcaccgttgaagtt
caccatggaagtgaaacctgttcctgtgctacagtctgacagtccaaccttggatgcagctctggagacc
ctcattgcatctctcacaacgatgcgtgttcgtctggcaggtaagcctctctcactcaacttacactctg
ttatcttttgagaatggttttttatatgagtgtgcactcaagcagggggagagtattctgatgtgtcgta
ctctgttccggatgatgtgggagatgagaatgtggaggaagaggacgatgatgatgagttagatgacaac
acttaattgcttagtcctggtcgttttttagttgtgcattgtgttagggggagtttgtccctgattgttt
ttgtgtgtttgtgaaaactccttgatcttgttctttatcggcttcttatcggatcttctgtgtgtttaat
gattggtcttatgtatggaaccaagaactcatttctgtttcatgtttaaatggttatgtcttttggcagt
agttgcttaaacttgagcctttggtttctatgtttggtatggaatattgcttggcttgtatgtgtggtct
ctactcttttgcagattacttgaggtgtcacactcagataaaaagggggagat1