;ID   ATHILA4C_I  DNA   ; ATH   ; 8639 BP
;XX
;DE   ATHILA4C_I is a an internal part of the ATHILA4C endogenous 
;DE   retrovirus - a consensus sequence.
;XX
;AC   .
;XX
;DT   31-MAY-2001 (Rel. 6.2, Created)
;DT   31-MAY-2001 (Rel. 6.2, Last updated, Version 1)
;XX
;KW   Gypsy-like endogenous retrovirus; ATHILA superfamily;
;KW   protease; reverse transcriptase; integrase; ATHILA4C_LTR; ATHILA6B_I; 
;KW   ATHILA4C_I.
;XX
;OS   consensus
;XX
;OC   Arabidopsis thaliana
;OC   Eukaryotae; mitochondrial eukaryotes; Viridiplantae;
;OC   Charophyta/Embryophyta group; Embryophyta; Magnoliophyta;
;OC   Magnoliopsida; Capparales; Brassicaceae; Arabidopsis.
;XX
;RN   [1]  (bases 1 to 8639)
;RA   Kapitonov,V.V. and Jurka,J.
;RL   Direct submission (May 2001)
;XX
;CC   ATHILA4C_I is an internal part of the ATHILA4C endogenous 
;CC   retrovirus. There are ~10 copies of ATHILA4C_I in the genome;
;CC   they are ~94% identical to the consensus sequence. The consensus
;CC   sequence was derived from 10 proviral copies. These copies are
;CC   flanked by ~96% identical LTRs (ATHILA4C_LTR). They have generated
;CC   5-bp target site duplications upon their integration into the
;CC   genome. ATHILA4C_I sequences can be split further into 
;CC   minor subfamilies.
;CC   ATHILA4C_I encodes two proteins: 1772-aa ATHILA4C1p (positions
;CC   155-5473) and 499-aa ATHILA4C2p (positions 5863-7362).
;CC   ATHILA4C1p is composed of the retroviral gag-like, protease, reverse 
;CC   transcriptase and integrase domains. 
;CC   ATHILA4C1p:
;CC   MADFHDCVDRHHQGVDRHPLQNQTATIGDFNRTDLFYTNRSAFKLPPFERDDFAIHPAYYDLVSRGKFRG
;CC   APDESPLDHLEVFEDIVSSIKAEGVPADYLLCKLFPHSLASRGTSWLRQLEPGSLTNWTDTKNAFMNHFF
;CC   DESVTEAIRLQISSFTQAPAESLRASWLRFRSYQRECPQHGFQESQLINLFYKGIDKPYQNQLDAASSCN
;CC   FMTRTTSEALLLITNALTCLSTQEFDKERRISAEIATASKETPVSAISAPIQAPPPPSSETRMESMLAQL
;CC   LAGLTKLDTKYESLSTKLDSKYDSLSTDLNSKIDNLRSQFSNLSPTSASINAVTLRSGKQLNPILQRERS
;CC   AQPSSFPIAENESVSIDTPGCRSTPITLDDSVFPLSSGIDNFAEEEETIPDGVDRHPAPVDRHPARSDNV
;CC   QIPAATKSANRRIPFPKSPKKSRQALDDVRCKAMIDKLIVEMPLVEAIHLSPTIRRYVKTMVTKNLTKEC
;CC   SVMMISEQGSDIIQERIPRKLPDPGTFVLSVTINHDSFPRALCDLGSSVNLMPRSVAMRLGYSNLEPTFI
;CC   TLVLADRSTRIPDGILIDVPVMIGKSMIPTDFVVLPYEKEPKDPLILGRSFLHTAGAIIDVRQGRIGLNV
;CC   GDLTMQFDMNTLVKKPIIEGKTFLIDSFTSSASDSISEMELEDPLERVLVSSIEDSADLDSETSTYTKLL
;CC   DETEHVMQLTVEEALPSVTSTPTTTSDWDPAKAPKIELKPLPAGLRYAFLGENSTYPVIVNASLNPAELT
;CC   LLLSKLRNHRKALGYSLDDIAGISPDVCMHRIHLEDESKSSVEHQRRLNPNLKEVVKKEIMKLLEAGIIY
;CC   PISDSSWVSPVHVVPKKGGVTVVKNEKDELIPTRTITGHRMCIDYRKLNAATRKDHFPLPFIDQMLERLA
;CC   NHKYYCFLDGYSGFFQIPIHPDDQEKTTFTCPYGTFAYRRMPFGLCNAPATFQRGMMSIFTDMIEDIMEV
;CC   FMDDFSVYGSSFEDCLENLYKVLARCEEKHLVLNWEKCHFMVQDGIVLGHRISEHGIEVDRAKIEVMTSL
;CC   QALDNVKAVRSFLGHAGFYRRFIKDFSRIARPLTALLCKEVKFEFTQECHDAFQQIKQALISAPIVQPPD
;CC   WDLPFEVMCDASDFAVGAVLGQRKDKKLHAIYYASRTLDDAQRNYATTEKELLAVVFAFEKFRSYLVGSK
;CC   VIVHTDHAALKYLMQKKDAKPRLLRWILLLQEFDIEVRDKKGVENGVADHLSRIRIDDDVPINDFLPEEN
;CC   IYMIDTAEEDDYKRDRLQNRASVSIDTPIMSIDTHISEEVDIRSCAMVSIDTIAPVDRHPSESTRNWSPT
;CC   ENCAVTAVEKDYPWYADIVNYLAADVEPDNFTDYNKKRFLREIRRYYWDEPYLYKHCSDGVYRRCIAATE
;CC   VPDILSHCHSSSYGGHFATFKTVSKVLQAGFWWPTMFRDAQKFISQCDPCQRRGKISKRNEMPQKFILEV
;CC   EVFDCWGIDFMGPFPPSNKNLYILVAVDYVSKWVEAIASPKNDSAVVMKLFKSIIFPRFGVPRIVISDGG
;CC   KHFINKILAKLLLQYGVQHRVATPYHPQTSGQVEVSNRQIKEILEKTVGKAKKEWSYKLDDALWAYRTAF
;CC   KTPLGTTPFHLLYGKACHLPVELEHKAAWAVKMMNFDIKSAGERRLIQLNELDEIRIHAYDNSKLYKERT
;CC   KAYHDKKILTRTFEPNDQVLLYDSRLTIFPGKLSSRWTGPYTVHSVRPYGTVILKNNNGKPFAVNGQRVK
;CC   HYWAEAEIPVEKPLDLQDPPVD
;CC
;CC   ATHILA4C2p:
;CC   MTGQKKKARKSVGPSSSSAVPHHRRRFSTAAGLRPSQPESQAAAPQPPIDRFPWPKLPKERIPSQRVWEK
;CC   DVNREFTKGDYINRPFSPDWDDYDTLFYNAWMSVEILPTRFADFGLMQRLQIEKSVLGLLDDIGLGTICC
;CC   RQYDLYPELVKQFMASVRVSYVNDRKRNAQEGALIFFIRGVRYSLPLRDLCDIYGFDNDLTGVSLPGQFK
;CC   DSQIFWSRFGNGIYDSKDAVHSEIRHPVLRYLVRLISSTLLCKMEPGKMRLSELLLLYHALHDFFPDSLG
;CC   FEQVDRNVNFGAVFAHHLVSLKTKPFTGRGQKSERVGSLLTPIFEHFRISFEGEEVNTTRVTMDETYLKN
;CC   SHWLKGNLLWCFRDDTGQHMIQLPRPALTEITGEHEEIGFHPDPSLLHAAPRTRRQRGSASGSAPTQTED
;CC   EFIDPAGGPRVGSSSSALPYQLPSPPPIPMEPQVFQQYVVDSFKSVWNAIATLSRCGCVAPTRRRRRSPA
;CC   PTSGSEHED
;XX
;DR   [1] (Consensus)
;XX
;SQ   Sequence 8639 BP; 2426 A; 1875 C; 1844 G; 2494 T; 0 other;
ATHILA4C_I
aatttggcgtcgttgccgaagttctcttttagttcaccattagactaggtgtcattttaagtctagatac
ttcttttctatcacctattcacgtgtttctttgtcttttggtctcttgtgtttcaggtaccttcagattg
agaatcacgtttccatggcagactttcacgattgtgtcgaccgacaccatcagggtgtcgatcgacaccc
tctgcagaaccagacggctacaataggcgattttaacagaactgatctcttctacacaaaccgatcagca
ttcaagctacctccatttgaaagagatgactttgctatacaccctgcctattatgatcttgtatctagag
gaaagtttagaggagctcccgatgaatcaccattagatcatttggaagtcttcgaagacatcgtttcctc
tattaaagcagaaggtgtaccagctgattaccttctttgcaaactcttcccccactctctcgcgagcaga
ggaacatcatggttgcgtcaactggaaccaggttccttaaccaattggactgatacaaagaatgcattca
tgaatcacttcttcgatgaatccgtgacagaggcaattcgtttgcaaatttcttcgttcactcaagctcc
agccgagtcccttagagcatcttggctccgattcaggtcttatcagcgtgaatgtccccaacatggcttc
caagagagccaactcatcaacctattctacaaaggaatagacaagccataccagaatcaacttgatgcag
caagctcctgtaacttcatgacaagaaccactagtgaagcattactcctcatcaccaatgctttaacttg
tctttcgacgcaggaatttgacaaagaacgaagaatttcagctgagatagccaccgcgagcaaagagact
cctgtttcagcaatttctgctcctatccaagccccccctccaccctcatcagagaccagaatggagtcta
tgcttgcgcagcttcttgcaggcctaacaaagcttgacaccaaatatgaatctctctccacaaagcttga
tagcaaatatgattctctctccacagatctcaatagcaagattgacaatctacgctctcaattctccaat
ctctcacctacatcagcttctatcaatgcagtcacactacgcagcggaaagcagctcaacccaatacttc
agcgcgaacgatcagctcagccttcatcttttccaattgcagaaaacgaatcagtgtcgatcgatacacc
agggtgtcgatcgacaccaattactcttgacgactctgttttccctttgtcgagtggaatcgacaatttt
gcggaggaggaggaaactattcccgatggtgtcgatcgacacccagctcctgtcgatcgacacccagctc
gttcagacaatgtgcaaattccagctgcaaccaagtcagcaaatcggcgaattcccttccccaagagtcc
taaaaagtcaagacaagccttagatgatgtcagatgcaaggctatgattgataagctgattgttgagatg
cctctagttgaagctattcatctatctcccacaatcagaaggtatgttaagacaatggtcactaagaatt
tgacgaaagaatgcagtgttatgatgatttcagagcaaggcagcgacatcattcaggaacgaatcccaag
aaagctacctgatcctggaacttttgtacttagtgtcaccattaatcacgattccttcccaagagcttta
tgtgatctcggttctagtgtgaatctgatgcctcgttcagttgctatgcgtcttgggtattccaatcttg
agcctacctttatcactcttgtgttggcagatcgttctacccgaattccagatggaattcttattgatgt
tcccgtgatgattggaaagagcatgattcctacagactttgtagtcttgccttatgagaaggaacctaag
gatccattaatccttgggagatccttcttgcacacagctggagcaatcatagatgttcgacaagggagga
tagggctgaatgttggcgatcttactatgcagttcgatatgaacactttggtcaagaaaccgatcataga
agggaaaaccttcttgatagattctttcacttcatcagcttcagatagtatttcagagatggaattagaa
gacccactagagcgagttttggtctcttccatagaggatagtgcagatttggacagtgaaacttctacat
atactaagctgctagatgagaccgaacatgtgatgcaacttacagttgaagaagctcttccatctgtgac
ttcaacgccgactactacttcagattgggatccagcaaaggctcctaaaatcgaattgaaaccacttcca
gcagggctaaggtacgcttttcttggtgaaaattctacttatcctgttattgtgaatgcttctcttaatc
ccgcagagctcaccttattgctaagcaagctgcgcaatcatcgcaaagctcttggctattctcttgatga
cattgcaggtatatctccagatgtatgcatgcataggatccaccttgaggatgagtctaagtcttcagtt
gaacatcagagaaggctgaatccgaatctgaaagaagtggttaagaaagagattatgaaactgttggaag
cagggattatctatccaatttcagatagcagttgggttagtccagttcatgtggttcctaagaagggagg
tgttacagtagtcaagaatgagaaagacgagctgattcctactcggacaatcacaggacatcggatgtgc
atcgattacagaaagctgaatgctgctaccaggaaagaccatttccccttaccatttatcgatcagatgt
tggagaggttagcaaatcataagtactattgcttccttgatggatactcaggattctttcagatcccgat
tcatccagatgaccaggagaaaacgactttcacctgcccctatggtacatttgcttatcggagaatgccc
ttcggtctttgtaatgctcctgcaacatttcagagaggtatgatgtctatcttcacagacatgattgagg
atatcatggaggttttcatggatgatttttcagtttatggatcatcgtttgaggattgcttagagaatct
ctacaaagtgttagcaagatgtgaggagaaacatctagttttgaattgggagaaatgtcacttcatggta
caggatggaatagttctcggacacaggatttctgagcatggtatagaagttgatagagctaagattgagg
tcatgacaagtcttcaagcgcttgataatgttaaagcagtgaggagtttccttggacatgctggtttcta
caggagattcatcaaagacttcagcagaatcgcaagaccattgactgctttactctgtaaagaagtcaaa
ttcgagtttacacaagagtgtcatgatgcctttcagcagataaaacaagccttgatcagcgcaccaattg
ttcagccaccagattgggatttaccttttgaggtaatgtgtgatgcgagtgattttgcagttggagctgt
tctaggacagaggaaggataagaaacttcatgccatctactatgcaagtagaactcttgatgatgctcaa
aggaattatgcaactacagagaaggagttgttagctgtggtgtttgctttcgagaaattcagatcttatc
tcgttggatccaaggttattgttcacacagatcatgctgccttgaagtatttgatgcaaaagaaggatgc
taagccaagacttttgagatggattttgcttcttcaagaatttgacatagaggttagagataagaaagga
gttgagaatggtgtagctgatcatttgtcacgcattaggatcgatgatgatgtccctataaacgatttct
tgcctgaagagaacatatacatgattgatacagctgaagaagatgactacaaacgtgacaggttgcagaa
tcgagcttcagtgtcgatcgacactcccattatgtcgatcgacactcacatttcagaggaagttgatatt
cgtagttgcgcgatggtgtcgatcgacaccattgcacctgtcgatcgacacccttctgaatcgacaagaa
attggtcaccaactgagaattgcgccgtcacagcggtcgagaaagattacccatggtatgctgacattgt
taattacctagctgcagatgtggaacctgataatttcacagattacaacaagaagagattcctaagagag
attaggcgatattattgggatgaaccttatctctataagcattgttctgatggagtttacaggagatgca
ttgctgcaacagaggttcctgatatactatcgcattgtcacagctctagctatggtggtcattttgcaac
cttcaagacagtatccaaagttcttcaagcaggcttttggtggcctacaatgtttcgggatgctcagaag
ttcatatcgcaatgtgacccttgtcagagaaggggaaagatcagcaagcgtaatgagatgcctcagaaat
ttatactcgaagtcgaagtcttcgattgttggggtatagatttcatgggaccattcccaccttccaacaa
gaatctctacatcctagtagctgtggattatgtctccaaatgggtagaagctatagctagtccgaagaac
gattcagcggttgtcatgaaactcttcaagtctatcatcttccctcgttttggagtgccacgcatagtca
ttagtgacggaggtaagcatttcattaacaagattcttgcgaaattgcttttacagtatggagtccagca
tcgggttgctactccctaccatccacaaacgagcggccaagttgaagtttccaacaggcaaatcaaagag
attcttgagaaaacagtgggtaaagcgaaaaaggagtggtcctacaagttagatgatgcactgtgggcct
acaggacagctttcaaaactccgcttggtaccacaccttttcatcttctatatggtaaagcttgtcatct
tccagtggaattagagcacaaagcagcttgggcagtcaagatgatgaatttcgacatcaaatcagctgga
gaaaggagacttattcagttgaatgagcttgatgaaatacgaattcacgcctatgacaactcgaagctct
acaaggaacgcacaaaagcttaccacgacaagaagatcctcactcggacctttgagcctaatgaccaggt
acttctttatgattctagattgacaatatttcctggaaaattgtcttctcgttggacaggtccctacaca
gtccactcggttaggccatatggaacagtcattctcaaaaacaacaatggaaaaccatttgctgtgaatg
gacagagagtcaaacactactgggctgaagccgagattccagtagaaaagcctttggatcttcaggaccc
acctgttgattaagcaactgcaaagtcaagctattgactataaacaagcgcttagtgggaggcaacccac
tggtaagtatttctcctttttatttatttcattcatttgattcttaacttaggatttttgattttatagg
actattcagattaaaaaaaaaaaaaaaaaaaaaacgtgtcgaccaaaaacgcgatggtgtcgatcgacac
cctcattttccaggtaagtcgttttaacctaaaacttcatcttcttcatttcatttctctctaaaaccgc
cgaattcactaaaacctctctagatccttcgattcttcaccgtttcttgtgatttcaggctctaaactca
ttagaatctcatcctctatcttccccaagtgtcaattccatcgatctgcaagatgacagggcagaaaaag
aaggcgaggaagagcgtcggaccttcctcctcctccgccgtccctcaccaccgccggcgcttctccaccg
ccgccggtctccgcccttcccaaccggaatcccaagctgcagcccctcaacctcctattgatcgatttcc
atggccaaagcttccaaaagagagaattccttctcaaagggtttgggagaaagatgtcaacagggaattc
actaaaggtgactatatcaatcgccctttttctccagattgggatgattatgatactctgttctataatg
cctggatgagtgtagaaattttgcctactcggtttgcggatttcggtttgatgcagcgtttgcagataga
gaaatctgttttaggactgctagatgacataggtttgggaaccatctgttgtaggcagtacgatttgtat
cctgagctggttaaacaattcatggcttctgttagggtttcttatgtgaatgacaggaaacgcaacgctc
aggaaggagctcttatcttcttcattcgcggtgtcagatacagtttaccactgcgagatttgtgtgacat
ctatgggtttgacaatgatctcactggagtttccctgcctggtcagtttaaggattctcagattttctgg
agcaggtttggtaatggaatctatgattccaaggacgcagtacactcagagatccgtcaccctgttctgc
gatatttggttaggctgattagtagcacactgttgtgcaagatggagcctggcaagatgagactttctga
gttgttattgctctatcacgctcttcatgacttctttccagacagccttggatttgagcaggttgaccgt
aatgtcaactttggtgctgtgttcgctcatcacttggtttctctcaagaccaagccctttacaggtagag
gacagaaatctgagagagttggcagtcttctcactcccatattcgagcatttccgcatcagctttgaggg
cgaagaagtcaacaccacgcgtgtcactatggatgagacttacctgaagaactctcactggcttaaaggc
aacttactgtggtgcttcagagatgatacaggtcagcacatgattcagttacctcgccctgcacttacag
agattacaggagagcatgaggagataggctttcatcccgatccttctcttctgcatgccgcaccacgcac
caggcgacagagaggttcagcttcaggatctgcgcctacacagaccgaggacgagttcattgaccctgct
gggggtccgagagttggatcttcctcatccgctcttccatatcagctgccttctcctccacccatcccga
tggagccacaggtctttcagcagtacgtcgtcgacagtttcaagagcgtctggaacgccatagctacact
ttctcgctgcggttgtgttgctcccactcgtcgtcgtcgtcgctctcccgcacccacctccggctcagag
catgaggactagccatcctccttatcttgctttgttttctgactttgtttttgctttatttcagactatt
ggttatttgttttgaaacttgttgcacttattttcttatgctttgttggaactattgatgtgtgttttag
tgatttaacagtctaactggaggatccatgtgaaaacactaaccaaggctctacaacaatgagcagaatt
cacaaagctaagcgaaacgaccagagtgtcgatcgacaccaacatggtatcgatcgacaccactctcagg
tggcaactcacgttttctaactttttaacttttgattatttctctttcgattcctctagcttaataactc
tggggacagtgttatctaagtctgggggagtcagttactaacttcatcttttcttttcttatgagtcagt
tgtgtcattttaacttagatagtcgttttattgagtcaaaacttttgtctgaaattatgatctatctctc
gagtttattgcctggattgcttaagtactgcagattcaaatcaatccgactaatggaaaatccagaaagc
caatcaacaaacttaaggacatccattcccaacataatccacaaaggctggagctaatctcactgctctt
ccttttacctcatcaaaacggttagtgctcataggacttgactccttgttcatacctgataagtcgacta
atgtgttaaaaagaaagtgttcctctccccaaataaaaaaaaaaaaataataataaataatgagagattg
agaagactttcagatagtgtataggggtaggaatgtttcctatgaccccattattatacattatttggaa
cgatcaattacaaaggttggaaaaatgccgagggtgtcgatccagtatgcgagttcccctttgtttactc
tctaaaaagaataagtctgggggagagaaagaacaccaaagaaaaaaatatatatataaagaatggtcga
tttatcatggtatagtacaggaatgagtctagaaggttcttaagcatttacttgattgatgagacccagc
gatgatagcctagtggatatagttgagtgctttggtatggaatctaggatgttgtgtgggctagcggaga
attacatggtggtcaggatttgattagaactgtttagtgcatggatttggtttgtatgatcaaggtaata
aactagagagattgggagtttctatgtttttcaaacctctttcacagattcaaccttttgttttgcttga
gggcaagcaaaagctaagtctgggggagt1