;ID   ATLANTYS2_I DNA   ; ATH   ; 9311 BP
;XX
;DE   ATLANTYS2_I is an internal portion of the ATLANTYS2 endogenous 
;DE   retrovirus - a consensus sequence.
;XX
;AC   .
;XX
;DT   26-FEB-2001 (Rel. 6.1, Created)
;DT   26-FEB-2001 (Rel. 6.1, Last updated, Version 1)
;XX
;KW   Gypsy-like endogenous retrovirus; ATLANTYS superfamily; gag; RT; 
;KW   RNase H; integrase; ATLANTYS2_LTR; ATLANTYS2_I.
;XX
;OS   consensus
;XX
;OC   Arabidopsis thaliana
;OC   Eukaryotae; mitochondrial eukaryotes; Viridiplantae;
;OC   Charophyta/Embryophyta group; Embryophyta; Magnoliophyta;
;OC   Magnoliopsida; Capparales; Brassicaceae; Arabidopsis.
;XX
;RN   [1]  (bases 1 to 9311)
;RA   Kapitonov,V.V. and Jurka,J.
;RL   Direct submission (February 2001)
;XX
;CC   ATLANTYS2_I is an internal portion of the ATLANTYS2 endogenous 
;CC   retrovirus. There are several copies of ATLANTYS2_I in the genome;
;CC   they are ~97% identical to the consensus sequence. Long terminal
;CC   repeats from ATLANTYS2 are deposited in Repbase Update as
;CC   ATLANTYS2_LTR. ATLANTYS2 has generated 5-bp target site duplications.
;CC   ATLANTYS2_I encodes a 1867-aa polyprotein, ATLANTYS2p1, composed
;CC   of gag, protease, reverse transcriptase, RNase H and integrase 
;CC   domains.
;CC   ATLANTYS2p1 (CDS 73-2388 and 2476-5763, predicted by FGENESH):
;CC   MVNNQIPGNTVDAEGNPIPPIQTDVPEAAAPATLAELRSMMAQLQQKVNDQEQANRSLAQ
;CC   QLEAATSQGQIRTTRFGARHLQDRRAAADLNPTRLVFHTPGNTTRPVRRTAPEIGRDRTE
;CC   PAILGNRETNRTERNEPQLPPPRAEVAEADQIGVSDDEDSEENIRWAEEYAREQEISAIK
;CC   LSLAKAENEMKLVRSQMHNAVSSAPNIDRILEESHNTPFTHRISNAIISDPGKLRIEYFN
;CC   GSSDPKGHLKSFIISVARAKFRPEERDAGLCHLFVEHLKGPALDWFSRLEGNSVDSFQEL
;CC   STLFLKQYSVLIDPGTSDADLWSLSQQPNEPLRDFLAKFRSTLAKVEGINDVAALSALKK
;CC   ALWYKSEFRKELNLSKPLTIRDALHRASDYVSHEEEMELLAKRHEPSKQTPRIDKSQPSA
;CC   PNHKKGAQGGTFVHHEGRNFSGAHNYQADTPRGEAARGRGRGRGRGRGRESYTWTKDQPA
;CC   GNEQEYCELHKSYGHHTSRCRSLGAKLAAKFLAGEIGGGLTIEDLEAEKGKTEQVNAVAN
;CC   PEQAAPAANPEGPKRGRGNREADDDEPEAARGRIFTILGDSAFCQDTAASIKAYQRKADA
;CC   NRNWARPFNGPNDEVTFHESDTNGLDRPHNDPLVITLTIGDFNVERVLVDTGSTLDIIFL
;CC   TTLREMKIDMTQIVPTPRPVLGFSGETTMTLGTIKLPVRAKGVTKIVDFSVTDQPTVYNA
;CC   IIGTPWLNQFRAVASTYHLCLKFPTSDGVKTIWGNQKNARICFMAAHKLRNPLKIEEARE
;CC   STTPTPDPVILICLDDEKPERCVEIGGDLGEELTAELTAFLKENVNTFAWSPEDLPGVSV
;CC   DIVSHELNIDPTFKPIKQKRRKLGRERAEAVKAEVEKLLRIGSITEAKYPDWIANPVVVK
;CC   KKNGKWRVCVDFTDLNKACPKDSFPLPHIDRLVESTSGNKLLSFMDAFAGYNQIMMNPED
;CC   QEKTAFYTEQGIFCYRVMPFGLKNAGATYQRFVNKIFALQIGKTMEVYIDDMLVKSMAEK
;CC   DHISHLRECFKQLNLYNVKLNPAKCRFGVRSGEFLGYLVTHRGIEANPKQIEALLGMASP
;CC   QNKREVQRLTGRVAALNRFISRSTDKCLAFYDVLRGNKKFEWTTRCEEAFQELKKYLATP
;CC   PILAKPVIGEPLYLYVAVSDTAVSGVLVREDRGEQKPIFYVSQTFTGAESRYPQMEKLAL
;CC   AVVMSARKLRPYFQSHSIIVMGSMPLRAILHSPSQSGRLAKWAIELSEYDIEYRNKTCAK
;CC   SQVLADFIVELPTKEARENPLDTTWLLHVDGSSSKQGSGVGIRLTSPTGEVLEQSFRLNF
;CC   EATNNVAEYEALVAGLNLARGLKIGKIRAFCDSQLVANQFNGEYTARDEKMEAYLIHVQN
;CC   LAKNFDEFELTRIPRGENTSADALAALASTSDPSLRRVIPVEFIEKPSIELGEEEHVLPI
;CC   QISADQDDPDDCSSEWMEPIISYISEGKLPSDKWKARKLKAQAARFVLVDEKLYKWRLSG
;CC   PLMTCVEGEAICKIMKEIHGGSCGNHSGGRALAIKIKRHGFFWPTMIKDCENFSKRCKKC
;CC   QRHAPTIHQPAELLSSIASPYPFMRWSMDIIGPMHPSKQKKLVLVLTDYFSKWIEAESYA
;CC   SIKDAQVENFVWKHILCRHGIPYEIVTDNGSQFISTRFQGFCDKWGIRLSKSTPRYPQGN
;CC   GQAEAANKTILDGLKKRLDAKKGSWSDELEGVLWSHRTTPRRATGETPFALVYGTECIIP
;CC   AEMIVPSLRRSLSPENTPDNTQRLLDELDLIDERRDSALVRIQNYQNETARHYNSNVRQR
;CC   RFHEGDRVLRKVFQNTAEPNAGKLGTNWEGPYLISKVIRPGVYELADLSGKAVPRSWNAM
;CC   HLRKYYN
;CC   ATLANTYS2_I encodes also a second protein, ATLANTYS2p2, located
;CC   in opposite orientation at a place occupied usually by the
;CC   env proteins in regular retroviruses.
;CC   ATLANTYS2p2 (781 aa, CDS 8837-7874, 7797-7130, 7061-6348):  
;CC   MSSSQSPSTPSASLVDSSDSNHPDDLPPIYKRRSVWTSSEEDAVSSSNAPEQTTPFTARE
;CC   DTNADIARELDLPDDPEPPLVRRSFAPMADEAGTSNWQDVPEPFMPTVKIEDFLYFGPNE
;CC   TEDILRLNEQKAFEKAEKKKRKKNKKVIMPDPPGSTLCTERSLSDLRARFGLGAVTLRVP
;CC   SPDERADNPPAGFYTLYEGFFYGCFLWLPIPRLVLEYVTSYQIALSQITMRSLRHLLGIL
;CC   IRSYESETEITLAHLRNFLEIRRVPKSEVDRYYISPAKGKKIIDGFPSKDEPYTDHFFFV
;CC   AIEDAVHEDLLGTVLTRWGILERTLKFLEPIPDDFLSAFHALSARKCDWLKHFSRERVER
;CC   ALRLLHGVSCPTSSESSDHRTQFFVDMQSTKLTLREVYAKKKEDKERRLAEEKRLVDAGL
;CC   ISPRAAPEATQDGNVIPDAAAPVDAAPAEAQEAEPSAAAPEAVVALPASDKAAGKRVRVD
;CC   DESSKKKKKKKKTSGSEAEKVLPIFEDRIASANLLGGCVGPLLPPPDTLLESRKYAETAS
;CC   HFLRAVASMNRMVHSYDSAMRSNMEVAGKLAEAESRIQAAEREKNEALSEAAAAKLEREE
;CC   VERMAFVNKENAIKMAEQNLKANSEIVRLKRMLSEARGLRDSEVARAIQTTRREVSETFI
;CC   AKIKTAEHKVSLLDEVNDRFMYLSQARANAQLIEALEGGGVLEREKEQVDEWLKDFADAE
;CC   VNLNRFIAELKDELKAPAPEPAPLSPGGHRSVESLADEAGVTDQSGSLLPAEDNRPSEDL
;CC   D
;CC   There is 48% identity between ATLANTYS2p1 and ATLANTYS1p1.
;CC   ATLANTYS2p2 and ATLANTYS2p1 are only 19% identical to each
;CC   other.
;XX
;DR   [1] (Consensus)
;XX
;SQ   Sequence 9311 BP; 2721 A; 2474 C; 2203 G; 1909 T; 4 other;
ATLANTYS2_I
atttggcgctagaaggaggggacttgagatttctcttactcccggaacacagaaccaaccacccaattca
caatggtcaacaatcaaatccccggtaacacagttgatgcagaggggaacccaatccctccaatccagac
agacgttcctgaagccgctgctcccgcgaccctagcggaactaagaagtatgatggctcaacttcagcag
aaggtgaacgatcaagaacaggcaaatcgatccttggcgcaacaactcgaagcagctacctcccaaggac
agatcaggactactcgtttcggcgcgaggcatcttcaggatcgacgagcagcagcagatctcaaccccac
acggctcgtgttccacacgcctggcaatactacaaggcccgtccgccgaaccgcaccggaaatcggaaga
gaccgaaccgagccagcgattttgggaaatcgggaaacgaatcgaacagaaagaaacgaaccgcagctcc
ctcctccccgagcagaagttgccgaggccgatcagatcggggtctcggacgatgaagattcagaagagaa
cattaggtgggctgaagaatacgccagagaacaggaaataagcgccatcaagctctccctagccaaggca
gaaaacgagatgaagctcgtgagatcccaaatgcataacgcagtctcctcggccccgaacatcgaccgca
ttctggaagagtcccacaacacaccgttcacacacaggatctccaacgcgataatctcagatccaggaaa
actaagaatcgagtacttcaacggatcttccgacccgaaaggacacttgaagtcattcatcatctccgtg
gcccgagccaaattcagaccagaagaaagagacgccggtctctgtcacctgttcgtcgagcacttgaaag
ggccagctctggattggttctcgagactcgaaggaaattctgtggacagttttcaggagctatcgacact
cttcctgaagcaatattcggtgctaatcgatcccggcacatcagacgccgacctgtggtcactatctcag
cagcctaatgagccacttcgagacttcctcgcaaaattccgatctaccctagccaaagtcgaaggaatca
acgacgtagcggctctctctgctctgaagaaagcactgtggtacaaatccgaatttcgaaaggaattaaa
tttgtccaaaccactgacaatccgagacgccttgcaccgagcctcggattacgtatcccatgaagaagaa
atggaactactagccaaaagacacgaaccgtccaagcaaacgcctcgcatcgataaatcccaacccagtg
ctccgaatcacaaaaagggtgctcaaggcgggacattcgttcaccatgaaggacgaaatttctccggagc
ccataattaccaggctgatacaccccgaggcgaagccgcccgaggccgaggacgaggccgcggtcgagga
cgcggtcgagaatcctacacttggacaaaggatcaacccgcaggaaacgagcaggaatattgcgagttgc
ataagagttacggccatcatacttccagatgtcgtagcctcggagcaaagttggcagcaaaattcctagc
cggagaaatcggtggaggtttgaccatcgaagacttagaagcggaaaaaggtaaaaccgagcaggtcaac
gctgtggccaatcccgagcaggcagcccccgcggcgaaccccgaaggacccaaaagaggccgaggtaatc
gcgaagcagacgacgatgagccagaagctgctcggggaaggatcttcacaattttaggggattcggcttt
ctgtcaagacacggcggcatcaatcaaggcttatcaaaggaaggccgacgcgaatcgtaactgggcgcgg
ccatttaatgggccaaatgacgaagtaacctttcacgaaagcgataccaacggtttagaccgtccgcaca
acgatcctttagtcattacactgaccatcggtgatttcaacgtcgaacgagtcctagtcgacacgggaag
cacactggacatcatttttcttacaactctgcgagaaatgaagatcgacatgacgcaaatcgtaccaact
ccacgacctgtgctcggattctctggggaaaccactatgactctcgggaccatcaaattaccagtccgag
ccaaaggggtaacaaaaatcgtcgatttctctgttaccgaccagccgaccgtgtacaacgcgattatcgg
cacaccatggttaaatcaattccgagctgtcgcctcgacgtatcatctctgcctgaaatttcccacaagc
gacggcgtgaaaaccatctggggaaatcagaaaaatgctcgcatctgcttcatggcagcacacaagctca
ggaaccccgtcactgaatcgacggccgacgcgaatcataagaaggccaagcttggccgagctgaagagaa
atcaatttccgagcagttatagcagctaaagatcgaggaggctcgggaatctacaacaccaactcccgat
ccggtaatcttaatctgccttgacgacgaaaagcccgagcgatgcgtagaaatcggcggagatctgggag
aagaactaacagctgaactcaccgccttcctcaaagaaaacgtcaatacattcgcctggtccccagaaga
tttgcccggagtaagtgttgacatcgtatcgcacgagctcaacatcgacccgactttcaaacccatcaag
cagaagaggagaaaattgggtcgggagcgagcagaagccgtgaaagccgaggtagagaaattattgagga
tcggatccatcaccgaggcgaaatatcccgattggatcgcgaacccggtcgtagtaaaaaagaaaaacgg
caaatggagagtctgcgtagatttcacagaccttaacaaagcctgcccgaaagacagcttcccattacca
cacatcgatcgcctcgtagaatcaacttctggaaacaagctactgtcattcatggacgctttcgctggtt
acaaccagatcatgatgaaccccgaagatcaagaaaaaaccgctttctacacagaacaaggcatcttttg
ttaccgagtgatgcccttcggactcaagaacgccggggcaacctatcaacgcttcgtcaacaaaatcttc
gcattacagatcgggaagacaatggaagtttacatcgacgacatgttggtgaaatccatggcagagaaag
atcacatatcccatttacgcgaatgtttcaagcagcttaacctctacaacgtcaaactcaatcctgcaaa
gtgccgcttcggagtaagatccggcgagttcctcgggtacctagtcacgcaccgcggcatcgaggcaaat
ccgaagcaaatcgaggcattgttgggaatggcgtcacctcagaacaagcgagaagtgcagcgcctaaccg
gaagagttgcggcccttaaccgtttcatctctcgctcaaccgacaaatgcttggccttttacgatgtgct
tcggggaaacaaaaagttcgaatggacgacccgatgcgaagaagcttttcaggaactcaagaagtacctg
gcaactccacccatcctcgcaaaacccgtaatcggagaaccactatacttgtatgttgccgtatcggata
ctgcagtcagcggagtgttagtccgagaagacagaggcgagcagaaaccgattttttacgtctcgcagac
tttcaccggcgcggaatctcgctatccgcaaatggaaaaacttgctttagcagtcgtaatgtcggctcgg
aagctgcgaccctactttcaatcccattccatcatagtaatgggatccatgccactccgcgccatcttac
acagtccaagccaatcaggacgtctggctaaatgggcaatcgagctcagcgaatacgacatcgagtatcg
gaacaaaacatgtgcaaaatcgcaggtcctagccgattttatcgtcgaactgcccaccaaggaggcccgg
gaaaacccactcgacacaacttggcttctacacgtagacggctcgtcatcaaagcaaggctcgggtgtag
gcatccgcctcacctcgccaacaggagaggtcctcgagcagtcattcagattaaacttcgaagctaccaa
caatgtggccgagtacgaagcgctcgttgccggacttaatctagctcggggactaaagataggaaaaatc
cgagctttttgcgattctcagctcgtcgcgaatcaattcaacggagaatacacagctcgggacgaaaaga
tggaagcctacctgattcatgttcaaaatctagcgaagaatttcgacgaattcgagttgacaaggattcc
acgaggagaaaatacatcggctgacgccctagctgctctagcctcgacatctgacccgagcctgagaaga
gtcatcccagtggaattcattgagaagccaagtattgagctcggcgaagaagaacacgtcctcccaatac
aaatcagcgcggatcaagacgacccagatgactgcagctcagaatggatggaacccatcataagctatat
atccgaagggaaattgccctcggacaaatggaaagctcggaaactcaaagctcaggctgcacgtttcgtt
ctagtagatgaaaaactttacaagtggcgattatccggacccttgatgacatgcgtggaaggagaagcga
tttgcaagatcatgaaggaaattcacggtggctcgtgcggaaatcattccgggggaagggctttagccat
taaaataaaacgccacggattcttctggccgacaatgatcaaagactgcgaaaatttttcaaaacgatgc
aaaaaatgtcaaaggcacgcgccaacaatccatcagccagccgagctcttgtcatcaatcgcctcgccat
atccattcatgcgatggtcaatggatataattggacctatgcatccctcgaagcaaaaaaagttagtcct
cgtcctgaccgactatttctctaagtggatagaagccgaatcttacgccagcataaaggacgctcaagtc
gagaacttcgtgtggaaacatatcctatgtcgccacgggataccttatgagattgtcacggataacggct
cgcagtttatatcaacccgcttccaaggcttctgtgataaatggggaattcgacttagcaagtcaacacc
acgatatccccaaggaaacggccaagccgaagccgctaacaaaacaatcctcgacggattgaagaaacgg
ctcgatgctaaaaagggctcgtggtccgacgaactcgaaggtgtactttggtcgcatcggacaactcctc
gccgagccacaggagaaacccctttcgccttagtctacggaacggaatgcataattccagccgagatgat
agtgccgagcctacgacggagtctatcccccgagaacacccctgataacactcaaaggctcctcgacgaa
ctcgatctgatcgatgaacgaagagattcagccctggttcgcatacaaaattatcagaatgaaacggctc
gtcattacaactcaaatgttcggcaacgaagattccacgaaggagatcgggtcctccgaaaagttttcca
gaacactgccgaaccgaacgctggaaagctcgggacgaactgggaaggaccatacttaatttctaaagtc
atccgacccggagtgtatgagctcgctgacttaagcggcaaagccgttccaagatcatggaacgcaatgc
acctaaggaaatactacaactaaatccgaggtgactaaacttgaactacgaggtggcttgatccctgaaa
agggtacgtaggcagctcgtcttcggacgagttcagctacccccccattaaaaaaggggggagtgggtcc
gtatattcatactcccatttttattatcttgtagattttcgaaccgaaacatgaataaaaattctttgca
actttttattcggctaatacgatgagcgacggctcggagtatccattacgcctattcggctacagcgcgc
tatacaaatacgaggtgaaatctatcagattatttctgataagaaattttcatcttcagaaacccggtta
tacttttaacaccggtatcccagctcgttcctaaacgctggtcgggaagtgaatacgatcgggtcgcaac
cgaatcatattagamaataaaacggttcggatttttagaatcccaccaaawattttggatttcaaaaaak
attggataaagccatccaggcacaggttctacaaatcattggataaagccatccaggcataagtactata
accaaacaaaaagaaaaaaaaaaaaaaacaaatcccgcctcgggtccctagtcgagatcttcagatgggc
gattatcctcggcaggaaggagagatccggattgatcagtaactcccgcctcgtcggcaagagactcgac
cgatctgtgaccacccggactcagaggagcgggctcgggagctggagccttgagttcatctttcagctcg
gcgatgaagcgattaagattaacctcagcatcggcgaaatctttcaaccactcatcgacctgctccttct
ctctctccagaactccgccaccttcgagcgcctcgatcagctgcgcattggctcgcgcctgagacaagta
catgaatcggtcattgacctcatcaaggagcgacactttgtgctcggcggtctttattttggcaatgaag
gtctcggaaacctcccgcctcgtcgtctgaatggcccgagccacctcgctatcacgaagccctctcgcct
cggataacatccgcttgagacggacgatctcagaattcgccttgagattctgctcggccatcttaatggc
attctctttgttcacaaaggccatcctctcgacctcctccctttccagtttcgctgcggcagcttcggag
agtgcctcatttttctctcgctcggcggcctgaatccgagactccgcctcggctaacttaccagccacct
ccatgttgctccgcatagccgaatcatacgaatgtaccatccggttcatcgaagcgacagcctggaaaag
agaaaagttagatttacaatgacaacaccaagacaagttataagtaacaaagcacatacccgcagaaagt
gagatgccgtctcggcatacttccgagattccagaagagtatctggaggaggaagcaagggaccaacgca
tcccccgagcagattggccgaggcaatacgatcttcaaagatcgggagtactttctccgcctcggagccg
gatgtcttctttttcttcttcttcttctttgacgattcatcgtcaacccgaacgcgcttacccgccgctt
tgtcactcgcaggtaacgccacaaccgcctcgggcgcagcagcagacggctcggcttcctgagcttcagc
gggcgccgcgtcaaccggsgcggcagcatcgggaataacattcccatcttgggtcgcctcaggagcggcc
cgcggcgagatcaatcccgcatcaacaaggcgtttctcctccgccaatcgcctctccttatcctctttct
tcttggcgtagacctcccgcaaagtaagcttcgtcgattgcatatctacgaaaaattgagtacggtggtc
ggaagattccgaactagtaggacaagaaacgccgtgaagaagacgaagcgcacgttcaactcgctcccga
gagaaatgcttcagccaatcgcacttccgagccgacaacgcgtgaaaagccgaaagaaaatcgtctggaa
tcggctcgagaaacttgagagtgcgttctgcaaaaggaaaactgcaattagaaaacgaaactaacatcgc
atcgggtacaaataaatttctaactacaactacccaaaattccccatctcgtcagaaccgtcccgaggag
atcttcatgaacagcatcttcgatggctacgaagaagaagtgatcggtatacggctcgtccttgctcggg
aacccatcaataattttctttcccttagcgggggaaatgtaataccgatccacttcggattttggcaccc
gccggatctcgagaaaatttctcagatgagcaagcgttatctccgtttcagactcgtaactccgaatcaa
aatcccgagcaaatgtctcaaagatcgcatcgtgatctgggaaagggcaatctgatatgatgtcacatac
tccaggaccagcctcgggatcggcagccacaagaagcaaccataaaagaacccctcatacaaggtataga
aacccgcagggggattgtcggctcgctcgtcagggctcggcacacgcaaggttacagcgccaagaccaaa
tcgagccctaaggtccgagagagaccgctcggtgcacaatgttgagccgggaggatctggcattatcacc
tttttgttcttcttcctcttcttcttctcagccttttcgaaagccttctgctcgttaagacgcaagatat
cttctgtctcgttcgggccgaagtaaagaaaatcctcgatcttaaccgtcggcatgaagggctcgggcac
atcttgccaattggatgtcccggcttcgtcagccatcggggcaaaagacctccgaacgagaggcggctcg
ggatcatcgggcaaatccaattcccgagcgatatcggcattggtatcttcccgagccgtgaagggagtcg
tctgctcgggggcattcgaagaagatacagcatcctcttcagaagacgtccaaactgatctccgtttgta
gatcgggggaagatcatctgggtgatttgagtcgctagaatcgacaagagaagcgctcggggtcgacgga
gactgtgaagaactcattctcggctcgggacgatagacgagaagggaaacaaccagaaaacaagaatttc
tatcccggaaaaatcaaaaataacaaaaagaacgacggagcttcaccggaaaagaaaaaacgatagaaat
atcaagatcagagagaaggaagcaaaccttgtttattttcttcgaaaacttcaaatgaagtcttgaagga
gtggctcgccgtatatatagagttttcaaaggcgcgtcgggtccgcagcaaacattcaaaacgcgttcaa
gcgcgtgggctcggcagaaggaaaagatccgcgtgatcctcgggaataagcgagaaacgtttccactcct
caagaatagtcggcgacacgcgacagaagtttgaccgtccgagaagatgaagcgacagaactagaagacg
aacagtcggctcgggaatccattcatgaagctcatcgtctctctgccaagtcaatgagctggggggcaaa
c1