;ID   ATGP1I      DNA   ; ATH   ; 5977 BP
;XX
;DE   ATGP1I , an internal part of the ATGP1 LTR-retrotransposon - 
;DE   a consensus.
;XX
;AC   .
;XX
;DT   21-JAN-1999 (Rel. 3.1, Created)
;DT   27-DEC-2001 (Rel. 6.3, Last updated, Version 2)
;XX
;KW   LTR retrotransposon; gag; pol; Gypsy supefamily; ATGP1 family;
;KW   ATGAGPOL1_LTR; ATGP1LTR; ATGAGPOL1_I; ATGP1I.
;XX
;OS   consensus
;XX
;OC   Arabidopsis thaliana
;OC   Eukaryota; Plantae; Embryobionta; Magnoliophyta; Magnoliopsida;
;OC   Dilleniidae; Capparales; Brassicaceae.
;XX
;RN   [1]
;RA   de la Bastide,M.R., Parnell,L.D., Kaplan,N., Gnoj,L., Hameed,A.,
;RA   Schutz,K., Hasegawa,A., Gottesman,T., Shohdy,N., Granat,S.,
;RA   Jensen,K., Johnson,A.F., Lodhi,M., Dedhia,N., Martienssen,R.
;RA   and McCombie,W.R.
;RT   A. thaliana BAC T32N15 from chromosome V
;RL   Direct submission to GenBank (September, 1997; AC002534) 
;XX
;RN   [2]  (bases 1 to 5977)
;RA   Kapitonov,V. and Jurka,J.
;RL   Direct submission (January, 1999)
;XX
;RN   [3]
;RA   Kapitonov,V.V. and Jurka,J.
;RT   Molecular paleontology of transposable elements from
;RT   Arabidopsis thaliana.
;RL   Genetica 107 (1-3), 27-37 (1999)
;XX
;CC   ATGP1I is an internal part of ATGP1 LTR-retrotransposon;
;CC   its LTR is deposited as ATGP1LTR. Presumably, ATGP1 
;CC   retrotransposon was an active element several millions years ago.
;CC   ATGP1 proviruses and solo LTRs are bordered by 5 bp-long target 
;CC   site duplications.
;CC   The consensus sequence has been reconstructed based on 13 individual
;CC   copies present in the GenBank 109.0 release. There are two major 
;CC   subfamilies of ATGP1. Retroelements from the first subfamily
;CC   are ~97% identical to each other; the second subfamily is a little
;CC   bit older since its copies have 95% mutual identity. There is ~92%
;CC   identity between sequences from different subfamilies. Examples of
;CC   the first subfamily include copies found in AB01821 and AC002534
;CC   GenBank loci; copies found in AF0588825 and AC006268 represent the
;CC   second subfamily. There is 97% identity between consensus sequences
;CC   reconstructed for each of these subfamilies.
;CC   Two gag- and pol-like polyproteins are encoded by separate ORF1 
;CC   (position 253-1629) and ORF2 (position 1669-5067). ORF1 codes for
;CC   two Zn-finger CCHC-type domains (aa positios 362-375 and 384-400)
;CC   analogously to gag-like proteins in many retroviruses including HIV2
;CC   and caulimovirus.
;XX
;DR   [2] (Consensus)
;XX
;SQ   Sequence 5977 BP; 1328 A; 1022 C; 2042 G; 1585 T; 0 other;
ATGP1I
atttggtatcagagcgattacggttctaggatgtgtagaaaaattaattgatgaattattttcctgttga
ttgatgtgggatgagctagggtgttagcatattgaggattgattatggagtgtttagtcaattgtgtgtt
gttggagtcctagcatcatcttctccgagccttaaatgagattccacggtaagttgtttgtgatatattc
gtttgttttagtttcttagccatttgttcttgagtagcgcagatggttagaggagctggagttcgtggtc
gtggtcgcggtcgtggtcgcggtaggggacgagtcctcgagggtaccggcgagagtgatggccacagtgc
cacagttgagcagagtgtgggttcgcagcctgagtttgtggagcccggggttaggaacggtcttggtgcc
gatatagctggtgcagccggagtgggggctggtggagctggtgtcggtaccggtgtgcatgccgttggtg
ctgagggcccaggagtgatgggtgccgcagccggaggagcccaggttccagaggttggtttggcgggcct
gttgaggcagttgttggagcggttaccaggtgtggtaccagtggaggctccagttgcgccacgagtggcg
gaggtgcagcagcgggctgcggttgctgaggaggttccatcttatttgaggatgatggagcagttgcaga
ggattggcaccggttatttttctggtggtactagtcctgaggaggccgacagttggaggtcgcgggttga
gcggaacttcggttcgagcagatgcccggcggagtatcgggtagatctagcagtgcattttctagagggg
gatgcacatttgtggtggcggagtgtgactgccaggaggaggcaggcggatatgtcttgggcagatttcg
tggctgagtttaatgccaagtactttccgcaggaggcgctcgaccgtatggaggcgcgttttcttgagct
gacacagggtgagcggtcggttcgggagtatgaccgggagtttaaccgactcttggtgtatgcgggtcgg
ggcatggaggatgaccaggcccagatgaggaggttcttgaggggacttcgaccggatttgagggtgcggt
gtagagtgtcgcagtacgccacgaaggcggcactggtggagacggctgctgaggtagaggaggaccttca
gaggcaggtggtgggagtgagtccagcggtacaaccgaagaagactcagcagcaggtggcacctagcaag
ggcggcaagcctgcgcaggggcagaagaggaagtgggatcatccttccagagctggacagggtggtcgtg
caggatgtttttcttgtgggagccttgaccacaaggtggcggattgcacgcagcgagctgagacgaggga
gtgctaccactgcagggagaggggacatctcaggccgaattgtcctaagctgcagcggatggcagtggca
gtggtacagccggcggtgcagcacggagcgcaggtgcagcagggagtgcagcagttggcacatattgcag
cggcaccgcagggttacactacgcgtgagataggcggtaccagcaacagagcgattactggtatgatttc
tgagactcagactttgtgaattatttctgttgttttggatatcattgttttgtgatgaatgttagggttc
ttagcacatgaggtttgtgtagggaccttattagtgggcggtgtagaggcccatgtgttgtttgactctg
gagcatcgcattgcttcattaccccggagagtgcctcgcgcggcaacatccgtggggatcctggtgagca
gcttggagccgtcaaggttgctggagggcagtttctagcggttctggggagagcgaagggtgtggatatt
cagattgcaggggagtcaatgccagcagatttgattatcagccctgtagagttgtatgatgttattctgg
gaatggattggttggatcattacagagtacatcttgactgccaccgtgggagagtttcttttgagcggcc
agagggaaggttggtttatcagggagtgagacctacctcagggagtctcgtcatctcagcagtgcaggca
gagaaaatgatcgagaagggctgcgaggcttacctggtgacgatatctatgccggagtctgtggggcagg
tagcagtgagcgacattcgggttgttcaggagttcgaggatgtgtttcaatcgttgcagggattaccacc
atcacggtctgatcctttcacgattgagttggaaccagggacagcgccgttgtctaaggcaccctacagg
atggctccagcagagatggcagagctaaagaagcagctagaggatttgttgggtaagggattcattcgtc
ctagtacttcaccttggggagcgccggtgttatttgtgaagaagaaggacgggagtttccgcttatgtat
tgactaccgggggttgaaccgggtcactgtgaagaacaagtaccctttgccaaggatcgatgagttgttg
gatcagttgaggggtgctacttgtttctccaaaatcgatttgacgtcgggttatcaccagatcccgatag
cggaggcagatgttcgcaagactgctttcaggacgagatatggacattttgagttcgtggtgatgccttt
cgggttgactaatgcacctgcagcgttcatgaggttgatgaacagcgtgttccaggagtttttggatgag
tttgtgatcatctttattgatgacattctggtgtattctaagagtcctgaggagcatgaggtacatctga
ggagagtgatggagaagctgcgggagcagaagttgttcgctaagttgagtaagtgcagtttctggcagag
agagatgggttttctgggtcacattgtttctgcagagggagtttcagtggatccagagaagattgaggct
atcagagattggcctagaccgacgaatgccactgagatcaggagttttcttgggttggcagggtattaca
ggaggttcgtcaaggggtttgcgagtatggcacagccgatgactaagctgacagggaaggatgttccttt
tgtgtggtcaccggagtgcgaggagggtttcgtgagcctgaaggagatgttgacgagtacaccagtgtta
gcattgccagagcatggagagccttacatggtgtatactgatgcctcgagagttggtttgggttgtgttc
taatgcagcgtgggaaggttattgcctatgcatcacggcagttgaggaagcatgagggcaactatcctac
tcacgatttggagatggctgcagtaatctttgccttgaaaatttggaggtcttatctttatggcgggaaa
gtgcaggtgttcacggatcataagagtttgaagtatatcttcactcagccagagctgaacctgaggcaga
ggcgatggatggagttggtggctgattatgatctggagatagcttaccatcccggtaaggctaatgtggt
tgcagacgcgttgagccgtaagcgcgtgggcgcggctccagggcagagtgttgaggccttggtgagtgag
attggtgctttgcggttgtgtgctgtggcacgagagccattgggattggaagctgtggatcgagcagatc
ttctgactagagttcggttggctcaggagaaggatgaggggttgattgcggcttctaaggcagagggctc
tgagtatcagtttgcagctaatgggactatccttgtgcatgggcgagtttgtgtgcctaaggatgaggag
ttgcgacgggagatcttgagtgaggctcatgcgagcatgttttccattcatccaggagcgactaagatgt
accgagacctcaagcggtactatcagtgggtcgggatgaagagggatgtagctaattgggttgcagagtg
cgatgtttgccagttagttaaggctgagcatcaggtgccaggtggcatgttgcagagtttacccattcct
gagtggaagtgggattttatcacgatggattttgtggttgggttgcctgtgtcgcgaaccaaagatgcta
tttgggtgatcgtggaccgtctgactaagtcagcacattttctggccatcagaaaaaccgatggagcggc
ggtgttggctaagaaatatgtgagcgagatcgtgaagttgcatggagtacctgtgagtattgtgtcagac
agggattccaagtttacttctgcattttggagagcctttcaggcagagatgggtactaaggtgcagatga
gcacggcttatcatccacagacagatgggcaatctgagaggactattcagacgcttgaggatatgttgcg
gatgtgtgtcttggattggggaggtcattgggcggatcacttgagtctggtagagtttgcttataacaat
agctatcaggcgagtattgggatggctccttttgaagcgttgtacgggaggccatgcaggacaccgttat
gctggacccaagtgggggagaggagcatatatggtgcagattatgttcaggagaccacggagaggatccg
ggttcttaagctgaacatgaaggaagctcaggatcgacagcggagttatgctgacaagaggaggagagag
cttgagtttgaggttggggacagagtgtatcttaagatggccatgttgcgaggtccgaacaggtctatct
cagagactaagttgagtccgaggtacatgggtccgttcaggattgttgagagggtgggaccagtggcata
caggttagagttgcctgatgttatgcgtgcgtttcacaaggtctttcatgtgtcgatgctgaggaagtgt
cttcacaaggatgatgaggtgttggctaagattccagaggatcttcagcccaacatgactttggaggcga
gaccagtgagggttctcgagaggaggatcaaggaacttcggcggaagaagattcctttgatcaaagtcct
gtgggactgcgatggtgtgacagaggagacttgggagccagaggcgaggatgaaggcaaggttcaagaag
tggttcgagaagcaggttgctgcatgagtttgcctttccttggtttccagtttcagtgttgttgagtttt
tgagcttttgagttgttgttcttatttctctttcatttgtattttcgtttggagttgtaacgatcatgac
agttgatgattaataagatgagtattttttttcttggatttgacttgagttgtcttatttaagaatgtgt
tgtgcggtttccaaaaatggaaagtttcctagaaaggacttgtctatttttggaaagtttgagtaaacaa
cctaagagaaaggtgtgtttacgagatggccgggagttggtcataaggatgttctcgaggggatatgatg
gatggtgaggcatagaggtgtcttacgcgaatccatgagagagtctggagatatgctcggaagcgtttag
tattgttgctcgagggacatgagggtcccaacgactcgtgaggagcggggtatgctcctgaggtaccggt
aactcgagggacatggtgtctttagagtacgaggggtacggtggcttcgggtgacgatccgagagcacgc
tttagggtgacgacccgagagcgagttttgtggtattattttctccaagtgtcgtaactcgatgactttg
tggctgagagtgagccgacacaactcagggtttgactttgagtggggaagataatatgtgtgatacccga
gataccgtaggccgaggcctgagagtgacggtttatgctcgaagattgcggtctgagagtggagagtgag
ttgtgagcaacaatgactagtaaatcgtgagtatatcgactggatagagagacctagagggggtctggtt
ggagggagtgcggactgtagtgatggttgcacggagttcctcggagtggaagggtgaagctagattcgag
gacgaatctatgttagtgggggagaat1