|
Guidelines for Chosing Restriction Endonucleases
Due to the
non-random arrangement of base pairs in a genome, certain restriction
enzyme recognition sequences may be substantially over or underrepresented.
The base composition and/or sequenced DNA from a genome can be used
to predict which recognition sequences will be rare. Summarized in
this table are predicted average fragment
sizes generated by commonly used restriction enzymes in various genomes.
Basic guidelines for predicting the frequency of cleavage in other
genomes of interest are presented below.
Bacterial
Genomes: CCG
and CGG are the rarest trinucleotides in most A + T rich bacterial
genomes. Endonuclease recognition sequences that contain these trinucleotides
will be correspondingly rare. Similarly, CTAG is the rarest tetranucleotide
in most G + C rich bacterial genomes. Endonuclease recognition sequences
that contain CTAG will be correspondingly rare. Suitable endonucleases
for three categories of bacterial genomic G + C content are presented
below.
Yeast
Genomes: The
Saccharomyces cerevisiae genome is very A + T rich (38% G +
C)(3), so G + C rich restriction endonuclease recognition sequences
are rare. Among these G + C rich recognition sequences, those that
do not occur in dispersed repeats, such as the Ty elements or tRNAs,
are particularly rare (2,4).
Mammalian
Genomes: The
nuclear genomes of mammals are all approximately 41% G + C and the
dinucleotide CG is five-fold more rare than expected from G + C content
(3). Restriction endonuclease recognition sequences that contain CG
are very rare in mammalian genomes (2,5,6,7). However, most CG sequences
are methylated in mammals and almost all the enzymes with CG in their
recognition sequence cannot cleave if CG is methylated (8). Nevertheless,
certain CG sequences in the genome of a particular cell type are either
completely methylated or completely unmethylated. This differential
methylation results in 'complete' digests at sites that are unmethylated
in the cell type. Given these facts, one can select endonucleases
that give discrete cleavage patterns despite the fact that they are
mCG sensitive. The average fragment sizes that result are quite large.
The genome is divided into large A + T rich regions with very few
CG dinucleotides and "islands" of a few hundred or thousand
base pairs that are about 50% G + C with CG occurring at almost expected
frequencies (2,5,6,7). The islands are often located 5´ to genes
(5,6,7). There is reason to believe that the unmethylated CG sequences
are most often found in the G + C rich islands.
Other
Genomes: Using
the G + C content, nearest neighbor data (dinucleotide frequencies)
(3) and a few thousand base pairs of nucleotide sequence data, it
is often possible to predict which restriction endonuclease recognition
sequences will occur least frequently in the genome of interest. For
instance, both Drosophila and Caenorhabditis are A +
T rich (~40% G + C), and the most rare dinucleotide in both species
is CG. However, CG is not as rare in these species as it is in mammals,
so recognition sequences that contain CG are not as rare. Furthermore,
these genomes are not methylated at CG, so all recognition sequences
can be expected to be cleaved to completion. Thus, very similar endonucleases
to those used with mammalian DNA are suitable for these species (4),
but the fragment sizes produced are somewhat less than half the size
of those produced from mammalian genomic DNA.
Reference:
- McClelland, M. et al
(1987) Nucl. Acids Res. 15, 5985-6005.
- McClelland, M. and Nelson,
M. (1987) Gene Amplification and Analysis 5, 257-282.
- Normore, W. M., Shapiro,
H. S., and Setlow, P., (1976) CRC Handbook of Biochemistry and Molecular
Biology. (Ed. G.D. Fastman), CRC Press.
- McClelland, M. (unpublished
results)
- McClelland, M. and Ivarie,
R., (1982) Nucl. Acids Res. 10, 7865-7877.
- Brown, W.R.A. and Bird,
A.P., (1986) Nature 324, 477-481.
- Lindsay, S. and Bird,
A. P. (1987) Nature 327, 336-338.
- McClelland, M. et al.
(1994) Nucl. Acids Res. 22, 3640-3659.
- Suwanto, A. and Kaplan,
S. (1989) J. Bacteriol. 171, 5850-5859.
| Restriction
Endonuclease Cleavage of Chromosomal DNA
|
 |
| Agarose-embedded
E. coli chromosomal DNA digested by NotI (d), SfiI (e), PmeI (f), PacI (g), and AscI (h). Lanes (a) and
(i) are Low Range PFG Markers. Lanes (b) and (c) are Mid Range PFG Markers I and
II, respectively. Electrophoresed in a 1% agarose gel at 170
V, 15°C, for 20 hours. Switch times ramped from 5-20 seconds. |
|