BCH5425 Molecular Biology and Biotechnology
Spring 1998
Dr. Michael Blaber
blaber@sb.fsu.edu
Lecture 21
DNA Sequence Analysis
DNA sequencing is most often accomplished using a procedure referred
to by one of the following names:
- Sanger sequencing
- Di-deoxy sequencing
- Chain termination sequencing
- Each of these refers to the same method:
- the use of di-deoxy base incorporation in a polymerization
reaction
- leads to termination of primer extension (a method pioneered
by Fred Sanger, now retired and happily puttering about in his
garden).
The basic method involves
- annealing a primer 5' to
a region of DNA we would like to sequence.
- The primer is extended in the traditional manner (i.e. with
DNA polymerase and the four dNTP's).
- However, a small concentration of di-deoxy
bases are included in the reaction mix.
- Usually this is accomplished by having four separate reactions,
into which one of the ddNTP's is added.
- Thus, one tube would contain primer, template, DNA polymerase,
the four dNTP's and ddATP, another tube would have the same thing
but with ddTTP instead of ddATP, and so on for ddCTP and ddGTP.
- During the reaction, the normal dNTP's are incorporated into
the growing chain.
- However, occasionally the DNA pol will incorporate
a ddbase into the growing primer.
- When this happens, the primer cannot be extended
any further (because the 3' dd base does not have an available
3' hydroxyl group).
- The resulting DNA fragment begins at
the 5' end of the sequencing primer and ends at the site of dd
base incorporation.
Thus, in the reaction mixture containing
the dd base ddATP, there will result an ensemble of fragements
of varying lengths, each ending in with the ddA base (i.e. at
all positions in the template where there was a comlementary 'T'
base).
The mixture containing ddCTP will have a different mix of fragements
- they will contain ddC at the 3' ends (at positions in the template
where 'G' bases were located).

- If the fragments from the 'A' reaction mix are run on a urea/acrylamide
gel (typically 6%) the fragments will separate according to size.
- Likewise for the 'C', 'G' and 'T' reaction mix fragments.
- If the four different reaction mixtures are run next to each
other the fragment sizes can be directly compared to one
another.
- Note that the shortest expected fragment is the primer itself
and the longer the fragment, the further from the primer the extension
reaction went before termination.
- Consider a case where the template has a stretch of six 'A'
bases in a row.
- In the 'T' reaction mix we will subsequently get six fragments;
each ending in 'T' and differing by one base length.
- None of the other reaction mixes will contain fragments between
these lengths (they will either be longer or shorter) because
none of the other reaction mixes will terminate within this region.
- Thus, if we run the four reaction mixes side by side and look
at the fragment patterns we would see the following:

- Now consider a template that contains the sequence 3' GATC
5' (note the orientation).
- When the primer is extended in the different reaction mixes
it can truncate first at the G (incorporating a dd 'C'), then
at the A (incorporating a dd 'T') and so on.
- The fragments run on a gel would thus look like:

- Thus, following the ladder of fragments on the sequencing
gel allows you to "read" the sequence of the template.
- Note however, that in regard to the
template when we read from the bottom of the gel to the top we
are reading in the 3' to 5' direction and reading the complementary
bases to the actual sequence.
Visualization of fragments
- If radiolabeled dATP is spiked into the mixtures it will be
incorporated like a "normal" dATP base.
- However, the resulting DNA fragment will be radiolabled.
- Thus the acrylamide gel can be exposed to x-ray film and the
location of the fragments determined.
- Recently, automated sequencers have made use of specific dyes
which are tagged to the dideoxy bases.
- These dyes can be "read" by a laser and thus the
specific terminating dd base for a particular DNA fragment can
be identified.
- Thus, the fragments can read as they elute, rather than stopping
the gel and exposing it.
- Furthermore, since each dd base can be uniquely identified,
all four reactions (ddA, ddC, ddG and ddT chain termination) can
be done in a single tube and run in a single lane on a sequencing
gel
- Automated sequencers can thus read further than
with manual methods.
- Since a single lane is used per sample (as compared to four
lanes with the radiolabeled method) many more samples can be analyzed
and the throughput is greater
- The acrylamide gels used for sequence analysis are typically
50 cm to 100 cm long.
- In manual sequencing the four reaction mixes are loaded and
the gel is run for approximately 2 hours then the samples are
reloaded on another part of the gel and the gel run is continued.
A third set of samples may be loaded after another 2 hours.
- The gel is stopped after the dye front of last sample loaded
has just reached the bottom of the gel.
- Thus, the short fragments can be visualized in the last load,
medium fragements in the second load and the long fragments will
be visualized in the first set of reaction mixtures loaded.
- Manual sequencing can resolve on the order of 400 bases of
continuous sequence. Automated sequencers can routinely provide
twice this amount of information.
- Automated sequencers use the same types of glass plates
- The continuous running of the gel (and dye identification)
means that typically 400-700 bases or more can be read
- Automatic software will interpret the dye signals into a sequence
- nuances of the sequencing chemistry and expert knowledge can
be programmed into the sequence analysis software (e.g. the software
can compensate for the "smile" of the gel)
1998 Dr. Michael Blaber