Applications of DNA sequencing
Used
for the identification of genes or mutations responsible for hereditary
disorders.
Used
for parental verification, criminal investigation and identification of
individuals using available samples such as hair, nail, blood or tissue.
Identification
of GMO species and any minor variations in the plant genome.
Used
to construct whole chromosomal maps, restriction digestion maps, and genome
maps.
Open
reading frames, non-open reading frames and protein-coding DNA sequences can be
identified.
Used
in exon/ intron, repeat sequence and tandem repeat identification and
detection.
Used
in gene manipulation and gene editing
New
variations in nature can be determined through sequencing.
Used
in metagenomic studies
For
Microbial identification and study of the new bacterial species.
For
evolutionary studies and for generating evolutionary map
For
studying asymptomatic high-risk population, prior to the occurrence of disease
Limitations of DNA
sequencing
DNA
sequencing is performed using computer algorithm-based assistive techniques and
so for such computational data processing high-speed supercomputer is required.
It
is difficult to sequence sequences like tandem repeats, repetitive DNA,
fragmented genes, other duplicated regions, etc.
There
are chances of errors in the pre-sample processing which will result in
economic losses.
Automated
sequencing
The
identification of sequence from the electrophoretic pattern through manual
Sanger method was tedious. Recent advances have enabled the semi-automated
Sanger sequencing method which is Sanger’s method with some minor variations.
Here,
instead of 4 different reaction tubes, a single tube is used and thus during
electrophoresis the DNA runs in a single lane in gel. Fluorescent-labeled
ddNTPs are used. Capillary electrophoresis is used to separate DNA
molecules on the basis of size. It is powerful enough to separate single
base pair fragment. The chromatogram generated after Capillary electrophoresis
will give output as fluorescent peaks, each colour representing a particular
ddNTP.
In dye-terminator sequencing, each of the four dideoxynucleotide chain terminators is labelled with fluorescent dyes, each of which emit light at different wavelengths.
Pyrosequencing:
This was described in 1993 by Bertil Pettersson, Mathias
Uhlen and Pål Nyren. Principle of the method is the detection
of the pyrophosphate released during the chain reaction of nucleotide addition.
The order of the nucleotide is determined by the PPi released during the
joining of two adjacent nucleotides.
Three enzymes are required in the pyrosequencing method which work in a
sequential manner for the detection of the PPi. The three enzymes are:
·
DNA
polymerase (without exonuclease activity)
·
Luciferase
·
Sulfurylase
The real-time polymerase activity
monitoring is done for the detection of the released pyrophosphate
Enzyme polymerase add dNTPs to single-stranded DNA. If the correct
complementary base is added, pyrophosphate is released.
Enzyme sulfurylase converts PPi into ATP (energy) with the help of the
APS (adenosine 5´ phosphosulfate).
In the presence of ATP and oxygen, luciferase converts luciferin into oxyluciferin and a photon of light is released.
So, once the correct nucleotide is added, light will be released by the
enzymatic reaction which is detected by a photodiode or a photomultiplier tube.
Based on the substrate used, two types of pyrosequencing methods are
there, solid-phase pyroseq and liquid phase pyroseq.
The pyrosequencing method is more accurate and faster than Sanger
sequencing.
But this method involves more chemical steps and thus is more complex.
Whole-genome shotgun sequencing:
This technique is also a modification of Sanger’s chain termination
method. Shotgun sequencing concept was originally discovered by
Sanger F and his colleagues for sequencing the whole genome. This technique can
be used to sequence the entire genome of an organism.
The principle is the same as Sanger’s method. There is an additional step of DNA fragmentation to generate multiple fragments. The entire genome of an organism is fragmented with the help of endonuclease enzymes or by mechanically, and the smaller fragments are sequenced individually.
The
computer-based software analyses each and every overlapping fragment and
reassemble it to generate the complete sequence of entire genome.
Steps
involved:
1.
Fragmentation
of DNA to about 2 -20kb.
2. Formation
of libraries of subfragments, fragments are ligated in vectors and an entire
library is generated
3.
Sequencing
the subfragments
4.
Generation
and reading of overlapping fragments (contigs) by using computer.
The technique is faster and
cheaper, and can be used to sequence whole genome of an organism. This
technique depends on computational analysis and a huge, powerful, supercomputer
is required.
In
1981, for sequencing cauliflower mosaic virus genome shotgun sequencing method
was used.
Clone by clone sequencing:
For
sequencing the whole genome, Clone by Clone Method can be used. In
1980 and 1990 the genomes of C. elegans and S.
cerevisiae were sequenced using the clone by clone sequencing,
respectively and this technique was used during the human genome project.
This
method is similar to shot gun sequencing method, but have additional steps.
1. In
the first step, instead of smaller fragments, large clumps of DNA fragments are
constructed and the location of each fragment is noted through gene
mapping. Using bacterial artificial chromosome, multiple copies of each
fragment are generated.
2. In
the next step, all these copied fragments are further fragmented into smaller
pieces and inserted into vectors.
3. Now
sequencing of these short fragments are performed as per shotgun technique and
overlapping fragments are assembled by using computer.
4. In the last step, the data obtained during gene mapping is used to assemble the complete sequence. So the sequences can be arranged on each chromosome based on their location.
Sequencing of whole chromosomes can be done without any gaps.
More tedious, time-consuming and costly since more procedures like mapping, cloning, and restriction digestion are involved.
Next-generation sequencing (NGS) or High-throughput sequencing
The most
recent set of DNA sequencing technologies are collectively referred to as next-generation
sequencing. Next-generation sequencing involves
amplification of millions of copies of a particular fragment and sequences are
analyzed by computational program. There are a variety of
next-generation sequencing techniques that use different technologies.
Examples are Polony sequencing, Massively parallel signature sequencing
(MPSS), 454 pyrosequencing, Illumina (Solexa) sequencing, Combinatorial
probe anchor synthesis (cPAS), SOLiD sequencing, Ion Torrent
semiconductor sequencing, DNA nanoball sequencing, Nanopore DNA
sequencing, etc.
These varieties of next-generation sequencing
techniques use different technologies, however, most share a common set of
features,
·
Highly
parallel: many
sequencing reactions take place at the same time
·
Micro
scale: reactions
are tiny and many can be done at once on a chip
·
Fast: because reactions are done
in parallel, results are ready much faster
·
Low-cost: sequencing a genome is
cheaper than with Sanger sequencing
Next-generation sequencing is kind of like running
a very large number of tiny Sanger sequencing reactions in parallel thus
allowing, large quantities of DNA to be sequenced much more quickly and cheaply.
The NGS is the most advanced, fast, accurate and 100% effective
technique for DNA sequencing.
·
Single-molecule
real-time (RNAP) sequencing
·
Illumina
(Solexa) sequencing
·
Polony
sequencing
·
DNA
nano ball sequencing
·
SOLiD
sequencing
·
Single-molecule
SMRT(TM) sequencing
·
Massively
parallel signature sequencing (MPSS)
·
High
throughput sequencing
·
Helioscope
(TM) single-molecule sequencing
1.
Library
preparation
2.
Cluster
generation
3.
DNA
sequencing
4.
Data
analysis
These processes are common to all next-generation DNA sequencing
techniques:
1. Library
Preparation: DNA is randomly fragmented to generate libraries, which are
then ligated together using specific linkers/adapters.
2. Cluster
generation or Amplification: PCR and clonal amplification techniques
are used to amplify the library.
3. Sequencing: DNA is sequenced using one of several
methods.
4. Data Analysis: Bioinformatics techniques are used to
process the generated data in order to align the reads, find deviations, and
assemble the entire genome.
Library preparation
Firstly, DNA is fragmented either enzymatically or by sonication to
create smaller strands. Short and double stranded pieces of synthetic DNA
called Adaptors are then ligated to these fragments using DNA ligase enzyme.
The adaptors enable the sequence to become bound to a complementary
counterpart.
Cluster generation/Amplification
The DNA library is amplified so as to generate
enough signal from the sequencer in the next stage, which is accurate and can
be reliably detected. PCR amplification methods produce many DNA clusters in
enormous quantities.
Sequencing & Data Analysis
Various companies have developed multiple types of Next Generation Sequencing competitive techniques (Pyrosequencing, Sequencing by ligation (SOLiD), Reversible terminator sequencing (Illumina)
Applications of NGS:
Researchers have been able to gather enormous amounts of genomic sequencing data through next-generation sequencing. This technology has a wide range of applications including the diagnosis and understanding of complex diseases, whole-genome sequencing, epigenetic analysis, mitochondrial sequencing, transcriptome sequencing, understanding how altered expression of genetic variants affects an organism, and exome sequencing (sequence of all the exons in a genome, which is the protein-coding portion of a genome. In humans, the exome is about 1.5% of the genome)– mutations in the exome are thought to contain up to 90% of mutations in the human genome, which results in disease.
Gene therapy in cancer treatment focuses on methods like antisense RNA, which prevents the synthesis of targeted proteins, and suicide gene therapy, which introduces genes to selectively kill cancer cells. The challenge lies in delivering these genes precisely to avoid harming healthy cells. Sequencing tumor genomes enables tailored chemotherapy and personalized medicine, revolutionizing diagnostics and treatment planning.
The decreasing cost of DNA sequencing leads to its wider adoption, but it also presents challenges. Processing and storing the vast amounts of sequencing data pose computational hurdles. Ethical concerns arise regarding ownership and security of individuals’ DNA data, as it can be misused by insurance companies, mortgage brokers, or employers.
Additionally, while sequencing can identify disease risks, issues remain regarding patient awareness and the availability of effective treatments.