Sunday, January 14, 2024

DNA sequencing - applications, limitations & other methods

 

Applications of DNA sequencing

Used for the identification of genes or mutations responsible for hereditary disorders.

Used for parental verification, criminal investigation and identification of individuals using available samples such as hair, nail, blood or tissue.

Identification of GMO species and any minor variations in the plant genome.

Used to construct whole chromosomal maps, restriction digestion maps, and genome maps.

Open reading frames, non-open reading frames and protein-coding DNA sequences can be identified.

Used in exon/ intron, repeat sequence and tandem repeat identification and detection.

Used in gene manipulation and gene editing

New variations in nature can be determined through sequencing.

Used in metagenomic studies

For Microbial identification and study of the new bacterial species.

For evolutionary studies and for generating evolutionary map

For studying asymptomatic high-risk population, prior to the occurrence of disease

Limitations of DNA sequencing

DNA sequencing is performed using computer algorithm-based assistive techniques and so for such computational data processing high-speed supercomputer is required.

It is difficult to sequence sequences like tandem repeats, repetitive DNA, fragmented genes, other duplicated regions, etc. 

There are chances of errors in the pre-sample processing which will result in economic losses.

Automated sequencing

The identification of sequence from the electrophoretic pattern through manual Sanger method was tedious. Recent advances have enabled the semi-automated Sanger sequencing method which is Sanger’s method with some minor variations.

Here, instead of 4 different reaction tubes, a single tube is used and thus during electrophoresis the DNA runs in a single lane in gel. Fluorescent-labeled ddNTPs are used.  Capillary electrophoresis is used to separate DNA molecules on the basis of size.  It is powerful enough to separate single base pair fragment. The chromatogram generated after Capillary electrophoresis will give output as fluorescent peaks, each colour representing a particular ddNTP.

In dye-terminator sequencing, each of the four dideoxynucleotide chain terminators is labelled with fluorescent dyes, each of which emit light at different wavelengths.


Pyrosequencing:

This was described in 1993 by Bertil Pettersson, Mathias Uhlen and Pål Nyren.  Principle of the method is the detection of the pyrophosphate released during the chain reaction of nucleotide addition. The order of the nucleotide is determined by the PPi released during the joining of two adjacent nucleotides.

Three enzymes are required in the pyrosequencing method which work in a sequential manner for the detection of the PPi. The three enzymes are:

·                  DNA polymerase (without exonuclease activity)

·                  Luciferase

·                  Sulfurylase

The real-time polymerase activity monitoring is done for the detection of the released pyrophosphate

 
Enzyme polymerase add dNTPs to single-stranded DNA. If the correct complementary base is added, pyrophosphate is released.

Enzyme sulfurylase converts PPi into ATP (energy) with the help of the APS (adenosine 5´ phosphosulfate).

In the presence of ATP and oxygen, luciferase  converts luciferin into oxyluciferin  and a photon of light is released.  

So, once the correct nucleotide is added, light will be released by the enzymatic reaction which is detected by a photodiode or a photomultiplier tube.

Based on the substrate used, two types of pyrosequencing methods are there, solid-phase pyroseq and liquid phase pyroseq.

The pyrosequencing method is more accurate and faster than Sanger sequencing.

But this method involves more chemical steps and thus is more complex. 

Whole-genome shotgun sequencing:

This technique is also a modification of Sanger’s chain termination method. Shotgun sequencing concept was originally discovered by Sanger F and his colleagues for sequencing the whole genome. This technique can be used to sequence the entire genome of an organism.

The principle is the same as Sanger’s method.  There is an additional step of DNA fragmentation to generate multiple fragments. The entire genome of an organism is fragmented with the help of endonuclease enzymes or by mechanically, and the smaller fragments are sequenced individually.  

The computer-based software analyses each and every overlapping fragment and reassemble it to generate the complete sequence of entire genome.

Steps involved:

1.               Fragmentation of DNA to about 2 -20kb.

2.        Formation of libraries of subfragments, fragments are ligated in vectors and an entire library is generated  

3.               Sequencing the subfragments  

4.               Generation and reading of overlapping fragments (contigs) by using computer.  

 The technique is faster and cheaper, and can be used to sequence whole genome of an organism. This technique depends on computational analysis and a huge, powerful, supercomputer is required.

In 1981, for sequencing cauliflower mosaic virus genome shotgun sequencing method was used.

 

Clone by clone sequencing:

For sequencing the whole genome, Clone by Clone Method can be used.  In 1980 and 1990 the genomes of C. elegans and S. cerevisiae were sequenced using the clone by clone sequencing, respectively and this technique was used during the human genome project.

This method is similar to shot gun sequencing method, but have additional steps.

1.      In the first step, instead of smaller fragments, large clumps of DNA fragments are constructed and the location of each fragment is noted through gene mapping.  Using bacterial artificial chromosome, multiple copies of each fragment are generated. 

2.      In the next step, all these copied fragments are further fragmented into smaller pieces and inserted into vectors.

3.      Now sequencing of these short fragments are performed as per shotgun technique and overlapping fragments are assembled by using computer.

4.      In the last step, the data obtained during gene mapping is used to assemble the complete sequence. So the sequences can be arranged on each chromosome based on their location.


Sequencing of whole chromosomes can be done without any gaps.

   More tedious, time-consuming and costly since more procedures like mapping, cloning, and restriction digestion are involved. 

Next-generation sequencing (NGS) or High-throughput sequencing

The most recent set of DNA sequencing technologies are collectively referred to as next-generation sequencing.  Next-generation sequencing involves amplification of millions of copies of a particular fragment and sequences are analyzed by computational program.  There are a variety of next-generation sequencing techniques that use different technologies.  Examples are Polony sequencing, Massively parallel signature sequencing (MPSS), 454 pyrosequencing, Illumina (Solexa) sequencing, Combinatorial probe anchor synthesis (cPAS), SOLiD sequencing, Ion Torrent semiconductor sequencing, DNA nanoball sequencing, Nanopore DNA sequencing, etc.

These varieties of next-generation sequencing techniques use different technologies, however, most share a common set of features,

·                  Highly parallel: many sequencing reactions take place at the same time

·                  Micro scale: reactions are tiny and many can be done at once on a chip

·                  Fast: because reactions are done in parallel, results are ready much faster

·                  Low-cost: sequencing a genome is cheaper than with Sanger sequencing

Next-generation sequencing is kind of like running a very large number of tiny Sanger sequencing reactions in parallel thus allowing, large quantities of DNA to be sequenced much more quickly and cheaply. The NGS is the most advanced, fast, accurate and 100% effective technique for DNA sequencing.

 There are several platforms for NGS, they are

·                  Single-molecule real-time (RNAP) sequencing

·                  Illumina (Solexa) sequencing

·                  Polony sequencing

·                  DNA nano ball sequencing

·                  SOLiD sequencing

·                  Single-molecule SMRT(TM) sequencing

·                  Massively parallel signature sequencing (MPSS)

·                  High throughput sequencing

·                  Helioscope (TM) single-molecule sequencing

 The NGS process can be divided into 4 different steps:

1.               Library preparation

2.               Cluster generation

3.               DNA sequencing

4.               Data analysis



These processes are common to all next-generation DNA sequencing techniques:

1. Library Preparation: DNA is randomly fragmented to generate libraries, which are then ligated together using specific linkers/adapters.

2. Cluster generation or Amplification: PCR and clonal amplification techniques are used to amplify the library.

3. Sequencing: DNA is sequenced using one of several methods.

4. Data Analysis: Bioinformatics techniques are used to process the generated data in order to align the reads, find deviations, and assemble the entire genome.

Library preparation

Firstly, DNA is fragmented either enzymatically or by sonication to create smaller strands. Short and double stranded pieces of synthetic DNA called Adaptors are then ligated to these fragments using DNA ligase enzyme. The adaptors enable the sequence to become bound to a complementary counterpart. 

Cluster generation/Amplification 

The DNA library is amplified so as to generate enough signal from the sequencer in the next stage, which is accurate and can be reliably detected. PCR amplification methods produce many DNA clusters in enormous quantities.

 Sequencing & Data Analysis 

Various companies have developed multiple types of Next Generation Sequencing competitive techniques (Pyrosequencing, Sequencing by ligation (SOLiD), Reversible terminator sequencing (Illumina)


 Applications of NGS: 

 Researchers have been able to gather enormous amounts of genomic sequencing data through next-generation sequencing. This technology has a wide range of applications including the diagnosis and understanding of complex diseases, whole-genome sequencing, epigenetic analysis, mitochondrial sequencing, transcriptome sequencing, understanding how altered expression of genetic variants affects an organism, and exome sequencing (sequence of all the exons in a genome, which is the protein-coding portion of a genome. In humans, the exome is about 1.5% of the genome)– mutations in the exome are thought to contain up to 90% of mutations in the human genome, which results in disease.

 Gene therapy in cancer treatment focuses on methods like antisense RNA, which prevents the synthesis of targeted proteins, and suicide gene therapy, which introduces genes to selectively kill cancer cells. The challenge lies in delivering these genes precisely to avoid harming healthy cells. Sequencing tumor genomes enables tailored chemotherapy and personalized medicine, revolutionizing diagnostics and treatment planning.

 The decreasing cost of DNA sequencing leads to its wider adoption, but it also presents challenges. Processing and storing the vast amounts of sequencing data pose computational hurdles. Ethical concerns arise regarding ownership and security of individuals’ DNA data, as it can be misused by insurance companies, mortgage brokers, or employers.

 Additionally, while sequencing can identify disease risks, issues remain regarding patient awareness and the availability of effective treatments.

 


 

 

DOWNSTREAM PROCESSING

The various procedure involved in the actual recovery of useful products after fermentation or any other process together constitute  Downst...