Question? Leave a message!

Overview of Next Generation Sequencing (NGS) Technologies

Overview of Next Generation Sequencing (NGS) Technologies 10
Overview of Next Generation Sequencing (NGS) Technologies Vivien G. Dugan Office of Genomics and Advanced Technologies NIAID/NIH Timothy Stockwell J. Craig Venter Institute th August 26 , 2013 NIAID Genomic Sequencing Centers for Infectious Diseases Bioinformatics High Throughput Genomics Sample Processing Metagenomics Tools Sequencing Bioinformatics Method Develop Transcriptomics Data Analysis Pipelines Training Pipelines What is‘NextGen’ sequencing • Different chemistry from Sanger • Sequences everything in a sample • Host, pathogen, cells, etc. • Sequences clonally amplified molecule • Sequencing occurs in parallel • Millions of sequences produced concurrently • Gigabytes of sequences What is‘NextGen’ sequencing • Less time than Sanger • Large capacity • Multiplexing, variation detection, gene expression, metagenomics • Address various biological questions Sanger vs Nextgeneration sequencing 100 of these…. = 1 of these…. GSFLX Roche/454 ABI 3730x “Single Molecule Sequencing” Add adapters Shear DNA Select for fragments with Sequence A B adapters Assemble Attach to solid Data surface complementary to adapters Mapping sequence reads to reference Why use NextGen  High rates of accuracy  Many reads per sequencing run  Faster time per sequencing run  Multiplexing capabilities  Decreased cost  Useful for many different applications Why use NextGen  2004: 100 influenza genomes in NCBI  2013: 14,000+ influenza genomes in NCBI Genomics Analysis at the Population Level Diversity Molecular Epidemiology Deep sequencing R CLADE 2 Consensus sequencing Elodie Ghedin Center for Vaccine Research CLADE 7 Dept. Computational Systems Biology NGS: Things to Consider  Each platform has advantages disadvantages – Read length, accuracy, reads per run, time, sequencing error rates  Biology of the pathogen of interest  What is your goal in sequencing – Complete genome – Specific region or gene True Diversity or Error RNA polymerase Error: 0.001 454 substitution Error: 0.03 Consensus of Clusters to Smooth out errors NGS: Things to Consider  Sample preparation is important – Sequencing everything in the sample Mammalian Virus Reads RNA Reads 0 Mammalian 3 Other Reads mtDNA Reads 12 2 Mycoplasma Reads 83 Summary  Next Generation sequencing provides increasingly vital information not previously available  NGS technologies becoming more commonly used in the field of infectious disease research  Sequencing technologies, assembly and analyses tools rapidly improving NGS Criteria to Consider  Ultimate goal  Sequencing platform(s) – Coverage level/depth – Read length – Error rates  Sample preparation  Confirmatory sequencing Overview of Next Generation Sequencing (NGS) Technologies Timothy Stockwell (JCVI) Vivien Dugan (NIAID/NIH) Outline • Some history of DNA sequencing • Overview of NextGen Sequencing Technologies at JCVI • Roche/454 Pyrosequencing • LifeTechnologies/IonTorrent Semiconductor Sequencing • Illumina/Solexa Sequencing By Synthesis (SBS) • Other technologies Review Sanger Sequencing • Randomly shear DNA, put it in a vector, and amplify with E. coli, or PCR amplify a region of a genome • The Sanger sequencing reaction is like PCR, except there is only one primer, and in addition to regular nucleotides, there are also a small amount of dye labelled dideoxy nucleotides, with a distinct dye for each base • As polymerase makes new ssDNA fragments, when a dye labelled dideoxy nucleotide is added, extension stops, and the fragment is labelled with a dye corresponding to the last base added. Review Sanger Sequencing • Over many cycles, fragments of all the different lengths are formed, with each length fragment ending with the dye corresponding to the base at that position • Capillary electrophoresis in polyacrylamide gel is used to separate the fragments by length and pass them by a laser and reader to interrogate the base at each position • The result is a chromatogram, that is then “base called” using algorithms to output the most likely base at each position, usually with an indication of accuracy of the base call. A chromatogram Sanger Sequencing • Think about the issues of scaling Sanger sequencing to obtain 1 million reads • The E. coli clones or PCR reactions need separated wells – 2600 384well plates • To read the DNA from both ends, need double the number of wells, and have to keep track of mate pairs – 5200 384well plates • Also think about storage, pipet tips, labor required, etc. • So then came along Next Generation Sequencing (NGS) Technologies NextGen Sequencing Technologies 454 GS FLX Illumina HiSeq 2000 Illumina MiSeq Ion PGM Sequencing Technologies in Use at JCVI Read Throughput Run time Throughput Accuracy length bp /machine run /day 600800 75,000bp 3060 min 12 Mb QV 30 ABI 3730xl Sequencing Strategies QV 20 400600 400 Mb 7 hr 800 Mb 454 80 bases up to 100 up to 600 Gb up to 12 days 50 Gb Illumina QV30 HiSeq 75 bases up to 250 up to 8.5 Gb up to 39 hours 5.2 Gb Illumina QV30 MiSeq Ion 80bases 150 900 Mb up to 4.5 hours Torrent QV20 Roche 454 Sequencing • Library Construction • Sequencing Process Overview 454 Library construction Covaris Adapted from the 454 Users Guide 454 Massively Parallel Pyrosequencing Process Overview 1) ssDNA library preparation 4) Perform sequencing by 2) emPCR 3) Load beads synthesis on the 454 amplification enzymes in PicoTiter instrument Plate™ Slide 26 454 Instrument and Data Output Slide 27 454 Sequencing Workflow Sequencing by Synthesis •Bases (TACG) are flowed sequentially and always in the same order (100 times for a large GS FLX run) across the PicoTiterPlate device during a sequencing run. •A nucleotide complementary to the template strand generates a light signal. •The light signal is recorded by the CCD camera. •The signal strength is proportional to the number of nucleotides incorporated. Slide 28 454 GS FLX Data Image Processing Overview T 1. Each well’s data 1. Raw data extracted, C is series of quantified and G images normalized A T 1. Read data converted into “flowgrams” Slide 29 454 GS FLX Data Flowgram Generation T A Flowgram 4mer Flow Order C G 3mer T T C T G C G A A 2mer 1mer Key sequence = TCAG for signal calibration Slide 30 454 GS FLX Plate 454 GS FLX Sequencer Ion Torrent Sequencing • Similar to 454, but rather than creating fluorescence and measuring light, Ion Torrent instead measures pH changes due to protons released during base incorporation • The Ion Torrent chips are a massively parallel array of the world’s smallest pH meters • As a semiconductor device, Ion Torrent has been able to make there chips denser and denser (more and more wells), following the trend of the electronics industry Ion Torrent Sequencing Ion Torrent Chips Ion Torrent PGM Sequencer Illumina Sequencing • Technology Overview • Mate Pair Library Construction Illumina Technology Overview (1) Adapted from the 454 Users Guide Illumina Technology Overview (2) Adapted from the 454 Users Guide Illumina Technology Overview (3) Adapted from the 454 Users Guide Illumina Technology Overview (4) Adapted from the 454 Users Guide Illumina Mate Pair Library Construction Adapted from the 454 Users Guide Illumina Flow Cells Illumina MiSeq Sequencer Illumina HiSeq Sequencer Other Technologies • Pacific Biosciences – single molecule sequencing, measures the incorporation of a single dye labelled base at a time, by laserexcitation of an extremely small volume that contains the polymerase and the DNA • Oxford Nanopore – single molecule sequencing, measures the electrical changes in a pore that arise when bases enter and exit the pore. Readings  Zagori et al. (2012) Read length versus depth of coverage for viral quasispecies reconstruction. PloS One 7(10):e47046  Quail et al. (2012) A tale of three next generation sequencing platforms: comparison of Ion Torrent, Pacific Biosciences and Illumina MiSeq sequencers BMC Genomics 13:341  Liu et al. (2012) Comparison of NextGeneration Sequencing Systems J. Biomed Biotechnol July 5.  Metzker (2010) Sequencing technologies – The next generation. Nature Reviews Genetics 11:31  Harismendy et al. (2009) Evaluation of next generation sequencing platforms for population targeted sequencing studies. Genome Biology 10:R32  Nagarajan and Pop (2010) Sequencing and genome assembly using nextgeneration technologies. Computational Biology, Methods in Molecular Biology Vol. 673
Website URL