Download presentation
Presentation is loading. Please wait.
1
Next Generation Sequencing
高通量测序技术简介 Next Generation Sequencing Roche 454 焦磷酸测序 Pyrophosphate Sequencing Sample fragmentation Library preparation Sequencing reaction Data analysis Illumina Solexa 合成测序 Sequence by Synthesize ABI SOLiD 连接法测序 Sequence by Ligation
2
Roche 454 焦磷酸测序 Pyrophosphate Sequencing 基本原理
3
454 sequencing: Emulsion PCR (emPCR)
A + PCR Reagents + Emulsion Oil B Micro-reactors Adapter carrying library DNA Mix DNA Library & capture beads (limited dilution) Create “Water-in-oil” emulsion Adapter complement Enrich Anneal Seq primer “Break micro-reactors” Isolate DNA containing beads Perform emulsion PCR Generation of millions of clonally amplified templates on each bead No cloning and colony picking
4
454 sequencing: Deposition of DNA beads into the PicoTiter™Plate
Load Enzyme Beads Load beads into PicoTiter™Plate Centrifuge Step
5
Illumina Solexa 合成测序 Sequence by Synthesize 基本原理
6
Clonal Single Molecule Arrays 单分子克隆
Amplify to form clusters Attach single molecules to surface Prepare DNA fragments Ligate adapters 20 microns Sequence ~1000 molecules per ~ 1 µm cluster ~1000 clusters per 100 µm square ~40 million clusters per experiment
7
Reversible Terminator Chemistry 可逆终止反应
All 4 labelled nucleotides in 1 reaction O PPP HN N cleavage site fluor 3’ block Next cycle Incorporation Detection Deblock; fluor removal O DNA HN N 3’ 5’ free 3’ end X OH
8
Sequencing-by-Synthesis (SBS)
5’ 3’ 5’ Cycle 1: Add sequencing reagents First base incorporated Remove unincorporated bases G T C A Detect signal T G C A G T Cycle 2-n: Add sequencing reagents and repeat 1、每轮测序反应加入四种带有荧光标记的dNTP,末端带有可以被去除的阻断基团 2、每轮反应只能整合一个核苷酸,仪器读取相应的荧光信号 3、信号读取结束,用化学方法去除阻断基团,进行下一轮测序反应
9
Base calling from the raw data
T G C T A C G A T … 1 2 3 7 8 9 4 5 6 T T T T T T T G T … The identity of each base of a cluster is read off from sequential images 根据每个点每轮反应读取的荧光信号序列,转换成相应的DNA序列
10
Solexa 测序 Workflow
11
ABI SOLiD 连接法测序 Sequence by Ligation 基本原理
12
文库制备:微珠单分子克隆
13
SOLiD 利用探针的连接反应读取模板的DNA序列
1024种8碱基探针 4色荧光,4种双核苷酸,每色荧光有256个探针(4^6)
14
连接法测序 (一) 测序引物与adapter退火 探针连接,检测荧光 切除荧光基团 第二轮探针连接,检测荧光 切除荧光基团
每个探针进行检测的两个碱基后面有三个匹配碱基,因此一条测序引物读取的序列是不完整的
15
连接法测序 (二) 测序引物沿着Adapter移动5次,确保每个位点都被检测
16
连接法测序 (三) 0位置是Adapter的最后一个碱基,因此只检测一次, 该碱基是进行解码所必须的。
17
Advantage & disadvantage
454 sequencing 读取长度大,400bp 可以对未知基因组进行从头测序de novo sequencing 当遇到polymer时,如AAAAAA等,荧光强度和碱基个数不成线性关系,判定重复碱基个数有困难 Solexa sequencing 高度自动化的系统 读取片段多,适合进行大量小片段的测序,如microRNA profiling 基于可逆反应,随反应轮数增加,效率降低,信号衰减,读取序列较短,给de novo sequencing 拼接带来困难 SOLiD sequencing 每个碱基读取两次非常高的准确性,特别是对于SNP的检测 灵活的系统,完善的磁珠编码系统,可以进行样品的pooling,分割测序区域 读取长度受连接反应的轮数限制,给de novo sequencing 拼接带来困难
18
高通量测序的应用 De novo 测序 基因深度测序(genome re-sequencing)
转录组深度测序(transcriptome re-sequencing) Digital expression profiling ChIP-seq Methy-seq
19
Transcriptome resequencing:
malignant pleural mesotheliomas (MPMs) :恶性胸膜间皮瘤 pulmonary adenocarcinoma (ADCA):肺腺癌
20
Transcriptome characteristics
Expression difference between MPM and ADCA sample compare to a lung tissue control Transcriptome characteristics Solid line: at least one read Dashed line:at least 20 reads Analysis of percent- age of reads containing known coding region SNVs in the six tissue samples. SNV: Single Nucleotide Substitution Variant
21
Digital expression profiling(1): 人大脑组织与UHR(Universal Human Reference)的表达差异
22
Digital expression profiling & microRNA re-sequencing:
hESC: human embryonic stem cells EB: embryoid bodies
23
ChIP-seq(1): 人一号染色体DNA-蛋白相互作用
24
ChIP-seq(2): Sequenced short reads (typically 25–50 bp) from ChIP-Seq experiments are first mapped onto the reference genome. The mapped reads are then used to estimate statistical parameters, which include the estimation of the average length F of sequenced DNA fragments.
25
Methy-seq(1): 肿瘤和MCF7细胞系中 BRCA!启动子区域的甲基化差异
26
Methy-seq(2): Some highlights: Correlation between ChIP-Seq and his prior SAGE-like method (called GMAT) has r= ‘However the resolution with ChIP-Seq was dramatically higher. Furthermore, ChIP-Seq was more sensitive and generated less false-negative regions’ 12,726 genes whose transcription levels are known in CD4+ T-cells were correlated with the histone modifications and 35,961 Pol II binding site ‘islands’ were identified ‘This cost-effective method produces digital-quality data and should find broad applications in our efforts to understand the contribution of the human epigenomes in gene expression and epigenetic inheritance’
27
部分参考文献阅读 Genome re-sequencing Transcriptome re-sequencing
van Orsouw N J, Hogers R C, Janssen A, et al. Complexity reduction of polymorphic sequences (CRoPS): a novel approach for large-scale polymorphism discovery in complex genomes. PLoS ONE, 2007, 2(11): e1172 Hillier L W, Marth G T, Quinlan A R, et al. Whole-genome sequencing and variant discovery in C. elegans. Nat Methods, 2008, 5(2): 183—188 Transcriptome re-sequencing Mortazavi A, Williams B A, McCue K, et al. Mapping and quantifying mammalian transcriptomes by RNA-Seq. Nat Methods, 2008, 5(7): 621—628 Sugarbaker D J, Richards W G, Gordon G J, et al. Transcriptome sequencing of malignant pleural mesothelioma tumors. Proc Natl Acad Sci USA, 2008, 105(9): 3521—3526 Digital expression profiling Ruby J G, Jan C, Player C, et al. Large-scale sequencing reveals 21U-RNAs and additional microRNAs and endogenous siRNAs in C. elegans. Cell, 2006, 127(6): 1193—1207 Morin R D, O'Connor M D, Griffith M, et al. Application of massively parallel sequencing to microRNA profiling and discovery in human embryonic stem cells. Genome Res, 2008, 18(4): 610—621 ChIP-seq Johnson D S, Mortazavi A, Myers R M, et al. Genome-wide mapping of in vivo protein-DNA interactions. Science, 2007, 316(5830): 1497—1502 Robertson G, Hirst M, Bainbridge M, et al. Genome-wide profiles of STAT1 DNA association using chromatin immunoprecipitation and massively parallel sequencing. Nat Methods, 2007, 4(8): 651—657
28
在一定程度上,这两种技术是优势互补,联合使用,将加快和加深我们的研究。
藤晓坤,肖华胜*,基因芯片与高通量DNA测序技术前景分析。中国科学C辑:生命科学。 2008:(38)10:1-9。 Teng X, Xiao H. Perspectives of DNA microarray and next-generation DNA sequencing technologies. Sci China C Life Sci ;52(1):7-16 高通量测序技术虽然建立的时间不长, 但是在基因组的各个研究领域都显示出其非凡的魅力, 而且日益显示出其对基因芯片“取而代之”的咄咄态势。 那么, 基因芯片向何处去呢? 基因芯片技术经过近15年的发展已经形成了一个系统的平台。 深度测序要建立这样的一个体系同样需要若干年的完善。 芯片杂交结果直观, 分析快速, 适合对大量生物学样,品进行已知信息的检测, 同时芯片数据分析有成熟完整的理论, 为后期数据分析提供强大的支持。 基因芯片的缺点, 就在于它是一个“封闭系统”, 它只能检测人们已知序列的特征(或有限的变异)。 而深度测序的强项, 就在于它是一个“开放系统”, 它的发现能力和寻找新的信息的能力, 从本质上高于芯片技术。 在一定程度上,这两种技术是优势互补,联合使用,将加快和加深我们的研究。
29
Thank you
Similar presentations