E-mail: xiaochuanle@126.com Improving peptide identification for tandem mass spectrometry by incorporating translatomics information Chuan-Le Xiao (肖传乐)

Slides:

Advertisements

Similar presentations

基質金屬蛋白 ?-2,-9, 及其組織抑制劑 -1,-2 基因多形性與泌尿道上皮癌之相關研究泌尿道上皮癌中以膀胱癌為最常見的癌症，膀胱癌的研究顯示，基質金屬蛋白酶（ matrix melloproteinase, MMPs ）家族與腫瘤細胞的增生、血管生成及進展有密切的相關，其中又以 MMP-2.

Advertisements

Peking Union Medical College Chinese Academy of Medical Sciences 尿蛋白标志物数据库 —— 现状和未来中国医学科学院基础医学研究所生物信息中心邵晨

建筑与周边环境的和谐关系建筑系梁晓蕊

Big Data Ecosystem – Hadoop Distribution

TIC 智慧城市与物联网创新创业大赛齐技术模式创新，智慧城市成真梦主办单位：协办单位：合作高校：

How to Use SPSS in Biomedical Data analysis

慢性鼻竇炎病人趨化激素RANTES, Eotaxin與疾病嚴重度的相關性

Some Knowledge of Machine Learning（1）

Chapter 5 蛋白质组学.

人力资源管理 human resource management

生物統計與SAS軟體課程教學(三) 雙變項統計分析(一)

二維品質模式與麻醉前訪視滿意度中文摘要麻醉前訪視，是麻醉醫護人員對病患提供麻醉相關資訊與服務，並建立良好醫病關係的第一次接觸。本研究目的是以Kano‘s 二維品質模式，設計病患滿意度問卷，探討麻醉前訪視內容與病患滿意度之關係，以期分析關鍵品質要素為何，作為提高病患對醫療滿意度之參考。本研究於台灣北部某醫學中心，通過該院人體試驗委員會審查後進行。對象為婦科排程手術住院病患，其中實驗組共107位病患，在麻醉醫師訪視之前，安排先觀看麻醉流程衛教影片；另外對照組111位病患，則未提供衛教影片。問卷於麻醉醫師

B型肝炎帶原之肝細胞癌患者接受肝動脈栓塞治療後血液中DNA之定量分析

Human Resource Planning

人力资源管理 human resource management

多菌株乳酸菌組合在飼料添加物及保健食品之應用-

分析抗焦慮劑/安眠劑之使用的影響因子在重度憂鬱症及廣泛性焦慮症病人和一般大眾的處方形態

商業智慧與資料倉儲課程簡介靜宜大學資管系楊子青.

班级小插曲.

Chapter 8 Liner Regression and Correlation 第八章直线回归和相关

Chaoping Li, Zhejiang University

An Ultra-Wearable, Wireless, Low Power ECG Monitoring System

libD3C: 一种免参数的、支持不平衡分类的二类分类器

Applications of Digital Signal Processing

Thinking of Instrumentation Survivability Under Severe Accident

生物資訊 bioinformatics 林育慶.

Validate antibacterial mode and find main bioactive components of traditional Chinese medicine Aquilegia oxysepala 生物系任理.

毕业论文报告孙悦明

模式识别 Pattern Recognition

Manifold Learning Kai Yang

丁承國立交通大學經營管理研究所教授成大統計68級民國103年6月14日

計算方法設計與分析 Design and Analysis of Algorithms 唐傳義

The Empirical Study on the Correlation between Equity Incentive and Enterprise Performance for Listed Companies 上市公司股权激励与企业绩效相关性的实证研究汇报人：白欣蓉学号：

2-D电泳与生物质谱技术组员：游明亮方国李健

5、利用EST数据库发现新基因 EST (expressed sequence tags),是从基因表达的短的序列，携带着完整基因某些片断的信息，称为表达序列标签获得一个EST的途径有三种：1 大规模测序；2 比较同源性；3 差异显示或基因芯片法获得与某一性状相关的EST 电脑克隆第一步，找到与待克隆基因相关的EST；第二步.

第 1 章 ERP的演变.

HLA - Time Management 陳昱豪.

1 Maturity Mechanics and Model Elements成熟度机理和模型的元素

Proteomics: the global analysis of proteins

Understanding masses of charm-strange states in Regge phenomenology

Formal Pivot to both Language and Intelligence in Science

Maturity Mechanics and Model for Large-Scale Construction Project Management 大型建设工程项目管理成熟度机理及其模型贾广社.

(第七十五期) 理论与交叉研究部&磁共振基础研究部联合邀请报告第1期

基于基因集富集分析的畜禽复杂性状GWAS分析平台及其应用

中国农村沼气政策与发展战略李景明中国北京农业部科技发展中心能源生态处处长中国沼气学会秘书长.

优尼科教育校园宣讲会为了梦想，我们聚到了一起。为了梦想，我们选择飞向远方。南工程站.

Advisor : Prof. Frank Y.S. Lin Presented by Yen-Yi, Hsu

读书报告汇报人：赵卓丽

一、緒論（INTRODUCTION） (Chapter 1 & 2)

Version Control System Based DSNs

生物統計 1 課程簡介 (Introduction)

高性能计算与天文技术联合实验室智能与计算学部天津大学

相關統計觀念復習 Review II.

3.5 Region Filling Region Filling is a process of “coloring in” a definite image area or region. 2019/4/19.

Common Qs Regarding Earnings

Representation Learning of Knowledge Graphs with Hierarchical Types

A Data Mining Algorithm for Generalized Web Prefetching

Interactome data and databases: different types of protein interaction

The viewpoint (culture) [观点(文化)]

An organizational learning approach to information systems development

氮的循环授课人：王小静阜阳一中化学组.

中三化學科第三節:原子.

Department of Computer Science & Information Engineering

Chapter4工作分析與工作評價第一節工作分析第二節工作評價.

可换成校徽论文主标题论文副标题指导老师：X教授答辩学生：宝藏PPT.

生物結構期末報告學生:葉雅如 M 老師: 鄒文雄教授.

4 純化策略 Purification strategy

Gaussian Process Ruohua Shi Meeting

Hybrid fractal zerotree wavelet image coding

Presentation transcript:

E-mail: xiaochuanle@126.com Improving peptide identification for tandem mass spectrometry by incorporating translatomics information Chuan-Le Xiao (肖传乐) 中山大学眼科学国家重点实验室 E-mail: xiaochuanle@126.com

Background 1 3 steps in protein identification：

Background 1 J. Proteome Res. 2014, 13, 4113−4119

Background 1 Translatomics (Ribosome profiling, Ribo-seq)

Background 1 Mapping 困难：基因组序列太长，计算方法要求: 需要比对reads量大。 • 速度和准确度 Sequencing read 50-150bp 目前测序产生数百万个短读序列(reads) ，将每个read在基因组上准确定位。 Mapping 困难：基因组序列太长，需要比对reads量大。计算方法要求: • 速度和准确度 • 可接受的内存耗用

高灵敏度！高速度！高精度！ Background 1 FANSe: an accurate algorithm for quantitative mapping of large scale sequencing reads. Nucleic Acids Res. 2012 FANSe2: A Robust and Cost-Efficient Alignment Tool for Quantitative Next-Generation Sequencing Applications.PLoS ONE 高灵敏度！高速度！高精度！

Background 1 A549正在转录mRNA量与蛋白质量关系

Background 1 m/z 峰强度？问题：蛋白的鉴定效率低（约10-30%）

Background 1 ProVerB: 高鉴定能力和高精度，广泛使用性且可靠性高

Background 1 1) 配对氨基酸与峰强度统计分析（b, y离子强度矩阵） i=A,C,…. j=A,C,….

Background 1 . 理论峰产生规则： 1. b ,y碎片离子必须产生 2. 碎片离子包含S,T,E,D产生 2) 产生理论图谱理论峰产生规则： 1. b ,y碎片离子必须产生 2. 碎片离子包含S,T,E,D产生 b-H2O和 y-H2O 3. 包含R,K,Q,N产生b-NH3,和y-NH3 4.母离子价态大于1且包含S,H,K 生成二价离子 .

Background 1 3) 打分模型实例匹配打分模型连续匹配打分模型 b3和b4 ，b4和b5 b, y离子匹配打分模型 P0=0.06 连续匹配打分模型 b3和b4 ，b4和b5 r=0.09083 b, y离子匹配打分模型总分和去背景值

Background 1 IPomics we propose a novel strategy and develop a software system called IPomics for peptides identification by incorporating prior information from tranlatomics abundance information

Materials and method 2 1. Five data resource Ribo-seq and MS/MS paired datasets

Materials and method 2 2. Analysis pipline ProVerB FANSe2 2. Analysis pipline The analysis pipeline of IPomics was made up of five key steps

RESULTS 3 1. The prior information of FPKM for protein identification 2. The incorporation of tranlatomic FPKM in scoring model 3. Comparison of IPomics with Mascot, OMSSA, X!Tandem and Pfind 4. Computational validation with SILAC and Tyrosine phosphorylation datasets

RESULTS 3 1. The prior information of FPKM for protein identification

RESULTS 3 Established a quantification model to transform the FPKM of translatomic into the corresponding probability of protein identification

RESULTS 3 2. The incorporation of tranlatomic FPKM in scoring model There were two ways included simple fragment match and consecutive ion match for incorporating the PF of prior information FPKM in the binomial scoring model we evaluated the different distribution of peptide score by applying two scoring methods -10·lg(P) and -10·lg(Psimple)

RESULTS 3 3. Comparison of IPomics with Mascot, OMSSA, X!Tandem and pFind

RESULTS 3 3. Comparison of IPomics with Mascot, OMSSA, X!Tandem and pFind

Comparison_peptides 3

Comparison_high-confidence peptides 3 Table 2. Fractions of high confidence peptides of the five algorithms Type Algorithm Datasets Human Youngbrain Oldbrain Youngliver Oldliver Peptide Total 37300 43357 42154 34667 34675 Mascot 29042 (77.9%) 36691 (84.6%) 35917 (85.2%) 29635 (85.5%) 29763 (85.8%) OMSSA 24780 (66.4%) 36803 (84.9%) 35880 (85.1%) 29682 (85.6%) 29838 (86.1%) X!Tandem 30862 (82.7%) 39161 (90.3%) 38202 (90.6%) 31636 (91.3%) 31647 (91.3%) pFind 33879 (90.8%) 36006 (83.1%) 35240 (83.6%) 30124 (86.9%) 30379 (87.6%) IPomics 36444 (97.7%) 42748 (98.6%) 41575 (98.6%) 34103 (98.4%) 34084 (98.3%)

RESULTS 3 4. Computational validation with SILAC and Tyrosine phosphorylation datasets

RESULTS 3 4. Computational validation with Tyrosine phosphorylation datasets Table S8. The identified spectra and peptides in tyrosine dataset The 175 of 304 tyrosine sites identified by IPomics were also searched in both Mascot and OMSSA, and the high confidence tyr peptides that at least identified by two engines were as high as 85.5% in IPomics (Fig. 7). The 14.5% (44) tyrosine phosphorylation peptides were uniquely identified by IPomics without overlap. However, all those peptides with tyrosine phosphorylation sites had been experimental verified in PhosphoSitePlus

Thanks！