E-mail: xiaochuanle@126.com Improving peptide identification for tandem mass spectrometry by incorporating translatomics information Chuan-Le Xiao (肖传乐)

Slides:



Advertisements
Similar presentations
基質金屬蛋白 ?-2,-9, 及其組織抑制劑 -1,-2 基因多形性與泌尿道上皮癌之 相關研究 泌尿道上皮癌中以膀胱癌為最常見的癌症,膀胱癌的研究顯示,基質金屬蛋白酶( matrix melloproteinase, MMPs )家 族與腫瘤細胞的增生、血管生成及進展有密切的相關,其中又以 MMP-2.
Advertisements

Peking Union Medical College Chinese Academy of Medical Sciences 尿蛋白标志物数据库 —— 现状和未来 中国医学科学院基础医学研究所 生物信息中心 邵晨
建筑与周边环境的和谐关系 建筑系 梁晓蕊
Big Data Ecosystem – Hadoop Distribution
TIC 智慧城市与物联网创新创业大赛 齐 技术模式 创新,智慧城市 成真 梦 主办单位: 协办单位: 合作高校:
How to Use SPSS in Biomedical Data analysis
慢性鼻竇炎病人趨化激素RANTES, Eotaxin與疾病嚴重度的相關性
Some Knowledge of Machine Learning(1)
Chapter 5 蛋白质组学.
人力资源管理 human resource management
生物統計與SAS軟體課程教學(三) 雙變項統計分析(一)
二維品質模式與麻醉前訪視滿意度 中文摘要 麻醉前訪視,是麻醉醫護人員對病患提供麻醉相關資訊與服務,並建立良好醫病關係的第一次接觸。本研究目的是以Kano‘s 二維品質模式,設計病患滿意度問卷,探討麻醉前訪視內容與病患滿意度之關係,以期分析關鍵品質要素為何,作為提高病患對醫療滿意度之參考。 本研究於台灣北部某醫學中心,通過該院人體試驗委員會審查後進行。對象為婦科排程手術住院病患,其中實驗組共107位病患,在麻醉醫師訪視之前,安排先觀看麻醉流程衛教影片;另外對照組111位病患,則未提供衛教影片。問卷於麻醉醫師
B型肝炎帶原之肝細胞癌患者接受肝動脈栓塞治療後血液中DNA之定量分析
Human Resource Planning
人力资源管理 human resource management
多菌株乳酸菌組合在飼料添加物及保健食品之應用-
分析抗焦慮劑/安眠劑之使用的影響因子在重度憂鬱症及廣泛性焦慮症病人和一般大眾的處方形態
商業智慧與資料倉儲 課程簡介 靜宜大學資管系 楊子青.
班级小插曲.
Chapter 8 Liner Regression and Correlation 第八章 直线回归和相关
Chaoping Li, Zhejiang University
An Ultra-Wearable, Wireless, Low Power ECG Monitoring System
libD3C: 一种免参数的、支持不平衡分类的二类分类器
Applications of Digital Signal Processing
Thinking of Instrumentation Survivability Under Severe Accident
生物資訊 bioinformatics 林育慶.
Validate antibacterial mode and find main bioactive components of traditional Chinese medicine Aquilegia oxysepala 生物系 任理.
毕业论文报告 孙悦明
模式识别 Pattern Recognition
Manifold Learning Kai Yang
丁 承 國立交通大學經營管理研究所教授 成大統計68級 民國103年6月14日
計算方法設計與分析 Design and Analysis of Algorithms 唐傳義
The Empirical Study on the Correlation between Equity Incentive and Enterprise Performance for Listed Companies 上市公司股权激励与企业绩效相关性的实证研究 汇报人:白欣蓉 学 号:
2-D电泳与生物质谱技术 组员:游明亮 方国 李健
5、利用EST数据库发现新基因 EST (expressed sequence tags),是从基因表达的短的序列,携带着完整基因某些片断的信息,称为表达序列标签 获得一个EST的途径有三种:1 大规模测序;2 比较同源性;3 差异显示或基因芯片法获得与某一性状相关的EST 电脑克隆 第一步,找到与待克隆基因相关的EST;第二步.
第 1 章 ERP的演变.
HLA - Time Management 陳昱豪.
1 Maturity Mechanics and Model Elements成熟度机理和模型的元素
Proteomics: the global analysis of proteins
Understanding masses of charm-strange states in Regge phenomenology
Formal Pivot to both Language and Intelligence in Science
Maturity Mechanics and Model for Large-Scale Construction Project Management 大型建设工程项目管理成熟度机理 及其模型 贾广社.
(第七十五期) 理论与交叉研究部&磁共振基础研究部联合邀请报告第1期
基于基因集富集分析的畜禽复杂性状GWAS分析平台及其应用
中国农村沼气政策与发展战略 李景明 中国北京 农业部科技发展中心能源生态处处长 中国沼气学会秘书长.
优尼科教育校园宣讲会 为了梦想,我们聚到了一起。 为了梦想,我们选择飞向远方。 南工程站.
Advisor : Prof. Frank Y.S. Lin Presented by Yen-Yi, Hsu
读书报告 汇报人:赵卓丽
一、緒 論(INTRODUCTION) (Chapter 1 & 2)
Version Control System Based DSNs
生物統計 1 課程簡介 (Introduction)
高性能计算与天文技术联合实验室 智能与计算学部 天津大学
相關統計觀念復習 Review II.
3.5 Region Filling Region Filling is a process of “coloring in” a definite image area or region. 2019/4/19.
Common Qs Regarding Earnings
Representation Learning of Knowledge Graphs with Hierarchical Types
A Data Mining Algorithm for Generalized Web Prefetching
Interactome data and databases: different types of protein interaction
The viewpoint (culture) [观点(文化)]
An organizational learning approach to information systems development
氮的循环 授课人:王小静阜阳一中化学组.
中三化學科 第三節:原子.
Department of Computer Science & Information Engineering
Chapter4工作分析與工作評價 第一節 工作分析 第二節 工作評價.
可换成校徽 论文主标题 论文副标题 指导老师:X教授 答辩学生:宝藏PPT.
生物結構期末報告 學生:葉雅如 M 老師: 鄒文雄 教授.
4 純化策略 Purification strategy
Gaussian Process Ruohua Shi Meeting
Hybrid fractal zerotree wavelet image coding
Presentation transcript:

E-mail: xiaochuanle@126.com Improving peptide identification for tandem mass spectrometry by incorporating translatomics information Chuan-Le Xiao (肖传乐) 中山大学眼科学国家重点实验室 E-mail: xiaochuanle@126.com

Background 1 3 steps in protein identification:

Background 1 J. Proteome Res. 2014, 13, 4113−4119

Background 1 Translatomics (Ribosome profiling, Ribo-seq)

Background 1 Mapping 困难:基因组序列太长, 计算方法要求: 需要比对reads量大。 • 速度和准确度 Sequencing read 50-150bp 目前测序产生数百万个短读序列(reads) ,将每个read在基因组上准确定位。 Mapping 困难:基因组序列太长, 需要比对reads量大。 计算方法要求: • 速度和准确度 • 可接受的内存耗用

高灵敏度!高速度!高精度! Background 1 FANSe: an accurate algorithm for quantitative mapping of large scale sequencing reads. Nucleic Acids Res. 2012 FANSe2: A Robust and Cost-Efficient Alignment Tool for Quantitative Next-Generation Sequencing Applications.PLoS ONE 高灵敏度!高速度!高精度!

Background 1 A549正在转录mRNA量与蛋白质量关系

Background 1 m/z 峰强度 ? 问题:蛋白的鉴定效率低(约10-30%)

Background 1 ProVerB: 高鉴定能力和高精度,广泛使用性且可靠性高

Background 1 1) 配对氨基酸与峰强度统计分析(b, y离子强度矩阵) i=A,C,…. j=A,C,….

Background 1 . 理论峰产生规则: 1. b ,y碎片离子必须产生 2. 碎片离子包含S,T,E,D产生 2) 产生理论图谱 理论峰产生规则: 1. b ,y碎片离子必须产生 2. 碎片离子包含S,T,E,D产生 b-H2O和 y-H2O 3. 包含R,K,Q,N产生b-NH3,和y-NH3 4.母离子价态大于1且包含S,H,K 生成二价离子 .

Background 1 3) 打分模型 实例 匹配打分模型 连续匹配打分模型 b3和b4 ,b4和b5 b, y离子匹配打分模型 P0=0.06 连续匹配打分模型 b3和b4 ,b4和b5 r=0.09083 b, y离子匹配打分模型 总分和去背景值

Background 1 IPomics we propose a novel strategy and develop a software system called IPomics for peptides identification by incorporating prior information from tranlatomics abundance information

Materials and method 2 1. Five data resource Ribo-seq and MS/MS paired datasets

Materials and method 2 2. Analysis pipline ProVerB FANSe2 2. Analysis pipline The analysis pipeline of IPomics was made up of five key steps

RESULTS 3 1. The prior information of FPKM for protein identification 2. The incorporation of tranlatomic FPKM in scoring model 3. Comparison of IPomics with Mascot, OMSSA, X!Tandem and Pfind 4. Computational validation with SILAC and Tyrosine phosphorylation datasets

RESULTS 3 1. The prior information of FPKM for protein identification

RESULTS 3 Established a quantification model to transform the FPKM of translatomic into the corresponding probability of protein identification

RESULTS 3 2. The incorporation of tranlatomic FPKM in scoring model There were two ways included simple fragment match and consecutive ion match for incorporating the PF of prior information FPKM in the binomial scoring model we evaluated the different distribution of peptide score by applying two scoring methods -10·lg(P) and -10·lg(Psimple)

RESULTS 3 3. Comparison of IPomics with Mascot, OMSSA, X!Tandem and pFind

RESULTS 3 3. Comparison of IPomics with Mascot, OMSSA, X!Tandem and pFind

Comparison_peptides 3

Comparison_high-confidence peptides 3 Table 2. Fractions of high confidence peptides of the five algorithms Type Algorithm Datasets Human Youngbrain Oldbrain Youngliver Oldliver Peptide Total 37300 43357 42154 34667 34675   Mascot 29042 (77.9%) 36691 (84.6%) 35917 (85.2%) 29635 (85.5%) 29763 (85.8%) OMSSA 24780 (66.4%) 36803 (84.9%) 35880 (85.1%) 29682 (85.6%) 29838 (86.1%) X!Tandem 30862 (82.7%) 39161 (90.3%) 38202 (90.6%) 31636 (91.3%) 31647 (91.3%) pFind 33879 (90.8%) 36006 (83.1%) 35240 (83.6%) 30124 (86.9%) 30379 (87.6%) IPomics 36444 (97.7%) 42748 (98.6%) 41575 (98.6%) 34103 (98.4%) 34084 (98.3%)

RESULTS 3 4. Computational validation with SILAC and Tyrosine phosphorylation datasets

RESULTS 3 4. Computational validation with Tyrosine phosphorylation datasets Table S8. The identified spectra and peptides in tyrosine dataset The 175 of 304 tyrosine sites identified by IPomics were also searched in both Mascot and OMSSA, and the high confidence tyr peptides that at least identified by two engines were as high as 85.5% in IPomics (Fig. 7). The 14.5% (44) tyrosine phosphorylation peptides were uniquely identified by IPomics without overlap. However, all those peptides with tyrosine phosphorylation sites had been experimental verified in PhosphoSitePlus

Thanks!