Presentation is loading. Please wait.

Presentation is loading. Please wait.

药物和疾病啥关系 ? 李智恒.

Similar presentations


Presentation on theme: "药物和疾病啥关系 ? 李智恒."— Presentation transcript:

1 药物和疾病啥关系 ? 李智恒

2 任务: BioCreative V Chemical-induced diseases relation extraction (CID) 1. V CDR Task: Identifying Chemical-induced Disease Relations in Biomedical Text 2. RELigator: Chemical-disease relation extraction using prior knowledge and textual information

3 任务介绍 BioCreative V Chemical-induced diseases relation extraction (CID)

4 UTH-CCB@BioCreative V CDR Task
sentence level Cs CID pair located in the same sentence CID abstract level  CD all candidate CID pairs Cs classifier : Context words with position Knowledgebase features Others

5 Cs features 1 Context words with position: eg:
C_D induced D_D in a D_D child. target entities: C_D , D_D009422 unigram and bigram words before, between and after target entities other entities between  entity type C_D induced D_disease in a D_D child.

6 Cs features 2 Knowledgebase features:
all relations of the chemical and disease pair in the CTD, MEDI, SIDER MeSH® tree structures of entities

7 CTD Comparative Toxicgenomics Database( http://ctdbase.org/ )
研究环境化学对人体健康的影响

8 CTD 研究实体: chemical/drugs genes/proteins disease taxa(分类群)
phenotypes(基因型和环境相互作用下的有机体的样子,显型) 人工标注: chemical–gene/protein interactions chemical–disease relationships gene–disease relationships chemical–phenotype relationships

9 CTD 数据分类: Chemical , Disease , Genes
Chemical–Gene/Protein Interactions Gene–Disease Associations Chemical–Disease Associations Gene–Gene Interactions References Organisms Gene Ontology Pathways Exposures

10 CTD Chemical–Disease Associations 下载文件: CTD chemical disease.xml.gz
therapeutic(治疗剂)或marker/mechanism(机制原理) 或 缺省

11 Cs features 2 Knowledgebase features:
all relations of the chemical and disease pair in the CTD, MEDI, SIDER MeSH® tree structures of entities

12 MEDI MEDI--an Ensemble MEDication Indication Resource
( ) 电子病历中提取得到的药物指示资源

13 Cs features 2 Knowledgebase features:
all relations of the chemical and disease pair in the CTD, MEDI, SIDER MeSH® tree structures of entities

14 SIDER Side Effect Resource(http://sideeffects.embl.de/)
销售药品和其他记录中的不良反应 从公开文档和包装说明书中抽取的信息 可用信息:副作用频率、药物副作用分类、 更多的信息链接(eg: drug-target relations)

15 SIDER

16 Cs features 2 Knowledgebase features:
all relations of the chemical and disease pair in the CTD, MEDI, SIDER MeSH® tree structures of entities

17 MeSH® tree structures 可以根据参考是找到比给定标题更具体、更广泛的标题 四肢 截肢残端 下肢 臀部 脚 脚踝
前脚掌,人类 跖骨 脚趾 大拇趾

18 Cs features 3 Others Mentions and normalized values of entities
Core chemicals: highest frequency or occurred in the title +1: 所有包含CID 关系对的句子 CID-SA —— -1 : 不包含CID 关系对的句子 +1 : 人工标注,确实含有关系的句子 CID-SM—— -1 : 人工标注,不含关系,但包含CID对的句子

19 CD classifier (2),(3) of Cs ( Knowledge features & core chemical )
Number of sentences between entities Trigger words For all CID pairs Cs+CD  predictions 若抽取结果为空,则核心化合物连接的CID对加入最终结果集

20 Results Training set + development set final models 自动标注优于人工标注结果

21 References CTD: The Comparative Toxicogenomics Database's 10th year anniversary: update 2015. MEDI: Development and evaluation of an ensemble resource linking medications to their indications. (2013) SIDER: A side effect resource to capture phenotypic effects of drugs.(2010)

22 RELigator RELigator: Chemical-disease relation extraction using prior knowledge and textual information Relation extraction: All co-occurrence pairs Cross the title-abstract border Features: Knowledge-based features Statistical features NLP features

23 Knowledge-based features
BRAIN : a graph database UMLS 中几乎所有的实体的相关关系 (来自结构化数据库&Medline文章) Entity1 connection Entity2 (每个connection标有来源,不同来源标有不同权重) (每个connection关联一系列 关系或预测) BRAIN提供用户编程接口,可用于查询两个给定实体的关系路径(path) 关系路径:直接/间接,每个path有志新分数, 用于衡量2个实体之间的连接紧密程度

24 Statistical features chemical , disease , chemical-disease pair 文档中出现频次 chemical 和 disease间的:1. 最少句子间隔 2.最少单词间隔 chemical 和 disease是否出现在title中,或者二者均出现在title中

25 NLP features Stanford CoreNLP parser 产生句子的依存树
Governing verb:分析树中某节点上升到根的过程中遇到的第一个动词 Semantic role:实体的语义角色由分析树中的governing verb 反映 对于最近的chemical和disease: Relating word (担任) Governing verb of Relating word (宣布) Chemical是否在disease前 是否有chemical-disease pair在低一级的分析树中 所有governing verb & 出现频次

26 Machine learning SVM分类 Radial basis function Grid search
Ten-fold cross-validation

27 References BRAIN: Bio-IT World. Big BRAIN: Finding Connections in the Literature Flood with Euretos BRAIN[ Internet]. Available from: Euretos[Internet]. Available from:

28 Sequence Modeling: Recurrent and Recursive Nets
张建海

29


Download ppt "药物和疾病啥关系 ? 李智恒."

Similar presentations


Ads by Google