Abdalghani Abujabal, Rishiraj Saha Roy, Mohamed Yahya, Gerhard Weikum

Slides:



Advertisements
Similar presentations
MAPLE LEAF INTERNATIONAL SCHOOL TIANJIN HUAYUAN.
Advertisements

齐心协力 建设最好的 物理教学实验中心
Ensite系统指导下复杂心律失常的射频消融治疗
十一 ASP对数据库的访问.
牙齒共振頻率之臨床探討 論 文 摘 要 論文名稱:牙齒共振頻率之臨床探討 私立台北醫學院口腔復健醫學研究所 研究生姓名:王茂生 畢業時間:八十八學年度第二學期 指導教授:李勝揚 博士 林哲堂 博士 在口腔醫學的臨床診斷上,到目前為止仍缺乏有效的設備或方法可以評估或檢測牙周之邊界狀態。臨床上有關牙周病的檢查及其病變之診斷工具,
Classification of Web Query Intent Using Encyclopedia 基于百科知识的查询意图获取
資料庫設計 Database Design.
Homework 2 : VSM and Summary
Academic Year TFC EFL Data Collection Outline 学年美丽中国英语测试数据收集概述
59 中 张丽娟 学习目标: 1. 识记并理解运用 6 个单词和 5 个短语。 (source, accessible, network, access, via, create come up with, from the moment on, consist of, go down , at the.
Section A Period One. 每课时单词和短语的预热温习 环节,要求学生快速读出英文单词, 说出汉语意思。 该环节可帮助学生在课初对早 读时间已熟读记忆过的单词及短语 进行快速温习回顾,巩固记忆,为 接下来的学习做好词汇准备。研究 表明,词汇的熟悉度越高,阅读的 速度越快,理解力也越高。
Leftmost Longest Regular Expression Matching in Reconfigurable Logic
Operating System CPU Scheduing - 3 Monday, August 11, 2008.
A Question Answering Approach to Emotion Cause Extraction
深層學習 暑期訓練 (2017).
Minimum Spanning Trees
MovieBot: Booking Tickets Easily
Some Effective Techniques for Naive Bayes Text Classification
Platypus — Indoor Localization and Identification through Sensing Electric Potential Changes in Human Bodies.
毕业论文报告 孙悦明
International Conference ITIE2010: Inspiration from Best Practices
模式识别 Pattern Recognition
词汇语义资源在中文关系抽取中的应用 报告人:钱龙华 刘丹丹 胡亚楠 钱龙华 周国栋
Draft Amendment to STANDARD FOR Information Technology -Telecommunications and Information Exchange Between Systems - LAN/: R: Fast BSS.
Retail Customer Online Registration 零售顧客線上註冊教學
第4章(2) 空间数据库 —关系数据库 北京建筑工程学院 王文宇.
旅游景点与度假村管理 中山大学新华学院 (Management of Attractions & Resorts) 总学时:54
创建型设计模式.
关于“理解名词短语”的 重新思考 丁文韬.
Word-Entity Duet Representations for Document Ranking
Yonghui Wu, Mike Schuster, Zhifeng Chen, Quoc V. Le, Mohammad Norouzi
Inventory System Changes and Limitations
重點 資料結構之選定會影響演算法 選擇對的資料結構讓您上天堂 程式.
药物和疾病啥关系 ? 李智恒.
基于语义网的军事问答系统的设计与实现 报告人:汤顺雷 指导老师:程龚.
Unit 1.
PubMed整合显示图书馆电子资源 医科院图书馆电子资源培训讲座.
—— 周小多.
Semantic Navigation Liang Zheng.
IBM SWG Overall Introduction
資料結構 Data Structures Fall 2006, 95學年第一學期 Instructor : 陳宗正.
Answering aggregation question over knowledge base
成品检查报告 Inspection Report
Ericsson Innovation Award 2018 爱立信创新大赛 2018
Review and Analysis of the Usage of Degree Adverbs
Learn Question Focus and Dependency Relations from Web Search Results for Question Classification 各位老師大家好,這是我今天要報告的論文題目,…… 那在題目上的括號是因為,前陣子我們有投airs的paper,那有reviewer對model的名稱產生意見.
Representation Learning of Knowledge Graphs with Hierarchical Types
從 ER 到 Logical Schema ──兼談Schema Integration
Google Local Search API Research and Implementation
爬蟲類動物2 Random Slide Show Menu
高考应试作文写作训练 5. 正反观点对比.
The viewpoint (culture) [观点(文化)]
An organizational learning approach to information systems development
Nucleon EM form factors in a quark-gluon core model
Efficient Query Relaxation for Complex Relationship Search on Graph Data 李舒馨
Create and Use the Authorization Objects in ABAP
Introduction of this course
More About Auto-encoder
钱炘祺 一种面向实体浏览中属性融合的人机交互的设计与实现 Designing Human-Computer Interaction of Property Consolidation for Entity Browsing 钱炘祺
SAGE Journals Online: 学术资源检索平台
參考資料: 林秋燕 曾元顯 卜小蝶,Chap. 1、3 Chowdhury,Chap.9
Class imbalance in Classification
MGT 213 System Management Server的昨天,今天和明天
Graph 1 Michael Tsai 2012/4/24 連載: 學生上課睡覺姿勢大全
WiFi is a powerful sensing medium
Homework 2 : VSM and Summary
Gaussian Process Ruohua Shi Meeting
《神经网络与深度学习》 第10章 模型独立的学习方式
When using opening and closing presentation slides, use the masterbrand logo at the correct size and in the right position. This slide meets both needs.
Some discussions on Entity Identification
Presentation transcript:

Abdalghani Abujabal, Rishiraj Saha Roy, Mohamed Yahya, Gerhard Weikum Never-Ending Learning for Open-Domain Question Answering over Knowledge Bases Abdalghani Abujabal, Rishiraj Saha Roy, Mohamed Yahya, Gerhard Weikum www’18

Exsiting method rely on a clear separation between an offline training phase, where a model is learned, and an online phase where this model is deployed. Shortcomings: they require access to a large annotated training set that is not always readily available they fail on questions from before-unseen domains. they are limited to the language learned at training time 离线训练与在线部署分离。 1、需要大量的训练集,不是很好获取。 2、对于之前没有见过的领域会失败。 3、受限于训练时的语言。

contribution a KB-QA system that can be seeded with a small number of training examples and supports continuous learning to improve its answering performance over time; a similarity function-based answering mechanism that enables NEQA to answer questions with previously-unseen syntactic structures, thereby extending its coverage; a user feedback component that judiciously asks non-expert users to select satisfactory answers, thus closing the loop between users and the system and enabling continuous learning; extensive experimental results on two benchmarks demonstrating the viability of our continuous learning approach, and the ability to answer questions from previously-unseen domains. 1、一个新的KBQA系统,可以只需要少量的训练集,然后持续学习来改善效果。 2、提出了一个相似函数回答系统,使得NEQA可以回答之前没有见过的句法结构。 3、包含一个用户反馈用于持续学习 4、实验证明

abstract KBQA: translate natural language questions to semantic representation (such as SPARQL) Offline, NEQA automatically learns templates from a small number of training question-answer pairs. Once deployed, continuous learning is triggered on cases where templates are insufficient periodically re-trains its underlying models KBQA需要将自然语言转化成语义表示。 离线的时候,通过少量的QA对来生成模板。 部署后,当模板不够时会触发自动学习机制。 阶段性地重新训练底层模型。

bank:u1 = “which film awards was bill carraro nominated for?” unew = “which president was lincoln succeeded by?” unew = “what are the film award nominations that bill carraro received?” 有两个bank,模板库和question-query库。

Offline training(template bank) nominatedFor ?x BillCarraro “which film awards was bill carraro nominated for?” type movieAward 1、训练实例:question ,answer set对 2、通过question/answer set中的实体(一个一个地),在原图中找一个最小连通子图,这个子图就被当作查询 3、已经有了question和query要生成模板,是一个对齐的过程。 《Automated Template Generation for Question Answering over Knowledge Graphs》 Weakly supervised Training instance: (u,set Au) the smallest connected subgraph of the KG that contains the above entities found in the question as well as a.

alignment “which film awards was bill carraro nominated for?” use the Stanford dependency parser to build a dependency parse tree Predicate and Class Lexicons(web pages,freebase) (e1 p e2), “e1 r e2”,r->p “e and other np”, (e type c),np->c Weight:corpus frequency. named entity recognition ILP(maximize the total weight of the mapped phrases) 通过斯坦福的工具生成一个依赖语法树。 生成了谓语和类的词典(加权的二部图),将文本内容对应到谓语和类。 通过entity recognition,将文本内容对应到实体。 文本和对应的内容不是一一对应的,所以通过一个ILP来生成对齐方式,使得映射的短语的权重和最大。

Question-query bank u1 = “which film awards was bill carraro nominated for?” q1=“BillCarraro nominatedFor ?x . ?x type movieAward (query).” u1 = “which film awards was ENTITY nominated for?” q1=“ENTITY nominatedFor ?x . ?x type movieAward (query).”

Answer with templates Match template unew = “which president was lincoln succeeded by?” Match template Generate top-K query(learning to rank) Fetch answer sets User feedback:choose answer from answer sets q*, add to question-query bank None, Answer via similarity function

Answer via similarity function bank:u1 = “which film awards was ENTITY nominated for?” unew = “what are the film award nominations that bill carraro received?” Template-based failed uses a semantic similarity function to retrieve the k most semantically similar questions to unew from its question-query bank instantiated with entities Fetch answer sets User feedback:choose answer from answer sets q*, add to question-query bank, obtain a new template (ut ,qt ), add to template bank

Similarity function question likelihood based on a language model, word embedding-based similarity obtained through word2vec w可以是unigram,bigram,trigram 语言模型:对于新来的question的每一个w,用ui来预测的最大似然概率。后面的是一个平滑因子,w在语料库中的最大似然概率。 Word-embedding:就是两两做cos sim的和。

experiment 1.Training set: bank,LTR,language model 开发集 1.Training set: bank,LTR,language model 2. Development set: γ和α调参

System performace in two mode No user feedback 1、top-ranking answer 2、use similarity function only if the list obtained using templates is empty. 有无用户反馈在两个数据集上的表现。 s:similarity function时top1不对,对齐的时候出问题,所以没有模板生成 e:复杂问题,用户常常判断没有正确答案。

Comparision with state-of-the-art a,online learning template,similarity function b, similarity function

Static-learning with continuous learning disabled

Open-domain question answer method F1 NEQA 50.3 NEQA without user feedback 41.5 AQQU 20.3 训练集中去掉了三个domain的训练数据

analysis Impact of templates and similarity fuction cannot completely decouple both branches On WQ, with user feedback, 1184 questions were answered with templates, while 848 were answered via the similarity function. For the no-feedback configuration,1788 out of 2032 were handled by the learned templates,and the similarity function answered 244 questions. Similarity function ablation study