Abdalghani Abujabal, Rishiraj Saha Roy, Mohamed Yahya, Gerhard Weikum

Slides:

Advertisements

Similar presentations

MAPLE LEAF INTERNATIONAL SCHOOL TIANJIN HUAYUAN.

Advertisements

齐心协力建设最好的物理教学实验中心

Ensite系统指导下复杂心律失常的射频消融治疗

十一 ASP对数据库的访问.

牙齒共振頻率之臨床探討論文摘要論文名稱：牙齒共振頻率之臨床探討私立台北醫學院口腔復健醫學研究所研究生姓名：王茂生畢業時間：八十八學年度第二學期指導教授：李勝揚博士林哲堂博士在口腔醫學的臨床診斷上，到目前為止仍缺乏有效的設備或方法可以評估或檢測牙周之邊界狀態。臨床上有關牙周病的檢查及其病變之診斷工具，

Classification of Web Query Intent Using Encyclopedia 基于百科知识的查询意图获取

資料庫設計 Database Design.

Homework 2 : VSM and Summary

Academic Year TFC EFL Data Collection Outline 学年美丽中国英语测试数据收集概述

59 中张丽娟学习目标： 1. 识记并理解运用 6 个单词和 5 个短语。 (source, accessible, network, access, via, create come up with, from the moment on, consist of, go down ， at the.

Section A Period One. 每课时单词和短语的预热温习环节，要求学生快速读出英文单词, 说出汉语意思。该环节可帮助学生在课初对早读时间已熟读记忆过的单词及短语进行快速温习回顾，巩固记忆，为接下来的学习做好词汇准备。研究表明，词汇的熟悉度越高，阅读的速度越快，理解力也越高。

Leftmost Longest Regular Expression Matching in Reconfigurable Logic

Operating System CPU Scheduing - 3 Monday, August 11, 2008.

A Question Answering Approach to Emotion Cause Extraction

深層學習暑期訓練 (2017).

Minimum Spanning Trees

MovieBot: Booking Tickets Easily

Some Effective Techniques for Naive Bayes Text Classification

Platypus — Indoor Localization and Identification through Sensing Electric Potential Changes in Human Bodies.

毕业论文报告孙悦明

International Conference ITIE2010: Inspiration from Best Practices

模式识别 Pattern Recognition

词汇语义资源在中文关系抽取中的应用报告人：钱龙华刘丹丹胡亚楠钱龙华周国栋

Draft Amendment to STANDARD FOR Information Technology -Telecommunications and Information Exchange Between Systems - LAN/: R: Fast BSS.

Retail Customer Online Registration 零售顧客線上註冊教學

第4章(2) 空间数据库 —关系数据库北京建筑工程学院王文宇.

旅游景点与度假村管理中山大学新华学院（Management of Attractions & Resorts）总学时：54

创建型设计模式.

关于“理解名词短语”的重新思考丁文韬.

Word-Entity Duet Representations for Document Ranking

Yonghui Wu, Mike Schuster, Zhifeng Chen, Quoc V. Le, Mohammad Norouzi

Inventory System Changes and Limitations

重點資料結構之選定會影響演算法選擇對的資料結構讓您上天堂程式.

药物和疾病啥关系？李智恒.

基于语义网的军事问答系统的设计与实现报告人：汤顺雷指导老师：程龚.

PubMed整合显示图书馆电子资源医科院图书馆电子资源培训讲座.

—— 周小多.

Semantic Navigation Liang Zheng.

IBM SWG Overall Introduction

資料結構 Data Structures Fall 2006， 95學年第一學期 Instructor : 陳宗正.

Answering aggregation question over knowledge base

成品检查报告 Inspection Report

Ericsson Innovation Award 2018 爱立信创新大赛 2018

Review and Analysis of the Usage of Degree Adverbs

Learn Question Focus and Dependency Relations from Web Search Results for Question Classification 各位老師大家好,這是我今天要報告的論文題目,…… 那在題目上的括號是因為,前陣子我們有投airs的paper,那有reviewer對model的名稱產生意見.

Representation Learning of Knowledge Graphs with Hierarchical Types

從 ER 到 Logical Schema ──兼談Schema Integration

Google Local Search API Research and Implementation

爬蟲類動物2 Random Slide Show Menu

高考应试作文写作训练 5. 正反观点对比.

The viewpoint (culture) [观点(文化)]

An organizational learning approach to information systems development

Nucleon EM form factors in a quark-gluon core model

Efficient Query Relaxation for Complex Relationship Search on Graph Data 李舒馨

Create and Use the Authorization Objects in ABAP

Introduction of this course

More About Auto-encoder

钱炘祺一种面向实体浏览中属性融合的人机交互的设计与实现 Designing Human-Computer Interaction of Property Consolidation for Entity Browsing 钱炘祺

SAGE Journals Online: 学术资源检索平台

參考資料：林秋燕曾元顯卜小蝶，Chap. 1、3 Chowdhury，Chap.9

Class imbalance in Classification

MGT 213 System Management Server的昨天，今天和明天

Graph 1 Michael Tsai 2012/4/24 連載: 學生上課睡覺姿勢大全

WiFi is a powerful sensing medium

Homework 2 : VSM and Summary

Gaussian Process Ruohua Shi Meeting

《神经网络与深度学习》第10章模型独立的学习方式

When using opening and closing presentation slides, use the masterbrand logo at the correct size and in the right position. This slide meets both needs.

Some discussions on Entity Identification

Presentation transcript:

Abdalghani Abujabal, Rishiraj Saha Roy, Mohamed Yahya, Gerhard Weikum Never-Ending Learning for Open-Domain Question Answering over Knowledge Bases Abdalghani Abujabal, Rishiraj Saha Roy, Mohamed Yahya, Gerhard Weikum www’18

Exsiting method rely on a clear separation between an offline training phase, where a model is learned, and an online phase where this model is deployed. Shortcomings: they require access to a large annotated training set that is not always readily available they fail on questions from before-unseen domains. they are limited to the language learned at training time 离线训练与在线部署分离。 1、需要大量的训练集，不是很好获取。 2、对于之前没有见过的领域会失败。 3、受限于训练时的语言。

contribution a KB-QA system that can be seeded with a small number of training examples and supports continuous learning to improve its answering performance over time; a similarity function-based answering mechanism that enables NEQA to answer questions with previously-unseen syntactic structures, thereby extending its coverage; a user feedback component that judiciously asks non-expert users to select satisfactory answers, thus closing the loop between users and the system and enabling continuous learning; extensive experimental results on two benchmarks demonstrating the viability of our continuous learning approach, and the ability to answer questions from previously-unseen domains. 1、一个新的KBQA系统，可以只需要少量的训练集，然后持续学习来改善效果。 2、提出了一个相似函数回答系统，使得NEQA可以回答之前没有见过的句法结构。 3、包含一个用户反馈用于持续学习 4、实验证明

abstract KBQA: translate natural language questions to semantic representation (such as SPARQL) Offline, NEQA automatically learns templates from a small number of training question-answer pairs. Once deployed, continuous learning is triggered on cases where templates are insufficient periodically re-trains its underlying models KBQA需要将自然语言转化成语义表示。离线的时候，通过少量的QA对来生成模板。部署后，当模板不够时会触发自动学习机制。阶段性地重新训练底层模型。

bank:u1 = “which film awards was bill carraro nominated for?” unew = “which president was lincoln succeeded by?” unew = “what are the film award nominations that bill carraro received?” 有两个bank，模板库和question-query库。

Offline training(template bank) nominatedFor ？x BillCarraro “which film awards was bill carraro nominated for?” type movieAward 1、训练实例：question ,answer set对 2、通过question/answer set中的实体（一个一个地），在原图中找一个最小连通子图，这个子图就被当作查询 3、已经有了question和query要生成模板，是一个对齐的过程。《Automated Template Generation for Question Answering over Knowledge Graphs》 Weakly supervised Training instance: (u,set Au) the smallest connected subgraph of the KG that contains the above entities found in the question as well as a.

alignment “which film awards was bill carraro nominated for?” use the Stanford dependency parser to build a dependency parse tree Predicate and Class Lexicons(web pages,freebase) (e1 p e2), “e1 r e2”,r->p “e and other np”, (e type c),np->c Weight:corpus frequency. named entity recognition ILP（maximize the total weight of the mapped phrases）通过斯坦福的工具生成一个依赖语法树。生成了谓语和类的词典（加权的二部图），将文本内容对应到谓语和类。通过entity recognition，将文本内容对应到实体。文本和对应的内容不是一一对应的，所以通过一个ILP来生成对齐方式，使得映射的短语的权重和最大。

Question-query bank u1 = “which film awards was bill carraro nominated for?” q1=“BillCarraro nominatedFor ?x . ?x type movieAward (query).” u1 = “which film awards was ENTITY nominated for?” q1=“ENTITY nominatedFor ?x . ?x type movieAward (query).”

Answer with templates Match template unew = “which president was lincoln succeeded by?” Match template Generate top-K query(learning to rank) Fetch answer sets User feedback:choose answer from answer sets q*, add to question-query bank None, Answer via similarity function

Answer via similarity function bank:u1 = “which film awards was ENTITY nominated for?” unew = “what are the film award nominations that bill carraro received?” Template-based failed uses a semantic similarity function to retrieve the k most semantically similar questions to unew from its question-query bank instantiated with entities Fetch answer sets User feedback:choose answer from answer sets q*, add to question-query bank, obtain a new template (ut ,qt ), add to template bank

Similarity function question likelihood based on a language model, word embedding-based similarity obtained through word2vec w可以是unigram,bigram,trigram 语言模型：对于新来的question的每一个w，用ui来预测的最大似然概率。后面的是一个平滑因子，w在语料库中的最大似然概率。 Word-embedding：就是两两做cos sim的和。

experiment 1.Training set: bank,LTR,language model 开发集 1.Training set: bank,LTR,language model 2. Development set: γ和α调参

System performace in two mode No user feedback 1、top-ranking answer 2、use similarity function only if the list obtained using templates is empty. 有无用户反馈在两个数据集上的表现。 s:similarity function时top1不对，对齐的时候出问题，所以没有模板生成 e:复杂问题，用户常常判断没有正确答案。

Comparision with state-of-the-art a，online learning template,similarity function b, similarity function

Static-learning with continuous learning disabled

Open-domain question answer method F1 NEQA 50.3 NEQA without user feedback 41.5 AQQU 20.3 训练集中去掉了三个domain的训练数据

analysis Impact of templates and similarity fuction cannot completely decouple both branches On WQ, with user feedback, 1184 questions were answered with templates, while 848 were answered via the similarity function. For the no-feedback configuration,1788 out of 2032 were handled by the learned templates,and the similarity function answered 244 questions. Similarity function ablation study