Institute of Information Science, Academia Sinica

Institute of Information Science, Academia Sinica
A Study of Machine Reading: Solving Math Word Problems in a Human Comprehensible Way Keh-Yih Su Institute of Information Science, Academia Sinica CCL-15, November 14, 2015

Academia Sinica Founded on June 9, 1928
The highest Research Organization in Taiwan 24 research institutes & 7 research centers Taiwan International Graduate Program (TIGP) with domestic Universities since 2002; 12 Cross-Domain PhD programs; 253(56%) students from 41 countries Researcher Research technicians Postdoc Admin RA Graduate 924 95 1,063 273 3,475 1,921 Ref: 2014/11

Institute of Information Science
Established in September 1982 PhD program established in 「Social Networks & Human-Centered Computing」& Bioinformatics under the auspices of TIGP since 2002, 34 students currently Research Groups Bioinformatics Group Computer System Group Data Management and Information Discovery Group Multimedia Technologies Group Natural Language and Knowledge Processing Group Network System and Service Group Programming Languages and Formal Methods Group Computation Theory and Algorithms Group Ref: Researcher Postdoc RA & Admin TIGP Student 40 43 270 34 Ref:

Natural Language & Knowledge Processing Group
Research on Knowledge Acquisition Knowledge Representation Knowledge Utilization Laboratories, 5 Researchers Project – Machine Reading (3 PIs) Chinese Document Processing Lab Chinese Knowledge & Information Processing Machine Learning and Classification Lab Intelligent Agent Systems Lab Speech, Language, and Music Processing Lab Natural Language Understanding Lab

A Group Project Project： Designing and Building a Domain Dependent Machine Reading System Principal Investigator： Keh-Yih Su (蘇克毅) Members： Lun-Wei Ku (古倫維)， Wei-Yun Ma (馬偉雲) Goal：Design and build a Natural Language Understanding (NLU) platform, and then construct a domain-dependent Machine Reading system on it. Period：3 Years ( ) Keh-Yih Su, Nov. 4, 2014

What is Machine Reading (MR)?
Read the given text/document, convert the content into logical form, and then perform inference on it. Evaluated by Reading Comprehension Test An inter-discipline research (NLP+KRR+ML) Natural Language Processing (NLP) Knowledge Representation and Reasoning (KRR) Machine Learning (ML) Keh-Yih Su, Nov. 4, 2014

Why Machine Reading? (1) Big Mechanism Big Data alone is not enough
DARPA, July 2014 Seek “why” behind big data Big Data alone is not enough Aims to explore the correlation between surface features, not their causality relationship Only provide surface information, not true knowledge Event_A -> Event_B: O(mxn) The causality relationship can greatly reduce the complexity that we need to handle, and can interpret the observed phenomena concisely and easily Keh-Yih Su, Nov. 4, 2014

Why Machine Reading? (2) Why Big Mechanism?
Evolution of our brain capability cannot compete with the knowledge growing speed Each researcher is forced to focus on a very narrow topic, unaware of progress in other disciplines Many complicated systems are studied piecewisely, and their literatures and data are fragmented Need a mechanism to handle complicated systems in which interactions have important causal effects. Emphasize signaling pathways in cancer biology Keh-Yih Su, Nov. 4, 2014

Why Machine Reading? (3) How is Machine Reading related to Big Mechanism? The pre-requisite for Big Mechanism that is that we can read each document and learn its associated knowledge Machine Reading is able to give the answer that is implied in the text Open-domain Q&A only aims to find out the text passage that contains the answer Keh-Yih Su, Nov. 4, 2014

Related Work (1) Related preliminary researches (IE, Q&A)
Information Extraction Fill in specified templates for a given domain Switching domains implies to re-train the system Open Domain Question and Answering Extract desired answer passage from the text Cannot answer even the simple problem if the desired answer is not included in the text Keh-Yih Su, Nov. 4, 2014

Related Work (2) MR related research activities
A MR program was initiated at DARPA (2009/6 – 2012/1) Learn to read, and read to learn what they need in order to answer questions or perform some reasoning tasks Under another project called Deep Exploration and Filtering of Text (DEFT) since June 2012 Conducted at Europe under the CLEF Initiative (Conference and Labs of the Evaluation Forum At NTCIR in Japan, and under the Forum for Information Retrieval Evaluation (FIRE) in India Keh-Yih Su, Nov. 4, 2014

Knowledge Discovery through Structural Document Understanding Yuji Matsumoto (Nara Institute of Science and Technology) Big Data Application （ )　 October 27, 2015 Academia Sinica Background Rapid increase of scientific / technical documents Web of Science and Scopus cover 23M and 53M technical papers Unable for researchers to read all of related papers Aim Developing and deepening Text/Document analysis technologies Implementing content-aware document retrieval and recommendation Enabling novel knowledge acquisition and surveillance All of those will be done through various levels of structure analysis of documents

Overview of the Project
Text/Document Analysis (T1, T5) Knowledge Base Construction (T2, T3, T4) Structural Similarity Analysis (T6, T7) User Interface / Survey Generation (T8, T9) Inner-Document Analysis Inter-Document Analysis Citation Analysis Document Relation Document Similarity Multi-Document Summarization Survey Generation Text/Document Analysis Concept/Relation 　　Analysis Event Chain Analysis Document Structure Analysis Document Summarization User Interface / Document Visualization

Robot for College Entrance Examination
2018/9/17 Robot for College Entrance Examination Fu Hongguang University of Electronic Science and Technology of China 2018/9/17

2018/9/17 Mainland 863 Project “Key technologies of simulating human intelligence based on big data” How to test these technologies in AI? Robot for College Entrance Examination: Chinese – HIT(Harbin Inst. of Technology) History – ISCAS(Inst. Of Software CAS) Geography – Nanjing University Mathematics – UESTC 类人答题过程-即像中学生一样的解题过程。

Reasoning Engine & ML Knowledge Base NLP Rules & Functions

MWP: A First Case for MR The math word problem (MWP) is chosen as the first test case to study MR. The answer cannot be extracted by simply performing keyword matching (beyond Q&A). Adopts less complicated syntax; Requires less amount of domain knowledge. Body part of MWP usually consists of only a few sentences. Standalone applications. (e.g. computer tutor)

What are Math Word Problems ?
Not Math Word Problems: 文具店進貨2361枝紅筆和1587枝藍筆, 文具店共進貨幾枝筆? 原子筆每枝20元，鉛筆每枝5元，姐姐買了24枝筆，共付了270元，兩種筆各買幾枝？一艘輪船20分鐘可以行駛25公里，2.5小時可以行駛多少公里？小天的跳遠紀錄是1.3公尺，小美的紀錄是1.03公尺，小綜的紀錄是1.33公尺。三人中誰跳得最遠？ 2 x 3 +5 = ?

Previous Approaches (1)
Rule-based approaches with logic inference (Bobrow, 1964): STUDENT Use “Format Matching” to convert a sentence into logic statements Input: “The price of a radio is dollars. If this price is 15 percent less than the marked price, find the marked price.” Equations: (EQUAL X00001 (Marked Price) ) (EQUAL (Price of Radio) (TIMES 0.85 (Marked Price) ) ) (EQUAL (Price of Radio) (TIMES (Dollars) ) ) Answer: “The marked price is 82 dollars.”

Rule-based approaches without logic inference (Charniak, 1968; Dellarosa, 1986; …., Hosseini, 2014): Arithmetic word problems(additions, subtractions) Describe a partial world state Update or elaborations on the state (Verb classify) End with linear algebraic equations

Hosseini, 2014 (1/2) Category Example Observation Positive Negative
There were 28 bales of hay in the barn. Positive Joan went to 4 football games this year. Negative Joan lost 3 of the violet balloons. Positive Transfer Mike’s dad borrowed 7 nickels from Mike. Negative Transfer Jason placed 131 erasers in the drawer. Construct Karen added 1/4 of a cup of walnut to a batch of trail mix. Destroy The rabbits ate 4 of Dan’s potatoes. One container Two container

Hosseini, 2014 (2/2) Grounding into Entities and Containers
Liz had 9 black kittens. She gave some of her kittens to John. Grounding into Entities and Containers [ give, 9 black kittens, Liz, John ] Classify Verb Categories (Liz, give): Negative (John, give): Positive Forming State Transitions and Equations

Purely statistical approaches (Kushman etc., 2014; Roy et al., 2015): Linear algebraic questions Mapping problem to a set of equations Generate numerical answers

Kushman et al., 2014 (1/4) Dataset: Algebra.com
Linear algebra question (#514) Manually label equation Generalize equation to a template with two rules

Kushman et al., 2014 (2/4) Try each template and variable alignment with a score u1 + u2 = n1 n3*u1 + n4*u2 = n5 u1 + n1*u2 = n2 n3*u1 + n4*u2 = n5 Word problem 2X + 3Y = 252 4X + Y = 244 u1 + u2 = n1 u1 + u2 = n2

Kushman et al., 2014 (3/4) 𝑂 ∗ = 𝑖 𝑦𝜖𝒴 𝑠.𝑡. 𝒱 𝑖 𝑦 =1 p 𝑦 𝑥 𝑖 ;𝜃
Objective function 𝑂 ∗ = 𝑖 𝑦𝜖𝒴 𝑠.𝑡. 𝒱 𝑖 𝑦 =1 p 𝑦 𝑥 𝑖 ;𝜃 𝑂= 𝑖 𝑙𝑜𝑔 𝑦𝜖𝒴 𝑠.𝑡. 𝒱 𝑖 𝑦 =1 p 𝑦 𝑥 𝑖 ;𝜃

Kushman et al., 2014 (4/4) Parameter estimation Fully supervised
Only expanding variable alignment Semi-supervised Expanding both template matching and variable alignment V: validation function

Experiment and Result (3/7)
5-fold cross-validation (Train #400, Test #100) Semi-supervised 5EQ: 5 seed questions annotated with equation 5EQ+ANS: 5 seed questions annotated with equation + answers of remaining problems Fully supervised ALLEQ: full equation systems

MA1: Simple Problems IXL: More complicated sentences MA2: Irrelevant info 395 problems, 13,632 words, 188 verbs, and 1,483 sentences.

Joan found 70 seashells on the beach . she gave Sam some of her seashells . She has 27 seashell . How many seashells did she give to Sam ? MA1: Simple Problems IXL: More complicated sentences MA2: Irrelevant info 395 problems, 13,632 words, 188 verbs, and 1,483 sentences.

Joan found 70 seashells on the beach . she gave Sam some of her seashells . She has 27 seashell . How many seashells did she give to Sam ? MA1: Simple Problems IXL: More complicated sentences MA2: Irrelevant info 395 problems, 13,632 words, 188 verbs, and 1,483 sentences. A ship is filled with 5973 tons of cargo . It stops in the Bahamas , where sailors load 8723 tons of cargo onboard . How many tons of cargo does the ship hold now ?

Joan found 70 seashells on the beach . she gave Sam some of her seashells . She has 27 seashell . How many seashells did she give to Sam ? MA1: Simple Problems IXL: More complicated sentences MA2: Irrelevant info 395 problems, 13,632 words, 188 verbs, and 1,483 sentences. A ship is filled with 5973 tons of cargo . It stops in the Bahamas , where sailors load 8723 tons of cargo onboard . How many tons of cargo does the ship hold now ? Jason found 49 seashells and 48 starfish on the beach . He gave 13 of the seashells to Tim . How many seashells does Jason now have ?

KAZB: Learning to align problems to templates ARIS2: No verb is repeated in train and test set Majority: All verbs are assumed to be positive Gold ARIS: Uses correct verb categories

Drawback of Previous Works
Weakness of rule-based approaches: Difficult to raise the coverage rate and deal with ambiguities. Expensive to construct. Weakness of not adopting logic inference: Need to implement a new handling procedure for each new type of problems (as the general logic inference mechanism is not adopted). No inference engine to generate the reasoning chain, additional effort would be required for generating the explanation. Weakness of purely statistical approaches: No understanding. System is sensitive to irrelevant information.

Design Principles (1/2) Problems should be solved via understanding and inference versus the purely statistical approach which does not perform language analysis MWP is only our first task for MR Able to handle various kinds of questions 文具店進貨2361枝紅筆和1587枝藍筆，賣了100枝藍筆，還剩幾枝筆?還剩幾枝藍筆?共進貨幾枝紅筆? 共進貨幾枝筆? 共賣了幾枝筆? 小明共進貨幾枝筆? Chao-Chun Liang, May. 12, 2015 Ref: "Seminar ProposedApproach.ppt" by Dr. Lin

Design Principles (2/2) Not favor any specific kind of problem/question Inference Engine (IE) should not adopt a pre-specified hierarchical structure IE should only deal with domain dependent generic concepts (instead of complicated problem dependent concepts) Example: “100顆糖裝成5盒糖, 1盒糖裝幾顆糖?” Not “裝成(100,顆,糖,5,盒)” Prefer “quan(q1,顆,糖) = 100” AND “quan(q2,盒,糖) = 5” AND “qmap(m1,q1,q2)” AND “verb(m1,裝成)” Chao-Chun Liang, May. 12, 2015 Ref: "Seminar ProposedApproach.ppt" by Dr. Lin

Figure out How to Solve It Elaborate Detailed Steps
How Human Solve a MWP ? Math Word Problem Understand Meaning Figure out How to Solve It Math lesson learnt from a math class Elaborate Detailed Steps Turning MWP into math equations Calculation Solving the equation

Framework Language Analysis Solution Type Classifier
Math Word Problem Paragraph (body text and questions) Language Analysis Semantic Sequences Solution Type Classifier Logic Form Converter Inference Engine Question-Answer Pairs Explanation Texts Explanation Generation

Task Math word problem (MWP): 文具店進貨2361枝紅筆和1587枝藍筆, 文具店共進貨幾枝筆?
Language Analysis: S(agent:NP(Head:Ncb:文具店)|Head:VA4:進貨|theme…. {進貨.buy|買:agent={[x1]文具店.{商店|shop:predication={or({buy|買},…… Logic Form (LF): LF1: verb(v1,進貨)&agent(v1,文具店)&theme(v1,n1)& … LF2: quan(q1,枝,n1)=2361&verb(q1,進貨)&agent(q1,文具店)&role(q1,theme) … ASK Sum(quan(?q,枝,n3), verb(?q,進貨)&agent(?q,文具店)) Logic Inference Engine (IE): (quan(q1,枝,n1p)=2361) + (quan(q2,枝,n2p)=1587) = 3948 Explanation generation: 共進貨 2361枝紅筆枝藍筆 = 3948枝筆

Syntactic/Semantic analysis (1/4)
CKIP Chinese parser Generate syntactic structure Semantic Role Labeling 文具店進貨2361枝紅筆和1587枝藍筆，文具店共進貨幾枝筆？

Syntactic/Semantic Analysis (2/4)
E-Hownet Semantic Composition Anaphora: Finding the antecedent for a pronoun. 木工拿出一塊長40cm、寬20cm的長方形木板，他想從中裁出幾個半徑2cm的圓，他最多可以裁出幾個圓形？ {拿出.take|取: agent={[x1]木工.工人|worker}, theme={[x2]木板.wood|木: 一.quantity={1}, 長.length={cm.{公分}: quantity={40} }, 寬.width={cm.{公分}: quantity={20} 長方形.shape={square|方:length={LengthLong|長}} source={[x]} {想.{expect|期望}: experiencer={[x1]他.3rdPerson|他人}, content={裁出.{cut|切削}: 從中.source={[x2]}, patient={[x3]圓.image|圖像: 幾.quantity={Ques|疑問}, 半徑.length({[x3]})={cm.{公分}: quantity={2} } }, agent={[x1]} {裁出.{cut|切削}: agent={[x1]他.3rdPerson|他人}, 最多.range={extensive|泛:degree={extreme|極}}, 可以.possibility={ish|稍}, patient={[x3]圓形.ShapeValue|形狀值: 幾.quantity={Ques|疑問} }

Semantic Composition Zero-Anaphora: 妹妹1天存150元，365天共可存幾元？ {存.store|保存: agent={[x1]妹妹.妹妹|YoungerSister}, duration={天.day|日: quantity={1} }, theme={元.money|貨幣: quantity={150} location={[x]} } {存.store|保存: duration={天.day|日: quantity={365} }, 共.quantity={all|全}, 可.possibility={ish|稍}, theme={元.money|貨幣: 幾.quantity={Ques|疑問} agent={[x1]}, location={[x]} }

Semantic Composition Co-reference: Finding referring expressions that refer to the same entity. 文具店進貨2361枝紅筆和1587枝藍筆，該店共進貨幾枝筆？ {進貨.buy|買: agent={[x1]文具店}, theme={和.and( {筆.PenInk|筆墨: quantity={2361}, color={紅.red|紅} }, quantity={1587}, color={藍.blue|藍} } )}, {進貨.buy|買: agent={[x1]文具店}, 共.quantity={all|全}, theme={筆.PenInk|筆墨: 幾.quantity={Ques|疑問} }, }

到stc

Solution Type Identification (1/3)
For identifying a specific stereotype-solution To solve a MWP, it’s not enough to know what MWP means. We need to collect various types of math operations, aggregative operations and specific problem types. 16 different kinds solution types are specified Cover a wide variety of questions found in our elementary math corpus.

Solution type (ST) with frequency in seed corpus (manually annotated 75 samples). Multiplication (24%) Utility (6%) Surplus (4%) L.C.M (2%) Sum (14%) Algebra (5%) Difference G.C.D Subtraction (12%) Comparison Ceil-Division (3%) Addition (1%) Floor-Division (7%) Ratio Common-Division Set-Operation 在我們的seed corpus裡面 , multiplication和sum佔了較大的比例

Currently, a SVM classifier with linear kernel function is used and four different kinds of feature-sets are adopted Lexicon. Ques part with lexicon “共” Lexicon pair. lexicon pair like “?比?便宜” Lexicon and POS pair. Pair [Verb, asking unit in Ques part] e.g. [ VC(修築), 天] Pattern-matching indicators. Relation between object-unit in question sentences and object-unit in body sentences (currently manually created). 目前stc 是採用svm with linear kernel, 還有四種不同的feature set下去train, 像是unig…… 動作及物動詞

Examples: Lexicon: Example:哥哥買了一本329元的故事書和一枝465元的鋼筆，共花了幾元？Type: Sum Lexicon pair: Example: 姐姐上個月電話費是562元，這個月比上個月便宜127元，這個月的電話費是幾元？ Type: Subtraction <Verb: Lexicon, POS> and <Ques: Unit-Lexicon>pair: Example: 一座長8360公尺的高架橋，工人每天修築88公尺，共要花幾天才可以全部修築完成？ [ VC(修築), 天] Type: CeilDiv Pattern-matching indicators: Example: 妹妹1天存150元，365天共可存幾元？ Type: Multiplication “Is the problem contains “two same unit with quantity =1 && quantity > 1”? 目前stc 是採用svm with linear kernel, 還有四種不同的feature set下去train, 像是unig……

Math Concepts Math Concepts 300朵玫瑰 quan(q1,朵,玫瑰)=300
quan(id, unit, obj)=value q_map(id, quan1, quan2) u_map(id, unit1, obj1, unit2, obj2)=value has(container, unit,obj, time-index)=value circle(id, obj) regular_triangle(id, obj) trapezoid(id, obj) radius(id, unit)=value diameter(id, unit)=value top_base(id, unit)=value bottom_base(id, unit)=value height(id, unit)=value circumference(id, unit)=value side_length(id, unit)=value area(id, unit)=value set(id) member(id, obj) 在lfc 有很多的math concepts 利入quantity, 舉個例子, 300朵玫瑰….

Input/Output of Logical Form Converter
SR: {進貨: agent={文具店}, theme={和.and( {筆: quantity={2361}, unit={枝}, color={紅}}, quantity={1587}, unit={枝}, color={藍}})}, … Semantic Sequences (Input) Logic Form Converter LF1: verb(v1,進貨)&agent(v1,文具店)&theme(v1,j1) cj_op(j1,and)&cj_arg(j1,n1)&cj_arg(j1,n2) head(n1,筆)&quantity(n1,2361)&unit(n1,枝)&color(n1,紅) head(n2,筆)&quantity(n2,1587)&unit(n2,枝)&color(n2,藍) LF2: quan(q1,枝,n1)=2361&verb(q1,進貨)&agent(q1,文具店)&role(q1,theme) quan(q2,枝,n2)=1587&verb(q2,進貨)&agent(q2,文具店)&role(q2,theme) ... ANS=Sum(quan(?q,枝,n3),verb(?q,進貨)&agent(?q,文具店)) First order logic (FOL)-like statements (Output)

Logical Form Converter (1/4)
Deterministically convert a tree-like semantic representation into a list of FOL-like statements. (Moldovan, 2011) 文具店進貨 theme n1 n2 SR: {進貨: agent={文具店}, theme={和.and( {筆: quantity={2361}, unit={枝}, color={紅}}, quantity={1587}, unit={枝}, color={藍}})}, … 2361 枝紅筆 1587 枝藍筆 LF1: verb(v1,進貨)&agent(v1,文具店)&theme(v1,j1) cj_op(j1,and)&cj_arg(j1,n1)&cj_arg(j1,n2) head(n1,筆)&quantity(n1,2361)&unit(n1,枝)&color(n1,紅) head(n2,筆)&quantity(n2,1587)&unit(n2,枝)&color(n2,藍)

Non-deterministically detect/extract math concepts and generate associated FOL-like statements. verb(v1,進貨)&agent(v1,文具店)&theme(v1,j1) cj_op(j1,and)&cj_arg(j1,n1)&cj_arg(j1,n2) head(n1,筆)&quantity(n1,2361)&unit(n1,枝)&color(n1,紅) head(n2,筆)&quantity(n2,1587)&unit(n2,枝)&color(n2,藍) LF1 & quan(q1,枝,n1)=2361&verb(q1,進貨)&agent(q1,文具店)&role(q1,theme) quan(q2,枝,n2)=1587&verb(q2,進貨)&agent(q2,文具店)&role(q2,theme) LF2

Non-deterministically find the arguments of Math operation utilities according to the given “Solution Type”. 文具店共進貨幾枝筆？ SR: {進貨.buy|買: agent={文具店…., 共.quantity={all|全}, theme={筆.PenInk|筆墨: 幾.quantity={Ques|疑問} }, … LF1: verb(v2,進貨)&agent(v2,文具店)&mod(v2,共)& theme(v2,n3)&head(n3,筆)&mod(n3,幾)&unit(n3,枝) LF2: ANS=Sum(quan(?q,枝,n3),verb(?q,進貨)&agent(?q,文具店)) Solution Type: Sum

Different Questions with the Same Body Text. Q1: 共進貨幾枝筆? LF1: verb(v2,進貨)&agent(v2,文具店)&mod(v2,共)&theme(v2,n3)& head(n3,筆)&mod(n3,幾)&unit(n3,枝) LF2: ANS=Sum(quan(?q,枝, n3),verb(?q,進貨)&agent(?q,文具店)) I.E. binds quan(?q,枝,n3) to both quan(q1,枝,n1) and quan(q2,枝,n2) Q2: 共進貨幾枝紅筆? LF1: verb(v2,進貨)&agent(v2,文具店)&mod(v2,共)&theme(v2,n3)& head(n3,筆)&mod(n3,幾)&unit(n3,枝)&color(n3,紅) LF2: ANS=Sum(quan(?q,枝,n3),verb(?q,進貨)&agent(?q,文具店)) I.E. binds quan(?q,枝,n3) to quan(q2,枝,n1) only

Operation Utilities Operation Utilities Sum(function, condition)=value
Addition(value1, value2)=value Subtraction(value1, value2)=value Difference(value1, value2)=value Multiplication(value1, value2)=value CommonDiv(value1, value2)=value FloorDiv(value1, value2)=value CeilDiv(value1, value2)=value Surplus(value1, value2)=value ArgMin(arg,function, condition)=value ArgMax(arg,function, condition)=value Get(function, condition)=value

Logic Inference (1/8) In our design, an Inference Engine (IE) is used to find out the solution for an MWP. Capabilities : Facts and inference rules are represented in first-order logic (FOL). Binding variables. Using inference rules to derive new facts from the old ones provided by LFC. Providing utilities to perform math operations on related facts. 在ie 用來solve mwp 他有幾個goals 像是提供operation utilities, 根據相關的fact找到ans 產生新的fact 做logic binding

Logic Inference (2/8) The problem which can be calculated from the facts directly derived from MWP: 花店進貨300朵玫瑰和600朵百合, 上午賣出186朵百合,下午賣出234朵，問花店共賣出幾朵百合? quan(q1,朵,玫瑰)=300&verb(q1,進貨)&agent(q1,花店)&… quan(q2,朵,百合)=600&verb(q2,進貨)&agent(q2,花店)&… quan(q3,朵,百合)=186&verb(q3,賣出)&agent(q3,花店)&… quan(q4,朵,百合)=234&verb(q4,賣出)&agent(q4,花店)&… ASK Sum(quan(?q,朵,百合),verb(?q,賣出)&agent(?q,花店)) 這樣的一個mwp , ie 根據lfc 產生的fact做binding 可以看到最後問的是 “賣出, 花店” Ie 在座binding 就不會去找到不相關的 quantity (進貨) 可以filter掉 irrelevant information 相較於以前的work, 不知道這些attribute, 就可能會把irrelevant quantity 來做計算. quan3 + quan4 = = 420

Logic Inference (3/8) The problem which needs to derive new facts according to background knowledge:. 爸爸買了3本329元的故事書和2枝465元的鋼筆，爸爸共要付幾元? quan(q1,本,n1p)=3&verb(q1,買)&agent(q1,爸爸)& head(n1p,故事書)&price(n1p,329) quan(q2,枝,n2p)=2&verb(q2,買)&agent(q2,爸爸)& head(n2p,鋼筆)&price(n1p,465) ASK Sum(quan(?q,元,#),verb(?q,付)&agent(?q,爸爸)) Ie 還會根據lfc 給的fact 產生新的fact, 像是”3本329元” 在lfc 會得到的fact 就是 q1…….q2…… quan(q1,本,n1)&verb(q1,買)&agent(q1,爸爸)&price(n1,329) →quan(q3,元,#)=quan(q1,本,n1)×329&verb(q3,付)&agent(q3,爸爸) quan(q1,本,n1)×329=3×329=987 quan(q3,元,#)=987&verb(q3,付)&agent(q3,爸爸)

Verb entailment (1/2) The MWP might adopt different verbs among body text and question text. 爸爸買了3本329元的故事書和2枝465元的鋼筆，爸爸共要付幾元? 有些問題, 在q的verb 和 body 的verb 不一樣, 在binding 的時候可能會miss match 例如, 因為”買” , 所以要”付” ie 必須要了解 verb之間存在有意義的關係, 用v e quan(?q,?u,?o)&verb(?q,買)&agent(?q,?a)&price(?o,?p) →quan($q,元,#)=quan(?q,?u,?o)×?p&verb($q,付)&agent($q,?a) Verb entailment required

Verb entailment (2/2) Given an order verb pair (買, 付) as input, we want to detect whether the entailment relation ‘買付’ holds for this pair. E-HowNet is adopted as the knowledge base for solving this problems.

E.g.奇幻森林裡有4256棵樹，每38棵劃分成1區，共可劃分成幾區？共可劃分成幾區？ SR: {劃分成(3): quantity={共(1)}, epistemics={可(2)}, range={區(5): quantifier={幾(4} }, theme={[x1]} } 共(1,Daa): {all|全} 可(2,Dbaa): possibility={ish|稍} 劃分成(3,VG2): {become|成為:means={classify|分類}} 幾(4,Neqa): quantity={Ques|疑問} 區(5,Ncb): {district|區} LF1: verb(v3,劃分成)&quantity(v3,共)&epistemics(v3,可)& range(v3,n4)&head(n4,區)&quantity(n4,幾) LF2: ANS=CeilDiv(quan(q1,棵,樹),u_map(?m,#,區,棵,樹)& verb(?m,劃分成)) 另外lfc 還會根據stc 所給的type 找到operation utilities 的argument. quan(q1,棵,樹)=4256 & quan(q2,棵,樹)=38 & qmap(m1,q1,q2) & verb(m1,劃分成) => u_map(m1, #,區,棵,樹): 1區樹?棵 Solution Type: CeilDiv

E.g. 1個平年有365天，3個平年共有幾天？ 3個平年共有幾天？ SR: {有(4): theme={平年(2): quantifier={３個(1)} }, quantity={共(3)}, range={幾天(5)} } ３個(1,DM): quantifier={個.null|無義:quantity={3}} 平年(2,Nac): {year|年:predication={exist|存在:theme={day|日:quantity={365}},topic={~}}} 共(3,Daa): {all|全} 有(4,V_2): {exist|存在} 幾天(5,DM): 天.time={day|日:幾.quantity={Ques|疑問}} LF1: verb(v2,有)&topic(v2,n3)&quantity(v2,共)&theme(v2,n4)&head(n3,平年)&quantity(n3,3)&unit(n3,個)&head(n4,天)&quantity(n4,幾) LF2: ANS=Multiplication(quan(q3,個,平年),u_map(?m,個,平年,天,#)&verb(?m,有)) 另外lfc 還會根據stc 所給的type 找到operation utilities 的argument. quan(q1,個,平年)=1 & quan(q2,天,#)=365 & qmap(m1,q1,q2) & verb(m1,有) => u_map(m1, 個,平年,天,#): 1個平年?天 Solution Type: Multiplication

E.g. 2.32公升的水和2.294公升的果汁，哪一種比較少？哪一種比較少？ SR: {少(3): theme={飲料(1.1): quantifier={[x1,x2]哪一種(1)} }, degree={比較(2)} } 哪一種(1,DM): quantifier={種.kind({entity|事物}):一.quantity={1}},哪.quantifier={Ques|疑問} 飲料(1.1,Naa): {drinks|飲品} 比較(2,Dfa): degree={more|較} 少(3,VH13): {few|少} LF1: verb(v1,少)&theme(v1,n3)&degree(v1,比較)&head(n3,j1)&quantifier(n3,哪)&qualification(n3,u3)&head(u3,種)&quantity(u3,1) LF2: set(s1)&member(s1,水)&member(s1,果汁) Ans=ArgMin(quan(?q,?u,?o)&member(s1,?o)) 另外lfc 還會根據stc 所給的type 找到operation utilities 的argument. Solution Type: Comparison

E.g.長３公尺、寬２公尺的長方形，面積是多少平方公尺？面積是多少平方公尺？ SR: {是(2): theme={[x2]面積(1)}, range={多少平方公尺(3)} } 面積(1,Nad): {area([x1])} 是(2,V_11): {be|是} 多少平方公尺(3,DM): 平方公尺.size={平方公尺:多少.quantity={Ques|疑問}} 長3公尺、寬2公尺的長方型， SR: {[x1]長方形(7): property={的(6): head={、(3): DUMMY1={長(1): range={３公尺(2)}, theme={[x1]}…, DUMMY2={寬(4): range={２公尺(5)}, theme={[x1]} }… 長(1,VH12): length={} ３公尺(2,DM): 公尺.length… 、(3,Caa): {and()} 寬(4,VH12): width={} ２公尺(5,DM): 公尺.length… 的(6,DE): relation({entity|事物}) 長方形(7,Nac): shape={square|方:length={LengthLong|長}} LF1: head(n1,長方形)&property(n1,的)&head(n1,、)&verb(n1,長)&range(n1,公尺)&quantity(n1,3)&theme(n1,長方形)&verb(n1,寬)&range(n1,公尺)&quantity(n1,2)&theme(n1,長方形) LF2: quan(q1,公尺,#)=3&quan(q2,公尺,#)=2 rectangle(o1)&length(o1,公尺)=3&width(o1,公尺)=2 另外lfc 還會根據stc 所給的type 找到operation utilities 的argument. LF1: verb(v1,是)&theme(v1,n2)&range(v1,n2)& head(n2,面積)&unit(n2,平方公尺)&quantity(n2,多少) LF2: ANS=Area(o1,平方公尺) Solution Type: Utility

E.g.每邊長13公分的正三角形和每邊長10公分的正三角形，周長相差幾公分？周長相差幾公分？ SR: {相差(2): theme={周長(1)}, range={幾公分(3)} } 周長(1,Nad): length({edge([x1,x2])}) 相差(2,VJ3): {different|異:range={}} 幾公分(3,DM): 公分.length={公分:幾.quantity={Ques|疑問}} LF1: verb(v1,相差)&theme(v1,周長)&range(v1,n3)&unit(n3,公分)&quantity(n3,幾) LF2: regular_triangle(o1) & length(o1,公分)=13 regular_triangle(o2) & length(o2,公分)=10 ANS=Difference(circumference(o1,公分), circumference(o2,公分)) 另外lfc 還會根據stc 所給的type 找到operation utilities 的argument. Solution Type: Difference

Explanation Generation (1/4)
EG Architecture of MWP Figure 3 shows the block diagram of our proposed EG. First, the Inference Engine generates the answer and its associated reasoning chain for the given math problem. To ease the operation of the EG, we first convert the given reasoning chain into its corresponding Explanation Tree (shown at Figure 5) to center on each operator appearing in the MWP (which is convenient to perform sentence segmentation later). Next, the Explanation Tree will be fed as input of Discourse Planner. The last stage is the Function Word Insertion & Ordering Module, which inserts the necessary functional words to the segmented sentences (resulted from Discourse Planner) and generates the Explanation Texts. MWP Explanation Generator (Text Generator)

1. 阿志買一臺冰箱和一臺電視機，付2疊一萬元鈔票、6張千元鈔票和13張百元鈔票，阿志共付了幾元？ SUM(quan(q?,…)) Figure shows that three sets of facts are originated from the 2nd body sentence (indicated by three S2 nodes). Each set contains a corresponding quantity-fact (e.g., q1(疊), q2(元), and q3(張)) and its associated object (e.g., n1, n2, and n3). For example, the first set (the left most one) contains q1(疊) (for “2 疊”) and n1 (for “一萬元鈔票”). This figure also shows that the outputs of three IE-Multiplication operators (i.e., “20,000 元”, “6,000 元”, and “1,300 元”) will be fed into the last LFC-Sum to get the final desired result “27,300 元” (denoted by the “Ans(SUM)” node in the figure).

OP Oriented Template -- OP_SUM 所以，共Verb … = 1 2 3 4 27,300元 SUM 20 MUL 2疊 10,000元鈔票 6, 6張 1,000元鈔票 1,3 13張 100元鈔票 4 1 2 3

[Sample1] 阿志買一臺冰箱和一臺電視機，付2疊一萬元鈔票、6張千元鈔票和13張百元鈔票，阿志共付了幾元？ Benchmark: 2疊 10,000元鈔票就是 2 x 10,000 = 20,000元 6張 1,000元鈔票就是 6 x 1, = 6,000元 13張 100元鈔票就是 13 x = 1,300元所以，共付 20, , , = 27,300元付怎麼來的要記得講 27,300元 SUM 20,000元 MUL 2疊 10,000元鈔票 6,000元 6張 1,000元鈔票 1,300元 13張 100元鈔票

Current Status and On-going Work (1/5)
MWP corpus statistics Average length per problem Corpus Num. of problems Training Set 20,093 Develop Set 1,700 Test Set Total 23,493 Corpus Avg. Chinese Chars. Avg. Chinese Words Body 27 18.2 Question 9.4 6.8

Completed all associated modules. Syntactic Parser Semantic Role Labeling Semantic Composer Solution Type Classifier, Logic Form Converter, Inference Engine and Explanation Generator Completed integration flow test. Manually annotated 75 samples (seed corpus) Manually annotated 200 development set.

Perform weakly supervised learning (with Beam-Search) and fine tune the system. Statistical framework. : The obtained answer Ans: A specific possible answer Body: The given body text of the problem Qus: The question text of the problem. IR : Inference Rules. LFB : Logic Form of Body text. LFQ : Logic Form of Question text. SMB : Semantic Representation of Body text. SMQ : Semantic Representation of Question text. ST : Solution Type.

On-going work (Cont.): Learn patterns and parameters from the training-set. Bootstrapping Procedure: (0) Get initial features/parameters, and start from Level-1 (1) Start with a small training-set in the specified level (2) Execute the program to generate various answers (with multiple output candidates in each phase) (3) Select the most likely path which matches the given answer (4) Extract new features (e.g., lexicon patterns) via pre-specified generic patterns (5) Extract associated outcomes (including new features) to re-estimate the parameters (6) Repeat steps (2) to (5) on a larger corpus of the same level (7) Repeat steps (2) to (6) from Level-1 to Level-6

Conclusion A tag-based statistical framework is proposed to perform understanding and reasoning for solving MWP. Main contributions Proposing a tag-based approach to: Provide the answer more precisely, Make the system less sensitive to the irrelevant information Provide the flexibility for handling various kinds of possible questions. Proposing a statistical framework to perform reasoning from the given text, and automatically learning patterns/parameters from the training-set.

Thanks for Listening !

Error Analysis

Without Co-reference Problem(1/4)
S20: 花園裡有35朵玫瑰花和2朵百合花，花園裡一共有幾朵花？ Sent1: 玫瑰花(5): {FlowerGrass|花草} 百合花(8): {FlowerGrass|花草} Sent2: 花(6): {FlowerGrass|花草} 無法指回Sent-1中的”玫瑰花”與”百合花” S63: 操場上有40個小朋友：每20人分成一組，可以分成多少組？ Sent-2 ”20人”中的”人”指的是Sent-1中的”小朋友” Subset 無法co-refer (by SuChu) 人(3): {human|人} 小朋友(5): {human|人:age={child|少兒}} Chao-Chun Liang, Oct. 19, 2015

原子筆(1): {PenInk|筆墨:telic={write|寫:instrument={~}}} S199: 原子筆每枝20元，鉛筆每枝5元，姐姐買了24枝筆，共付了270元，兩種筆各買幾枝？ Sent-3 ”姐姐買了24枝筆”中的”筆”, 指的是Sent-1中的”原子筆”, 與Sent-2中的” 鉛筆” Sent-3 的筆指的是原子筆和鉛筆，但不是Sent-1, Sent-2中一般性論述的原子筆和鉛筆，無法co-refer。 Sent-4”共付了270元”, 指的是Sent-3 “買了24枝筆”所付的錢 "買"的event frame只有agent, theme，錢(270元)不屬於其中，但可由適當的entailment得知。 Sent-5 “兩種筆” , 指的是Sent-1中的”原子筆”, 與Sent-2中的” 鉛筆” 標記已涵蓋。鉛筆(1): {PenInk|筆墨} 筆(5): {PenInk|筆墨}

輪胎(2): {輪胎|tire} 輪(6): {輪子|wheel} S186: 有100個輪胎，正好可組裝在六輪小貨車和四輪汽車共22輛上，組裝好的六輪小貨車和四輪汽車各有幾輛？ Sent-2 ”正好可組裝在六輪小貨車和四輪汽車” 中 “六輪小貨車”所需的”輪胎” 與“四輪汽車”所需的” 輪胎”指的是Sent-1中的” 100個輪胎”中的”輪胎” 因 "四輪" 與 "六輪" 汽車所指涉的輪胎是一般指稱，並不為Sent-1所提的100個輪胎，不為co-reference。 Sent-3 “六輪小貨車和四輪汽車” , 指的是Sent-2中的”六輪小貨車” 和” 四輪汽車” 目前的標記已涵蓋。

X001: 雞兔共35隻，已知腳數共104隻，問雞兔各有幾隻？ Sent-2 “腳數共104隻”中的”腳數”, 指的是Sent-1中的” 雞”與”兔” 的腳 "腳"的 whole/host 的確可以refer回 Sent-1中"雞"和"兔"的腳，但目前要填回 host 並 co-refer 的只有"長"、"寬"、"高"等那二十幾個詞彙，並不包含"腳"。所以依照現在的標記原則不需要在語意上標記共同指稱。 Append [x1] ("雞""兔"這個複合體) in the next example Sent-3 “雞兔各有幾隻”, 指的是Sent-1中的” 雞”與”兔” 目前標記涵蓋該資訊。其中[x1]指的就是"雞""兔"這個複合體。

Extract and construct quantitative relations
2018/9/17 Extract and construct quantitative relations There are some chickens and rabbits in a cage, we can see 35 heads and 94 legs. How many chickens and rabbits? (若干只鸡兔同在一个笼子里，总共有35个头，有94只脚。问笼中各有多少只鸡和兔？) Solving: Let chicken x, rabbit y. x+y=35 2x+4y=94 So, x=23, y=12 Answer: Rabbits 12, Chickens 23. Slegs Chicken 2 Slegs Rabbit 4 Chicken heads + Rabbit heads = Heads Chicken legs + Rabbit legs = legs

Institute of Information Science, Academia Sinica

Similar presentations

Presentation on theme: "Institute of Information Science, Academia Sinica"— Presentation transcript:

Similar presentations

About project

反馈

请登录

Auth with social network:

Institute of Information Science, Academia Sinica

Similar presentations

Presentation on theme: "Institute of Information Science, Academia Sinica"— Presentation transcript:

Similar presentations

About project

反馈