PRIMT: A Pick-Revise Framework for Interactive Machine Translation

Slides:



Advertisements
Similar presentations
第七课:电脑和网络. 生词 上网 vs. 网上 我上网看天气预报。 今天早上看了网上的天气预报。 正式 zhèngshì (报告,会议,纪录) 他被这所学校正式录取 大桥已经落成,日内就可以正式通车 落伍 luòw ǔ 迟到 chídào 他怕迟到,六点就起床了.
Advertisements

胸痛中心的时间流程管理 上海胸科医院 方唯一.
Time Objectives By the end of this chapter, you will be able to
Quotes With Integrity & Risk Control 外贸报价的诚信与风控
Healthy Breakfast 第四組 電子一甲(電資一) 指導老師:高美玉 組長:B 侯昌毅
How can we become good leamers
即兴中文讲演比赛 On-Site Speech 新型比赛项目
Chapter 8 Liner Regression and Correlation 第八章 直线回归和相关
Chaoping Li, Zhejiang University
摘要的开头: The passage mainly tells us sth.
Mode Selection and Resource Allocation for Deviceto- Device Communications in 5G Cellular Networks 林柏毅 羅傑文.
Text Segmentation for Chinese Spell Checking
Homework 4 an innovative design process model TEAM 7
Visualizing and Understanding Neural Machine Translation
Platypus — Indoor Localization and Identification through Sensing Electric Potential Changes in Human Bodies.
Excellence in Manufacturing 卓 越 制 造
Calling about an apartment for rent II Objectives
The Empirical Study on the Correlation between Equity Incentive and Enterprise Performance for Listed Companies 上市公司股权激励与企业绩效相关性的实证研究 汇报人:白欣蓉 学 号:
簡易 Visual Studio 2010 C++ 使用手冊
Journal Citation Reports® 期刊引文分析報告的使用和檢索
課務組 Curriculum Section
Decision Support System (靜宜資管楊子青)
Unit 2 Key points summary.
Time Objectives By the end of this chapter, you will be able to
製程能力分析 何正斌 教授 國立屏東科技大學工業管理學系.
The expression and applications of topology on spatial data
Chinese 101 University of Puget Sound
Yonghui Wu, Mike Schuster, Zhifeng Chen, Quoc V. Le, Mohammad Norouzi
971研究方法課程第九次上課 認識、理解及選擇一項適當的研究策略
Interval Estimation區間估計
簡易 Visual Studio 2005 C++ 使用手冊
Chapter 3 Nationality Objectives:
农村居民的信息需求与获取渠道研究 ——以云南省腾冲县为个案
Decision Support System (靜宜資管楊子青)
21st Century Teaching & Learning
句子成分的省略(1).
IBM SWG Overall Introduction
成品检查报告 Inspection Report
Red hot & Green leaves Item
Guide to a successful PowerPoint design – simple is best
BORROWING SUBTRACTION WITHIN 20
中国科学技术大学计算机系 陈香兰 2013Fall 第七讲 存储器管理 中国科学技术大学计算机系 陈香兰 2013Fall.
虚 拟 仪 器 virtual instrument
Common Qs Regarding Earnings
中央社新聞— <LTTC:台灣學生英語聽說提升 讀寫相對下降>
关联词 Writing.
Review and Analysis of the Usage of Degree Adverbs
Learn Question Focus and Dependency Relations from Web Search Results for Question Classification 各位老師大家好,這是我今天要報告的論文題目,…… 那在題目上的括號是因為,前陣子我們有投airs的paper,那有reviewer對model的名稱產生意見.
Unit 7 Lesson 20 九中分校 刘秀芬.
Inter-band calibration for atmosphere
績效考核 一.績效考核: 1.意義 2.目的 3.影響績效的因素 二.要考核什麼? 三.誰來負責考核? 四.運用什麼工具與方法?
高考应试作文写作训练 5. 正反观点对比.
An Efficient MSB Prediction-based Method for High-capacity Reversible Data Hiding in Encrypted Images 基于有效MSB预测的加密图像大容量可逆数据隐藏方法。 本文目的: 做到既有较高的藏量(1bpp),
第八章 結論章節.
An organizational learning approach to information systems development
李宏毅專題 Track A, B, C 的時間、地點開學前通知
Efficient Query Relaxation for Complex Relationship Search on Graph Data 李舒馨
Create and Use the Authorization Objects in ABAP
严肃游戏设计—— Lab-Adventure
名词从句(2).
More About Auto-encoder
國立清華大學 National Tsing Hua University
钱炘祺 一种面向实体浏览中属性融合的人机交互的设计与实现 Designing Human-Computer Interaction of Property Consolidation for Entity Browsing 钱炘祺
高考英语短文改错答题技巧 砀山中学 黄东亚.
Class imbalance in Classification
MGT 213 System Management Server的昨天,今天和明天
簡單迴歸分析與相關分析 莊文忠 副教授 世新大學行政管理學系 計量分析一(莊文忠副教授) 2019/8/3.
CAI-Asia China, CATNet-Asia
When using opening and closing presentation slides, use the masterbrand logo at the correct size and in the right position. This slide meets both needs.
Presentation transcript:

PRIMT: A Pick-Revise Framework for Interactive Machine Translation Shanbo Cheng, Shujian Huang, Huadong Chen, Xinyu Dai and Jiajun Chen Nanjing University By Jiawei Ling

The Pick-Revise IMT Framework Introduction IMT Traditional IMT and Pick-Revise framework The Pick-Revise IMT Framework Pick Revise Decoder and Model Adaption Automatic Suggestion Models PSM RSM Experiments Example Analysis Conclusion

IMT Human translators usually have to modify the results generated by a machine translation (MT) system which needs a lot of modifications, and is time- consuming. To speed up the process, interactive machine translation (IMT) is proposed which instantly update the translation result after every human action. Because the translation quality could be improved after every update, IMT is expected to generate high quality translations with less human actions. 1、在开始的机器翻译中,经常使用post-editing,即译后编辑,指的是“通过少量的人工修改以对机器生成的翻译进行完善”的过程。

Traditional IMT Typical IMT systems usually use a left-to-right sentence completing framework in which the users process the translation from the beginning of the sentence and interact with the system at the left-most error. It is difficult to modify critical translation errors at the end of a sentence. Critical translation errors are those errors that has large impact on the translation of other words or phrases, which are often caused by the inherent difficulty of translating source phrases. 1、典型的IMT系统经常使用从左到右的句子翻译框架,翻译过程从句子开始,并对最左边的翻译错误进行人机交互。假设从句子开始到被修改的部分,这个部分叫做“前缀“,是正确的,系统会在给定前缀的后面生成新的翻译。 并且从左至右修改将延迟歧义点的修改,降低了交互的效率。

Introduction to Pick-Revise Framework Pick: a wrongly-translated phrase is selected from the whole sentence. Revise: the correct translation is selected from the translation table (or manually added) to replace the original one. Our system then re-translates the sentence and searches for the best translation using previous modifications as constraints. we propose two automatic suggestion models that could predict the wrongly- translated phrases and select the revised translation. Pick:即从整个句子中挑选出被错误翻译的短语,Revise:即从翻译表中(或者手工添加的翻译)选择相对正确的翻译,去修改之前(错误的)翻译。 3、句子会被重新翻译,系统也将原先的修改作为约束搜索出最好的翻译。

Difference between PR and L2R 在left-to-right翻译系统中,系统选择最左侧的错误”to discuss”,修改为”discuss”。但是这样并不会带来更加有效的效果,所以我们需要更多的人机交互提高翻译质量。 在pick-revise系统中,假设我们挑选“反恐”作为最严重的翻译错误,然后将其从“the”修改为“anti-terrorism”。之后句子将会被重新翻译,不仅生成了正确的翻译,而且提高了翻译的质量。

(Sij,t’) (Sij,t) Start Model Adaption S1,…,Sn Constrained Decoder Acceptable? Picking Revising Model Adaption Stop Yes S1,…,Sn E1,…,En (Sij,t’) (Sij,t) No 框架系统使用带约束的解码器生成翻译,约束由原先的pick和revise过程生成。 pick和revise的结果也被收入到模型适应中 整个过程会循环直到翻译被用户接受。

Pick In the picking step, the users pick the wrongly-translated phrase, (sji ,t). Aiming at finding critical errors in the translation, caused by errors in the translation table or inherent translation ambiguities. To make the picking step easier to be integrated into MT system, we limit the selection of translation errors to be those phrases in the previous PR-cycle output. For more convenient user interactions, in our PRIMT system, critical errors can be picked from both the source and target side by simply a mouse click on it. 1、s[i..j]是包括了源语句中从i到j位置的短语,被翻译成t 错误越严重,改正翻译错误使得翻译质量提高更大,因为严重的翻译错误会对文本翻译造成很大的影响。

Revise The users revise the translation of sij by selecting the correct translation t′ from the translation table, or manually add one if there is no correct translation in the translation table. Whether to perform selection or adding depends on the quality of the translation table. When the translation system is trained with large enough parallel data, the quality of the translation table is usually high enough to offer the correct translation. 2、此外,对于被选中的短语,短语表中的翻译选项在用户面前以表的形式呈现,用户仅需简单的使用鼠标点击正确的翻译完成修订的操作,或者将一个新的翻译输入到一个输入区域。

Decoder and Model Adaption We use a constrained decoder to search for the best translation with the previous PRPs as constraints. It makes an extra comparison between each translation option and previous PR pairs, which ignores all the phrases that overlap with the source side of a pick-revise pair (PRP). It makes the search space much smaller than standard decoding. 0、在一个Pick-revise循环中,pick-revise对(s[i..j],t’)被收入到解码器中。

The Picking Suggestion Model (PSM) The goal of PSM is to automatically recognize those phrases that might be wrongly-translated, and suggest users to pick these phrases. Within all the phrases of a source sentence, we need to separate the wrongly- translated phrases and correctly-translated phrases. We use the translation quality gain after the revising action as a measurement. 为了进一步减少人的操作,我们在pick和revise操作中使用一个自动化的建议模型,以给用户提供pick和revise操作的建议。因为在pick和revise操作中,会在大量候选中实行操作,我们使用分类为基础的方法对两个操作建立模型。接下来我们介绍如何将pick和revise定义为分类任务,并且选择特征去对这些建立模型。 3、因为翻译错误会导致翻译质量的下降,我们将revise操作后翻译质量的提高作为衡量标准。将修改操作后翻译质量提高的那些短语当做曾经错误翻译过的短语,那些修改之后反而翻译质量退化的短语当做正确翻译的短语。

The Picking Suggestion Model (PSM) determine whether the phrase is difficult-to-translate. determine whether the current translation option is correct. 将pick过程建立模型需要两方面信息

我们使用翻译模型,语言模型,词汇重排序模型,计数模型,词性标注和词汇

The Revising Suggestion Model (RSM) The goal of RSM is to predict the correct translation and suggest users to replace the wrong translation with the predicted one. We use two criteria to distinguish correct translation options from wrong translation options: The correct translation option should be a substring of the references. The correct translation option should be consistent with pretrained word alignment on the translated sentence pair. With the above criteria, we select all correct translation options as positive instances for the revising step, and randomly sample the same number of wrong translation options to be negative instances. 1、对于一个短语,词汇表就有很多的翻译选择,我们需要将其分为正确和错误的翻译选择。(并不会让用户去标记判断这些翻译) 2、第一个标准保证了选择本身的重要性,第二个标准保证了翻译选择不会选择源短语之外的翻译 3、特别的,用基线系统的翻译选择视为错误的例子。

The Revising Suggestion Model (RSM) For translations of a given source phrase, there is no need to compare their source-side information because these translation options share the same source phrase and context. Features mainly focus on estimating the translation quality of a given translation option. 这些功能主要集中在估计给定的翻译选项的翻译质量

Experiments in ideal environment 我们对其中可以通过我们目前的机器翻译系统使用强制解码产生的参考句子进行实验。强制解码迫使解码器生成几乎和参考相同的翻译,意味着不必输入新的单词生成正确的翻译。我们只模拟人的修改操作作为在短语表中选择最好的翻译,保证了短语表包含了每个短语的正确翻译。可以看到第一次PR操作,改正最严重的错误使得翻译质量得到很大的提高,BLEU(一种机器翻译的自动评价方法),KSMR (Keystroke and Mouse Action Ratio) 达到正确译文所需的键盘敲击次数与鼠标点击次数占正确译文长度的比例,和译后编辑相比,使用PRIMT框架使得人们可以用更少的交互就能得到更好的翻译,这个efficiency啊

Experiments in general environment 我们也得出了在一般环境下,翻译质量也得到了显著的提高。因为机器翻译系统本身的局限性,在一些句子中,翻译表中可能没有包括句子中源短语的正确翻译。尽管在一般情况下,BLEU的提升率不如理想环境,但是相对来说仍然有着大幅度提升。

Using Automatic Suggestion Models 我们用分类表现和翻译表现,说明了自动化建议模型的有效性。因为只有被预测为正确翻译选项才会被用于IMT系统,因此左表的准确率(提取出的正确信息条数 /  提取出的信息条数),召回率(提取出的正确信息条数 /  样本中的信息条数 )和F-score(F 值即为正确率和召回率的调和平均值)是在正确的翻译选项基础上计算的(因为很难去自动识别正确翻译,当RSM分类所有的翻译为错误时,保持翻译不变)。前馈神经网络有一定的提升,而PSM和RSM的F值均在0.60-0.70左右,可以说准确率很高了。

Using Automatic Suggestion Models 我们也测量了当模型运用于PR框架时翻译质量的提升。当随机Pick时,对翻译质量几乎没有什么提升,但是使用了PSM模型时BLEU却有了一定的提升,说明了在修改步骤中,BLEU的提升并不是因为长的翻译匹配。随机修订也不会带来BLEU的显著提升,使用RSM模型也只有不到2的提升。 一般的,使用PSM和RSM中一个,仍然会得到翻译质量的提升。但是相对于完全模拟的结果,提升相对来说很小,说明人工参与对提升翻译质量还是很重要的。如果有更好的模型或者拥有更多数据,对自动建议模型的质量会有很大的提升。

Example Analysis 1、第一次循环挑中“第六”,从the变为the 6th,引起confirmed变为confirms,第二次循环将病例从cases修改为case,“禽流感死亡病例”又改为death case from the bird flu 2、第一次挑中“需要 一定”,使得“通常”由“is”改为usually,第二次将过程从process改为course,使得,改为,and,与此同时,course移位,最后一次循环“很难”从it改为it cannot be,但是一蹴而就没有适当的翻译选择,因此需要人工翻译员添加,并生成参考翻译 3、第一次选中“无法”作为critical error,与之有关的,将充分扫除改为fully clear;第二次选择“以色列”,并使回答由response改成reply,但是句子翻译依旧不一样,因为语言模型和词汇重排序模型更倾向于错误的短语顺序,使得the us放在句末,这是机器翻译系统本身的问题,在框架中无法解决。

Conclusion By correcting the critical error instead of the left most one, our framework could improve the translation quality in a quicker and more efficient way. By using automatic suggestion models, we could reduce human interaction to a single type, either picking or revising. The performance of current framework is still related to the underlying MT system. Further improvement could be achieved by supporting other type of interactions, such as reordering operations, or building the system with stronger statistical models. 3、框架仍然是以机器翻译系统为基础

Q&A Thank you~