Download presentation
Presentation is loading. Please wait.
Published byKenneth Paul Short Modified 6年之前
1
Visualizing and Understanding Neural Machine Translation
ACL 2017 Visualizing and Understanding Neural Machine Translation Yanzhuo Ding, Yang Liu, Huanbo Luan, Maosong Sun
2
Machine Translation MT: using computer to translate natural languages
布什 与 沙龙 举行 了 会谈 机器翻译是利用计算机翻译自然语言的一种技术。 Bush held a talk with Sharon
3
Neural Machine Translation
Black Box 随着技术的发展,近几年 NMT越来越受到人们关注,目前大多数的神经机器翻译都是基于注意力的encoder-decoder模型,在encoder过程中,模型将输入利用LSTM生成隐状态,通过注意力机制,将encoder的隐状态作为context输入到decoder中,利用LSTM生成目标语言单词。 。目前NMT的翻译效果已经远远超出了传统的统计机器翻译方法,成为了学术界和工业界的主流方法。然而在模型中传递的是浮点数,没有语言学的解释,因此NMT难以理解和调试。因此人们
4
Previous Work Attention: relevance between input and output
非常好,但是只有encoder和decoder的隐层之间的对齐信息。 (Bahdanau et al., 2015)
5
Previous Work First-Derivative Saliency: using gradient to measure relevance. 非常好,但是只有encoder和decoder的隐层之间的对齐信息。 (Li et al., 2016)
6
Previous Work Layer-wise relevance propagation: decomposing outputs into sum of relevance scores 非常好,但是只有encoder和decoder的隐层之间的对齐信息。 (Bach et al., 2015)
7
Our Work Visualizing and interpreting NMT using LRP method
Helping to analyze translation errors
8
An Example
9
An Example
10
Neuron-level relevance
The relevance between two neuron.
11
Vector-level relevance
The relevance between two vectors.
12
Relevance vectors A sequence of vector-level relevance of its contextual words
13
Weight ratio Matrix multiplication Element-wise multiplication
Maximization
14
LRP Algorithm in NMT Algorithm: Layer-wise relevance propagation for NMT
15
Visualization of NMT model
Source Side 近 两 年 来 , 美国 1 2 3 4 5 6 jin liang nian lai , meiguo 近 两 年 来 , 美国 jin liang nian lai , meiguo 1 2 3 4 5 6
16
Visualization of NMT model
Target Side my visit is to pray 1 2 3 4 5 我 参拜 是 为了 祈求 my wo canbai shi weile qiqiu 1 2 3 4 5 1
17
Translation error analysis
Word Omission vote of confidence in the senate </s> 5 6 7 8 9 10 11 12 参 众 两 院 信任 投票 </s> the senate 4 5 6 7 8 9 10 11 10 11 can zhong liang yuan xinren toupiao </s>
18
Translation error analysis
Word Repetition
19
Translation error analysis
Unrelated Words
20
Translation error analysis
Negation Reversion
21
Conclusion We propose to use layer-wise relevance propagation to visualize and interpret NMT Our approach can calculate the relevance between arbitrary hidden states and contextual words It helps us to analyze translation errors and debug the model
22
Thanks
Similar presentations