Download presentation
Presentation is loading. Please wait.
1
基于规则抽取的时间表达式识别 -英文Ⅲ 高冠吉
2
对TE3数据集的思考 TE3测试集上的Relaxed Match已经达到98.55,在自动识别之后再 进行一些处理是可行的。
利用语法树:一个时间表达式应当是一个完整的NP、ADVP或CD 对token层面的规则进一步整理 (大致框架:抽取规则抽取待定时间表达式语法树对边界修 正合并)
3
语法树解决识别过长 This comes just over a week before the start of British Summer Time. This flu season started in early December, a month earlier than usual.
4
语法树解决识别部分 Leon worked in Texas, a position he had held for almost seven years. China's current economic policies would result in an enormous surge in coal consumption and automobile sales over the next decade.
5
人工构建token规则(泛化、分类) 可以延用SynTime的token规则,并进行修正和扩充 将类别层次化,解决部分可泛化的时间表达式
Late last July [MONTH_REGEX] Januarys?('s)?|Februarys?('s)?|Marchs?('s)?|Aprils?('s)?|Mays?('s)?|Junes?('s)?|Julys?('s)?|Augusts?('s)?|Septembers?('s)?|Octobers?('s)?|Novembers?('s)?|Decembers?('s)? [PREFIX_REGEX_1] the|this|that|these|those|next|following|consecutive|previous|latter|last|late(st)?|initial|universal|mid(dle)?|final|coming|upcoming|past|future|current|recent|ides|early|each|every|other|alternate|alternating|another|about|around|almost|some|whole|few|several|of|more|less|than|near(ly)?|right
Similar presentations