Download presentation
Presentation is loading. Please wait.
1
Classification of Web Query Intent Using Encyclopedia 基于百科知识的查询意图获取
Bingquan Liu, Ming Liu, Gang Hu Harbin Institute of Technology Classification of Web Query Intent Using Encyclopedia 基于百科知识的查询意图获取
2
Outline Meaning Seed term extraction Intent category
Experiments results
3
Meaning Improve performance of retrieve system by searching user’s intent Classical category methods need adequate training corpus, whereas, it’s unavailable in retrieve situation. Classical category methods mostly focus on long-text, contrastingly, query is quite short-text.
4
Seed term extraction Semantic similarity calculation between words based on HowNet. Lexical construction to indicate text’s topic. Markoff Random Walk to extend seed term set.
5
intent category Training corpus formed by Baidu Zhidao daily log.
Intent category based on SVM classification.
6
Experiments results Testing corpus crawled from Sogou company.
Table 1 Seed terms extraction 意图类别 人工抽取开放分类 种子词条 导航类 门户网站、博客、微博、电子商城、贴吧、论坛、在线…… 17958 人名类 明星、专家、运动员、伟人、现代人物、古代人物…… 366411 下载类 电影、歌曲、小说、软件、故事片、战争片、计算机软件、杀 毒软件、系统工具…… 96700 Table 2 Classification results 意图类别 百度百科 人工标注 P R F 导航类 87.62 76.53 83.58 88.31 75.66 83.65 人名类 89.43 74.69 83.91 91.28 76.25 85.65 下载类 83.37 79.31 81.97 82.94 77.90 80.99
Similar presentations