Presentation is loading. Please wait.

Presentation is loading. Please wait.

数据摘要现状调研报告 上下文摘要初步思考 徐丹云.

Similar presentations


Presentation on theme: "数据摘要现状调研报告 上下文摘要初步思考 徐丹云."— Presentation transcript:

1 数据摘要现状调研报告 上下文摘要初步思考 徐丹云

2 数据摘要分类 实体摘要 面向查询的摘要 上下文有关摘要 一般性摘要 本体摘要 抽取术语 抽取句子 RDF图摘要 模式抽取 顶点聚类

3 实体摘要-面向查询 和查询的相关性 Bai X, Delbru R, Tummarello G. RDF snippets for Semantic Web search engines topic-related node + query-related node+启发性算法排序 Cheng G, Qu Y. Searching linked objects with falcons: Approach, implementation and evaluation 词向量+余弦相似度 Zhang L, Zhang Y, Chen Y. Summarizing highly structured documents for effective search interaction 机器学习方法计算Facet-value和query的相关度

4 实体摘要-上下文感知 Tonon A, Catasta M, Demartini G, et al. TRank: Ranking Entity Types Using the Web of Data Type出现的频率 上下文中提到的实体 Type的层次结构

5 实体摘要-一般性 单步摘要 Cheng G, Tran T, Qu Y. RELIN: relatedness and informativeness-based centrality for entity summarization 随机冲浪模型 两个Feature的相关度(搜索引擎搜索共同出现的次数) Feature的信息量(数据集中出现的次数) Thalhammer A, Toma I, Roa-Valverde A, et al. Leveraging usage data for linked data movie entity summarization K近邻 Feature分数计算(与k个近邻共享的feature) 选择前n个feature

6 实体摘要-一般性 多步摘要 Fakas G J, Cai Z, Mamoulis N. Size-l object summaries for relational keyword 优化问题(动态规划,贪心算法) 每个元组重要性,最大化 Sydow M, Pikuła M, Schenkel R. The notion of diversity in graphical entity summarisation on semantic knowledge graphs relevance+importance+popularity+diversity

7 实体摘要-总结 统计信息 信息量,流行度等 特定应用相关信息 相关度 图结构重要性 图算法

8 本体摘要-抽取术语 Zhang X, Li H, Qu Y. Finding important vocabulary within ontology Vocabulary Dependency Graph+vocabulary和ontology的相似度+Double Focused PageRank算法(concepts+relations) Wu G, Li J, Feng L, et al. Identifying potentially important concepts and relations in an ontology CARRank算法的四个原则,迭代(concepts+relations) A concept is more important if there are more relations starting from the concepts A concept is more important if there is a relation starting from the concept to a more important concept A concept is more important if it has a higher relation weight to any other concept A relation weight is higher if it starts from a more import concept

9 本体摘要-抽取句子 -概念以及概念之间的关系
Zhang X, Cheng G, Qu Y. Ontology summarization based on rdf sentence graph RDF Sentence Graph+RDF句子的“Centrality” Degree Centrality Between Centrality Eigenvector Centrality(PageRank,HITS) Zhang X, Cheng G, Ge W Y, et al. Summarizing vocabularies in the global semantic web Expanded Bipartite Graph+Weighted HITS算法(结构上的重要程度) 包含的terms的平均重要程度(语用学上的重要性) Cheng G, Ji F, Luo S, et al. BipRank: ranking and summarizing RDF vocabulary descriptions A bipartite graph(sentence-item graph) Random walk

10 本体摘要-总结 构造图 图算法

11 RDF图摘要-抽取模式 Basse A, Gandon F, Mirbel I, et al. DFS-based frequent graph pattern extraction to characterize the content of RDF Triple Stores DFS编码代表rdf图模式+递归 Presutti12 V, Aroyo L, Adamou12 A, et al. Extracting core knowledge from Linked Data Dataset knowledge architecture+统计分析+between centrality

12 RDF图摘要-顶点聚类

13 RDF图摘要-顶点聚类 等价关系,划分 Tian Y, Hankins R A, Patel J M. Efficient aggregation for graph summarization 根据用户选择的属性和关系 Campinas S, Perry T E, Ceccarelli D, et al. Introducing rdf graph summary with application to assisted sparql formulation 同类型+相似的属性 Campinas S, Delbru R, Tummarello G. Efficiency and precision trade-offs in graph summary algorithms 至少有同一种类型,所有类型相同,有相同属性

14 下阶段工作 上下文感知摘要

15 上下文有关摘要 图算法 自然语言处理 只有考虑对type排序 只利用上下文的entity 邻居实体 TRank
Leveraging usage。。。 只有考虑对type排序 只利用上下文的entity 邻居实体 图算法 自然语言处理

16 Thanks


Download ppt "数据摘要现状调研报告 上下文摘要初步思考 徐丹云."

Similar presentations


Ads by Google