Faceted Search Linlin Jia 2008.7.5
Outline What is faceted search? Why use faceted search? How to create facet? Faceted Search in Dataspace
What is faceted search? Facet in information retrieval is one of several attributes that items have and can be used for navigation. (wikipedia) Faceted Search:A search that enable users to navigate a multi-dimensional information space by combine text search with a progressive narrowing of choices in each dimension. orthogonal 什么是facet?是一个红宝石的切面?是昆虫复眼中的单眼? Facet在信息检索领域指用于引导用户查询的属性。 Facet search指在文本查询过程中引导用户渐进地缩小查询范围从而在多维信息空间中进行检索的方式。 通常这些facet是正交的 应用上,它在电子商务网站中是一个主流的交互机制。 向半结构化数据扩展。Xml/web
A example —— FacetedDBLP
Differences Keywords search Advanced Search Navigational search
Keywords Search vs. Faceted Search Keyword:simple,no need to learn complicated query language. Faceted search is a successful complement to keyword searching. Browsing very large result sets of a keyword search. Display the relationship. keyword search:简单,用户不再需要学习复杂的查询语言 over relational databases allows users to find pieces of information without having to write complicated SQL queries. 不仅是检索,更多可以展示结果集的特征,为用户提供更多的信息。
Advanced Search vs. Faceted Search 高级检索对用户提出了更高的要求,不仅要充分了解自己要查询什么,在各个属性上的值,而且需要用户充分掌握高级查询接口的使用方法。如果他把查询刻画得太过细致,在很多属性上填入了值,反而会发现查询结果为空。
Navigational search vs. Faceted Search Tango Carlos Cardel Itzhak Perlman Vs. Por Una Cabeza 相同之处:层次结构 不同之处: Navigational search 在一种层次结构 (分类学) 按照预定的顺序,不断地缩小用户的查询结果集. Faceted search满足不同用户的需求以及思考方式,允许通过不同顺序缩小结果集。即,在分类层次上存在指向同一结果的不同路径。 例如查询歌曲,传统方法可能是将歌曲先按照风格分类,然后年份、国家,歌手,格式等。而在faceted 查询中,一个用户可以决定先按照格式分类,然后是歌手等; 用户的查询需求以及思维方式不同,在navigational search中难以满足 Violin Argentina Por Una Cabeza
Outline What is faceted search? Why use faceted search? How to create facet? Faceted Search in Dataspace
Why use faceted search? Filter content using multiple taxonomy terms at the same time. Combine text searches, taxonomy term filtering, and other search criteria. Don’t know what to search for, or what they can find on your site. Hint users at related content they might not have thought of looking for, but that could be of interest to them. Clearly show users what subject areas are the most comprehensive on your site. Discover relationships or trends between contents. Too much content for it to be displayed trough fixed navigational structures. A single taxonomic order or a single folksonomy is not suitable or sufficient for your content. (faceted classification). Users often get empty result sets when searching your site. “Advanced” search forms are not fun to use. The Faceted Search module provides a search API and a search interface for allowing users to browse content in such a way that they can rapidly get acquainted with the scope and nature of the content, and never feel lost in the data. More than a search interface, this is an information navigation and discovery tool. The interface exposes metadata in such a way that users can build their queries as they go, refining or expanding the current query, with results automatically reflecting the current query. This interface also combines free-text search, fully leveraging Drupal's search engine. It avoids complex search forms, and never offers facets that would lead to empty result sets. The most obvious metadata for faceted searches is provided by Drupal's taxonomy module. However, Faceted Search's API allows developers to expose other metadata, therefore providing more facets to users for browsing content. 信息导航、知识发现。 总的来说faceted search是介于关键词查询与高级查询之间,介于浏览和检索之间的一种信息获取方式。
Outline What is faceted search? Why use faceted search? How to create facet? Faceted Search in Dataspace
How to create facet? In FacetedDBLP:Uses the keywords provided in metadata annotations of digital objects collections to automatically create light-weight topic categorization systems How facets themselves can be organized and presented, especially if a facet is highly dynamic and is too large to be presented at once. Dblp上的主要研究topic facet的自动生成与组织、维护。 对不同用户展示不同的组织方式。 一个facet上可能有很多的组成部分,需要自动的方法进行组织,例如把年份这个facet分成区间。 文档的主题是通过高昂的代价从文本中得到的,而这些主题根据用户或社区的不同可能有很大区别,甚至随时间变化
(DBLP_id, tag_id), (DBLP_id, year) Two list according TFIDF and PageRank
Research on faceted search SIGIR / CIKM / JCDL… Topics: Facet creation from document Facet creation from xml corpora and database records Facet creation by social tagging and folksonomies Faceted search in the mobile and pervasive domain Faceted search & data mining Indexing …
Faceted Search in Dataspace Search in personal information space Time Type Task Size
References http://www.l3s.de/growbag/index.php http://www.searchtools.com/info/faceted-metadata.html http://en.wikipedia.org/wiki/Faceted_classification