SView0.2-排序 (续) Rank aggregation Two rank aggregation method

Slides:



Advertisements
Similar presentations
期末考试作文讲解 % 的同学赞成住校 30% 的学生反对住校 1. 有利于培养我们良好的学 习和生活习惯; 1. 学生住校不利于了解外 界信息; 2 可与老师及同学充分交流有 利于共同进步。 2. 和家人交流少。 在寄宿制高中,大部分学生住校,但仍有一部分学生选 择走读。你校就就此开展了一次问卷调查,主题为.
Advertisements

2012年 IEET工程及科技教育認證說明會 落實成果導向認證機制 與國際接軌.
Chapter 3 The Fundamentals : Algorithms, the Integers, and Matrices
-Artificial Neural Network- Hopfield Neural Network(HNN) 朝陽科技大學 資訊管理系 李麗華 教授.
Mode Selection and Resource Allocation for Deviceto- Device Communications in 5G Cellular Networks 林柏毅 羅傑文.
XI. Hilbert Huang Transform (HHT)
Minimum Spanning Trees
3-3 Modeling with Systems of DEs
-Artificial Neural Network- Adaline & Madaline
Introduction To Mean Shift
IV. Implementation IV-A Method 1: Direct Implementation 以 STFT 為例
THE BI-CRITERIA DOUBLY WEIGHTED CENTER-MEDIAN PATH PROBLEM ON A TREE
Population proportion and sample proportion
模式识别 Pattern Recognition
Differential Equations (DE)
3.2.5 Surjective functions from N to X, up to a permutation of N
Chapter 4 歸納(Induction)與遞迴(Recursion)
SAT and max-sat Qi-Zhi Cai.
Chapter 6 Graph Chang Chi-Chung
MICROECONOMICS Chapter16 Price Control 價格管制.
The Greedy Method.
演算法方式總覽 The Divide-and-Conquer Strategy (個各擊破)(binary Searching、Quick Sort…. ) The Greedy Method(貪婪演算法) (Prim MST、Kruskal MST、Djikstra's algorithm) Dynamic.
圖論 (Graph Theory) B 電機四 大鳥 B 電機四 酋長 B 電機四 炫大
Sampling Theory and Some Important Sampling Distributions
第二單元 面積與黎曼和.
Course 9 NP Theory序論 An Introduction to the Theory of NP
创建型设计模式.
第五組 : 廖震昌 / 謝坤吉 / 黃麗珍 陳曉伶 / 陳思因 / 林慧佳
圖表製作 集中指標 0628 統計學.
主讲人: 吕敏 } Spring 2012 ,USTC 算法基础 主讲人: 吕敏 } Spring 2012 ,USTC.
Lexicographical order VS canonical order
Inventory System Changes and Limitations
Interval Estimation區間估計
子博弈完美Nash均衡 我们知道,一个博弈可以有多于一个的Nash均衡。在某些情况下,我们可以按照“子博弈完美”的要求,把不符合这个要求的均衡去掉。 扩展型博弈G的一部分g叫做一个子博弈,如果g包含某个节点和它所有的后继点,并且一个G的信息集或者和g不相交,或者整个含于g。 一个Nash均衡称为子博弈完美的,如果它在每.
消費者偏好與效用概念.
PubMed整合显示图书馆电子资源 医科院图书馆电子资源培训讲座.
Chapter12 Graphs Definitions Applications Properties
Yuzhong Qu Nanjing University
Ch07 圖形與網路 淡江大學 周清江
句子成分的省略(1).
Bayesian Method 陈子豪 ACM Honored Class July 17th,2014.
Version Control System Based DSNs
高性能计算与天文技术联合实验室 智能与计算学部 天津大学
Mechanics Exercise Class Ⅰ
每周三交作业,作业成绩占总成绩的15%; 平时不定期的进行小测验,占总成绩的 15%;
虚 拟 仪 器 virtual instrument
计算机问题求解 – 论题 算法方法 2016年11月28日.
Tournament (graph theory)
Course 10 削減與搜尋 Prune and Search
貪婪演算法 /5/6 演算法 _ 第三章.
计算机问题求解 – 论题1-5 - 数据与数据结构 2018年10月16日.
鄧姚文 資料結構 第八章:圖(Graph) 鄧姚文
Efficient Query Relaxation for Complex Relationship Search on Graph Data 李舒馨
最短通路问题.
Create and Use the Authorization Objects in ABAP
赵才荣 同济大学,电子与信息工程学院,智信馆410室
计算机问题求解 – 论题 图的连通度 2018年11月13日.
动词不定式(6).
5. Combinational Logic Analysis
2012 程式設計比賽 Openfind 天使帝國 v2.0 (蓋亞的紋章).
Konig 定理及其证明 杨欣然
複習 2013/12/24 Jehn-Ruey Jiang 江振瑞.
Maximum Flow.
簡單迴歸分析與相關分析 莊文忠 副教授 世新大學行政管理學系 計量分析一(莊文忠副教授) 2019/8/3.
Graph 1 Michael Tsai 2012/4/24 連載: 學生上課睡覺姿勢大全
Principle and application of optical information technology
Gaussian Process Ruohua Shi Meeting
计算机问题求解---论题3-12 图中的匹配与因子分解
Presentation transcript:

SView0.2-排序 (续) Rank aggregation Two rank aggregation method Kemeny method Spearman's Footrule method Improve Folksorting How to do experiments and evalution

1. Ranking aggregation This is a classical problem from social choice and voting theory, in which each voter gives a preference on a set of alternatives, and the system outputs a single preference order on the set of alternatives based on the voters' preferences

Election Winner Two such schemes. One was invented by the mathematician Charles Dodgson (better known as Lewis Carroll), and the other was suggested by J. Kemeny. We show that under a either a Dodgson election or a Kemeny election, it is NP-hard (that is, at least as hard as an NP-complete problem) to determine whether any particular candidate has won!

a. Dodgson Election Dodgson score is NP-complete. Now we contrive an election for which determining the winner entails solving Exact Cover by 3-Sets (X3C), which is known to be NP-complete.

b. Kemeny Election Kemeny score is NP-complete, and Kemeny ranking and Kemeny winner are NP-hard. Proof. Kemeny score is in NP since the score of any candidate can be computed in polynomial time. Kemeny ranking and Kemeny winner is finding a minimum feedback arc set

Feedback Arc Set 给定一个有向图G=( V, A) , V={v1 , v2 , ⋯, vn }, A={a1 , a2 , ⋯, am}, 其中V 为顶点集合, A 为弧集合, n 为顶点个数,m为弧个数, 要求找到一个最小的子集A*A, 使得G*=( V, A-A*) 无环。

2. Rank aggregation method Kemeny method Spearman's Footrule method

Definition U: a finite set of elements. {x1,x2,…,xn}. By assignig a unique identifier to each element in U, assume U={1,2,…,|U|} Τ: ordered list of U. Τ(i): denote position of i in T.( i U) πk: the kth user preference orders on U. σ: is closest to a given set of input lists {πk }, called consensus order.

Kemeny method 距离度量:2个序列π₁, π₂的Kendall-tau距离为: K(π₁, π₂) = |{(i, j)|i<j , π₁(i) < π₁( j) ,but π₂(i) > π₂(j)}| counts the number of pairwise disagreements between two lists Kemeny 排序方法对由k 个排序π₁, π₂,…, πk计算最优的合成排序σ ,使得Kendall-tau距离总和 K(σ; π₁, π₂,…, πk) = 最小。 The Kendall tau distance counts the number of pairwise disagreements between two lists; 成对分歧 that is, the distance between two full lists σ and τ is K(σ; τ) =|{(i; j)| i < j; σ(i) <σ(j); but τ(i) >τ(j)}|. Dividing this number by the maximum possible value 从|S|选2 ; we obtain a normalized version of the Kendall distance. The Kendall distance for full lists is the `bubble sort' distance, i.e., the number of pairwise adjacent transpositions needed to transform from one list to the other. The Kendall distance between two lists of length n can be computed in nlogn time using simple data structures. K(σ, πl)

A weighted complete digraph G=(V ,E, W) Kemeny optimal aggregation is equivalent to find a minimum feedback arc sets(FAS) in digraph. A weighted complete digraph G=(V ,E, W) V: the set of elements. Here, V=U. E: the set of arcs. <xi, xj>: prefer xi to xj W: the total K-distance (from the πi‘s) of a ranking that among elements. weight W(xi, xj)= |{<xi, xj>|xi< xj , π (xi) < π (xj)| x1 x2 x3 x4

This problem can be cast as a special case of weighted minimum feedback arc sets in tournaments. x1 x2 x4 x3 The minimum feedback arc set problem can be approximated to within O(logn loglog n) in general graphs and has (at least) the same approximation hardness as the vertex cover problem, which is 1.36. Even 等将带权有向图FAS 问题规约为有向循环网络( circular networks) 中最小容量多路分割问题( minimum capacity multicut) , 提出了一个近似度为O( log2|X| ) 的算法, 其中X 是给定的顶点集合, 从图中删除FAS 中边将使得图中没有圈与X 中点相交

computational complexity Total = Distance computed+ kemeny optimal aggregation Distance computed: Two lists :K(π₁ , π₂ ) is O(nlogn) W(xi, xj): O(knlogn); W: O((n*(n-1)/2) *knlogn)) kemeny optimal aggregation: minimum feedback arc sets

Spearman's Footrule method 距离度量:2个序列π₁, π₂ 的Spearman's Footrule distance F(π₁ , π₂ ) = the sum of the absolute difference between the rank of i according to the two lists. Spearman排序方法对由k 个排序π₁, π₂,…, πk计算最优的合成排序σ,使得Spearman's Footrule距离总和 最小。 | π₁(i)-π₂(i) | The Spearman footrule distance is the sum, over all elements i ∈S, of the absolute difference between the rank of i according to the two lists. 元素i所在不同序列中的位置绝对差 Formally, given two full lists σ and τ , the distance is given by F(σ; τ) After dividing this number by the maximum value |S|2/2, one can obtain a normalized value of the footrule distance,which is always between 0 and 1. The footrule distance between two lists can be computed in linear time. F(σ, πl)

footrule optimal aggregation is equivalent to find a minimum cost perfect matching in a bipartite graph. A weighted complete bipartite graph G=(C ,P, W) C: the set of elements. Here, C=U. P: the set of available positions. PN+ W: the total F-distance (from the πi's) of a ranking that places element c at position p. W(c,p)= |πi(c)-p | … 1 2 n P: C: x1 x2 xn W:

computational complexity  

summary Above two method according to full list Two method for partial lists is more problematic. footrule optimal aggregation to be equivalent to the NP-hard problem of computing the minimum number of edges to delete to convert a directed graph into a directed cyclic graph(DAG). Distance measure: scaled footrule distance; induced footrule distance.

3. Improve Folksorting Definition U: a finite set of elements. {x1,x2,…,xn}. By assigning a unique identifier to each element in U, assume U={1,2,…,|U|} Τ: ordered list of U. Τ(i): denote position of i in T.( i U) Τ∣S : supposed SU, a partial list of T that contains only elements from S. Wi: Given i=1,2,…,k users, Wi U, represent the scope of the ith user’s focus on U, all of the elements involved in the user’s movement

πi: the ith user rearrangement the list Τ∣Wi, finally obtain a preference order. πi denote a permutation of Τ∣Wi σ: called consensus permutation, is closest to a given set U of input permutations{πi}, finally obtain a consensus order of U. d(π₁ , π₂ )= D(σ; π₁, π₂,…, πk)= d(σ∣πi , πi)= costF(σ∣πi(x) , πl (x)) = | π₁(i)π₂(i) | d(σ∣πi , πi) costF(σ∣πi(x) , πi (x))

① Use for reference from Kemeny optimal aggregation equivalent to find a maximum weight directed cycle in a weighted digraph. A weighted directed graph G=(V ,E, W) V: the set of elements. Here, V= Wi. E: the set of edges, with respect to edges of permutation graph of {πi}, and delete duplicate edge. W(xi, xj): the number of duplicate edge in edges of permutation graph of {πi}

permutation graph Majority graph x1 x2 x5 x3 x4 2 x1 x2 Target: 原始序列 (x1 x2 x3 x4) ;用户移动后序列(x4 x1 x2 x3) ;置换为πi =(x1 x4 x3 x2) x1 x2 x5 x3 x4 2 Majority graph 原始序列 (x1 x2 x3 x4 x5) ; 3个用户移动后序列分别(x4 x1 x2 x3 x5) , (x1 x3 x4 x5x2), (x1 x2 x4 x5 x3 ), 对应π1=(x1 x4 x3 x2) π2=(x2 x3 x4 x5) π3=(x3 x4 x5) x1 x2 Target: maximum weight directed cycle 2 x3 x5 2 2 x4

② Use for reference from Footrule optimal aggregation equivalent to find a maximum weight perfect matching in a bipartite graph. x1 x2 x5 x3 x4 2 最终得到:每个元素对应全序列中的位置; 可能存在不同的结果集 T(x1) T(x2) T(x3) T(x4) T(x5) x1 x2 x5 x3 x4 2 2 x1 x2 x3 x4 x5

summary 借鉴前面的两种方法,利用图论中问题来解决。 不同点 输入:kemeny和footrule是lists,folksorting是置换; kemeny和footrule的图中边的权重是距离,folksorting是边出现重复次数 计算复杂度,folksorting边权重计算:W(xi, xj): is O(k),W: O(nk); kemeny和footrule是O((n*(n-1)/2) *knlogn)) 和O(n2k)。 优化算法的复杂度一样O(n2.5). [但nfolksorting  nfootrule ] 现提出改进方法,边的权重计算没有完全体现出hamming距离,总目标可以体现出 现没有对 涉及用户的非关注点的元素的边权重进行类似induced,scaled的计算 现选择边时,除权重外,没加入启发式规则

谢谢 请提出宝贵意见