Download presentation
Presentation is loading. Please wait.
1
DeepPath 周天烁
2
Outline Introduction Reinforcement Learning recap Methodology
Modeling Training Experiment Conclusion
3
Introduction (h,?,t) (h,r,?)
4
Reinforcement Learning recap
Markov Decision Processes (MDP) М = <S, A, T, R> S : state space A : action space T : transition R : reward
5
Methodology —— Modeling
М = < ?, ?, ?, ? >
6
Methodology —— Modeling
М = <S, A, T, R> States Actions Transition Reward Global accuracy Path efficiency Path diversity ?
7
Methodology —— Training
Target Function expected total rewards Supervised Policy Learning Retraining with Rewards
8
Experiment —— setup Dataset Tasks Metric FB15K-237 NELL-995
link prediction (h , r , ?) fact prediction (h , ? , t) Metric MAP (Mean Average Precision) 例如:假设有两个主题,主题1有4个相关网页,主题2有5个相关网页。某系统对 于主题1检索出4个相关网页,其rank分别为1, 2, 4, 7;对于主题2检索出3个相关网 页,其rank分别为1,3,5。对于主题1,平均准确率为(1/1+2/2+3/4+4/7)/4=0.83。对 于主题2,平均准确率为(1/1+2/3+3/5+0+0)/5=0.45。 则MAP= ( )/2=0.64。 取值 [ 0 , 1 ]
9
Experiment —— result
10
Experiment —— example reasoning paths
11
Conclusion pros cons Novel Code public
Selective experiment : didn’t cover the dataset Baseline too old Time consuming
12
Reference Wenhan Xiong, Thien Hoang, and William Yang Wang Deeppath: A reinforcement learning method for knowledge graph reasoning Yang, Fan, Zhilin Yang, and William W. Cohen.
13
Q & A
Similar presentations