Paper Reading 2017/04/18 Yuan Xin.

Slides:



Advertisements
Similar presentations
11010: Tic-Tac-Tough ★★★★☆ 題組: Problem Set Archive with Online Judge
Advertisements

Unsupervised feature learning: autoencoders
专题八 书面表达.
-Artificial Neural Network- Hopfield Neural Network(HNN) 朝陽科技大學 資訊管理系 李麗華 教授.
Mode Selection and Resource Allocation for Deviceto- Device Communications in 5G Cellular Networks 林柏毅 羅傑文.
Understanding Interest Rates
XI. Hilbert Huang Transform (HHT)
Leftmost Longest Regular Expression Matching in Reconfigurable Logic
A TIME-FREQUENCY ADAPTIVE SIGNAL MODEL-BASED APPROACH FOR PARAMETRIC ECG COMPRESSION 14th European Signal Processing Conference (EUSIPCO 2006), Florence,
A Question Answering Approach to Emotion Cause Extraction
深層學習 暑期訓練 (2017).
Minimum Spanning Trees
Visualizing and Understanding Neural Machine Translation
-Artificial Neural Network- Adaline & Madaline
Large-Scale Malware Indexing Using Function-Call Graphs
Improving classification models with taxonomy information
! 温故知新 上下文无关文法 最左推导 最右推导 自上而下 自下而上 句柄 归约 移进-归约冲突 移进-归约分析 递归下降预测分析
NLP Group, Dept. of CS&T, Tsinghua University
模式识别 Pattern Recognition
Consumer Memory 指導老師 莊勝雄 MA4D0102郭虹汝MA4D0201吳宜臻.
Source: IEEE Access, vol. 5, pp , October 2017
Chapter 6 Graph Chang Chi-Chung
Journal Citation Reports® 期刊引文分析報告的使用和檢索
圖論 (Graph Theory) B 電機四 大鳥 B 電機四 酋長 B 電機四 炫大
Sampling Theory and Some Important Sampling Distributions
課務組 Curriculum Section
Course 9 NP Theory序論 An Introduction to the Theory of NP
创建型设计模式.
Simulated Annealing 報告者:李怡緯 OPLAB in NTUIM.
InterSpeech 2013 Investigation of Recurrent-Neural-Network Architectures and Learning Methods for Spoken Language Understanding University of Rouen(France)
Advanced Artificial Intelligence
Yonghui Wu, Mike Schuster, Zhifeng Chen, Quoc V. Le, Mohammad Norouzi
子博弈完美Nash均衡 我们知道,一个博弈可以有多于一个的Nash均衡。在某些情况下,我们可以按照“子博弈完美”的要求,把不符合这个要求的均衡去掉。 扩展型博弈G的一部分g叫做一个子博弈,如果g包含某个节点和它所有的后继点,并且一个G的信息集或者和g不相交,或者整个含于g。 一个Nash均衡称为子博弈完美的,如果它在每.
ZEEV ZEITIN Delft University of Technology, Netherlands
Answering aggregation question over knowledge base
Version Control System Based DSNs
高性能计算与天文技术联合实验室 智能与计算学部 天津大学
计算机问题求解 – 论题3-2 - 贪心算法 2018年09月18日.
Maintaining Frequent Itemsets over High-Speed Data Streams
每周三交作业,作业成绩占总成绩的15%; 平时不定期的进行小测验,占总成绩的 15%;
前向人工神经网络敏感性研究 曾晓勤 河海大学计算机及信息工程学院 2003年10月.
3.5 Region Filling Region Filling is a process of “coloring in” a definite image area or region. 2019/4/19.
ImageNet Classification with Deep Convolutional Neural Networks
中央社新聞— <LTTC:台灣學生英語聽說提升 讀寫相對下降>
VRP工具or-tools调研 王羚宇
Representation Learning of Knowledge Graphs with Hierarchical Types
Neural Networks: Learning
系统科学与复杂网络初探 刘建国 上海理工大学管理学院
Q & A.
BiCuts: A fast packet classification algorithm using bit-level cutting
李宏毅專題 Track A, B, C 的時間、地點開學前通知
Efficient Query Relaxation for Complex Relationship Search on Graph Data 李舒馨
磁共振原理的临床应用.
最短通路问题.
Introduction of this course
赵才荣 同济大学,电子与信息工程学院,智信馆410室
(二)盲信号分离.
An Quick Introduction to R and its Application for Bioinformatics
More About Auto-encoder
Speaker : YI-CHENG HUNG
Chapter 9 Validation Prof. Dehan Luo
Konig 定理及其证明 杨欣然
Class imbalance in Classification
Graph 1 Michael Tsai 2012/4/24 連載: 學生上課睡覺姿勢大全
之前都是分类的蒸馏很简单。然后从分类到分割也是一样,下一篇是检测的蒸馏
Self-Attention huitr
Gaussian Process Ruohua Shi Meeting
JAVA 程式設計與資料結構 第二十一章 Graph.
Presentation transcript:

Paper Reading 2017/04/18 Yuan Xin

Paper Unitary Evolution Recurrent Neural Networks (ICML 16, Bengio et al) Learning Convolutional Neural Networks for Graphs(ICML 16, NEC Lab)

Unitary Evolution Recurrent Neural Networks RNNs difficult to train Learns a unitary weight matrix Parametrizing unitary matrices require expensive computations construct an expressive unitary weight matrix by composing several structured matrices Recurrent neural networks (RNNs) are notoriously difficult to train.主要是因为特征值偏离1时候,指数级别的增长或者衰减。Vanishing and exploding gradients, especially when long-term dependencies. learns a unitary weight matrix

Orthogonal Weights and Bounding the Long-Term Gradient By RNNs by the chain rule 根据矩阵乘积范数小于矩阵范数乘积 where

Orthogonal Weights and Bounding the Long-Term Gradient We prove: When ||𝐷 𝑘 || is not 1, vanish or explode. linear unit (ReLU) nonlinearity an attractive choice Dk是激活值导数的最大值。 Dk不是一就会爆炸或者消失。 使用ReLu是很好的选择,只要不被全部判为0,导数最大必为1。

Unitary Evolution RNNs a gradient step will typically yield a matrix that is not unitary Projecting costs O(n3) computation. Lamma: A complex square matrix W is unitary if and only if it has an eigendecomposition of the form W = VDV* , W is a real orthogonal matrix if and only if for every eigenvalue D(j,j) = 𝜆 𝑗 with eigenvector 𝑣 𝑗 Requires O(n2) memory Setting V to identity reducing memory and W remain diagonal. 在这样一个基于梯度的方法中,直接参数化酉矩阵不现实。 我们想要利用的是,正交矩阵和酉矩阵特征值绝对值为1的特性。

Unitary Evolution RNNs Alternative strategy to parameterize unitary matrices Compose several simple, parameteric, unitary matrices to construct a single, expressive unitary matrix. D,R,Pi 都是时间空间O(n)的。F和F-1时间是O(nlogn),不需要空间。 除了中间的都是负数,但是我们用实数来表示,顶层只要是实数并且可微分,就可以进行学习。 uRNN:

Complex hidden units Complex hidden units 在实现中实际上使用实数来表示复数的。 用虚实部的形式来写。 This allows us to implement everything using real valued operations, compatible with any any deep learning framework with automatic differentiation such as Theano.

Input to Hidden, Nonlinearity, Hidden to Output Input to hiden: follows the same hidden to hidden mapping as equation. Nonliearity: A variation of the ReLU that we name modReLU. Hidden to out: 非线性用relu特别不好,可能是直接强行分别对虚部实部进行relu破坏了本来的相位。 定义U是实数,得到实数输出,之后接softmax或者什么的就好。 因为通过酉矩阵的正交约束,很稳定,不太受初始值的影响。但是还是给了。

Experiments See Paper

Learning Convolutional Neural Networks for Graphs Motivation Given a collection of graphs, learn a function that can be used for classification and regression problems on unseen graphs. (chemical compound) - Given a large graph, learn graph representations that can be used to infer unseen graph properties such as node types and missing edges.(social network, unlabeled graph) 2016ICML上的一篇论文,对于如何将cnn应用在graph上提供了一种新的思路。 总体上讲,就是用w个固定size=(k+1)的子图来表示输入的graph,再将这w个子图正则化后,生成一个w(k+1)维的向量,作为传统的cnn的输入,进行学习。 其实就是做了一个从graph到向量的映射的一个预处理过程。 首先进行把graph和image进行了这样一个类比。 这个邻域的东西实际上就是感受野,卷积kernel。

Learning Convolutional Neural Networks for Graphs An image can be represented as a square grid graph whose nodes represent pixels. Now, a CNN can be seen as traversing a node sequence (nodes 1 -4 in Figure 1 (a)) and generating fixed-size neighborhood graphs (the 3 x3 grids in Figure 1 (b)) for each of the nodes. The neighborhood graphs serve as the receptive fields to read feature values from the pixel nodes. 2016ICML上的一篇论文,对于如何将cnn应用在graph上提供了一种新的思路。 总体上讲,就是用w个固定size=(k+1)的子图来表示输入的graph,再将这w个子图正则化后,生成一个w(k+1)维的向量,作为传统的cnn的输入,进行学习。 其实就是做了一个从graph到向量的映射的一个预处理过程。 首先进行把graph和image进行了这样一个类比。 这个邻域的东西实际上就是感受野,卷积kernel。

PATCHY -SAN PATCHY -SAN architecture which has several advantages over existing approaches: First, it is highly efficient, naively parallelizable, and applicable to large graphs. Second, for a number of applications, ranging from computational biology to social network analysis, it is important to visualize learned network motifs (Milo et al. , 2002 ). (1) Select a fixed-length sequence of nodes from the graph; (2) assemble a fixed-size neighborhood for each node in the selected sequence; (3) normalize the extracted neighborhood graph; (4) learn neighborhood representations with convolutional neural networks from the resulting sequence of patches 容易并行 有大规模应用 从图里选一个节点序列,对序列中的某些节点生成邻域子徒并且归一化,得到感受野,然后进行正常的卷积。 输入:任意一张图  输出:每个channel输出w个receptive field,之后拿这个感受野就可以进行卷积操作了。 接下来就详细说下这几个操作。

Node Sequence Selection Step1: graph labeling(对图的节点做标记,比如可以用节点的度做标记,做图的划分,也可以叫做color refinement or vertex classification)  文中采用The Weisfeiler-Lehman algorithm做图的划分。由此可以得到每个节点的rank 值(为了不同的图能够有一个规范化的组织方式) Step2:对labeling好的节点排序,取前w个节点,作为处理的节点序列。(这样就可以把不同size的graph,变成同一个size)若不足w个节点,则,在输出中加全零的receptive field,相当于padding Step3:采用stride=s来遍历这w个节点。文中s=1(若s)1,为了输出有w个receptive field, 也用step2的方式补全) Given Labeling: Weisfeiler-Lehman algorithm for graph labeling

Neighborhood Assembly 对遍历到的每个节点v(称作root),采用广搜的方式获得此节点的k个1-neighborhood, 如果不k个,再遍历1-neighborhood的1-neighborhood。直到满足k个,层次遍历。或者所有的 邻居节点都遍历完。此节点和他的k个邻居节点就生成了neighborhood graph。 breadth-first search K 1-neighborhood, then k 1-1-neighborhood, … until k or end.

Graph Normalization But Optimal graph normalization is NP-hard! So we only compare but not solve Problem 1. 解释算法一,找标记算法实际上是找好的结构化描述,即相似的结构应该在labeling以后也保持相似。 Step5: step4就生成了w个(s=1)neighborhood graph。需要对着w个graph 进行labeling, 根据离root节点v的远近来计算每个节点的rank,根据算法4是离v越近,rank越高。 如果每个neighborhood graph不足k个节点,用0节点补充pad。 规范化step5得到了已经label好的graph,因为需要把它变成injective,使每个节点 的标签唯一,采用nauty的算法。

Convolutional Architecture For each input G After the algrithm above, we have a blob(batch,height,width,channel) Vertices (w,k,1,av) Edges (w,k,k,ae) 如何去接入CNN的框架呢? 没有用文章里的张量解释,

Experiments See Paper

Thank You