Http://mypage.zju.edu.cn/wufei/ http://www.dcd.zju.edu.cn/ CCF计算机视觉专委会走进高校第29期 序列数据深度学习及其思考 浙江大学计算机学院人工智能研究所 吴飞 http://mypage.zju.edu.cn/wufei/ http://www.dcd.zju.edu.cn/

Slides:



Advertisements
Similar presentations
广州市教育局教学研究室英语科 Module 1 Unit 2 Reading STANDARD ENGLISH AND DIALECTS.
Advertisements

allow v. wrong adj. What’s wrong? midnight n. look through guess v. deal n. big deal work out 允许;准许 有毛病;错误的 哪儿不舒服? 午夜;子夜 快速查看;浏览 猜测;估计 协议;交易 重要的事.
智慧老伯的一席話 原稿 : 溫 Sir 中譯 : 老柳 A man of 92 years, short, very well- presented, who takes great care in his appearance, is moving into an old people’s.
Section B Period Two.
Time Objectives By the end of this chapter, you will be able to
Unsupervised feature learning: autoencoders
資料採礦與商業智慧 第四章 類神經網路-Neural Net.
汇报人:李臻 中国海洋大学信息科学与工程学院 计算机科学与技术系
2012高考英语书面表达精品课件:话题作文6 计划与愿望.
-Artificial Neural Network- Hopfield Neural Network(HNN) 朝陽科技大學 資訊管理系 李麗華 教授.
摘要的开头: The passage mainly tells us sth.
天文望远镜集成建模研究 杨德华 南京天文光学技术研究所 30 NOV, 年中国虚拟天文台年会 广西师范大学 桂林
59 中 张丽娟 学习目标: 1. 识记并理解运用 6 个单词和 5 个短语。 (source, accessible, network, access, via, create come up with, from the moment on, consist of, go down , at the.
Leftmost Longest Regular Expression Matching in Reconfigurable Logic
Unit 9 What does he look like?
libD3C: 一种免参数的、支持不平衡分类的二类分类器
Operating System CPU Scheduing - 3 Monday, August 11, 2008.
深層學習 暑期訓練 (2017).
Visualizing and Understanding Neural Machine Translation
-Artificial Neural Network- Adaline & Madaline
Paper Reading 2017/04/18 Yuan Xin.
Rate and Distortion Optimization for Reversible Data Hiding Using Multiple Histogram Shifting Source: IEEE Transactions On Cybernetics, Vol. 47, No. 2,February.
Manifold Learning Kai Yang
机器翻译前沿动态 张家俊 中国科学院自动化研究所
Consumer Memory 指導老師 莊勝雄 MA4D0102郭虹汝MA4D0201吳宜臻.
Source: IEEE Access, vol. 5, pp , October 2017
Knowledge Engineering & Artificial Intelligence Lab (知識工程與人工智慧)
研究、論文、計畫與生活之平衡 演講人:謝君偉 元智大學電機系 2018年11月22日.
Guide to Freshman Life Prepared by Sam Wu.
Decision Support System (靜宜資管楊子青)
HLA - Time Management 陳昱豪.
Time Objectives By the end of this chapter, you will be able to
InterSpeech 2013 Investigation of Recurrent-Neural-Network Architectures and Learning Methods for Spoken Language Understanding University of Rouen(France)
Advanced Artificial Intelligence
Yonghui Wu, Mike Schuster, Zhifeng Chen, Quoc V. Le, Mohammad Norouzi
The Wise Old Man 智慧老伯的一席話 原稿: 溫Sir 中譯 : 老柳 中譯潤稿:風刀雨箭
Time Objectives By the end of this chapter, you will be able to
作者: DALE GOODHUE 來源: JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION
Formal Pivot to both Language and Intelligence in Science
药物和疾病啥关系 ? 李智恒.
塑膠材料的種類 塑膠在模具內的流動模式 流動性質的影響 溫度性質的影響
2002年国家自然科学奖答辩材料剪辑 此获奖项目包含三大部分 这里仅介绍 神经网络非线性逼近理论 上世纪 90年代的热点课题
Lesson 44:Popular Sayings
A Study on the Next Generation Automatic Speech Recognition -- Phase 2
Unit 4.
Decision Support System (靜宜資管楊子青)
基于课程标准的校本课程教学研究 乐清中学 赵海霞.
Hobbies II Objectives A. Greet a long time no see friend: Respond to the greeting: B. Ask the friend if he/she likes to on this weekend? She/he doesn’t.
Long short term memory 郭琦鹏
資料結構 Data Structures Fall 2006, 95學年第一學期 Instructor : 陳宗正.
Answering aggregation question over knowledge base
Version Control System Based DSNs
高性能计算与天文技术联合实验室 智能与计算学部 天津大学
前向人工神经网络敏感性研究 曾晓勤 河海大学计算机及信息工程学院 2003年10月.
虚 拟 仪 器 virtual instrument
The Wise Old Man 智慧老伯的一席話 原稿: 溫Sir 中譯 : 老柳
中央社新聞— <LTTC:台灣學生英語聽說提升 讀寫相對下降>
Representation Learning of Knowledge Graphs with Hierarchical Types
系统科学与复杂网络初探 刘建国 上海理工大学管理学院
An organizational learning approach to information systems development
李宏毅專題 Track A, B, C 的時間、地點開學前通知
Introduction of this course
(二)盲信号分离.
More About Auto-encoder
國立東華大學課程設計與潛能開發學系張德勝
主要内容 什么是概念图? 概念图的理论基础 概念图的功能 概念地图的种类 如何构建概念图 概念地图的评价标准 国内外概念图研究现状
The Wise Old Man 智慧老伯的一席話 原稿: 溫Sir 中譯 : 老柳
Sun-Star第六届全国青少年英语口语大赛 全国总决赛 2015年2月 北京
LIU Lei Shanghai Center for Bioinformation Technology 03/05/2013
Gaussian Process Ruohua Shi Meeting
Presentation transcript:

http://mypage.zju.edu.cn/wufei/ http://www.dcd.zju.edu.cn/ CCF计算机视觉专委会走进高校第29期 序列数据深度学习及其思考 浙江大学计算机学院人工智能研究所 吴飞 http://mypage.zju.edu.cn/wufei/ http://www.dcd.zju.edu.cn/ 2017年3月20日 1

提纲 序列学习概念:sequence to sequence(Seq2Seq) learning 序列学习若干方法 知识计算引擎:从大数据到知识

The architecture of Seq2Seq Learning Output v1 v2 v3 v4 Decoder Encoder w1 w2 w3 w4 w5 Input

seq2seq learning: Machine Translation Jordan likes playing basketball Part of Speech Jordan/NNP likes/VBZ playing/VBG basketball/NN NP Jordan/NNP Parsing S likes/VBZ VP S VP playing/VBG NP basketball/NN Semantic Analysis Jordan likes playing basketball AD V A1 AD V A1 乔丹 喜欢 打篮球

seq2seq learning: Machine Translation 乔丹 喜欢 打篮球 Decoder Decoder Decoder Learned Translator Encoder Encoder Encoder Encoder Jordan likes playing basketball Data-driven learning via amounts of bilingual corpus (the aligned source-target sentences )

Convolutional Neural Network seq2seq learning: visual Q-A Convolutional Neural Network what is the man doing ? Encoder Decoder Riding a bike

seq2seq learning: Image-captioning <start> A man in a white helmet is riding a bike Decoder Encoder <start> A man in a white helmet is riding a bike

seq2seq learning: video action classification NO ACTION pitching pitching pitching NO ACTION Decoder Encoder

Seq2seq learning: put it together One Output Many Output One Output One Input One Input Many Input Image Classification Image Captioning Sentiment Analysis

Seq2seq learning: put it together Many Output Many Output Many Input Many Input Video Storyline Machine Translation

提纲 序列学习概念:sequence to sequence(Seq2Seq) learning 序列学习若干方法 若干研究与思考

Basic models From multilayer perceptron (MLP) to Recurrent Neural Network to LSTM/GRU Multi-Layer Perceptron(MLP) is by nature a feedforward directed acyclic network (Feed-forward neural network).  An MLP consists of multiple layers and can map input data to output data via a set of nonlinear activation functions. MLP utilizes a supervised learning technique called backpropagation for training the network. non-linear end-to-end differentiable sequential Input Mapping Output

Basic models 前向神经网络在刻画数据分布方面的作用: universal approximation theorem A feed-forward network with a single hidden layer containing a finite number of neurons (i.e., a multilayer perceptron), can approximate continuous functions on compact subsets of Rn, under mild assumptions on the activation function. The theorem thus states that simple neural networks can represent a wide variety of interesting functions when given appropriate parameters; however, it does not touch upon the algorithmic learnability of those parameters. One of the first versions of the theorem was proved by George Cybenko in 1989 for sigmoid activation functions. Balázs Csanád Csáji, Approximation with Artificial Neural Networks, Faculty of Sciences; Eötvös Loránd University, Hungary Cybenko., G. , Approximations by superpositions of sigmoidal functions, Mathematics of Control, Signals, and Systems, 2 (4), 303-31,1989 Kurt Hornik,Approximation Capabilities of Multilayer Feedforward Networks, Neural Networks, 4(2), 251–257,1991

Basic models Backpropagate errors (误差后向传播) Paul J. Werbos (born 1947) is a scientist best known for his 1974 Harvard University Ph.D. thesis, which first described the process of training artificial neural networks through backpropagation of errors. The thesis, and some supplementary information, can be found in his book, The Roots of Backpropagation (ISBN 0-471-59897-6). He also was a pioneer of recurrent neural networks. Werbos was one of the original three two-year Presidents of the International Neural Network Society (INNS). He was awarded the IEEE Neural Network Pioneer Award for the discovery of backpropagation and other basic neural network learning frameworks such as Adaptive Dynamic Programming. Paul J. Werbos, Backpropagation Through Time: What It Does and How to Do It, Proceedings of the IEEE, 78(10):1550-1560,1990

Basic models Backpropagate errors (误差后向传播) Repeatedly adjusts the weights of the connections in the network so as to minimize the measure of the difference between the actual output vector of the net and the desired output vector. As a result of the weight adjustments, internal "hidden" units which are not part of the input or output come to represent important features of the task domain, and the regularities in the task are captured by the interactions of these units. Rumelhart, David E.; Hinton, Geoffrey E.; Williams, Ronald J., Learning representations by back-propagating errors, Nature, 323 (6088): 533–536,1986

Basic models From multilayer perceptron (MLP) to Recurrent Neural Network to LSTM/GRU Recurrent Neural Network: An RNN has recurrent connections (connections to previous time steps of the same layer). RNN are powerful but can get extremely complicated. Computations derived from earlier input are fed back into the network, which gives RNN a kind of memory. Standard RNNs suffer from both exploding and vanishing gradients due to their iterative nature. sequence input (x0…xt) Embedding vector (ht) Mapping

Basic models Long Short-Term Memory (LSTM) Model: LSTM is an RNN devised to deal with exploding and vanishing gradient problems in RNN. An LSTM hidden layer consists of a set of recurrently connected blocks, known as memory cells. Each of memory cells is connected by three multiplicative units - the input, output and forget gates. The input to the cells is multiplied by the activation of the input gate, the output to the net is multiplied by the output gate, and the previous cell values are multiplied by the forget gate. Sepp Hochreiter &Jűrgen Schmidhuber, Long short-term memory, Neural computation, Vol. 9(8), pp. 1735--1780, MIT Press, 1997

Basic models Gated Recurrent Unit (GRU) Gated recurrent units are a gating mechanism in recurrent neural networks, GRU has fewer parameters than LSTM, as they lack an output gate. zt =𝜎(Wzxt+Uzht-1) 𝒉 t = tanh(Wxt+U(rt ht-1)) rt = 𝜎(Wrxt+Urht-1) ht=(1-zt)ht-1+zt 𝒉 t Chung, Junyoung; Gulcehre, Caglar; Cho, KyungHyun; Bengio, Yoshua (2014), Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling. arXiv:1412.355

在输出序列中,每一时刻的输出依赖于所有输入数据当时时刻的编码 Learning with attention/ internal memory “The behavior of the computer at any moment is determined by the symbols which he is observing and his 'state of mind' at that moment.” – Alan Turing 输出序列 在输出序列中,每一时刻的输出依赖于所有输入数据当时时刻的编码 输入序列

Learning with attention/ internal memory Neural Machine Translation by Jointly Learning to Align and Translate, ICLR 2015

Learning with attention/ internal memory Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation Google’s Multilingual Neural Machine Translation System Enabling zero-shot translation

Deterministic “Soft” Attention via end-to-end learning Learning with attention/ internal memory context vector Zt Deterministic “Soft” Attention via end-to-end learning Kelvin Xu, Jimmy Ba, Ryan Kiros, Kyunghyun Cho, Aaron Courville, Ruslan Salakhudinov, Rich Zemel and Yoshua Bengio, Show, Attend and Tell: Neural Image Caption Generation with Visual Attention, ICML 2015

Learning with attention/ internal memory Kelvin Xu, Jimmy Ba, Ryan Kiros, Kyunghyun Cho, Aaron Courville, Ruslan Salakhudinov, Rich Zemel and Yoshua Bengio, Show, Attend and Tell: Neural Image Caption Generation with Visual Attention, ICML 2015

Learning with external memory 人类大脑中的海马体:先验和知识的储存体

Learning with external memory Neural Turing Machines Reading Writing Graves A, Wayne G, Danihelka I. Neural Turing Machines,arXiv preprint arXiv:1410.5401, 2014, DeepMind J. Weston, S. Chopra, A. Bordes. Memory Networks. ICLR 2015 (and arXiv:1410.3916, Facebook AI)

Learning with  external  memory Neural Turing Machines

Learning with external memory 可微分神经计算机(differentiable neural computer,DNC): An achievement that has potential implications for the neural–symbolic integration problem(神经网络-符号计算的统一) Deep neural reasoning and one-shot learning Graves, Alex, et al. "Hybrid computing using a neural network with dynamic external memory." Nature 538.7626 (2016): 471-476.

(连续空间模型与离散空间模型相互协调的搜索与决策) Learning with  external  memory Deep neural reasoning (连续空间模型与离散空间模型相互协调的搜索与决策)

Learning with external memory Learning of Basic Algorithms using Reasoning, Attention, Memory (RAM) Methods include adding stacks and addressable memory to RNNs: “Neural Net Architectures for Temporal Sequence Processing.” M. Mozer. “Neural Turing Machines” A. Graves, G. Wayne, I. Danihelka. “Inferring Algorithmic Patterns with Stack Augmented Recurrent Nets.” A. Joulin, T. Mikolov. “Learning to Transduce with Unbounded Memory” E. Grefenstette et al. “Neural Programmer-Interpreters” S. Reed, N. de Freitas. “Reinforcement Learning Turing Machine.” W. Zaremba and I. Sutskever. “Learning Simple Algorithms from Examples” W. Zaremba, T. Mikolov, A. Joulin, R. Fergus. “The Neural GPU and the Neural RAM machine” I. Sutskever.

Gives memory to AI DeepMind crafted an algorithm that lets a neural network 'remember' past knowledge and learn more effectively. The approach is similar to how your own mind works, and might even provide insights into the functioning of human minds. Much like real synapses, which tend to preserve connections between neurons when they've been useful in the past, the algorithm (known as Elastic Weight Consideration) decides how important a given connection is to its associated task James Kirkpatrick, Razvan Pascanu, et al., Overcoming catastrophic forgetting in neural network, PNAS, http://www.pnas.org/cgi/doi/10.1073/pnas.1611835114

提纲 序列学习概念:sequence to sequence(Seq2Seq) learning 序列学习若干方法 知识计算引擎:从大数据到知识

知识计算引擎: KS-Studio http://www.ksstudio.org/

知识计算引擎: KS-Studio KS-Studio 技术框架

在数据驱动机器学习中引入“众包数据”或“知识规则”,拓展单纯数据驱动的概念识别手段,建立解释性强的人工智能方法。 知识计算引擎: KS-Studio 数据驱动机器学习 个体内隐知识模型 (crowd model) 实体、属性、 关系 知识图谱 实体 众包数据中 弱标注信息   直觉与经验 在数据驱动机器学习中引入“众包数据”或“知识规则”,拓展单纯数据驱动的概念识别手段,建立解释性强的人工智能方法。

TAC Knowledge Base Population (KBP) 2016

TAC Knowledge Base Population (KBP) 2016 Task 1: Cold Start KBP  The Cold Start KBP track builds a knowledge base from scratch using a given document collection and a predefined schema for the entities and relations that will comprise the KB. In addition to an end-to-end KB Construction task, Cold Start KBP includes a Slot Filling (SF) task to fill in values for predefined slots (attributes) for a given entity. Person and Organization)

TAC Knowledge Base Population (KBP) 2016 Task 2: Entity Discovery and Linking (EDL)  The Entity Discovery and Linking (EDL) track aims to extract entity mentions from a source collection of textual documents in multiple languages (English, Chinese, and Spanish), and link them to an existing Knowledge Base (KB); an EDL system is also required to cluster mentions for those entities that don't have corresponding KB entries.

TAC Knowledge Base Population (KBP) 2016 Task 3: Event Track  The goal of the Event track is to extract information about events such that the information would be suitable as input to a knowledge base. The track includes Event Nugget (EN) tasks to detect and link events, and Event Argument (EA) tasks to extract event arguments and link arguments that belong to the same event. 

KS-Studio参加KBP知识图谱国际测评 参赛队伍来自CMU、UIUC 、 IBM、UCL 、科大讯飞、浙江大学、北邮等15个国内外知名高校与研究机构。 参赛队伍信息

KS-Studio参加KBP知识图谱国际测评 KBP 2016 提及检测任务(Mention Detection)参赛队伍成绩 浙江大学获得综合排名第一 3个指标中2项第一,1项并列第二

KS-Studio参加KBP知识图谱国际测评 KBP 2016 实体链接任务(Entity Linking)参赛队伍成绩 浙江大学关键技术组获得综合排名第一 5个指标中4项第一,1项第二

KS-Studio:关系挖掘(Relation Discovery) 训练:根据标注的样本,训练卷积网络和分类器 预测:提取句子中命名实体对,进行关系预测 Carbonate may be a factor in the increasing incidence of heart disease. 正样本 Lithium also causes cyanosis during early pregnancy Flumazenil was well tolerated, with no movement disorders reported 向量化表述 负样本 Thromboembolism is a recognized complication of heparin therapy A 2-year-old child with known neurologic impairment developed a dyskinesia soon after starting phenobarbital therapy for seizures. Known causes of movement disorders were Phenobarbital and could be eliminated after evaluation. Dexmedetomidine is useful as the sole sedative for pediatric MRI. Dexmedetomidine is useful as the sole sedative for pediatric MRI. Known causes of movement disorders were Phenobarbital and could be eliminated after evaluation. Drug-induced-Disease: (Phenobarbital, movement disorders)

苯巴比妥造成运动障碍 KS-Studio:关系挖掘(Relation Discovery) 以Drug-induced-Disease关系提取为例 (Phenobarbital, dyskinesia) 苯巴比妥造成运动障碍

KS-Studio:Q-A问答 利用社交网络、专家模型、答案时序信息来作为问答匹配的辅助信息,从而提高深度问答匹配的效果。 完成学习后的 专家推荐 完成学习后的 深度神经网络 短文本排序 测试数据 用户满意度预测 训练 …… 大量问答对 深度神经网络模型 用户社交网络 答案时序信息 其他信息 提取

KS-Studio:Q-A问答 在Q-A问答中刻画因果依赖和时序影响机制 能用众力,则无敌于天下矣;能用众智,则无畏于圣人矣(语出《三国志·吴志·孙权传》) Q-A问答中观点逐步凝练形成

KS-Studio:Q-A问答 Fei Wu, Xinyu Duan, Jun Xiao, Zhou Zhao, Siliang Tang, Yin Zhang, and Yueting Zhuang, Temporal Interaction and Causation Influence in Community-based Question Answering, IEEE Transactions on Knowledge and Data Engineering (major revised)

KS-Studio:看图说话 对图像-句子等耦合数据生成组合语义,在层次化深度学习框架下实现“看图说话” Yueting Zhuang, Jun Song, Fei Wu, Xi Li, Zhongfei Zhang, Rui Yong, Multi-modal Deep Embedding via Hierarchical Grounded Compositional Semantics, IEEE Transactions on Circuits and Systems for Video Technology, 10.1109/TCSVT.2016.2606648

视频事件检测之一及Video-Captioning KS-Studio:运动识别 视频事件检测之一及Video-Captioning 国际视频运动行为识别竞赛Thumos Challenge 2015第三名,Thumos包含了101个视频运动类别 2016 ActivityNet视频运动行为识别竞赛第六名 Pingbo Pan, Zhongwen Xu, Yi Yang, Fei Wu, Yueting Zhuang, Hierarchical Recurrent Neural Encoder for Video Representation with Application to Captioning, CVPR 2016, 1029-1038

KS-Studio:运动识别 Recognition System Video Real-time video stream results

Pan, Yunhe, 2016, Heading toward artificial intelligence 2.0, 总结:迈向人工智能2.0 Pan, Yunhe, 2016, Heading toward artificial intelligence 2.0, Engineering, 409-413

总结:迈向人工智能2.0 Yueting Zhuang, Fei Wu, Chun Chen, Yunhe Pan, Challenges and Opportunities: From Big Data to Knowledge in AI 2.0,Frontiers of Information Technology & Electronic Engineering, 2017,18(1):3-14

谢谢大家 Email: wufei@cs.zju.edu.cn