More About Auto-encoder

Slides:



Advertisements
Similar presentations
考研英语复试 口语准备 考研英语口语复试. 考研英语复试 口语准备 服装 谦虚、微笑、自信 态度积极 乐观沉稳.
Advertisements

第七课:电脑和网络. 生词 上网 vs. 网上 我上网看天气预报。 今天早上看了网上的天气预报。 正式 zhèngshì (报告,会议,纪录) 他被这所学校正式录取 大桥已经落成,日内就可以正式通车 落伍 luòw ǔ 迟到 chídào 他怕迟到,六点就起床了.
FREE-TYPE POEM GENERATION QIXIN WANG, TIANYI LUO, DONG WANG, CHAO XING AAAI & IJCAI 2016.
全国卷书面表达备考建议 广州市第六中学 王慧珊 Aug. 24th, 2015.
Have you ever been to a zoo? zoo water park Have you ever been to a water park?
Unsupervised feature learning: autoencoders
宏 观 经 济 学 N.Gregory Mankiw 上海杉达学院.
探 索 与 创 新 夏若出版社.
How can we become good leamers
Java Programming Hygiene - for DIDC
华东师范大学软件学院 王科强 (第一作者), 王晓玲
Homework 2 : VSM and Summary
-Artificial Neural Network- Hopfield Neural Network(HNN) 朝陽科技大學 資訊管理系 李麗華 教授.
从离线考试的翻译题谈起 - - 英译汉 词汇翻译技巧2则
59 中 张丽娟 学习目标: 1. 识记并理解运用 6 个单词和 5 个短语。 (source, accessible, network, access, via, create come up with, from the moment on, consist of, go down , at the.
Unit 2 Topic 2 What does she look like? Section D 龙岩初级中学 余军.
Chinglish 组员:吴海燕 冯华波 顾佳婧 孙 露 陆 凯 李 宁.
深層學習 暑期訓練 (2017).
Visualizing and Understanding Neural Machine Translation
What are the shirts made of?
Unit 3 Families Celebrate Together Lesson 22 Presents from Canada!
Ⅱ、从方框里选择合适的单词填空,使句子完整通顺。 [ size beef special large yet ]
I always like birthday parties.
组合逻辑 刘鹏 Mar. 17, 2015 浙江大学 信息与电子工程系
NLP Group, Dept. of CS&T, Tsinghua University
学练优英语教学课件 八年级(上) it! for Go
樹狀結構 陳怡芬 2018/11/16 北一女中資訊專題研究.
Source: IEEE Access, vol. 5, pp , October 2017
我祝願你足夠 背景音樂-星空下的小喇叭【電影:亂世忠魂】 AUTO.
Guide to Freshman Life Prepared by Sam Wu.
HLA - Time Management 陳昱豪.
创建型设计模式.
1 Introduction Prof. Lin-Shan Lee.
Step 1. Semi-supervised Given a region, where a primitive event happens Given the beginning and end time of each instance of the primitive event.
Unit 5 Why do you like pandas?
The expression and applications of topology on spatial data
Yonghui Wu, Mike Schuster, Zhifeng Chen, Quoc V. Le, Mohammad Norouzi
第三章 基本觀念 電腦繪圖與動畫 (Computer Graphics & Animation) Object Data Image
学习报告 —语音转换(voice conversion)
Lesson 28 How Do I Learn English?
Lesson 44:Popular Sayings
Area of interaction focus
Traditional Chinese Medicine
第三章 基本觀念 電腦繪圖與動畫 (Computer Graphics & Animation) Object Data Image
Machine Translation for Conversational Texts
1 Introduction Prof. Lin-Shan Lee.
如何增加对欧贸易出口 中国制造展销中心(英国)有限公司 首席执行官 理查德·赛斯
Objective Clauses (宾语从句)
Making Connection Sound with Symbol
二、雅思学术类阅读题型 10种题型 5种大题型+5种小题型.
Have you read Treasure Island yet?
Research 裴澍炜 Shuwei Pei Tel:
Guide to a successful PowerPoint design – simple is best
Common Qs Regarding Earnings
中央社新聞— <LTTC:台灣學生英語聽說提升 讀寫相對下降>
-----Reading: ZhongGuanCun
高考应试作文写作训练 5. 正反观点对比.
The Role of Parents in the Moral Development of the Child
李宏毅專題 Track A, B, C 的時間、地點開學前通知
Introduction of this course
創造思考的開發與培養.
An Quick Introduction to R and its Application for Bioinformatics
Speaker : YI-CHENG HUNG
怎樣把同一評估 給與在不同班級的學生 How to administer the Same assessment to students from Different classes and groups.
NLP+Knowledge zdf.
Introduction to Computer Security and Cryptography
英语口译 4 Education and Campus 大学英语教学部 向丁丁.
Homework 2 : VSM and Summary
Gaussian Process Ruohua Shi Meeting
Euangelion.
Presentation transcript:

More About Auto-encoder Hung-yi Lee 李宏毅

Embedding, Latent Representation, Latent Code Auto-encoder As close as possible NN Encoder NN Decoder vector Embedding, Latent Representation, Latent Code More than minimizing reconstruction error More interpretable embedding

What is good embedding? An embedding should represent the object. 是一對 不是一對

loss of the classification task is 𝐿 𝐷 Train 𝜙 to minimize 𝐿 𝐷 Beyond Reconstruction 𝐿 𝐷 ∗ = min 𝜙 𝐿 𝐷 How to evaluate an encoder? The embeddings are representative. Small 𝐿 𝐷 ∗ loss of the classification task is 𝐿 𝐷 Discriminator image y/n 𝜙 binary classifier NN Encoder Say “Yes” In real implementation, do we slit into train and test NN Encoder Say “No”

loss of the classification task is 𝐿 𝐷 Train 𝜙 to minimize 𝐿 𝐷 Beyond Reconstruction 𝐿 𝐷 ∗ = min 𝜙 𝐿 𝐷 How to evaluate an encoder? The embeddings are representative. Small 𝐿 𝐷 ∗ loss of the classification task is 𝐿 𝐷 Large 𝐿 𝐷 ∗ Not representative Discriminator image y/n 𝜙 binary classifier NN Encoder Say “Yes” NN Encoder Say “No”

loss of the classification task is 𝐿 𝐷 Train 𝜙 to minimize 𝐿 𝐷 Beyond Reconstruction 𝐿 𝐷 ∗ = min 𝜙 𝐿 𝐷 How to evaluate an encoder? The embeddings are representative. Small 𝐿 𝐷 ∗ loss of the classification task is 𝐿 𝐷 Large 𝐿 𝐷 ∗ Not representative Discriminator image y/n Train 𝜃 to minimize 𝐿 𝐷 ∗ 𝜙 binary classifier 𝜃 ∗ =𝑎𝑟𝑔 min 𝜃 𝐿 𝐷 ∗ NN Encoder =𝑎𝑟𝑔 min 𝜃 min 𝜙 𝐿 𝐷 Does it train iteratively???????? Train the encoder 𝜃 and discriminator 𝜙 to minimize 𝐿 𝐷 𝜃 NN Encoder Deep InfoMax (DIM) (c.f. training encoder and decoder to minimize reconstruction error) 𝜃

Typical auto-encoder is a special case As close as possible NN Encoder Decoder vector score - (reconstruction error) NN Decoder vector Discriminator

Sequential Data Skip thought Quick thought A document is a sequence of sentences. Sequential Data previous Skip thought current https://papers.nips.cc/paper/5950-skip-thought-vectors.pdf next Quick thought current In our experiments, c is simply defined to be an inner product c(u, v) = u T v random next random https://arxiv.org/pdf/1803.02893.pdf

Sequential Data Contrastive Predictive Coding (CPC) https://arxiv.org/pdf/1807.03748.pdf

Embedding, Latent Representation, Latent Code Auto-encoder As close as possible NN Encoder NN Decoder vector Embedding, Latent Representation, Latent Code More than minimizing reconstruction error More interpretable embedding

Feature Disentangle An object contains multiple aspect information Source: https://www.dreamstime.com/illustration/disentangle.html An object contains multiple aspect information Encoder Decoder input audio reconstructed Include phonetic information, speaker information, etc. Encoder Decoder input sentence reconstructed Include syntactic information, semantic information, etc.

Feature Disentangle phonetic information Encoder Decoder input audio reconstructed speaker information phonetic information Encoder 1 Decoder Encoder 2 reconstructed speaker information

Feature Disentangle - Voice Conversion How are you? Encoder Decoder How are you? How are you? Hello Encoder Decoder Hello Hello

Feature Disentangle - Voice Conversion How are you? Encoder How are you? How are you? Decoder Hello Encoder Hello How are you?

Feature Disentangle - Voice Conversion The same sentence has different impact when it is said by different people. Do you want to study a PhD? Student Go away! Do you want to study a PhD? 新垣結衣 (Aragaki Yui) Student

Feature Disentangle - Adversarial Training Learn to fool the speaker classifier Speaker Classifier or (Discriminator) How are you? Encoder Decoder How are you? How are you? Speaker classifier and encoder are learned iteratively

Feature Disentangle - Designed Network Architecture How are you? Encoder 1 IN Decoder How are you? How are you? How are you? Encoder 2 IN = instance normalization (remove global information)

Feature Disentangle - Designed Network Architecture How are you? Encoder 1 IN Decoder AdaIN How are you? How are you? Encoder 2 IN = instance normalization (remove global information) AdaIN = adaptive instance normalization (only influence global information)

https://jjery2243542.github.io/voice_conversion_demo/ Feature Disentangle - Adversarial Training Target Speaker Source Speaker Source to Target (Never seen during training!) Me Me https://b04901014.github.io/ISGAN/ 感謝周儒杰同學提供實驗結果 Me Me Thanks Ju-chieh Chou for providing the results. https://jjery2243542.github.io/voice_conversion_demo/

Discrete Representation https://arxiv.org/pdf/1611.01144.pdf non differentiable Easier to interpret or clustering One-hot 0.9 0.1 0.3 0.7 1 NN Encoder NN Decoder Binary 0.9 0.1 0.3 0.7 1 NN Encoder NN Decoder

Discrete Representation https://arxiv.org/abs/1711.00937 Vector Quantized Variational Auto-encoder (VQVAE) vector NN Encoder NN Decoder vector 3 Codebook (a set of vectors) Compute similarity vector 1 vector 2 vector 3 vector 4 vector 5 The most similar one is the input of decoder. Learn from data For speech, the codebook represents phonetic information https://arxiv.org/pdf/1901.08810.pdf

Sequence as Embedding G R Seq2seq Seq2seq https://arxiv.org/abs/1810.02851 Only need a lot of documents to train the model This is a seq2seq2seq auto-encoder. Using a sequence of words as latent representation. not readable … word sequence document document G R Summary? Seq2seq Seq2seq

Sequence as Embedding D G R Discriminator Seq2seq Seq2seq Human written summaries Real or not Let Discriminator considers my output as real Discriminator word sequence document document G R Readable Summary? Seq2seq Seq2seq

感謝 王耀賢 同學提供實驗結果 Sequence as Embedding Document:澳大利亞今天與13個國家簽署了反興奮劑雙 邊協議,旨在加強體育競賽之外的藥品檢查並共享研究成 果 …… Summary: Human:澳大利亞與13國簽署反興奮劑協議 Unsupervised:澳大利亞加強體育競賽之外的藥品檢查 Document:中華民國奧林匹克委員會今天接到一九九二年 冬季奧運會邀請函,由於主席張豐緒目前正在中南美洲進 行友好訪問,因此尚未決定是否派隊赴賽 …… Human:一九九二年冬季奧運會函邀我參加 Unsupervised:奧委會接獲冬季奧運會邀請函

感謝 王耀賢 同學提供實驗結果 Sequence as Embedding Document:據此間媒體27日報道,印度尼西亞蘇門答臘島 的兩個省近日來連降暴雨,洪水泛濫導致塌方,到26日為止 至少已有60人喪生,100多人失蹤 …… Summary: Human:印尼水災造成60人死亡 Unsupervised:印尼門洪水泛濫導致塌雨 Document:安徽省合肥市最近為領導幹部下基層做了新規 定:一律輕車簡從,不準搞迎來送往、不準搞層層陪同 …… Human:合肥規定領導幹部下基層活動從簡 Unsupervised:合肥領導幹部下基層做搞迎來送往規定: 一律簡

Tree as Embedding https://arxiv.org/abs/1806.07832 https://vimeo.com/285800885 https://arxiv.org/abs/1806.07832 https://arxiv.org/abs/1904.03746

Concluding Remarks More than minimizing reconstruction error As close as possible NN Encoder Decoder code More than minimizing reconstruction error Using Discriminator Sequential Data More interpretable embedding Feature Disentangle Discrete and Structured