More About Auto-encoder

Slides:

Advertisements

Similar presentations

考研英语复试口语准备考研英语口语复试. 考研英语复试口语准备服装谦虚、微笑、自信态度积极乐观沉稳.

Advertisements

第七课：电脑和网络. 生词上网 vs. 网上我上网看天气预报。今天早上看了网上的天气预报。正式 zhèngshì （报告，会议，纪录）他被这所学校正式录取大桥已经落成，日内就可以正式通车落伍 luòw ǔ 迟到 chídào 他怕迟到，六点就起床了.

FREE-TYPE POEM GENERATION QIXIN WANG, TIANYI LUO, DONG WANG, CHAO XING AAAI & IJCAI 2016.

全国卷书面表达备考建议广州市第六中学王慧珊 Aug. 24th, 2015.

Have you ever been to a zoo? zoo water park Have you ever been to a water park?

Unsupervised feature learning: autoencoders

宏观经济学 N.Gregory Mankiw 上海杉达学院.

探索与创新夏若出版社.

How can we become good leamers

Java Programming Hygiene - for DIDC

华东师范大学软件学院王科强 (第一作者), 王晓玲

Homework 2 : VSM and Summary

-Artificial Neural Network- Hopfield Neural Network(HNN) 朝陽科技大學資訊管理系李麗華教授.

从离线考试的翻译题谈起 - - 英译汉词汇翻译技巧2则

59 中张丽娟学习目标： 1. 识记并理解运用 6 个单词和 5 个短语。 (source, accessible, network, access, via, create come up with, from the moment on, consist of, go down ， at the.

Unit 2 Topic 2 What does she look like? Section D 龙岩初级中学余军.

Chinglish 组员：吴海燕冯华波顾佳婧孙露陆凯李宁.

深層學習暑期訓練 (2017).

Visualizing and Understanding Neural Machine Translation

What are the shirts made of?

Unit 3 Families Celebrate Together Lesson 22 Presents from Canada!

Ⅱ、从方框里选择合适的单词填空，使句子完整通顺。 [ size beef special large yet ]

I always like birthday parties.

组合逻辑刘鹏 Mar. 17, 2015 浙江大学信息与电子工程系

NLP Group, Dept. of CS&T, Tsinghua University

学练优英语教学课件八年级（上） it! for Go

樹狀結構陳怡芬 2018/11/16 北一女中資訊專題研究.

Source: IEEE Access, vol. 5, pp , October 2017

我祝願你足夠背景音樂-星空下的小喇叭【電影：亂世忠魂】 AUTO.

Guide to Freshman Life Prepared by Sam Wu.

HLA - Time Management 陳昱豪.

创建型设计模式.

1 Introduction Prof. Lin-Shan Lee.

Step 1. Semi-supervised Given a region, where a primitive event happens Given the beginning and end time of each instance of the primitive event.

Unit 5 Why do you like pandas?

The expression and applications of topology on spatial data

Yonghui Wu, Mike Schuster, Zhifeng Chen, Quoc V. Le, Mohammad Norouzi

第三章基本觀念電腦繪圖與動畫 (Computer Graphics & Animation) Object Data Image

学习报告 —语音转换（voice conversion)

Lesson 28 How Do I Learn English?

Lesson 44:Popular Sayings

Area of interaction focus

Traditional Chinese Medicine

第三章基本觀念電腦繪圖與動畫 (Computer Graphics & Animation) Object Data Image

Machine Translation for Conversational Texts

1 Introduction Prof. Lin-Shan Lee.

如何增加对欧贸易出口中国制造展销中心（英国）有限公司首席执行官理查德·赛斯

Objective Clauses (宾语从句)

Making Connection Sound with Symbol

二、雅思学术类阅读题型 10种题型 5种大题型+5种小题型.

Have you read Treasure Island yet?

Research 裴澍炜 Shuwei Pei Tel:

Guide to a successful PowerPoint design – simple is best

Common Qs Regarding Earnings

中央社新聞— ＜LTTC：台灣學生英語聽說提升讀寫相對下降＞

-----Reading: ZhongGuanCun

高考应试作文写作训练 5. 正反观点对比.

The Role of Parents in the Moral Development of the Child

李宏毅專題 Track A, B, C 的時間、地點開學前通知

Introduction of this course

創造思考的開發與培養.

An Quick Introduction to R and its Application for Bioinformatics

Speaker : YI-CHENG HUNG

怎樣把同一評估給與在不同班級的學生 How to administer the Same assessment to students from Different classes and groups.

NLP+Knowledge zdf.

Introduction to Computer Security and Cryptography

英语口译 4 Education and Campus 大学英语教学部向丁丁.

Homework 2 : VSM and Summary

Gaussian Process Ruohua Shi Meeting

Presentation transcript:

More About Auto-encoder Hung-yi Lee 李宏毅

Embedding, Latent Representation, Latent Code Auto-encoder As close as possible NN Encoder NN Decoder vector Embedding, Latent Representation, Latent Code More than minimizing reconstruction error More interpretable embedding

What is good embedding? An embedding should represent the object. 是一對不是一對

loss of the classification task is 𝐿 𝐷 Train 𝜙 to minimize 𝐿 𝐷 Beyond Reconstruction 𝐿 𝐷 ∗ = min 𝜙 𝐿 𝐷 How to evaluate an encoder? The embeddings are representative. Small 𝐿 𝐷 ∗ loss of the classification task is 𝐿 𝐷 Discriminator image y/n 𝜙 binary classifier NN Encoder Say “Yes” In real implementation, do we slit into train and test NN Encoder Say “No”

loss of the classification task is 𝐿 𝐷 Train 𝜙 to minimize 𝐿 𝐷 Beyond Reconstruction 𝐿 𝐷 ∗ = min 𝜙 𝐿 𝐷 How to evaluate an encoder? The embeddings are representative. Small 𝐿 𝐷 ∗ loss of the classification task is 𝐿 𝐷 Large 𝐿 𝐷 ∗ Not representative Discriminator image y/n 𝜙 binary classifier NN Encoder Say “Yes” NN Encoder Say “No”

loss of the classification task is 𝐿 𝐷 Train 𝜙 to minimize 𝐿 𝐷 Beyond Reconstruction 𝐿 𝐷 ∗ = min 𝜙 𝐿 𝐷 How to evaluate an encoder? The embeddings are representative. Small 𝐿 𝐷 ∗ loss of the classification task is 𝐿 𝐷 Large 𝐿 𝐷 ∗ Not representative Discriminator image y/n Train 𝜃 to minimize 𝐿 𝐷 ∗ 𝜙 binary classifier 𝜃 ∗ =𝑎𝑟𝑔 min 𝜃 𝐿 𝐷 ∗ NN Encoder =𝑎𝑟𝑔 min 𝜃 min 𝜙 𝐿 𝐷 Does it train iteratively???????? Train the encoder 𝜃 and discriminator 𝜙 to minimize 𝐿 𝐷 𝜃 NN Encoder Deep InfoMax (DIM) (c.f. training encoder and decoder to minimize reconstruction error) 𝜃

Typical auto-encoder is a special case As close as possible NN Encoder Decoder vector score - (reconstruction error) NN Decoder vector Discriminator

Sequential Data Skip thought Quick thought A document is a sequence of sentences. Sequential Data previous Skip thought current https://papers.nips.cc/paper/5950-skip-thought-vectors.pdf next Quick thought current In our experiments, c is simply defined to be an inner product c(u, v) = u T v random next random https://arxiv.org/pdf/1803.02893.pdf

Sequential Data Contrastive Predictive Coding (CPC) https://arxiv.org/pdf/1807.03748.pdf

Embedding, Latent Representation, Latent Code Auto-encoder As close as possible NN Encoder NN Decoder vector Embedding, Latent Representation, Latent Code More than minimizing reconstruction error More interpretable embedding

Feature Disentangle An object contains multiple aspect information Source: https://www.dreamstime.com/illustration/disentangle.html An object contains multiple aspect information Encoder Decoder input audio reconstructed Include phonetic information, speaker information, etc. Encoder Decoder input sentence reconstructed Include syntactic information, semantic information, etc.

Feature Disentangle phonetic information Encoder Decoder input audio reconstructed speaker information phonetic information Encoder 1 Decoder Encoder 2 reconstructed speaker information

Feature Disentangle - Voice Conversion How are you? Encoder Decoder How are you? How are you? Hello Encoder Decoder Hello Hello

Feature Disentangle - Voice Conversion How are you? Encoder How are you? How are you? Decoder Hello Encoder Hello How are you?

Feature Disentangle - Voice Conversion The same sentence has different impact when it is said by different people. Do you want to study a PhD? Student Go away! Do you want to study a PhD? 新垣結衣 (Aragaki Yui) Student

Feature Disentangle - Adversarial Training Learn to fool the speaker classifier Speaker Classifier or (Discriminator) How are you? Encoder Decoder How are you? How are you? Speaker classifier and encoder are learned iteratively

Feature Disentangle - Designed Network Architecture How are you? Encoder 1 IN Decoder How are you? How are you? How are you? Encoder 2 IN = instance normalization (remove global information)

Feature Disentangle - Designed Network Architecture How are you? Encoder 1 IN Decoder AdaIN How are you? How are you? Encoder 2 IN = instance normalization (remove global information) AdaIN = adaptive instance normalization (only influence global information)

https://jjery2243542.github.io/voice_conversion_demo/ Feature Disentangle - Adversarial Training Target Speaker Source Speaker Source to Target (Never seen during training!) Me Me https://b04901014.github.io/ISGAN/ 感謝周儒杰同學提供實驗結果 Me Me Thanks Ju-chieh Chou for providing the results. https://jjery2243542.github.io/voice_conversion_demo/

Discrete Representation https://arxiv.org/pdf/1611.01144.pdf non differentiable Easier to interpret or clustering One-hot 0.9 0.1 0.3 0.7 1 NN Encoder NN Decoder Binary 0.9 0.1 0.3 0.7 1 NN Encoder NN Decoder

Discrete Representation https://arxiv.org/abs/1711.00937 Vector Quantized Variational Auto-encoder (VQVAE) vector NN Encoder NN Decoder vector 3 Codebook (a set of vectors) Compute similarity vector 1 vector 2 vector 3 vector 4 vector 5 The most similar one is the input of decoder. Learn from data For speech, the codebook represents phonetic information https://arxiv.org/pdf/1901.08810.pdf

Sequence as Embedding G R Seq2seq Seq2seq https://arxiv.org/abs/1810.02851 Only need a lot of documents to train the model This is a seq2seq2seq auto-encoder. Using a sequence of words as latent representation. not readable … word sequence document document G R Summary? Seq2seq Seq2seq

Sequence as Embedding D G R Discriminator Seq2seq Seq2seq Human written summaries Real or not Let Discriminator considers my output as real Discriminator word sequence document document G R Readable Summary? Seq2seq Seq2seq

感謝王耀賢同學提供實驗結果 Sequence as Embedding Document:澳大利亞今天與13個國家簽署了反興奮劑雙邊協議,旨在加強體育競賽之外的藥品檢查並共享研究成果 …… Summary: Human:澳大利亞與13國簽署反興奮劑協議 Unsupervised:澳大利亞加強體育競賽之外的藥品檢查 Document:中華民國奧林匹克委員會今天接到一九九二年冬季奧運會邀請函,由於主席張豐緒目前正在中南美洲進行友好訪問,因此尚未決定是否派隊赴賽 …… Human:一九九二年冬季奧運會函邀我參加 Unsupervised:奧委會接獲冬季奧運會邀請函

感謝王耀賢同學提供實驗結果 Sequence as Embedding Document:據此間媒體27日報道,印度尼西亞蘇門答臘島的兩個省近日來連降暴雨,洪水泛濫導致塌方,到26日為止至少已有60人喪生,100多人失蹤 …… Summary: Human:印尼水災造成60人死亡 Unsupervised:印尼門洪水泛濫導致塌雨 Document:安徽省合肥市最近為領導幹部下基層做了新規定:一律輕車簡從,不準搞迎來送往、不準搞層層陪同 …… Human:合肥規定領導幹部下基層活動從簡 Unsupervised:合肥領導幹部下基層做搞迎來送往規定: 一律簡

Tree as Embedding https://arxiv.org/abs/1806.07832 https://vimeo.com/285800885 https://arxiv.org/abs/1806.07832 https://arxiv.org/abs/1904.03746

Concluding Remarks More than minimizing reconstruction error As close as possible NN Encoder Decoder code More than minimizing reconstruction error Using Discriminator Sequential Data More interpretable embedding Feature Disentangle Discrete and Structured