Presentation is loading. Please wait.

Presentation is loading. Please wait.

More About Auto-encoder

Similar presentations


Presentation on theme: "More About Auto-encoder"— Presentation transcript:

1 More About Auto-encoder
Hung-yi Lee 李宏毅

2 Embedding, Latent Representation, Latent Code
Auto-encoder As close as possible NN Encoder NN Decoder vector Embedding, Latent Representation, Latent Code More than minimizing reconstruction error More interpretable embedding

3 What is good embedding? An embedding should represent the object. 是一對
不是一對

4 loss of the classification task is 𝐿 𝐷
Train 𝜙 to minimize 𝐿 𝐷 Beyond Reconstruction 𝐿 𝐷 ∗ = min 𝜙 𝐿 𝐷 How to evaluate an encoder? The embeddings are representative. Small 𝐿 𝐷 ∗ loss of the classification task is 𝐿 𝐷 Discriminator image y/n 𝜙 binary classifier NN Encoder Say “Yes” In real implementation, do we slit into train and test NN Encoder Say “No”

5 loss of the classification task is 𝐿 𝐷
Train 𝜙 to minimize 𝐿 𝐷 Beyond Reconstruction 𝐿 𝐷 ∗ = min 𝜙 𝐿 𝐷 How to evaluate an encoder? The embeddings are representative. Small 𝐿 𝐷 ∗ loss of the classification task is 𝐿 𝐷 Large 𝐿 𝐷 ∗ Not representative Discriminator image y/n 𝜙 binary classifier NN Encoder Say “Yes” NN Encoder Say “No”

6 loss of the classification task is 𝐿 𝐷
Train 𝜙 to minimize 𝐿 𝐷 Beyond Reconstruction 𝐿 𝐷 ∗ = min 𝜙 𝐿 𝐷 How to evaluate an encoder? The embeddings are representative. Small 𝐿 𝐷 ∗ loss of the classification task is 𝐿 𝐷 Large 𝐿 𝐷 ∗ Not representative Discriminator image y/n Train 𝜃 to minimize 𝐿 𝐷 ∗ 𝜙 binary classifier 𝜃 ∗ =𝑎𝑟𝑔 min 𝜃 𝐿 𝐷 ∗ NN Encoder =𝑎𝑟𝑔 min 𝜃 min 𝜙 𝐿 𝐷 Does it train iteratively???????? Train the encoder 𝜃 and discriminator 𝜙 to minimize 𝐿 𝐷 𝜃 NN Encoder Deep InfoMax (DIM) (c.f. training encoder and decoder to minimize reconstruction error) 𝜃

7 Typical auto-encoder is a special case
As close as possible NN Encoder Decoder vector score - (reconstruction error) NN Decoder vector Discriminator

8 Sequential Data Skip thought Quick thought
A document is a sequence of sentences. Sequential Data previous Skip thought current next Quick thought current In our experiments, c is simply defined to be an inner product c(u, v) = u T v random next random

9 Sequential Data Contrastive Predictive Coding (CPC)

10 Embedding, Latent Representation, Latent Code
Auto-encoder As close as possible NN Encoder NN Decoder vector Embedding, Latent Representation, Latent Code More than minimizing reconstruction error More interpretable embedding

11 Feature Disentangle An object contains multiple aspect information
Source: An object contains multiple aspect information Encoder Decoder input audio reconstructed Include phonetic information, speaker information, etc. Encoder Decoder input sentence reconstructed Include syntactic information, semantic information, etc.

12 Feature Disentangle phonetic information Encoder Decoder input audio
reconstructed speaker information phonetic information Encoder 1 Decoder Encoder 2 reconstructed speaker information

13 Feature Disentangle - Voice Conversion
How are you? Encoder Decoder How are you? How are you? Hello Encoder Decoder Hello Hello

14 Feature Disentangle - Voice Conversion
How are you? Encoder How are you? How are you? Decoder Hello Encoder Hello How are you?

15 Feature Disentangle - Voice Conversion
The same sentence has different impact when it is said by different people. Do you want to study a PhD? Student Go away! Do you want to study a PhD? 新垣結衣 (Aragaki Yui) Student

16 Feature Disentangle - Adversarial Training
Learn to fool the speaker classifier Speaker Classifier or (Discriminator) How are you? Encoder Decoder How are you? How are you? Speaker classifier and encoder are learned iteratively

17 Feature Disentangle - Designed Network Architecture
How are you? Encoder 1 IN Decoder How are you? How are you? How are you? Encoder 2 IN = instance normalization (remove global information)

18 Feature Disentangle - Designed Network Architecture
How are you? Encoder 1 IN Decoder AdaIN How are you? How are you? Encoder 2 IN = instance normalization (remove global information) AdaIN = adaptive instance normalization (only influence global information)

19 https://jjery2243542.github.io/voice_conversion_demo/
Feature Disentangle - Adversarial Training Target Speaker Source Speaker Source to Target (Never seen during training!) Me Me 感謝周儒杰同學提供實驗結果 Me Me Thanks Ju-chieh Chou for providing the results.

20 Discrete Representation
non differentiable Easier to interpret or clustering One-hot 0.9 0.1 0.3 0.7 1 NN Encoder NN Decoder Binary 0.9 0.1 0.3 0.7 1 NN Encoder NN Decoder

21 Discrete Representation
Vector Quantized Variational Auto-encoder (VQVAE) vector NN Encoder NN Decoder vector 3 Codebook (a set of vectors) Compute similarity vector 1 vector 2 vector 3 vector 4 vector 5 The most similar one is the input of decoder. Learn from data For speech, the codebook represents phonetic information

22 Sequence as Embedding G R Seq2seq Seq2seq
Only need a lot of documents to train the model This is a seq2seq2seq auto-encoder. Using a sequence of words as latent representation. not readable … word sequence document document G R Summary? Seq2seq Seq2seq

23 Sequence as Embedding D G R Discriminator Seq2seq Seq2seq
Human written summaries Real or not Let Discriminator considers my output as real Discriminator word sequence document document G R Readable Summary? Seq2seq Seq2seq

24 感謝 王耀賢 同學提供實驗結果 Sequence as Embedding Document:澳大利亞今天與13個國家簽署了反興奮劑雙 邊協議,旨在加強體育競賽之外的藥品檢查並共享研究成 果 …… Summary: Human:澳大利亞與13國簽署反興奮劑協議 Unsupervised:澳大利亞加強體育競賽之外的藥品檢查 Document:中華民國奧林匹克委員會今天接到一九九二年 冬季奧運會邀請函,由於主席張豐緒目前正在中南美洲進 行友好訪問,因此尚未決定是否派隊赴賽 …… Human:一九九二年冬季奧運會函邀我參加 Unsupervised:奧委會接獲冬季奧運會邀請函

25 感謝 王耀賢 同學提供實驗結果 Sequence as Embedding Document:據此間媒體27日報道,印度尼西亞蘇門答臘島 的兩個省近日來連降暴雨,洪水泛濫導致塌方,到26日為止 至少已有60人喪生,100多人失蹤 …… Summary: Human:印尼水災造成60人死亡 Unsupervised:印尼門洪水泛濫導致塌雨 Document:安徽省合肥市最近為領導幹部下基層做了新規 定:一律輕車簡從,不準搞迎來送往、不準搞層層陪同 …… Human:合肥規定領導幹部下基層活動從簡 Unsupervised:合肥領導幹部下基層做搞迎來送往規定: 一律簡

26 Tree as Embedding https://arxiv.org/abs/1806.07832

27 Concluding Remarks More than minimizing reconstruction error
As close as possible NN Encoder Decoder code More than minimizing reconstruction error Using Discriminator Sequential Data More interpretable embedding Feature Disentangle Discrete and Structured


Download ppt "More About Auto-encoder"

Similar presentations


Ads by Google