1 Introduction Prof. Lin-Shan Lee TA: Chun-Hsuan Wang.

Slides:

Advertisements

Similar presentations

Linux 环境及 Shell 程序操作系统实验 1. 二、 Shell 编程与进程通信常用 shell 命令文件及文件属性操作 ls 、 cp 、 mv 、 rm ln 、 ln –s 、 chmod 、 groupadd 、 useradd 输入输出操作 echo 、 cat >> 、

Advertisements

上課教材 Database Slide Windows筆電 (32 bit)

統計程式語言.

資訊安全與系統管理 2013/3/13 Chien wei lin.

UNIX系統與資料庫安裝 Why UNIX 常用的工具程式介紹資料庫的安裝.

深層學習暑期訓練 (2017).

System Administration Practice Homework 2: Shell Programming

Linux 的進階檔案處理.

報告人：丁英智資策會網路多媒體研究所 11/3/2006

第三讲 shell 程序设计.

第 2 章上機使用 Unix/Linux 內容：操作介面主機連線登入主機認識系統環境使用者常用命令.

臺北市立大學資訊科學系(含碩士班) 賴阿福 CS TEAM

台灣大學計算機及資訊網路中心教學研究組張傑生

Acoustic规范和测试 Base Band 瞿雪丽 2002/1/30.

Source: IEEE Access, vol. 5, pp , October 2017

Shell Script 程式設計.

關鍵詞辨認（Keyword Spotting）

Shell程序设计曙光信息产业股份有限公司.

第五章 shell 编程 shell 编程的基本过程分为三步: 1. 建立 shell 文件包含任意多行操作系统命令或shell命令的文本

Linux 基础与常用命令简介生物信息学培训班杭州，2018年1月18日周银聪.

Write a letter in a proper format

系統與網路管理工具.

32位元處理器之定點數MFCC演算法的改進與探討 Improvement and Discussion of MFCC Algorithm on 32-bit Fixed-point Processors 學生:陳奕宏指導教授：張智星.

第二天计算机基础技能培训（一）linux基础知识

1 Introduction Prof. Lin-Shan Lee.

曙光集群简明使用手册技术支持中心.

Computational Photography final project- Histogram Equalization

关键技术数据库构建文本处理声学建模最优搜索波形处理.

簡易 Visual Studio 2005 C++ 使用手冊

A Study on the Next Generation Automatic Speech Recognition -- Phase 2

The First Course in Speech Lab

1 Introduction Prof. Lin-Shan Lee.

语音技术的应用及挑战 APPLICATIONS & CHALLENGES OF SPEECH TECHNOLOGIES

檔案類型介紹 Linux 的檔案類型目錄: 用ls –F時檔名後面有/,螢幕列出目錄時為藍色可執行檔: 壓縮檔: 連結檔:

第二章 UNIX系统安装与常用命令.

第六章 shell 程序调试一. 程序执行状态跟踪程序: -n 读取命令, 但不执行. 主要用于跟踪程序流程是

如何有效率的學習Linux 培養組合能力多用程式, 少寫程式從錯誤訊息中學習養成略讀 “各種文件” 加強英文基本閱讀能力勤作筆記

Confidential Property

1 Introduction Prof. Lin-Shan Lee.

MATLAB 程式設計入門篇初探MATLAB

中央社新聞— ＜LTTC：台灣學生英語聽說提升讀寫相對下降＞

Learn Question Focus and Dependency Relations from Web Search Results for Question Classification 各位老師大家好,這是我今天要報告的論文題目,…… 那在題目上的括號是因為,前陣子我們有投airs的paper,那有reviewer對model的名稱產生意見.

指導教授：陳柏琳博士研究生：許庭瑋陳冠宇中華民國九十六年七月十三日

杨振伟清华大学第一讲：Linux环境下编程(1)

A Data Mining Algorithm for Generalized Web Prefetching

Course 10 削減與搜尋 Prune and Search

Compute System Administration Homework 2: Shell Script

李宏毅專題 Track A, B, C 的時間、地點開學前通知

Lab01 工作站&Linux操作日期：2011/03/07.

第三章基本的輸出與輸入函數 (Basic Output & Input Function)

Create and Use the Authorization Objects in ABAP

Introduction of this course

台灣大學計算機及資訊網路中心教學研究組張傑生

More About Auto-encoder

Speaker : YI-CHENG HUNG

參考資料：林秋燕曾元顯卜小蝶，Chap. 1、3 Chowdhury，Chap.9

粒子物理与核物理实验中的数据分析杨振伟清华大学第1讲：Linux环境下的编程.

Arguments to the main Function and Final Project

大数据应用人才培养系列教材 Python语言刘鹏张燕总主编李肖俊主编刘河钟涛副主编.

第三章音樂檢索技術 1) 內涵式音樂資訊檢索(content-based music information retrieval)

杨振伟清华大学第一讲：Linux环境下编程(1)

人工智慧＆Scratch 林俞均侯藹玲陳芸儀鄭涵庭

鳥聲辨識之初步研究與分析 Initial Studies and Analysis of Birdsong Recognition

Usage Eclipse 敏捷方法工具介紹實驗室網站:

劉庠宏、林合治編著國立高雄大學應用數學系 2005年3月1日

第六章文件系统与文件管理 6.4 Linux文件管理 1、比较MS DOS 与 Linux的目录结构一、Linux文件系统的树形结构

1 Introduction Prof. Lin-Shan Lee TA: Chung-Ming Chien.

Gaussian Process Ruohua Shi Meeting

適用於數位典藏多媒體內容之複合式多媒體檢索技術

Presentation transcript:

1 Introduction Prof. Lin-Shan Lee TA: Chun-Hsuan Wang

Outline Project Introduction Linux and Bash Introduction 2 Project Introduction Linux and Bash Introduction Feature Extraction Homework

Project Introduction 3

第一階段專題目的：透過建立一個基本的大字彙語音辨識系統，讓同學對語音辨識有具體的了解，並且以此作為進一步研究各項進階技術的基礎。 4 目的：透過建立一個基本的大字彙語音辨識系統，讓同學對語音辨識有具體的了解，並且以此作為進一步研究各項進階技術的基礎。 Input Speech Speech Recognition System Output Sentence 今天 We are going to learn the black box of the speech recognition system.

語音辨識系統 Conventional ASR (Automatic Speech Recognition) system: 5 Conventional ASR (Automatic Speech Recognition) system: Input Speech Feature Vectors Output Sentence Linguistic Decoding and Search Algorithm Front-end Signal Processing 今天 Speech Corpora Acoustic Model Training Acoustic Model Language Model Language Model Construction Text Corpora Lexicon Deep learning based ASR system

語音辨識系統 Conventional ASR system Deep learning based ASR system 6 Conventional ASR system Widely used in commercial system Deep learning based ASR system Still to be studied Both will be implemented in this project with Kaldi toolkit Kaldi is the most widely used ASR toolkit.

Schedule Week Progress Report Group 1 7 Week Progress Report Group 1 Introduction + Linux intro+ Feature extraction 2 Acoustic model training : monophone & triphone 3 Language model training + Decoding A 4 Live demo B 5 Deep Neural Network 6 Progress Report 7 ... ... 第一階段 …. 第二階段

語音辨識系統 Week 1 Week 3 Week 4 Week 2 Week 5 8 Conventional ASR (Automatic Speech Recognition) system: Week 1 Week 3 Input Speech Feature Vectors Output Sentence Linguistic Decoding and Search Algorithm Front-end Signal Processing 今天 Week 4 Speech Corpora Acoustic Model Training Acoustic Model Language Model Language Model Construction Text Corpora Lexicon Week 2 Deep learning based ASR system Week 5

How to do recognition? How to map speech O to a word sequence W ? 9 How to map speech O to a word sequence W ? P(O|W): acoustic model P(W): language model

Language model P(W) W = w1, w2, w3, …, wn 10 W = w1, w2, w3, …, wn 𝑃 𝑊 =𝑃 𝑊 1 𝑃 𝑊 2 𝑊 1 𝑖=3 𝑛 𝑃( 𝑊 𝑖 | 𝑊 𝑖−2 , 𝑊 𝑖−1 )

Language model examples 11 log Prob Probability in log scale

Acoustic Model P(O|W) Model of a phone Markov Model 12 Model of a phone Markov Model Gaussian Mixture Model

Lexicon 13

語音辨識系統 Conventional ASR (Automatic Speech Recognition) system: 5 Conventional ASR (Automatic Speech Recognition) system: Input Speech Feature Vectors Output Sentence Linguistic Decoding and Search Algorithm Front-end Signal Processing 今天 Speech Corpora Acoustic Model Training Acoustic Model Language Model Language Model Construction Text Corpora Lexicon Deep learning based ASR system

Linux and Bash Introduction 15

Vim 如何建立文件： vim hello.txt 進去後，輸入“ i ”即可進入編輯模式此時，按下ESC即可回復一般模式，此時可以： 16 如何建立文件： vim hello.txt 進去後，輸入“ i ”即可進入編輯模式此時，輸入任何你想要打的此時，按下ESC即可回復一般模式，此時可以：輸入” /想搜尋的字“ 輸入”:w”即可存檔輸入”:wq”即可存檔+離開

Screen 簡單講一下，避免因為斷線而程式跑到一半就失敗了，大家可以使用screen，簡單使用法如下： 17 簡單講一下，避免因為斷線而程式跑到一半就失敗了，大家可以使用screen，簡單使用法如下： 1. 一登入後打"screen"，就進入了screen使用模式，用法都相同 2. 如果想要關掉此screen也是用"exit" 3. 如果還有程式在跑沒有想關掉他，但是想要跳出，按"Ctrl + a" + "d"離開screen模式(此時登出並關機程式也不會斷掉) 4. 下次登入時，打"screen -r"就可以跳回之前沒關掉的screen唷~ 5. 打”screen -r” 也許會有很多個未關的screen，輸入你要的 screen id 即可（越大的越新）這樣就算關掉電腦，工作仍可以進行!!! 也可以用tmux，tmux像是有更多功能的screen

Linux Shell Script Basics 18 echo “Hello” (print “hello” on the screen) a=ABC (assign ABC to a) echo $a (will print ABC on the screen) b=$a.log (assign ABC.log to b) cat $b > testfile (write “ABC.log” to testfile) 指令 -h (will output the help information)

Bash Example 19

Bash script 20 [ condition ] uses ‘test’ to check. Ex. test -e ~/tmp; echo $? File [ -e filename ] -e 該「檔名」是否存在？ -f 該「檔名」是否存在且為檔案(file)? -d 該「檔名」是否存在且為目錄(directory)? Number [ n1 -eq n2 ] -eq equal (n1==n2) -ne not equal (n1!=n2) -gt greater than (n1>n2) -lt less than (n1<n2) -ge greater or equal (n1>=n2) -le less than or equal (n1<=n2) SPACE COUNTS!!!!

Bash script Logic -a and -o or ! negation 21 Logic -a and -o or ! negation [ "$yn" == "Y" -o "$yn" == "y" ] [ "$yn" == "Y" ] || [ "$yn" == "y" ] Don’t forget the space and the double quote!!!!

Bash script ` operation && || ; operation Some useful commands. 22 ` operation echo `ls` my_date=`date` echo $my_date && || ; operation echo hello || echo no~ echo hello && echo no~ [ -f tmp ] && cat tmp || echo "file not found” [ -f tmp ] ; cat tmp ; echo "file not found” Some useful commands. grep, sed, touch, awk, ln

Bash script Pipeline program1 | program2 | program3 23 Pipeline program1 | program2 | program3 echo “hello” | tee log More information about pipeline: http://www.gnu.org/software/bash/manual/html_node/Pipelines.html

Bash script Input / output for bash: 24 Input / output for bash: cmd > logfile # 將 stdout 導入logfile，stderr 印於螢幕 cmd > logfile 2>&1 # 將stdout、stderr 全部導到 logfile cmd <inputfile 2>errorfile | grep stdoutfile More Information about bash input/output: http://tldp.org/LDP/Bash-Beginners-Guide/html/sect_08_02.html

Feature Extraction 02.extract.feat.sh 25 今天 Feature Vectors Output Input Speech Feature Vectors Output Sentence Linguistic Decoding and Search Algorithm Front-end Signal Processing 今天 Language Model Speech Corpora Acoustic Model Training Acoustic Model Language Model Construction Text Corpora Lexicon

Feature Extraction - MFCC 26

MFCC (Mel-frequency cepstral coefficients) 27 13 dimensions vector 數位語音第二章

Extract Feature (02.extract.feat.sh) 28 Training Set Input Output Archive 目錄 Development Set Testing Set

Kaldi rspecifier & wspecifier format 29 ark:<ark file> 眾多小檔案的檔案庫，可能是wav檔、mfcc檔、statistics的集合 scp:<scp file> 一群檔案的位置表，可能指向個別檔案(如我們的material/train.wav.scp)，也可以指向ark檔中的位置 ark,t:<ark file> 輸出文字檔案的ark，當輸入時,t無作用；不加,t，預設輸出二進位格式 ark,scp:<ark file>,<scp file> 同時輸出ark檔和scp檔

Extract Feature (extract.feat.sh) 30 compute-mfcc-feats add-deltas compute-cmvn-stats apply-cmvn

MFCC – Add delta add-deltas Deltas and Delta-Deltas 31 add-deltas Deltas and Delta-Deltas 將MFCC的Δ以及ΔΔ (意近一次微分與二次微分) 加入參數中，使得總維度變成39維 Usage：

MFCC – CMVN 32 CMVN： Cepstral Mean and Variance Normalization

MFCC – CMVN 33 compute-cmvn-stats Usage： apply-cmvn

Hint (Important!!) compute-mfcc-feats output為 ark:$path/$target.13.ark 34 compute-mfcc-feats output為 ark:$path/$target.13.ark add-deltas [input] [add_deltas] [input] = ark:$path/$target.13.ark compute-cmvn-stats [add_deltas] [comput_result] apply-cmvn [comput_result] [add_deltas] [output] [output] MUST BE rm -f [add_deltas] [comput_result] ark,t,scp:$path/$target.39.cmvn.ark,$path/$target.39.cmvn.scp

Homework Linux, background knowledge 01.format.sh, 02.extract.feat.sh 35 Linux, background knowledge 01.format.sh, 02.extract.feat.sh

Homework 如果你沒有操作 Linux 系統的經驗，請事先預習 Linux 系統的指令。鳥哥的Linux 私房菜 36 如果你沒有操作 Linux 系統的經驗，請事先預習 Linux 系統的指令。鳥哥的Linux 私房菜第七章Linux 檔案與目錄管理http://linux.vbird.org/linux_basic/0220filemanager.php 第十章vim 程式編輯器http://linux.vbird.org/linux_basic/0310vi.php

Homework (optional) 閱讀：使用加權有限狀態轉換器的基於混合詞與次詞以文字及語音指令偵測口語詞彙” – 第三章 37 閱讀：使用加權有限狀態轉換器的基於混合詞與次詞以文字及語音指令偵測口語詞彙” – 第三章 https://www.dropbox.com/s/dsaqh6xa9dp3dzw/wfst_thesis.pdf Kaldi documentation： http://kaldi-asr.org/doc/tools.html

Login Workstation By pietty/putty/Xshell ssh 140.112.21.80 port 22 38 By pietty/putty/Xshell ssh 140.112.21.80 port 22 By terminal ssh -p 22 username@140.112.21.80

Data 將壓縮檔複製至自己的家目錄底下 cp /share/proj1.ASTMIC.subset.tar.gz ~/. 解壓縮 tar -zxvf proj1.ASTMIC.subset.tar.gz

To Do Step 1: Execute the following command: Step 2: 40 Step 1: Execute the following command: script/01.format.sh | tee log/01.format.log script/02.extract.feat.sh | tee log/02.extract.feat.sh.log Step 2: Add-delta CMVN Observe the output and report

工作站注意事項請避免在程式中重複暴力的搜尋外網或抓取資料，這類的行為如果被計中偵測到，會將ip給ban，造成大家無法連進工作站。 41 請避免在程式中重複暴力的搜尋外網或抓取資料，這類的行為如果被計中偵測到，會將ip給ban，造成大家無法連進工作站。如果需要train的corpus佔用空間需要超過50G以上，麻煩請寄信給我，以控制專題工作站的空間使用量。因工作站運算資源有限，請避免使用工作站train一些個人作業等，而讓資源留給大家使用在專題研究上。本次project中，Week 3 & Week 5 的實驗需要的大量運算資源和時間，請大家儘早開始，免得積到最後一兩天，大家的程式會因運算資源有限，而造成全部卡住，大家都無法進行實驗。

工作站注意事項 42 有為大家裝不同版本的cuda library，大家如果在某些檔案需要使用各版本的cuda library，請自行加進path中，如果所需要的cuda版本沒有，可以寫信請我幫忙裝。第二階段專題的時候建議大家使用virtual environment，Ex: virtualenv, conda等，也可以使用pip --user 將需要的package放在local端請不要在工作站跑ipython之類互動式的程式，會吃掉大家的資源，請直接跑python檔。

其他注意事項 Problems about the project: Problems about the workstation: 43 Problems about the project: Facebook Group：數位語音專題 DSP Website: http://speech.ee.ntu.edu.tw/courses.html Week 1 TA: 王君璇 r07942076@ntu.edu.tw Problems about the workstation: Workstation TA: 王君璇 r07942076@ntu.edu.tw

其他注意事項請大家務必至以下網址填入自己個人資料：https://goo.gl/Qm81M2 所有公告會同時公告於fb社團和寄信給各位。 44 請大家務必至以下網址填入自己個人資料：https://goo.gl/Qm81M2 所有公告會同時公告於fb社團和寄信給各位。