Study on Speaker Recognition Based on HHT

Slides:

Advertisements

Similar presentations

產學攜手合作計畫楊授印國立虎尾科技大學推廣教育中心主任動力機械工程系助理教授民國103年10月30日.

Advertisements

人民币升值及汇率有关问题福建省质量技术监督局黄序和

日月光·伯爵居项目介绍.

班社会实践调查 ——大学生健康与运动状况调查.

香港故事之三年零八個月的艱苦歲月組員: 梁珮瑩吳遠莉李琪李青儀方松皓.

我的故事 ————往事回首.

教育部技職司北區：2015年10月12日下午南區：2015年10月16日下午

女生成功靠什么？ 09英本四班傅柏双.

国际投资环境罗氏评级法美国.

社会保障学第5章失业保险.

主题班会团结　　协作　　　力量.

电大转型社区教育何以可能华东师范大学终身教育研究中心主任教育学部博士生导师吴遵民教授.

資料探勘(Data Mining)及其應用之介紹

固定与搬运技术义乌市中心医院陈红卫.

中鸣虚拟搜救比赛项目（一人）现场主题创作（40%）(一人） 3D虚拟搜救（60%）(一人）.

案例分析胎记美容记第6小组

人生五色臉年輕十歲必學的小動作,九個保持身體健康的的小訣竅人們常在不經意間做些小動作，並認為這是身體的本能反應，

第六节脑和脊髓的传导通路.

实践课题周围环境对当代大学生成长的影响指导老师：王永章小组成员：陈荣、刘若楠、张红艳、吕雪丹、樊金芳、李惠芬、黄婧

学籍异动学生选课辅导学年第1学期.

XI. Hilbert Huang Transform (HHT)

MovieBot: Booking Tickets Easily

AN INTRODUCTION TO OFDM

Applications of Digital Signal Processing

Rate and Distortion Optimization for Reversible Data Hiding Using Multiple Histogram Shifting Source: IEEE Transactions On Cybernetics, Vol. 47, No. 2,February.

Population proportion and sample proportion

報告人：丁英智資策會網路多媒體研究所 11/3/2006

Empirical Mode Decomposition

非線性規劃 Nonlinear Programming

32位元處理器之定點數MFCC演算法的改進與探討 Improvement and Discussion of MFCC Algorithm on 32-bit Fixed-point Processors 學生:陳奕宏指導教授：張智星.

X. Other Applications of Time-Frequency Analysis

II. Short-time Fourier Transform

聲轉電信號.

VI. Brief Introduction for Acoustics

希伯特黃轉換(Hilbert Huang Transform) 簡介

第二章實體層 2-1 傳輸媒介的種類 2-2 數據傳輸的相關技術 2-3 數位資料與數位訊號 2-4 數位資料與類比訊號

2012清大電資院學士班「頂尖企業暑期實習」經驗分享心得報告實習企業：工業技術研究院　電光所實習學生：電資院學士班　　呂軒豪.

A Study on the Next Generation Automatic Speech Recognition -- Phase 2

A high payload data hiding scheme based on modified AMBTC technique

Advanced Digital Signal Processing 高等數位訊號處理

VIDEO COMPRESSION & MPEG

一個基于相鄰區塊相似性和動態次編碼簿的低位元率向量量化圖像壓縮法

VII. Data Compression (A)

9.1 仿真概念和仿真操作步骤 9.2 常用仿真元件与激励源 9.3 仿真器的设置与运行

Vector Quantization(VQ)

NSC D 蔣依吾中山大學資訊工程系紅外線點目標的檢知法則 Automatic detection of small targets in infrared image sequences containing evolving cloud clutter NSC D

通信工程专业英语 Lesson 13 Phase-Locked Loops 第13课锁相环

主講人：陳鴻文副教授銘傳大學資訊傳播工程系所日期：3/13/2010

Predictive Coding Chapter /4/28 資料壓縮 ※ 第七章預測編碼 ※

以四元樹為基礎抽取圖片物件特徵之影像檢索

Efficient Query Relaxation for Complex Relationship Search on Graph Data 李舒馨

F F F F F F F 第二章连续时间信号与系统的时域分析本章要点常用典型信号连续时间信号的分解连续时间系统的数学模型

一個基于相鄰區塊相似性和動態次編碼簿的低位元率向量量化圖像壓縮法

XI. Hilbert Huang Transform (HHT)

96學年度第二學期電機系教學助理課後輔導進度表（三）(查堂重點)

委外暨合約管理研究中心 RCOCM Research Center of Outsourcing & Contract Management

More About Auto-encoder

本講義為使用｢訊號與系統，王小川編寫，全華圖書公司出版」之輔助教材

HHT 2009/01/19 showmin.

語音訊號的特徵向量張智星多媒體資訊檢索實驗室清華大學資訊工程系.

本講義為使用｢訊號與系統，王小川編寫，全華圖書公司出版」之輔助教材

一個基于相鄰區塊相似性和動態次編碼簿的低位元率向量量化圖像壓縮法

II. Short-time Fourier Transform

第三章时域分析引言语音信号的短时处理方法短时能量和短时平均幅度短时平均过零率短时自相关函数短时时域处理技术应用举例

Surface wave dispersion measurements using Hilbert-Huang Transform

1 Chapter 9 交變正弦波.

Principle and application of optical information technology

Gaussian Process Ruohua Shi Meeting

Presentation transcript:

Study on Speaker Recognition Based on HHT 指導教授:謝傳璋教授王昭男教授學生：吳明弦日期：98/12/10

Outline 一、abstract 二、Instantaneous frequency 三、EMD&IMF 四、speech signal pretreatment 五、Vector quantization 六、conclusion 七、reference

abstract 語音訊號屬於非線性非平穩，傳統的傅利業分析屬於線性，需要了解希爾伯特轉換(線性及非線性)，可知道頻率含量隨時間的變化。語者識別是一門很廣泛的學科，與心理學、訊號處理、資訊工程、語音學等息息相關，用於實現機器與人的溝通，提升識別身份的準確性。語音訊號屬於非線性非平穩，傳統的傅利業分析屬於線性，需要了解希爾伯特轉換(線性及非線性)，可知道頻率含量隨時間的變化。另有提到經驗模態分解的概念，現實生活中，由於訊號為多頻率成份所組成，故將原始訊號分成有限個本質模態函數加一個趨勢訊號來表示原始訊號希爾伯特轉換在語者識別上已有成功的應用例子，如語音訊號端點檢測、特徵提取，以便進行語者識別系統設計，達到想要的語者識別準確性，現今生活還應用在地震、軌道、財管等，貢獻良多。

Speaker Recognition Process pretreatment Feature extraction Speech signal feature Speaker database Comparison with Speaker database decision Yes or no

HHT Process no Trend Or constant Input data Shift process Intrinsic Mode Function (IMF) Empirical Mode Decomposition (EMD) Marginal spectrum Hilbert spectrum Hilbert transform

Fourier analysis x=0.5*sin(2*pi*15*t)+2*sin(2*pi*40*t)

Analytic signal

Hilbert transform

Instantaneous frequency 1.mean value=0 dt=1/400

Instantaneous frequency 2.mean value<1

Instantaneous frequency 3.mean value>1

EMD x(t) shift process: Use characteristic time scales vibrate mode definition，time difference of between max and min value analyze local property。 x(t) shift process: 1.Find x(t) all local max、min value，use cubic spline hold all local max、min point link up、low envelopment。 2.Find mean of up、low envelopment again that get mean envelopment m1(t) 。 3.h1(t)= x(t)-m1(t) get first component，first shift finish，if no，keep shift second until are IMF conditions 。

Shift process 1.x(t)

Shift process 2.m1(t) h1(t)

IMF shift process： 1.remove carrier wave(one mode vibrate) 2.waveform symmetry (avoid vibrate of no smooth) IMF property ： shift process get decompose component 1. Number of local max and min value = function number ofzero crossing point，otherwise difference 1。 2. Mean value of local max and min value = 0。

Hilbert Spectrum

Produce of speech signal Voice (period impulse) Speech signal Vocal tract Unvoice (not period)

End-point Detection throrem 1.energy e(i)= Energy of voice more than unvoice， but unvoice may have large background noise ，may see very large energy

End-point Detection throrem 2.zero crossing rate ZCR(i)= voice→zero crossing rate small unvoice→ zero crossing rate large Frame enery> ，frame index 1 ， A frame of after 1 > ，after A frame may start of speech index 1，back see inside B frame < start of speech is sure index 0

End-point Detection way 1.frequency change dt=0.1

End-point Detection way

End-point Detection way

End-point Detection way 2.phase change dt=0.1

End-point Detection way

End-point Detection way

Pre-emphasis& remove slience Signal amplitude <1/10 of Max amplitude → slience

Before pre-emphasis and after pre-emphasis

feature extraction Speaker 1 Speech Signal hello

Instantaneous frequency

Instantaneous frequency

Hilbert Spectrum

Speaker2 Speech Signal

Instantaneous frequency

Instantaneous frequency

Hilbert Spectrum

Speaker 1 Speech Signal

Instantaneous frequency

Instantaneous frequency

Hilbert Spectrum

Pulse code modulation 1.uniform quantization 出處王小川語音訊號處理

Scalar quantization 2.non-uniform quantization 出處王小川語音訊號處理

Vector quantization Mean quantization error smallest Condition: (1)nearest neighbor selection rule (2)quantization value

Produce of Vector quantization codebook centroid splitting algorithm 1.initally All train data calculate a centroid →initally codebook 2.splitting n stage splitting 2^n centroid，input data compare all centroid distance smallest →know input data in A region， calculate centroid again，reach codebook size

conclusion 簡單介紹經驗模態分解、本質模態函數、希爾伯特頻譜、語音識別的概念，語音預處理等，目前語者識別的特徵提取方法以希爾伯特轉換為基礎，適用於非線性非平穩的語音訊號，根據所提取的特徵，可知語者何時說話，另外利用向量量化所建的語音資料庫編碼本來進行距離比較，得知是哪個語者說話，由此可知瞬時頻率的重要性

reference 1. The empirical mode decomposition and the Hilbert spectrum for nonlinear and non-stationary time series analysis By Norden E. Huang1, Zheng Shen2, Steven R. Long3,Manli C.Wu4, Hsing H. Shih5, Quanan Zheng6, Nai-Chyuan Yen7,Chi Chao Tung8 and Henry H. Liu9 2. 方建、基於HHT語音識別技術研究，哈爾濱工程大學通信與信息系統研究所碩士論文，2006 3.許豔紅、HHT變換在說話人識別中的應用，浙江大學電子信息及技術研究所碩士論文，2005 4.王小川、語音訊號處理，2007

next step 1.Speaker Recognition system design 2.Find speaker database

Thank you