Chapter 6 Basics of Digital Audio

Slides:



Advertisements
Similar presentations
1 Lecture 5 Properties of LTI Systems The Solution of LCCDE.
Advertisements

663 Chapter 14 Integral Transform Method Integral transform 可以表示成如下的積分式的 transform  kernel Laplace transform is one of the integral transform 本章討論的 integral.
Final Review Chapter 1 Discrete-time signal and system 1. 模拟信号数字化过程的原理框图 使用 ADC 变换器对连续信号进行采样的过程 使用 ADC 变换器对连续信号进行采样的过程 x(t) Analog.
本投影片檔案僅供本書上課教師使用,非經作者同意請勿拷貝或轉載,謝謝。
資料庫設計 Database Design.
市八中学 胡亮平 办公室:F504 高中信息科技 市八中学 胡亮平 办公室:F504
-Artificial Neural Network- Hopfield Neural Network(HNN) 朝陽科技大學 資訊管理系 李麗華 教授.
4.1 音频信号概述 4.2 音频信号的获取与处理 4.3 话音信号的参数编码 4.4 乐器数字接口MIDI 4.5 声卡概述
Audio.
XI. Hilbert Huang Transform (HHT)
A TIME-FREQUENCY ADAPTIVE SIGNAL MODEL-BASED APPROACH FOR PARAMETRIC ECG COMPRESSION 14th European Signal Processing Conference (EUSIPCO 2006), Florence,
深層學習 暑期訓練 (2017).
3-3 Modeling with Systems of DEs
Euler’s method of construction of the Exponential function
AN INTRODUCTION TO OFDM
IV. Implementation IV-A Method 1: Direct Implementation 以 STFT 為例
Applications of Digital Signal Processing
Differential Equations (DE)
單元一:基頻訊號傳送技術實習 (PCM取樣 量化 編碼部分) 數位通訊實習模擬 單元一.
編碼 用於資料傳輸及壓縮 漢明碼 霍夫曼編碼.
语音编码 陈虎.
Acoustic规范和测试 Base Band 瞿雪丽 2002/1/30.
調變技術 Modulation 陳哲儀 老師 行 動 網 路 技 術 調變技術 Modulation 陳哲儀 老師 元培資管系 陳哲儀 老師.
電腦數位音樂介紹 11組 電機三 陳俊傑 吳岳庭.
第二章 声音和语音编码 南通大学计算机应用教研室.
視訊串流\Streaming Video Part-1 Multimedia on Computer Digital
Sampling Theory and Some Important Sampling Distributions
Digital Terrain Modeling
信号与图像处理基础 An Introduction to Signal and Image Processing 中国科学技术大学 自动化系
II. Short-time Fourier Transform
機械波 Mechanical Waves Mechanical wave is a disturbance that travels through some material or substance called the medium for wave. Transverse wave is the.
數位影像壓縮 技術簡介 第四組 陳孝賢.
聲轉電信號.
Fundamentals of Physics 8/e 31 - Alternating Fields and Current
VI. Brief Introduction for Acoustics
Interval Estimation區間估計
第十章 轉換編碼 視轉換為座標軸之旋轉 視轉換為基底函數之分解 影像轉換 轉換編碼之方法 JPEG DCT 演算法 JPEG DCT 之結果
校園網路架構介紹與資源利用 主講人:趙志宏 圖書資訊館網路通訊組.
第2章 数字声音及MIDI简介.
第二章 實體層 2-1 傳輸媒介的種類 2-2 數據傳輸的相關技術 2-3 數位資料與數位訊號 2-4 數位資料與類比訊號
A high payload data hiding scheme based on modified AMBTC technique
Advanced Digital Signal Processing 高等數位訊號處理
Version Control System Based DSNs
VIDEO COMPRESSION & MPEG
校園地震預警系統的建置與應用 林沛暘.
XIV. Orthogonal Transform and Multiplexing
高性能计算与天文技术联合实验室 智能与计算学部 天津大学
计算机问题求解 – 论题3-2 - 贪心算法 2018年09月18日.
VII. Data Compression (A)
媒体基础(一) 向辉 山东大学软件学院 2003年秋季.
Common Qs Regarding Earnings
通信工程专业英语 Lesson 13 Phase-Locked Loops 第13课 锁相环
Predictive Coding Chapter /4/28 資料壓縮 ※ 第七章 預測編碼 ※
Inter-band calibration for atmosphere
第4章 连续时间傅立叶变换 The Continuous-Time Fourier Transform
An Efficient MSB Prediction-based Method for High-capacity Reversible Data Hiding in Encrypted Images 基于有效MSB预测的加密图像大容量可逆数据隐藏方法。 本文目的: 做到既有较高的藏量(1bpp),
Q & A.
李宏毅專題 Track A, B, C 的時間、地點開學前通知
 隐式欧拉法 /* implicit Euler method */
More About Auto-encoder
本講義為使用「訊號與系統,王小川編寫,全華圖書公司出版」之輔助教材
Class imbalance in Classification
II. Short-time Fourier Transform
第三章时 域 分 析 引言 语音信号的短时处理方法 短时能量和短时平均幅度 短时平均过零率 短时自相关函数 短时时域处理技术应用举例
簡單迴歸分析與相關分析 莊文忠 副教授 世新大學行政管理學系 計量分析一(莊文忠副教授) 2019/8/3.
Principle and application of optical information technology
Significant Figures 有效數字
Gyrophone: Recognizing Speech From Gyroscope Signals
Gaussian Process Ruohua Shi Meeting
Hybrid fractal zerotree wavelet image coding
Presentation transcript:

Chapter 6 Basics of Digital Audio 取樣定理 頻域轉換 for 定理證明、濾波處理 濾波器使用測試 格式 & 傳輸儲存方法 Chapter 6 Basics of Digital Audio 6.1 Digitization of Sound 6.2 MIDI: Musical Instrument Digital Interface 6.3 Quantization and Transmission of Audio 6.4 Further Exploration

Issues (modified outline) 數位化 格式 1.取樣 2.量化 5.記錄 6.傳輸 F.T. 4.濾波 3.合成 辨識 偵測 處理

What is Sound A wave phenomenon like light Molecules of air being compressed and expanded under the action of some physical device pressure wave continuous values (before digitized) reflection (反射) refraction (折射) diffraction (繞射)

Interesting Titbits Typical Sampling Rates = 8k / 48k Hz Human voice  up to 4K Hz. Human ear can hear  20 ~ 20K Hz. Nyquist Sampling Rate (later) Musicology/ Octave/ Harmonics: note “A” (La) within middle C is 440 Hz. Octave above is another A note doubling the frequency, i.e., 880 Hz. any series of musical tones whose frequencies are integral multiples of the frequency of a fundamental tone.

Issues 數位化 格式 1.取樣 2.量化 5.記錄 6.傳輸 F.T. 4.濾波 3.合成 辨識 偵測 處理

Orthogonality (正交性) W1 W2 W W5 F G W3 W q W4 x= v0 cos(q) t – W/m t2 兩個分量其內積(1-by-1相乘相加)為零 無法再分解出投影在對方成分上的係數值 y= v0 sin(q) t – g t2 可用來投影、觀察、或數個物件進行加總、求平衡…

Signal Decomposition Signals can be decomposed into a sum of sinusoids

Orthogonality of Trigonometric Funcs. 三角函數的正交性

Euler-Fourier Formula [Proof: ak] 兩邊同時乘 cos(kx) 再逐項積分[-p,p] 意義: 依頻訊號強度

Fourier Series (複數型式的數列) 展開加項k, 去掉相乘之時域積分值為0 的項目 此處ak 是個複數參數(具雙部)

ak bk (不含負號與虛數 j) 結論:兩組轉換式相同

Fourier Transform (Rad) 把係數抽出來,不必 執著於等式的展開, 可以正/逆轉換即可。

Fourier Transform (Hz) w: 每秒相角轉幾弧度? u: 每秒振動幾次(轉幾圈)?

Basic Properties Time Domain Frequency Domain f(t) F(u) g(t) + h(t) G(u) + H(u) g(t) × h(t) G(u) × H(u) G(u) × H(u) d (t – T ) d (u - 1/T) 可見”帶通濾波”在”時域(time domain)”有多難處理 Demo

Issues 數位化 格式 1.取樣 2.量化 5.記錄 6.傳輸 F.T. 4.濾波 3.合成 辨識 偵測 處理

Issues for Digital Audio Data What is the sampling rate? How finely is the data to be quantized, and is quantization uniform? How is audio data formatted? (file format)

Digitization Quantization Sampling

Nyquist Theorem (1924) Harry Nyquist (1889-1976) If a signal is band-limited, i.e., there is a lower limit f1 and an upper limit f2 of frequency components in the signal Sampling rate should be at least 2(f2 – f1). Usually, f1 is “0”.

Time Domain Observation

Alias Frequency Sampling at 1.5 times per cycle produces an alias perceived frequency

Nyquist Rate

Fourier Transform (example)

Fourier Transform (Hz) recall Fourier Transform (Hz) w: 每秒相角轉幾弧度? u: 每秒振動幾次(轉幾圈)?

Basic Properties Time Domain Frequency Domain f(t) F(u) g(t) + h(t) recall Basic Properties Time Domain Frequency Domain f(t) F(u) g(t) + h(t) G(u) + H(u) g(t) × h(t) G(u) × H(u) G(u) × H(u) d (t – T ) d (u - 1/T) 可見”帶通濾波”在”時域(time domain)”有多難處理 Demo

Basic Properties (Cont.) Time Domain Frequency Domain g(t) × h(t) G(u) × H(u) d (t – T ) d (u - 1/T) Convolution 中譯: 疊代 or 旋積 Impulse Function 中譯:沖激函數

Sampling Rate Time Domain Frequency Domain g(t) + h(t) G(u) + H(u) d (t – T ) d (u - 1/T) T 1/T

Fourier Spectrum f(t) | F(u) | fs(t) = f(t).s(t) Fs(u) = F(u) × S(u) umax f(t) fs(t) fs(t) = f(t).s(t) Qu: what about T0 ? 1/T |Fs(u)| Fs(u) = F(u) × S(u) umax usampling

Nyquist Theorem (freq. Domain) umax 1/T 2/T =usampling 取樣頻率不到二倍 頻譜間格就不夠寬 -1/T 1/T 2/T umax =usampling

Nyquist Theorem (freq. Domain) 原本是兩個紅色peaks, 但取樣 複製出綠色peaks 而被誤解 如果是一段頻域如三角形所示,因複製干擾,則會產生Aliasing(串音) -1/T umax 1/T 2/T -1/T 1/T 2/T umax =usampling

Issues 數位化 格式 1.取樣 2.量化 5.記錄 6.傳輸 F.T. 4.濾波 3.合成 辨識 鑑定 處理

Issues for Digital Audio Data What is the sampling rate? How finely is the data to be quantized, and is quantization uniform? How is audio data formatted? (file format)

Signal to Noise Ratio (SNR) A measure of the quality of the signal. In units of dB (decibel), 10dB= 1 bel Base-10 logarithms of the Ratio of (the power of the correct signal) and (the power of the noise) Note: P=V2/R The higher the better

dB Applied to Common Sounds A ratio to the quietest sound The quietest sound capable of hearing i.e. the just audible sound with frequency 1KHz Def. 10-5 N/m2 The lower the better

環保署噪音管制標準(1020065143號修正)

微軟消音室「-20.3分貝」 全球最安靜 美國華盛頓州瑞蒙市微軟總部87號大樓 獲認2015年金氏世界紀錄 -- 負20.3分貝 2015-10-18  世界日報  美國華盛頓州瑞蒙市微軟總部87號大樓 獲認2015年金氏世界紀錄 -- 負20.3分貝 接近地球上可能達到的最安靜極限負23分貝 空氣分子彼此碰撞製造的噪音強度 訓練太空人適應太空的「安靜環境」 讓人產生幻覺和失去方向感,甚至站不穩 安靜到讓人受不了,熬最久的人只停留了45分鐘 聽到自己的心跳,甚至聽到肺部的聲音,以及肚子裡東西流動的聲音,自己變成了噪音來源

Signal to Quantization Noise Ratio SQNR, Quantization noise = round-off error Let quantization accuracy = N bits per sample The worst case SQNR = 6.02 N (dB) input signal is sinusoidal, the quantization error is statistically independent, SQNR = 6.02 N + 1.76 (dB) SNR (SQNR) > 70 Can be acceptable in general, i.e., We need N > 12

Linear and Non-linear Quantization Linear format: samples are typically stored as uniformly quantized values. Non-uniform quantization: set up more finely-spaced levels where humans hear with the most acuity. Weber's Law stated formally says that equally perceived differences have values proportional to absolute levels: Δresponse ∞ ΔStimulus / Stimulus (6.5)

Nonlinear Quantization Transforming an analog signal from the raw s space into the theoretical r space, and then uniformly quantizing the resulting values quantization of r giving finer resolution in s at the quiet end Called m-law encoding, (or u-law). A very similar rule, called A-law used in telephony in Europe.

Equations of u-law and A-law (6.9) (6.10)

Nonlinear Transform for audio signals Fig 6.6 音量較低的訊號 在量化過程中 被 “放大” 檢視

Data rate and bandwidth in sample audio applications Table 6.2 Bytes x 1/8 [1,2,6] 1/2 , “>=”

AM vs FM

Issues 數位化 格式 1.取樣 2.量化 5.記錄 6.傳輸 F.T. 4.濾波 3.合成 辨識 鑑定 處理

Synthetic Sounds 1. FM (Frequency Modulation): x(t) = A(t) cos[ M(t) ] one approach to generating sound: x(t) = A(t) cos[ M(t) ] 2. Wave table or wave sound A more accurate way of generating sounds from digital signals.

Issues 數位化 格式 1.取樣 2.量化 5.記錄 6.傳輸 F.T. 4.濾波 3.合成 辨識 鑑定 處理

Digital Filter DEMO Homework? DFT/DCT (see DFTDCT.ppt)

Issues 數位化 格式 1.取樣 2.量化 5.記錄 6.傳輸 F.T. 4.濾波 3.合成 辨識 鑑定 處理

WAV File Format ‘RIFF’ 4 bytes RIFF file identification (Resource Interchange File Format) <length> Length field (afterwards) ‘WAVE’ WAVE chunk identification ‘fmt’ Format sub-chunk identification flength Length of format sub-chunk (afterwards) format 2 bytes Format specifier (Linear-quantization PCM = 1) Chans Number of channels sampsRate Sampling rate in Hz Bpsec Bytes per second = sampsRate x Bpsample Bpsample Bytes per sample = chans x bpchan/8 bpchan bits per channel ‘data’ Data sub-chunk identificatoin dlength Length of data sub-chunk (afterwards) Values Digital Audio Data … Other possible data chunk in the tail

Binary Code (Sec1.wav) Dlength=(001A6904)h= 1730820 =1730904 -44 -40 <length>=(001A6950)h= 1730896 = 1730904 -8 flength=(00 00 00 10)h=16 format = (00 01)h = 1 … PCM chans = (00 01)h = 1 sampsRate = (00 00 AC 44)h = 44100 Bpsec = (00 00 AC 44)h = 44100 Bpsample = (00 01)h = 1 bpchan = (00 08)h = 8 檔頭到dlength 欄位結束 共44 bytes, 檔尾40 bytes

Binary Code (Sec2.wav) <length>=(0059EBA8)h= 5893032 = 5893040 -8 Dlength=(0059EB5C)h= 5892956 =5893040 -44 -40 flength=(00 00 00 10)h=16 format = (00 01)h = 1 … PCM chans = (00 02)h = 2 sampsRate = (00 00 AC 44)h = 44100 Bpsec = (00 02 B1 10)h = 17640 Bpsample = (00 04)h = 4 bpchan = (00 10)h = 16 檔頭到dlength 欄位結束 共44 bytes, 檔尾40 bytes

(break)

Issues 數位化 格式 1.取樣 2.量化 5.記錄 6.傳輸 F.T. 4.濾波 3.合成 辨識 鑑定 處理

Coding of Audio Pulse Code Modulation: PCM (脈碼調變) The basic coding method Producing quantized sampled output for audio The differences version: DPCM (差值脈碼調變) A crude but efficient variant (delta): DM. The adaptive version: ADPCM. Example: WAV 是一種 PCM 編碼 Skype 採用 ADPCM, 32kbps

Pulse Code Modulation: PCM Original analog signal & corresponding PCM signals. (b) Decoded staircase signal. (c) Reconstructed signal after low-pass filtering. Fig 6.13

PCM in Telephony System 如果有所謂的壓縮 (Compression) 其實是指 Nonlinear Quantization 8-bit, 8 kHz  64 kbps

Coding of Audio Pulse Code Modulation: PCM (脈碼調變) The basic coding method Producing quantized sampled output for audio The differences version: DPCM (差值脈碼調變) A crude but efficient variant (delta): DM. The adaptive version: ADPCM. Example: WAV 是一種 PCM 編碼 Skype 採用 ADPCM, 32kbps

Three-Stages Compression Every compression scheme has three stages: (A) The input data is transformed to a new representation that is easier or more efficient to compress. (B) We may introduce loss of information. Quantization is the main lossy step  we use a limited number of reconstruction levels, fewer than in the original signal. (C) Coding. Assign a codeword (thus forming a binary bitstream) to each output level or symbol. This could be a fixed-length code, or a variable length code such as Human coding (Chap. 7). DPCM (next page) e.g. Hoffman code

Example: DPCM codec module B C A

Huffman Code (Lossless Compression) Symbol @ # $ & Frequency 1/8 1/4 1/2 Original Encoding 00 01 10 11 2 bits Huffman Encoding 110 111 3 bits 1 bit Expected length Original  1/82 + 1/42 + 1/22 + 1/82 = 2 bits / symbol Huffman  1/83 + 1/42 + 1/21 + 1/83 = 1.75 bits / symbol

Huffman Tree Construction 1 B C D E A 2 5 8 7 3

Huffman Tree Construction 2 D E A B 5 8 7 3 2 5

Huffman Tree Construction 3 D E A B 8 7 3 2 C 5 5 10

Huffman Tree Construction 4 D E A B 8 7 3 2 C 15 5 5 10

Huffman Tree Construction 5 010001110101110001 =DEDBCAED A B 3 2 E = 00 D = 01 C = 10 B = 110 A = 111 C D E 1 5 8 7 5 1 1 15 10 Average Length: 3x3/25 +3x2/25 +2x5/25 + 2x8/25 +2x7/27 = 2.2 (bits) 1 25

Differential Coding of Audio Audio is often stored not in simple PCM Instead in a form that exploits differences – which are generally smaller numbers, so offer the possibility of using fewer bits to store. (6.12) 最簡單的預估公式

Histogram of digital speech signal Signal Values v.s. Signal Differences Fig 6.15

Predictive Coding f0=f1, e0=0

Problem in Predictive Coding f0=f1, e0=0 ?!

DPCM codec module 重建 (訊號) 引入 Quantization 已不是 lossless 必須用重建的訊號預估 而不可用真實訊號 真實 預估 重建 (訊號)

DPCM Formulae (6.16) "^" hat (預估) "~" tilde (重建)

Example (DPCM, formulae) Let Quantization Steps Be { … -24, -8, 8, 24, 40, 56, …}

Example (DPCM, results) (2) (3) (1) 130 Encoder: (1) (2) (3) Decoder: (1) (3)

DM (Delta Modulation) Formulae (6.21)

Example (DM, results) ~ k=4, f1=f1=10

ADPCM codec module

End of Chap #6