Chapter 7 Dimensionality reduction Prof. Dehan Luo

Slides:



Advertisements
Similar presentations
Chapter 2 Combinatorial Analysis 主講人 : 虞台文. Content Basic Procedure for Probability Calculation Counting – Ordered Samples with Replacement – Ordered.
Advertisements

黄国文 中山大学 通用型英语人才培养中的 语言学教学 黄国文 中山大学
3 供需彈性與均衡分析.
B型肝炎帶原之肝細胞癌患者接受肝動脈栓塞治療後血液中DNA之定量分析
-Artificial Neural Network- Hopfield Neural Network(HNN) 朝陽科技大學 資訊管理系 李麗華 教授.
第三章 隨機變數.
Chapter 8 Liner Regression and Correlation 第八章 直线回归和相关
XI. Hilbert Huang Transform (HHT)
3-3 Modeling with Systems of DEs
Euler’s method of construction of the Exponential function
Introduction To Mean Shift
Some Effective Techniques for Naive Bayes Text Classification
Applications of Digital Signal Processing
Platypus — Indoor Localization and Identification through Sensing Electric Potential Changes in Human Bodies.
Thinking of Instrumentation Survivability Under Severe Accident
指導教授:許子衡 教授 報告學生:翁偉傑 Qiangyuan Yu , Geert Heijenk
Population proportion and sample proportion
模式识别 Pattern Recognition
Manifold Learning Kai Yang
教師的成長 與 教師專業能力理念架構 教育局 專業發展及培訓分部 TCF, how much you know about it?
Differential Equations (DE)
What are samples?. Chapter 6 Introduction to Inferential Statistics Sampling and Sampling Designs.
D. Halliday, R. Resnick, and J. Walker
微積分網路教學課程 應用統計學系 周 章.
非線性規劃 Nonlinear Programming
Continuous Probability Distributions
第二章 共轴球面系统的物像关系 Chapter 2: Object-image relations of coaxial spheric system.
Properties of Continuous probability distributions
Sampling Theory and Some Important Sampling Distributions
Guide to Freshman Life Prepared by Sam Wu.
Digital Terrain Modeling
第二十九單元 方向導數與梯度.
Pattern Recognition Chapter1 Introduction.
The expression and applications of topology on spatial data
Inventory System Changes and Limitations
Interval Estimation區間估計
子博弈完美Nash均衡 我们知道,一个博弈可以有多于一个的Nash均衡。在某些情况下,我们可以按照“子博弈完美”的要求,把不符合这个要求的均衡去掉。 扩展型博弈G的一部分g叫做一个子博弈,如果g包含某个节点和它所有的后继点,并且一个G的信息集或者和g不相交,或者整个含于g。 一个Nash均衡称为子博弈完美的,如果它在每.
消費者偏好與效用概念.
句子成分的省略(1).
模式识别 Pattern Recognition
Version Control System Based DSNs
高性能计算与天文技术联合实验室 智能与计算学部 天津大学
Mechanics Exercise Class Ⅰ
表情识别研究 Sources of facial expressions
3.5 Region Filling Region Filling is a process of “coloring in” a definite image area or region. 2019/4/19.
Chapter 04 流程能力與績效分析.
Review and Analysis of the Usage of Degree Adverbs
Representation Learning of Knowledge Graphs with Hierarchical Types
主講人:陳鴻文 副教授 銘傳大學資訊傳播工程系所 日期:3/13/2010
计算机问题求解 – 论题 算法方法 2016年11月28日.
Chapter 3 What Is Money?.
Q & A.
计算机问题求解 – 论题1-5 - 数据与数据结构 2018年10月16日.
Nucleon EM form factors in a quark-gluon core model
Efficient Query Relaxation for Complex Relationship Search on Graph Data 李舒馨
磁共振原理的临床应用.
More About Auto-encoder
國立東華大學課程設計與潛能開發學系張德勝
Chapter 6 Introduction to Pattern Analysis Prof. Dehan Luo
何正斌 博士 國立屏東科技大學工業管理研究所 教授
怎樣把同一評估 給與在不同班級的學生 How to administer the Same assessment to students from Different classes and groups.
Chapter 9 Validation Prof. Dehan Luo
Class imbalance in Classification
Chapter 8 Pattern Classification Prof. Dehan Luo
簡單迴歸分析與相關分析 莊文忠 副教授 世新大學行政管理學系 計量分析一(莊文忠副教授) 2019/8/3.
Principle and application of optical information technology
WiFi is a powerful sensing medium
語音特徵擷取之 資料相關線性特徵轉換 研究生:張志豪 多酌墨在數學式的物理意義及精神。 老師、各位口試委員、各位同學大家好。
Gaussian Process Ruohua Shi Meeting
Presentation transcript:

Chapter 7 Dimensionality reduction Prof. Dehan Luo 第七章 纬度压缩 Section One The curse of dimensionality 第一节 多纬度存在的问题 Section Two Feature extraction vs. feature selection 第二节 特征提取与特征选择 Section Three Principal Components Analysis 第三节 主要成分分析 Section Four Linear Discriminant Analysis 第四节 线性判别分析 Intelligent Sensors System 7-1 School of Information Engineering

Chapter 7 Dimensionality reduction Prof. Dehan Luo Section One The curse of dimensionality 多纬度存在的问题 The “curse of dimensionality” Refers to the problems associated with multivariate data analysis as the dimensionality increases (由于纬度增加,多变量数据分析引起一些严重问题) Consider a 3-class pattern recognition problem(三类模型识别问题) Three types of objects have to be classified based on the value of a single feature: (根据单个特征对三类目标进行分类) Intelligent Sensors System 7-2 School of Information Engineering

Chapter 7 Dimensionality reduction Prof. Dehan Luo A 3-class pattern recognition problem(三类模型识别问题)(续) A simple procedure would be to (1)Divide the feature space into uniform bins (将特征空间分成三个相同的箱柜) (2)Compute the ratio of examples for each class at each bin (在每一个箱中计算每个样本所占比例)and, (3)For a new example, find its bin and choose the predominant class in that bin(对一个新样本,找到一箱柜,在这箱柜中该类占主要部分) We decide to start with one feature and divide the real line into 3 bins (首先用一个特征,并将一根实线(一维空间)分成三段(三箱柜)) Notice that there exists a lot of overlap between classes ⇒ to improve discrimination, we decide to incorporate a second feature (在类与类之间有重叠,为了改善识别,结合第二特征) Intelligent Sensors System 7-3 School of Information Engineering

Chapter 7 Dimensionality reduction Prof. Dehan Luo A 3-class pattern recognition problem(三类模型识别问题)(续) Moving to two dimensions increases the number of bins from 3 to 32 = 9 (将一维空间的三个箱柜移到二维空间则增至为9个箱柜) QUESTION: Which should we maintain constant? The density of examples per bin? This increases the number of examples from 9 to 27(每个箱柜保持样本密度不变?) The total number of examples? This results in a 2D scatter plot that is very sparse(保持总样本数不变?) Intelligent Sensors System 7-4 School of Information Engineering

Chapter 7 Dimensionality reduction Prof. Dehan Luo A 3-class pattern recognition problem(三类模型识别问题)(续) Moving to three dimensions increases the number of bins from 3 to 33 = 27 (将一维空间的三个箱柜移到三维空间则增至为27个箱柜) The number of bins grows to 33=27 To maintain the initial density of examples, the number of required examples grows to 81(要保持初始 样本密度,样本数需要增加到81个) For the same number of examples the 3D scatter plot is almost empty (要保持相同的样本数,三维图 中的箱柜空的太多) Intelligent Sensors System 7-5 School of Information Engineering

Chapter 7 Dimensionality reduction Prof. Dehan Luo A 3-class pattern recognition problem(三类模型识别问题)(续) Implications of the curse of dimensionality Exponential growth with dimensionality in the number of examples required to accurately estimate a function (为获得精确地估计函数,样本数随纬度按指数规律增加) In practice, the curse of dimensionality means that For a given sample size, there is a maximum number of features above which the performance of our classifier will degrade rather than improve (在相同样本数条件下,取最大特征数,其分类性能降低而不是改善) Intelligent Sensors System 7-6 School of Information Engineering

Chapter 7 Dimensionality reduction Prof. Dehan Luo A 3-class pattern recognition problem(三类模型识别问题)(续) In most cases The information that was lost by discarding some features is compensated by a more accurate mapping in lower dimensional space performance (在大多数情况下,因丢弃某些特征而失去的信息可由低纬度 空间更 精确映射来补偿) Intelligent Sensors System 7-7 School of Information Engineering

Section Two Feature extraction vs. feature selection Chapter 7 Dimensionality reduction Prof. Dehan Luo Section Two Feature extraction vs. feature selection 第二节 特征提取与特征选择 How do we beat the curse of dimensionality? (如何解决纬度问题?) By incorporating prior knowledge (结合先前知识) By providing increasing smoothness of the target function (增加目标函数的平稳性) By reducing the dimensionality (降低纬度) Intelligent Sensors System 7-8 School of Information Engineering

Chapter 7 Dimensionality reduction Prof. Dehan Luo Feature extraction vs. feature selection( 特征提取与特征选择)(续) Two approaches to perform dim. reduction RN→RM (M<N) Feature selection: choosing a subset of all the features (选择所有特征子集) Feature extraction: creating new features by combining existing ones (结合已有特征,建立新的特征) In either case, the goal is to find a low-dimensional representation of the data that preserves (most of) the information or structure in the data Intelligent Sensors System 7-9 School of Information Engineering

Chapter 7 Dimensionality reduction Prof. Dehan Luo Feature extraction vs. feature selection( 特征提取与特征选择)(续) Linear feature extraction The “optimal” mapping y=f(x) is, in general, a non-linear function whose form is problem-dependent (通常,最佳映射y=f(x) 是非线性函数,其表达方式与问题相关) Hence, feature extraction is commonly limited to linear projections y=Wx (特征提取只限于线性关系) Intelligent Sensors System 7-10 School of Information Engineering

Chapter 7 Dimensionality reduction Prof. Dehan Luo Feature extraction vs. feature selection( 特征提取与特征选择)(续) Two criteria can be used to find the “optimal” feature extraction mapping y=f(x)(有两个标准用于求得“最佳”特征提取映射y=f(x)) Signal representation: The goal of feature extraction is to represent the samples accurately in a lower-dimensional space (特征提取目标之一是在低纬度空间精确表达样本) Classification: The goal of feature extraction is to enhance the class discriminatory information in the lower-dimensional space (特征提取目标之二是在低纬度空间增强分类识别信息) Intelligent Sensors System 7-11 School of Information Engineering

Chapter 7 Dimensionality reduction Prof. Dehan Luo Feature extraction vs. feature selection( 特征提取与特征选择)(续) Within the realm of linear feature extraction, two techniques are commonly used (1)Principal Components Analysis (PCA) Based on signal representation (主要成分分析,基于信号表达方式) (2)Fisher’s Linear Discriminant Analysis (LDA), Based on classification (Fisher’s线性判别分析, 基于信号表达方式) Intelligent Sensors System 7-12 School of Information Engineering

Section Three Principal Components Analysis Chapter 7 Dimensionality reduction Prof. Dehan Luo Section Three Principal Components Analysis 第三节 主要成分分析 Let us illustrate PCA with a two dimensional problem Data x follows a Gaussian density as depicted in the figure。Vectors can be represented by their 2D coordinates(数据X服从如图所示的Gaussian密度分布,他在两维空间的矢量也在图中所示 ) Intelligent Sensors System 7-13 School of Information Engineering

Chapter 7 Dimensionality reduction Prof. Dehan Luo Principal Components Analysis(主要成分分析)(续) Let us illustrate PCA with a two dimensional problem(续) We seek to find a 1D representation x’ “close” to x Where “closeness” is measured by the mean squared error over all points in the distribution Intelligent Sensors System 7-14 School of Information Engineering

Chapter 7 Dimensionality reduction Prof. Dehan Luo Principal Components Analysis(主要成分分析)(续) RESULT It can be shown that the “optimal”1D representation consists of projecting the vector x over the direction of maximum variance in the data (e.g., the longest axis in the ellipse) (可见,“最佳”一维表达是由最大数据偏差方向上的映射矢量X组成,(如椭圆中长轴)) This result can be generalized for more than two dimensions Intelligent Sensors System 7-15 School of Information Engineering

Chapter 7 Dimensionality reduction Prof. Dehan Luo Principal Components Analysis(主要成分分析)(续) Summary where is the eigenvector(特征向量)corresponding to the kth largest eigenvalue(特征值)of the covariance matrix(协方差矩阵) ( vk是相应于协方差矩阵第K个最大特征值的特征向量) Intelligent Sensors System 7-16 School of Information Engineering

Linear Discriminant Analysis Chapter 7 Dimensionality reduction Prof. Dehan Luo Linear Discriminant Analysis 第四节 线性判别分析 The objective of LDA is to perform dimensionality reduction while preserving as much of the class discriminatory information as possible (线性判别分析目的是使纬度压缩而尽可能多的保留分类判别的信息) Assume we have a set of N-dimensional samples (x1, x2, …, xN), P1 of which belong to class ω1, and P2 to class ω2. We seek to obtain a scalar y (标量y)by projecting the samples x onto a line Of all the possible lines we would like to select the one that maximizes the separability(可分离性) of the classes Intelligent Sensors System 7-17 School of Information Engineering

Chapter 7 Dimensionality reduction Prof. Dehan Luo Linear Discriminant Analysis(线性判别分析)(续) In a nutshell, we want Maximum separation between the means of the projection(映射) Minimum variance within each projected class Intelligent Sensors System 7-18 School of Information Engineering

Chapter 7 Dimensionality reduction Prof. Dehan Luo Linear Discriminant Analysis(线性判别分析)(续) PCA Versus LDA(PCA与LDA比较) Intelligent Sensors System 7-19 School of Information Engineering

Chapter 7 Dimensionality reduction Prof. Dehan Luo Linear Discriminant Analysis(线性判别分析)(续) Limitations of LDA(LDA局限性) (1)LDA assumes unimodal Gaussian likelihoods (假设高斯分布是单峰的) If the densities are significantly non-Gaussian, LDA may not preserve any complex structure of the data needed for classification Intelligent Sensors System 7-20 School of Information Engineering

Chapter 7 Dimensionality reduction Prof. Dehan Luo Linear Discriminant Analysis(线性判别分析)(续) Limitations of LDA(LDA局限性) (续) (2)LDA will fail when the discriminatory information is not in the mean but rather in the variance of the data (若判别别数据不在均值上而是存在很大不一致性时,LDA判别失灵) Intelligent Sensors System 7-21 School of Information Engineering

Chapter 7 Dimensionality reduction Prof. Dehan Luo Limitations of LDA(LDA局限性)(续) (3) LDA has a tendency to overfit training data To illustrate this problem, we generate an artificial dataset. Three classes, 50 examples per class, with the exact same likelihood: a multivariate Gaussian with zero mean and identity covariance Intelligent Sensors System 7-22 School of Information Engineering

Chapter 7 Dimensionality reduction Prof. Dehan Lu Limitations of LDA(LDA局限性)(续) (3) LDA has a tendency to overfit training data As we arbitrarily (任意) increase the number of dimensions, classes appear to separate better, even though they come from the same distribution(分布) Intelligent Sensors System 7-23 School of Information Engineering