Chapter 7 Dimensionality reduction Prof. Dehan Luo

Slides:

Advertisements

Similar presentations

Chapter 2 Combinatorial Analysis 主講人 : 虞台文. Content Basic Procedure for Probability Calculation Counting – Ordered Samples with Replacement – Ordered.

Advertisements

黄国文中山大学通用型英语人才培养中的语言学教学黄国文中山大学

3 供需彈性與均衡分析.

B型肝炎帶原之肝細胞癌患者接受肝動脈栓塞治療後血液中DNA之定量分析

-Artificial Neural Network- Hopfield Neural Network(HNN) 朝陽科技大學資訊管理系李麗華教授.

第三章隨機變數.

Chapter 8 Liner Regression and Correlation 第八章直线回归和相关

XI. Hilbert Huang Transform (HHT)

3-3 Modeling with Systems of DEs

Euler’s method of construction of the Exponential function

Introduction To Mean Shift

Some Effective Techniques for Naive Bayes Text Classification

Applications of Digital Signal Processing

Platypus — Indoor Localization and Identification through Sensing Electric Potential Changes in Human Bodies.

Thinking of Instrumentation Survivability Under Severe Accident

指導教授：許子衡教授報告學生：翁偉傑 Qiangyuan Yu , Geert Heijenk

Population proportion and sample proportion

模式识别 Pattern Recognition

Manifold Learning Kai Yang

教師的成長與教師專業能力理念架構教育局專業發展及培訓分部 TCF, how much you know about it?

Differential Equations (DE)

What are samples?. Chapter 6 Introduction to Inferential Statistics Sampling and Sampling Designs.

D. Halliday, R. Resnick, and J. Walker

微積分網路教學課程應用統計學系周章.

非線性規劃 Nonlinear Programming

Continuous Probability Distributions

第二章共轴球面系统的物像关系 Chapter 2: Object-image relations of coaxial spheric system.

Properties of Continuous probability distributions

Sampling Theory and Some Important Sampling Distributions

Guide to Freshman Life Prepared by Sam Wu.

Digital Terrain Modeling

第二十九單元方向導數與梯度.

Pattern Recognition Chapter1 Introduction.

The expression and applications of topology on spatial data

Inventory System Changes and Limitations

Interval Estimation區間估計

子博弈完美Nash均衡我们知道，一个博弈可以有多于一个的Nash均衡。在某些情况下，我们可以按照“子博弈完美”的要求，把不符合这个要求的均衡去掉。扩展型博弈G的一部分g叫做一个子博弈，如果g包含某个节点和它所有的后继点，并且一个G的信息集或者和g不相交，或者整个含于g。一个Nash均衡称为子博弈完美的，如果它在每.

消費者偏好與效用概念.

句子成分的省略（1）.

模式识别 Pattern Recognition

Version Control System Based DSNs

高性能计算与天文技术联合实验室智能与计算学部天津大学

Mechanics Exercise Class Ⅰ

表情识别研究 Sources of facial expressions

3.5 Region Filling Region Filling is a process of “coloring in” a definite image area or region. 2019/4/19.

Chapter 04 流程能力與績效分析.

Review and Analysis of the Usage of Degree Adverbs

Representation Learning of Knowledge Graphs with Hierarchical Types

主講人：陳鴻文副教授銘傳大學資訊傳播工程系所日期：3/13/2010

计算机问题求解 – 论题算法方法 2016年11月28日.

Chapter 3 What Is Money?.

计算机问题求解 – 论题1-5 - 数据与数据结构 2018年10月16日.

Nucleon EM form factors in a quark-gluon core model

Efficient Query Relaxation for Complex Relationship Search on Graph Data 李舒馨

磁共振原理的临床应用.

More About Auto-encoder

國立東華大學課程設計與潛能開發學系張德勝

Chapter 6 Introduction to Pattern Analysis Prof. Dehan Luo

何正斌博士國立屏東科技大學工業管理研究所教授

怎樣把同一評估給與在不同班級的學生 How to administer the Same assessment to students from Different classes and groups.

Chapter 9 Validation Prof. Dehan Luo

Class imbalance in Classification

Chapter 8 Pattern Classification Prof. Dehan Luo

簡單迴歸分析與相關分析莊文忠副教授世新大學行政管理學系計量分析一(莊文忠副教授) 2019/8/3.

Principle and application of optical information technology

WiFi is a powerful sensing medium

語音特徵擷取之資料相關線性特徵轉換研究生：張志豪多酌墨在數學式的物理意義及精神。老師、各位口試委員、各位同學大家好。

Gaussian Process Ruohua Shi Meeting

Presentation transcript:

Chapter 7 Dimensionality reduction Prof. Dehan Luo 第七章纬度压缩 Section One The curse of dimensionality 第一节多纬度存在的问题 Section Two Feature extraction vs. feature selection 第二节特征提取与特征选择 Section Three Principal Components Analysis 第三节主要成分分析 Section Four Linear Discriminant Analysis 第四节线性判别分析 Intelligent Sensors System 7-1 School of Information Engineering

Chapter 7 Dimensionality reduction Prof. Dehan Luo Section One The curse of dimensionality 多纬度存在的问题 The “curse of dimensionality” Refers to the problems associated with multivariate data analysis as the dimensionality increases （由于纬度增加，多变量数据分析引起一些严重问题） Consider a 3-class pattern recognition problem（三类模型识别问题） Three types of objects have to be classified based on the value of a single feature: （根据单个特征对三类目标进行分类） Intelligent Sensors System 7-2 School of Information Engineering

Chapter 7 Dimensionality reduction Prof. Dehan Luo A 3-class pattern recognition problem（三类模型识别问题）（续） A simple procedure would be to （1）Divide the feature space into uniform bins （将特征空间分成三个相同的箱柜）（2）Compute the ratio of examples for each class at each bin （在每一个箱中计算每个样本所占比例）and, （3）For a new example, find its bin and choose the predominant class in that bin（对一个新样本，找到一箱柜，在这箱柜中该类占主要部分） We decide to start with one feature and divide the real line into 3 bins （首先用一个特征，并将一根实线（一维空间）分成三段（三箱柜）） Notice that there exists a lot of overlap between classes ⇒ to improve discrimination, we decide to incorporate a second feature （在类与类之间有重叠，为了改善识别，结合第二特征） Intelligent Sensors System 7-3 School of Information Engineering

Chapter 7 Dimensionality reduction Prof. Dehan Luo A 3-class pattern recognition problem（三类模型识别问题）（续） Moving to two dimensions increases the number of bins from 3 to 32 = 9 （将一维空间的三个箱柜移到二维空间则增至为9个箱柜） QUESTION: Which should we maintain constant? The density of examples per bin? This increases the number of examples from 9 to 27（每个箱柜保持样本密度不变？） The total number of examples? This results in a 2D scatter plot that is very sparse（保持总样本数不变？） Intelligent Sensors System 7-4 School of Information Engineering

Chapter 7 Dimensionality reduction Prof. Dehan Luo A 3-class pattern recognition problem（三类模型识别问题）（续） Moving to three dimensions increases the number of bins from 3 to 33 = 27 （将一维空间的三个箱柜移到三维空间则增至为27个箱柜） The number of bins grows to 33=27 To maintain the initial density of examples, the number of required examples grows to 81（要保持初始样本密度，样本数需要增加到81个） For the same number of examples the 3D scatter plot is almost empty （要保持相同的样本数，三维图中的箱柜空的太多） Intelligent Sensors System 7-5 School of Information Engineering

Chapter 7 Dimensionality reduction Prof. Dehan Luo A 3-class pattern recognition problem（三类模型识别问题）（续） Implications of the curse of dimensionality Exponential growth with dimensionality in the number of examples required to accurately estimate a function （为获得精确地估计函数，样本数随纬度按指数规律增加） In practice, the curse of dimensionality means that For a given sample size, there is a maximum number of features above which the performance of our classifier will degrade rather than improve （在相同样本数条件下，取最大特征数，其分类性能降低而不是改善） Intelligent Sensors System 7-6 School of Information Engineering

Chapter 7 Dimensionality reduction Prof. Dehan Luo A 3-class pattern recognition problem（三类模型识别问题）（续） In most cases The information that was lost by discarding some features is compensated by a more accurate mapping in lower dimensional space performance （在大多数情况下，因丢弃某些特征而失去的信息可由低纬度空间更精确映射来补偿） Intelligent Sensors System 7-7 School of Information Engineering

Section Two Feature extraction vs. feature selection Chapter 7 Dimensionality reduction Prof. Dehan Luo Section Two Feature extraction vs. feature selection 第二节特征提取与特征选择 How do we beat the curse of dimensionality? （如何解决纬度问题？） By incorporating prior knowledge （结合先前知识） By providing increasing smoothness of the target function （增加目标函数的平稳性） By reducing the dimensionality （降低纬度） Intelligent Sensors System 7-8 School of Information Engineering

Chapter 7 Dimensionality reduction Prof. Dehan Luo Feature extraction vs. feature selection（特征提取与特征选择）（续） Two approaches to perform dim. reduction RN→RM (M<N) Feature selection: choosing a subset of all the features （选择所有特征子集） Feature extraction: creating new features by combining existing ones （结合已有特征，建立新的特征） In either case, the goal is to find a low-dimensional representation of the data that preserves (most of) the information or structure in the data Intelligent Sensors System 7-9 School of Information Engineering

Chapter 7 Dimensionality reduction Prof. Dehan Luo Feature extraction vs. feature selection（特征提取与特征选择）（续） Linear feature extraction The “optimal” mapping y=f(x) is, in general, a non-linear function whose form is problem-dependent （通常，最佳映射y=f(x) 是非线性函数，其表达方式与问题相关） Hence, feature extraction is commonly limited to linear projections y=Wx （特征提取只限于线性关系） Intelligent Sensors System 7-10 School of Information Engineering

Chapter 7 Dimensionality reduction Prof. Dehan Luo Feature extraction vs. feature selection（特征提取与特征选择）（续） Two criteria can be used to find the “optimal” feature extraction mapping y=f(x)（有两个标准用于求得“最佳”特征提取映射y=f(x)） Signal representation: The goal of feature extraction is to represent the samples accurately in a lower-dimensional space （特征提取目标之一是在低纬度空间精确表达样本） Classification: The goal of feature extraction is to enhance the class discriminatory information in the lower-dimensional space （特征提取目标之二是在低纬度空间增强分类识别信息） Intelligent Sensors System 7-11 School of Information Engineering

Chapter 7 Dimensionality reduction Prof. Dehan Luo Feature extraction vs. feature selection（特征提取与特征选择）（续） Within the realm of linear feature extraction, two techniques are commonly used （1）Principal Components Analysis (PCA) Based on signal representation (主要成分分析，基于信号表达方式）（2）Fisher’s Linear Discriminant Analysis (LDA)， Based on classification (Fisher’s线性判别分析，基于信号表达方式） Intelligent Sensors System 7-12 School of Information Engineering

Section Three Principal Components Analysis Chapter 7 Dimensionality reduction Prof. Dehan Luo Section Three Principal Components Analysis 第三节主要成分分析 Let us illustrate PCA with a two dimensional problem Data x follows a Gaussian density as depicted in the figure。Vectors can be represented by their 2D coordinates（数据X服从如图所示的Gaussian密度分布，他在两维空间的矢量也在图中所示） Intelligent Sensors System 7-13 School of Information Engineering

Chapter 7 Dimensionality reduction Prof. Dehan Luo Principal Components Analysis（主要成分分析）（续） Let us illustrate PCA with a two dimensional problem（续） We seek to find a 1D representation x’ “close” to x Where “closeness” is measured by the mean squared error over all points in the distribution Intelligent Sensors System 7-14 School of Information Engineering

Chapter 7 Dimensionality reduction Prof. Dehan Luo Principal Components Analysis（主要成分分析）（续） RESULT It can be shown that the “optimal”1D representation consists of projecting the vector x over the direction of maximum variance in the data (e.g., the longest axis in the ellipse) （可见，“最佳”一维表达是由最大数据偏差方向上的映射矢量X组成，（如椭圆中长轴）） This result can be generalized for more than two dimensions Intelligent Sensors System 7-15 School of Information Engineering

Chapter 7 Dimensionality reduction Prof. Dehan Luo Principal Components Analysis（主要成分分析）（续） Summary where is the eigenvector（特征向量）corresponding to the kth largest eigenvalue（特征值）of the covariance matrix（协方差矩阵）（ vk是相应于协方差矩阵第K个最大特征值的特征向量） Intelligent Sensors System 7-16 School of Information Engineering

Linear Discriminant Analysis Chapter 7 Dimensionality reduction Prof. Dehan Luo Linear Discriminant Analysis 第四节线性判别分析 The objective of LDA is to perform dimensionality reduction while preserving as much of the class discriminatory information as possible （线性判别分析目的是使纬度压缩而尽可能多的保留分类判别的信息） Assume we have a set of N-dimensional samples (x1, x2, …, xN), P1 of which belong to class ω1, and P2 to class ω2. We seek to obtain a scalar y （标量y）by projecting the samples x onto a line Of all the possible lines we would like to select the one that maximizes the separability(可分离性） of the classes Intelligent Sensors System 7-17 School of Information Engineering

Chapter 7 Dimensionality reduction Prof. Dehan Luo Linear Discriminant Analysis（线性判别分析）（续） In a nutshell, we want Maximum separation between the means of the projection（映射） Minimum variance within each projected class Intelligent Sensors System 7-18 School of Information Engineering

Chapter 7 Dimensionality reduction Prof. Dehan Luo Linear Discriminant Analysis（线性判别分析）（续） PCA Versus LDA（PCA与LDA比较） Intelligent Sensors System 7-19 School of Information Engineering

Chapter 7 Dimensionality reduction Prof. Dehan Luo Linear Discriminant Analysis（线性判别分析）（续） Limitations of LDA（LDA局限性）（1）LDA assumes unimodal Gaussian likelihoods （假设高斯分布是单峰的） If the densities are significantly non-Gaussian, LDA may not preserve any complex structure of the data needed for classification Intelligent Sensors System 7-20 School of Information Engineering

Chapter 7 Dimensionality reduction Prof. Dehan Luo Linear Discriminant Analysis（线性判别分析）（续） Limitations of LDA（LDA局限性）（续）（2）LDA will fail when the discriminatory information is not in the mean but rather in the variance of the data （若判别别数据不在均值上而是存在很大不一致性时，LDA判别失灵） Intelligent Sensors System 7-21 School of Information Engineering

Chapter 7 Dimensionality reduction Prof. Dehan Luo Limitations of LDA（LDA局限性）（续）（3） LDA has a tendency to overfit training data To illustrate this problem, we generate an artificial dataset. Three classes, 50 examples per class, with the exact same likelihood: a multivariate Gaussian with zero mean and identity covariance Intelligent Sensors System 7-22 School of Information Engineering

Chapter 7 Dimensionality reduction Prof. Dehan Lu Limitations of LDA（LDA局限性）（续）（3） LDA has a tendency to overfit training data As we arbitrarily (任意） increase the number of dimensions, classes appear to separate better, even though they come from the same distribution（分布） Intelligent Sensors System 7-23 School of Information Engineering