Chapter 8 Pattern Classification Prof. Dehan Luo

Slides:



Advertisements
Similar presentations
Chapter 2 Combinatorial Analysis 主講人 : 虞台文. Content Basic Procedure for Probability Calculation Counting – Ordered Samples with Replacement – Ordered.
Advertisements

期末考试作文讲解 % 的同学赞成住校 30% 的学生反对住校 1. 有利于培养我们良好的学 习和生活习惯; 1. 学生住校不利于了解外 界信息; 2 可与老师及同学充分交流有 利于共同进步。 2. 和家人交流少。 在寄宿制高中,大部分学生住校,但仍有一部分学生选 择走读。你校就就此开展了一次问卷调查,主题为.
Course 1 演算法: 效率、分析與量級 Algorithms: Efficiency, Analysis, and Order
教材:模式识别(第三版) 张学工编著 清华大学出版社
Performance Evaluation
Routing Protocols and Concepts – Chapter 3
-Artificial Neural Network- Hopfield Neural Network(HNN) 朝陽科技大學 資訊管理系 李麗華 教授.
初中进阶 (2346 期 ) 1 版. 1. What types of bullying do you know about? Physical hitting, tripping, stealing and hair pulling Social telling other kids.
Chapter 8 Liner Regression and Correlation 第八章 直线回归和相关
摘要的开头: The passage mainly tells us sth.
Academic Year TFC EFL Data Collection Outline 学年美丽中国英语测试数据收集概述
XI. Hilbert Huang Transform (HHT)
Operating System CPU Scheduing - 3 Monday, August 11, 2008.
A TIME-FREQUENCY ADAPTIVE SIGNAL MODEL-BASED APPROACH FOR PARAMETRIC ECG COMPRESSION 14th European Signal Processing Conference (EUSIPCO 2006), Florence,
3-3 Modeling with Systems of DEs
Operators and Expressions
Euler’s method of construction of the Exponential function
Introduction To Mean Shift
Some Effective Techniques for Naive Bayes Text Classification
Platypus — Indoor Localization and Identification through Sensing Electric Potential Changes in Human Bodies.
指導教授:許子衡 教授 報告學生:翁偉傑 Qiangyuan Yu , Geert Heijenk
Population proportion and sample proportion
模式识别 Pattern Recognition
Manifold Learning Kai Yang
What are samples?. Chapter 6 Introduction to Inferential Statistics Sampling and Sampling Designs.
非線性規劃 Nonlinear Programming
On Some Fuzzy Optimization Problems
Continuous Probability Distributions
HOW TO ACE -- THE IELTS SPEAKING TEST
Sampling Theory and Some Important Sampling Distributions
主讲人: 吕敏 { } Spring 2016,USTC 算法基础 主讲人: 吕敏 { } Spring 2016,USTC.
第14章 竞争市场上的企业 上海杉达学院 国贸系.
This Is English 3 双向视频文稿.
Inventory System Changes and Limitations
Interval Estimation區間估計
子博弈完美Nash均衡 我们知道,一个博弈可以有多于一个的Nash均衡。在某些情况下,我们可以按照“子博弈完美”的要求,把不符合这个要求的均衡去掉。 扩展型博弈G的一部分g叫做一个子博弈,如果g包含某个节点和它所有的后继点,并且一个G的信息集或者和g不相交,或者整个含于g。 一个Nash均衡称为子博弈完美的,如果它在每.
塑膠材料的種類 塑膠在模具內的流動模式 流動性質的影響 溫度性質的影響
消費者偏好與效用概念.
ZEEV ZEITIN Delft University of Technology, Netherlands
模式识别 Pattern Recognition
Chp.4 The Discount Factor
Computer Vision Chapter 4
谈模式识别方法在林业管理问题中的应用 报告人:管理工程系 马宁 报告地点:学研B107
Version Control System Based DSNs
高性能计算与天文技术联合实验室 智能与计算学部 天津大学
Chp.4 The Discount Factor
Safety science and engineering department
中国科学技术大学计算机系 陈香兰 2013Fall 第七讲 存储器管理 中国科学技术大学计算机系 陈香兰 2013Fall.
線性規劃模式 Linear Programming Models
公钥密码学与RSA.
计算机问题求解 – 论题 算法方法 2016年11月28日.
Chp.4 The Discount Factor
Q & A.
Introduction to Probability Theory ‧1‧
Chapter 10 Mobile IP TCP/IP Protocol Suite
冀教版 九年级 Lesson 20: Say It in Five.
Efficient Query Relaxation for Complex Relationship Search on Graph Data 李舒馨
 隐式欧拉法 /* implicit Euler method */
More About Auto-encoder
Chapter 6 Introduction to Pattern Analysis Prof. Dehan Luo
何正斌 博士 國立屏東科技大學工業管理研究所 教授
怎樣把同一評估 給與在不同班級的學生 How to administer the Same assessment to students from Different classes and groups.
Chapter 9 Validation Prof. Dehan Luo
Class imbalance in Classification
Chapter 7 Dimensionality reduction Prof. Dehan Luo
簡單迴歸分析與相關分析 莊文忠 副教授 世新大學行政管理學系 計量分析一(莊文忠副教授) 2019/8/3.
Principle and application of optical information technology
Homework 2 : VSM and Summary
Gaussian Process Ruohua Shi Meeting
When using opening and closing presentation slides, use the masterbrand logo at the correct size and in the right position. This slide meets both needs.
Presentation transcript:

Chapter 8 Pattern Classification Prof. Dehan Luo 第八章 模式分类 Section One Discriminant functions (第一节 判别函数) Section Two Quadratic classifiers (第二节 二次方分类器) Section Three Euclidean and Mahalanobis metrics (第三节 欧几里得和Mahalanobis方法) Section Four Quadratic classifier examples (第四节 二次分类器举例) Section Five K Nearest Neighbor classifiers (第五节 KNN分类器) Intelligent Sensors System 8-1 School of Information Engineering

Chapter 8 Pattern Classification Prof. Dehan Luo Section One Discriminant functions (第一节 判别函数) A convenient way to represent a pattern classifier is in terms of a family of discriminant functions gi(x) with a simple MAX gate as the classification rule (表达模式分类器的简易的方 法是用具有简单MAX门的系列 判别函数gi(x)作为分类准则) How do we choose the discriminant functions gi(x) Depends on the objective function to minimize (取决于目标函数最小化) Probability of error(误差概率) Bayes Risk Intelligent Sensors System 8-2 School of Information Engineering

Chapter 8 Pattern Classification Prof. Dehan Luo Discriminant functions (Cont.) (判别函数)(续) Probability of error P[error|x] is “the probability of assigning x to the wrong class”(误差概率是将X值赋予给错误类型的概率) For a two-class problem, P[error|x] is simply It makes sense that the classification rule be designed to minimize the average probability of error P[error] across all possible values of x (这意味分类规则被设计为在所有可能的X值范围内使平均误差概率为最小) Intelligent Sensors System 8-3 School of Information Engineering

Chapter 8 Pattern Classification Prof. Dehan Luo Discriminant functions (Cont.) (判别函数)(续) To ensure P(error) is minimum we minimize P(error|x) by choosing the class with maximum posterior P(ωi|x) at each x (为使误差概率最小,对每个X,选择最大的P(ωi|x) This is called the MAXIMUM A POSTERIORI (MAP) RULE and the associated discriminant functions become (这就称为最大后续规则(MAP),相关的判别函数表示为) Intelligent Sensors System 8-4 School of Information Engineering

Chapter 8 Pattern Classification Prof. Dehan Luo Discriminant functions (Cont.) (判别函数)(续) We “prove” the optimality of the MAP rule graphically The right plot shows the posterior for each of the two classes The bottom plots shows the P(error) for the MAP rule and another rule。 Which one has lower P(error) (color-filled area) ? Intelligent Sensors System 8-5 School of Information Engineering

Chapter 8 Pattern Classification Prof. Dehan Luo Section Two Quadratic classifiers (第二节 二次方分类器) Let us assume that the likelihood densities are Gaussian (假设可能的概率密度是高斯分布) Using Bayes rule, the MAP discriminant functions become (使用Bayes规则,MAP判别函数成为) Intelligent Sensors System 8-6 School of Information Engineering

Chapter 8 Pattern Classification Prof. Dehan Luo Quadratic classifiers (二次方分类器)(续) Eliminating constant terms ( Eliminating常数条件) We take natural logs (the logarithm is monotonically increasing) (采用自然对数,对数算法是单调增加) This is known as a Quadratic Discriminant Function(称为二次分类函数) The quadratic term is know as the Mahalanobis distance (二次术语称为Mahalanobis 距离) Intelligent Sensors System 8-7 School of Information Engineering

Chapter 8 Pattern Classification Prof. Dehan Luo Section Three Euclidean and Mahalanobis metrics (第三节 欧几里得和Mahalanobis方法) The Mahalanobis distance can be thought of vector distance that uses a ∑ -1 norm (Mahalanobis 距离被看作是矢量距离) ∑-1 can be thought of as a stretching factor on the space (∑-1被认为是空间弹性因子) Note that for an identity covariance matrix (∑i=I), the Mahalanobis distance becomes the familiar Euclidean distance(当协方差矩阵∑i=I时,Mahalanobis 距离即为欧几里得距离) Intelligent Sensors System 8-8 School of Information Engineering

Section Four Quadratic classifier examples Chapter 8 Pattern Classification Prof. Dehan Luo Section Four Quadratic classifier examples (第四节 二次分类器举例) In the following slides we look at special cases of the Quadratic classifier For convenience we will assume equiprobable priors(等概率优先) so we can drop the term log(P(ωi)) Special case I: Σi= σ2I Intelligent Sensors System 8-9 School of Information Engineering

Special case 2: Σi= Σ( Σdiagonal) Chapter 8 Pattern Classification Prof. Dehan Luo Quadratic classifier examples (Cont.)(续) Special case 2: Σi= Σ( Σdiagonal) Intelligent Sensors System 8-10 School of Information Engineering

Special case 3: Σi=Σ (Σ non-diagonal) Chapter 8 Pattern Classification Prof. Dehan Luo Quadratic classifier examples (Cont.)(续) Special case 3: Σi=Σ (Σ non-diagonal) Intelligent Sensors System 8-11 School of Information Engineering

Chapter 8 Pattern Classification Prof. Dehan Luo Quadratic classifier examples (Cont.)(续) Special Case 4: Σi=σi2I Intelligent Sensors System 8-12 School of Information Engineering

Case 5: Σi≠Σj general case Chapter 8 Pattern Classification Prof. Dehan Luo Quadratic classifier examples (Cont.)(续) Case 5: Σi≠Σj general case Intelligent Sensors System 8-13 School of Information Engineering

Chapter 8 Pattern Classification Prof. Dehan Luo Quadratic classifier examples (Cont.)(续) Limitations of quadratic classifiers (1)The fundamental limitation is the unimodal Gaussian assumption(基本局限是假设单峰高斯分布) For non-Gaussian or multimodal Gaussian, the results may be significantly sub-optimal (对于非高斯或多峰高斯分布,其结果可能很明显不是最佳) (2)A practical limitation is associated with the minimum required size for the dataset(实际局限是与所需最小数据库大小有关) If the number of examples per class is less than the number of dimensions, the covariance matrix becomes singular and, therefore, its inverse cannot be computed In this case it is common to assume the same covariance structure for all classes and compute the covariance matrix using all the examples, regardless of class Intelligent Sensors System 8 -14 School of Information Engineering

Chapter 8 Pattern Classification Prof. Dehan Luo Quadratic classifier examples (Cont.)(续) Conclusions We can extract the following conclusions 1、The Bayes classifier for normally distributed classes is quadratic (1)The Bayes classifier for normally distributed classes with equal covariance matrices is a linear classifier (2) The minimum Mahalanobis distance classifier is optimum for normally distributed classes and equal covariance matrices and equal priors (3)The minimum Euclidean distance classifier is optimum for normally distributed classes and equal covariance matrices proportional to the identity matrix and equal priors (4)Both Euclidean and Mahalanobis distance classifiers are linear Intelligent Sensors System 8-15 School of Information Engineering

Chapter 8 Pattern Classification Prof. Dehan Luo Quadratic classifier examples (Cont.)(续) Conclusions(续) We can extract the following conclusions(续) 2、The goal of this discussion was to show that some of the most popular classifiers can be derived from decision- theoretic principles and some simplifying assumptions (1)It is important to realize that using a specific (Euclidean or Mahalanobis) minimum distance classifier implicitly ` corresponds to certain statistical assumptions (2)The question whether these assumptions hold or don’t can rarely be answered in practice; in most cases we are limited to posting and answering the question “does this classifier solve our problem or not?” Intelligent Sensors System 8-16 School of Information Engineering

Chapter 8 Pattern Classification Prof. Dehan Luo Section Five K Nearest Neighbor classifiers (第五节 KNN分类器) 1、K Nearest Neighbor classifier( KNN分类器) 2、KNN in action: examples ( KNN分类器运用举例) 3、KNN versus 1NN( KNN分类器与1NN比较) 4、Characteristics of the kNN classifier(KNN分类器特点) Intelligent Sensors System 8-18 School of Information Engineering

1、K Nearest Neighbor classifier( KNN分类器) Chapter 8 Pattern Classification Prof. Dehan Luo 1、K Nearest Neighbor classifier( KNN分类器) The kNN classifier is based on non-parametric density estimation techniques ( KNN分类器是基于非参数密度估计技术) (1) Let us assume we seek to estimate the density function P(x) from a dataset of examples (假设从样本数据组中求密度函数P(x) ) (2)P(x) can be approximated by the expression (密度函数P(x) 近似表达为) Intelligent Sensors System 8-19 School of Information Engineering

1、K Nearest Neighbor classifier( KNN分类器)(续) Chapter 8 Pattern Classification Prof. Dehan Luo 1、K Nearest Neighbor classifier( KNN分类器)(续) (3)The volume V is determined by the D-dim distance RkD(x) between x and its k nearest neighbor (V的大小是由X与KNN之间的D维距离RkD(x) 决定) Where CD is the volume of the unit sphere in D dimensions ( CD 是D维空间的单位球体) Intelligent Sensors System 8-20 School of Information Engineering

1、K Nearest Neighbor classifier( KNN分类器)(续) Chapter 8 Pattern Classification Prof. Dehan Luo 1、K Nearest Neighbor classifier( KNN分类器)(续) We use the previous result to estimate the posterior probability (使用先前结果估计后期概率) And the priors can be estimated by (先前概率值通过下式被估算) The posterior probability then becomes (后期概率可表示为) Yielding discriminant functions(得到的判别函数为) This is known as the k Nearest Neighbor classifier Intelligent Sensors System 8-21 School of Information Engineering

1、K Nearest Neighbor classifier( KNN分类器)(续) Chapter 8 Pattern Classification Prof. Dehan Luo 1、K Nearest Neighbor classifier( KNN分类器)(续) The kNN classifier is a very intuitive method ( KNN分类器是非常直接的方法) Examples are classified based on their similarity with training data (样本基于类似训练数据进行分类) For a given unlabeled example Xu ∈ R D, find the k “closest” labeled examples in the training data set and assign Xu to the class that appears most frequently within the k-subset Intelligent Sensors System 8-22 School of Information Engineering

1、K Nearest Neighbor classifier( KNN分类器)(续) Chapter 8 Pattern Classification Prof. Dehan Luo 1、K Nearest Neighbor classifier( KNN分类器)(续) The kNN only requires An integer k(整数K) A set of labeled examples (一组分类样本) A measure of “closeness” (接近度测量) Intelligent Sensors System 8-22 School of Information Engineering

2、KNN in action: examples ( KNN分类器运用举例) Chapter 8 Pattern Classification Prof. Dehan Luo 2、KNN in action: examples ( KNN分类器运用举例) We generate data for a 2-dimensional 3-class problem(产生2维3类问题数据), where the class-conditional densities are multi-modal, and non-linearly separable We used kNN with k = five Metric = Euclidean distance Intelligent Sensors System 8-23 School of Information Engineering

3、KNN versus 1NN( KNN分类器与1NN比较) Chapter 8 Pattern Classification Prof. Dehan Luo 3、KNN versus 1NN( KNN分类器与1NN比较) Intelligent Sensors System 8-24 School of Information Engineering

4、Characteristics of the kNN classifier(KNN分类器特点) (1)Advantages Chapter 8 Pattern Classification Prof. Dehan Luo 4、Characteristics of the kNN classifier(KNN分类器特点) (1)Advantages (a) Analytical tractable, simple implementation (简单易用) Nearly optimal in the large sample limit (N→∞) (大样本条件下接 近最佳) P[error]Bayes >P[error]1-NNR<2P[error]Bayes (b) Uses local information, which can yield highly adaptive behavior (使用局部信息,能产生好的适应特性) (c) Lends itself very easily to parallel implementations (容易进行并行运行) (2) Disadvantages Large storage requirements(需要大存储空间) Computationally intensive recall(计算量大) Intelligent Sensors System 8-25 School of Information Engineering

4、Characteristics of the kNN classifier(KNN分类器特点)(续) (3)1NN versus kNN Chapter 8 Pattern Classification Prof. Dehan Luo 4、Characteristics of the kNN classifier(KNN分类器特点)(续) (3)1NN versus kNN The use of large values of k has two main advantages (a)Yields smoother decision regions (b)Provides probabilistic information: The ratio of examples for each class gives information about the ambiguity of the decision However, too large values of k are detrimental (a)It destroys the locality of the estimation (b)In addition, it increases the computational burden Intelligent Sensors System 8-26 School of Information Engineering