Gaussian Process Ruohua Shi Meeting 2018.08.17.

Slides:

Advertisements

Similar presentations

Chapter 2 Combinatorial Analysis 主講人 : 虞台文. Content Basic Procedure for Probability Calculation Counting – Ordered Samples with Replacement – Ordered.

Advertisements

Survey Sampling 問卷調查和訪談

第六讲需求预测 Demand Forecasting.

人群健康研究的统计方法预防医学系指导教师：方亚电话：

第一章生物统计学基本知识 1、明确统计在做什么事情、将用什么样的方式去做。 2、生物统计与统计学的关系，其涉及哪些内容 1.

-Artificial Neural Network- Hopfield Neural Network(HNN) 朝陽科技大學資訊管理系李麗華教授.

第三章水文统计的基本原理与方法.

第5章間斷機率分佈.

第三章隨機變數.

Chapter 8 Liner Regression and Correlation 第八章直线回归和相关

管理统计学主讲人：北京理工大学管理与经济学院李金林电话: 办公室：中心教学楼1012房间

統計學授課教師:林志偉 Tel:5021.

XI. Hilbert Huang Transform (HHT)

深層學習暑期訓練 (2017).

3-3 Modeling with Systems of DEs

Introduction To Mean Shift

Some Effective Techniques for Naive Bayes Text Classification

袁星谢正辉，梁妙玲中国科学院大气物理研究所

Thinking of Instrumentation Survivability Under Severe Accident

Population proportion and sample proportion

模式识别 Pattern Recognition

Manifold Learning Kai Yang

實驗研究法多因子實驗設計指導老師：黃萬居教授學生：陳志鴻 m

Differential Equations (DE)

微積分網路教學課程應用統計學系周章.

次数依变量模型 (Models for Count Outcomes)

On Some Fuzzy Optimization Problems

第七章 SPSS的非参数检验.

Continuous Probability Distributions

機率與統計 Introduction 講師：黃弘州.

Properties of Continuous probability distributions

Sampling Theory and Some Important Sampling Distributions

Decision Support System (靜宜資管楊子青)

簡單迴歸模型的基本假設用最小平方法(OLS-ordinary least square)找到一個迴歸式：

第二十九單元方向導數與梯度.

Pattern Recognition Chapter1 Introduction.

製程能力分析何正斌教授國立屏東科技大學工業管理學系.

第六章機率分配.

Randomized Algorithms

二元隨機變數(Bivariate Random Variables)

VI. Brief Introduction for Acoustics

Interval Estimation區間估計

Decision Support System (靜宜資管楊子青)

模式识别 Pattern Recognition

Chp.4 The Discount Factor

生物統計 1 課程簡介 (Introduction)

Introduction to Basic Statistics

抽樣分配 Sampling Distributions

資料整理與次數分配 Organizing Data 社會統計（上）.

相關統計觀念復習 Review II.

Chp.4 The Discount Factor

表情识别研究 Sources of facial expressions

Introduction to Basic Statistics

Design and Analysis of Experiments Final Report of Project

Simple Regression (簡單迴歸分析)

The Bernoulli Distribution

Chp.4 The Discount Factor

五.連續變數及常態分佈 (Continuous Random Variables and Normal Distribution)

研究所生物統計課程整合說明課程規劃及修課建議楊奕馨高雄醫學大學藥學系研究所生統課程授課教師

統計學回顧區國強.

Review of Statistics.

Introduction of this course

An Quick Introduction to R and its Application for Bioinformatics

More About Auto-encoder

生物统计学 Biostatistics 第一章统计数据的收集与整理

第七章计量资料的统计分析.

Class imbalance in Classification

簡單迴歸分析與相關分析莊文忠副教授世新大學行政管理學系計量分析一(莊文忠副教授) 2019/8/3.

Presentation transcript:

Gaussian Process Ruohua Shi Meeting 2018.08.17

Presentation goals 1. Some Reviews 2. What is Gaussian Distribution？ 3. What is Gaussian Process？ 4. How to use Gaussian Process？ • Gaussian Process Regression • A Sample • Gaussian Process Classification 5. Reference

Some Reviews Random variable : A real number to every possible outcome of a random experiment. 𝑋:Ω→ℝ There are two types of random variables, discrete and continuous. Sample Space : A set of possible outcomes of a random experiment. Ω Probability : 𝑃 𝑋=𝐴 =𝑝 Mean : μ= 𝐸 𝑋 𝑋 = 𝑛 1 𝑥 𝑖 𝑝 𝑖 Variance : 𝜎 2 = 𝑛 1 (𝑥 𝑖 −μ) 2 𝑝 𝑖 Standard Deviation: 𝜎= ( 𝑛 1 (𝑥 𝑖 −μ) 2 𝑝 𝑖 ) 1/2 Distribution Function: 𝐹 𝑋 𝑥 =𝑃{𝜔∈Ω;𝑋 𝜔 <𝑥} . 𝐹 𝑋 𝑥 =𝑃(𝑋<𝑥) Density Functions: 𝐹 𝑋 (𝑥)= −∞ 𝑥 𝑓 𝑋 𝑡 𝑑𝑡 (for continuous) Joint Distributions: For random variables 𝑋1 and 𝑋2. For discrete random variables, 𝑓 𝑋 1, 𝑋 2 𝑥 1 , 𝑥 2 =𝑃{ 𝑋 1 = 𝑥 1 , 𝑋 2 = 𝑥 2 } Independent R. V: Conditional Distributions

Gaussian Distribution x ~ 𝑁 𝑝 (𝜇, Σ) And Therefore we conclude that Pf. For any vector 𝜇=E(𝐱) Σ=𝐸 𝐱−𝜇 𝐱−𝜇 𝑇 =𝐸 𝐱 𝐱 𝑇 −𝜇 𝜇 𝑇 𝜀 𝑖𝑗 =𝐶𝑜𝑣[ 𝑋 𝑖 , 𝑋 𝑗 ] Multivariate Normal distribution 有个很重要的性质，就是假设 N 个变量服从 Multivariate Normal distribution ，从里面任意扣 n<N 个变量组成的vector 的联合分布仍然服从 Multivariate Normal distribution。这个性质除了 Normal，和 Gamma （Gamma 是么）好像也没谁了。所以当处理高维联合分布的时候，大家非常喜欢用高斯分布

What is a stochastic process？ X f(x) P 𝑥 1 ∙ 𝑥 2 ∙ 𝑥 𝑛 ∙ Parameter set: X={x1,…,xn} State Set: Z={Z1,Z2,…,Zn} Zi = f(xi) i=1,…,n Z （1）Discrete time Discrete state Bernoulli Process （2） Discrete time Continuous state White Noise Process （3） Continuous time Discrete state Poisson Process （4） Continuous time Continuous state Gaussian Process 离散参数，离散状态 eg.最简单也最早被人们研究的随机过程是随机游动.伯努利过程它是一种描述粒子在格子点集 Z := {0, ±1, ±2, . . . } 上随机运动的数学模型. 设 {Zn : n ≥ 1} 是某个概率空间上独立同分布的随机变量序列且都服从 Bernoulli 分布，即P(Zi=1)=p,P(Zi=-1)=1-p 离散参数，连续状态 eg.高斯白噪声过程对第xi个时刻，Z1…Zn相互独立同分布,服从正态分布N(0,sigma^2)的白噪声连续参数，离散状态 eg.计数过程如果N t 表示直到时刻t为止发生的某随机事件总数, 则称实随机过程{N t ,t≥0}为计数过程. 如直到时刻t为止进入某商店的人数N t , {N t ,t≥0} 计数过程通常满足： N t 是非负整数且N 0 =0 连续参数，连续状态 eg.高斯过程设Z= {Z t ,t∈T}是一实值随机过程,若对任意n≥1及x1 ,x2 ,…,xn ∈X, n维随机变量(Z1 , Z2 , …, Zn )服从n维正态分布,则称X是高斯过程，如果X是一个连续集，那么这个高斯过程就是一个连续参数连续状态的随机过程随机过程及其应用（第2版）；西安电子科技大学出版社; 6 edition (May 1 2012)

What is a Gaussian process？ Let Z={Z1,Z2,…,Zn}be an N-dimensional vector of function values evaluated at n points xi ∈ X, i=1,…,n. Zi = f(xi) • Note that Z is a random variable. • Definition: Z is a Gaussian process if for any finite subset of {x1,…,xn}, the marginal distribution over that finite subset Z has a multivariate Gaussian distribution. Gaussian process is parameterized by a mean function 𝜇 𝑋 ,and a covariance function (kernel) 𝐾( 𝑘 𝑖𝑗 ) 𝑛×𝑛 , where 𝑘 𝑖𝑗 =𝑘( 𝑥 𝑖 , 𝑥 𝑗 ). 这里没有强调Z1，Z2是不是要是同分布，只强调他们都是高斯分布并且联合分布也是高斯分布。 Gaussian Processes for Machine Learning, Carl Edward Rasmussen and Chris Williams, the MIT Press, 2006

Kernels 𝑘 𝑥, 𝑥 ′ = 𝜑 𝑥 𝑇 𝜑 𝑥′ 𝜑(𝑥) :A nonlinear feature space mapping k is A symmetric function of its arguments so that 𝑘 𝑥,𝑥′ =𝑘( 𝑥 ′ ,𝑥) 𝑘 𝑥, 𝑥 ′ = 𝜑 𝑥 𝑇 𝜑 𝑥′ Pattern Recognition and Machine Learning. Bishop, Christopher,2006

Gaussian Regression MODEL INFERENCE Let Let observed unobserved MODEL Let Let Where m is the mean of the predicted values and D is the variance. We can know the confidence interval by mean and variance. Pattern Recognition and Machine Learning. Bishop, Christopher,2006

Relationship to Polynomial regression prior Linear. Part 1. Non-Linear. Part 2. GP regression maximize the function: 极大似然函数，theta点估计 Polynomial regression minimize the function:

Example Gaussian Processes for Regression A Quick Introduction, M.Ebden, August 2008.

Gaussian Process Classification (GPC) Consider a two-class problem with a target variable y ∈ {-1, 1}. Input: x=( 𝑥 1 ,…, 𝑥 𝑛 )T Observed value: 𝑦=(𝑦 1 ,…, 𝑦 𝑛 )T Test data : 𝑥 ∗ Target value: 𝑦 ∗ Goal: 𝑃( 𝑦 ∗ |𝑦) Gaussian process prior over the latent function f with a covariance function 𝑘(𝑥, 𝑥′|𝜃), which may depend on hyperparameters 𝜃. Pattern Recognition and Machine Learning. Bishop, Christopher,2006

Advantages Disadvantages 1) The GP model is considered to be a basic framework for statistical machine learning. （Interpolation，Fitting） 2) The GP model is a model that combines kernel machine learning with Bayesian inference learning, and has the advantages of the above two types of learning methods. GP can generate probability information, and different kernels can be specified. Disadvantages 1) The hyperparameters of the GP model, such as the covariance function and the pending parameters in the prior distribution, have a large impact on the learning and prediction results. But there is no clear explanation of how to determine the appropriate initial value. 2) GP model does not work well on sparse samples. http://www.gaussianprocess.org/ 周亚同,陈子一 ,马尽文. 从高斯过程到高斯过程混合模型：研究与展望. Journal of Signal Processing. Vol.32 No.8. Aug 2016.

Reference [1] 随机过程及其应用（第2版）；西安电子科技大学出版社; 6 edition (May 1 2012) [2] Gaussian Processes for Machine Learning, Carl Edward Rasmussen and Chris Williams, the MIT Press, 2006 [3] Pattern Recognition and Machine Learning. Bishop, Christopher,2006 [4] Gaussian Processes for Regression A Quick Introduction, M.Ebden, August 2008. [5] 周亚同,陈子一 ,马尽文. 从高斯过程到高斯过程混合模型：研究与展望. Journal of Signal Processing. Vol.32 No.8. Aug 2016. [6] http://www.gaussianprocess.org/