Gaussian Process Ruohua Shi Meeting 2018.08.17.

Slides:



Advertisements
Similar presentations
Chapter 2 Combinatorial Analysis 主講人 : 虞台文. Content Basic Procedure for Probability Calculation Counting – Ordered Samples with Replacement – Ordered.
Advertisements

Survey Sampling 問卷調查和訪談
第六讲 需求预测 Demand Forecasting.
人群健康研究的统计方法 预防医学系 指导教师:方亚 电话:
第一章 生物统计学基本知识 1、明确统计在做什么事情、将用什么样的方式去做。 2、生物统计与统计学的关系,其涉及哪些内容 1.
-Artificial Neural Network- Hopfield Neural Network(HNN) 朝陽科技大學 資訊管理系 李麗華 教授.
第三章 水文统计的基本原理与方法.
第5章 間斷機率分佈.
第三章 隨機變數.
Chapter 8 Liner Regression and Correlation 第八章 直线回归和相关
管理统计学 主讲人: 北京理工大学 管理与经济学院 李金林 电话: 办公室: 中心教学楼1012房间
統計學 授課教師:林志偉 Tel:5021.
XI. Hilbert Huang Transform (HHT)
深層學習 暑期訓練 (2017).
3-3 Modeling with Systems of DEs
Introduction To Mean Shift
Some Effective Techniques for Naive Bayes Text Classification
袁 星 谢正辉,梁妙玲 中国科学院大气物理研究所
Thinking of Instrumentation Survivability Under Severe Accident
Population proportion and sample proportion
模式识别 Pattern Recognition
Manifold Learning Kai Yang
實 驗 研 究 法 多因子實驗設計 指導老師:黃萬居教授 學生:陳志鴻 m
Differential Equations (DE)
微積分網路教學課程 應用統計學系 周 章.
次数依变量模型 (Models for Count Outcomes)
On Some Fuzzy Optimization Problems
第七章 SPSS的非参数检验.
Continuous Probability Distributions
機率與統計 Introduction   講師:黃弘州.
Properties of Continuous probability distributions
Sampling Theory and Some Important Sampling Distributions
Decision Support System (靜宜資管楊子青)
簡單迴歸模型的基本假設 用最小平方法(OLS-ordinary least square)找到一個迴歸式:
第二十九單元 方向導數與梯度.
Pattern Recognition Chapter1 Introduction.
製程能力分析 何正斌 教授 國立屏東科技大學工業管理學系.
第六章 機率分配.
Randomized Algorithms
二元隨機變數(Bivariate Random Variables)
VI. Brief Introduction for Acoustics
Interval Estimation區間估計
Decision Support System (靜宜資管楊子青)
模式识别 Pattern Recognition
Chp.4 The Discount Factor
生物統計 1 課程簡介 (Introduction)
第一章.
Introduction to Basic Statistics
抽樣分配 Sampling Distributions
資料整理與次數分配 Organizing Data 社會統計(上).
相關統計觀念復習 Review II.
Chp.4 The Discount Factor
表情识别研究 Sources of facial expressions
Introduction to Basic Statistics
Design and Analysis of Experiments Final Report of Project
Simple Regression (簡單迴歸分析)
The Bernoulli Distribution
Chp.4 The Discount Factor
五.連續變數及常態分佈 (Continuous Random Variables and Normal Distribution)
Q & A.
研究所生物統計課程整合說明 課程規劃及修課建議 楊奕馨 高雄醫學大學 藥學系 研究所生統課程授課教師
統計學回顧 區國強.
Review of Statistics.
Introduction of this course
An Quick Introduction to R and its Application for Bioinformatics
More About Auto-encoder
生物统计学 Biostatistics 第一章 统计数据的收集与整理
第七章 计量资料的统计分析.
Class imbalance in Classification
簡單迴歸分析與相關分析 莊文忠 副教授 世新大學行政管理學系 計量分析一(莊文忠副教授) 2019/8/3.
Presentation transcript:

Gaussian Process Ruohua Shi Meeting 2018.08.17

Presentation goals 1. Some Reviews 2. What is Gaussian Distribution? 3. What is Gaussian Process? 4. How to use Gaussian Process? • Gaussian Process Regression • A Sample • Gaussian Process Classification 5. Reference

Some Reviews Random variable : A real number to every possible outcome of a random experiment. 𝑋:Ω→ℝ There are two types of random variables, discrete and continuous. Sample Space : A set of possible outcomes of a random experiment. Ω Probability : 𝑃 𝑋=𝐴 =𝑝 Mean : μ= 𝐸 𝑋 𝑋 = 𝑛 1 𝑥 𝑖 𝑝 𝑖 Variance : 𝜎 2 = 𝑛 1 (𝑥 𝑖 −μ) 2 𝑝 𝑖 Standard Deviation: 𝜎= ( 𝑛 1 (𝑥 𝑖 −μ) 2 𝑝 𝑖 ) 1/2 Distribution Function: 𝐹 𝑋 𝑥 =𝑃{𝜔∈Ω;𝑋 𝜔 <𝑥} . 𝐹 𝑋 𝑥 =𝑃(𝑋<𝑥) Density Functions: 𝐹 𝑋 (𝑥)= −∞ 𝑥 𝑓 𝑋 𝑡 𝑑𝑡 (for continuous) Joint Distributions: For random variables 𝑋1 and 𝑋2. For discrete random variables, 𝑓 𝑋 1, 𝑋 2 𝑥 1 , 𝑥 2 =𝑃{ 𝑋 1 = 𝑥 1 , 𝑋 2 = 𝑥 2 } Independent R. V: Conditional Distributions

Gaussian Distribution x ~ 𝑁 𝑝 (𝜇, Σ) And Therefore we conclude that Pf. For any vector 𝜇=E(𝐱) Σ=𝐸 𝐱−𝜇 𝐱−𝜇 𝑇 =𝐸 𝐱 𝐱 𝑇 −𝜇 𝜇 𝑇 𝜀 𝑖𝑗 =𝐶𝑜𝑣[ 𝑋 𝑖 , 𝑋 𝑗 ] Multivariate Normal distribution 有个很重要的性质,就是假设 N 个变量服从 Multivariate Normal distribution , 从里面任意扣 n<N 个变量组成的vector 的联合分布仍然服从 Multivariate Normal distribution。这个性质除了 Normal, 和 Gamma (Gamma 是么) 好像也没谁了。 所以当处理高维联合分布的时候, 大家非常喜欢用高斯分布

What is a stochastic process? X f(x) P 𝑥 1 ∙ 𝑥 2 ∙ 𝑥 𝑛 ∙ Parameter set: X={x1,…,xn} State Set: Z={Z1,Z2,…,Zn} Zi = f(xi) i=1,…,n Z (1)Discrete time Discrete state Bernoulli Process (2) Discrete time Continuous state White Noise Process (3) Continuous time Discrete state Poisson Process (4) Continuous time Continuous state Gaussian Process 离散参数,离散状态      eg.最简单也最早被人们研究的随机过程是随机游动.伯努利过程 它是一种描述粒子在格子点 集 Z := {0, ±1, ±2, . . . } 上随机运动的数学模型. 设 {Zn : n ≥ 1} 是某个概率空间上独立 同分布的随机变量序列且都服从 Bernoulli 分布,即P(Zi=1)=p,P(Zi=-1)=1-p 离散参数,连续状态      eg.高斯白噪声过程 对第xi个时刻,Z1…Zn相互独立同分布,服从正态分布N(0,sigma^2)的白噪声 连续参数,离散状态      eg.计数过程 如果N t 表示直到时刻t为止发生的某随机事件总数, 则称实随机过程{N t ,t≥0}为计数过程.   如直到时刻t为止进入某商店的人数N t , {N t ,t≥0}   计数过程通常 满足: N t 是非负整数且N 0 =0 连续参数,连续状态      eg.高斯过程 设Z= {Z t ,t∈T}是一实值随机过程,若对任意n≥1及x1 ,x2 ,…,xn ∈X, n维随机变量(Z1 , Z2 , …, Zn )服从n维正态分布,则称X是高斯过程,如果X是一个连续集,那么这个高斯过程就是一个连续参数连续状态的随机过程 随机过程及其应用(第2版);西安电子科技大学出版社; 6 edition (May 1 2012)

What is a Gaussian process? Let Z={Z1,Z2,…,Zn}be an N-dimensional vector of function values evaluated at n points xi ∈ X, i=1,…,n. Zi = f(xi) • Note that Z is a random variable. • Definition: Z is a Gaussian process if for any finite subset of {x1,…,xn}, the marginal distribution over that finite subset Z has a multivariate Gaussian distribution. Gaussian process is parameterized by a mean function 𝜇 𝑋 ,and a covariance function (kernel) 𝐾( 𝑘 𝑖𝑗 ) 𝑛×𝑛 , where 𝑘 𝑖𝑗 =𝑘( 𝑥 𝑖 , 𝑥 𝑗 ). 这里没有强调Z1,Z2是不是要是同分布,只强调他们都是高斯分布并且联合分布也是高斯分布。 Gaussian Processes for Machine Learning, Carl Edward Rasmussen and Chris Williams, the MIT Press, 2006

Kernels 𝑘 𝑥, 𝑥 ′ = 𝜑 𝑥 𝑇 𝜑 𝑥′ 𝜑(𝑥) :A nonlinear feature space mapping k is A symmetric function of its arguments so that 𝑘 𝑥,𝑥′ =𝑘( 𝑥 ′ ,𝑥) 𝑘 𝑥, 𝑥 ′ = 𝜑 𝑥 𝑇 𝜑 𝑥′ Pattern Recognition and Machine Learning. Bishop, Christopher,2006

Gaussian Regression MODEL INFERENCE Let Let observed unobserved MODEL Let Let Where m is the mean of the predicted values and D is the variance. We can know the confidence interval by mean and variance. Pattern Recognition and Machine Learning. Bishop, Christopher,2006

Relationship to Polynomial regression prior Linear. Part 1. Non-Linear. Part 2. GP regression maximize the function: 极大似然函数,theta点估计 Polynomial regression minimize the function:

Example   Gaussian Processes for Regression A Quick Introduction, M.Ebden, August 2008.

Gaussian Process Classification (GPC) Consider a two-class problem with a target variable y ∈ {-1, 1}. Input: x=( 𝑥 1 ,…, 𝑥 𝑛 )T Observed value: 𝑦=(𝑦 1 ,…, 𝑦 𝑛 )T Test data : 𝑥 ∗ Target value: 𝑦 ∗ Goal: 𝑃( 𝑦 ∗ |𝑦) Gaussian process prior over the latent function f with a covariance function 𝑘(𝑥, 𝑥′|𝜃), which may depend on hyperparameters 𝜃. Pattern Recognition and Machine Learning. Bishop, Christopher,2006

Advantages Disadvantages 1) The GP model is considered to be a basic framework for statistical machine learning. (Interpolation,Fitting) 2) The GP model is a model that combines kernel machine learning with Bayesian inference learning, and has the advantages of the above two types of learning methods. GP can generate probability information, and different kernels can be specified. Disadvantages 1) The hyperparameters of the GP model, such as the covariance function and the pending parameters in the prior distribution, have a large impact on the learning and prediction results. But there is no clear explanation of how to determine the appropriate initial value. 2) GP model does not work well on sparse samples. http://www.gaussianprocess.org/ 周亚同,陈子一 ,马尽文. 从高斯过程到高斯过程混合模型:研究与展望. Journal of Signal Processing. Vol.32 No.8. Aug 2016.

Reference [1] 随机过程及其应用(第2版);西安电子科技大学出版社; 6 edition (May 1 2012) [2] Gaussian Processes for Machine Learning, Carl Edward Rasmussen and Chris Williams, the MIT Press, 2006 [3] Pattern Recognition and Machine Learning. Bishop, Christopher,2006 [4] Gaussian Processes for Regression A Quick Introduction, M.Ebden, August 2008. [5] 周亚同,陈子一 ,马尽文. 从高斯过程到高斯过程混合模型:研究与展望. Journal of Signal Processing. Vol.32 No.8. Aug 2016. [6] http://www.gaussianprocess.org/