Simple Regression (簡單迴歸分析)

Slides:



Advertisements
Similar presentations
高考英语阅读分析 —— 七选五. 题型解读: 试题模式: 给出一篇缺少 5 个句子的文章, 对应有七个选项,要求同学们根据文章结构、 内容,选出正确的句子,填入相应的空白处。 考查重点: 主要考查考生对文章的整体内容 和结构以及上下文逻辑意义的理解和掌握。 (考试说明) 选项特点: 主旨概括句(文章整体内容)
Advertisements

TOEFL Speaking ----Q1&Q2 坚果托福 秀文. 评分标准评分标准 Volume Grammar Fluency Logic / Organization Lexical ability Pronunciation.
TEM-4 Reading Comprehension 黄山学院 程汕姗. TEM-4 试卷题型及分值、时间分配 1. 听写 15 分 15mins 2. 听力 30 题 15 分 15mins 3. 完形填空 20 题 10 分 15mins 4. 语法和词汇 30 题 15 分 15mins 5.
考研英语复试 口语准备 考研英语口语复试. 考研英语复试 口语准备 服装 谦虚、微笑、自信 态度积极 乐观沉稳.
SanazM Compiled By: SanazM Here Are Some Tips That May Bring You A Beautiful Life! Music: 美麗人生 Angel ( 主題曲 ) Revised By: Henry 以下是一些能帶給你一個美麗人生的秘訣 中文註解:
黄国文 中山大学 通用型英语人才培养中的 语言学教学 黄国文 中山大学
性別主流化 人力資源暨公共關係學系 助理教授 陳月娥.
性別主流化 人力資源暨公共關係學系 助理教授 陳月娥.
Healthy Breakfast 第四組 電子一甲(電資一) 指導老師:高美玉 組長:B 侯昌毅
How to Use SPSS in Biomedical Data analysis
专题八 书面表达.
Chapter 5 research Methods in Social Medicine
B型肝炎帶原之肝細胞癌患者接受肝動脈栓塞治療後血液中DNA之定量分析
多元迴歸 Multiple Regression
Chapter 8 Liner Regression and Correlation 第八章 直线回归和相关
摘要的开头: The passage mainly tells us sth.
第四章 测试效度及其 验证方法(一) 湖南师范大学外国语学院 邓 杰 教授.
What water is more suitable for nurturing the goldfish
I always like birthday parties.
Population proportion and sample proportion
Chapter 2 簡單迴歸模型.
Here Are Some Tips That May Bring You A Beautiful Life!
Differential Equations (DE)
次数依变量模型 (Models for Count Outcomes)
課程九 迴歸與相關1.
一元线性回归(二).
第一章.
Stochastic Relationships and Scatter Diagrams
第十一章. 簡單直線迴歸與簡單相關 Simple Linear Regression and Simple Correlation
十一、簡單相關與簡單直線回歸分析(Simple Correlations and Simple Linear Regression )
簡單迴歸模型的基本假設 用最小平方法(OLS-ordinary least square)找到一個迴歸式:
创建型设计模式.
製程能力分析 何正斌 教授 國立屏東科技大學工業管理學系.
Chapter 14 Simple Linear Regression
971研究方法課程第九次上課 認識、理解及選擇一項適當的研究策略
Interval Estimation區間估計
塑膠材料的種類 塑膠在模具內的流動模式 流動性質的影響 溫度性質的影響
The Nature and Scope of Econometrics
第四章 测试效度及其 验证方法(一) 湖南师范大学外国语学院 邓 杰 教授.
Could you please clean your room?
大学思辨英语教程 精读1:语言与文化 (说课)
庄文忠 副教授 世新大学行政管理学系 相关分析与简单回归分析 庄文忠 副教授 世新大学行政管理学系 SPSS之应用(庄文忠副教授) 2019/4/7.
精品学习网---初中频道 海量同步课件、同步备考、同步试题等资源免费下载!
Here Are Some Tips That May Bring You A Beautiful Life!
漂亮的台灣水雉What Beautiful Jacanas in Taiwan !
英语口语比赛要点2 茂名职业技术学院.
Here Are Some Tips That May Bring You A Beautiful Life!
Liner regression analysis
生物統計 1 課程簡介 (Introduction)
Here Are Some Tips That May Bring You A Beautiful Life!
Mechanics Exercise Class Ⅰ
相關統計觀念復習 Review II.
Design and Analysis of Experiments Final Report of Project
中央社新聞— <LTTC:台灣學生英語聽說提升 讀寫相對下降>
Philosophy of Life.
社会研究方法 第7讲:社会统计2.
第二章 经典线性回归模型: 双变量线性回归模型
Statistics Chapter 1 Introduction Instructor: Yanzhi Wang.
名词从句(2).
 隐式欧拉法 /* implicit Euler method */
动词不定式(6).
何正斌 博士 國立屏東科技大學工業管理研究所 教授
Multiple Regression: Estimation and Hypothesis Testing
怎樣把同一評估 給與在不同班級的學生 How to administer the Same assessment to students from Different classes and groups.
Views on the News 不同的观点 选自《多维阅读第11级》.
二项式的分解因式 Factoring binomials
簡單迴歸分析與相關分析 莊文忠 副教授 世新大學行政管理學系 計量分析一(莊文忠副教授) 2019/8/3.
Sun-Star第六届全国青少年英语口语大赛 全国总决赛 2015年2月 北京
Gaussian Process Ruohua Shi Meeting
Climbing a Rock Wall 攀岩 选自《多维阅读第10级》.
Presentation transcript:

Simple Regression (簡單迴歸分析) Social Research Methods 2109 & 6507 Spring, 2006 March 8, 9, 13, 2006

From Correlation to Regression: Correlation (相關分析、相關係數): measures the strength of linear association between 2 quantitative variables (二變數線性關係的強度) Regression (迴歸分析): Description (描述): summarize the relationship between the two variables with a straight line, what does the line look like? (如何用一直線描述二變數的關係?) Prediction (預測): how to make predictions about one variable based on another? (如何從一變數預測另一變數?)

Example: summarize the relationship with a straight line

Draw a straight line, but how? (怎麼畫那條直線?)

Notice that some predictions are not complete accurate

How to draw the line? Purpose: draw the regression line to give the most accurate predictions of y given x Criteria for “accurate”: Sum of (observed y – predicted y)2 = sum of (prediction errors) 2 [觀察值與估計值之差的平方和] Called the sum of squared errors or sum of the squared residuals (SSE)

Ordinary Least Squares (OLS) Regression (普通最小平方法) The regression line is drawn so as to minimize the sum of the squared vertical distances from the points to the line (讓SSE最小) This line minimize squared predictive error This line will pass through the middle of the point cloud (迴歸線從資料群中間穿過)(think as a nice choice to describe the relationship)

To describe a regression line (equation): Algebraically, line described by its intercept (截距) and slope (斜率) Notation: y = the dependent variable x = the independent variable y_hat ( )= predicted y based on the regression line β = slope of the regression line α= intercept of the regression line

The meaning of slope and intercept: slope = change in (y_hat) for a 1 unit change in x (x一單位的改變導致y估計值的變化) intercept = value of (y_hat) when x is 0

General equation of a regression line: (y_hat) = α +βx where α and β are chosen to minimize: sum of (observed y – predicted y)2 A formula for α and β which minimize this sum is programmed into statistical programs and calculators

An example of a regression line

Residuals (殘差) Residual = difference between the predicted y and the observed y for an observation residuali = yi – (y_hat)i

Interpreting regression coefficients Slope = change in y predicted with a one unit change in x Slope = 0: no linear relationship between x and y (r = 0) Intercept = predicted value of y when x is 0 Often, we are not interested in the intercept Note: interpretation of the slope and intercept requires thinking in the units of x and y (解釋截距與斜率時要注意到x and y的單位)

Regression and Correlation Distinct but related measures Correlation: measures strength of relationship, a major aspect of which is how closely the points form a line shape Regression slope: how steep is the slope of the line?

To get slope and intercept for a regression:

How slope and correlation are mathematically related: β = r (sy)/ (sx) α = (y_bar) – β(x_bar)

Fit: how much can regression explain? (迴歸能解釋y多少的變異?) Look at the regression equation again: (y_hat) = (y_hat) = α +βx y = α +βx + ε Data = what we explain + what we don’t explain Data = predicted + residual (資料有我們不能解釋的與可解釋的部分,即能預估的與誤差的部分)

In regression, we can think “fit” in this way: Total variation = sum of squares of y explained variation = total variation explained by our predictions unexplained variation = sum of squares of residuals R2 = (explained variation)/ (total variation) (判定係數) [y 全部的變易量中回歸分析能解釋的部分]

R2 = r2 NOTE: a special feature of simple regression (OLS), this is not true for multiple regression or other regression methods. [注意:這是簡單迴歸分析的特性,不適用於多元迴歸分析或其他迴歸分析]

Some cautions about regression and R2 It’s dangerous to use R2 to judge how “good” a regression is. (不要用R2來判斷迴歸的適用性) The “appropriateness” of regression is a technique is not a function of R2 When to use regression? Not suitable for non-linear shapes [you can modify non-linear shapes] regression is appropriate when r (correlation) is appropriate as a measure

Residuals and residual plots residuali = yi – (y_hat)I We can use residual plots to help us assess the fit of a regression line A residual plot: a scatterplot of the regression residuals against the explanatory variable (殘差在y軸,自變數在x軸)

Example of a residual plot

Look at a residual plot 殘差(residuals)的分布是否平均散佈在 0 的上面及下面? 對整個自變數的分佈而言,殘差的垂直分佈(vertical spread)是否都差不多?

Types of residual plots

Outliers and influences Outlier (極端值): a point that falls outside the overall patterns of the graph Influential observation (深具影響的觀察值) = a point which, if removed, would markedly change the position of the regression line NOTE: Outliers are not necessarily influential.

The differences between outliers and influential outliers

Outliers and influential observations Outliers which are at the extremes of x are more likely to be influential than those are at the extremes of y (自變數的極端值比依變數的極端值較有可能是對迴歸影響力大的觀察值) It is often a good idea to eliminate any influential outliers and recompute our regression without them.(建議:將對迴歸影響力大的觀察值刪除,再計算一次迴歸線)

Cautions about correlation and regression: Extrapolation is not appropriate Regression: pay attention to lurking or omitted variables Lurking (omitted) variables: having influence on the relationship between two variables but is not included among the variables studied A problem in establishing causation Association does not imply causation. Association alone: weak evidence about causation Experiments with random assignment are the best way to establish causation.