Presentation is loading. Please wait.

Presentation is loading. Please wait.

Logistic Regression Appiled Linear Statistical Models,由Neter等著

Similar presentations


Presentation on theme: "Logistic Regression Appiled Linear Statistical Models,由Neter等著"— Presentation transcript:

1 Logistic Regression Appiled Linear Statistical Models,由Neter等著
Categorical Data Analysis,由Agresti著

2 Logistic 回归 当响应变量是定性变量时的非线性模型 两种可能的结果,成功或失败,患病的或没 有患病的,出席的或缺席的
实例:CAD(心血管疾病)是年龄,体重,性别,吸烟历史,血压的函数 吸烟者或不吸烟者是家庭历史,同年龄组行 为,收入,年龄的函数 今年购买一辆汽车是收入,当前汽车的使用 年限,年龄的函数

3 二元结果的响应函数

4 当响应是二元时的特殊问题 对响应函数的约束: 非标准化的误差项: 非恒量的误差方差:

5 Logistic 响应函数

6 Logistic 响应函数的例子 图中横坐标为:年龄;纵坐标为:CAD的概率

7 Logistic 响应函数的性质

8 似然函数

9

10 多元Logistic回归的似然性

11 似然方程的解 不封闭的形式解,使用Newton-Raphson算法,迭代地重加权最小二乘法(IRLS)

12 Logistic 回归系数的解释

13 kyphosis {rpart}(驼背)81 rows and 4 columns
Kyphosis: a factor with levels absent present indicating if a kyphosis (a type of deformation) was present after the operation. Age: in months Number: the number of vertebrae involved Start: the number of the first (topmost) vertebra operated on.

14 some(kyphosis) Kyphosis Age Number Start 12 absent 18 absent 32 absent 40 present 50 absent 51 absent 52 absent 70 absent 79 absent 81 absent

15 summary(kyphosis) Kyphosis Age Number Start
absent :64 Min. : Min. : Min. : 1.00 present:17 1st Qu.: st Qu.: st Qu.: 9.00 Median : Median : Median :13.00 Mean : Mean : Mean :11.49 3rd Qu.: rd Qu.: rd Qu.:16.00 Max. : Max. : Max. :18.00

16 plot(kyphosis)

17 预测因子vs.驼背的箱图 图中横坐标为:是否驼背;纵坐标分别为:年龄,数值,起始boxplot(Age~Kyphosis,data=kyphosis)

18 广义拉格朗日乘子拟合 summary(glm(Kyphosis~Age+Number+Start,family=binomial,data=kyphosis)) Deviance Residuals: Min 1Q Median 3Q Max Coefficients: Estimate Std. Error z value Pr(>|z|) (Intercept) Age Number Start ** --- Signif. codes: 0 ‘***’ ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 (Dispersion parameter for binomial family taken to be 1) Null deviance: on 80 degrees of freedom Residual deviance: on 77 degrees of freedom AIC: Number of Fisher Scoring iterations: 5

19 残差

20 模型偏差 拟合模型的偏差是拟合模型的对数似然与饱和模型的对数似然的比值。 饱和模型的对数似然=0

21 协方差矩阵 x<-model.matrix(kyph.glm) fi=fitted(kyph.glm)
xvx<-t(x)%*%diag(fi*(1-fi))%*%x xvx (Intercept) Age Number Start (Intercept) Age Number Start

22 xvxi<-solve(xvx) xvxi (Intercept) Age Number Start
(Intercept) e Age e Number e Start e

23 sqrt(diag(xvxi)) (Intercept) Age Number Start 1. 449621939 0

24 因向模型中增加项而产生的偏 差变化 anova(kyph.glm) Analysis of Deviance Table Model: binomial, link: logit Response: Kyphosis Terms added sequentially (first to last) Df Deviance Resid. Df Resid. Dev NULL Age Number Start

25 带有附加的年龄^2的驼背模型 kyph.glm2<-glm(Kyphosis~poly(Age,2)+Number+Start,family=binomial,data=kyphosis) summary(kyph.glm2)

26 偏差分析 anova(kyph.glm2) Analysis of Deviance Table Model: binomial, link: logit Response: Kyphosis Terms added sequentially (first to last) Df Deviance Resid. Df Resid. Dev NULL poly(Age, 2) Number Start

27 驼背数据,16个对象,带有拟合 和残差 kyphosis$fi<-fi
y<-as.numeric(kyphosis$Kyphosis) y<-as.numeric(kyphosis$Kyphosis)-1 kyphosis$rr<-y-fi kyphosis$rp<-(y-fi)/sqrt(fi*(1-fi)) kyphosis$rd<-sqrt(-2*log(abs(1-y-fi)))

28 响应残差vs.拟合的图 图中横坐标为:y拟合值;纵坐标分别为:拟合值 plot(rr~fi,kyphosis)

29 偏差残差vs.序号的图 yy<-sign(y-fi)*(-2*(y*log(fi)+(1-y)*log(1-fi)))^(1/2)
图中横坐标为:序号;纵坐标分别为:残差plot(resid(kyph.glm)) yy<-sign(y-fi)*(-2*(y*log(fi)+(1-y)*log(1-fi)))^(1/2)

30 偏差残差vs.拟合值的图


Download ppt "Logistic Regression Appiled Linear Statistical Models,由Neter等著"

Similar presentations


Ads by Google