Canonical Correlation Analysis 经典相关分析

Slides:



Advertisements
Similar presentations
Chapter 2 Combinatorial Analysis 主講人 : 虞台文. Content Basic Procedure for Probability Calculation Counting – Ordered Samples with Replacement – Ordered.
Advertisements

Survey Sampling 問卷調查和訪談
升中導航— 面試技巧工作坊 學校社工:江曉彤姑娘.
Dr. Baokun Li 经济实验教学中心 商务数据挖掘中心
F1 VISA APPLICATION F1学生赴美留学签证申请流程.
-CHINESE TIME (中文时间): Free Response idea: 你周末做了什么?
SPSS軟體與多變量分析 南台科技大學企管系 呂金河.
How to Use SPSS in Biomedical Data analysis
中四 升學講座 中五 2007年12月8日.
二維品質模式與麻醉前訪視滿意度 中文摘要 麻醉前訪視,是麻醉醫護人員對病患提供麻醉相關資訊與服務,並建立良好醫病關係的第一次接觸。本研究目的是以Kano‘s 二維品質模式,設計病患滿意度問卷,探討麻醉前訪視內容與病患滿意度之關係,以期分析關鍵品質要素為何,作為提高病患對醫療滿意度之參考。 本研究於台灣北部某醫學中心,通過該院人體試驗委員會審查後進行。對象為婦科排程手術住院病患,其中實驗組共107位病患,在麻醉醫師訪視之前,安排先觀看麻醉流程衛教影片;另外對照組111位病患,則未提供衛教影片。問卷於麻醉醫師
探討強迫症患者之焦慮、憂鬱症狀與自殺意念之相關
Chapter 8 Liner Regression and Correlation 第八章 直线回归和相关
A Lesson In a Lab Introduction Vocabulary and Speaking.
Homework 4 an innovative design process model TEAM 7
Unit 4 I used to be afraid of the dark.
Module 5 Shopping 第2课时.
Population proportion and sample proportion
實 驗 研 究 法 多因子實驗設計 指導老師:黃萬居教授 學生:陳志鴻 m
Differential Equations (DE)
SAS  統計程序實作 CONTENTS By DR. Yang , Yi-Chiang /11/11.
項目分析與探索式因素分析 李茂能, 2007,成大 Fred Li, 2007.
The Empirical Study on the Correlation between Equity Incentive and Enterprise Performance for Listed Companies 上市公司股权激励与企业绩效相关性的实证研究 汇报人:白欣蓉 学 号:
Dr. Baokun Li 经济实验教学中心 商务数据挖掘中心
中国公私合作伙伴关系(PPP )现状 Disclaimer:
MICROECONOMICS Chapter16 Price Control 價格管制.
Write a letter in a proper format
Fundamentals of Physics 8/e 27 - Circuit Theory
Decision Support System (靜宜資管楊子青)
Responsibility Accounting
Area of interaction focus
The expression and applications of topology on spatial data
Lesson 10.
第九章 典型相关分析 第一节 引言 第二节 典型相关的基本理论 第三节 样本典型相关分析 第四节 典型相关分析应用中的几 个问题
第14章 竞争市场上的企业 上海杉达学院 国贸系.
Randomized Algorithms
扩大阅读 提高综合运用能力 谈18选6英语05、06题的命题意图,解题思路及备考 ——————瓯海中学胡良云.
Interval Estimation區間估計
University of Science and Technology, Beijing
Formal Pivot to both Language and Intelligence in Science
These Views Are Not Necessarily
消費者偏好與效用概念.
線性相關與直線迴歸 基本概念 線性相關:兩個連續變項的共變關係,且有線性關係。所謂 的線性關係乃指兩個變項的關係可以被一條最具
客户服务 询盘惯例.
Decision Support System (靜宜資管楊子青)
21st Century Teaching & Learning
GRANT UNION HIGH SCHOOL
Chp.4 The Discount Factor
Unit 5 Reading A Couch Potato.
每周三交作业,作业成绩占总成绩的15%; 平时不定期的进行小测验,占总成绩的 15%;
Chp.4 The Discount Factor
中国科学技术大学计算机系 陈香兰 2013Fall 第七讲 存储器管理 中国科学技术大学计算机系 陈香兰 2013Fall.
Unit 7 Lesson 20 九中分校 刘秀芬.
公钥密码学与RSA.
Simple Regression (簡單迴歸分析)
爬蟲類動物2 Random Slide Show Menu
Inter-band calibration for atmosphere
Chp.4 The Discount Factor
计算机问题求解 – 论题1-5 - 数据与数据结构 2018年10月16日.
國立東華大學課程設計與潛能開發學系張德勝
动词不定式(6).
Multiple Regression: Estimation and Hypothesis Testing
怎樣把同一評估 給與在不同班級的學生 How to administer the Same assessment to students from Different classes and groups.
簡單迴歸分析與相關分析 莊文忠 副教授 世新大學行政管理學系 計量分析一(莊文忠副教授) 2019/8/3.
Principle and application of optical information technology
Train Track and Children
Gaussian Process Ruohua Shi Meeting
Climbing a Rock Wall 攀岩 选自《多维阅读第10级》.
SAS 統計程序實作 PROC MEANS (一個母體)
Section 1 Basic concepts of web page
When using opening and closing presentation slides, use the masterbrand logo at the correct size and in the right position. This slide meets both needs.
Presentation transcript:

Canonical Correlation Analysis 经典相关分析 Chapter 10 Canonical Correlation Analysis 经典相关分析 School of information Technology, Jiangxi University of Finance & Economics Zhu yongjun

典型相关分析 主要目的:识别和量化两组变量集之间的相关关系 (its use)Examples Relating arithmetic speed and arithmetic power to reading speed and reading power Relating government policy variables with economic goal variables Relating college “performance” variables with precollege “achievement” variables

how can we relize the idea! 典型相关分析 主要讨论两个变量集中变量线性组合的相关关系。 第一步确定具有最大相关关系的线性组合对 其次确定同前不相关,且具有最大相关系数的线性对。如此等等。 how can we relize the idea! Something like PCA

CCA的主要内容 典型变量 典型相关系数 优化方面 Pairs of linear combinations used in canonical correlation analysis 典型相关系数 Correlations between the canonical variables Measures the strength of association between the two sets of variables 优化方面 Attempt to concentrate a high-dimensional relationship between two sets of variables into a few pairs of canonical variables

例题 10.5 工作满意度 任务特性 Job characteristics ,The answer may have implications for job design!

Example 10.5 Job Satisfaction Job satisfaction, n=784

CCA的假设 In order to measure association between two groups of variables,we make some assumption. Build new variable prime , Partition of matrix

CCA的注意点 不同变量集成对变量之间的协方差包含在S12 中或者S21 当p和q相对较大时,使用 S12 中的元素来解释集合之间的相关程度相对要困难 典型相关分析可以使用少数协方差来总结两个变量集之间的相关关系 ,而不是用 S12

CCA的主要任务 It is often linear combinations of variables that are interesting and useful for predictive or comparative purpose. The main task of CCA is to summarize the associations between the X(1) and X(2) sets in terms of a few carefully chosen covariances (or correlations) rather than the pq covariances in S12.

原始变量的线性变换

典型变量的定义 First pair of canonical variables Pair of linear combinations U1, V1 having unit variances, which maximize the correlation kth pair of canonical variables Pair of linear combinations Uk, Vk having unit variances both, which maximize the correlation among all choices uncorrelated with the previous k-1 canonical variable pairs

典型相关系数的定义 The correlation between the kth cannonical variate pair is called the kth cannonical correlation.Such as ,when correlation coefficient =1,it represent completely linear correlation! The following result gives the necessary details of obtaining the canonical variables and their correlations

结论 10.1 Suppose X(1) and X( 2) as above, p<q,U=aX(1),V=bX(2)

Result 10.1 ?

Result 10.1

According to the spectral decomposition of matrix,see p66 (2-22) Proof of Result 10.1 According to the spectral decomposition of matrix,see p66 (2-22) Expressed as 张尧庭老师有另外一种方法来证明!Anderson(1984) use lagrange multipliers. nominator denominator See.p78,(2-48),c’*sigma*… etal as b

The first part of the brackets of the right inequality Proof of Result 10.1 See p80,(2-51),PCA Denote is as f1

Proof of Result 10.1 AB,BA have same nonzero eigenvalue!

Proof of Result 10.1 orthogonal to

Proof of Result 10.1

典型变量 Application software such as Spss ,the standardized variable are used

Comment Decomposition

Comment Note: If there are multiple roots, the coefficient a and b is not the only one!

Example 10.1

Example 10.1 Choose b by this formula

Example 10.1 Scale change Unchange by standardized

其他求解方法 Why the correlation is the same , AB,BA have same egienvalue. Two side multiply by sqare-root of inverse matrix of big sigma 11 get the third result. Why the correlation is the same , AB,BA have same egienvalue. Get the correlation,see Exercise 10.4

10.3 解释总体典型变量 典型变量一般来说是人工生成的. 即,没有明显的物理意义. If the original variables X(1) and X(2) are used, the canonical coefficients a and b have unit proportional to those of the X(1) and X(2) sets.

识别典型变量

Identifying Canonical Variables by Correlation

Example 10.2 Here Az and Bz is coefficient matrix.

典型相关系数同其他相关系数关系 This mean the first canonical correlation is larger than the absolute value of any entry in eho 12. CC are also the multiple correlation coefficent of U with X(2)

前r对典型相关系数总结了相关程度 坐标变换 X(1) to U=AX(1) and from X(2) to V=BX(2) 目的在于最大化 corr(U1,V1) and, 同样,corr(U2,V2)….(Ui,Vi) have zero correlation with the previous pairs.变量集之间的相关系数X(1) and X(2) 就被分类成了典型相关系数.

样本典型变量和样本典型相关系数

结果 10.2

矩阵表示

标准化数据的样本典型相关分析

Example 10.5 Job Satisfaction

Example 10.5 Job Satisfaction

Example 10.5: Sample Correlation Matrix Based on 784 Responses

Example 10.5: Canonical Variate Coefficients

Example 10.5: Sample Correlations between Original and Canonical Variables

10.5 渐近误差矩阵

Matrices of Errors of Approximations

渐近误差矩阵为

Example 10.6

Example 10.6

Example 10.6

典型变量和原始变量之间的样本相关相关系数矩阵

被解释的样本方差的比例

Proportion of Sample Variances Explained by the Canonical Variables

Example 10.7

大样本推断Result 10.3

Bartlett’s 修正

典型相关系数的显著性检验

Example 10.8

Example 10.8 http://listserv.uga.edu/cgi-bin/wa?A2=ind0710&L=spssx-l&H=1&P=15613

SPSS program MANOVA VAR1 VAR2 VAR3 WITH VAR4 VAR5 VAR6 /DISCRIM RAW STAN ESTIM CORR ALPHA(1) /PRINT SIGNIF(MULT UNIV EIGN DIMNER) SIGNIF(EFSIZE) CELLINFO(CORR) /NOPRINT PARAM(ESTIM) /POWER T(.05) F(.05) /METHOD=UNIQUE /ERROR WITHIN+RESIDUAL /DESIGN. Open a new sytax,cope it. Then, we need change the var1 ….to your variable .

上面为各典型变量与变量组1中各变量间标化与未标化的系数列表,由此我们可以写出典型变量的转换公式(标化的)为: L1=0. 05759 上面为各典型变量与变量组1中各变量间标化与未标化的系数列表,由此我们可以写出典型变量的转换公式(标化的)为:     L1=0.05759*var00001+0.22244*var02  -0.12806*var03

上表为第一变量组中各变量分别与自身、相对的典型变量的相关系数,可见它们主要和第一对典型变量的关系比较密切。 参考地址:http://bbs.miforum.net/thread-2786-1-4.html

经典变量同原始第一组变量的关系

第二组变量关系 第二组中变量,这里称为为协变量covariaates

Sas程序 options ls=78; title "Canonical Correlation Analysis - Sales Data"; data sales; infile "D:\Statistics\STAT 505\data\sales.txt"; input growth profit new create mech abs math; run; proc cancorr out=canout vprefix=sales vname="Sales Variables" wprefix=scores wname="Test Scores"; var growth profit new; with create mech abs math; proc gplot; axis1 length=3 in; axis2 length=4.5 in; plot sales1*scores1 / vaxis=axis1 haxis=axis2; symbol v=J f=special h=2 i=r color=black;