三、機率(Probability) (Chapter 4) 劉仁沛教授 國立台灣大學農藝學研究所生物統計組 國立台灣大學流行病學與預防醫學研究所 國家衛生研究院生物統計與生統資訊組 jpliu@ntu.edu.tw 【本著作除另有註明,網站之內容皆採用 創用CC 姓名標示-非商業使用-相同方式分享 3.0 台灣 授權條款釋出】 2018/9/19 Jen-pei Liu, PhD
機率概念(Concept of Probability) 樣品空間及事件(Sample Space and Events) 機率運算法則(Elementary Probability Rules) 條件機率及獨立(Conditional Probability) 機率的應用(Applications) 2018/9/19 Jen-pei Liu, PhD
試驗(Experiment) 一個收集不定結果(Outcome)之觀測值的過程, 每一次試驗只有一個結果(Outcome) 例: 擲硬幣一次 二個可能的結果:正面(H)或反面(T) 但每次只有一結果出現 而且在擲硬幣前不知會觀測到哪一個結果 但可計算每一種結果出現的機率 2018/9/19 Jen-pei Liu, PhD
機率(Probability) 1.機率是介於0與1之間 2.所有結果之機率和為1 例 擲公平硬幣(fair coin)一次 出現正面的機率為0.5 出現反面的機率為0.5 均在0與1之間 而且只有正、反面二種結果 →0.5+0.5=1 2018/9/19 Jen-pei Liu, PhD
樣品空間(Sample Space) 樣品空間(Sample Space) 試驗所有可能結果的集合 例:擲硬幣一次 {H,T} 擲骰子一次 {1,2,3,4,5,6} 夫婦二個小孩的性別 {男男,女女,男女,女男} ={BB, GG, BG, GB} 2018/9/19 Jen-pei Liu, PhD
事件(Event) 事件為樣品空間的子集合 例子:夫婦二個小孩至少一人為女孩 {GG,BG,GB} 擲骰子一次其結果大於3 {4,5,6} 2018/9/19 Jen-pei Liu, PhD
事件機率(Probability of an Event) 事件機率為事件中所有結果機率之和 若試驗中單一結果發生機率均相同 若以E代表事件 則以P(E)代表事件機率 2018/9/19 Jen-pei Liu, PhD
事件機率(Probability of an Event) 例: 夫婦二個小孩的性別 樣品空間={BB,GG,BG,GB} 可能結果之總數=4 至少一人為女孩 E={GG,BG,GB} 事件中結果之個數=3 P(E)=3/4=0.75 2018/9/19 Jen-pei Liu, PhD
機率運算法則 1.事件E之補集合Ec之機率為 P(Ec)=1-P(E) P(E) + P(Ec)=1 例:夫婦二個小孩的性別 E:至少一人為女孩= {GG,BG,GB} Ec:兩人均為男孩= {BB} P(Ec)=1/4=1-P(E)=1-3/4 2018/9/19 Jen-pei Liu, PhD
機率運算法則 2.加法法則 A與B二事件之交集(Intersection)=A∩B 包括屬於A事件及B事件的結果 A與B二事件之聯集(Union)=A∪B 包括屬於A事件或B事件的結果 2018/9/19 Jen-pei Liu, PhD
P(A∪B)=P(A)+P(B)-P(A∩B) 2018/9/19 Jen-pei Liu, PhD
P(A∩B)=P(<=20歲及女性)=21/50=0.42 例: A:<=20歲 B:女性 P(A)=P(<=20歲)=35/50=0.7 P(B)=P(女性)=30/50=0.6 P(A∩B)=P(<=20歲及女性)=21/50=0.42 P(A∪B)=P(<=20歲或女性)=0.70+0.60-0.42=0.88 年齡 性別 <=20歲 >20歲 和 男 14 6 20 女 21 9 30 35 15 50 2018/9/19 Jen-pei Liu, PhD
互斥事件(Mutually Exclusive Events) 若A事件與B事件均無相同的結果 A∩B=ψ P(A∩B)=0 P(A∪B)=P(A)+P(B) 2018/9/19 Jen-pei Liu, PhD
互斥事件(Mutually Exclusive Events) 例:隨機抽取一張撲克牌 A:結果為J B:結果為Q C:結果為紅牌(紅心或方塊) A∩B=ψ → P(A∩B)=0 P(A∪B)=P(A)+P(B)= P(A∪C)=P(A)+P(C)-P(A∩C) 2018/9/19 Jen-pei Liu, PhD
條件機率(Conditional Probability) 學生為<=20歲中女性之機率 P(B|A)=21/35=0.6 年齡 性別 <=20歲 >20歲 和 男 14 6 20 女 21 9 30 35 15 50 2018/9/19 Jen-pei Liu, PhD
條件機率(Conditional Probability) 2018/9/19 Jen-pei Liu, PhD
條件機率(Conditional Probability) 乘法法則 P(A∩B)=P(B|A)‧P(A) =P(A|B)‧P(B) 獨立事件:兩個互不影響的事件 P(B|A)=P(B) =P(B)‧P(A) =P(A) ‧P(B) 2018/9/19 Jen-pei Liu, PhD
條件機率(Conditional Probability) 例:擲硬幣二次 第二次 第一次 H T 和 1 2 4 2018/9/19 Jen-pei Liu, PhD
P(B|A)=P(第二次為正面|第一次為正面) P(A)=P(第一次為正面)=2/4=1/2 P(B)=P(第二次為正面)=2/4=1/2 P(B|A)=P(第二次為正面|第一次為正面) 第二次為正面或反面與第一次無關 P(A∩B)=1/4=(1/2)(1/2)=P(A)‧P(B) 2018/9/19 Jen-pei Liu, PhD
個人機率-定義機率的兩種哲思 信念的測量(measure of belief) 量化每個人對某特定事件發生的主觀看法。 (如P(這是營養課), P(我追的上那一班公車)) Critics: 主觀!應該不主觀嗎? 20 2018/9/19 2018/9/19 Jen-pei Liu, PhD Jen-pei Liu, PhD
個人機率 不同人的主觀信念 不同機率? 是的,所以我們要收資料找到證據! 東方人通常善於隱藏personal belief,所以如果不是大家只有一個意見,不然就是大家都沒意見全部交給高層決定 自由開放?一言堂?不敢被挑戰?不能接受自己錯誤? 21 2018/9/19 2018/9/19 Jen-pei Liu, PhD
個人機率-貝氏定理 以貝氏統計(Bayesian statistics) 為基礎 主觀信念 + 資料 客觀的推論 Proposed early in 1700+ 22 2018/9/19 2018/9/19 Jen-pei Liu, PhD Jen-pei Liu, PhD
個人機率-貝氏定理 P(B) = P(ACB) + P(AB) By Conditional probability P(AB) = P(B|A)P(A) and P(ACB) = P(B|Ac)P(Ac) P(B) = P(AB) + P(ACB) = P(B|A)P(A) + P(B|Ac)P(Ac) 2018/9/19 Jen-pei Liu, PhD
個人機率-貝氏定理 24 條件機率; 事前機率: Pr(A); 事後機率: Pr(A|B) 事前機率為personal probability; 24 2018/9/19 2018/9/19 Jen-pei Liu, PhD
Applications Diagnosis of Diseases Estimation of Survival Function Classification Pattern Recognition Estimation of Survival Function 2018/9/19 Jen-pei Liu, PhD
Diagnosis of Diseases Contingency Table True Condition Status Test Results Present (S2) Absent (S1) Total Positive (R2) a b a+b Negative (R1) c d c+d a+c b+d 2018/9/19 Jen-pei Liu, PhD
Indices of Diagnostic Accuracy Sensitivity (True Positive rate): Capacity for making a correct diagnosis in subjects with the disease Estimated Sensitivity: P(R2|S2) P(R2|S2) = 100% x a/(a+c) Specificity (True Negative rate): Capacity for making a correct diagnosis in subjects without disease Estimated Specificity: P(R1|S1) =100% x d/(b+d) 2018/9/19 Jen-pei Liu, PhD
Indices of Diagnostic Accuracy Positive Predictive Value (Positive Predictive Accuracy): the proportion of subjects with the disease given the positive results. P(S2|R2) = 100% x a/(a+b) Negative Predictive Value (Negative Predictive Accuracy): the proportion of subjects without the disease given the negative results. P(S1|R1) = 100% x d/(c+d) False positive rate: given the positive results ,the proportion of subjects without the disease P(S1|R2) =1 – positive predictive value = 100% x b/(a+b) False negative rate: given the negative results, the proportion of subjects with the disease P(S2|R1) = 1 – negative predictive value = 100% x c/(c+d) 2018/9/19 Jen-pei Liu, PhD
個人機率-貝式定理的應用 29 2018/9/19 2018/9/19 Jen-pei Liu, PhD
Example 2 (Feinstein, 2002) New Maker Test Result Diseased Cases Non-diseased Control Total Positive Negative 46 4 2 48 52 50 100 2018/9/19 Jen-pei Liu, PhD
Indices of Diagnostic Accuracy Data from Example 2 (Feinstein, 2002) Sensitivity = 100% x 46/50 = 92.0% Specificity = 100% x 48/50 = 96.0% Prevalence = 100% x 50/100 = 50.0% Positive Predictive Value = 100% x 46/48 = 95.8% = (0.92x0.5)/[0.92x0.5 + (1–0.96)x(1–0.5)] Negative Predictive Value = 100% x 48/52 = 92.3% False Positive Rate = 100% x 2/48 = 4.2% False Negative Rate = 100% x 4/52 = 7.7% 2018/9/19 Jen-pei Liu, PhD
Example 3 (Feinstein, 2002) New Maker Test Result Diseased Cases Non-diseased Control Total Positive Negative 46 4 38 912 84 916 50 950 1000 2018/9/19 Jen-pei Liu, PhD
Indexes of Diagnostic Accuracy Example 3 (Feinstein, 2002) Sensitivity = 100% x 46/50 = 92.0% Specificity = 100% x 912/950 = 96.0% Prevalence = 100% x 50/1000 = 5.0% Positive Predictive Value = 100% x 46/84 = 54.8% = 0.92x0.05/[0.92x0.05 + (1–0.96)x(1–0.05)] Negative Predictive Value = 100% x 912/916 = 99.6% False Positive Rate = 100% x 38/84 = 45.2% False Negative Rate = 100% x 4/916 = 0.4% 2018/9/19 Jen-pei Liu, PhD
Error rates associated with screening test (Fleiss, 1981) Prevalence False Positive Rate False Negative Rate 1/million .9999 0 1/100,000 .9991 0 1/10,000 .9906 .00001 1/1000 .913 .00005 1/500 .840 .00010 1/200 .677 .00025 1/100 .510 .00051 2018/9/19 Jen-pei Liu, PhD
Indexes of Diagnostic Accuracy Type of Diagnostic Tests (Feinstein, 1977) Screening or discovery tests: mammogram, fasting blood sugar - required high sensitivity => high false positive rate. Exclusion tests: to rule out the presence of the disease such as colonoscopic examination => require extremely high sensitivity Confirmation test: to verify the suspicion of the presence of the disease such as biopsy for lung cancer => require extremely high specificity with very few false positive. 2018/9/19 Jen-pei Liu, PhD
勝算與勝算比 Odds (勝算): p/(1-p) 得肺癌的人當中有的有抽煙,有的沒有 P(抽煙得肺癌)/P(不抽煙得肺癌)=5 36 2018/9/19 2018/9/19 Jen-pei Liu, PhD
勝算與勝算比 有讀書比從不讀書的人通過考試的勝算是999 P(有讀通過)/P(不讀通過)=999, 37 2018/9/19 2018/9/19 Jen-pei Liu, PhD Jen-pei Liu, PhD
勝算與勝算比 -烏腳病與飲用含砷井水的關係 38 2018/9/19 2018/9/19 Jen-pei Liu, PhD
勝算與勝算比 -烏腳病與飲用含砷井水的關係 簡單的來看就是AD 對角線在分子,而BC 對角線在分母的一個比值,這就是勝算比。 若這個勝算比大於1,則代表「飲用含砷井水的人口中罹患烏腳病的勝算」是高於「未飲用含砷井水的人口中罹患烏腳病的勝算」的,也就是說飲用含砷井水可能會有比較高的風險得到烏腳病。 39 2018/9/19 2018/9/19 Jen-pei Liu, PhD Jen-pei Liu, PhD
勝算與勝算比 -婦女乳癌與口服避孕藥的關係 40 2018/9/19 2018/9/19 Jen-pei Liu, PhD
Computation of Kaplan-Meier Estimate of Survival (Actuarial Estimate) Time point t1, t2, and t3 E1: event of surviving from 0 to t1; E2: event of surviving from t1 to t2; E3: event of surviving from t2 to t3 E1E2 E3: event of surviving from 0 to t3 By conditional probability P(E1E2 E3) = P(E3| E1E2)P(E1E2) = P(E3| E1E2)P(E2|E1)P(E1) 2018/9/19 Jen-pei Liu, PhD
Computation of Kaplan-Meier Estimate of Survival (Actuarial Estimate) Divide the time into intervals by the time points where the pre-defined event (death) occurred. For each interval, count the number of the patients who were alive at the beginning of the interval and the number of the patients who were still alive at the end of the interval. Compute the survival rate for each interval as the number of the patients still alive at the end of interval divided by the number of the patients alive at the beginning of the interval. For the time point where pre-defined event occurred, the Kaplan-Meier estimate is the product of survival rate of the preceding intervals and present one. 2018/9/19 Jen-pei Liu, PhD
Computation of Kaplan-Meier Survival Ŝ[y(k)] = P(在y(k)存活) = P(经过y(1),y(2),......,y(k-1),y(k)都存活) = P(在y(k)存活 | 经过y(1),y(2),……,y(k-1), 都存活) × P(经过y(1),y(2),……,y(k-1)都存活) = P(在y(k)存活|经过y(1),y(2),......,y(k-1),都存活) × P(在y(k-1)存活|经过y(1),y(2),……,y(k-2)都存活) ×......×P(在y(2)存活|经过y(1)存活)× P(在y(1)存活)。 2018/9/19 Jen-pei Liu, PhD
Time in Months to Progression of the Patients with StageⅡorⅢA Ovarian Carcinoma by Low-grade or Well-differentiated Cancer Patient Number Time in Months Death (non-censored) Cell Grade 1 0.92 Yes Low Grade 2 2.93 3 5.76 4 6.41 5 10.16 6 12.40 No 7 12.93 8 13.85 9 14.70 10 15.20 11 23.32 12 24.47 13 25.33 14 36.38 15 39.67 16 1.12 High Grade 17 2.89 18 4.51 2018/9/19 Jen-pei Liu, PhD
2018/9/19 Jen-pei Liu, PhD 19 6.55 Yes High Grade 20 9.21 21 9.57 22 9.84 No 23 9.87 24 10.16 25 11.55 26 11.78 27 12.14 28 29 12.17 30 12.34 31 12.57 32 12.89 33 14.11 34 14.84 35 36.81 Source: Fleming, et al. (1980) 2018/9/19 Jen-pei Liu, PhD
Data Layout for Computation of Kaplan-Meier Estimates of Survival Function Ordered Distinct Event Time Number of Events Number of Censored in [y(k), y(k+1)] Number in Risk Set S(y) Y(0) = 0 d0 = 0 m0 n0 1 Y(1) d1 m1 n1 1- d1/n1 Y(2) d2 m2 n2 (1- d1/n1) (1- d2/n2) Y(k) dk mk nk (1- d1/n1)(1- d2/n2)…(1- dk/nk) 2018/9/19 Jen-pei Liu, PhD
Patients with Low-grade Cancer Computation of Kaplan-Meier Estimates of Survival Function for Patients with Low-grade Cancer Ordered Distinct Progression Time Number of Events Number of Censored in [y(k), y(k+1)] Number in Risk Set S(y) 15 1 0.92 0.9333 2.93 14 0.8667 5.76 13 0.8000 6.41 12 0.7333 10.16 4 11 0.6667 15.20 6 0.5556 2018/9/19 Jen-pei Liu, PhD
Example K-M estimate of survival at 15.20 months or longer = = 0.5556 2018/9/19 Jen-pei Liu, PhD
總結(Summary) 機率 機率概念 樣品空間與事件 0≦P(E) ≦1 加法法則 P(A∪B)=P(A)+P(B)-P(A∩B) 互斥事件=P(A∩B)=0 2018/9/19 Jen-pei Liu, PhD
總結(Summary) 機率 條件機率 P(A|B)=P(A∩B)/P(B) 獨立事件:P(A|B)=P(A); P(B|A)=P(B); P(A∩B)=P(A)‧P(B) 貝氏定理 應用 診斷、勝算比、存活機率 2018/9/19 Jen-pei Liu, PhD
版權聲明 頁碼 作品 授權條件 作者/來源 1-51 轉載自 Microsoft Office 2003多媒體藝廊,