One-Sample Tests of Hypothesis Chapter 10 . Copyright © 2015 McGraw-Hill Education. All rights reserved. No reproduction or distribution without the prior written consent of McGraw-Hill Education.
Learning Objectives LO10-1 Define a hypothesis. LO10-2 Explain the process of testing a hypothesis. LO10-3 Apply the six-step procedure for testing a hypothesis. LO10-4 Distinguish between a one-tailed and a two-tailed test of hypothesis. LO10-5 Conduct a test of a hypothesis about a population mean. LO10-6 Compute and interpret a p-value. LO10-7 Use a t statistic to test a hypothesis. LO10-8 Compute the probability of a Type II error. 10-*
LO10-1 Define a hypothesis. Examples: Pay is related to performance: People who are paid more perform better. Consumers prefer Coke over all other cola drinks. Billboard advertising is more effective than advertising in paper-based media. Consumer confidence in the economy is increasing. 10-*
Hypothesis Testing LO10-2 Explain the process of testing a hypothesis. HYPOTHESIS TESTING A procedure based on sample evidence and probability theory to determine whether the hypothesis is a reasonable statement. 10-*
假設檢定的意義 統計假設(statistical hypothesis) :係指有關母體特性之陳述(假設),而此陳述為真(true)或偽(false)則可由此母體抽樣所得之訊息予以評斷。 假設檢定:在於建立一套準則,以決定接受或拒絕統計假設,此種推論過程或方法稱為假設檢定。
Step 1: 假設的建立 統計假設檢定的基本精神:除非具有足夠的證據可以拒絕H0,否則只好接受H0;但是接受H0並不表示H0為真,僅表示沒有充分證據可以拒絕H0。 拒絕H0時表示我們具有充分的事實證據可以拒絕H0,此時該檢定稱為具顯著性(significance),故統計假設檢定亦稱為顯著性檢定(significant testing)。 具顯著性的檢定,其檢定的結論是拒絕H0;亦即顯著性意指有充分證據可以拒絕H0。
Step 1: State the Null and the Alternate Hypothesis LO10-2 Step 1: State the Null and the Alternate Hypothesis 虛無假設 Null Hypothesis A statement about the value of a population parameter developed for the purpose of testing numerical evidence. It is represented by H0. 對立假設 Alternate Hypothesis A statement that is accepted if the sample data provide sufficient evidence that the null hypothesis is false. It is represented by H1. 10-*
1-1 Step 1: 假設檢定的基本概念
Step 1: 法庭判決與假設檢定對照表 虛無假設與檢定的觀念 法庭的判法 統計假設的檢定 1.需要有力的證據 犯罪的證據 推論 2.虛無假設H0 被告無罪 「推論」為偽 3.對立假設H1 被告有罪 「推論」為真 4.檢定的精神與觀念 除非有足夠犯罪的證據,否則認定被告「無罪」。 錯誤地判決「冤枉好人」更嚴重,亦即寧原「勿枉」,而不強調「勿縱」。 除非樣本數據顯示不利於H0,否則仍維持H0 「錯誤地拒絕H0」之嚴重性高於「錯誤地接受H0」,此即「H1為真而未能拒絕H0 」。
Step 1: 假設檢定判斷準則 1/2 準則1 當目標在於以樣本觀察值支持我們的主張時,則其相反的主張視為虛無假設H0,而原先之主張做為對立假設H1。因此,我們必須努力證明我們的主張(即H1)是對的 因為我們檢定的目標是:拒絕H0,故我們將要拒絕/否定的假設放在H0。如:廠商從研發部門得知,新生產方法的產量較大(600個),為了驗證新方法,可設定H0:μ= 600, H1:μ > 600。 準則2 若錯誤地拒絕H0,其後果比接受H0嚴重的話;此時便表示我們所建立的H0是合適的。因此,必須努力證明H0是錯誤的(證明的成本較高),才能拒絕H0 。 如:環保局要檢查機車廢氣排放量是否合乎標準,H0:不合格,因為若不合格,卻接受其為「合格」的後果更為嚴重。故不能H0 :合格。
Step 1: 假設檢定判斷準則 2/2 準則3 以某人(單位、機構、企業……等)的宣稱做為虛無假設H0。亦即假定他的主張是真實的。因此,檢定的任務是證明他們的主張H0有問題! 如:公司業務部宣稱年銷售額為500萬元,根據某股東觀察,認為應該超過500萬元,故設定H1:μ > 500。又如:公司稱產品不良率小於0.01,但批發商懷疑應該大於0.01,故H0:π≦ 0.01, H1: π > 0.01 準則4 問題中若出現「是否顯著地」(小、重、優、劣、多、少等形容詞)……」時,則以其反面敘述做為虛無假設。將我們想要證實/分辨/有疑問的放在H1 。 如:若問現在的初生嬰兒體重是否比過去重?則設定H0:μ= 3 kg, H1:μ > 3 kg
Step 2: State a Level of Significance: Errors in Hypothesis Testing The 顯著水準 significance level of a test: Defined as the probability of rejecting the null hypothesis when it is actually true. = P(拒絕 H0|H0為真) This is denoted by the Greek letter “”. Also known as Type I Error. We select this probability prior to collecting data and testing the hypothesis. A typical value of “” is 0.05. 10-*
顯著水準α(以左尾檢定為例)
Step 2: State a Level of Significance: Errors in Hypothesis Testing Another possible error: The probability of not rejecting the null hypothesis when it is actually false. β = P(接受 H0|H0為假) This is denoted by the Greek letter “β”. Also known as Type II Error. We cannot select this probability. It is related to the choice of , the sample size, and the data collected. 10-*
Step 2: 假設檢定的兩種誤差 1/2 母體的情形 (未明的真實狀況) H0真 H0偽 接受H0 正確決策 錯誤決策 (型Ⅱ誤差)β 假設檢定中的型Ⅰ與型Ⅱ誤差 母體的情形 (未明的真實狀況) H0真 H0偽 接受H0 正確決策 錯誤決策 (型Ⅱ誤差)β 拒絕H0 (型Ⅰ誤差)α 檢定的結論
Step 2: 假設檢定的兩種誤差 2/2 在一假設檢定中,發生型 I 誤差之最大機率稱為顯著水準(significant level),記作α。α可利用數學式子表示如下 α =max P(型I誤差) =max P(拒絕H0|H0為真) 在既定的拒絕域下,發生型II誤差之機率記作β,亦即 β =P(型II誤差) =P(接受H0|H0為偽)
Step 2 型Ⅰ與型Ⅱ錯誤的機率
Step 2: State a Level of Significance: Errors in Hypothesis Testing 10-*
Step 3: 決定使用的統計量Identify the Test Statistic LO10-2 Step 3: 決定使用的統計量Identify the Test Statistic TEST STATISTIC A value, determined from sample information, used to determine whether to fail to reject or reject the null hypothesis. To test hypotheses about population means we use the z or t-statistic. For hypotheses about population variances, we use the F-statistic. 10-*
Step 3: 決定使用的統計量 Identify the Test Statistic 檢定母體平均值: 大樣本(或常態且σ已知):z 檢定 小樣本(σ未知):t 檢定 檢定母體變異數:F檢定 臨界值檢定法: - 臨界值 - z, t 臨界值
Step 4: Formulate a Decision Rule: One-Tail vs. Two-Tail Tests LO10-2 Step 4: Formulate a Decision Rule: One-Tail vs. Two-Tail Tests CRITICAL VALUE Based on the selected level of significance, the critical value is the dividing point between the region where the null hypothesis is rejected and the region where it is not rejected. If the test statistic is greater than or less than the critical value (in the region of rejection), then reject the null hypothesis. 10-*
Step 4: 單尾檢定與雙尾檢定 註:μ0代表任一特定的值,亦即我們所欲檢定母體平均數之假想值,一般以 表示。
假設檢定的符號
Step 4: Z-檢定或 t-檢定 1/3 鏈結圖 9.4 圖: 臨界值檢定與Z–檢定之比較的圖例
臨界值檢定與Z–檢定(或t–檢定)之比較 Step 4: 臨界值檢定 2/3 臨界值檢定與Z–檢定(或t–檢定)之比較 (a)Z—檢定的情況 臨界值檢定 檢定統計量: Z—檢定 拒絕域:z ≥ zα/2或 z ≤ –zα/2 拒絕域:z ≤ –zα 拒絕域:z ≥ zα
臨界值檢定與Z–檢定(或t–檢定)之比較(續) Step 4: 臨界值檢定 3/3 臨界值檢定與Z–檢定(或t–檢定)之比較(續) (b)t—檢定的情況 臨界值檢定 檢定統計量: t—檢定 拒絕域:t ≥ tα/2,(n-1) 或 t ≤ –tα/2,(n-1) 拒絕域:t ≤ –tα,(n-1) 拒絕域:t ≥ tα,(n-1)
1-1 母體平均數的假設檢定—大樣本 第一種方法:X 臨界值法
左尾檢定
右尾檢定
雙尾檢定
例:雙尾檢定的拒絕域與接受域 在H0成立下, 的抽樣分配
母體平均數μ的雙尾檢定 1/2 H0:μ=μ0 H1:μ≠μ0 決策法則( 的抽樣分配為常態分配) 若 ≥ c1或 ≤ c2,則拒絕H0; 決策法則( 的抽樣分配為常態分配) 若 ≥ c1或 ≤ c2,則拒絕H0; 若c2< <c1,則接受H0。 其中c1=μ0+zα/2. 且 c2=μ0–zα/2. 表示顯著水準
母體平均數μ的雙尾檢定 2/2 常態母體,小樣本,母體標準差σ未知,則母體平均數μ之雙尾檢定的臨界值c1與c2為 其中 ,代表 的標準誤。 其中 ,代表 的標準誤。 (9-4)
Step 5: Take a Sample, Arrive at a Decision LO10-2 Step 5: Take a Sample, Arrive at a Decision Identify an unbiased sample. Collect the data on the relevant variables. Calculate test statistics. Compare the test statistic to the critical value. Make a decision, i.e., reject or fail to reject the null hypothesis. 10-*
Step 6: Interpret the Result LO10-2 Step 6: Interpret the Result What does the decision to reject or fail to reject the null hypothesis mean in the context of the study? Examples: “Based on the data, there is no evidence to support the hypothesis that pay is related to performance.” “Based on the data, there is evidence that billboard advertising if more effective than paper-based media advertising”. 10-*
LO10-5 Conduct a test of a hypothesis about a population mean. 範例:檢定μ,σ已知 - 雙尾檢定 Hypothesis Test of a Population Mean, Known Population Standard Deviation – Example J鋼鐵公司製作並組裝桌子與辦公家具,他們在Fredonia廠的A325型桌子的每週產量服從常態分配,平均值200,標準差16,最近他們引進一項新製程與新員工,副總裁想知道新製程是否改變了A325的產量。 Jamestown Steel Company manufactures and assembles desks and other office equipment. The weekly production of the Model A325 desk at the Fredonia Plant follows the normal probability distribution with a mean of 200 and a standard deviation of 16. Recently, new production methods have been introduced and new employees hired. The VP of manufacturing would like to investigate whether there has been a change in the weekly production of the Model A325 desk. 根據題目提供資料: A325週產量~N(200, 162) 去年50週(其間曾放假2週)的平均產量:203.5 (p. 326) 10-*
How to Set Up a Hypothesis Test LO10-3 LO10-5 How to Set Up a Hypothesis Test In actual practice, the status quo is set up as H0. If the claim is “boastful” the claim is set up as H1 (we apply the Missouri rule – “show me”). Remember, H1 has the burden of proof. In problem solving, look for key words and convert them into symbols. Some key words include: “improved, better than, as effective as, different from, has changed, etc.” Keywords Inequality Symbol Part of: Larger (or more) than > H1 Smaller (or less) than < No more than ≦ H0 At least ≥ Has increased Is there difference? ≠ Has not changed = Has “improved”, “is better than”, “is more effective” See keywords 10-*
Hypothesis Setups for Testing a Mean () LO10-5 LO10-3 Hypothesis Setups for Testing a Mean () 10-*
Important Things to Remember about H0 and H1 LO10-5 Important Things to Remember about H0 and H1 H0 is the null hypothesis; H1 is the alternate hypothesis. H0 and H1 are mutually exclusive and collectively exhaustive. H0 is always presumed to be true. H1 has the burden of proof. A random sample (n) is used to “reject H0.” If we conclude “do not reject H0,” this does not necessarily mean that the null hypothesis is true, it only suggests that there is not sufficient evidence to reject H0; rejecting the null hypothesis, suggests that the alternative hypothesis may be true given the probability of Type I error. Equality is always part of H0 (e.g. “=”, “≥”, “≤”). Inequality is always part of H1 (e.g. “≠”, “<”, “>”). 10-*
LO10-3 LO10-3 範例:檢定μ,σ已知 - 雙尾檢定 Hypothesis Test of a Population Mean, Known Population Standard Deviation – Example 因為題目問:「是否改變了A325的產量」,故為雙尾檢定 Step 1: State the null and alternate hypotheses. H0: = 200 H1: ≠ 200 (Note: The keyword in the problem “has changed.”) Step 2: Select the level of significance. α = 0.01 as stated in the problem. 10-*
LO10-3 LO10-3 範例:檢定μ,σ已知 用 Z 值法檢定 Hypothesis Test of a Population Mean, Known Population Standard Deviation – Example Step 3: Select the test statistic. Use z-distribution since σ is known. 10-*
1-1 母體平均數的假設檢定—大樣本 第二種方法:Z值法
LO10-3 LO10-3 範例:檢定μ,σ已知 - 雙尾檢定 Hypothesis Test of a Population Mean, Known Population Standard Deviation – Example Step 4: Formulate the decision rule. 先計算檢定值 z Reject H0 if |z| >z/2 所以,檢定值 z = 1.55 Step 5: Make a decision and interpret the result. H0 is not rejected because 1.55 does not fall in the rejection region. Step 6: Interpret the result. We conclude that the population mean is not different from 200. So we would report to the vice president of manufacturing that the sample evidence does not show that the production rate at the plant has changed from 200 per week. 10-*
LO10-4 Distinguish between a one-tailed and a two-tailed test of hypothesis. 範例:檢定μ,σ已知 Hypothesis Test of a Population Mean, Known Population Standard Deviation – Example 假設前例中的陳述改為:VP想知道產量是否增加?或說過去50週的產量是否大於200? Suppose in the previous problem the vice president wants to know whether there has been an increase in the number of units assembled. To put it another way, can we conclude, because of the improved production methods, that the mean number of desks assembled in the last 50 weeks was more than 200? Recall: σ=16, n=200, α=.01 10-*
One-Tailed Test versus Two-Tailed Test LO10-4 One-Tailed Test versus Two-Tailed Test 10-*
LO10-4 範例:檢定μ,σ已知 ─單尾檢定 Testing for a Population Mean, Known Population Standard Deviation – One-Tail Example 這時問題變成:「產量是否增加」,故為右尾檢定 Step 1: State the null hypothesis and the alternate hypothesis. H0: ≤ 200 H1: > 200 (Note: The keyword in the problem “an increase.”) Step 2: Select the level of significance. 選擇顯著水準為0.01 α = 0.01 as stated in the problem. Step 3: Select the test statistic. 因為σ已知,故用 z 檢定 Use z-distribution since σ is known. 10-*
LO10-4 範例:檢定μ,σ已知 ─單尾檢定 Testing for a Population Mean, Known Population Standard Deviation – One-Tail Example Step 4: Formulate the decision rule. Reject H0 if z > z. 注意:z 檢定值為 1.55,而臨界 z 值為 2.33 (α = 0.01) Step 5: Make a decision. Because 1.55 does not fall in the rejection region, H0 is not rejected. 因為1.55位於接受區,故無法拒絕H0 Step 6. Interpret the result. Based on the evidence, we cannot conclude that the average number of desks assembled increased in the last 50 weeks. 10-*
p值檢定法 (p.328) p-Value in Hypothesis Testing LO10-6 Compute and interpret a p-value. p值檢定法 (p.328) p-Value in Hypothesis Testing p值法乃是將統計值視為臨界值,然後求出拒絕區的機率值=p值,然後比較 p值與值(顯著水準)。 A p-value is the probability of observing a sample value as extreme as, or more extreme than, the value observed (the test statistic), given that the null hypothesis is true. In testing a hypothesis, we can also compare the p-value to the significance level (). Decision rule using the p-value: Reject H0 if p-value < significance level. 10-*
1-1 母體平均數的假設檢定—大樣本 第三種方法:p值法 (p.328)
圖例 機器性能的檢定p值法
p-Value in Hypothesis Testing – Example (p.329) 檢定μ,σ已知 LO10-6 p-Value in Hypothesis Testing – Example (p.329) 檢定μ,σ已知 Recall the last problem where the hypothesis and decision rules were set up as: H0: ≤ 200 H1: > 200 Reject H0 if z >z, where z = 1.55 and z = 2.33. Reject H0 if p-value < : 0.0606 is not < 0.01. Conclude: Fail to reject H0. 因為 p > 10-*
LO10-6 p-value < 是什麼意思?(p.329) What does it mean when the p-value < ? If p-value =.10, we have some evidence that H0 is not true. If p-value =.05, we have strong evidence that H0 is not true. If p-value =.01, we have very strong evidence that H0 is not true. If p-value =.001, we have extremely strong evidence that H0 is not true. 10-*
ex. 6 (p.330) The waiting time for customers at MacBurger Restaurants follows a normal distribution with a population standard deviation of 1 minute. At the Warren Road MacBurger, the quality-assurance department sampled 50 customers and found that the mean waiting time was 2.75 minutes. At the .05 significance level, can we conclude that the mean waiting time is less than 3 minutes? (a) State the null hypothesis and the alternate hypothesis. (b) State the decision rule. (c) Compute the value of the test statistic. (d) What is your decision regarding H0? (e) What is the p-value? Interpret it.
ex. 6 (p.330) a. H0: 3 H1: < 3 b. Reject Ho if z < 1.645 c. 1.77, found by d. Reject H0 e. p = 0.0384, found by (0.5000 – 0.4616). P( 1.77 < z < 0) = 0.4616 We conclude that the mean waiting time is less than three minutes. When H0 is true, the probability of obtaining a value smaller than 1.77 is 0.0384.
ex. 8 (p.331) At the time she was hired as a server at the Grumney Family Restaurant, Beth Brigden was told, “You can average $80 a day in tips.” Assume the population of daily tips is normally distributed with a standard deviation of $9.95. Over the first 35 days she was employed at the restaurant, the mean daily amount of her tips was $84.85. At the .01 significance level, can Ms. Brigden conclude that her daily tips average more than $80? (a) State the null hypothesis and the alternate hypothesis. (b) State the decision rule. (c) Compute the value of the test statistic. (d) What is your decision regarding H0? (e) What is the p-value? Interpret it.
ex. 8 (p.331) a. H0: 80 H1: >80 b. Reject H0 if z > 2.326 c. 2.88, found by d. Reject H0 e. p = 0.0020. P(0 < z < 2.88) = 0.498, The mean amount of tips per day is larger than $80.00. If H0 is true, you will of obtain a sample mean this far above 80 about one time out of 500 tests.
LO10-7 Use a t statistic to test a hypothesis. 檢定μ,σ未知 (p331) Testing for the Population Mean: Population Standard Deviation Unknown When the population standard deviation (σ) is unknown, the sample standard deviation (s) is used in its place The t-distribution is used as the test statistic, which is computed using the formula: 10-*
檢定μ,σ未知 (p331) Testing for the Population Mean: Population Standard Deviation Unknown 本書在第292頁的圖中說明,當σ未知時,用s取代σ,而抽樣分配(在此檢定時),就用t分配。但t分配通常是在小樣本時採用,若為大樣本,根據中央極限定理,可以用Z分配取代,故也應可用Z檢定。 t分配的特性: - 為連續分配 - 為鐘形分配 - 若抽n個樣本,其自由度為n-1,自由度不同,t分配也不相同 - 當自由度增加時, t分配趨近Z分配 - t分配比 Z 分配更扁平,離散度更大
LO10-7 檢定μ,σ未知 (p331-332):例1 M保險公司理賠部門說處理每件理賠案件的平均成本為$60,比其他公司來得高,因此,M公司開始削減成本,為了檢定其成效,他們隨機抽了26件上個月處理的理賠個案,列於下表,在0.01的顯著水準下,我們是否能說現在成本已經低於$60了? The McFarland Insurance Company Claims Department reports the mean cost to process a claim is $60. An industry comparison showed this amount to be larger than most other insurance companies, so the company instituted cost-cutting measures. To evaluate the effect of the cost-cutting measures, the Supervisor of the Claims Department selected a random sample of 26 claims processed last month. The sample information is reported below. At the .01 significance level, is it reasonable to conclude that a claim is now less than $60? 10-*
檢定μ,σ未知 (p331-333):例1 H1: < $60 LO10-7 Step 1: State the null hypothesis and the alternate hypothesis. (左尾檢定) H0: ≥ $60 H1: < $60 (Note: The keyword in the problem is “now less than.”) Step 2: Select the level of significance. α = 0.01 as stated in the problem. Step 3: Select the test statistic. Since σ is unknown, use a t-distribution with n-1 (26 – 1 = 25) degrees of freedom. (小樣本) 10-*
t-Distribution Table (Portion):例1 找98%(單尾,顯著水準1%),自由度25的臨界 t值 LO10-7 t-Distribution Table (Portion):例1 找98%(單尾,顯著水準1%),自由度25的臨界 t值 10-*
檢定μ,σ未知 (p331-334):例1 Step 4-5:找t 檢定的臨界值,自由度26-1=25: 前面提到抽26個樣本,若此樣本計算出:平均成本為$56.42,樣本標準差為$10.04。 Step 4-5:用t 檢定,計算t 檢定統計值: Step 4-5:找t 檢定的臨界值,自由度26-1=25: t,n-1 = t0.01,25 = -2.485 故 t = -1.818 > t0.01,25 = -2.485 無法拒絕 H0,
LO10-7 檢定μ,σ未知 (p331-334):例1 Step 4: Formulate the decision rule. Reject H0 if t < -t,n-1. 因為 t = -1.818 > 臨界t值= -2.485,故不能拒絕 H0 Step 5: Make a decision. Because -1.818 does not fall in the rejection region, H0 is not rejected at the .01 significance level. Step 6: Interpret the result. We have not demonstrated that the cost- cutting measures reduced the mean cost per claim to less than $60. The difference of $3.58 ($56.42 - $60) between the sample mean and the population mean could be due to sampling error. 10-*
檢定μ,σ未知 (p334):例2 Myrtle Beach國際機場的短期停車場離機場出入口很近,因此,只要走一小段路就能到領行李處,對等候接機相當方便,機場經理想知道此停車場是否有足夠的停車位讓人停車,他必須確定平均停車時間是否超過40分鐘,抽樣12位剛在這兒停過車的顧客,他們停留時間如下: 問:在顯著水準為0.05下,是否能說此停車場的車輛平均停留時間超過40分鐘?
檢定μ,σ未知 (p334-345):例2 Step 1: State the null and the alternate hypothesis. H0: µ ≤ 40 H1: µ > 40 右尾檢定 Step 2: Select the level of significance. It is .05. Step 3: Select the test statistic. t 檢定,自由度 = 12-1 = 11 t 臨界值 = t,n-1 = t0.05,11 = 1.796
檢定μ,σ未知 (p334-345):例2 計算 t 檢定值: 抽樣:n = 12 樣本平均值 = 48 樣本標準差 = 9.835
檢定μ,σ未知 (p334-345):例2 檢定結果:因為 t = 2.818 > t0.05,11 = 1.796 拒絕 H0,亦即:接受H1 結論:顧客停留時間超過40分鐘,因此,停車位可能不夠,需要更多停車位。
檢定μ,σ未知 (p334-345):例2 若顯著水準為0.01或0.005呢?他的結論是否會改變? 本題中抽樣檢定值 t = 2.818 t0.01,11 = 2.718,而t0.005,11 = 3.106 故α =0.01時,結論不變,但當α =0.005時,結論改變成無法拒絕H0 此外,能否查表求 p值?由上例中可之,p值介於0.005與0.01之間,p值比較靠近0.01。 由此p值可知,拒絕虛無假設H0是恰當的,因為p < 0.01
LO10-7 檢定μ,σ未知:例3 N公司目前生產5安培的保險絲的速率為每小時250個,公司購買新機器並安裝好,要提高產能,隨機從上個月的生產記錄抽10個樣本,計算出樣本平均產量為每小時256個,樣本標準差為每小時6個。 在顯著水準為0.05下,N公司能否確定新機器的生產速率更快? The current rate for producing 5 amp fuses at Neary Electric Co. is 250 per hour. A new machine has been purchased and installed that, according to the supplier, will increase the production rate. A sample of 10 randomly selected hours from last month revealed the mean hourly production on the new machine was 256 units, with a sample standard deviation of 6 per hour. At the .05 significance level, can Neary conclude that the new machine is faster? 10-*
檢定μ,σ未知:例3 查表(p.734):臨界值 t0.05,9 = 1.833 (90%信賴水準) LO10-7 Step 1: State the null and the alternate hypothesis. 右尾檢定 H0: µ ≤ 250 H1: µ > 250 Step 2: Select the level of significance. It is .05. Step 3: Select the test statistic. The population standard deviation is not known. Use the t-distribution with n - 1 (10 – 1 = 9) degrees of freedom. 查表(p.734):臨界值 t0.05,9 = 1.833 (90%信賴水準) 樣本平均 = 256,樣本標準差 = 6 10-*
LO10-7 檢定μ,σ未知:例3 Step 4: State the decision rule. There are 10 – 1 = 9 degrees of freedom. The null hypothesis is rejected if t > 1.833. Step 5: Make a decision. The null hypothesis is rejected. Step 6: Interpret the results. The mean number produced is more than 250 per hour. 10-*
檢定μ,σ未知:例3 由前例可知,本檢定的p值介於0.01與0.005之間,但其實此處的p值比較靠近0.005。 故而,此例更應該拒絕虛無假設,亦即新機器使得生產更加迅速,母體平均產量應該超過每小時250個。
ex. 14 (p. 336) Most air travelers now use e-tickets. Electronic ticketing allows passengers to not worry about a paper ticket, and it costs the airline companies less to handle than paper ticketing. However, in recent times the airlines have received complaints from passengers regarding their e-tickets, particularly when connecting flights and a change of airlines were involved. To investigate the problem, an independent watchdog agency contacted a random sample of 20 airports and collected information on the number of complaints the airport had with e-tickets for the month of March. The information is reported below. At the .05 significance level, can the watchdog agency conclude the mean number of complaints per airport is less than 15 per month? a. What assumption is necessary before conducting a test of hypothesis? b. Plot the number of complaints per airport in a frequency distribution or a dot plot. Is it reasonable to conclude that the population follows a normal distribution? c. Conduct a test of hypothesis and interpret the results.
ex. 14 (p. 336) a. The population of complaints follows a normal probability distribution. b. The assumption of normality appears reasonable.
ex. 14 (p. 336) c. H0 : 15 H1 : < 15 Reject H0 if t < –1.729 Reject the null hypothesis. The mean number of complaints is less than 15.
ex. 20 (p. 339) Hugger Polls contends that an agent conducts a mean of 53 in-depth home surveys every week. A streamlined survey form has been introduced, and Hugger wants to evaluate its effectiveness. The number of in-depth surveys conducted during a week by a random sample of 15 agents are: At the .05 level of significance, can we conclude that the mean number of interviews conducted by the agents is more than 53 per week? Estimate the p-value.
ex. 20 (p. 339) H0: 53 H1: > 53 Reject H0 if t > 1.761 Reject H0. The mean number of surveys conducted is greater than 53. The p-value is less than 0.005.
LO10-8 Compute the probability of a Type II error. Recall that Type I Error, or the level of significance, is defined as the probability of rejecting the null hypothesis when it is actually true. It is denoted by the Greek letter alpha, “”. 亦即: = Max P(拒絕H0|H0為真) Type II Error is defined as the probability of “accepting” the null hypothesis when it is actually false. It is denoted by the Greek letter beta, “β”. 亦即: β = P(接受H0|H0為偽) 10-*
LO10-8 型2錯誤:例子 (p.340-342) 廠商購買鋼條做開口插銷,根據過去經驗,進貨的鋼條平均抗拉強度為1萬psi (pound per square inch),標準差σ為400 psi,該廠品管規範為:「抽100個鋼條,採用顯著水準0.05來檢定,若樣本平均強度介於9922與10078 psi之間,就接受進貨,否則,整批退貨。」 若新進的一批貨的母體平均並非10,000,而是9900 psi,那麼品管部門沒有拒絕這批貨的機率為? A manufacturer purchases steel bars to make cotter pins. Past experience indicates that the mean tensile strength of all incoming shipments is 10,000 psi and that the standard deviation, σ, is 400 psi. In order to make a decision about incoming shipments of steel bars, the manufacturer set up this rule for the quality-control inspector to follow: “Take a sample of 100 steel bars. At the .05 significance level if the sample mean strength falls between 9,922 psi and 10,078 psi, accept the lot. Otherwise the lot is to be rejected.” 10-*
型2錯誤:例子 (p.340-341) μ=10,000 psi,σ= 400 psi,α =0.05,若新進貨的母體平均並非10,000,而是9900 psi,那麼品管部門沒有拒絕這批貨的機率為型2錯誤:
型2錯誤:例子 (p.340-341) 0.55 < z < 4.45 因接受進貨而犯了型2錯誤,其機率為: 真正的μ=μ1=9900 < 9922,按照品管規範,應該退貨,但是仍有可能會被接受,Why? 因為真分配中仍有一部份與假分配的「接受區」重疊!! 由圖得知若9922 < X < 10078接受進貨,而在真正的母體分配中, 0.55 < z < 4.45 因接受進貨而犯了型2錯誤,其機率為: β = P(z > 0.55) = 0.2912
Type I and Type II Errors Illustrated LO10-8 Type I and Type II Errors Illustrated If we take a sample of 100 bars and the sample mean is 9,900 psi, this value is our best estimate of the true population mean. And, based on the Region A graph, we should reject the null hypothesis with a 0.05 probability of a Type I error. However, there is always sampling error. For a distribution with a population mean of 9,900, it is possible that a sample would have a sample mean greater than 9,922. See Region B. So we could commit a Type II error: Fail to reject a false null hypothesis. Region B and the computed z-statistic show that the probability of a Type II error is 0.2912 when our estimate of the population mean is 9,900. 10-*
型2錯誤:例子 (p.341-342) μ=10,000 psi,σ= 400 psi,α =0.05,若新進貨的母體平均並非10,000,而是10120 psi,那麼品管部門沒有拒絕這批貨的機率為型2錯誤:
型2錯誤:例子 (p.341-342) -5.5 < z < -1.05 因接受進貨而犯了型2錯誤,其機率為: 真正的μ=μ1=10120 > 10078,按照品管規範,應該退貨,但是仍有可能會被接受,Why? 因為真分配中仍有一部份與假分配的「接受區」重疊!! 由圖得知若9922 < X < 10078接受進貨,而在真正的母體分配中, -5.5 < z < -1.05 因接受進貨而犯了型2錯誤,其機率為: β = P(z < -1.05) = 0.1469 1 - β = 0.8531:未犯型2錯誤的機率(又稱作檢定力)
型2錯誤:例子 (p.341-342) 1- β = 又稱作檢定力(power of a test) μ0=10,000 psi,σ= 400 psi,α =0.05 1- β = 又稱作檢定力(power of a test)
ex. 21 (p. 342) Refer to Table 10–4 and the example just completed. With and μ1 = 9,880, verify that the probability of a Type II error is .1469.
ex. 21 (p. 342) 1.05, found by Then 0.5000 – 0.3531 = 0.1469, which is the probability of a Type II error.
ex. 22 (p. 342) Refer to Table 10–4 and the example just completed. With and μ1 = 10,100, verify that the probability of a Type II error is .2912.
ex. 22 (p. 342) 0.55, found by Then the area beyond -0.55 is 0.5000 - .2088 = 0.2912, which is the probability of a Type II error.
例題1: 某廠商宣稱其所生產的咖啡平均每罐重3磅以上,現由其中隨機抽出36罐秤重,得出平均重量為2.97磅。假設母體的標準差為0.18磅,請以0.01顯著水準檢定廠商是否說實話? a.用z值法 b.用p值法 c.若真正的母體平均為2.875,我們沒發現廠商說謊的機率為?
例題1: 檢定值 z為: a. 左尾檢定,α = 0.01,臨界值 z* = -2.33 H0: ≥ 3 H1: < 3 而檢定值 z = -1 > z* 故 無法拒絕H0 b. p值= P(以2.97為臨界值|H0為真) = P(z < -1) = 0.5-0.3413 = 0.1587 > α c. 要計算型2錯誤必須先找出(XC)臨界值 檢定值 z為: 故無法拒絕H0 因為
例題1: c. 找到 臨界值 故若 找到2.93在新母體的z值: 而型2錯誤:應拒絕卻接受H0 計算真正母體的「接受區」機率 β = P(z > 1.83) = 0.5 - 0.4664 = 0.0336
例題1:若真正μ1=2.9375呢? c. 找到 臨界值 故若 找到2.93在新母體的z值: 而型2錯誤:應拒絕卻接受H0 計算真正母體的「接受區」機率 β = P(z > -0.25) = 0.5 + 0.0987 = 0. 5987 故μ0與μ1越靠近,β值越大 且α值越大(接受區越小),則β值越小
P–值檢定 鏈結圖 9.6 臨界值檢定、Z–檢定與P–值檢定三者之間的關係
β值 當c由2.93變為2.94時,α增加而β減小 同一檢定下,α與β的關係
以下為課外教材 課外教材…..
母體比例的檢定 1/4 圖9-9 ,臨界值c1與c2的計算
母體比例的檢定 2/4 母體比例P之臨界值檢定 臨界值 決策法則 註:
母體比例的檢定 3/4 檢定統計量 決策法則 同上 (1)雙尾檢定 H0:P=P0 H1:P≠P0 (2)左尾檢定 H0:P≥P0 母體比例P之Z–檢定 檢定統計量 決策法則 (1)雙尾檢定 H0:P=P0 H1:P≠P0 (2)左尾檢定 H0:P≥P0 H1:P<P0 同上 (3)右尾檢定 H0:P≤P0 H1:P>P0
母體比例的檢定 4/4 母體比例之信賴區間 決策法則 (1)雙尾檢定 H0:P=P0 H1:P≠P0 若區間包含P0,則接受H0,反之,則拒絕H0。 (2)左尾檢定 H0:P≥P0 H1:P<P0 同上 (3)右尾檢定 H0:P≤P0 H1:P>P0
母體變異數的檢定 1/4 α 顯著水準下,右尾檢定的臨界c
母體變異數σ2之臨界值檢定的各種情況之臨界值與拒絕域 母體變異數的檢定 2/4 母體變異數σ2之臨界值檢定的各種情況之臨界值與拒絕域 鏈結表 9.10
母體變異數的檢定 3/4 母體變異數σ2之χ2–檢定的各種情況 鏈結表 9.11
母體變異數σ2之P–值檢定——P–值的計算 母體變異數的檢定 4/4 母體變異數σ2之P–值檢定——P–值的計算 (1)雙尾檢定 (2)左尾檢定 (3)右尾檢定 註: 代表樣本變異數S2的觀察值 右端: 左端: (1) (2) (3)
信賴區間檢定 1/4 在雙尾檢定中,H0:μ=μ0;H1:μ≠μ0,如果母體平均數μ的(1–α)100%之信賴區間(9-8)式包含μ0,則樣本平均數 的觀察值會落於接受域(9-9)式,此時將做出接受H0的結論。
信賴區間檢定 2/4 臨界值檢定 計算臨界值,求出接受域(或拒絕域),若檢定統計量的觀察值落於接受域,則接受H0;反之,則拒絕H0。 計算母體參數的信賴區間,若此區間包含H0成立時的假想值(如μ0),則接受H0;反之,則拒絕H0。
信賴區間檢定 3/4 μ之雙尾檢定的接受域與μ之信賴區間的關係 鏈結圖 9.5
各種情況的μ之信賴區間(信賴區間檢定法) 信賴區間檢定 4/4 各種情況的μ之信賴區間(信賴區間檢定法) 抽樣分配屬於常態(若σ未知,可以 s 取代) 抽樣分配屬於 t 分配 (1)雙尾檢定 H0:μ=μ0 H1:μ≠μ0 (2)左尾檢定 H0:μ≥μ0 H1:μ<μ0 (2)右尾檢定 H0:μ≤μ0 H1:μ>μ0
1-1 檢定力函數
圖 母體平均數的檢定--小樣本
1-1 母體平均數的假設檢定—小樣本