Estimation and Confidence Intervals Chapter 9 . Copyright © 2015 McGraw-Hill Education. All rights reserved. No reproduction or distribution without the prior written consent of McGraw-Hill Education.
Learning Objectives LO9-1 Compute and interpret a point estimate of a population mean. LO9-2 Compute and interpret a confidence interval for a population mean. LO9-3 Compute and interpret a confidence interval for a population proportion. LO9-4 Calculate the required sample size to estimate a population proportion or population mean. LO9-5 Adjust a confidence interval for finite populations. 9-*
估計的基本概念 點估計(point estimation):根據樣本資料求得一估計值,以推估未知的母體參數。 區間估計(interval estimation):根據樣本資料所求出的點估計值(point estimate),藉由點估計量抽樣分配的性質求出兩個數值而構成一區間,稱為區間估計值(interval estimate),並利用此一區間推估未知母體參數的範圍。 9-
點估計 Point Estimates 點估計:由樣本中計算出一個代表(估計)母體數值的樣本數值 LO9-1 Compute and interpret a point estimate of a population mean. 點估計 Point Estimates 點估計:由樣本中計算出一個代表(估計)母體數值的樣本數值 A point estimate is a single value (point) derived from a sample and used to estimate a population value. 9-*
1-1 點估計 9-
統計估計分兩類:(p.280) 點估計(point estimate): 是從一組樣本中根據樣本統計式所計算而得的數值,我們用此數值來推論母體參數值。 e.g. 某教授為了了解大學畢業生第一份工作的平均所得,就以簡單抽樣的方法,從全台灣25萬大學畢業生這個母體當中,抽出50個為一組的隨機樣本。然後計算此樣本的平均薪資為2萬1千元。此教授若做以下結論: 「台灣的大學畢業生第一份工作的平均薪資為2萬1千元!」 9-
點估計 你覺得這個教授的結論可靠嗎? 或者說,這個樣本統計值(50位大學畢業生第一份工作薪資的平均值)是否接近、代表母體參數(全台灣大學畢業生第一份工作薪水的平均值)? 用點估計的結果去推估母體,不見得準確! 只用一組樣本的樣本統計量去推估母體參數,非常不可靠! 因為,再抽取另外一組樣本(50個樣本點),得到的點估計值會跟前一組樣本的點估計不同,請問你要用哪一個點估計? 以單單一組樣本得到的點估計值去估計母體參數值,缺點如下: (1)無法得知是否真是母體參數的真值 (2)如果不等,其誤差的大小無法得知 9-
信賴區間估計 Confidence Interval Estimates LO9-2 Compute and interpret a confidence interval for a population mean. 信賴區間估計 Confidence Interval Estimates 信賴區間估計(C.I.):在某個特定機率下,由樣本資料推估母體參數落在那個區間內,亦為一種區間估計。這個特定機率又稱為信賴水準(level of confidence) A confidence interval estimate is a range of values constructed from sample data so that the population parameter is likely to occur within that range at a specified probability. The specified probability is called the level of confidence. C.I. = 點估計 ± 誤差 C.I. = point estimate ± margin of error 9-*
信賴區間估計: 步驟 (1)透過樣本取得點估計值(平均數) e.g.大學畢業生第一年的平均薪資為2萬1千元 (2)以此估計值為中心,根據特定的信賴水準,導出一個上下限的信賴區間 e.g.信賴區間為1萬8千元與2萬5千元之間 (3)常用的信賴水準為95%或是99% 若是95%的信賴水準,表示: 我們有95%的信心,母體參數會落在這個信賴區間之內 e.g.大學畢業生第一年的薪資介於1萬8千元與2萬5千元之間,此區間包含母體平均數(全台大學畢業生平均薪資)的可信度(機率)為95%。(有95%的信心,母體參數會位於這個區間之內) 9-
如何決定信賴區間的範圍? Factors Affecting Confidence Interval Estimates 從樣本平均數推估母體平均數所在的信賴區間, 有兩種情況: (1)常態分配下母體的標準差σ已知: (2)母體的標準差σ未知: 用樣本標準差s取代母體標準差 9- 10
σ 已知,求母體平均的信賴區間 Confidence Intervals for a Mean, σ Known LO9-2 σ 已知,求母體平均的信賴區間 Confidence Intervals for a Mean, σ Known . 區間大小由信賴水準與平均數的標準差來決定。 The width of the interval is determined by the level of confidence and the size of the standard error of the mean. 而標準差受到下列二值的影響: The standard error is affected by two values: 標準差σ (Standard deviation) 樣本數 n (Number of observations in the sample) 9-*
圖 P(–Zα/2<Z<zα/2)=1–α 的抽樣分配為常態 (8-6) 註:zα值為標準常態邊際值。 9- 圖 P(–Zα/2<Z<zα/2)=1–α
的抽樣分配為常態 (8-9) 母體參數的信賴區間=點估計量 ± 抽樣誤差 9-
常態母體資料的標準差σ已知 Q: 如何從樣本平均值推估未知的母體平均值? 回想上章學到的… 已知分配為常態(分配未知但樣本夠大),母體平均值、標準差已知, 當要推估樣本平均值抽樣分配所涵蓋的機率時,很容易將其標準化 因為我們知道: ~ N (ux, σx2/N) 如果 P (-A < < A) =95%,求A為多少? = P (-1.96 < Z <1.96)=95% = P (-1.96 < ( - μx)/(σx/√n) < 1.96) =95% = P (μx -1.96(σx/√n) < < μx +1.96(σx/√n))=95% 然而,在這章中,有個已知的 ,但 μx 未知: = P ( -1.96(σx/√n) < μx < +1.96(σx/√n))=95% 此為95%信賴區間,包含母體平均值的機率為95% 9-
信賴水準 P ( -1.96(σx/√n) < μx < +1.96(σx/√n))=95% 信賴水準95%:此信賴區間,包含母體平均值的機率為95% 若是信賴水準是99%? P (-b < < b)=99% =P (-2.58 < Z < 2.58)=99% P ( -2.58(σx/√n) < μx < +2.58(σx/√n))=99% 若是信賴水準是90%? P ( -1.64(σx/√n) < μx < +1.64(σx/√n))=90% 9-
影響信賴區間估計的因素: Factors Affecting Confidence Interval Estimates LO9-2 影響信賴區間估計的因素: Factors Affecting Confidence Interval Estimates 信賴區間的大小受到下列因素的影響: The width of a confidence interval is determined by: 樣本大小 (The sample size), n 母體的變異度/離散度 (The variability in the population, usually σ estimated by s) 信賴水準 (The desired level of confidence) 9-*
決定信賴區間的三個要素: 信賴水準的意義 (p.282) 1.The sample size, n. 95%信賴水準下,母體參數的信賴區間為 ±1.96(σx/√n) 信賴區間的大小會隨著樣本數不同而有所不同,若將全部可能的樣本平均值計算出所有的區間,若以95%為信賴水準,則表示: 在所有的區間中有95%的區間會包含母體參數,有5%的區間不包含母體參數。 e.g.如果選出100組樣本(各內含50位大學畢業生薪資),然後計算各樣本的平均值。從各樣本平均值再推出各個相對應的信賴區間,共有100個信賴區間。 95%的這些信賴區間會涵蓋到母體平均值 注意:以上都是以隨機抽樣的方式抽出所有的樣本。 95%信賴區間表示母體參數有95%的機率(我們相信)落在此區間內 決定信賴區間的三個要素: 1.The sample size, n. 2.The variability in the population, usually σ estimated by s. 3.The desired level of confidence. 9- 17
的抽樣分配為常態 信賴區間 為一可能包含μ之真實值的隨機區間。 信賴區間 為一可能包含μ之真實值的隨機區間。 機率 ,可解釋為,多次重複抽樣下之相對次數,亦即大約有(1–α)100%個區間包含μ。 只要由所觀察的樣本求得後,隨機區間 即可視為μ的(1–α)100%信賴區間。 9-
的抽樣分配為常態 鏈結圖 8.6 圖 母體平均數μ的95%信賴區間,100個區間中有94個包含μ 9-
區間估計:闡釋 Interval Estimates - Interpretation LO9-2 區間估計:闡釋 Interval Estimates - Interpretation 95%信賴區間:表示若將所有95%信賴區間全都估算出來,有95%的區間內包含母體參數,或在特定的n下,有95%的樣本平均值落在「假定」母體平均值1.96個標準差的範圍內。 For a 95% confidence interval about 95% of similarly constructed intervals will contain the parameter being estimated. Also 95% of the sample means for a specified sample size will lie within 1.96 standard deviations of the hypothesized population. 未知! 9-*
LO9-2 σ已知,求母體平均的信賴區間(例子)p.285 Confidence Interval for a Mean, σ Known - Example 美國經理協會調查零售業中級經理的平均年薪,他們隨機抽了49個樣本,樣本平均值為$45,420,母體標準差為 $2,050 The American Management Association surveys middle managers in the retail industry and wants to estimate their mean annual income. A random sample of 49 managers reveals a sample mean of $45,420. The standard deviation of this population is $2,050. 母體平均值的最佳點估計為? What is the best point estimate of the population mean? 母體平均數值的合理範圍為? What is a reasonable range of values for the population mean? 上述計算結果的意涵為? What do these results mean? 9-*
LO9-2 σ已知,求母體平均的信賴區間(例子)p.285 Confidence Interval for a Mean, σ Known - Example 美國經理協會調查零售業中級經理的平均年薪,他們隨機抽了49個樣本,樣本平均值為$45,420,母體標準差為 $2,050 The American Management Association surveys middle managers in the retail industry and wants to estimate their mean annual income. A random sample of 49 managers reveals a sample mean of $45,420. The standard deviation of this population is $2,050. 母體平均值的最佳點估計為? What is the best point estimate of the population mean? 因為我們不知道(也無從得知)母體平均值,於是,最佳估計值就是隨機抽樣的樣本統計值 Our best estimate of the unknown population mean is the corresponding sample statistic. 因此,樣本平均值$45,420為未知母體平均的點估計 The sample mean of $45,420 is the point estimate of the unknown population mean. 9-*
LO9-2 σ已知,求母體平均的信賴區間(例子)p.285 Confidence Interval for a Mean, σ Known - Example 美國經理協會調查零售業中級經理的平均年薪,他們隨機抽了49個樣本,樣本平均值為$45,420,母體標準差為 $2,050 The American Management Association surveys middle managers in the retail industry and wants to estimate their mean annual income. A random sample of 49 managers reveals a sample mean of $45,420. The standard deviation of this population is $2,050. 母體平均數值的合理範圍為? What is a reasonable range of values for the population mean? Suppose the association decides to use the 95 percent level of confidence. 若信賴區間訂為95%,我們用下式來求區間估計範圍: 9-*
LO9-2 給定信賴水準後,如何查表找到z值 p.283 How to Obtain a z-value for a Given Confidence Level 查表得知:要求95%的信賴水準,對應的z值為1.96 (因為0.475*2=0.95) The 95 percent confidence refers to the middle 95 percent of the observations. Therefore, the remaining 5 percent are equally divided between the two tails. Following is a portion of Appendix B.3. 9-*
LO9-2 σ已知,求母體平均的信賴區間(例子)p.285 Confidence Interval for a Mean, σ Known - Example 美國經理協會調查零售業中級經理的平均年薪,他們隨機抽了49個樣本,樣本平均值為$45,420,母體標準差為 $2,050 The American Management Association surveys middle managers in the retail industry and wants to estimate their mean annual income. A random sample of 49 managers reveals a sample mean of $45,420. The standard deviation of this population is $2,050. 95%的信賴區間估計為: The 95 percent confidence interval estimate is: 9-*
LO9-2 σ已知,求母體平均的信賴區間(例子)p.285 Confidence Interval for a Mean, σ Known - Example Confidence Interval for a Mean – Interpretation 美國經理協會調查零售業中級經理的平均年薪,他們隨機抽了49個樣本,樣本平均值為$45,420,母體標準差為 $2,050 The American Management Association surveys middle managers in the retail industry and wants to estimate their mean annual income. A random sample of 49 managers reveals a sample mean of $45,420. The standard deviation of this population is $2,050. 如何解釋信賴上下限:$45,846 與 $45,994? What is the interpretation of the confidence limits $45,846 and $45,994? 若我們隨機抽49個經理作樣本,且抽出許多組樣本(每組都是49個樣本),每組樣本都算出樣本平均值,然後計算出各組的95%信賴區間,我們可以預期大約95%的區間內包含母體平均值µ(雖然我們並不知道µ=?),且也可預期5%的區間不包含母體平均值µ 。 If we select many samples of 49 managers, and for each sample we compute the mean and then construct a 95 percent confidence interval, we could expect about 95 percent of these confidence intervals to contain the population mean. Conversely, about 5 percent of the intervals would not contain the population mean annual income, µ. 9-*
P. 286-288 電腦模擬範例 Town Bank多年經營汽車出租業,知道租4年合約的平均行駛距離為5萬英里,標準差為5千英里,以上為母體參數,若他想實驗看看抽樣來估算母體平均,於是,他抽30個樣本觀察值,用信賴區間來估計母體平均,據此實驗,我們想看看是否95%個區間會包含母體平均值,若抽60組,應該有57組會包含5萬英里。(為方便計算,單位:千英里) (Sol):用統計軟體抽60組隨機樣本(n=30),σ /√n = 5 /√30 = 0.913 95%信賴區間列於287頁的表中,表中顯示:共有4組的95%信賴區間不包含5萬英里,故而僅56組(佔93.33%)包含母體平均值。 由此更可看出抽樣誤差的存在,或更進一步說:用特定一組隨機樣本來代表母體,可能還是會產生偏誤的問題,且即使用區間估計也是如此。 9-
範例 假設花蓮吉安鄉的碾米廠根據過去的經驗,分裝大包裝的白米時,每包重量標準差是9公斤。今隨機抽樣100包白米秤重,平均每包105公斤。請問:在95%的信賴水準下,白米平均重量(ux)的信賴區間是多少? P( – z*σ/√n< μx< + z*σ/√n)=95% P(105 – 1.96*9/√100< μx< 105 + 1.96*9/√10)=95% P(103.236 < μx < 106.764) =95% 在95%的信賴水準下,每包白米平均重量在103.236公斤及106.764公斤之間。 9-
p.289 Q.7 X~N (?, 2.32), n=60. = 8.6 問: a. 點估計? 8.6 b. 在α=99% , P( ____ < μx < ____)=0.99 - z*(2.3/ √60) Z值是多少? 取決於level of confidence=99%,故 z=2.58 8.6-2.58*(2.3/ √60)=8.6-2.58*0.297=7.834 8.6+2.58*(2.3/ √60)= 8.6+2.58*0.297=9.366 9-
p. 289 ex 9-8 Dr. Patton is a professor of English. Recently she counted the number of misspelled words in a group of student essays. She noted the distribution of misspelled words per essay followed the normal distribution with a population standard deviation of 2.44 words per essay. For her 10 a.m. section of 40 students, the mean number of misspelled words was 6.05. Construct a 95% confidence interval for the mean number of misspelled words in the population of student essays. 9-
p. 289 ex 9-8 5.294 and 6.806 errors found by 95%信賴區間:z = 1.96 5.294 and 6.806 errors found by 9-
σ未知,求母體平均的信賴區間 p.289 Confidence Intervals for a Mean, σ Unknown LO9-2 σ未知,求母體平均的信賴區間 p.289 Confidence Intervals for a Mean, σ Unknown 在大部分的情況下,母體標準差都是未知的,以下列出σ未知的範例: In most sampling situations the population standard deviation (σ) is not known. Below are some examples where it is unlikely the population standard deviations would be known. 商學院院長想估計全職(full-time)學生每週工作的平均工時,他選了30個學生,直接聯絡他們,並問他們上週工作多少小時。 The Dean of the Business College wants to estimate the mean number of hours full-time students work at paying jobs each week. He selects a sample of 30 students, contacts each student, and asks them how many hours they worked last week. 學務處長想估計一般通勤學生由家中到學校的距離是多少,她選了40個通勤學生,聯絡他們問由他們家到學校中心的單程距離是多少。 The Dean of Students wants to estimate the distance the typical commuter student travels to class. She selects a sample of 40 commuter students, contacts each, and determines the one-way distance from each student’s home to the center of campus. 助學貸款主任想知道學生畢業時他們平均的助學貸款欠額,主任選了20個畢業生,聯絡他們以調查此資訊。 The Director of Student Loans wants to know the mean amount owed on student loans at the time of his/her graduation. The director selects a sample of 20 graduating students and contacts each to find the information. 9-*
Population Standard Deviation (σ) Unknown 但是往往母體標準差σ是未知! 若X~N(μx,?) 樣本平均值的標準差(standard error of the mean)要如何計算? 用樣本標準差S (sample standard deviation)替代母體標準差σ 這個時候Xbar會成什麼分配? 因不知母體標準差,不能用z分配 若改用樣本標準差,就呈現t分配 若求Xbar涵蓋的範圍,可將其轉換成t值 t = Xbar-μx S/ √ n 之前的例子,都是母體標準差已知的狀況。 若X~N(μx,σx2) Xbar~N (μx,σx2/n) 若求Xbar涵蓋的範圍,可將其轉換成Z值 Z = Xbar-μx σx/ √ n 9- 33
的抽樣分配與 t 分配 (8-12) 9-
t分配的特色 Characteristics of the t-distribution 1. 與 z 分配一樣,是連續分配 2. 與 z 分配一樣, 是鐘形對稱的分配 3. t分配也是一群分配,並非只有一個,且所有的t分配的平均值皆為0 ,樣本數愈少,標準差愈大 4. t分配比z分配的離散度更大,當樣本數增加時,t分配就漸漸接近z分配 9- 35
Using the t-Distribution: Confidence Intervals for a Mean, σ Unknown LO9-2 Using the t-Distribution: Confidence Intervals for a Mean, σ Unknown It is, like the z distribution, a continuous distribution. It is, like the z distribution, bell-shaped and symmetrical. There is not one t distribution, but rather a family of t distributions. All t distributions have a mean of 0, but their standard deviations differ according to the sample size, n. The t distribution is more spread out and flatter at the center than the standard normal distribution As the sample size increases, however, the t distribution approaches the standard normal distribution. 9-*
LO9-2 Comparing the z and t Distributions When n is Small, 95% Confidence Level 9-*
LO9-2 Using the t-Distribution; Confidence Intervals for a Mean, σ Unknown (p. 292) 輪胎廠想知道胎痕的壽命,抽10個行駛5萬英里後得到胎痕剩下:樣本平均0.32英吋,標準差0.09英吋 A tire manufacturer wishes to investigate the tread life of its tires. A sample of 10 tires driven 50,000 miles revealed a sample mean of 0.32 inch of tread remaining with a standard deviation of 0.09 inch. Construct a 95 percent confidence interval for the population mean. 廠商的結論:行駛5萬英里後,胎痕剩下0.3英吋(母體平均),是否合理? Would it be reasonable for the manufacturer to conclude that after 50,000 miles the population mean amount of tread remaining is 0.30 inches? 9-*
Using the Student’s t-Distribution Table (p. 292) LO9-2 Using the Student’s t-Distribution Table (p. 292) 9-*
t 分配 (p. 292-293)範例 95%信賴區間為:(0.256, 0.384) 故廠商有95%的信心認為行駛5萬英里後,胎痕應剩下平均:(0.256, 0.384)英吋,0.3英吋在此範圍內,有可能0.3英吋就是母體平均(但不確定)。 9-
Confidence Interval Estimates for the Mean – Example (p.293) LO9-2 Confidence Interval Estimates for the Mean – Example (p.293) 購物中心經理想知道顧客平均花多少錢在此商場,他抽20位顧客,他們的消費額如下表: The manager of the Inlet Square Mall, near Ft. Myers, Florida, wants to estimate the mean amount spent per shopping visit by customers. A sample of 20 customers reveals the following amounts spent. 根據95%信賴區間,他們是否平均花$50?他們是否平均花$60? Based on a 95% confidence interval, do customers spend $50 on average? Do they spend $60 on average? . 9-*
Confidence Interval Estimates for the Mean – Example (p.294) LO9-2 Confidence Interval Estimates for the Mean – Example (p.294) 9-*
Confidence Interval Estimates for the Mean – Example (p.294) $50在此區間內,故母體平均為$50的可能性高。 而$60不在此區間內,因此$60不太可能為母體平均。 9-
Confidence Interval Estimates for the Mean – Using Minitab (p.294) LO9-2 Confidence Interval Estimates for the Mean – Using Minitab (p.294) 9-*
Confidence Interval Estimates for the Mean – Using Excel (p.295) LO9-2 Confidence Interval Estimates for the Mean – Using Excel (p.295) 9-*
p. 297 ex. 9-14 The Greater Pittsburgh Area Chamber of Commerce wants to estimate the mean time workers who are employed in the downtown area spend getting to work. A sample of 15 workers reveals the following number of minutes spent traveling. Develop a 98% confidence interval for the population mean. Interpret the result. 9-
p. 297 ex. 9-14 Between 30.99 and 39.15, found by about 98 percent of the intervals constructed of similar size will include the population mean. 9-
LO9-2 When to Use the z or t Distribution for Confidence Interval Computation 9-*
When to Use the z or t Distribution for Confidence Interval Computation So we know, sample size in t distribution is relatively small (課本這點沒提) BUT If sample size is large enough… Based on… Central limit Theorem 9- 49
1-1 母體平均數的區間估計—小樣本 9-
1-1 母體平均數的區間估計—小樣本 9-
LO9-2 When to Use the z or t Distribution for Confidence Interval Computation . Use Z-distribution, If the population standard deviation is known. Use t-distribution, If the population standard deviation is unknown. 9-*
1-1 母體平均數的區間估計—大樣本 9-
A Confidence Interval for a Population Proportion, LO9-3 Compute and interpret a confidence interval for a population proportion. A Confidence Interval for a Population Proportion, 下面為比例的範例,注意它們都是名目尺度的變數: The examples below report proportions. Note that each variable is measured with the nominal scale of measurement. 南方科大畢業生求職服務部主任報告:80%的畢業生進入職場都能學以致用。 The career services director at Southern Technical Institute reports that 80 percent of its graduates enter the job market in a position related to their field of study. 漢堡王的業務代表說45%的產品都是經由drive-through窗口銷售。 A company representative claims that 45 percent of Burger King sales are made at the drive-through window. 芝加哥區的家庭調查指出:85%新屋都有中央空調。 A survey of homes in the Chicago area indicated that 85 percent of the new construction had central air conditioning. 近來對35到50歲的已婚男士調查發現:63%認為夫妻兩人都應該賺錢養家。 A recent survey of married men between the ages of 35 and 50 found that 63 percent felt that both partners should earn a living. 9-*
A Confidence Interval for a Population Proportion, LO9-3 A Confidence Interval for a Population Proportion, To develop a confidence interval for a population proportion, we need to meet the following assumptions. 1. The binomial conditions, discussed in Chapter 6, must be met. Briefly, these conditions are: a. The sample data is the result of counts. b. There are only two possible outcomes. c. The probability of a success remains the same from one trial to the next. d. The trials are independent. This means the outcome on one trial does not affect the outcome on another. 2. The values and should both be greater than or equal to 5. This condition allows us to invoke the central limit theorem and employ the standard normal distribution, that is, z, to compute a confidence interval for a population proportion. . 9-*
母體比例π的信賴區間估計 必須符合二項分配與大樣本(利用中央極限定理)的條件,這樣才能用z分配來求信賴區間。故必須符合下列條件: 1. X:成功的個數,(每次試驗僅2種結果:成功、失敗) 2. 成功的機率不變(如:抽後放回) 3. 試驗間彼此獨立 3. nπ≧5 以及 n(1-π)≧5 注意:二項分配母數:μ= nπ, σ2 = nπ(1-π) 9-
1-1 樣本比例的抽樣分配 9-
1-1 樣本比例的抽樣分配 9-
樣本比例的抽樣分配(點二項分配為白努利分配) 9-
A Confidence Interval for a Population Proportion, LO9-3 A Confidence Interval for a Population Proportion, 9-*
母體比例的區間估計 9-
A Confidence Interval for a Population Proportion, - Example (p.299) LO9-3 A Confidence Interval for a Population Proportion, - Example (p.299) BBA工會想與TU工會合併,但必須至少3/4成員同意才能合併,在BBA工會內抽2000名會員調查,有1600人同意合併,母體比例為何? The union representing the Bottle Blowers of America (BBA) is considering a proposal to merge with the Teamsters Union. According to BBA union bylaws, at least three-fourths of the union membership must approve any merger. A random sample of 2,000 current BBA members reveals 1,600 plan to vote for the merger proposal. What is the estimate of the population proportion? 95%信賴區間?根據此抽樣調查,你能否下結論說:已有足夠會員同意合併?為什麼? Develop a 95 percent confidence interval for the population proportion. Basing your decision on this sample information, can you conclude that the necessary proportion of BBA members favor the merger? Why? 9-*
樣本比例的應用:選舉民調 p.299-300 某人要競選國會議員,必須取得過半數選票才能當選,他抽樣500位選民調查,結果有275人說會投他,請問他是否能當選? (分析結果): 樣本比例 p = 275/500 = 0.55 點估計值 > 0.5 ,能當選?? 95%區間估計:p ± z√p (1-p)/n = 0.55 ± 1.96√.55(1-.55)/500 = 0.55 ± 0.044 = (0.506, 0.594) 因為此信賴區間的下限大於0.5 故他非常可能會當選! 9-
EX 9-16 p.300 Ms. Maria Wilson is considering running for mayor of the town of Bear Gulch, Montana. Before completing the petitions, she decides to conduct a survey of voters in Bear Gulch. A sample of 400 voters reveals that 300 would support her in the November election. a. Estimate the value of the population proportion. b. Develop a 99% confidence interval for the population proportion. c. Interpret your findings. 9-
EX 9-16 p.300 a. 0.75, found by 300/400. 點估計 b. Between 0.694 and 0.806, found by (99%,z=2.576~2.58) c. We are reasonably sure the population proportion is between 69 and 81 percent. We expect 99% of similar constructed samples to contain the true population proportion. 9-
估計μ該抽多少樣本?適當的樣本大小: Selecting an Appropriate Sample Size LO9-4 Calculate the required sample size to estimate a population proportion or population mean. 估計μ該抽多少樣本?適當的樣本大小: Selecting an Appropriate Sample Size 下列3種因素可決定樣本大小,但都與母體大小無關: There are 3 factors that determine the size of a sample, none of which has any direct relationship to the size of the population: 需要多大的信賴水準? 信賴水準越大,樣本數就越大 The level of confidence desired 能容忍多大的誤差? 誤差越小,樣本數就越大 The margin of error the researcher will tolerate 母體變異有多大? 母體離散度越大,樣本數就越大 The variation in the population being studied 9-*
LO9-4 選擇適當的樣本大小:若母體標準差未知呢? Selecting an Appropriate Sample Size: What if the Population Standard Deviation is not Known? 如何估算σ?有3種可能的估算法: 1. Conduct a pilot study 抽一組小樣本,計算樣本標準差,來替代σ 2. Use a comparable study 若有類似研究,用該研究所估算的標準差,作為此處的σ 3. Use a range-based approach 用全距除以6,來替代σ 9-*
平均數的估計誤差 (margin of error):E 9-
估計μ該抽多少樣本? Sample Size for Estimating the Population Mean LO9-4 估計μ該抽多少樣本? Sample Size for Estimating the Population Mean 9-*
LO9-4 估計μ該抽多少樣本?例1 (p.302) Sample Size for Estimating Population Mean – Example 1 (p.302) 公行系學生想知道大城市的市議員平均月薪是多少,她想用95%的信賴區間來估計,且誤差要小於$100,她用勞工部的資料估計標準差為$1000,請問她的樣本應該多大? A student in public administration wants to determine the mean amount members of city councils in large cities earn per month. She would like to estimate the mean with a 95% confidence interval and a margin of error of less than $100. The student found a report by the Department of Labor that estimated the standard deviation to be $1,000. What is the required sample size? Given in the problem: E, the maximum allowable error, is $100, The value of z for a 95 percent level of confidence is 1.96, The estimate of the standard deviation is $1,000. 故應該抽385個樣本 9-*
估計μ該抽多少樣本?例1 (p.302) Sample Size for Estimating Population Mean – Example 1 若她要提高信賴水準到99%,要抽多少樣本才夠呢? n = 664 增加了279個觀察值(↑72%) 若她要降低誤差到50呢? n = 1537 增加了1152個觀察值(↑299%) 若母體誤差提高到1200呢? n = 554 增加了169個觀察值(↑44%) 9-
估計μ該抽多少樣本?例2 Sample Size for Estimating Population Mean – Example 2 LO9-4 估計μ該抽多少樣本?例2 Sample Size for Estimating Population Mean – Example 2 消費者團體要估算單一家庭在七月份的平均電費,誤差不超過$5,信賴水準99%,根據類似研究,標準差估計為$20,應該抽多少樣本? A consumer group would like to estimate the mean monthly electricity charge for a single family house in July within $5 using a 99 percent level of confidence. Based on similar studies, the standard deviation is estimated to be $20.00. How large of a sample is required? 9-*
估計母體比例π應抽多少樣本? π的估計誤差 (margin of error):E 9-
估計母體比例π應抽多少樣本? Sample Size for Estimating a Population Proportion LO9-4 估計母體比例π應抽多少樣本? Sample Size for Estimating a Population Proportion where: n is the size of the sample z is the standard normal value corresponding to the desired level of confidence E is the maximum allowable error 9-*
估計母體比例π應抽多少樣本? 若不知道π,可用樣本比例取代π 若無法估計樣本比例,可用0.5取代π 9-
LO9-4 估計母體比例π應抽多少樣本?例1 Sample Size for Estimating Population Proportion – Example 1 美國養狗俱樂部想估算兒童養寵物狗的比例有多少,若他們不想誤差超過母體比例的3%,他們應該訪問多少兒童才夠?假設95%的信賴水準,且他們估計有30%兒童養狗。 The American Kennel Club wants to estimate the proportion of children that have a dog as a pet. If the club wants the estimate to be within 3% of the population proportion, how many children would they need to contact? Assume a 95% level of confidence and that the club estimated that 30% of the children have a dog as a pet. 9-*
LO9-4 估計母體比例π應抽多少樣本?例1 (p.303) Sample Size for Estimating Population Proportion – Example 2 想研究有多少比例的城市由私人收集清運垃圾,調查人員希望誤差小於母體比例的0.10,信賴水準為90%,但無法估算母體比例。請問應該抽多少樣本? A study needs to estimate the proportion of cities that have private refuse collectors. The investigator wants the margin of error to be within .10 of the population proportion, the desired level of confidence is 90 percent, and no estimate is available for the population proportion. What is the required sample size? 9-*
p. 304 ex 26 Past surveys reveal that 30% of tourists going to Las Vegas to gamble spend more than $1,000. The Visitor's Bureau of Las Vegas wants to update this percentage. a. The new study is to use the 90% confidence level. The estimate is to be within 1% of the population proportion. What is the necessary sample size? b. The Bureau feels the sample size determined above is too large. What can be done to reduce the sample? Based on your suggestion, recalculate the sample size. 9-
p. 304 ex 26 a. 5683, found by b. Increase the allowable error from 0.01 to 0.05. Thus the sample size would be reduced to 228, found by 9-
有限母體的校正因子 Finite-Population Correction Factor (FPC) LO9-5 Adjust a confidence interval for finite populations. 有限母體的校正因子 Finite-Population Correction Factor (FPC) 若母體的個數已知,則為有限母體。 A population that has a known size is said to be finite. 若一有限母體的個數為N,樣本個數為n,下式為樣本平均數與比例的標準差的修正: For a finite population, where the total number of objects is N and the size of the sample is n, the following adjustment is made to the standard errors of the sample means and the proportion: 然而,若n/N < .05,則有限母體的校正因子可以被忽視 However, if n/N < .05, the finite-population correction factor may be ignored. Finite Population Correction Standard Error of the Mean Standard Error of the Mean Standard Error of the Proportion Standard Error of the Proportion 9-*
n/N變動對有限母體校正因子的影響 Effects on FPC when n/N Changes LO9-5 n/N變動對有限母體校正因子的影響 Effects on FPC when n/N Changes 注意:當n/N 越小,校正因子越接近1 (N=1000) Observe that FPC approaches 1 when n/N becomes smaller. 9-*
LO9-5 信賴區間估計μ與π:有限母體校正 Confidence Interval Formulas for Estimating Means and Proportions with Finite Population Correction (FPC) C.I. for the Mean () C.I. for the Mean () C.I. for the Proportion () 9-*
注意:用 Z 還是用 t 分配 依照本書: 母體標準差σ已知:用Z分配 母體標準差σ未知:用t分配 其他可能主張: 大樣本:用 Z分配 9-
CI for Mean with FPC – Example (p.305) LO9-5 CI for Mean with FPC – Example (p.305) 賓州S市有250家庭,隨機抽40家庭顯示對教會的年平均奉獻為$450,標準差$75。 There are 250 families in Scandia, Pennsylvania. A random sample of 40 of these families revealed the mean annual church contribution was $450 and the standard deviation of this was $75. 母體平均的最佳估計為$450 討論為何應該使用有限母體的校正因子。 Discuss why the finite-population correction factor should be used. 母體平均值為?90%最佳信賴區間估計為? (431.65, 468.35) What is the population mean? What is the best interval estimate of the population mean with 90% confidence? 母體平均值是否為$445或$425? $445有可能,但$425不可能 Could the population mean be $445 or $425? Given in the problem: N = 250 n = 40 S = $75 因為n/N = 40/250 = 0.16,所以必須用有限母體校正因子。 Since n/N = 40/250 = 0.16, the finite population correction factor must be used. 因母體標準差未知,所以用t分配 The population standard deviation is not known therefore use the t-distribution 用下式來計算信賴區間: Use the formula below to compute the confidence interval: 9-*
CI for Mean with FPC – Example (305) LO9-5 CI for Mean with FPC – Example (305) 9-*
p. 306 ex. 30 There are 300 welders employed at Maine Shipyards Corporation. A sample of 30 welders revealed that 18 graduated from a registered welding course. Construct the 95% confidence interval for the proportion of all welders who graduated from a registered welding course. 9-
p. 306 ex. 30 比例: p=18/30=0.6 95% z=1.96 0.433 and 0.767, found by 9-
p.307 以後 以下為課外,可看可不看 9-
點估計量的性質 不偏性 有效性 均方誤 充分性 一致性 趨近不偏性 9-
不偏性 若點估計量 的抽樣分配之期望值等於母體參數θ,亦即E( )=θ,則稱 為θ的不偏估計量(unbiased estimator)。 θ的有偏估計量 9- θ的不偏估計量 ,與有偏估計量
有效性 1/2 若考慮母體參數θ的所有可能不偏估計量,則其中具有最小變異數者,稱為最有效的估計量。 (8-2) 9-
有效性 2/2 θ 母體參數 估計量 的抽樣分配 圖 與 皆為θ的不偏估計量,但 相對有效 9-
均方誤 1/2 (8-3) (8-4) 9-
為不偏估計量, 的變異數量最小,但 的MSE的最小 均方誤 2/2 的抽樣分配 (不偏的估計量) (最佳的估計量) 的抽樣分配(Var 最小) 估計量 母體參數 為不偏估計量, 的變異數量最小,但 的MSE的最小 9-
充分性 充分性是指一估計式 在估計θ時,能充分利用樣本統計量之訊息,完全不遺漏地呈現出母體真實狀態;則稱 為具有充分性之估計量。 充分性是指一估計式 在估計θ時,能充分利用樣本統計量之訊息,完全不遺漏地呈現出母體真實狀態;則稱 為具有充分性之估計量。 設X1,X2,…,Xn為某一母體分配,其機率分配函數為fx(X1,…,Xn;θ),若有一統計量 滿足下列條件時:Pr(X1=x1,X2=x2,…,Xn=xn| = )其條件機率值與 無關,且所有 皆成立時,則稱 為θ之充分統計量。 9-
一致性 若 為θ之一點估計量,則會滿足下列條件 之一 (1) (2) (3) 9-
1-1 估計式的評斷標準 9-
不偏估計式 9-
1-1 估計式的評斷標準 9-
估計式的評斷標準 9-
一致性估計式 9-
幾個常用的估計式的不偏性 9-