Presentation is loading. Please wait.

Presentation is loading. Please wait.

Chapter 7 Sampling and Sampling Distributions

Similar presentations


Presentation on theme: "Chapter 7 Sampling and Sampling Distributions"— Presentation transcript:

1 Chapter 7 Sampling and Sampling Distributions
Business Statistics: A First Course 5th Edition Chapter 7 Sampling and Sampling Distributions Basic Business Statistics, 11e © 2009 Prentice-Hall, Inc.

2 Learning Objectives In this chapter, you learn:
To distinguish between different sampling methods 了解各种抽样方法 The concept of the sampling distribution 理解抽样分布的概念 To compute probabilities related to the sample mean and the sample proportion 计算样本均值和样本比例有关的分布概率 The importance of the Central Limit Theorem 中心极限定理的运用 Basic Business Statistics, 11e © 2009 Prentice-Hall, Inc..

3 Why Sample? Selecting a sample is less time-consuming than selecting every item in the population (census). 抽样调查相比全面普查更节省大量时间 Selecting a sample is less costly than selecting every item in the population.抽样调查相比全面普查更节省成本 An analysis of a sample is less cumbersome and more practical than an analysis of the entire population. 实际中,相较于分析总体全部,分析样本更加易操作 Basic Business Statistics, 11e © 2009 Prentice-Hall, Inc..

4 A Sampling Process Begins With A Sampling Frame
The sampling frame is a listing of items that make up the population 抽样框是构成总体的所有抽样单元的列表 Frames are data sources such as population lists, directories, or maps 抽样框的可能形式有总体名单、手册、地图等 Inaccurate or biased results can result if a frame excludes certain portions of the population 如果抽样框没有覆盖总体的某个部分,则抽样所得样本可能导致有偏差的结果 Using different frames to generate data can lead to dissimilar conclusions 通过不同的抽样框所的样本可能带来不一样的推断结论 Basic Business Statistics, 11e © 2009 Prentice-Hall, Inc..

5 Probability Samples概率抽样 Non-Probability Samples非概率抽样
Types of Samples Samples 抽样 Probability Samples概率抽样 Non-Probability Samples非概率抽样 Simple Random 简单随机抽样 Stratified分层抽样 Judgment 判断抽样 Convenience 便利抽样 Cluster整群抽样 Systematic系统抽样 Basic Business Statistics, 11e © 2009 Prentice-Hall, Inc..

6 Types of Samples: Nonprobability Sample
In a nonprobability sample, items included are chosen without regard to their probability of occurrence. 非概率抽样中,个体样本不是按照一定概率入样,而是由抽样者主观抽出或者入样个体志愿进入样本 In convenience sampling, items are selected based only on the fact that they are easy, inexpensive, or convenient to sample. 便利抽样是调查者根据自己的方便、自行确定入样个体 In a judgment sample, you get the opinions of pre-selected experts in the subject matter. 根据调查者或者事先选定专家的主观意见抽取样本 Basic Business Statistics, 11e © 2009 Prentice-Hall, Inc..

7 Types of Samples: Probability Sample
In a probability sample, items in the sample are chosen on the basis of known probabilities. 概率抽样中,每个入样个体都被指定了已知的入样可能概率 Probability Samples概率抽样 Simple Random 简单随机抽样 Systematic 系统抽样 Stratified 分层抽样 Cluster 整群抽样 Basic Business Statistics, 11e © 2009 Prentice-Hall, Inc..

8 Probability Sample: Simple Random Sample
Every individual or item from the frame has an equal chance of being selected 抽样框下,每个抽样单元或者个体都等可能的被抽取 Selection may be with replacement (selected individual is returned to frame for possible reselection) or without replacement (selected individual isn’t returned to the frame).抽样过程可以是放回的或者是无放回的 Samples obtained from table of random numbers or computer random number generators.样本可以通过随机数表或者计算机产生随机数等方法实现 Basic Business Statistics, 11e © 2009 Prentice-Hall, Inc..

9 Selecting a Simple Random Sample Using A Random Number Table
Portion Of A Random Number Table Sampling Frame For Population With 850 Items Item Name Item # Bev R Ulan X Joann P Paul F The First 5 Items in a simple random sample Item # 492 Item # 808 Item # does not exist so ignore Item # 435 Item # 779 Item # 002 Basic Business Statistics, 11e © 2009 Prentice-Hall, Inc..

10 Probability Sample: Systematic Sample
Decide on sample size: n 先决定样本量n Divide frame of N individuals into groups of k individuals: k=N/n 将总体个数N分到n组,每组k个个体 Randomly select one individual from the 1st group 随机从第一组的k个个体中选择一个入样个体 Select every kth individual thereafter 然后每隔k个抽取一个入样,直到抽取到n个入样 First Group N = 40 n = 4 k = 10 Basic Business Statistics, 11e © 2009 Prentice-Hall, Inc..

11 Probability Sample: Stratified Sample
Divide population into two or more subgroups (called strata) according to some common characteristic 根据某特征,将总体中的个体不重不漏的分到2个及以上的层(子总体)中 A simple random sample is selected from each subgroup, with sample sizes proportional to strata sizes 在每一层中抽取一个简单随机样本 Samples from subgroups are combined into one 总样本由所有层总样本合成 This is a common technique when sampling population of voters, stratifying across racial or socio-economic lines. 这在选举调查中常用、比如按种族、社会经济状况等分层 Population Divided into 4 strata Basic Business Statistics, 11e © 2009 Prentice-Hall, Inc..

12 Probability Sample Cluster Sample
Population is divided into several “clusters,” each representative of the population 将总体按照某种原则分成几个群,每群可以看做总体的代表 A simple random sample of clusters is selected 对群进行简单随机抽样 All items in the selected clusters can be used, or items can be chosen from a cluster using another probability sampling technique 一般来说对抽到群的每个个体进行调查 A common application of cluster sampling involves election exit polls, where certain election districts are selected and sampled. 经常应用于选举后的民意调查,此时对某些选区进行随机选取和抽样 Population divided into 16 clusters. Randomly selected clusters for sample Basic Business Statistics, 11e © 2009 Prentice-Hall, Inc..

13 Probability Sample: Comparing Sampling Methods
Simple random sample and Systematic sample Simple to use 易于运用 May not be a good representation of the population’s underlying characteristics 有时候总体的代表性不足 Stratified sample Ensures representation of individuals across the entire population 能保证总体各个部分在样本中都有代表 Cluster sample More cost effective 调查成本较易控制 Less efficient (need larger sample to acquire the same level of precision) 抽样的有效性较低,需要更大的样本量来达到相同的精度 Basic Business Statistics, 11e © 2009 Prentice-Hall, Inc..

14 Evaluating Survey Worthiness
What is the purpose of the survey? 抽样调查目的 Is the survey based on a probability sample? Coverage error – appropriate frame? 覆盖误差-不适合的抽样框会带来选择性偏差 Nonresponse error – follow up 无回答误差 Measurement error – good questions elicit good responses 测量误差-需要精心设计问卷 Sampling error – always exists 抽样误差-永远存在、可以控制 Basic Business Statistics, 11e © 2009 Prentice-Hall, Inc..

15 Types of Survey Errors Coverage error or selection bias 选择性偏差
Exists if some groups are excluded from the frame and have no chance of being selected 当总体的某部分没有包括在抽样框时产生选择偏差 Non response error or bias 无回答误差\偏差 People who do not respond may be different from those who do respond 回答和不回答的个体特征是不同的 Sampling error Variation from sample to sample will always exist 可能的不同样本之间差异带来的、总是存在 Measurement error Due to weaknesses in question design, respondent error, and interviewer’s effects on the respondent (“Hawthorne effect”) 由于问卷设计、回答误差,或者霍索恩效应导致的误差 Basic Business Statistics, 11e © 2009 Prentice-Hall, Inc..

16 Types of Survey Errors Coverage error Non response error
(continued) Coverage error Non response error Sampling error Measurement error Excluded from frame Follow up on nonresponses Random differences from sample to sample Bad or leading question Basic Business Statistics, 11e © 2009 Prentice-Hall, Inc..

17 Sampling Distributions
A sampling distribution is a distribution of all of the possible values of a sample statistic for a given size sample selected from a population. 抽样分布指再确定的抽样方案和样本来那个条件下,样本统计量的所有可能取值的分布 For example, suppose you sample 50 students from your college regarding their mean GPA. If you obtained many different samples of 50, you will compute a different mean for each sample. We are interested in the distribution of all potential mean GPA we might calculate for any given sample of 50 students. 假定在全校学生中抽取一个样本量为50的样本,计算他们的平均GPA值。这种样本可能或有很多种可能(N个里面抽取50个的所有可能),对所有可能样本计算GPA均值,就可以得到该校学生GPA均值的分布 Basic Business Statistics, 11e © 2009 Prentice-Hall, Inc..

18 Developing a Sampling Distribution
Assume there is a population … Population size N=4 Random variable, X, is age of individuals Values of X: 18, 20, 22, 24 (years) 考虑一个包含4个人的总体,用变量 X表示年龄,其值分别为18、20、22、 24岁 D A B C Basic Business Statistics, 11e © 2009 Prentice-Hall, Inc..

19 Developing a Sampling Distribution
(continued) Summary Measures for the Population Distribution:该总体的描述度量: P(x) .3 .2 .1 x A B C D Uniform Distribution 均匀分布 Basic Business Statistics, 11e © 2009 Prentice-Hall, Inc..

20 Developing a Sampling Distribution
(continued) Now consider all possible samples of size n=2 考虑一个放回的简单随机抽样,样本量n=2,考察样本均值,则所有可能样本如下(4*4=16): 16 Sample Means 1st Obs 2nd Observation 18 20 22 24 18,18 18,20 18,22 18,24 20,18 20,20 20,22 20,24 22,18 22,20 22,22 22,24 24,18 24,20 24,22 24,24 16 possible samples (sampling with replacement) 所有可能的16个样本 Basic Business Statistics, 11e © 2009 Prentice-Hall, Inc..

21 Sampling Distribution of All Sample Means 样本均值的抽样分布
Developing a Sampling Distribution (continued) Sampling Distribution of All Sample Means 样本均值的抽样分布 Sample Means Distribution 16 Sample Means _ P(X) .3 .2 .1 _ X (no longer uniform 不再是均匀分布) Basic Business Statistics, 11e © 2009 Prentice-Hall, Inc..

22 Summary Measures of this Sampling Distribution:抽样分布的描述度量
Developing a Sampling Distribution (continued) Summary Measures of this Sampling Distribution:抽样分布的描述度量 Basic Business Statistics, 11e © 2009 Prentice-Hall, Inc..

23 Comparing the Population Distribution to the Sample Means Distribution
_ P(X) P(X) .3 .3 .2 .2 .1 .1 _ X A B C D X Basic Business Statistics, 11e © 2009 Prentice-Hall, Inc..

24 Sample Mean Sampling Distribution: Standard Error of the Mean
Different samples of the same size from the same population will yield different sample means 显然同一总体、不同样本之间的均值不同 A measure of the variability in the mean from sample to sample is given by the Standard Error of the Mean:这种不同样本之间均值变化程度的度量称为均值的标准误: (This assumes that sampling is with replacement or sampling is without replacement from an infinite population,假定总体N很大,或者无穷总体) Note that the standard error of the mean decreases as the sample size increases 显然均值的标准误随着样本量的增加而减少 Basic Business Statistics, 11e © 2009 Prentice-Hall, Inc..

25 Sample Mean Sampling Distribution: If the Population is Normal
If a population is normally distributed with mean μ and standard deviation σ, the sampling distribution of is also normally distributed with 如果总体本身服从均值为μ,标准差为σ的正态分布,则均值 的抽样分布也是正态的,其均值和标准误如下 and Basic Business Statistics, 11e © 2009 Prentice-Hall, Inc..

26 Z-value for Sampling Distribution of the Mean
Z-value for the sampling distribution of : 的Z值 where: = sample mean 样本均值 = population mean 总体均值 = population standard deviation 总体标准差 n = sample size 样本量 Basic Business Statistics, 11e © 2009 Prentice-Hall, Inc..

27 Sampling Distribution Properties
Normal Population Distribution 的期望等于总体均值 (i.e is unbiased 是 的无偏估计) Normal Sampling Distribution (has the same mean) Basic Business Statistics, 11e © 2009 Prentice-Hall, Inc..

28 Sampling Distribution Properties
(continued) As n increases, decreases Larger sample size Smaller sample size Basic Business Statistics, 11e © 2009 Prentice-Hall, Inc..

29 Determining An Interval Including A Fixed Proportion of the Sample Means
Find a symmetrically distributed interval around µ that will include 95% of the sample means when µ = 368, σ = 15, and n = 25. 在µ = 368, σ = 15, and n = 25时,计算一个关于总体均值µ对称的区间,使得该区间能够覆盖95%的样本均值 Since the interval contains 95% of the sample means 5% of the sample means will be outside the interval 95%被覆盖,意味着5%可能样本均值不在该区间范围 Since the interval is symmetric 2.5% will be above the upper limit and 2.5% will be below the lower limit.考虑到对称性,各有2.5%的样本均值落在区间两端外 From the standardized normal table, the Z score with 2.5% (0.0250) below it is and the Z score with 2.5% (0.0250) above it is 1.96.从标准正态表上可以查到2.5%对应的Z值 Basic Business Statistics, 11e © 2009 Prentice-Hall, Inc..

30 Determining An Interval Including A Fixed Proportion of the Sample Means
(continued) Calculating the lower limit of the interval 计算区间下限 Calculating the upper limit of the interval 计算区间上限 95% of all sample means of sample size 25 are between and 则95%的可能样本均值落在区间[362.12,373.88]。 Basic Business Statistics, 11e © 2009 Prentice-Hall, Inc..

31 Sample Mean Sampling Distribution: If the Population is not Normal
We can apply the Central Limit Theorem: 当总体分布不是正态分布时,中心极限定理仍然可以保证样本均值抽样分布的正态性(当样本量足够大时) Even if the population is not normal, …sample means from the population will be approximately normal as long as the sample size is large enough. Properties of the sampling distribution:抽样分布的性质不变 and Basic Business Statistics, 11e © 2009 Prentice-Hall, Inc..

32 Central Limit Theorem the sampling distribution becomes almost normal regardless of shape of population 不管总体分布是否正态,样本均值的抽样分布都近似正态 As the sample size gets large enough… 当样本量足够大时 n↑ Basic Business Statistics, 11e © 2009 Prentice-Hall, Inc..

33 Sample Mean Sampling Distribution: If the Population is not Normal
(continued) Population Distribution Sampling distribution properties: Central Tendency Sampling Distribution (becomes normal as n increases) Variation Larger sample size Smaller sample size Basic Business Statistics, 11e © 2009 Prentice-Hall, Inc..

34 How Large is Large Enough?
For most distributions, n > 30 will give a sampling distribution that is nearly normal 对大多数分布,样本 量n>30时能保证抽样分布的渐进正态性 For fairly symmetric distributions, n > 15 will usually give a sampling distribution is almost normal 如果总 体分布是对称的,则样本量超过15就可以了 For normal population distributions, the sampling distribution of the mean is always normally distributed 而对于正态分布的总体,则抽样分布始终是正态的 Basic Business Statistics, 11e © 2009 Prentice-Hall, Inc..

35 Example Suppose a population has mean μ = 8 and standard deviation σ = 3. Suppose a random sample of size n = 36 is selected. 假定一个总体的均值为8,标准差为3,从中抽取一个样本量为36的随机样本 What is the probability that the sample mean is between 7.8 and 8.2? 请计算样本均值取值在7.8和8.2之间的概率 Basic Business Statistics, 11e © 2009 Prentice-Hall, Inc..

36 Example Solution: (continued)
Even if the population is not normally distributed, the central limit theorem can be used (n > 30) … so the sampling distribution of is approximately normal … with mean = 8 …and standard deviation 尽管总体不确定是正态分布,但是样本量为36>30满足中心极限地理 使用条件,从而样本均值的抽样分布近似服从均值为8,标准误为 0.5的正态分布 Basic Business Statistics, 11e © 2009 Prentice-Hall, Inc..

37 Example Solution (continued): (continued) Z X Population Distribution
Sampling Distribution Standard Normal Distribution ? ? ? ? ? ? ? ? ? ? Sample Standardize ? ? Z X Basic Business Statistics, 11e © 2009 Prentice-Hall, Inc..

38 Population Proportions总体比例
π = the proportion of the population having some characteristic π表示总体中具有某种特征的个体的比例 Sample proportion ( p ) provides an estimate of π:样本比例p作为它的 估计 0 ≤ p ≤ 1 p is approximately distributed as a normal distribution when n is large (assuming sampling with replacement from a finite population or without replacement from an infinite population) 同样,样本比例怕的抽样分布近似正态分布 Basic Business Statistics, 11e © 2009 Prentice-Hall, Inc..

39 Sampling Distribution of p
Approximated by a normal distribution if: where and Sampling Distribution P( ps) .3 .2 .1 p (where π = population proportion) Basic Business Statistics, 11e © 2009 Prentice-Hall, Inc..

40 Z-Value for Proportions
Standardize p to a Z value with the formula: Basic Business Statistics, 11e © 2009 Prentice-Hall, Inc..

41 Example i.e.: if π = 0.4 and n = 200, what is P(0.40 ≤ p ≤ 0.45) ?
If the true proportion of voters who support Proposition A is π = 0.4, what is the probability that a sample of size 200 yields a sample proportion between 0.40 and 0.45? 假定某个选民总体中支持议题A的比例为0.4,则在次总体中抽取一个样本量为200的样本,则样本比例取值在0.4和0.45之间的概率是多少? i.e.: if π = 0.4 and n = 200, what is P(0.40 ≤ p ≤ 0.45) ? Basic Business Statistics, 11e © 2009 Prentice-Hall, Inc..

42 Example if π = 0.4 and n = 200, what is P(0.40 ≤ p ≤ 0.45) ? Find :
(continued) if π = 0.4 and n = 200, what is P(0.40 ≤ p ≤ 0.45) ? Find : Convert to standardized normal: Basic Business Statistics, 11e © 2009 Prentice-Hall, Inc..

43 Standardized Normal Distribution
Example (continued) if π = 0.4 and n = 200, what is P(0.40 ≤ p ≤ 0.45) ? Use standardized normal table: P(0 ≤ Z ≤ 1.44) = Standardized Normal Distribution Sampling Distribution 0.4251 Standardize 0.40 0.45 1.44 p Z Basic Business Statistics, 11e © 2009 Prentice-Hall, Inc..

44 Chapter Summary Discussed probability and nonprobability samples
Described four common probability samples Examined survey worthiness and types of survey errors Introduced sampling distributions Described the sampling distribution of the mean For normal populations Using the Central Limit Theorem Described the sampling distribution of a proportion Calculated probabilities using sampling distributions Basic Business Statistics, 11e © 2009 Prentice-Hall, Inc..


Download ppt "Chapter 7 Sampling and Sampling Distributions"

Similar presentations


Ads by Google