# Population proportion and sample proportion

## Presentation on theme: "Population proportion and sample proportion"— Presentation transcript:

Population proportion and sample proportion

Population proportion and sample proportion

Sampling Distribution of the Sample Proportion
Let p denote the proportion of items in a population that possess a certain characteristic (unemployed, income below poverty level). To estimate p, we take a random sample of n observation from the population and count the number X of items in the sample that possess the characteristic. The sample proportion p^ = X/n is used to estimate the population proportion p. 社會統計（上） ©蘇國賢2007

Sampling Distribution of the Sample Proportion

Sampling Distribution of the Sample Proportion

Sampling Distribution of the Sample Proportion

Sampling Distribution of the Sample Proportion

The Bernoulli Distribution

The Bernoulli Distribution

Sampling Distribution of the Sample Proportion
The Normal Approximation Rule for Proportion: Let p denote the proportion of a population possessing some characteristics of interest. Take a random sample of n observations from the population. Let X denote the number of items in the sample possessing the characteristic. We estimate the population proportion p by the sample proportion p^=X/n. If np5, and nq 5, the random variable p^ has approximately a normal distribution with: 社會統計（上） ©蘇國賢2007

Sampling Distribution of the Sample Proportion

Sampling Distribution of the Sample Proportion

Sampling Distribution of the Sample Proportion
If the distribution of p^ is approximately normal, and 社會統計（上） ©蘇國賢2007

Confidence intervals for proportions (large samples)
we know that p^ ~N(p, pq/n) , where q = 1-p and np≧5 and nq≧5) 社會統計（上） ©蘇國賢2007

Value of Zα P(Z≧ zα/2) =α/2 P(Z≦ -zα/2) =α/2 P(-zα/2 ≦Z≦ zα/2) =(1-α)
1-α/2-α/2 =1-α P(Z≧ zα/2) =α/2 P(Z≦ -zα/2) =α/2 P(-zα/2 ≦Z≦ zα/2) =(1-α) α/2 社會統計（上） ©蘇國賢2007

Confidence intervals for proportions (large samples)

Confidence intervals for proportions (large samples)

Confidence interval for the population proportion p

Wilson estimate 用樣本比例取代母體比例來估計標準誤並不一定正確。 例如：丟一個銅板三次得到三次都得正面，則 p^=3/3=1

Wilson estimate We must know the s.d. of the population to get a CI for p. Unfortunately, modern computer studies reveal the confidence intervals based on this approach can be quite inaccurate, even for large samples. -- When the sample is not a SRS. -- When the sample size is small 社會統計（上） ©蘇國賢2007

Wilson estimate The Wilson estimate ~ Add 2 successes and 2 failures (so that the sample proportion is slightly moved away from 0 and 1.) -- Because this estimate was first suggested by Edwin Bidwell Wilson in 1927, we call it the Wilson estimate. 社會統計（上） ©蘇國賢2007

Wilson estimate An approximate level C confidence interval for p is
The margin of error is 社會統計（上） ©蘇國賢2007

Confidence interval for the population proportion p

One-sided confidence intervals for the population proportion
Suppose that we take a random sample of n observations from some population having unknown proportion p. Suppose we wish to find the lower confidence limit LCL such that the probability is (1-) that p exceeds LCL. The one-sided interval (LCL, 1.00) is a left-sided confidence interval. The LCL is given by: 社會統計（上） ©蘇國賢2007

One-sided confidence intervals for the population proportion
Construct a right-sided 95% CI for the proportion of defective items produced by a machine if 16 items are found to be defective in a random sample of 100 items. The 95% right-sided CI for p is (0, .2306) This mean that we can be 95% confident that the population proportion is less than .2306 社會統計（上） ©蘇國賢2007

Determining the sample size決定樣本大小
Margin of Error Suppose that we take a random sample from some population. Then a 100(1-)% confidence interval for the population proportion extends at most a distance m on each side of the sample proportion if the number of observations is ? 社會統計（上） ©蘇國賢2007

Determining the sample size決定樣本大小
(1) 我們可以用pilot study來得到p的估計值。 (2) 在不知道的樣本比例情形下，我們可以採用最保守的估計，也就是最大的變異.5*.5=.25來估計n。 社會統計（上） ©蘇國賢2007

Sample size and confidence interval for the proportion

Sample size and confidence interval for the proportion

Sample size and confidence interval for the proportion

Tests of the population proportion

Sampling Distribution of the Sample Proportion
If the distribution of p^ is approximately normal, then random variable 社會統計（上） ©蘇國賢2007

Tests of the population proportion

Page 614, Procedure 12.2B (cont.)

Solution: If H0 is true, then p^ has a normal distribution with mean p =.6 and variance pq/n = (.6)(.4)/100 = .0024 If we use a one-tailed test at the 5% level of significance, the critical region consists of all values of Z less than –z = -z.05 = 從樣本中得知p^=x/n = 55/100 =.55 社會統計（上） ©蘇國賢2007

We do not reject H0 1 -1.02 實際上觀察到的樣本比例為.55>.519因此無法推翻虛擬假設 社會統計（上） ©蘇國賢2007

Sampling distribution of the difference between sample proportions
Suppose we take independent sample of size n1 and n2 from two population. Let p1 and p2 be the proportion of items in each population that possess a certain characteristics, and let q1=(1-p1), q2=(1-p2). If n1p1>5, n1q1>5, n2p2>5, n2q2>5, then the random variable (p1^-p2^) is approximately normally distributed with 社會統計（上） ©蘇國賢2007

Confidence intervals for the difference of Two population proportion
Let p1 denote the observed proportion of successes in a random sample of n1 observation from a population with proportion p1 successes, and let p2 denote the observed proportion of successes in an independent random sample of n2 observations from a population with proportion p2 successes. A 100(1- α) % confidence interval for (p1 – p2) is given by the interval This result holds provided n1p1≧ 5 n1q1 ≧5 n2p2≧ 5 and n2q2 ≧5 社會統計（上） ©蘇國賢2007

Tests concerning differences of proportions

Tests concerning differences of proportions

Tests concerning differences of proportions

Tests concerning differences of proportions