AP Stats ~ Lesson 8A: Confidence Intervals OBJECTIVES: DETERMINE the point estimate and margin of error from a confidence interval. INTERPRET a confidence interval in context. INTERPRET a confidence level in context. DESCRIBE how the sample size and confidence level affect the length of a confidence interval. EXPLAIN how practical issues like nonresponse, undercoverage, and response bias can affect the interpretation of a confidence interval. If you had to give one number to estimate an unknown population parameter, what would it be? If you were estimating a population mean 𝜇, you would probably use 𝑥. If you were estimating a population proportion 𝑝, you would probably choose 𝑝, because they are usually considered to be unbiased estimators of the population. In both cases, you are providing a POINT ESTIMATE of the parameter of interest. A POINT ESTIMATOR is a statistic that provides an estimate of a population parameter. The value of that statistic from a sample is called a POINT ESTIMATE. An ideal point estimator will have no bias, and will have low variability. Since variability is almost always present when calculating statistics from different samples, we must extend our thinking about estimating parameters to include an acknowledgement that repeated sampling could yield different results. Example: In each of the following settings, determine the point estimator you would use and calculate the value of the point estimate. (a) The makers of a new golf ball want to estimate the median distance the new balls will travel when hit by a mechanical driver. They select a random sample of 10 balls and measure the distance each ball travels after being hit by the mechanical driver. Here are the distances (in yards): 285 286 284 285 282 284 287 290 288 285 (b) The golf ball manufacturer would also like to investigate the variability of the distance travelled by the golf balls by estimating the interquartile range. (c) The math department wants to know what proportion of its students own a graphing calculator, so they take a random sample of 100 students and find that 28 own a graphing calculator. So, how close is our point estimator going to be to the actual parameter? How will our sample means or proportions vary if we took many, many SRS’s? Think about what we know. • We know that the sampling distribution of 𝑥, describes how the values of 𝑥 vary in repeated samplings. • We know that the SHAPE of the sampling distribution mimics that of the population distribution, so if the population is Normal, our sampling distribution will be, too. • We know that the mean of the sampling distribution is the same as the unknown population mean (it’s an unbiased estimator). • We know that the standard deviation gets smaller as the sample size gets larger. It stands to reason, then, that even if we don’t know the true mean or standard deviation of the population, if we take repeated samples, then the mean and standard deviation of the sampling distribution will be the same, or almost the same as that of the population. This leads us to our “Big Idea”. The sampling distribution of 𝑥 tells us how close to 𝜇 the sample mean 𝑥 is likely to be. We can use this information to construct a CONFIDENCE INTERVAL (sometimes called an interval estimate). All confidence intervals we construct will have a form similar to this: point estimate ± margin of error The point estimate can be 𝑥 or 𝑝. It is our best guess for the unknown population parameter (𝜇 or 𝑝). The margin of error shows how close we believe our guess is, and is based on the variability of our estimate. A C% confidence interval gives an interval of plausible* values for a parameter. The interval is calculated from the data and has the form point estimate ± margin of error The difference between the point estimate and the true parameter value will be less than the margin of error in C% of all samples. The confidence level C gives the overall success rate of the method for calculating the confidence interval. That is, in C% of all possible samples, the method would yield an interval that captures the true parameter value. * Note: Plausible does not mean the same thing as possible! Just about any value of a parameter is possible, but based on our data, the values in our interval are reasonable or believable values of our parameter. The confidence level is the overall capture rate if the method is used many times. The sample mean will vary from sample to sample, but when we use the method estimate ± margin of error to get an interval based on each sample, C% of these intervals capture the unknown population mean µ. Interpreting Confidence Intervals and Confidence Levels THIS IS A PHRASE TO MEMORIZE!!!! Example: A large company is concerned that many of its employees are in poor physical condition, which can result in decreased productivity. To determine how many steps each employee takes per day, on average, the company provides a pedometer to 50 randomly selected employees to use for one 24-hour period. After collecting the data, the company statistician reports a 95% confidence interval of 4547 steps to 8473 steps. (a) Interpret the confidence interval. (b) What is the point estimate that was used to create the interval? What is the margin of error? (c) Recent guidelines suggest that people aim for 10,000 steps per day. Is there convincing evidence that the employees of this company are not meeting the guideline, on average? Explain. The confidence level tells us how likely it is that the method we are using will produce an interval that captures the population parameter if we use it many times. The confidence level does not tell us the chance that a particular confidence interval captures the population parameter. Instead, the confidence interval gives us a set of plausible values for the parameter. We interpret confidence levels and confidence intervals in much the same way whether we are estimating a population mean, proportion, or some other parameter. Let’s be sure we’ve got it. There are only 2 possibilities when discussing confidence levels: Our sample may be one of the 95% of samples that contain the population mean, or else it’s (unhappily) one of the 5% that doesn’t. We cannot know whether our sample is one of the 95%, so by saying we’re 95% confident, what we’re saying is that the method we are using gives correct results 95% of the time. The chance of our getting a confidence interval that captures the true parameter is NOT 95%. Instead, we have a 95% chance of getting an sample mean that’s within 2 standard deviations of the mystery parameter. After we actually construct the confidence interval, the probability that it captures the population parameter is either 1 or 0. Example: According to the American Community Survey, a 95% confidence interval for the median household income in Texas during the years 2009–2011 is $58,929 ± $218. <http://www.census.gov/hhes/www/income/data/statemedian/> Interpret the confidence interval and the confidence level. Constructing Confidence Intervals: What if we want have a greater than 95% confidence level (like 99%)? Or a less than 95% confidence level (like 90%)? What will happen to our confidence interval? We’ve already determined that our confidence interval is found by the point estimate ± margin of error. This leads to a more general formula for confidence intervals: statistic ± (critical value) • (standard deviation of statistic) The CRITICAL VALUE (sometimes referred to as z*) is a multiplier that makes the interval wide enough to capture the desired percentage. The critical value depends both on the confidence level C and the sampling distribution of the statistic. (These critical values are based on the number of standard deviations away from the mean. For example, our 68-95-99.7 rule states that 95% of our data is within 2 standard deviations away from the mean. We will be a bit more precise, and use z*=1.96 when we want a 95% confidence level.) Properties of Confidence Intervals: •The “margin of error” is the (critical value) • (standard deviation of statistic) •The user chooses the confidence level, and the margin of error follows from this choice. •The critical value depends on the confidence level and the sampling distribution of the statistic. •Greater confidence requires a larger critical value. •The standard deviation of the statistic depends on the sample size n. Here are two important cautions to keep in mind when constructing and interpreting confidence intervals. • Our method of calculation assumes that the data come from an SRS of size n from the population of interest. While other types of sampling may be preferable, we cannot use the data from them in this setting. • The margin of error in a confidence interval covers only chance variation due to random sampling or random assignment! Remember that the way a survey or experiment is conducted may influence our results. Usually not in a good way. Homework: page 488: #1-19 odds, 20-25 all Read pp. 492-504

© Copyright 2024