Statistics for Dummies (26 page)

Read Statistics for Dummies Online

Authors: Deborah Jean Rumsey

Tags: #Non-Fiction, #Reference

BOOK: Statistics for Dummies
9.43Mb size Format: txt, pdf, ePub

Going back to the example, you know that the pass/fail cutoff as a standard score is

0.3, so Z =

0.3. You also know that the mean of all the scores is
μ
= 250, and the standard deviation is
σ
= 15. Converting the standard score to the original score, you get
x
=

0.3 × 15 + 250 =

4.5 + 250 = 245.5 (or 246). So the cutoff score for pass/fail is 246. Anyone scoring below 246 fails (Rhodie, for example) and anyone scoring at or above 246 passes (Clint, for example).

What if the scores don't have a normal distribution? You can still calculate percentiles, but you will have to do it manually, or use some computer software to do it (such as Microsoft Excel).

To find the k
th
percentile when the data do not have a normal distribution:

  1. Put the values in order from smallest to largest.

  2. Let
    n
    be the size of the data set. Multiply
    k
    percent times
    n
    , and round to the nearest whole number.

  3. Count your way through the data until you reach the point identified in Step 2. This is the
    k
    th
    percentile in your data set.

For example, suppose you have the following data set: 1, 6, 2, 5, 3, 9, 3, 5, 4, 5, and you want the 90th percentile.

  • Step 1: order the data to get 1, 2, 3, 3, 4, 5, 5, 5, 6, 9.

  • Step 2:
    n
    = 10,
    k
    = 90%, and
    k
    percent times
    n
    is 0.90 times 10 = 9.

  • Step 3: That means the 9th number (from smallest to largest), which is 6, is the 90th percentile. About 90% of the values are below 6, and 10% of the values are at or above 6. (See
    Chapter 5
    for more discussion and examples of percentiles.)

 

Chapter 9:
Caution—Sample Results Vary!

Statistics are often presented as one-shot deals. For example, "One out of every two marriages ends in divorce", "Four out of five dentists surveyed recommend Trident gum", or "The average lifespan of a female born in the year 2000 is 80 years." People hear statistics like this and assume that those results apply to them. The ordinary person may assume, for example, that his chance of getting divorced is 50%, that his dentist probably recommends Trident gum, and that if he and his wife (if they haven't divorced yet!) had a baby girl in the year 2000, they can expect her to live to be 80 years old.

But shouldn't these statistics come with a "plus or minus" indicating that the results vary? You bet! Does it happen? Not often enough. The truth is that unless the researchers are able to conduct a
census
to get their results (collecting data on every single member of the population), those results are going to vary from sample to sample, and that variability can be much more than you think! The question is, by how much should you expect a statistical result to vary? You hope (perhaps even automatically assume) that it shouldn't vary by much, and that you can accurately apply the reported result to almost anyone. But is this always the case? Absolutely not; the variability in any statistical result depends on a number of factors, all of which are discussed in this chapter.

Expecting Sample Results to Vary

I was watching a commercial on TV the other day for a weight-loss meal-replacement drink. It gave an inspiring story about a woman who had lost 50 pounds in 6 months (and who had kept off the weight for over a year). During her testimony, a message flashed for a couple of seconds at the bottom of the screen that said "results not typical."

That leads to the question, "What's typical?" How much weight can you expect to lose in 6 months on this product, or, if you wanted to lose 50 pounds, how long should you expect it to take? You know that no matter what the results are for any individual, results are expected to vary from person to person. But this commercial is trying to lead you to believe that you should expect to lose about 50 pounds in 6 months (even though the tiny message says you shouldn't). What would be nice is if this manufacturer told you by how much you should expect those results to vary. What would also be nice is if the commercial presented the results from a sample of people, not just from one person.

HEADS UP 

Anecdotes
(individual stories and testimonies) are eye catching, but they're not statistically meaningful!

Suppose you're trying to estimate the proportion of people in the United States who approve of the president. If you ask a random sample of 1,000 people from the United States whether they approve of how the president is doing his job, you'll get one sample result (for example, 55% approve). You shouldn't report that 55% of the entire population of the United States approves of the president, because your result is based on one sample of only 1,000 people.

If you take a different random sample of 1,000 people from the same population and ask the same question, you'll likely get a different result. In fact, given that the population of the United States is so large and not everyone shares the same opinion of the president, 100 different random samples of 1,000 people — each taken from the same population and each asked the same question — would yield 100 different results. So how do you report your sample results? Some measure of how much the results are expected to vary has to be part of the package.

HEADS UP 

Expect sample results to vary from sample to sample. Don't take a statistic at face value and try to apply it without having some indication of how much that result is expected to vary.

 

Measuring Variability in Sample Results

You may be wondering how you can assess the amount by which a sample statistic is going to vary without having to select every possible sample and look at its statistic. You may as well do a census at that point, right? Fortunately, thanks to some major statistical results (mainly the central limit theorem), you can find out how much you expect sample means or proportions to vary without having to take all possible samples (what a relief!). The
central limit theorem
, in a nutshell, says that the distribution of all sample
means (or proportions) is normal, as long as the sample sizes are large enough. And even more impressive, the central limit theorem doesn't care what the distribution of the original population looks like. How can this be? You need to take a large enough sample, and you need to know some of the characteristics of your original population (like the mean and the standard deviation). And then the magic of statistical theory takes over from there.

Standard errors

Variability in sample means (or proportions) is measured in terms of standard errors.
Standard error
is the same basic concept as standard deviation; both represent a typical distance from the mean. But here is how standard error differs from standard deviation. The original population values
deviate
from each other due to natural phenomena (people have different heights, weights, and so on), hence the name standard
deviation
to measure their variability. Sample means vary because of the error that occurs in not doing a census and being able to only take samples, hence the name standard
error
to measure the variability of the sample means. (See
Chapter 5
for more on the standard deviation. I talk more about how to interpret standard errors in this section. For specifics on standard error formulas, see
Chapter 10
.)

Here's an example: The U.S. Bureau of Labor Statistics tries to track what people spend their money on each year with a Consumer Expenditure Survey (CES). The bureau takes a sample of households and asks each household in the sample to give their spending information. (Bias in reporting could be an issue here.) Their typical sample size is 7,500 people.
Table 9-1
shows a few of the results from the 2001 CES. This table not only includes the average amount of money spent by people in the sample on various items (the sample means for each item), it also includes the standard error for each of those sample means.

Table 9-1:
Average Yearly Household Expenses for American Households in 2001

Expense

Mean

Standard Error

Food (eating at home)

$3,085.52

$42.30

Food (eating out)

$2,235.37

$38.35

Phone

  $914.41

  $9.69

Gas and oil (for vehicles)

$1,279.37

$12.88

Reading materials

  $141.00

  $2.99

You can interpret the results of
Table 9-1
by making relative comparisons. For example, notice that about 42% of all average household food expenses are for eating out, given that $2,235.37 ÷ ($3,085.52 + $2,235.37) = $2,235.37 ÷ $5,320.89 = 0.42 or 42%. The standard errors for average food expenses are larger than for the other expenses on the list, because food expenses vary a lot more from household to household. However, you may wonder why the standard errors for food expenses aren't larger than what's shown in
Table 9-1
. Remember, standard error tells you how much variability you can expect in the average if you were to take another sample. If the sample size is large, the average shouldn't change by much. And you know that the government never uses small sample sizes!

HEADS UP 

A listing of the standard errors for sample means is not something you would typically see in a media report. However, you can (and should, when the results are important to you) dig deeper and find the standard errors, as well. The best thing to do is to look up the research articles and look for the standard errors in those articles.

Sampling distributions

A listing of all the values that a sample mean can take on and how often those values can occur is called the
sampling distribution
of the sample mean. A sampling distribution, like any other distribution, has a shape, a center, and a measure of variability (in this case, the standard error). (See
Chapter 4
for information on shape, center, and variability; see
Chapter 3
for more information on distributions.)

TECHNICAL STUFF 

According to the central limit theorem, if the samples are large enough, the distribution of all possible sample means will have a bell-shaped, or normal distribution with the same mean as that of the original population. (See
Chapter 3
for more on the normal distribution.) This is because the sample means are clustered near the overall average value, which is the population mean. High values in a sample are offset by low values that also appear in the sample, in an
averaging out
effect. The variability in the sampling distribution is measured in terms of standard errors. An added benefit to using an average to get an estimate (rather than a total or a single value) is that the variability in the sample means decreases as sample sizes get larger. (Similar characteristics also apply to the sampling distribution of the sample proportion, in the case of categorical [yes/no] data from surveys and polls.)

Using the empirical rule to interpret standard errors

Because the sampling distribution of sample means (or sample proportions) is normal (mound-shaped), you can use the empirical rule to get an idea of
how much a given sample result is expected to vary, provided that the sample size is large enough. (See
Chapter 8
for full coverage of the empirical rule.)

Applied to sample means and proportions, the empirical rule says that you can expect:

  • About 68% of the sample means to lie within 1 standard error of the population mean

  • About 95% of the sample means to lie within 2 standard errors of the population mean

  • About 99.7% of the sample means to lie within 3 standard errors of the population mean

  • Similar values for categorical (yes/no) data: 68%, 95%, or 99.7% of the sample proportions will lie within 1, 2, or 3 standard errors, respectively, of the population proportion

HEADS UP 

What does the empirical rule tell you about how much you can expect a given sample mean to vary? Keep in mind that 95% of the sample means should lie within 2 standard errors of the population mean, and your job is to estimate the population mean. So, if your estimate is actually a range including your sample mean plus or minus 2 standard errors, your estimate would be correct about 95% of the time. (The number of standard errors added or subtracted is called the
margin of error.
For more on the margin of error, see
Chapter 10
.)

Consider this example: According to the U.S. Bureau of Labor Statistics, the average household in 2001 contained 2.5 people (0.7 of which were children under 18, and 0.3 of which were people over 65) and 1.9 vehicles. (Sorry, no standard errors were available for these data.) Referring to
Table 9-1
, the average phone expenses for the year for this sample of 7,500 households was $914.41 per household. How much are these results expected to vary from sample to sample (that is, if different samples of 7,500 households each had been selected from the same population)? The standard error for phone expenses for this sample is $9.69. This means that 95% of the sample average phone expenses should lie within 2 × $9.69 (or $19.38) on either side of the actual population average. This shows how much the mean phone expenses are expected to vary when the sample size is 7,500.

HEADS UP 

In the preceding example, you're not saying that 95% of all the households in the population have phone expenses in that range. Instead, you're giving an estimate for what the average phone expense is, over all households in the population. The average phone expense would actually be a single number. But because you can't get the actual number, you estimate it using this range of values.

You can also use the empirical rule to give a rough estimate of the average telephone expenses for all U.S. households (not just the sample of 7,500 people). Again, using the second property of the empirical rule, you would expect that the average telephone expense for all households in the U.S. is about $914.41, plus or minus 2 × $9.69 = $19.38. This type of estimate will be correct for 95% of the samples selected (and you hope that the sample collected in the Consumer Expenditure Survey is one of them). Using proper statistical jargon, this means that you can estimate with about 95% confidence that the average phone expenses per year for all U.S. households lies somewhere between $914.41

$19.38 and $914.41 + $19.38, or between $895.03 and $933.79. If you want to be about 99.7% confident in your estimate (instead of 95% confident), you need to add and subtract 3 standard errors.

This type of result, involving a statistic plus or minus a certain number of standard errors, is called a
confidence interval
. (For more on this see
Chapter 11
.) The amount added or subtracted is called the
margin of error.

HEADS UP 

Reporting the margin of error for the sample mean is something that you don't see very often in media reports involving quantitative data (such as household income, house prices, or stock market values). Yet the margin of error should always be there in order for the public to assess the accuracy of the results! With survey and polling data (which are categorical data and are reported as proportions) you're often given the margin of error, which is directly related to the standard error (see
Chapter 10
). Why the double standard? I can't say for sure.

Specifics of the central limit theorem

Notice that the results that apply the empirical rule in the preceding section give a rough way of interpreting 1, 2, or 3 standard errors, and they tell you what to expect in terms of the sample mean (or sample proportion). These results are actually due to the central limit theorem (CLT). Statisticians love the CLT; without it, they wouldn't have jobs. The CLT allows them to actually say something about where they expect sample results to lie, without having to look at all possible samples drawn from a particular population. It also provides a formula for calculating standard errors, as well as more specific information regarding what percentage of the sample means (or proportions) will lie between any number of standard errors (not just 1, 2, or 3).

The central limit theorem says that for any population with mean
μ
and standard deviation
σ
:

  • The distribution of all possible sample means, x, is
    approximately
    normal for sufficiently large sample sizes. That means you can use the normal distribution to answer questions or to draw conclusions about your sample mean. (See
    Chapter 8
    for the normal distribution.)

  • The larger the sample size (
    n
    ) is, the closer the distribution of the sample means will be to a normal distribution. (Most statisticians agree that if
    n
    is at least 30, it will do a reasonable job in most cases.)

  • The mean of the distribution of sample means is also
    μ
    .

  • The standard error of the sample means is
    . It decreases as
    n
    increases.

  • If the original data have a normal distribution, the sample means will always have an exact normal distribution, no matter what the sample size is.

HEADS UP 

If the population standard deviation,
σ
, is unknown (which will be the case most of the time), you can estimate it using
s
, the sample standard deviation, for standard error in the preceding formula. More on this in
Chapter 12
.

Tip 

Note that the CLT says that even if the original data are not normally distributed, the distribution of sample means will be normal, as long as the sample sizes are large enough. This, again, is due to the averaging effect.

Checking out the ACT math scores

Consider the 2002 ACT math scores for male and female high school students. (This is a situation in which the population mean and standard deviation are known, because all of the tests taken in 2002 were graded and recorded.) The average ACT math score for male students was 21.2 with a standard deviation of 5.3. Female students averaged 20.1 (with a standard deviation of 4.8) on the same ACT math test. Prior research has shown that ACT scores have an approximate normal distribution.
Figure 9-1
shows the distributions of the scores for males and females, respectively, given the preceding information. In each case, the size of the entire test-taking population (boys and girls combined) is about one million students.

Figure 9-1:
ACT math scores for male and female high school students in 2002.

Using the empirical rule (see
Chapter 8
), about 95% of the male students scored between 10.6 and 31.8 on the ACT math test, and about 95% of the female students scored between 10.5 and 29.7 on the ACT math test. The scores of male and female students are quite comparable.

Suppose that you're interested in the average scores of samples of size 100 from the total population of 500,000 male students who took the test in 2002. Why? Maybe you have 100 students in a class and want to see how they did, compared to all other possible classes of this size. What will the distribution of all possible sample means look like? According to the CLT, it will have a normal distribution with the same mean (21.2), and the standard error will be 5.3 divided by the square root of 100 (because in this hypothetical sample,
n
= 100). Therefore the standard error for this sample is

Figure 9-2
shows what the sampling distribution of the sample means looks like for samples of 100 male students.

Figure 9-2:
Average 2002 ACT math scores for male students for samples of size 100.
HEADS UP 

Notice in
Figure 9-2
how much smaller the standard error of the sample means is, compared to the standard deviation of the original scores shown in
Figure 9-1
. That's because each sample mean in
Figure 9-2
contains information from 100 students, compared to each individual score in
Figure 9-1
, which contains information from only a single student. Sample means won't vary as much as individual scores will. That's why using a sample mean is a much better idea for estimating the population mean than just using an individual score (or an anecdote).

Figure 9-3
shows what happens to the sampling distribution of the sample mean when the sample sizes increase to 1,000 male students. The standard error reduces to 5.3 divided by the square root of 1,000, or 5.3 ÷ 31.62 = 0.17. The standard error for
Figure 9-3
is smaller than the standard error in
Figure 9-2
, because the sample means in
Figure 9-3
are each based on 1,000 students and contain even more information than the sample means shown in
Figure 9-2
, (which are based on 100 students each).

Figure 9-3:
Average 2002 ACT math scores for male students for samples of size 1,000.

Figure 9-4
shows all three distributions for male students (individual scores, sample means of size 100, and sample means of size 1,000) overlapping, to compare their variability.

Figure 9-4:
Sampling distributions of 2002 ACT math scores for male students showing original scores, sample means for samples of size 100, and sample means for samples of size 1,000.
TECHNICAL STUFF 

You can use the central limit theorem to answer questions about sample results in situations like the ACT math scores. For example, suppose you want to know the chance that a sample of 100 male students will have an average ACT math score of 22 or less. Using the technique shown in
Chapter 8
, you change the score of 22 to a standard score by subtracting the population mean (21.2) and dividing the difference by the standard error (instead of the standard deviation). The formula for this conversion is
, where
x
is the average score of the sample (in this case, 22) and
is the standard error. Note that
σ
is the population standard deviation (5.3). In this example, where
the standard error was calculated to be 5.3 ÷

100 = 0.53 (see
Figure 9-2
), the class average of 22 converts to a standard score of (22

21.2) ÷ 0.53 = 0.8 ÷ 0.53 = 1.51. You want to know the percentage of scores that lie to the left of this value (in other words, the percentile corresponding to a standard score of 1.51). Referring to
Table 8-1
in
Chapter 8
, that percentage is about 93.32%.

Tip 

Don't forget to find 22 – 21.2 first, before dividing by 0.53 in the precdeing example, or you'll get –18, which is wrong.

REMEMBER 

A group of averages always has less variability than a group of individual scores. And averages that are based on larger sample sizes have even less variability than averages based on smaller sample sizes.

The central limit theorem doesn't apply only to sample means. You can also use the central limit theorem to answer questions or make conclusions about population proportions, based on your sample proportion. The same conclusions about the shape, center, and variability in the sample means applies to sample proportions. Of course, the formulas will be a little different, but the concepts are all the same. First, the sample proportion is denoted
and is equal to the number of individuals in the sample in the category of interest, divided by the total sample size (
n
)

The central limit theorem says that for any population of data with
p
as the overall population percentage:

  • The distribution of all possible sample proportions
    is
    approximately
    normal, provided that the sample size is large enough. (See
    Chapter 8
    for more on the normal distribution.)

  • The larger the sample size (
    n
    ) is, the closer the distribution of sample proportions will be to a normal distribution. (That means you can use the normal distribution to answer questions or to draw conclusions about your sample proportion.)

  • The mean of the distribution of sample proportions is also
    p.

  • The standard error of the sample proportions is

Other books

Starstruck by Paige Thomas
The Nanny Arrangement by Lily George
I Sing the Body Electric by Ray Bradbury
Batman 4 - Batman & Robin by Michael Jan Friedman
Warning! Do Not Read This Story! by Robert T. Jeschonek
Evangelista's Fan by Rose Tremain