Suppose your sample size is 10, your test statistic (referred to as the
t-value
) is 2.5, and your alternative hypothesis, H
a
, is the greater-than alternative. Because the sample size is 10, you use the t-distribution with 10
−
1 = 9 degrees of freedom to calculate your
p
-value. This means you'll be looking at the row in the t-table (
Table 14-2
) that has a 9 in the Degrees of Freedom column. Your test statistic (2.5) falls between two values: 2.262 (the 97.5th percentile) and 2.821 (the 98th percentile).
What's the
p
-value? Somewhere between 100%
−
97.5% = 2.5% = 0.025 and 100%
−
98% = 2% = 0.02. (Keep in mind that with a greater-than alternative, you need 100% minus the percentile.) You don't know exactly what the
p
-value is, but because 2% and 2.5 % are both less than the typical cutoff of 5%, you reject H
o
.
Note that for a less-than alternative hypothesis, your test statistic would be a negative number (to the left of 0 on the t-distribution). In this case, you want to find the percentage below, or to the left of, your test statistic to get your
p
-value. Yet negative test statistics don't appear on
Table 14-2
. Not to worry! The percentage to the left (below) a negative t-value is the same as the percentage to the right (above) the positive t-value, due to symmetry. So, to find the
p
-value for your negative test statistic, look up the positive version of your test statistic on
Table 14-2
, find the corresponding percentile, and take 100% minus that.
For example, if your test statistic is –2.5 with 9 degrees of freedom, look up +2.5 on
Table 14-2
, and you find that it falls between the 97.5th and 98th percentiles. Taking 100% minus these amounts, your
p
-value is somewhere between 2% and 2.5%. (Note that this approach for negative numbers is different from how these situations are handled in
Table 8-1
in
Chapter 8
. But that table was set up differently.)
If your alternative hypothesis (H
a
) has the not-equal-to alternative, double the percentage that you get.
For all types of hypotheses (greater-than, less-than, and not-equal-to), change the percentage to a probability by dividing by 100 or moving the decimal point two places to the left.
HEADS UP | The t-table ( |
TECHNICAL STUFF | The last line of |
Whether in product advertisements or media blitzes on recent medical breakthroughs, you often run across claims made about one or more populations. For example, "We promise to deliver our packages in two days or less" or, "Two recent studies show that a high-fiber diet may reduce your risk of colon cancer by 20%." Whenever someone makes a claim (also called a
null hypothesis
) about a population (for example, that the average amount of time people spend commuting to and from work is 6 hours per week, or that the percentage of people in the United States who like reality TV is 30%), you can test the claim by doing what statisticians call a
hypothesis test.
You can also use a hypothesis test to compare two populations (for example, the mean commuting time for people working first shift compared to people working second shift, or the proportion of women compared to men who have cellphones). See
Chapter 14
for background information on the general ideas behind hypothesis tests.
A hypothesis test involves setting up your
hypotheses
(a claim and its alternative), selecting a sample (or samples), collecting data, calculating the relevant statistics, and using those statistics to decide whether the claim is true. What you're really doing is comparing your sample statistic to the claimed population parameter and seeing how close they are to each other. For example, if the average commuting time for a sample of 1,000 workers is 5.2 hours, 5.2 is the sample statistic. If the claim is that the average commuting time for the population of
all
workers is 6 hours per week, 6 is the claimed population parameter, in this case, the population mean. The closer the sample statistic is to the claimed value of the population parameter, the more you can believe that the claim is valid. Yet the big question is, "How close is close enough?"
In this chapter, I outline the formulas used for some of the most common hypothesis tests, explain the necessary calculations, and walk you through some examples.
This test is used when the variable is numerical (for example, age, income, time, and so on) and only one population or group is being studied (for example, all U.S. households or all college students). For example, Dr. Ruth says that the average time that working mothers spend talking to their children is 11 minutes per day, on average. (For dads, the claim is 8 minutes.) The variable, time, is numerical, and the population is all working mothers.
The null hypothesis is that the population mean,
μ
, is equal to a certain claimed value,
μ
o
. The notation for the null hypothesis is H
o
:
μ
=
μ
o
. So, the null hypothesis in the Dr. Ruth example is H
o
:
μ
= 11 minutes, and
μ
o
is 11 here. Note that
μ
represents the average number of minutes per day that all working mothers spend talking to their children, on average. The alternative hypothesis, H
a
, is either
μ
, >
μ
o
,
μ
<
μ
o
, or
μ
≠
μ
o
.In this example, the three possibilitiesfor H
a
would be:
μ
> 11,
μ
< 11, or
μ
≠
11.(See
Chapter 14
for more on alternative hypotheses.) If you suspect that the average time working mothers spend talking with their kids is more than 11 minutes, your altenative hypothesis would be H
a
:
μ
> 11.
The formula for the test static for one population mean is
. To calculate it, do the following:
Calculate the sample mean
,
x
,
and the sample standard deviation
,
s.
Let
n
represent the sample size.
See
Chapter 4
for calculations of the mean and standard deviation.
Find
x
minus
μ
o
.
Calculate the standard error:
Save your answer.
Divide your result from Step 2 by the standard error found in Step 3.
For the Dr. Ruth example, suppose a random sample of 100 working mothers spend an average of 11.5 minutes per day talking with their children, with a standard deviation of 2.3 minutes. That means
x
is 11.5,
n
= 100, and
s
= 2.3.
Take 11.5
−
11 = +0.5.
Take 2.3 divided by the square root of 100 (which is 10) to get 0.23 for the standard error.
Divide +0.5 by 0.23, to get 2.17 (rounded to 2.2). That's your test statistic.
This means your sample mean is 2.2 standard errors above the claimed population mean. Would these sample results be unusual if the claim (H
o
:
μ
= 11 minutes) were true? To decide whether your test statistic supports H
o
, calculate the
p
-value. To calculate the
p
-value, look up your test statistic (in this case 2.2) on the standard normal distribution (Z-distribution) — see
Table 8-1
in
Chapter 8
— and take 100% minus the percentile shown, because your H
a
is a greater-than hypothesis. In this case, the percentage would be 100%
−
98.61% = 1.39%. So, the
p
-value (dividing by 100) would be 0.0139. (See
Chapter 14
for more on
p
-value calculations.) This
p
-value of 0.0139 (1.39%) is quite a bit less than 0.05 (5%). That means your sample results are unusual if the claim (of 11 minutes) is true. So, reject the claim (
μ
= 11 minutes) by rejecting H
o
, and then accept H
a
(
μ
> 11 minutes).
Your conclusion: According to this (hypothetical) sample, Dr. Ruth's claim of 11 minutes is a bit low; the actual average is greater than 11 minutes per day. See
Chapter 14
for more on hypothesis test calculations and conclusions.
HEADS UP | If the sample size, |
This test is used when the variable is categorical (for example, gender, political party, support/oppose, and so on) and only one population or group is being studied (for example, all U.S. citizens or all registered voters). The test is looking at the proportion (
p
) of individuals in the population who have a certain characteristic, for example, the proportion of people who carry cellphones. The null hypothesis is H
o
:
p
=
p
0
, where
p
0
is a certain claimed value. For example, if the claim is that 20% of people carry cellphones,
p
0
is 0.20. The alternative hypothesis is one of the following:
p
>
p
0
,
p
<
p
0
, or
p
≠
p
0
. (See
Chapter 14
for more on alternative hypotheses.)
The formula for the test statistic for a single proportion is
. To calculate it, do the following:
Calculate the sample proportion,
,
by taking the number of people in the sample who have the characteristic of interest (for example, the number of people in the sample carrying cellphones) and dividing that by
n
, the sample size.
Take
minus
p
0
.
Calculate the standard error:
.
Save your answer.
Divide your result from Step 2 by your result from Step 3.
To interpret the test statistic, look up your test statistic on the standard normal distribution (see
Table 8-1
in
Chapter 8
) and calculate the
p
-value (see
Chapter 14
for more on
p
-value calculations).
For example, suppose Cavifree toothpaste claims that four out of five dentists recommend Cavifree toothpaste to their patients. In this case, the population is all dentists, and
p
is the proportion of all dentists who recommended Cavifree to their patients. The claim is that
p
is equal to "four out of five", which means that
p
0
is 4 ÷ 5 = 0.80. You suspect that the proportion is actually less than 0.80. Your hypotheses are H
o
:
p
= 0.80 versus H
a
:
p
< 0.80. Suppose that 150 out of 200 dental patients sampled received a recommendation for Cavifree.
To find the test statistic, start with
is 150 ÷ 200 = 0.75. Also,
p
0
= 0.80 and
n
= 200.
Take 0.75
−
0.80 =
−
0.05.
Next, the standard error is the square root of [(0.80 × [1
−
0.80]) ÷ 200] = the square root of (0.16 ÷ 200) = the square root of 0.0008 = 0.028.
The test statistic is
−
0.05 divided by 0.028, which is
−
0.05 ÷ 0.028 =
−
1.79, rounded to
−
1.8.
This means that your sample results are 1.8 standard errors below the claimed value for the population.
How often would you expect to get results like this if H
o
were true? The percentage chance of being at or beyond (in this case to the left of )
−
1.8, is 3.59%. (Look up
−
1.8 in
Table 8-1
in
Chapter 8
and use the corresponding percentile, because H
a
is a less-than hypothesis. See
Chapter 14
for more on this.) Now divide by 100 to get your
p
-value, which is 0.0359. Because the
p
-value is less than 0.05, you have enough evidence to reject H
o
.
According to your sample, the claim of four out of five (80% of) dentists recommending Cavifree toothpaste is not true; the actual percentage of recommendations is less than that.
HEADS UP | Most hypothesis tests involving proportions are done using samples that are quite large, given that they're most often based on surveys, so you rarely encounter a situation in which a very small sample is used. For information on how to calculate the |