Hypothesis tests about the variance

by Marco Taboga , PhD

This page explains how to perform hypothesis tests about the variance of a normal distribution, called Chi-square tests.

We analyze two different situations:

when the mean of the distribution is known;

when it is unknown.

Depending on the situation, the Chi-square statistic used in the test has a different distribution.

At the end of the page, we propose some solved exercises.

Table of contents

Normal distribution with known mean

The null hypothesis, the test statistic, the critical region, the decision, the power function, the size of the test, how to choose the critical value, normal distribution with unknown mean, solved exercises.

The assumptions are the same previously made in the lecture on confidence intervals for the variance .

The sample is drawn from a normal distribution .

A test of hypothesis based on it is called a Chi-square test .

Otherwise the null is not rejected.

[eq8]

We explain how to do this in the page on critical values .

We now relax the assumption that the mean of the distribution is known.

[eq29]

See the comments on the choice of the critical value made for the case of known mean.

Below you can find some exercises with explained solutions.

Suppose that we observe 40 independent realizations of a normal random variable.

we run a Chi-square test of the null hypothesis that the variance is equal to 1;

[eq38]

Make the same assumptions of Exercise 1 above.

If the unadjusted sample variance is equal to 0.9, is the null hypothesis rejected?

How to cite

Please cite as:

Taboga, Marco (2021). "Hypothesis tests about the variance", Lectures on probability theory and mathematical statistics. Kindle Direct Publishing. Online appendix. https://www.statlect.com/fundamentals-of-statistics/hypothesis-testing-variance.

Most of the learning materials found on this website are now available in a traditional textbook format.

  • Convergence in probability
  • Multivariate normal distribution
  • Characteristic function
  • Moment generating function
  • Chi-square distribution
  • Beta function
  • Bernoulli distribution
  • Mathematical tools
  • Fundamentals of probability
  • Probability distributions
  • Asymptotic theory
  • Fundamentals of statistics
  • About Statlect
  • Cookies, privacy and terms of use
  • Posterior probability
  • IID sequence
  • Probability space
  • Probability density function
  • Continuous mapping theorem
  • To enhance your privacy,
  • we removed the social buttons,
  • but don't forget to share .

Hypothesis Testing - Analysis of Variance (ANOVA)

Lisa Sullivan, PhD

Professor of Biostatistics

Boston University School of Public Health

hypothesis test on variance

Introduction

This module will continue the discussion of hypothesis testing, where a specific statement or hypothesis is generated about a population parameter, and sample statistics are used to assess the likelihood that the hypothesis is true. The hypothesis is based on available information and the investigator's belief about the population parameters. The specific test considered here is called analysis of variance (ANOVA) and is a test of hypothesis that is appropriate to compare means of a continuous variable in two or more independent comparison groups. For example, in some clinical trials there are more than two comparison groups. In a clinical trial to evaluate a new medication for asthma, investigators might compare an experimental medication to a placebo and to a standard treatment (i.e., a medication currently being used). In an observational study such as the Framingham Heart Study, it might be of interest to compare mean blood pressure or mean cholesterol levels in persons who are underweight, normal weight, overweight and obese.  

The technique to test for a difference in more than two independent means is an extension of the two independent samples procedure discussed previously which applies when there are exactly two independent comparison groups. The ANOVA technique applies when there are two or more than two independent groups. The ANOVA procedure is used to compare the means of the comparison groups and is conducted using the same five step approach used in the scenarios discussed in previous sections. Because there are more than two groups, however, the computation of the test statistic is more involved. The test statistic must take into account the sample sizes, sample means and sample standard deviations in each of the comparison groups.

If one is examining the means observed among, say three groups, it might be tempting to perform three separate group to group comparisons, but this approach is incorrect because each of these comparisons fails to take into account the total data, and it increases the likelihood of incorrectly concluding that there are statistically significate differences, since each comparison adds to the probability of a type I error. Analysis of variance avoids these problemss by asking a more global question, i.e., whether there are significant differences among the groups, without addressing differences between any two groups in particular (although there are additional tests that can do this if the analysis of variance indicates that there are differences among the groups).

The fundamental strategy of ANOVA is to systematically examine variability within groups being compared and also examine variability among the groups being compared.

Learning Objectives

After completing this module, the student will be able to:

  • Perform analysis of variance by hand
  • Appropriately interpret results of analysis of variance tests
  • Distinguish between one and two factor analysis of variance tests
  • Identify the appropriate hypothesis testing procedure based on type of outcome variable and number of samples

The ANOVA Approach

Consider an example with four independent groups and a continuous outcome measure. The independent groups might be defined by a particular characteristic of the participants such as BMI (e.g., underweight, normal weight, overweight, obese) or by the investigator (e.g., randomizing participants to one of four competing treatments, call them A, B, C and D). Suppose that the outcome is systolic blood pressure, and we wish to test whether there is a statistically significant difference in mean systolic blood pressures among the four groups. The sample data are organized as follows:

 

n

n

n

n

s

s

s

s

The hypotheses of interest in an ANOVA are as follows:

  • H 0 : μ 1 = μ 2 = μ 3 ... = μ k
  • H 1 : Means are not all equal.

where k = the number of independent comparison groups.

In this example, the hypotheses are:

  • H 0 : μ 1 = μ 2 = μ 3 = μ 4
  • H 1 : The means are not all equal.

The null hypothesis in ANOVA is always that there is no difference in means. The research or alternative hypothesis is always that the means are not all equal and is usually written in words rather than in mathematical symbols. The research hypothesis captures any difference in means and includes, for example, the situation where all four means are unequal, where one is different from the other three, where two are different, and so on. The alternative hypothesis, as shown above, capture all possible situations other than equality of all means specified in the null hypothesis.

Test Statistic for ANOVA

The test statistic for testing H 0 : μ 1 = μ 2 = ... =   μ k is:

and the critical value is found in a table of probability values for the F distribution with (degrees of freedom) df 1 = k-1, df 2 =N-k. The table can be found in "Other Resources" on the left side of the pages.

NOTE: The test statistic F assumes equal variability in the k populations (i.e., the population variances are equal, or s 1 2 = s 2 2 = ... = s k 2 ). This means that the outcome is equally variable in each of the comparison populations. This assumption is the same as that assumed for appropriate use of the test statistic to test equality of two independent means. It is possible to assess the likelihood that the assumption of equal variances is true and the test can be conducted in most statistical computing packages. If the variability in the k comparison groups is not similar, then alternative techniques must be used.

The F statistic is computed by taking the ratio of what is called the "between treatment" variability to the "residual or error" variability. This is where the name of the procedure originates. In analysis of variance we are testing for a difference in means (H 0 : means are all equal versus H 1 : means are not all equal) by evaluating variability in the data. The numerator captures between treatment variability (i.e., differences among the sample means) and the denominator contains an estimate of the variability in the outcome. The test statistic is a measure that allows us to assess whether the differences among the sample means (numerator) are more than would be expected by chance if the null hypothesis is true. Recall in the two independent sample test, the test statistic was computed by taking the ratio of the difference in sample means (numerator) to the variability in the outcome (estimated by Sp).  

The decision rule for the F test in ANOVA is set up in a similar way to decision rules we established for t tests. The decision rule again depends on the level of significance and the degrees of freedom. The F statistic has two degrees of freedom. These are denoted df 1 and df 2 , and called the numerator and denominator degrees of freedom, respectively. The degrees of freedom are defined as follows:

df 1 = k-1 and df 2 =N-k,

where k is the number of comparison groups and N is the total number of observations in the analysis.   If the null hypothesis is true, the between treatment variation (numerator) will not exceed the residual or error variation (denominator) and the F statistic will small. If the null hypothesis is false, then the F statistic will be large. The rejection region for the F test is always in the upper (right-hand) tail of the distribution as shown below.

Rejection Region for F   Test with a =0.05, df 1 =3 and df 2 =36 (k=4, N=40)

Graph of rejection region for the F statistic with alpha=0.05

For the scenario depicted here, the decision rule is: Reject H 0 if F > 2.87.

The ANOVA Procedure

We will next illustrate the ANOVA procedure using the five step approach. Because the computation of the test statistic is involved, the computations are often organized in an ANOVA table. The ANOVA table breaks down the components of variation in the data into variation between treatments and error or residual variation. Statistical computing packages also produce ANOVA tables as part of their standard output for ANOVA, and the ANOVA table is set up as follows: 

Source of Variation

Sums of Squares (SS)

Degrees of Freedom (df)

Mean Squares (MS)

F

Between Treatments

k-1

Error (or Residual)

N-k

Total

N-1

where  

  • X = individual observation,
  • k = the number of treatments or independent comparison groups, and
  • N = total number of observations or total sample size.

The ANOVA table above is organized as follows.

  • The first column is entitled "Source of Variation" and delineates the between treatment and error or residual variation. The total variation is the sum of the between treatment and error variation.
  • The second column is entitled "Sums of Squares (SS)" . The between treatment sums of squares is

and is computed by summing the squared differences between each treatment (or group) mean and the overall mean. The squared differences are weighted by the sample sizes per group (n j ). The error sums of squares is:

and is computed by summing the squared differences between each observation and its group mean (i.e., the squared differences between each observation in group 1 and the group 1 mean, the squared differences between each observation in group 2 and the group 2 mean, and so on). The double summation ( SS ) indicates summation of the squared differences within each treatment and then summation of these totals across treatments to produce a single value. (This will be illustrated in the following examples). The total sums of squares is:

and is computed by summing the squared differences between each observation and the overall sample mean. In an ANOVA, data are organized by comparison or treatment groups. If all of the data were pooled into a single sample, SST would reflect the numerator of the sample variance computed on the pooled or total sample. SST does not figure into the F statistic directly. However, SST = SSB + SSE, thus if two sums of squares are known, the third can be computed from the other two.

  • The third column contains degrees of freedom . The between treatment degrees of freedom is df 1 = k-1. The error degrees of freedom is df 2 = N - k. The total degrees of freedom is N-1 (and it is also true that (k-1) + (N-k) = N-1).
  • The fourth column contains "Mean Squares (MS)" which are computed by dividing sums of squares (SS) by degrees of freedom (df), row by row. Specifically, MSB=SSB/(k-1) and MSE=SSE/(N-k). Dividing SST/(N-1) produces the variance of the total sample. The F statistic is in the rightmost column of the ANOVA table and is computed by taking the ratio of MSB/MSE.  

A clinical trial is run to compare weight loss programs and participants are randomly assigned to one of the comparison programs and are counseled on the details of the assigned program. Participants follow the assigned program for 8 weeks. The outcome of interest is weight loss, defined as the difference in weight measured at the start of the study (baseline) and weight measured at the end of the study (8 weeks), measured in pounds.  

Three popular weight loss programs are considered. The first is a low calorie diet. The second is a low fat diet and the third is a low carbohydrate diet. For comparison purposes, a fourth group is considered as a control group. Participants in the fourth group are told that they are participating in a study of healthy behaviors with weight loss only one component of interest. The control group is included here to assess the placebo effect (i.e., weight loss due to simply participating in the study). A total of twenty patients agree to participate in the study and are randomly assigned to one of the four diet groups. Weights are measured at baseline and patients are counseled on the proper implementation of the assigned diet (with the exception of the control group). After 8 weeks, each patient's weight is again measured and the difference in weights is computed by subtracting the 8 week weight from the baseline weight. Positive differences indicate weight losses and negative differences indicate weight gains. For interpretation purposes, we refer to the differences in weights as weight losses and the observed weight losses are shown below.

Low Calorie

Low Fat

Low Carbohydrate

Control

8

2

3

2

9

4

5

2

6

3

4

-1

7

5

2

0

3

1

3

3

Is there a statistically significant difference in the mean weight loss among the four diets?  We will run the ANOVA using the five-step approach.

  • Step 1. Set up hypotheses and determine level of significance

H 0 : μ 1 = μ 2 = μ 3 = μ 4 H 1 : Means are not all equal              α=0.05

  • Step 2. Select the appropriate test statistic.  

The test statistic is the F statistic for ANOVA, F=MSB/MSE.

  • Step 3. Set up decision rule.  

The appropriate critical value can be found in a table of probabilities for the F distribution(see "Other Resources"). In order to determine the critical value of F we need degrees of freedom, df 1 =k-1 and df 2 =N-k. In this example, df 1 =k-1=4-1=3 and df 2 =N-k=20-4=16. The critical value is 3.24 and the decision rule is as follows: Reject H 0 if F > 3.24.

  • Step 4. Compute the test statistic.  

To organize our computations we complete the ANOVA table. In order to compute the sums of squares we must first compute the sample means for each group and the overall mean based on the total sample.  

 

Low Calorie

Low Fat

Low Carbohydrate

Control

n

5

5

5

5

Group mean

6.6

3.0

3.4

1.2

We can now compute

So, in this case:

Next we compute,

SSE requires computing the squared differences between each observation and its group mean. We will compute SSE in parts. For the participants in the low calorie diet:  

6.6

8

1.4

2.0

9

2.4

5.8

6

-0.6

0.4

7

0.4

0.2

3

-3.6

13.0

Totals

0

21.4

For the participants in the low fat diet:  

3.0

2

-1.0

1.0

4

1.0

1.0

3

0.0

0.0

5

2.0

4.0

1

-2.0

4.0

Totals

0

10.0

For the participants in the low carbohydrate diet:  

3

-0.4

0.2

5

1.6

2.6

4

0.6

0.4

2

-1.4

2.0

3

-0.4

0.2

Totals

0

5.4

For the participants in the control group:

2

0.8

0.6

2

0.8

0.6

-1

-2.2

4.8

0

-1.2

1.4

3

1.8

3.2

Totals

0

10.6

We can now construct the ANOVA table .

Source of Variation

Sums of Squares

(SS)

Degrees of Freedom

(df)

Means Squares

(MS)

F

Between Treatmenst

75.8

4-1=3

75.8/3=25.3

25.3/3.0=8.43

Error (or Residual)

47.4

20-4=16

47.4/16=3.0

Total

123.2

20-1=19

  • Step 5. Conclusion.  

We reject H 0 because 8.43 > 3.24. We have statistically significant evidence at α=0.05 to show that there is a difference in mean weight loss among the four diets.    

ANOVA is a test that provides a global assessment of a statistical difference in more than two independent means. In this example, we find that there is a statistically significant difference in mean weight loss among the four diets considered. In addition to reporting the results of the statistical test of hypothesis (i.e., that there is a statistically significant difference in mean weight losses at α=0.05), investigators should also report the observed sample means to facilitate interpretation of the results. In this example, participants in the low calorie diet lost an average of 6.6 pounds over 8 weeks, as compared to 3.0 and 3.4 pounds in the low fat and low carbohydrate groups, respectively. Participants in the control group lost an average of 1.2 pounds which could be called the placebo effect because these participants were not participating in an active arm of the trial specifically targeted for weight loss. Are the observed weight losses clinically meaningful?

Another ANOVA Example

Calcium is an essential mineral that regulates the heart, is important for blood clotting and for building healthy bones. The National Osteoporosis Foundation recommends a daily calcium intake of 1000-1200 mg/day for adult men and women. While calcium is contained in some foods, most adults do not get enough calcium in their diets and take supplements. Unfortunately some of the supplements have side effects such as gastric distress, making them difficult for some patients to take on a regular basis.  

 A study is designed to test whether there is a difference in mean daily calcium intake in adults with normal bone density, adults with osteopenia (a low bone density which may lead to osteoporosis) and adults with osteoporosis. Adults 60 years of age with normal bone density, osteopenia and osteoporosis are selected at random from hospital records and invited to participate in the study. Each participant's daily calcium intake is measured based on reported food intake and supplements. The data are shown below.   

1200

1000

890

1000

1100

650

980

700

1100

900

800

900

750

500

400

800

700

350

Is there a statistically significant difference in mean calcium intake in patients with normal bone density as compared to patients with osteopenia and osteoporosis? We will run the ANOVA using the five-step approach.

H 0 : μ 1 = μ 2 = μ 3 H 1 : Means are not all equal                            α=0.05

In order to determine the critical value of F we need degrees of freedom, df 1 =k-1 and df 2 =N-k.   In this example, df 1 =k-1=3-1=2 and df 2 =N-k=18-3=15. The critical value is 3.68 and the decision rule is as follows: Reject H 0 if F > 3.68.

To organize our computations we will complete the ANOVA table. In order to compute the sums of squares we must first compute the sample means for each group and the overall mean.  

Normal Bone Density

n =6

n =6

n =6

 If we pool all N=18 observations, the overall mean is 817.8.

We can now compute:

Substituting:

SSE requires computing the squared differences between each observation and its group mean. We will compute SSE in parts. For the participants with normal bone density:

1200

261.6667

68,486.9

1000

61.6667

3,806.9

980

41.6667

1,738.9

900

-38.3333

1,466.9

750

-188.333

35,456.9

800

-138.333

19,126.9

Total

0

130,083.3

For participants with osteopenia:

1000

200

40,000

1100

300

90,000

700

-100

10,000

800

0

0

500

-300

90,000

700

-100

10,000

Total

0

240,000

For participants with osteoporosis:

890

175

30,625

650

-65

4,225

1100

385

148,225

900

185

34,225

400

-315

99,225

350

-365

133,225

Total

0

449,750

Between Treatments

152,477.7

2

76,238.6

1.395

Error or Residual

819,833.3

15

54,655.5

Total

972,311.0

17

We do not reject H 0 because 1.395 < 3.68. We do not have statistically significant evidence at a =0.05 to show that there is a difference in mean calcium intake in patients with normal bone density as compared to osteopenia and osterporosis. Are the differences in mean calcium intake clinically meaningful? If so, what might account for the lack of statistical significance?

One-Way ANOVA in R

The video below by Mike Marin demonstrates how to perform analysis of variance in R. It also covers some other statistical issues, but the initial part of the video will be useful to you.

Two-Factor ANOVA

The ANOVA tests described above are called one-factor ANOVAs. There is one treatment or grouping factor with k > 2 levels and we wish to compare the means across the different categories of this factor. The factor might represent different diets, different classifications of risk for disease (e.g., osteoporosis), different medical treatments, different age groups, or different racial/ethnic groups. There are situations where it may be of interest to compare means of a continuous outcome across two or more factors. For example, suppose a clinical trial is designed to compare five different treatments for joint pain in patients with osteoarthritis. Investigators might also hypothesize that there are differences in the outcome by sex. This is an example of a two-factor ANOVA where the factors are treatment (with 5 levels) and sex (with 2 levels). In the two-factor ANOVA, investigators can assess whether there are differences in means due to the treatment, by sex or whether there is a difference in outcomes by the combination or interaction of treatment and sex. Higher order ANOVAs are conducted in the same way as one-factor ANOVAs presented here and the computations are again organized in ANOVA tables with more rows to distinguish the different sources of variation (e.g., between treatments, between men and women). The following example illustrates the approach.

Consider the clinical trial outlined above in which three competing treatments for joint pain are compared in terms of their mean time to pain relief in patients with osteoarthritis. Because investigators hypothesize that there may be a difference in time to pain relief in men versus women, they randomly assign 15 participating men to one of the three competing treatments and randomly assign 15 participating women to one of the three competing treatments (i.e., stratified randomization). Participating men and women do not know to which treatment they are assigned. They are instructed to take the assigned medication when they experience joint pain and to record the time, in minutes, until the pain subsides. The data (times to pain relief) are shown below and are organized by the assigned treatment and sex of the participant.

Table of Time to Pain Relief by Treatment and Sex

12

21

15

19

16

18

17

24

14

25

14

21

17

20

19

23

20

27

17

25

25

37

27

34

29

36

24

26

22

29

The analysis in two-factor ANOVA is similar to that illustrated above for one-factor ANOVA. The computations are again organized in an ANOVA table, but the total variation is partitioned into that due to the main effect of treatment, the main effect of sex and the interaction effect. The results of the analysis are shown below (and were generated with a statistical computing package - here we focus on interpretation). 

 ANOVA Table for Two-Factor ANOVA

Model

967.0

5

193.4

20.7

0.0001

Treatment

651.5

2

325.7

34.8

0.0001

Sex

313.6

1

313.6

33.5

0.0001

Treatment * Sex

1.9

2

0.9

0.1

0.9054

Error or Residual

224.4

24

9.4

Total

1191.4

29

There are 4 statistical tests in the ANOVA table above. The first test is an overall test to assess whether there is a difference among the 6 cell means (cells are defined by treatment and sex). The F statistic is 20.7 and is highly statistically significant with p=0.0001. When the overall test is significant, focus then turns to the factors that may be driving the significance (in this example, treatment, sex or the interaction between the two). The next three statistical tests assess the significance of the main effect of treatment, the main effect of sex and the interaction effect. In this example, there is a highly significant main effect of treatment (p=0.0001) and a highly significant main effect of sex (p=0.0001). The interaction between the two does not reach statistical significance (p=0.91). The table below contains the mean times to pain relief in each of the treatments for men and women (Note that each sample mean is computed on the 5 observations measured under that experimental condition).  

Mean Time to Pain Relief by Treatment and Gender

A

14.8

21.4

B

17.4

23.2

C

25.4

32.4

Treatment A appears to be the most efficacious treatment for both men and women. The mean times to relief are lower in Treatment A for both men and women and highest in Treatment C for both men and women. Across all treatments, women report longer times to pain relief (See below).  

Graph of two-factor ANOVA

Notice that there is the same pattern of time to pain relief across treatments in both men and women (treatment effect). There is also a sex effect - specifically, time to pain relief is longer in women in every treatment.  

Suppose that the same clinical trial is replicated in a second clinical site and the following data are observed.

Table - Time to Pain Relief by Treatment and Sex - Clinical Site 2

22

21

25

19

26

18

27

24

24

25

14

21

17

20

19

23

20

27

17

25

15

37

17

34

19

36

14

26

12

29

The ANOVA table for the data measured in clinical site 2 is shown below.

Table - Summary of Two-Factor ANOVA - Clinical Site 2

Source of Variation

Sums of Squares

(SS)

Degrees of freedom

(df)

Mean Squares

(MS)

F

P-Value

Model

907.0

5

181.4

19.4

0.0001

Treatment

71.5

2

35.7

3.8

0.0362

Sex

313.6

1

313.6

33.5

0.0001

Treatment * Sex

521.9

2

260.9

27.9

0.0001

Error or Residual

224.4

24

9.4

Total

1131.4

29

Notice that the overall test is significant (F=19.4, p=0.0001), there is a significant treatment effect, sex effect and a highly significant interaction effect. The table below contains the mean times to relief in each of the treatments for men and women.  

Table - Mean Time to Pain Relief by Treatment and Gender - Clinical Site 2

24.8

21.4

17.4

23.2

15.4

32.4

Notice that now the differences in mean time to pain relief among the treatments depend on sex. Among men, the mean time to pain relief is highest in Treatment A and lowest in Treatment C. Among women, the reverse is true. This is an interaction effect (see below).  

Graphic display of the results in the preceding table

Notice above that the treatment effect varies depending on sex. Thus, we cannot summarize an overall treatment effect (in men, treatment C is best, in women, treatment A is best).    

When interaction effects are present, some investigators do not examine main effects (i.e., do not test for treatment effect because the effect of treatment depends on sex). This issue is complex and is discussed in more detail in a later module. 

Lecture 14: Hypothesis Test for One Variance

STAT 205: Introduction to Mathematical Statistics

University of British Columbia Okanagan

March 17, 2024

Introduction

We have covered three hypothesis tests for a single sample:

  • Hypothesis test for the mean \(\mu\) with \(\sigma\) known ( \(Z\) - test)
  • Hypothesis tests for the proportion \(p\) ( \(Z\) - test)
  • Hypothesis test for the mean \(\mu\) with \(\sigma\) unknown ( \(t\) -test)

Today we consider hypothesis tests involve the population variance \(\sigma^2\)

hypothesis test on variance

Assumptions: \(X_1, X_2, \dots, X_n\) are i.i.d + assumptions in the rhombuses.

In Lecture 7 we saw how to construct a confidence interval for \(\sigma^2\) based on the sampling distribution derived in Lecture 8 .

For random samples from normal populations , we know:

\[ \dfrac{(n-1)S^2}{\sigma^2} \sim \chi^2_{n-1} \]

where \(S^2 = \frac{\sum_{i = 1}^n (X_i - \bar{X})}{n-1}\) is the sample variance and \(\chi^2_{n-1}\) is the Chi-squared distribution with \(n-1\) degrees of freedom.

We may which to test if there is evidence to suggest that population variance differs for some hypothesized value \(\sigma_0^2\) .

As before, we start with a null hypothesis ( \(H_0\) ) that the population variance equals a specified value ( \(\sigma^2 = \sigma_0^2\) )

We test this against the alternative hypothesis \(H_A\) which can either be one-sided ( \(\sigma^2 < \sigma_0^2\) or \(\sigma^2 > \sigma_0^2\) ) or two-sided ( \(\sigma^2 \neq \sigma_0^2\) ).

Test Statistic

Recall that our test statistic is calculated assuming the null hypothesis is true . Hence, if we are testing \(H_0: \sigma^2 = \sigma_0^2\) , the test statistic we use is : \[ \chi^2 = \dfrac{(n-1)S^2}{\sigma_0^2} \] where \(\chi^2 \sim \chi^2_{n-1}\) .

Chi-square distrbituion

hypothesis test on variance

Assumptions

For the following inference procedures to be valid we require:

  • A simple random sample from the population
  • A normally distributed population (very important, even for large sample sizes)

It is important to note that if the population is not approximately normally distributed, chi-squared distribution may not accurately represent the sampling distribution of the test statistic.

Rejection Regions and \(p\)-values for the chi-square test concerning one variance
Alternative Reject \(H_A\) if… \(p\)-value
\(H_A: \sigma^2 < \sigma_0^2\) \(\chi^2_{\text{obs}} \geq \chi^2_\alpha\) Area to the right of \(\chi^2_{\text{obs}}\)
\(H_A: \sigma^2 > \sigma_0^2\) \(\chi^2_{\text{obs}} \leq \chi^2_{1-\alpha}\) Area to the left of \(\chi^2_{\text{obs}}\)
\(H_A: \sigma^2 \neq \sigma_0^2\) \(\chi^2_{\text{obs}} \geq \chi^2_{\alpha/2}\) or \(\chi^2_{\text{obs}} \leq \chi^2_{1-\alpha/2}\) Double the area to the left or right of \(\chi^2_{\text{obs}}\); whichever is smallest.

Critical Region (upper-tailed)

hypothesis test on variance

The rejection region associated with an upper-tailed test for the population variance. Note that the critical value will depend on the chosen significance level ( \(\alpha\) ) and the d.f.

Critical Region (lower-tailed)

hypothesis test on variance

Critical Region (two-tailed)

hypothesis test on variance

Similarly we can find \(p\) -values from Chi-squared tables or R

hypothesis test on variance

\(p\) -value for lower-tailed: \[\Pr(\chi^2 < \chi^2_{\text{obs}})\] \(p\) -value for upper-tailed: \[\Pr(\chi^2 > \chi^2_{\text{obs}})\] \(p\) -value for two-tailed:

\[2\cdot \min \{ \Pr(\chi^2 < \chi^2_{\text{obs}}), \Pr(\chi^2 > \chi^2_{\text{obs}})\}\]

hypothesis test on variance

Exercise 1: Beyond Burger Fat

Beyond Burgers claim to have 18g grams of fat. A random sample of 6 burgers had a mean of 19.45 and a variance of 0.85 grams \(^2\) . Suppose that the quality assurance team at the company will on accept at most a \(\sigma\) of 0.5. Use the 0.05 level of significance to test the null hypotehsis \(\sigma = 0.5\) against the appropriate alternative.

Distribution of Test Statistic

hypothesis test on variance

Under the null hypothesis, the test statistic follows \(\chi^2 = (n-1)S^2/0.5^2\) a chi-square distribution with df = 5

Critical value

hypothesis test on variance

The critical value can be found by determining what value on the chi-square curve with 5 df yield a 5 percent probability in the upper tail (since we are doing an upper-tailed test). In R: qchisq(alpha, df=n-1, lower.tail = FALSE) . Verify using \(\chi^2\) table.

Observed Test Statistic

Compute the observed test statistic which we denote by \(\chi^2_{\text{obs}}\)

hypothesis test on variance

Since the observed test statistic falls in the rejection region, i.e.  \(\chi^2_{\text{obs}} > \chi^2_{\alpha}\) , we rejection the null hypothesis in favour of the alternative.

P-value in R

hypothesis test on variance

Alternatively we could compute the p-value which in this case is 0.013. Since this is smaller than the alpha-level of 0.05, we reject the null hypothesis in favour of the alternative. Verify using \(\chi^2\) table.

P-value from tables

hypothesis test on variance

Using the chi-square distribution table we can see that our observed test statistic falls between two values. We can use the neigbouring values to approximate our p-value.

Approximate P-value

hypothesis test on variance

It is clear from the visualization that \[\begin{align} \Pr(\chi^2_{5} > \chi^2_{0.025}) > \Pr(\chi^2_{5} > \chi^2_{\text{obs}})\\ \Pr(\chi^2_{5} > \chi^2_{\text{obs}}) < \Pr(\chi^2_{5} > \chi^2_{0.01}) \\ \end{align}\]

The \(p\) -value, \(\Pr(\chi^2_{5} > 14.45)\) can then be expressed as: \[\begin{align} 0.01 < p\text{-value } < 0.025 \end{align}\]

  • the \(p\) -value (0.013) is less than \(\alpha\) = 0.05 OR
  • the the observed test statistic ( \(\chi^2_{\text{obs}}\) = 14.45) is larger than the critical value \(\chi^2_{\alpha}\)

we reject the null hypothesis in favour of the alternative. More specifically, there is very strong evidence to suggest that the population variance \(\sigma^2\) is greater than \(0.5^2\) .

https://irene.vrbik.ok.ubc.ca/quarto/stat205/

11.6 Test of a Single Variance

A test of a single variance assumes that the underlying distribution is normal . The null and alternative hypotheses are stated in terms of the population variance (or population standard deviation). The test statistic is:

  • n = the total number of data
  • s 2 = sample variance
  • σ 2 = population variance

You may think of s as the random variable in this test. The number of degrees of freedom is df = n - 1. A test of a single variance may be right-tailed, left-tailed, or two-tailed. Example 11.10 will show you how to set up the null and alternative hypotheses. The null and alternative hypotheses contain statements about the population variance.

Example 11.10

Math instructors are not only interested in how their students do on exams, on average, but how the exam scores vary. To many instructors, the variance (or standard deviation) may be more important than the average.

Suppose a math instructor believes that the standard deviation for his final exam is five points. One of his best students thinks otherwise. The student claims that the standard deviation is more than five points. If the student were to conduct a hypothesis test, what would the null and alternative hypotheses be?

Even though we are given the population standard deviation, we can set up the test using the population variance as follows.

  • H 0 : σ 2 = 5 2
  • H a : σ 2 > 5 2

Try It 11.10

A SCUBA instructor wants to record the collective depths each of his students dives during their checkout. He is interested in how the depths vary, even though everyone should have been at the same depth. He believes the standard deviation is three feet. His assistant thinks the standard deviation is less than three feet. If the instructor were to conduct a test, what would the null and alternative hypotheses be?

Example 11.11

With individual lines at its various windows, a post office finds that the standard deviation for normally distributed waiting times for customers on Friday afternoon is 7.2 minutes. The post office experiments with a single, main waiting line and finds that for a random sample of 25 customers, the waiting times for customers have a standard deviation of 3.5 minutes.

With a significance level of 5%, test the claim that a single line causes lower variation among waiting times (shorter waiting times) for customers .

Since the claim is that a single line causes less variation, this is a test of a single variance. The parameter is the population variance, σ 2 , or the population standard deviation, σ .

Random Variable: The sample standard deviation, s , is the random variable. Let s = standard deviation for the waiting times.

  • H 0 : σ 2 = 7.2 2
  • H a : σ 2 < 7.2 2

The word "less" tells you this is a left-tailed test.

Distribution for the test: χ 24 2 χ 24 2 , where:

  • n = the number of customers sampled
  • df = n – 1 = 25 – 1 = 24

Calculate the test statistic:

χ 2 = ( n   −   1 ) s 2 σ 2 = ( 25   −   1 ) ( 3.5 ) 2 7.2 2 = 5.67 χ 2 = ( n   −   1 ) s 2 σ 2 = ( 25   −   1 ) ( 3.5 ) 2 7.2 2 = 5.67

where n = 25, s = 3.5, and σ = 7.2.

Probability statement: p -value = P ( χ 2 < 5.67) = 0.000042

Compare α and the p -value: α = 0.05 ; p -value = 0.000042 ; α > p -value

Make a decision: Since α > p -value, reject H 0 . This means that you reject σ 2 = 7.2 2 . In other words, you do not think the variation in waiting times is 7.2 minutes; you think the variation in waiting times is less.

Conclusion: At a 5% level of significance, from the data, there is sufficient evidence to conclude that a single line causes a lower variation among the waiting times or with a single line, the customer waiting times vary less than 7.2 minutes.

Using the TI-83, 83+, 84, 84+ Calculator

In 2nd DISTR , use 7:χ2cdf . The syntax is (lower, upper, df) for the parameter list. For Example 11.11 , χ2cdf(-1E99,5.67,24) . The p -value = 0.000042.

Try It 11.11

The FCC conducts broadband speed tests to measure how much data per second passes between a consumer’s computer and the internet. As of August of 2012, the standard deviation of Internet speeds across Internet Service Providers (ISPs) was 12.2 percent. Suppose a sample of 15 ISPs is taken, and the standard deviation is 13.2. An analyst claims that the standard deviation of speeds is more than what was reported. State the null and alternative hypotheses, compute the degrees of freedom, the test statistic, sketch the graph of the p -value, and draw a conclusion. Test at the 1% significance level.

This book may not be used in the training of large language models or otherwise be ingested into large language models or generative AI offerings without OpenStax's permission.

Want to cite, share, or modify this book? This book uses the Creative Commons Attribution License and you must attribute OpenStax.

Access for free at https://openstax.org/books/introductory-statistics/pages/1-introduction
  • Authors: Barbara Illowsky, Susan Dean
  • Publisher/website: OpenStax
  • Book title: Introductory Statistics
  • Publication date: Sep 19, 2013
  • Location: Houston, Texas
  • Book URL: https://openstax.org/books/introductory-statistics/pages/1-introduction
  • Section URL: https://openstax.org/books/introductory-statistics/pages/11-6-test-of-a-single-variance

© Jun 23, 2022 OpenStax. Textbook content produced by OpenStax is licensed under a Creative Commons Attribution License . The OpenStax name, OpenStax logo, OpenStax book covers, OpenStax CNX name, and OpenStax CNX logo are not subject to the Creative Commons license and may not be reproduced without the prior and express written consent of Rice University.

Have a language expert improve your writing

Run a free plagiarism check in 10 minutes, generate accurate citations for free.

  • Knowledge Base

Hypothesis Testing | A Step-by-Step Guide with Easy Examples

Published on November 8, 2019 by Rebecca Bevans . Revised on June 22, 2023.

Hypothesis testing is a formal procedure for investigating our ideas about the world using statistics . It is most often used by scientists to test specific predictions, called hypotheses, that arise from theories.

There are 5 main steps in hypothesis testing:

  • State your research hypothesis as a null hypothesis and alternate hypothesis (H o ) and (H a  or H 1 ).
  • Collect data in a way designed to test the hypothesis.
  • Perform an appropriate statistical test .
  • Decide whether to reject or fail to reject your null hypothesis.
  • Present the findings in your results and discussion section.

Though the specific details might vary, the procedure you will use when testing a hypothesis will always follow some version of these steps.

Table of contents

Step 1: state your null and alternate hypothesis, step 2: collect data, step 3: perform a statistical test, step 4: decide whether to reject or fail to reject your null hypothesis, step 5: present your findings, other interesting articles, frequently asked questions about hypothesis testing.

After developing your initial research hypothesis (the prediction that you want to investigate), it is important to restate it as a null (H o ) and alternate (H a ) hypothesis so that you can test it mathematically.

The alternate hypothesis is usually your initial hypothesis that predicts a relationship between variables. The null hypothesis is a prediction of no relationship between the variables you are interested in.

  • H 0 : Men are, on average, not taller than women. H a : Men are, on average, taller than women.

Here's why students love Scribbr's proofreading services

Discover proofreading & editing

For a statistical test to be valid , it is important to perform sampling and collect data in a way that is designed to test your hypothesis. If your data are not representative, then you cannot make statistical inferences about the population you are interested in.

There are a variety of statistical tests available, but they are all based on the comparison of within-group variance (how spread out the data is within a category) versus between-group variance (how different the categories are from one another).

If the between-group variance is large enough that there is little or no overlap between groups, then your statistical test will reflect that by showing a low p -value . This means it is unlikely that the differences between these groups came about by chance.

Alternatively, if there is high within-group variance and low between-group variance, then your statistical test will reflect that with a high p -value. This means it is likely that any difference you measure between groups is due to chance.

Your choice of statistical test will be based on the type of variables and the level of measurement of your collected data .

  • an estimate of the difference in average height between the two groups.
  • a p -value showing how likely you are to see this difference if the null hypothesis of no difference is true.

Based on the outcome of your statistical test, you will have to decide whether to reject or fail to reject your null hypothesis.

In most cases you will use the p -value generated by your statistical test to guide your decision. And in most cases, your predetermined level of significance for rejecting the null hypothesis will be 0.05 – that is, when there is a less than 5% chance that you would see these results if the null hypothesis were true.

In some cases, researchers choose a more conservative level of significance, such as 0.01 (1%). This minimizes the risk of incorrectly rejecting the null hypothesis ( Type I error ).

The results of hypothesis testing will be presented in the results and discussion sections of your research paper , dissertation or thesis .

In the results section you should give a brief summary of the data and a summary of the results of your statistical test (for example, the estimated difference between group means and associated p -value). In the discussion , you can discuss whether your initial hypothesis was supported by your results or not.

In the formal language of hypothesis testing, we talk about rejecting or failing to reject the null hypothesis. You will probably be asked to do this in your statistics assignments.

However, when presenting research results in academic papers we rarely talk this way. Instead, we go back to our alternate hypothesis (in this case, the hypothesis that men are on average taller than women) and state whether the result of our test did or did not support the alternate hypothesis.

If your null hypothesis was rejected, this result is interpreted as “supported the alternate hypothesis.”

These are superficial differences; you can see that they mean the same thing.

You might notice that we don’t say that we reject or fail to reject the alternate hypothesis . This is because hypothesis testing is not designed to prove or disprove anything. It is only designed to test whether a pattern we measure could have arisen spuriously, or by chance.

If we reject the null hypothesis based on our research (i.e., we find that it is unlikely that the pattern arose by chance), then we can say our test lends support to our hypothesis . But if the pattern does not pass our decision rule, meaning that it could have arisen by chance, then we say the test is inconsistent with our hypothesis .

If you want to know more about statistics , methodology , or research bias , make sure to check out some of our other articles with explanations and examples.

  • Normal distribution
  • Descriptive statistics
  • Measures of central tendency
  • Correlation coefficient

Methodology

  • Cluster sampling
  • Stratified sampling
  • Types of interviews
  • Cohort study
  • Thematic analysis

Research bias

  • Implicit bias
  • Cognitive bias
  • Survivorship bias
  • Availability heuristic
  • Nonresponse bias
  • Regression to the mean

Hypothesis testing is a formal procedure for investigating our ideas about the world using statistics. It is used by scientists to test specific predictions, called hypotheses , by calculating how likely it is that a pattern or relationship between variables could have arisen by chance.

A hypothesis states your predictions about what your research will find. It is a tentative answer to your research question that has not yet been tested. For some research projects, you might have to write several hypotheses that address different aspects of your research question.

A hypothesis is not just a guess — it should be based on existing theories and knowledge. It also has to be testable, which means you can support or refute it through scientific research methods (such as experiments, observations and statistical analysis of data).

Null and alternative hypotheses are used in statistical hypothesis testing . The null hypothesis of a test always predicts no effect or no relationship between variables, while the alternative hypothesis states your research prediction of an effect or relationship.

Cite this Scribbr article

If you want to cite this source, you can copy and paste the citation or click the “Cite this Scribbr article” button to automatically add the citation to our free Citation Generator.

Bevans, R. (2023, June 22). Hypothesis Testing | A Step-by-Step Guide with Easy Examples. Scribbr. Retrieved August 26, 2024, from https://www.scribbr.com/statistics/hypothesis-testing/

Is this article helpful?

Rebecca Bevans

Rebecca Bevans

Other students also liked, choosing the right statistical test | types & examples, understanding p values | definition and examples, what is your plagiarism score.

Hypothesis Tests for One or Two Variances or Standard Deviations

Chi-Square-tests and F-tests for variance or standard deviation both require that the original population be normally distributed.

Testing a Claim about a Variance or Standard Deviation

To test a claim about the value of the variance or the standard deviation of a population, then the test statistic will follow a chi-square distribution with $n-1$ dgrees of freedom, and is given by the following formula.

$\chi^2 = \dfrac{(n-1)s^2}{\sigma_0^2}$

The television habits of 30 children were observed. The sample mean was found to be 48.2 hours per week, with a standard deviation of 12.4 hours per week. Test the claim that the standard deviation was at least 16 hours per week.

  • The hypotheses are: $H_0: \sigma = 16$ $H_a: \sigma < 16$
  • We shall choose   $\alpha = 0.05$.
  • The test statistic is   $\chi^2 = \dfrac{(n-1)s^2}{\sigma_0^2} = \dfrac{(30-1)12.4^2}{16^2} = 17.418$.
  • The p-value is   $p = \chi^2\text{cdf}(0,17.418,29) = 0.0447$.
  • The variation in television watching was less than 16 hours per week.

Testing a the Difference of Two Variances or Two Standard Deviations

Two equal variances would satisfy the equation   $\sigma_1^2 = \sigma_2^2$,   which is equivalent to   $\dfrac{ \sigma_1^2}{\sigma_2^2} = 1$.   Since sample variances are related to chi-square distributions, and the ratio of chi-square distributions is an F-distribution, we can use the F-distribution to test against a null hypothesis of equal variances. Note that this approach does not allow us to test for a particular magnitude of difference between variances or standard deviations.

Given sample sizes of $n_1$ and $n_2$, the test statistic will have   $n_1-1$   and   $n_2-1$   degrees of freedom, and is given by the following formula.

$F = \dfrac{s_1^2}{s_2^2}$

If the larger variance (or standard deviation) is present in the first sample, then the test is right-tailed. Otherwise, the test is left-tailed. Most tables of the F-distribution assume right-tailed tests, but that requirement may not be necessary when using technology.

Samples from two makers of ball bearings are collected, and their diameters (in inches) are measured, with the following results:

  • Acme: $n_1 = 80$, $s_1 = 0.0395$
  • Bigelow: $n_2 = 120$, $s_2 = 0.0428$
  • The hypotheses are: $H_0: \sigma_1 = \sigma_2$ $H_a: \sigma_1 \neq \sigma_2$
  • The test statistic is   $F = \dfrac{s_1^2}{s_2^2} = \dfrac{0.0395^2}{0.0428^2} = 0.8517$.
  • Since the first sample had the smaller standard deviation, this is a left-tailed test. The p-value is   $p = \operatorname{Fcdf}(0,0.8517,79,119) = 0.2232$.
  • There is insufficient evidence to conclude that the diameters of the ball bearings in the two companies have different standard deviations.

If the two samples had been reversed in our computations, we would have obtained the test statistic   $F = 1.1741$,   and performing a right-tailed test, found the p-value   $p = \operatorname{Fcdf}(1.1741,\infty,119,79) = 0.2232$.   Of course, the answer is the same.

Module 11: The Chi Square Distribution

Hypothesis test for variance, learning outcomes.

  • Conduct a hypothesis test on one variance and interpret the conclusion in context

Recall: STANDARD DEVIATION AND VARIANCE

The most common measure of variation, or spread, is the standard deviation. The standard deviation is a number that measures how far data values are from their mean.

To calculate the standard deviation, we need to calculate the variance first. The variance is the average of the squares of the deviations [latex](x- \overline{x})[/latex] values for a sample, or the [latex]x – μ[/latex] values for a population). The symbol [latex]\sigma ^2[/latex] represents the population variance; the population standard deviation [latex]σ[/latex] is the square root of the population variance. The symbol [latex]s^2[/latex] represents the sample variance; the sample standard deviation [latex]s[/latex] is the square root of the sample variance.

The variance is a squared measure and does not have the same units as the data. Taking the square root solves the problem. The standard deviation measures the spread in the same units as the data.

A test of a single variance assumes that the underlying distribution is normal . The null and alternative hypotheses are stated in terms of the population variance (or population standard deviation). The test statistic is:

[latex]\displaystyle\dfrac{\left(n-1\right)s^2}{\sigma^2}[/latex]

  • [latex]n[/latex] = the total number of data
  • [latex]s^2[/latex] = sample variance
  • [latex]\sigma^2[/latex] = population variance

You may think of [latex]s[/latex] as the random variable in this test. The number of degrees of freedom is [latex]df=n-1[/latex]. A test of a single variance may be right-tailed, left-tailed, or two-tailed.  The example below will show you how to set up the null and alternative hypotheses. The null and alternative hypotheses contain statements about the population variance.

Math instructors are not only interested in how their students do on exams, on average, but how the exam scores vary. To many instructors, the variance (or standard deviation) may be more important than the average.

Suppose a math instructor believes that the standard deviation for his final exam is five points. One of his best students thinks otherwise. The student claims that the standard deviation is more than five points. If the student were to conduct a hypothesis test, what would the null and alternative hypotheses be?

Even though we are given the population standard deviation, we can set up the test using the population variance as follows.

H 0 : σ 2 = 5 2 H a : σ 2 > 5 2

A scuba instructor wants to record the collective depths of each of his students’ dives during their checkout. He is interested in how the depths vary, even though everyone should have been at the same depth. He believes the standard deviation is three feet. His assistant thinks the standard deviation is less than three feet. If the instructor were to conduct a test, what would the null and alternative hypotheses be?

Recall: ORDER OF OPERATIONS

parentheses exponents multiplication division addition subtraction
[latex]( \ )[/latex] [latex]x^2[/latex] [latex]\times \ \mathrm{or} \ \div[/latex] [latex]+ \ \mathrm{or} \ -[/latex]

To calculate the test statistic follow the following steps:

1st find the numerator:

Step 1: Calculate [latex](n-1)[/latex] by reading the problem or counting the total number of data points and then subtract [latex]1[/latex].

Step 2: Calculate [latex]s^2[/latex], and find the variance from the sample. This can be given to you in the problem or can be calculated with the following formula described in Module 2.

[latex]s^2= \frac{\sum (x- \overline{x})^2}{n-1}[/latex]. Note if you are performing a test of a single standard deviation,

Step 3: Multiply the values you got in Step 1 and Step 2.

Note: if you are performing a test of a single standard deviation, in step 2, calculate the standard deviation, [latex]s[/latex], by taking the square root of the variance.

2nd find the denominator: If you are performing a test of a single variance, read the problem or calculate the population variance with the data. If you are performing a test of a single standard deviation, read the problem or calculate the population standard deviation with the data.

Formula for the Population Variance: [latex]\sigma ^2 = \frac{\sum (x- \mu)^2}{N}[/latex]

Formula for the Population Standard Deviation: [latex]\sigma = \sqrt{\frac{\sum (x- \mu)^2}{N}}[/latex]

3rd take the numerator and divide by the denominator.

With individual lines at its various windows, a post office finds that the standard deviation for normally distributed waiting times for customers on Friday afternoon is 7.2 minutes. The post office experiments with a single, main waiting line and finds that for a random sample of 25 customers, the waiting times for customers have a standard deviation of 3.5 minutes.

With a significance level of 5%, test the claim that a single line causes lower variation among waiting times (shorter waiting times) for customers .

Since the claim is that a single line causes less variation, this is a test of a single variance. The parameter is the population variance, σ 2 , or the population standard deviation, σ .

Random Variable: The sample standard deviation, s , is the random variable. Let s = standard deviation for the waiting times.

H 0 : σ 2 = 7.22 H a : σ 2 < 7.22

The word “less” tells you this is a left-tailed test.

Distribution for the test:  [latex]X^2_{24}[/latex], where:

  • n = the number of customers sampled
  • df = n – 1 = 25 – 1 = 24

Calculate the test statistic:

X 2 = [latex]\dfrac{\left ( n-1 \right )s^2}{\sigma^2}[/latex] = [latex]\dfrac{\left ( 25-1 \right )\left ( 3.5 \right )^2}{7.2^2}[/latex] = 5.67

where n = 25, s = 3.5, and σ = 7.2.

This is a nonsymmetrical chi-square curve with values of 0 and 5.67 labeled on the horizontal axis. The point 5.67 lies to the left of the peak of the curve. A vertical upward line extends from 5.67 to the curve and the region to the left of this line is shaded. The shaded area is equal to the p-value.

Probability statement: p -value = P (χ 2 < 5.67) = 0.000042

Compare α and the p -value:

α = 0.05; p -value = 0.000042; α > p -value

Make a decision: Since α > p -value, reject H 0 . This means that you reject σ 2 = 7.22. In other words, you do not think the variation in waiting times is 7.2 minutes; you think the variation in waiting times is less.

Conclusion: At a 5% level of significance, from the data, there is sufficient evidence to conclude that a single line causes a lower variation among the waiting times or with a single line, the customer waiting times vary less than 7.2 minutes.

Using a calculator:

In 2nd DISTR , use 7:χ2cdf . The syntax is (lower, upper, df) for the parameter list. For this example, χ 2 cdf(-1E99,5.67,24) . The p -value = 0.000042.

The FCC conducts broadband speed tests to measure how much data per second passes between a consumer’s computer and the internet. As of August 2012, the standard deviation of Internet speeds across Internet Service Providers (ISPs) was 12.2 percent. Suppose a sample of 15 ISPs is taken, and the standard deviation is 13.2. An analyst claims that the standard deviation of speeds is more than what was reported. State the null and alternative hypotheses, compute the degrees of freedom and the test statistic, sketch the graph of the p -value, and draw a conclusion. Test at the 1% significance level.

  • Introductory Statistics. Authored by : Barbara Illowsky, Susan Dean. Provided by : OpenStax. Located at : https://openstax.org/books/introductory-statistics/pages/1-introduction . License : CC BY: Attribution . License Terms : Access for free at https://openstax.org/books/introductory-statistics/pages/1-introduction

Footer Logo Lumen Candela

Privacy Policy

User Preferences

Content preview.

Arcu felis bibendum ut tristique et egestas quis:

  • Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris
  • Duis aute irure dolor in reprehenderit in voluptate
  • Excepteur sint occaecat cupidatat non proident

Keyboard Shortcuts

12.1 - one variance.

Yeehah again! The theoretical work for developing a hypothesis test for a population variance \(\sigma^2\) is already behind us. Recall that if you have a random sample of size n from a normal population with (unknown) mean \(\mu\) and variance \(\sigma^2\), then:

\(\chi^2=\dfrac{(n-1)S^2}{\sigma^2}\)

follows a chi-square distribution with n −1 degrees of freedom. Therefore, if we're interested in testing the null hypothesis:

\(H_0 \colon \sigma^2=\sigma^2_0\)

against any of the alternative hypotheses:

\(H_A \colon\sigma^2 \neq \sigma^2_0,\quad H_A \colon\sigma^2<\sigma^2_0,\text{ or }H_A \colon\sigma^2>\sigma^2_0\)

we can use the test statistic:

\(\chi^2=\dfrac{(n-1)S^2}{\sigma^2_0}\)

and follow the standard hypothesis testing procedures. Let's take a look at an example.

Example 12-1 Section  

construction worker wearing a hardhat

A manufacturer of hard safety hats for construction workers is concerned about the mean and the variation of the forces its helmets transmits to wearers when subjected to an external force. The manufacturer has designed the helmets so that the mean force transmitted by the helmets to the workers is 800 pounds (or less) with a standard deviation to be less than 40 pounds. Tests were run on a random sample of n = 40 helmets, and the sample mean and sample standard deviation were found to be 825 pounds and 48.5 pounds, respectively.

Do the data provide sufficient evidence, at the \(\alpha = 0.05\) level, to conclude that the population standard deviation exceeds 40 pounds?

We're interested in testing the null hypothesis:

\(H_0 \colon \sigma^2=40^2=1600\)

against the alternative hypothesis:

\(H_A \colon\sigma^2>1600\)

Therefore, the value of the test statistic is:

\(\chi^2=\dfrac{(40-1)48.5^2}{40^2}=57.336\)

Is the test statistic too large for the null hypothesis to be true? Well, the critical value approach would have us finding the threshold value such that the probability of rejecting the null hypothesis if it were true, that is, of committing a Type I error, is small... 0.05, in this case. Using Minitab (or a chi-square probability table), we see that the cutoff value is 54.572:

That is, we reject the null hypothesis in favor of the alternative hypothesis if the test statistic \(\chi^2\) is greater than 54.572. It is. That is, the test statistic falls in the rejection region:

Therefore, we conclude that there is sufficient evidence, at the 0.05 level, to conclude that the population standard deviation exceeds 40.

Of course, the P -value approach yields the same conclusion. In this case, the P -value is the probablity that we would observe a chi-square(39) random variable more extreme than 57.336:

As the drawing illustrates, the P -value is 0.029 (as determined using the chi-square probability calculator in Minitab). Because \(P = 0.029 ≤ 0.05\), we reject the null hypothesis in favor of the alternative hypothesis.

Do the data provide sufficient evidence, at the \(\alpha = 0.05\) level, to conclude that the population standard deviation differs from 40 pounds?

In this case, we're interested in testing the null hypothesis:

\(H_A \colon\sigma^2 \neq 1600\)

The value of the test statistic remains the same. It is again:

Now, is the test statistic either too large or too small for the null hypothesis to be true? Well, the critical value approach would have us dividing the significance level \(\alpha = 0.05\) into 2, to get 0.025, and putting one of the halves in the left tail, and the other half in the other tail. Doing so (and using Minitab to get the cutoff values), we get that the lower cutoff value is 23.654 and the upper cutoff value is 58.120:

That is, we reject the null hypothesis in favor of the two-sided alternative hypothesis if the test statistic \(\chi^2\) is either smaller than 23.654 or greater than 58.120. It is not. That is, the test statistic does not fall in the rejection region:

Therefore, we fail to reject the null hypothesis. There is insufficient evidence, at the 0.05 level, to conclude that the population standard deviation differs from 40.

Of course, the P -value approach again yields the same conclusion. In this case, we simply double the P -value we obtained for the one-tailed test yielding a P -value of 0.058:

\(P=2\times P\left(\chi^2_{39}>57.336\right)=2\times 0.029=0.058\)

Because \(P = 0.058 > 0.05\), we fail to reject the null hypothesis in favor of the two-sided alternative hypothesis.

The above example illustrates an important fact, namely, that the conclusion for the one-sided test does not always agree with the conclusion for the two-sided test. If you have reason to believe that the parameter will differ from the null value in a particular direction, then you should conduct the one-sided test.

One Sample Variance Test Calculator

Enter sample data.

Chi-square test for the variance

How to use the one-sample variance test calculator, assumptions, required sample data, effect size formula.

φ = √(
χ )
n

Calculators

Chi-Square test for One Pop. Variance

Instructions: This calculator conducts a Chi-Square test for one population variance (\(\sigma^2\)). Please select the null and alternative hypotheses, type the hypothesized variance, the significance level, the sample variance, and the sample size, and the results of the Chi-Square test will be presented for you:

hypothesis test on variance

Chi-Square test for One Population Variance

More about the Chi-Square test for one variance so you can better understand the results provided by this solver: A Chi-Square test for one population variance is a hypothesis that attempts to make a claim about the population variance (\(\sigma^2\)) based on sample information.

Main Properties of the Chi-Square Distribution

The test, as every other well formed hypothesis test, has two non-overlapping hypotheses, the null and the alternative hypothesis. The null hypothesis is a statement about the population variance which represents the assumption of no effect, and the alternative hypothesis is the complementary hypothesis to the null hypothesis.

The main properties of a one sample Chi-Square test for one population variance are:

  • The distribution of the test statistic is the Chi-Square distribution, with n-1 degrees of freedom
  • The Chi-Square distribution is one of the most important distributions in statistics, together with the normal distribution and the F-distribution
  • Depending on our knowledge about the "no effect" situation, the Chi-Square test can be two-tailed, left-tailed or right-tailed
  • The main principle of hypothesis testing is that the null hypothesis is rejected if the test statistic obtained is sufficiently unlikely under the assumption that the null hypothesis is true
  • The p-value is the probability of obtaining sample results as extreme or more extreme than the sample results obtained, under the assumption that the null hypothesis is true
  • In a hypothesis tests there are two types of errors. Type I error occurs when we reject a true null hypothesis, and the Type II error occurs when we fail to reject a false null hypothesis

Chi-Square test for one variance

Can you use Chi-square for one variable?

Absolutely! The Chi-Square statistics is a very versatile statistics, that can be used for a one-way situation (one variable) for example for testing for one variance, or for a goodness of fit test .

But it can also be used for a two-way situation (two variables) for example for a Chi-Square test of independence .

How do you do hypothesis test for single population variance?

The sample variance \(s^2\) has some very interesting distributional properties. In fact, based on how the variance is constructed, we can think of the variance as the sum of pieces that have a standard normal distribution but they are squared.

Without getting into much detail, the sum of squared standard normal distributions is tightly related to the Chi-Square distribution, as we will see in the next section.

What is the Chi-Square Formula?

The formula for a Chi-Square statistic for testing for one population variance is

The null hypothesis is rejected when the Chi-Square statistic lies on the rejection region, which is determined by the significance level (\(\alpha\)) and the type of tail (two-tailed, left-tailed or right-tailed).

To compute critical values directly, please go to our Chi-Square critical values calculator

Related Calculators

Descriptive Statistics Calculator of Grouped Data

log in to your account

Reset password.

COMMENTS

  1. 3.5: Hypothesis Test about a Variance

    The test statistic is. χ2 = (n − 1)S2 σ20 = (11 − 1)0.064 0.06 = 10.667 χ 2 = ( n − 1) S 2 σ 0 2 = ( 11 − 1) 0.064 0.06 = 10.667. We fail to reject the null hypothesis. The forester does NOT have enough evidence to support the claim that the variance is greater than 0.06 gal.2 You can also estimate the p-value using the same method ...

  2. Hypothesis tests about the variance

    The null hypothesis. We test the null hypothesis that the variance is equal to a specific value : The test statistic. We construct a test statistic by using the sample mean and either the unadjusted sample variance or the adjusted sample variance. The test statistic, known as Chi-square statistic, is. The critical region

  3. 11.7: Test of a Single Variance

    The test statistic is: χ2 = (n − 1)s2 σ2 (11.7.1) (11.7.1) χ 2 = ( n − 1) s 2 σ 2. where: n n is the the total number of data. s2 s 2 is the sample variance. σ2 σ 2 is the population variance. You may think of s s as the random variable in this test. The number of degrees of freedom is df = n − 1 d f = n − 1.

  4. Lesson 12: Tests for Variances

    Lesson 12: Tests for Variances. Continuing our development of hypothesis tests for various population parameters, in this lesson, we'll focus on hypothesis tests for population variances. Specifically, we'll develop: a hypothesis test for testing whether a single population variance σ 2 equals a particular value.

  5. Lesson 12: Tests for Variances

    Lesson 12: Tests for Variances. Continuing our development of hypothesis tests for various population parameters, in this lesson, we'll focus on hypothesis tests for population variances. Specifically, we'll develop: a hypothesis test for testing whether a single population variance \ (\sigma^2\) equals a particular value.

  6. Hypothesis Testing

    The hypothesis is based on available information and the investigator's belief about the population parameters. The specific test considered here is called analysis of variance (ANOVA) and is a test of hypothesis that is appropriate to compare means of a continuous variable in two or more independent comparison groups.

  7. Stat 205

    Under the null hypothesis, the test statistic follows \(\chi^2 = (n-1)S^2/0.5^2\) a chi-square distribution with df = 5. If this alternative hypothesis is true and \(\sigma^2\) squared is greater than the hypothesized value then the sample variance \(S^2\) will have a tendency to be greater than the hypothesized value and the test statistic will tend to be large

  8. 11.6 Test of a Single Variance

    The test statistic is: (n − 1)s2 σ2 ( n - 1) s 2 σ 2. where: n = the total number of data. s2 = sample variance. σ2 = population variance. You may think of s as the random variable in this test. The number of degrees of freedom is df = n - 1. A test of a single variance may be right-tailed, left-tailed, or two-tailed.

  9. Hypothesis Testing

    Hypothesis testing is a formal procedure for investigating our ideas about the world using statistics. It is most often used by scientists to test specific predictions, called hypotheses, that arise from theories. ... Alternatively, if there is high within-group variance and low between-group variance, then your statistical test will reflect ...

  10. Hypothesis Tests for One or Two Variances or Standard Deviations

    Bigelow: n2 = 120 n 2 = 120, s2 = 0.0428 s 2 = 0.0428. Assuming that the diameters of the bearings from both companies are normally distributed, test the claim that there is no difference in the variation of the diameters between the two companies. The hypotheses are: H0: σ1 = σ2 H 0: σ 1 = σ 2. Ha: σ1 ≠ σ2 H a: σ 1 ≠ σ 2.

  11. Statistical hypothesis test

    A statistical hypothesis test is a method of statistical inference used to decide whether the data sufficiently supports a ... chi-squared test), William Sealy Gosset (Student's t-distribution), and Ronald Fisher ("null hypothesis", analysis of variance, "significance test"), while hypothesis testing was developed by Jerzy Neyman and Egon ...

  12. Hypothesis Test for Variance

    To calculate the test statistic follow the following steps: 1st find the numerator: Step 1: Calculate (n−1) ( n − 1) by reading the problem or counting the total number of data points and then subtract 1 1. Step 2: Calculate s2 s 2, and find the variance from the sample.

  13. Hypothesis Testing with Python: Step by step hands-on tutorial with

    It tests the null hypothesis that the population variances are equal (called homogeneity of variance or homoscedasticity). Suppose the resulting p-value of Levene's test is less than the significance level (typically 0.05).In that case, the obtained differences in sample variances are unlikely to have occurred based on random sampling from a population with equal variances.

  14. 1.3.5.8. Chi-Square Test for the Variance

    A chi-square test was performed for the GEAR.DAT data set. The observed variance for the 100 measurements of gear diameter is 0.00003969 (the standard deviation is 0.0063). We will test the null hypothesis that the true variance is equal to 0.01. H 0: σ 2 = 0.01 H a: σ 2 ≠ 0.01

  15. 12.1

    12.1 - One Variance. Yeehah again! The theoretical work for developing a hypothesis test for a population variance σ 2 is already behind us. Recall that if you have a random sample of size n from a normal population with (unknown) mean μ and variance σ 2, then: χ 2 = ( n − 1) S 2 σ 2. follows a chi-square distribution with n −1 degrees ...

  16. One Sample Variance Test Calculator

    Chi-square test for the variance. Target: To check if the assumed variance (σ 2 0) is statistically correct, based on a sample variance S 2. Identically check if the assumed standard deviation (σ 0) is statistically correct, based on a sample standard deviation S. Target: the chi-square test for the variance checks if the population variance (σ 2) is different from the expected value (σ 0 ...

  17. Hypothesis Tests on the Variance

    Organized by textbook: https://learncheme.com/ Made by faculty at the University of Colorado Boulder, Department of Chemical & Biological Engineering.

  18. Statistics 101: Hypothesis Tests for the Variance

    In this video, we learn how to conduct hypothesis tests for the variance as compared to some standard or hypothesized value. We learn the conceptual backgrou...

  19. Chi-Square test for One Pop. Variance

    The formula for a Chi-Square statistic for testing for one population variance is. \chi^2 = \frac { (n-1)s^2} {\sigma^2} χ2 = σ2(n−1)s2. The null hypothesis is rejected when the Chi-Square statistic lies on the rejection region, which is determined by the significance level ( \alpha α) and the type of tail (two-tailed, left-tailed or right ...

  20. 8.5: Hypothesis Test on a Single Variance

    A test of a single variance assumes that the underlying distribution is normal. The null and alternative hypotheses are stated in terms of the population variance (or population standard deviation). The test statistic is: χ2 = (n − 1)s2 σ2 (8.5.1) (8.5.1) χ 2 = ( n − 1) s 2 σ 2. where:

  21. Hypothesis Testing

    This video is part of the Introductory Statistics series by Professor Dan Kernler of Elgin Community College. In this video, we discuss how to perform hypoth...

  22. One Sample Test of Variance

    Based on Property 7 of Chi-square Distribution, we can use the chi-square distribution to test the variance of a distribution.. Hypothesis Test. Example 1: A company produces metal pipes of a standard length.Twenty years ago it tested its production quality and found that the lengths of the pipes produced were normally distributed with a standard deviation of 1.1 cm.

  23. Hypothesis Tests for Variance Case I

    s 2 is the sample variance. The test statistic is compared to a critical value, χ 2 α, or χ 2 α/2 based on the significance level,α and for two or single-tailed test, respectively. The hypothesis test evaluating if the sample is equal or not is the two-tailed test, the others are single tail tests.

  24. Long‐term effects of widespread pharmaceutical pollution on trade‐offs

    We test whether a widespread pharmaceutical pollutant, fluoxetine (Prozac), disrupts the trade-off between individual-level (co)variation in behavioural, life-history and reproductive traits of freshwater fish. ... (i.e. the proportion of behavioural variance observed within and between individuals of a population) in jumping spiders (Royauté ...