• Skip to secondary menu
  • Skip to main content
  • Skip to primary sidebar

Statistics By Jim

Making statistics intuitive

Null Hypothesis: Definition, Rejecting & Examples

By Jim Frost 6 Comments

What is a Null Hypothesis?

The null hypothesis in statistics states that there is no difference between groups or no relationship between variables. It is one of two mutually exclusive hypotheses about a population in a hypothesis test.

Photograph of Rodin's statue, The Thinker who is pondering the null hypothesis.

  • Null Hypothesis H 0 : No effect exists in the population.
  • Alternative Hypothesis H A : The effect exists in the population.

In every study or experiment, researchers assess an effect or relationship. This effect can be the effectiveness of a new drug, building material, or other intervention that has benefits. There is a benefit or connection that the researchers hope to identify. Unfortunately, no effect may exist. In statistics, we call this lack of an effect the null hypothesis. Researchers assume that this notion of no effect is correct until they have enough evidence to suggest otherwise, similar to how a trial presumes innocence.

In this context, the analysts don’t necessarily believe the null hypothesis is correct. In fact, they typically want to reject it because that leads to more exciting finds about an effect or relationship. The new vaccine works!

You can think of it as the default theory that requires sufficiently strong evidence to reject. Like a prosecutor, researchers must collect sufficient evidence to overturn the presumption of no effect. Investigators must work hard to set up a study and a data collection system to obtain evidence that can reject the null hypothesis.

Related post : What is an Effect in Statistics?

Null Hypothesis Examples

Null hypotheses start as research questions that the investigator rephrases as a statement indicating there is no effect or relationship.

Does the vaccine prevent infections? The vaccine does not affect the infection rate.
Does the new additive increase product strength? The additive does not affect mean product strength.
Does the exercise intervention increase bone mineral density? The intervention does not affect bone mineral density.
As screen time increases, does test performance decrease? There is no relationship between screen time and test performance.

After reading these examples, you might think they’re a bit boring and pointless. However, the key is to remember that the null hypothesis defines the condition that the researchers need to discredit before suggesting an effect exists.

Let’s see how you reject the null hypothesis and get to those more exciting findings!

When to Reject the Null Hypothesis

So, you want to reject the null hypothesis, but how and when can you do that? To start, you’ll need to perform a statistical test on your data. The following is an overview of performing a study that uses a hypothesis test.

The first step is to devise a research question and the appropriate null hypothesis. After that, the investigators need to formulate an experimental design and data collection procedures that will allow them to gather data that can answer the research question. Then they collect the data. For more information about designing a scientific study that uses statistics, read my post 5 Steps for Conducting Studies with Statistics .

After data collection is complete, statistics and hypothesis testing enter the picture. Hypothesis testing takes your sample data and evaluates how consistent they are with the null hypothesis. The p-value is a crucial part of the statistical results because it quantifies how strongly the sample data contradict the null hypothesis.

When the sample data provide sufficient evidence, you can reject the null hypothesis. In a hypothesis test, this process involves comparing the p-value to your significance level .

Rejecting the Null Hypothesis

Reject the null hypothesis when the p-value is less than or equal to your significance level. Your sample data favor the alternative hypothesis, which suggests that the effect exists in the population. For a mnemonic device, remember—when the p-value is low, the null must go!

When you can reject the null hypothesis, your results are statistically significant. Learn more about Statistical Significance: Definition & Meaning .

Failing to Reject the Null Hypothesis

Conversely, when the p-value is greater than your significance level, you fail to reject the null hypothesis. The sample data provides insufficient data to conclude that the effect exists in the population. When the p-value is high, the null must fly!

Note that failing to reject the null is not the same as proving it. For more information about the difference, read my post about Failing to Reject the Null .

That’s a very general look at the process. But I hope you can see how the path to more exciting findings depends on being able to rule out the less exciting null hypothesis that states there’s nothing to see here!

Let’s move on to learning how to write the null hypothesis for different types of effects, relationships, and tests.

Related posts : How Hypothesis Tests Work and Interpreting P-values

How to Write a Null Hypothesis

The null hypothesis varies by the type of statistic and hypothesis test. Remember that inferential statistics use samples to draw conclusions about populations. Consequently, when you write a null hypothesis, it must make a claim about the relevant population parameter . Further, that claim usually indicates that the effect does not exist in the population. Below are typical examples of writing a null hypothesis for various parameters and hypothesis tests.

Related posts : Descriptive vs. Inferential Statistics and Populations, Parameters, and Samples in Inferential Statistics

Group Means

T-tests and ANOVA assess the differences between group means. For these tests, the null hypothesis states that there is no difference between group means in the population. In other words, the experimental conditions that define the groups do not affect the mean outcome. Mu (µ) is the population parameter for the mean, and you’ll need to include it in the statement for this type of study.

For example, an experiment compares the mean bone density changes for a new osteoporosis medication. The control group does not receive the medicine, while the treatment group does. The null states that the mean bone density changes for the control and treatment groups are equal.

  • Null Hypothesis H 0 : Group means are equal in the population: µ 1 = µ 2 , or µ 1 – µ 2 = 0
  • Alternative Hypothesis H A : Group means are not equal in the population: µ 1 ≠ µ 2 , or µ 1 – µ 2 ≠ 0.

Group Proportions

Proportions tests assess the differences between group proportions. For these tests, the null hypothesis states that there is no difference between group proportions. Again, the experimental conditions did not affect the proportion of events in the groups. P is the population proportion parameter that you’ll need to include.

For example, a vaccine experiment compares the infection rate in the treatment group to the control group. The treatment group receives the vaccine, while the control group does not. The null states that the infection rates for the control and treatment groups are equal.

  • Null Hypothesis H 0 : Group proportions are equal in the population: p 1 = p 2 .
  • Alternative Hypothesis H A : Group proportions are not equal in the population: p 1 ≠ p 2 .

Correlation and Regression Coefficients

Some studies assess the relationship between two continuous variables rather than differences between groups.

In these studies, analysts often use either correlation or regression analysis . For these tests, the null states that there is no relationship between the variables. Specifically, it says that the correlation or regression coefficient is zero. As one variable increases, there is no tendency for the other variable to increase or decrease. Rho (ρ) is the population correlation parameter and beta (β) is the regression coefficient parameter.

For example, a study assesses the relationship between screen time and test performance. The null states that there is no correlation between this pair of variables. As screen time increases, test performance does not tend to increase or decrease.

  • Null Hypothesis H 0 : The correlation in the population is zero: ρ = 0.
  • Alternative Hypothesis H A : The correlation in the population is not zero: ρ ≠ 0.

For all these cases, the analysts define the hypotheses before the study. After collecting the data, they perform a hypothesis test to determine whether they can reject the null hypothesis.

The preceding examples are all for two-tailed hypothesis tests. To learn about one-tailed tests and how to write a null hypothesis for them, read my post One-Tailed vs. Two-Tailed Tests .

Related post : Understanding Correlation

Neyman, J; Pearson, E. S. (January 1, 1933).  On the Problem of the most Efficient Tests of Statistical Hypotheses .  Philosophical Transactions of the Royal Society A .  231  (694–706): 289–337.

Share this:

null hypothesis test for equality

Reader Interactions

' src=

January 11, 2024 at 2:57 pm

Thanks for the reply.

January 10, 2024 at 1:23 pm

Hi Jim, In your comment you state that equivalence test null and alternate hypotheses are reversed. For hypothesis tests of data fits to a probability distribution, the null hypothesis is that the probability distribution fits the data. Is this correct?

' src=

January 10, 2024 at 2:15 pm

Those two separate things, equivalence testing and normality tests. But, yes, you’re correct for both.

Hypotheses are switched for equivalence testing. You need to “work” (i.e., collect a large sample of good quality data) to be able to reject the null that the groups are different to be able to conclude they’re the same.

With typical hypothesis tests, if you have low quality data and a low sample size, you’ll fail to reject the null that they’re the same, concluding they’re equivalent. But that’s more a statement about the low quality and small sample size than anything to do with the groups being equal.

So, equivalence testing make you work to obtain a finding that the groups are the same (at least within some amount you define as a trivial difference).

For normality testing, and other distribution tests, the null states that the data follow the distribution (normal or whatever). If you reject the null, you have sufficient evidence to conclude that your sample data don’t follow the probability distribution. That’s a rare case where you hope to fail to reject the null. And it suffers from the problem I describe above where you might fail to reject the null simply because you have a small sample size. In that case, you’d conclude the data follow the probability distribution but it’s more that you don’t have enough data for the test to register the deviation. In this scenario, if you had a larger sample size, you’d reject the null and conclude it doesn’t follow that distribution.

I don’t know of any equivalence testing type approach for distribution fit tests where you’d need to work to show the data follow a distribution, although I haven’t looked for one either!

' src=

February 20, 2022 at 9:26 pm

Is a null hypothesis regularly (always) stated in the negative? “there is no” or “does not”

February 23, 2022 at 9:21 pm

Typically, the null hypothesis includes an equal sign. The null hypothesis states that the population parameter equals a particular value. That value is usually one that represents no effect. In the case of a one-sided hypothesis test, the null still contains an equal sign but it’s “greater than or equal to” or “less than or equal to.” If you wanted to translate the null hypothesis from its native mathematical expression, you could use the expression “there is no effect.” But the mathematical form more specifically states what it’s testing.

It’s the alternative hypothesis that typically contains does not equal.

There are some exceptions. For example, in an equivalence test where the researchers want to show that two things are equal, the null hypothesis states that they’re not equal.

In short, the null hypothesis states the condition that the researchers hope to reject. They need to work hard to set up an experiment and data collection that’ll gather enough evidence to be able to reject the null condition.

' src=

February 15, 2022 at 9:32 am

Dear sir I always read your notes on Research methods.. Kindly tell is there any available Book on all these..wonderfull Urgent

Comments and Questions Cancel reply

Module 9: Hypothesis Testing With One Sample

Null and alternative hypotheses, learning outcomes.

  • Describe hypothesis testing in general and in practice

The actual test begins by considering two  hypotheses . They are called the null hypothesis and the alternative hypothesis . These hypotheses contain opposing viewpoints.

H 0 : The null hypothesis: It is a statement about the population that either is believed to be true or is used to put forth an argument unless it can be shown to be incorrect beyond a reasonable doubt.

H a : The alternative hypothesis : It is a claim about the population that is contradictory to H 0 and what we conclude when we reject H 0 .

Since the null and alternative hypotheses are contradictory, you must examine evidence to decide if you have enough evidence to reject the null hypothesis or not. The evidence is in the form of sample data.

After you have determined which hypothesis the sample supports, you make adecision. There are two options for a  decision . They are “reject H 0 ” if the sample information favors the alternative hypothesis or “do not reject H 0 ” or “decline to reject H 0 ” if the sample information is insufficient to reject the null hypothesis.

Mathematical Symbols Used in  H 0 and H a :

equal (=) not equal (≠)
greater than (>) less than (<)
greater than or equal to (≥) less than (<)
less than or equal to (≤) more than (>)

H 0 always has a symbol with an equal in it. H a never has a symbol with an equal in it. The choice of symbol depends on the wording of the hypothesis test. However, be aware that many researchers (including one of the co-authors in research work) use = in the null hypothesis, even with > or < as the symbol in the alternative hypothesis. This practice is acceptable because we only make the decision to reject or not reject the null hypothesis.

H 0 : No more than 30% of the registered voters in Santa Clara County voted in the primary election. p ≤ 30

H a : More than 30% of the registered voters in Santa Clara County voted in the primary election. p > 30

A medical trial is conducted to test whether or not a new medicine reduces cholesterol by 25%. State the null and alternative hypotheses.

H 0 : The drug reduces cholesterol by 25%. p = 0.25

H a : The drug does not reduce cholesterol by 25%. p ≠ 0.25

We want to test whether the mean GPA of students in American colleges is different from 2.0 (out of 4.0). The null and alternative hypotheses are:

H 0 : μ = 2.0

H a : μ ≠ 2.0

We want to test whether the mean height of eighth graders is 66 inches. State the null and alternative hypotheses. Fill in the correct symbol (=, ≠, ≥, <, ≤, >) for the null and alternative hypotheses. H 0 : μ __ 66 H a : μ __ 66

  • H 0 : μ = 66
  • H a : μ ≠ 66

We want to test if college students take less than five years to graduate from college, on the average. The null and alternative hypotheses are:

H 0 : μ ≥ 5

H a : μ < 5

We want to test if it takes fewer than 45 minutes to teach a lesson plan. State the null and alternative hypotheses. Fill in the correct symbol ( =, ≠, ≥, <, ≤, >) for the null and alternative hypotheses. H 0 : μ __ 45 H a : μ __ 45

  • H 0 : μ ≥ 45
  • H a : μ < 45

In an issue of U.S. News and World Report , an article on school standards stated that about half of all students in France, Germany, and Israel take advanced placement exams and a third pass. The same article stated that 6.6% of U.S. students take advanced placement exams and 4.4% pass. Test if the percentage of U.S. students who take advanced placement exams is more than 6.6%. State the null and alternative hypotheses.

H 0 : p ≤ 0.066

H a : p > 0.066

On a state driver’s test, about 40% pass the test on the first try. We want to test if more than 40% pass on the first try. Fill in the correct symbol (=, ≠, ≥, <, ≤, >) for the null and alternative hypotheses. H 0 : p __ 0.40 H a : p __ 0.40

  • H 0 : p = 0.40
  • H a : p > 0.40

Concept Review

In a  hypothesis test , sample data is evaluated in order to arrive at a decision about some type of claim. If certain conditions about the sample are satisfied, then the claim can be evaluated for a population. In a hypothesis test, we: Evaluate the null hypothesis , typically denoted with H 0 . The null is not rejected unless the hypothesis test shows otherwise. The null statement must always contain some form of equality (=, ≤ or ≥) Always write the alternative hypothesis , typically denoted with H a or H 1 , using less than, greater than, or not equals symbols, i.e., (≠, >, or <). If we reject the null hypothesis, then we can assume there is enough evidence to support the alternative hypothesis. Never state that a claim is proven true or false. Keep in mind the underlying fact that hypothesis testing is based on probability laws; therefore, we can talk only in terms of non-absolute certainties.

Formula Review

H 0 and H a are contradictory.

  • OpenStax, Statistics, Null and Alternative Hypotheses. Provided by : OpenStax. Located at : http://cnx.org/contents/[email protected]:58/Introductory_Statistics . License : CC BY: Attribution
  • Introductory Statistics . Authored by : Barbara Illowski, Susan Dean. Provided by : Open Stax. Located at : http://cnx.org/contents/[email protected] . License : CC BY: Attribution . License Terms : Download for free at http://cnx.org/contents/[email protected]
  • Simple hypothesis testing | Probability and Statistics | Khan Academy. Authored by : Khan Academy. Located at : https://youtu.be/5D1gV37bKXY . License : All Rights Reserved . License Terms : Standard YouTube License

Independent t-test for two samples

Introduction.

The independent t-test, also called the two sample t-test, independent-samples t-test or student's t-test, is an inferential statistical test that determines whether there is a statistically significant difference between the means in two unrelated groups.

Null and alternative hypotheses for the independent t-test

The null hypothesis for the independent t-test is that the population means from the two unrelated groups are equal:

H 0 : u 1 = u 2

In most cases, we are looking to see if we can show that we can reject the null hypothesis and accept the alternative hypothesis, which is that the population means are not equal:

H A : u 1 ≠ u 2

To do this, we need to set a significance level (also called alpha) that allows us to either reject or accept the alternative hypothesis. Most commonly, this value is set at 0.05.

What do you need to run an independent t-test?

In order to run an independent t-test, you need the following:

  • One independent, categorical variable that has two levels/groups.
  • One continuous dependent variable.

Unrelated groups

Unrelated groups, also called unpaired groups or independent groups, are groups in which the cases (e.g., participants) in each group are different. Often we are investigating differences in individuals, which means that when comparing two groups, an individual in one group cannot also be a member of the other group and vice versa. An example would be gender - an individual would have to be classified as either male or female – not both.

Assumption of normality of the dependent variable

The independent t-test requires that the dependent variable is approximately normally distributed within each group.

Note: Technically, it is the residuals that need to be normally distributed, but for an independent t-test, both will give you the same result.

You can test for this using a number of different tests, but the Shapiro-Wilks test of normality or a graphical method, such as a Q-Q Plot, are very common. You can run these tests using SPSS Statistics, the procedure for which can be found in our Testing for Normality guide. However, the t-test is described as a robust test with respect to the assumption of normality. This means that some deviation away from normality does not have a large influence on Type I error rates. The exception to this is if the ratio of the smallest to largest group size is greater than 1.5 (largest compared to smallest).

What to do when you violate the normality assumption

If you find that either one or both of your group's data is not approximately normally distributed and groups sizes differ greatly, you have two options: (1) transform your data so that the data becomes normally distributed (to do this in SPSS Statistics see our guide on Transforming Data ), or (2) run the Mann-Whitney U test which is a non-parametric test that does not require the assumption of normality (to run this test in SPSS Statistics see our guide on the Mann-Whitney U Test ).

Assumption of homogeneity of variance

The independent t-test assumes the variances of the two groups you are measuring are equal in the population. If your variances are unequal, this can affect the Type I error rate. The assumption of homogeneity of variance can be tested using Levene's Test of Equality of Variances, which is produced in SPSS Statistics when running the independent t-test procedure. If you have run Levene's Test of Equality of Variances in SPSS Statistics, you will get a result similar to that below:

Levene's Test for Equality of Variances in the Independent T-Test Procedure within SPSS

This test for homogeneity of variance provides an F -statistic and a significance value ( p -value). We are primarily concerned with the significance value – if it is greater than 0.05 (i.e., p > .05), our group variances can be treated as equal. However, if p < 0.05, we have unequal variances and we have violated the assumption of homogeneity of variances.

Overcoming a violation of the assumption of homogeneity of variance

If the Levene's Test for Equality of Variances is statistically significant, which indicates that the group variances are unequal in the population, you can correct for this violation by not using the pooled estimate for the error term for the t -statistic, but instead using an adjustment to the degrees of freedom using the Welch-Satterthwaite method. In all reality, you will probably never have heard of these adjustments because SPSS Statistics hides this information and simply labels the two options as "Equal variances assumed" and "Equal variances not assumed" without explicitly stating the underlying tests used. However, you can see the evidence of these tests as below:

Differences in the t-statistic and the degrees of freedom when homogeneity of variance is not assumed

From the result of Levene's Test for Equality of Variances, we can reject the null hypothesis that there is no difference in the variances between the groups and accept the alternative hypothesis that there is a statistically significant difference in the variances between groups. The effect of not being able to assume equal variances is evident in the final column of the above figure where we see a reduction in the value of the t -statistic and a large reduction in the degrees of freedom (df). This has the effect of increasing the p -value above the critical significance level of 0.05. In this case, we therefore do not accept the alternative hypothesis and accept that there are no statistically significant differences between means. This would not have been our conclusion had we not tested for homogeneity of variances.

Testimonials

Reporting the result of an independent t-test

When reporting the result of an independent t-test, you need to include the t -statistic value, the degrees of freedom (df) and the significance value of the test ( p -value). The format of the test result is: t (df) = t -statistic, p = significance value. Therefore, for the example above, you could report the result as t (7.001) = 2.233, p = 0.061.

Fully reporting your results

In order to provide enough information for readers to fully understand the results when you have run an independent t-test, you should include the result of normality tests, Levene's Equality of Variances test, the two group means and standard deviations, the actual t-test result and the direction of the difference (if any). In addition, you might also wish to include the difference between the groups along with a 95% confidence interval. For example:

Inspection of Q-Q Plots revealed that cholesterol concentration was normally distributed for both groups and that there was homogeneity of variance as assessed by Levene's Test for Equality of Variances. Therefore, an independent t-test was run on the data with a 95% confidence interval (CI) for the mean difference. It was found that after the two interventions, cholesterol concentrations in the dietary group (6.15 ± 0.52 mmol/L) were significantly higher than the exercise group (5.80 ± 0.38 mmol/L) ( t (38) = 2.470, p = 0.018) with a difference of 0.35 (95% CI, 0.06 to 0.64) mmol/L.

To know how to run an independent t-test in SPSS Statistics, see our SPSS Statistics Independent-Samples T-Test guide. Alternatively, you can carry out an independent-samples t-test using Excel, R and RStudio .

Stack Exchange Network

Stack Exchange network consists of 183 Q&A communities including Stack Overflow , the largest, most trusted online community for developers to learn, share their knowledge, and build their careers.

Q&A for work

Connect and share knowledge within a single location that is structured and easy to search.

Why must a null hypothesis contain equality?

Using the inference of mean as an example, the null and alternative hypothesis could be

$$H_0: \mu \le 0 \Leftrightarrow H_1: \mu > 0$$

It is often argued that this makes the calculation of the $p$ -value (or critical region) easier, but from my understanding the $p$ -value is $P(\bar{X} > 0|H_0) \stackrel{\color{red}{?}}{=} \sup_{\theta \in H_0} P(\bar{X} > 0 | \theta )$ . Since I'm taking its supremum, the $p$ -value doesn't change regardless of whether the equality sign is included. The same result goes for the critical region. However, I'm not sure if my understanding is correct, and my doubt is marked with a red question mark above.

Anyway, I wonder if it's correct (I know that would be unconventional, though) to write

$$H_0: \mu < 0 \Leftrightarrow H_1: \mu \ge 0$$

If that's OK, what about its $p$ -value and critical region? If that's not OK, I would be really surprised, because in that way we are changing the question ("Is $\mu$ less than or equal to zero?") to something different ("Is $\mu$ less than zero?")

  • hypothesis-testing
  • mathematical-statistics
  • type-i-and-ii-errors

nalzok's user avatar

  • 2 $\begingroup$ I guess Ho and H1 have to be mutually exclusive, don't they? $\endgroup$ –  Isabella Ghement Commented Jun 28, 2019 at 15:05
  • 3 $\begingroup$ You implicitly assume something that is not generally true: namely, that the null hypothesis includes values of $\mu$ arbitrarily close to $0.$ What if $\mu$ is known to be integral, for instance? I realize this is somewhat artificial, but it might provide some insight. $\endgroup$ –  whuber ♦ Commented Jun 28, 2019 at 15:33
  • 1 $\begingroup$ @IsabellaGhement that's my understanding as well, but apparently it's controversial... see the discussion in the comments of this post: One sided test $H_0:\mu=0$ or $H_0:\mu\leq 0$? $\endgroup$ –  Scholar Commented Jun 28, 2019 at 15:57
  • 2 $\begingroup$ Consider a continuous, composite null containing no boundary (i.e. an open set) - for simplicity a "standard" one tailed test of a population mean. Where do you calculate the significance level at? (Hint: you actually end up evaluating it under the alternative...) $\endgroup$ –  Glen_b Commented Jun 29, 2019 at 7:16
  • 1 $\begingroup$ This is a case where use of standard Bayesian posterior probabilities cuts through a lot of the complexities; just compute $P(\mu \in S)$ for any set $S$. $\endgroup$ –  Frank Harrell Commented Jun 29, 2019 at 11:26

2 Answers 2

The principle of the hypothesis test requires the computation of the rejection probability assuming H0 (type I error probability). In fact, if the null hypothesis is not just one point, we are fine with the supremum, so that we can state that the type I error probability of the test is smaller or equal to that value. So far this would be in line with what you're writing and wouldn't require the equality case to be included.

However, if in fact the rejection probability for the "equal" is equal to the supremum over the rest of the null hypothesis (as is the case for most tests although one could imagine tests for which this doesn't hold), the test will not distinguish the equality case from the H0, i.e., you will not have a higher rejection probability for "equal" than for the H0 as a whole, and therefore it makes sense to include "equal" in the H0.

In the words of E.S. Pearson (if I remember correctly), tests for which parameter values in the H1 have a lower rejection probability than cases in the H0 are "worse than useless", and if they have the same rejection probability, they are just useless.

Why would you want to reject a hull hypothesis in favour of an alternative that has a rejection probability that isn't higher? The idea of tests is that you collect evidence in favour of the alternative from observing a rejection event that has a probability under the alternative that is higher than under the H0.

PS (added later): Technically, according to the general definition of hypothesis tests, you do not always have to include $\mu=0$ in the null hypothesis (note also that not all tests are Neyman-Pearson tests). I'm not giving you a reason why this isn't permitted, I'm giving you a reason why in most cases it doesn't make sense. As written above, I can even imagine to construct a problem with a weird model and parametrisation according to which the rejection probability at $\mu=0$ is in fact higher than the supremum over $\mu<0$ , in which case I wouldn't object at all against testing $H_0: \mu<0$ against $H_1: \mu\ge 0$ . However pretty much all the standard tests that you find in the literature are not of this kind. In all the standard cases in fact the test level is computed on the hypothesis border, i.e. (using the parametrisation referred to in the question), $\mu=0$ , and then $\mu<0$ is attached based on the fact that the supremum over those parameter values is not bigger. So it would be illogical to then attach $\mu=0$ , the basis of the level computation under H0, to the H1.

PPS, prompted by your comment: In my (hopefully not too unconventional) terminology the "rejection probability" is the probability, given $\mu$ , to reject the H0. This depends on $\mu$ . Particularly, if $\mu$ is in the H0, it is a type I error probability, whereas if $\mu$ is in the H1, it is the power. In a good test, you want the type I error probability to be low and the power to be high. A test in which the power is not higher than the type I error probability (or be it the supremum of these) doesn't do what it's supposed to do, it doesn't indicate the H1 in case of rejection. Or let's say it doesn't indicate the $\mu=0$ part of the H1 in case you chose to include $\mu=0$ in the H1. It's nonsense to say "this level $\alpha$ -test indicates evidence in favour of $\mu\ge 0$ " if the power in case of $\mu=0$ is no better than that $\alpha$ .

Christian Hennig's user avatar

  • $\begingroup$ You said $H_0: \mu \le 0$ and $H_0: \mu < 0$ can have the same rejection region, and I totally agree, but I don't get the "reject a null hypothesis in favour of an alternative that has a rejection probability that isn't higher" part. Isn't the rejection probability always 5% when the significance level is 0.05? $\endgroup$ –  nalzok Commented Jun 28, 2019 at 17:20
  • 1 $\begingroup$ Lehmann (AFAIK): see here . But if the null hypothesis were $\mu < 0$, then the supremum of the rejection probability over the null wouldn't be equal to the rejection probability when $\mu=0$, would it? @nalzok: It's always 5% or less under the null . $\endgroup$ –  Scortchi - Reinstate Monica ♦ Commented Jun 28, 2019 at 17:39
  • $\begingroup$ @nalzok: See added PS and PPS. $\endgroup$ –  Christian Hennig Commented Jun 28, 2019 at 22:22
  • $\begingroup$ I get the problem that you describe. A test which rejects $H_0$ in favor of an equally likely or even worse likely $H_1$ is not so great. But is this solved by switching? From $$H_0: \mu < 0 \Leftrightarrow H_1: \mu \geq 0$$ to $$H_0: \mu \le 0 \Leftrightarrow H_1: \mu > 0$$ To me this seems more a problem that the hypothesis testing is not such a great method because it doesn't involve priors (and replaces this with some worse case scenario). In practice, when I observe some high value of a statistic which is very unlikely under $H_0: \mu < 0$ then I have little problems accepting $H_1$.... $\endgroup$ –  Sextus Empiricus Commented Feb 1, 2020 at 10:29
  • 1 $\begingroup$ "Accepting the $H_1$" is not a standard way to interpret a hypothesis test. Anyway, I have no issues with the question addressed by hypothesis tests, namely whether the data is compatible with the $H_0$, and if not, in what direction it points, which doesn't commit me to "accept" anything as true. That's relevant enough in many situations. And if I don't have information that comes in a way nicely expressible as a prior, I'm happy about a method that doesn't require one. $\endgroup$ –  Christian Hennig Commented Feb 3, 2020 at 0:14

In traditional hypothesis testing the null hypothesis always contains an $=$ -sign, whether it is as $=, \le,$ or $\ge.$

The mull hypothesis determines the null distribution of the test statistic. Hence also the critical value used in testing at a particular level or, in computer programs, the P-value.

Example: In a simple binomial test whether a coin is fair, when suspected of bias towards Heads, might be $H_0: p = 1/2$ vs $H_0: p > 1/2.$

Suppose data consist of 100 coin tosses: We can reject $H_0$ at the 4.43% level, if the number $X$ of Heads matches or exceeds the 'critical value' $c= 59.$ (Because of the discreteness of the binomial distribution a nonrandomized test at exactly the 5% level is not available.)

In R, $P(X \ge 59\,|\,p=.5) = 1 - P(X \le 58\,|\,p = .5) = 0.0443.$

enter image description here

Intuitively, it might seem that getting 55 Heads in 100 should arouse suspicion that the coin is biased for Heads. "Suspicion" maybe Yes, but this would not be a statistically significant result at the stated level of significance. By random variation a truly fair coin can show 55 or more Heads in 100 tosses with probability about 0.184.

Also, if we observe $X = 62$ Heads then the P-value is $P(X \ge 62\,|\,p=.5) = 0.01,$ leading to rejection of $H_0$ because the P-value $0.0105 < 0.0443.$ The P-value is defined as the probability, under $H_0,$ of an outcome more extreme (in the direction of the alternative) than what was observed.

If the hypotheses were formulated as $H_0: p \le 1/2$ vs. $H_0: p > 1/2,$ then the $=$ -sign in $H_0$ would still govern these probability computations.

BruceET's user avatar

  • 1 $\begingroup$ I can understand what you are saying, but could you elaborate on how does it answer my question? $\endgroup$ –  nalzok Commented Jun 28, 2019 at 17:09
  • $\begingroup$ You seem to propose a null hypothesis $H_0: \mu < 0,$ which does not contain an $=$-sign. That is not a correct null hypothesis. It provides no specific value of $\mu$ that can be used to determine a null distribution. // First rule of hypothesis testing: Put an $=$-sign in $H_0.$ $\endgroup$ –  BruceET Commented Jun 28, 2019 at 17:23
  • $\begingroup$ I noticed this issue and thus asked if the $p$-value can be calculated with $P(\bar{X} > 0|H_0) = \sup_{\theta \in H_0} P(\bar{X} > 0 | \theta)$. If this equation is correct, then I can work on $\lim_{n \to \infty} P(\bar{X} > 0 | \mu \le -\frac{1}{n}) = \lim_{n \to \infty} P(\bar{X} > 0 | \mu = \frac{1}{n})$. This is calculatable because each value of $\mu$ is specific. $\endgroup$ –  nalzok Commented Jun 28, 2019 at 17:28
  • $\begingroup$ @BruceET: "First rule of hypothesis testing: Put an =-sign in 𝐻0". Nope, see my answer, particularly the PS part. I'm with nalzok on this matter. You won't find such a rule in any sufficiently general treatment of hypothesis tests. $\endgroup$ –  Christian Hennig Commented Jun 29, 2019 at 11:28
  • $\begingroup$ @Lewian. Saw it. Hard to recognize it as traditional hypothesis testing. Pondering whether there are practical situations where that framework is useful. Disagreements about hypothesis testing are hardly new to statistics. $\endgroup$ –  BruceET Commented Jun 29, 2019 at 15:51

Your Answer

Sign up or log in, post as a guest.

Required, but never shown

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy .

Not the answer you're looking for? Browse other questions tagged hypothesis-testing mathematical-statistics p-value intuition type-i-and-ii-errors or ask your own question .

  • Featured on Meta
  • Announcing a change to the data-dump process
  • Bringing clarity to status tag usage on meta sites

Hot Network Questions

  • meaning of "as... as could be"
  • How to Interpret Statistically Non-Significant Estimates and Rule Out Large Effects?
  • What should I do if my student has quarrel with my collaborator
  • Would an LEO asking for a race constitute entrapment?
  • In what instances are 3-D charts appropriate?
  • Has any astronomer ever observed that after a specific star going supernova it became a Black Hole?
  • Two-airline trip: is it worthwhile to buy a more expensive ticket allowing more luggage?
  • Why am I having problems starting my service in Red Hat Enterprise Linux 8?
  • How would you slow the speed of a rogue solar system?
  • What's "the archetypal book" called?
  • quantulum abest, quo minus .
  • What is the difference between passing NULL vs. nullptr to a template parameter?
  • Creating Layout of 2D Board game
  • Largest prime number with +, -, ÷
  • Why doesn’t dust interfere with the adhesion of geckos’ feet?
  • Why are poverty definitions not based off a person's access to necessities rather than a fixed number?
  • Titus 1:2 and the Greek word αἰωνίων (aiōniōn)
  • Can I arxive a paper that I had already published in a journal(EPL, in my case), so that eveyone can access it?
  • Why do the opposite of skillful virtues result in remorse?
  • How should I tell my manager that he could delay my retirement with a raise?
  • How do I safely download and run an older version of software for testing without interfering with the currently installed version?
  • Is it possible to travel to USA with legal cannabis?
  • Star Trek: The Next Generation episode that talks about life and death
  • The state of the art on topological rings - the Jacobson topology

null hypothesis test for equality

Have a language expert improve your writing

Run a free plagiarism check in 10 minutes, generate accurate citations for free.

  • Knowledge Base
  • Null and Alternative Hypotheses | Definitions & Examples

Null & Alternative Hypotheses | Definitions, Templates & Examples

Published on May 6, 2022 by Shaun Turney . Revised on June 22, 2023.

The null and alternative hypotheses are two competing claims that researchers weigh evidence for and against using a statistical test :

  • Null hypothesis ( H 0 ): There’s no effect in the population .
  • Alternative hypothesis ( H a or H 1 ) : There’s an effect in the population.

Table of contents

Answering your research question with hypotheses, what is a null hypothesis, what is an alternative hypothesis, similarities and differences between null and alternative hypotheses, how to write null and alternative hypotheses, other interesting articles, frequently asked questions.

The null and alternative hypotheses offer competing answers to your research question . When the research question asks “Does the independent variable affect the dependent variable?”:

  • The null hypothesis ( H 0 ) answers “No, there’s no effect in the population.”
  • The alternative hypothesis ( H a ) answers “Yes, there is an effect in the population.”

The null and alternative are always claims about the population. That’s because the goal of hypothesis testing is to make inferences about a population based on a sample . Often, we infer whether there’s an effect in the population by looking at differences between groups or relationships between variables in the sample. It’s critical for your research to write strong hypotheses .

You can use a statistical test to decide whether the evidence favors the null or alternative hypothesis. Each type of statistical test comes with a specific way of phrasing the null and alternative hypothesis. However, the hypotheses can also be phrased in a general way that applies to any test.

Receive feedback on language, structure, and formatting

Professional editors proofread and edit your paper by focusing on:

  • Academic style
  • Vague sentences
  • Style consistency

See an example

null hypothesis test for equality

The null hypothesis is the claim that there’s no effect in the population.

If the sample provides enough evidence against the claim that there’s no effect in the population ( p ≤ α), then we can reject the null hypothesis . Otherwise, we fail to reject the null hypothesis.

Although “fail to reject” may sound awkward, it’s the only wording that statisticians accept . Be careful not to say you “prove” or “accept” the null hypothesis.

Null hypotheses often include phrases such as “no effect,” “no difference,” or “no relationship.” When written in mathematical terms, they always include an equality (usually =, but sometimes ≥ or ≤).

You can never know with complete certainty whether there is an effect in the population. Some percentage of the time, your inference about the population will be incorrect. When you incorrectly reject the null hypothesis, it’s called a type I error . When you incorrectly fail to reject it, it’s a type II error.

Examples of null hypotheses

The table below gives examples of research questions and null hypotheses. There’s always more than one way to answer a research question, but these null hypotheses can help you get started.

( )
Does tooth flossing affect the number of cavities? Tooth flossing has on the number of cavities. test:

The mean number of cavities per person does not differ between the flossing group (µ ) and the non-flossing group (µ ) in the population; µ = µ .

Does the amount of text highlighted in the textbook affect exam scores? The amount of text highlighted in the textbook has on exam scores. :

There is no relationship between the amount of text highlighted and exam scores in the population; β = 0.

Does daily meditation decrease the incidence of depression? Daily meditation the incidence of depression.* test:

The proportion of people with depression in the daily-meditation group ( ) is greater than or equal to the no-meditation group ( ) in the population; ≥ .

*Note that some researchers prefer to always write the null hypothesis in terms of “no effect” and “=”. It would be fine to say that daily meditation has no effect on the incidence of depression and p 1 = p 2 .

The alternative hypothesis ( H a ) is the other answer to your research question . It claims that there’s an effect in the population.

Often, your alternative hypothesis is the same as your research hypothesis. In other words, it’s the claim that you expect or hope will be true.

The alternative hypothesis is the complement to the null hypothesis. Null and alternative hypotheses are exhaustive, meaning that together they cover every possible outcome. They are also mutually exclusive, meaning that only one can be true at a time.

Alternative hypotheses often include phrases such as “an effect,” “a difference,” or “a relationship.” When alternative hypotheses are written in mathematical terms, they always include an inequality (usually ≠, but sometimes < or >). As with null hypotheses, there are many acceptable ways to phrase an alternative hypothesis.

Examples of alternative hypotheses

The table below gives examples of research questions and alternative hypotheses to help you get started with formulating your own.

Does tooth flossing affect the number of cavities? Tooth flossing has an on the number of cavities. test:

The mean number of cavities per person differs between the flossing group (µ ) and the non-flossing group (µ ) in the population; µ ≠ µ .

Does the amount of text highlighted in a textbook affect exam scores? The amount of text highlighted in the textbook has an on exam scores. :

There is a relationship between the amount of text highlighted and exam scores in the population; β ≠ 0.

Does daily meditation decrease the incidence of depression? Daily meditation the incidence of depression. test:

The proportion of people with depression in the daily-meditation group ( ) is less than the no-meditation group ( ) in the population; < .

Null and alternative hypotheses are similar in some ways:

  • They’re both answers to the research question.
  • They both make claims about the population.
  • They’re both evaluated by statistical tests.

However, there are important differences between the two types of hypotheses, summarized in the following table.

A claim that there is in the population. A claim that there is in the population.

Equality symbol (=, ≥, or ≤) Inequality symbol (≠, <, or >)
Rejected Supported
Failed to reject Not supported

Prevent plagiarism. Run a free check.

To help you write your hypotheses, you can use the template sentences below. If you know which statistical test you’re going to use, you can use the test-specific template sentences. Otherwise, you can use the general template sentences.

General template sentences

The only thing you need to know to use these general template sentences are your dependent and independent variables. To write your research question, null hypothesis, and alternative hypothesis, fill in the following sentences with your variables:

Does independent variable affect dependent variable ?

  • Null hypothesis ( H 0 ): Independent variable does not affect dependent variable.
  • Alternative hypothesis ( H a ): Independent variable affects dependent variable.

Test-specific template sentences

Once you know the statistical test you’ll be using, you can write your hypotheses in a more precise and mathematical way specific to the test you chose. The table below provides template sentences for common statistical tests.

( )
test 

with two groups

The mean dependent variable does not differ between group 1 (µ ) and group 2 (µ ) in the population; µ = µ . The mean dependent variable differs between group 1 (µ ) and group 2 (µ ) in the population; µ ≠ µ .
with three groups The mean dependent variable does not differ between group 1 (µ ), group 2 (µ ), and group 3 (µ ) in the population; µ = µ = µ . The mean dependent variable of group 1 (µ ), group 2 (µ ), and group 3 (µ ) are not all equal in the population.
There is no correlation between independent variable and dependent variable in the population; ρ = 0. There is a correlation between independent variable and dependent variable in the population; ρ ≠ 0.
There is no relationship between independent variable and dependent variable in the population; β = 0. There is a relationship between independent variable and dependent variable in the population; β ≠ 0.
Two-proportions test The dependent variable expressed as a proportion does not differ between group 1 ( ) and group 2 ( ) in the population; = . The dependent variable expressed as a proportion differs between group 1 ( ) and group 2 ( ) in the population; ≠ .

Note: The template sentences above assume that you’re performing one-tailed tests . One-tailed tests are appropriate for most studies.

If you want to know more about statistics , methodology , or research bias , make sure to check out some of our other articles with explanations and examples.

  • Normal distribution
  • Descriptive statistics
  • Measures of central tendency
  • Correlation coefficient

Methodology

  • Cluster sampling
  • Stratified sampling
  • Types of interviews
  • Cohort study
  • Thematic analysis

Research bias

  • Implicit bias
  • Cognitive bias
  • Survivorship bias
  • Availability heuristic
  • Nonresponse bias
  • Regression to the mean

Hypothesis testing is a formal procedure for investigating our ideas about the world using statistics. It is used by scientists to test specific predictions, called hypotheses , by calculating how likely it is that a pattern or relationship between variables could have arisen by chance.

Null and alternative hypotheses are used in statistical hypothesis testing . The null hypothesis of a test always predicts no effect or no relationship between variables, while the alternative hypothesis states your research prediction of an effect or relationship.

The null hypothesis is often abbreviated as H 0 . When the null hypothesis is written using mathematical symbols, it always includes an equality symbol (usually =, but sometimes ≥ or ≤).

The alternative hypothesis is often abbreviated as H a or H 1 . When the alternative hypothesis is written using mathematical symbols, it always includes an inequality symbol (usually ≠, but sometimes < or >).

A research hypothesis is your proposed answer to your research question. The research hypothesis usually includes an explanation (“ x affects y because …”).

A statistical hypothesis, on the other hand, is a mathematical statement about a population parameter. Statistical hypotheses always come in pairs: the null and alternative hypotheses . In a well-designed study , the statistical hypotheses correspond logically to the research hypothesis.

Cite this Scribbr article

If you want to cite this source, you can copy and paste the citation or click the “Cite this Scribbr article” button to automatically add the citation to our free Citation Generator.

Turney, S. (2023, June 22). Null & Alternative Hypotheses | Definitions, Templates & Examples. Scribbr. Retrieved September 4, 2024, from https://www.scribbr.com/statistics/null-and-alternative-hypotheses/

Is this article helpful?

Shaun Turney

Shaun Turney

Other students also liked, inferential statistics | an easy introduction & examples, hypothesis testing | a step-by-step guide with easy examples, type i & type ii errors | differences, examples, visualizations, what is your plagiarism score.

9.1 Null and Alternative Hypotheses

The actual test begins by considering two hypotheses . They are called the null hypothesis and the alternative hypothesis . These hypotheses contain opposing viewpoints.

H 0 , the — null hypothesis: a statement of no difference between sample means or proportions or no difference between a sample mean or proportion and a population mean or proportion. In other words, the difference equals 0.

H a —, the alternative hypothesis: a claim about the population that is contradictory to H 0 and what we conclude when we reject H 0 .

Since the null and alternative hypotheses are contradictory, you must examine evidence to decide if you have enough evidence to reject the null hypothesis or not. The evidence is in the form of sample data.

After you have determined which hypothesis the sample supports, you make a decision. There are two options for a decision. They are reject H 0 if the sample information favors the alternative hypothesis or do not reject H 0 or decline to reject H 0 if the sample information is insufficient to reject the null hypothesis.

Mathematical Symbols Used in H 0 and H a :

equal (=) not equal (≠) greater than (>) less than (<)
greater than or equal to (≥) less than (<)
less than or equal to (≤) more than (>)

H 0 always has a symbol with an equal in it. H a never has a symbol with an equal in it. The choice of symbol depends on the wording of the hypothesis test. However, be aware that many researchers use = in the null hypothesis, even with > or < as the symbol in the alternative hypothesis. This practice is acceptable because we only make the decision to reject or not reject the null hypothesis.

Example 9.1

H 0 : No more than 30 percent of the registered voters in Santa Clara County voted in the primary election. p ≤ 30 H a : More than 30 percent of the registered voters in Santa Clara County voted in the primary election. p > 30

A medical trial is conducted to test whether or not a new medicine reduces cholesterol by 25 percent. State the null and alternative hypotheses.

Example 9.2

We want to test whether the mean GPA of students in American colleges is different from 2.0 (out of 4.0). The null and alternative hypotheses are the following: H 0 : μ = 2.0 H a : μ ≠ 2.0

We want to test whether the mean height of eighth graders is 66 inches. State the null and alternative hypotheses. Fill in the correct symbol (=, ≠, ≥, <, ≤, >) for the null and alternative hypotheses.

  • H 0 : μ __ 66
  • H a : μ __ 66

Example 9.3

We want to test if college students take fewer than five years to graduate from college, on the average. The null and alternative hypotheses are the following: H 0 : μ ≥ 5 H a : μ < 5

We want to test if it takes fewer than 45 minutes to teach a lesson plan. State the null and alternative hypotheses. Fill in the correct symbol ( =, ≠, ≥, <, ≤, >) for the null and alternative hypotheses.

  • H 0 : μ __ 45
  • H a : μ __ 45

Example 9.4

An article on school standards stated that about half of all students in France, Germany, and Israel take advanced placement exams and a third of the students pass. The same article stated that 6.6 percent of U.S. students take advanced placement exams and 4.4 percent pass. Test if the percentage of U.S. students who take advanced placement exams is more than 6.6 percent. State the null and alternative hypotheses. H 0 : p ≤ 0.066 H a : p > 0.066

On a state driver’s test, about 40 percent pass the test on the first try. We want to test if more than 40 percent pass on the first try. Fill in the correct symbol (=, ≠, ≥, <, ≤, >) for the null and alternative hypotheses.

  • H 0 : p __ 0.40
  • H a : p __ 0.40

Collaborative Exercise

Bring to class a newspaper, some news magazines, and some internet articles. In groups, find articles from which your group can write null and alternative hypotheses. Discuss your hypotheses with the rest of the class.

This book may not be used in the training of large language models or otherwise be ingested into large language models or generative AI offerings without OpenStax's permission.

Want to cite, share, or modify this book? This book uses the Creative Commons Attribution License and you must attribute Texas Education Agency (TEA). The original material is available at: https://www.texasgateway.org/book/tea-statistics . Changes were made to the original material, including updates to art, structure, and other content updates.

Access for free at https://openstax.org/books/statistics/pages/1-introduction
  • Authors: Barbara Illowsky, Susan Dean
  • Publisher/website: OpenStax
  • Book title: Statistics
  • Publication date: Mar 27, 2020
  • Location: Houston, Texas
  • Book URL: https://openstax.org/books/statistics/pages/1-introduction
  • Section URL: https://openstax.org/books/statistics/pages/9-1-null-and-alternative-hypotheses

© Apr 16, 2024 Texas Education Agency (TEA). The OpenStax name, OpenStax logo, OpenStax book covers, OpenStax CNX name, and OpenStax CNX logo are not subject to the Creative Commons license and may not be reproduced without the prior and express written consent of Rice University.

Levene Test for Equality of Variances

Statistics Definitions > Levene Test

What is the Levene Test?

Run a levene’s test for equality of variances in spss.

Levene’s test is used to check that variances are equal for all samples when your data comes from a non normal distribution . You can use Levene’s test to check the assumption of equal variances before running a test like One-Way ANOVA .

If you’re fairly certain your data comes from a normal or nearly normal distribution , use Bartlett’s Test instead.

The null hypothesis for Levene’s is that the variances are equal across all samples. In more formal terms, that’s written as: H 0 : σ 1 2 = σ 2 2 = … = σ k 2 . The alternate hypothesis (the one you’re testing), is that the variances are not equal for at least one pair: H 0 : σ 1 2 ≠ σ 2 2 ≠… ≠ σ k 2 .

levene test

  • Robustness , is a measure of how well the test does not falsely report unequal variances (when the variances are actually equal).
  • Power is a measure of how well the test correctly reports unequal variances.

According to Brown and Forsythe:

  • Trimmed means work best with heavy-tailed distributions like the Cauchy distribution .
  • For skewed distribution s, or if you aren’t sure about the underlying shape of the distribution, the median may be your best choice.
  • For symmetric and moderately tailed distributions, use the mean .

Levene’s test is built into most statistical software. For example, the Independent Samples T Test in SPSS generates a “Levene’s Test for Equality of Variances” column as part of the output. The result from the test is reported as a p-value, which you can compare to your alpha level for the test. If the p-value is larger than the alpha level , then you can say that the null hypothesis stands — that the variances are equal; if the p-value is smaller than the alpha level, then the implication is that the variances are un equal.

Levene’s Test tests whether variances of two samples are approximately equal. Ideally, you want a non significant result for this test — that means your variances meet the assumption of equal variances. SPSS will automatically run a Levene’s test any time you run an independent samples t-test, but you can run a test independently using the following steps.

Note: The following steps are run on a set of data with two levels of the independent variable: At Home and In Office. The test scores are measured at the scale level.

Step 1 : Go to Analyze > General Linear Model → Univariate

Step 2 : Move your independent variable over to the Fixed Factor box.

Step 3 : Move your dependent variable over to the Dependent Variable box.

Step 4 : Click “Options”, then place a check mark next to Homogeneity Tests. At the bottom, you can change your alpha level, if desired. It’s common to set the alpha level at .01 or .001, especially with large sample sizes. This is because you’re more likely to get a statistically significant result for larger samples.

Step 6 : Click Continue, then click OK to run the test.

Reading the Output

Read the result from the Sig column (Based on Mean) in the Levene’s Test of Equality of Error Variances box. A non-significant result here (greater than .05) indicates you have met the assumption of homogeneity of variance (i.e., equal variances are assumed). A significant result here (lessthan .05) indicates you have violated the assumption of homogeneity of variance (i.e., equal variances are not assumed).

If you are reading the output as part of a t-test, this tells you whether you should interpret the t-test for equal variances assumed, or not assumed.

Reference : Brown, M. B. and Forsythe, Robust Tests for the Equality of Variances. A. B. (1974), Journal of the American Statistical Association, 69, pp. 364-367. Available here .

Hypothesis Testing - Chi Squared Test

Lisa Sullivan, PhD

Professor of Biostatistics

Boston University School of Public Health

Introductory word scramble

Introduction

This module will continue the discussion of hypothesis testing, where a specific statement or hypothesis is generated about a population parameter, and sample statistics are used to assess the likelihood that the hypothesis is true. The hypothesis is based on available information and the investigator's belief about the population parameters. The specific tests considered here are called chi-square tests and are appropriate when the outcome is discrete (dichotomous, ordinal or categorical). For example, in some clinical trials the outcome is a classification such as hypertensive, pre-hypertensive or normotensive. We could use the same classification in an observational study such as the Framingham Heart Study to compare men and women in terms of their blood pressure status - again using the classification of hypertensive, pre-hypertensive or normotensive status.  

The technique to analyze a discrete outcome uses what is called a chi-square test. Specifically, the test statistic follows a chi-square probability distribution. We will consider chi-square tests here with one, two and more than two independent comparison groups.

Learning Objectives

After completing this module, the student will be able to:

  • Perform chi-square tests by hand
  • Appropriately interpret results of chi-square tests
  • Identify the appropriate hypothesis testing procedure based on type of outcome variable and number of samples

Tests with One Sample, Discrete Outcome

Here we consider hypothesis testing with a discrete outcome variable in a single population. Discrete variables are variables that take on more than two distinct responses or categories and the responses can be ordered or unordered (i.e., the outcome can be ordinal or categorical). The procedure we describe here can be used for dichotomous (exactly 2 response options), ordinal or categorical discrete outcomes and the objective is to compare the distribution of responses, or the proportions of participants in each response category, to a known distribution. The known distribution is derived from another study or report and it is again important in setting up the hypotheses that the comparator distribution specified in the null hypothesis is a fair comparison. The comparator is sometimes called an external or a historical control.   

In one sample tests for a discrete outcome, we set up our hypotheses against an appropriate comparator. We select a sample and compute descriptive statistics on the sample data. Specifically, we compute the sample size (n) and the proportions of participants in each response

Test Statistic for Testing H 0 : p 1 = p 10 , p 2 = p 20 , ..., p k = p k0

We find the critical value in a table of probabilities for the chi-square distribution with degrees of freedom (df) = k-1. In the test statistic, O = observed frequency and E=expected frequency in each of the response categories. The observed frequencies are those observed in the sample and the expected frequencies are computed as described below. χ 2 (chi-square) is another probability distribution and ranges from 0 to ∞. The test above statistic formula above is appropriate for large samples, defined as expected frequencies of at least 5 in each of the response categories.  

When we conduct a χ 2 test, we compare the observed frequencies in each response category to the frequencies we would expect if the null hypothesis were true. These expected frequencies are determined by allocating the sample to the response categories according to the distribution specified in H 0 . This is done by multiplying the observed sample size (n) by the proportions specified in the null hypothesis (p 10 , p 20 , ..., p k0 ). To ensure that the sample size is appropriate for the use of the test statistic above, we need to ensure that the following: min(np 10 , n p 20 , ..., n p k0 ) > 5.  

The test of hypothesis with a discrete outcome measured in a single sample, where the goal is to assess whether the distribution of responses follows a known distribution, is called the χ 2 goodness-of-fit test. As the name indicates, the idea is to assess whether the pattern or distribution of responses in the sample "fits" a specified population (external or historical) distribution. In the next example we illustrate the test. As we work through the example, we provide additional details related to the use of this new test statistic.  

A University conducted a survey of its recent graduates to collect demographic and health information for future planning purposes as well as to assess students' satisfaction with their undergraduate experiences. The survey revealed that a substantial proportion of students were not engaging in regular exercise, many felt their nutrition was poor and a substantial number were smoking. In response to a question on regular exercise, 60% of all graduates reported getting no regular exercise, 25% reported exercising sporadically and 15% reported exercising regularly as undergraduates. The next year the University launched a health promotion campaign on campus in an attempt to increase health behaviors among undergraduates. The program included modules on exercise, nutrition and smoking cessation. To evaluate the impact of the program, the University again surveyed graduates and asked the same questions. The survey was completed by 470 graduates and the following data were collected on the exercise question:

 

Number of Students

255

125

90

470

Based on the data, is there evidence of a shift in the distribution of responses to the exercise question following the implementation of the health promotion campaign on campus? Run the test at a 5% level of significance.

In this example, we have one sample and a discrete (ordinal) outcome variable (with three response options). We specifically want to compare the distribution of responses in the sample to the distribution reported the previous year (i.e., 60%, 25%, 15% reporting no, sporadic and regular exercise, respectively). We now run the test using the five-step approach.  

  • Step 1. Set up hypotheses and determine level of significance.

The null hypothesis again represents the "no change" or "no difference" situation. If the health promotion campaign has no impact then we expect the distribution of responses to the exercise question to be the same as that measured prior to the implementation of the program.

H 0 : p 1 =0.60, p 2 =0.25, p 3 =0.15,  or equivalently H 0 : Distribution of responses is 0.60, 0.25, 0.15  

H 1 :   H 0 is false.          α =0.05

Notice that the research hypothesis is written in words rather than in symbols. The research hypothesis as stated captures any difference in the distribution of responses from that specified in the null hypothesis. We do not specify a specific alternative distribution, instead we are testing whether the sample data "fit" the distribution in H 0 or not. With the χ 2 goodness-of-fit test there is no upper or lower tailed version of the test.

  • Step 2. Select the appropriate test statistic.  

The test statistic is:

We must first assess whether the sample size is adequate. Specifically, we need to check min(np 0 , np 1, ..., n p k ) > 5. The sample size here is n=470 and the proportions specified in the null hypothesis are 0.60, 0.25 and 0.15. Thus, min( 470(0.65), 470(0.25), 470(0.15))=min(282, 117.5, 70.5)=70.5. The sample size is more than adequate so the formula can be used.

  • Step 3. Set up decision rule.  

The decision rule for the χ 2 test depends on the level of significance and the degrees of freedom, defined as degrees of freedom (df) = k-1 (where k is the number of response categories). If the null hypothesis is true, the observed and expected frequencies will be close in value and the χ 2 statistic will be close to zero. If the null hypothesis is false, then the χ 2 statistic will be large. Critical values can be found in a table of probabilities for the χ 2 distribution. Here we have df=k-1=3-1=2 and a 5% level of significance. The appropriate critical value is 5.99, and the decision rule is as follows: Reject H 0 if χ 2 > 5.99.

  • Step 4. Compute the test statistic.  

We now compute the expected frequencies using the sample size and the proportions specified in the null hypothesis. We then substitute the sample data (observed frequencies) and the expected frequencies into the formula for the test statistic identified in Step 2. The computations can be organized as follows.

   

255

125

90

470

470(0.60)

=282

470(0.25)

=117.5

470(0.15)

=70.5

470

Notice that the expected frequencies are taken to one decimal place and that the sum of the observed frequencies is equal to the sum of the expected frequencies. The test statistic is computed as follows:

  • Step 5. Conclusion.  

We reject H 0 because 8.46 > 5.99. We have statistically significant evidence at α=0.05 to show that H 0 is false, or that the distribution of responses is not 0.60, 0.25, 0.15.  The p-value is p < 0.005.  

In the χ 2 goodness-of-fit test, we conclude that either the distribution specified in H 0 is false (when we reject H 0 ) or that we do not have sufficient evidence to show that the distribution specified in H 0 is false (when we fail to reject H 0 ). Here, we reject H 0 and concluded that the distribution of responses to the exercise question following the implementation of the health promotion campaign was not the same as the distribution prior. The test itself does not provide details of how the distribution has shifted. A comparison of the observed and expected frequencies will provide some insight into the shift (when the null hypothesis is rejected). Does it appear that the health promotion campaign was effective?  

Consider the following: 

 

255

125

90

470

282

117.5

70.5

470

If the null hypothesis were true (i.e., no change from the prior year) we would have expected more students to fall in the "No Regular Exercise" category and fewer in the "Regular Exercise" categories. In the sample, 255/470 = 54% reported no regular exercise and 90/470=19% reported regular exercise. Thus, there is a shift toward more regular exercise following the implementation of the health promotion campaign. There is evidence of a statistical difference, is this a meaningful difference? Is there room for improvement?

The National Center for Health Statistics (NCHS) provided data on the distribution of weight (in categories) among Americans in 2002. The distribution was based on specific values of body mass index (BMI) computed as weight in kilograms over height in meters squared. Underweight was defined as BMI< 18.5, Normal weight as BMI between 18.5 and 24.9, overweight as BMI between 25 and 29.9 and obese as BMI of 30 or greater. Americans in 2002 were distributed as follows: 2% Underweight, 39% Normal Weight, 36% Overweight, and 23% Obese. Suppose we want to assess whether the distribution of BMI is different in the Framingham Offspring sample. Using data from the n=3,326 participants who attended the seventh examination of the Offspring in the Framingham Heart Study we created the BMI categories as defined and observed the following:

 

30

20

932

1374

1000

3326

  • Step 1.  Set up hypotheses and determine level of significance.

H 0 : p 1 =0.02, p 2 =0.39, p 3 =0.36, p 4 =0.23     or equivalently

H 0 : Distribution of responses is 0.02, 0.39, 0.36, 0.23

H 1 :   H 0 is false.        α=0.05

The formula for the test statistic is:

We must assess whether the sample size is adequate. Specifically, we need to check min(np 0 , np 1, ..., n p k ) > 5. The sample size here is n=3,326 and the proportions specified in the null hypothesis are 0.02, 0.39, 0.36 and 0.23. Thus, min( 3326(0.02), 3326(0.39), 3326(0.36), 3326(0.23))=min(66.5, 1297.1, 1197.4, 765.0)=66.5. The sample size is more than adequate, so the formula can be used.

Here we have df=k-1=4-1=3 and a 5% level of significance. The appropriate critical value is 7.81 and the decision rule is as follows: Reject H 0 if χ 2 > 7.81.

We now compute the expected frequencies using the sample size and the proportions specified in the null hypothesis. We then substitute the sample data (observed frequencies) into the formula for the test statistic identified in Step 2. We organize the computations in the following table.

 

30

20

932

1374

1000

3326

66.5

1297.1

1197.4

765.0

3326

The test statistic is computed as follows:

We reject H 0 because 233.53 > 7.81. We have statistically significant evidence at α=0.05 to show that H 0 is false or that the distribution of BMI in Framingham is different from the national data reported in 2002, p < 0.005.  

Again, the χ 2   goodness-of-fit test allows us to assess whether the distribution of responses "fits" a specified distribution. Here we show that the distribution of BMI in the Framingham Offspring Study is different from the national distribution. To understand the nature of the difference we can compare observed and expected frequencies or observed and expected proportions (or percentages). The frequencies are large because of the large sample size, the observed percentages of patients in the Framingham sample are as follows: 0.6% underweight, 28% normal weight, 41% overweight and 30% obese. In the Framingham Offspring sample there are higher percentages of overweight and obese persons (41% and 30% in Framingham as compared to 36% and 23% in the national data), and lower proportions of underweight and normal weight persons (0.6% and 28% in Framingham as compared to 2% and 39% in the national data). Are these meaningful differences?

In the module on hypothesis testing for means and proportions, we discussed hypothesis testing applications with a dichotomous outcome variable in a single population. We presented a test using a test statistic Z to test whether an observed (sample) proportion differed significantly from a historical or external comparator. The chi-square goodness-of-fit test can also be used with a dichotomous outcome and the results are mathematically equivalent.  

In the prior module, we considered the following example. Here we show the equivalence to the chi-square goodness-of-fit test.

The NCHS report indicated that in 2002, 75% of children aged 2 to 17 saw a dentist in the past year. An investigator wants to assess whether use of dental services is similar in children living in the city of Boston. A sample of 125 children aged 2 to 17 living in Boston are surveyed and 64 reported seeing a dentist over the past 12 months. Is there a significant difference in use of dental services between children living in Boston and the national data?

We presented the following approach to the test using a Z statistic. 

  • Step 1. Set up hypotheses and determine level of significance

H 0 : p = 0.75

H 1 : p ≠ 0.75                               α=0.05

We must first check that the sample size is adequate. Specifically, we need to check min(np 0 , n(1-p 0 )) = min( 125(0.75), 125(1-0.75))=min(94, 31)=31. The sample size is more than adequate so the following formula can be used

This is a two-tailed test, using a Z statistic and a 5% level of significance. Reject H 0 if Z < -1.960 or if Z > 1.960.

We now substitute the sample data into the formula for the test statistic identified in Step 2. The sample proportion is:

null hypothesis test for equality

We reject H 0 because -6.15 < -1.960. We have statistically significant evidence at a =0.05 to show that there is a statistically significant difference in the use of dental service by children living in Boston as compared to the national data. (p < 0.0001).  

We now conduct the same test using the chi-square goodness-of-fit test. First, we summarize our sample data as follows:

 

Saw a Dentist

in Past 12 Months

Did Not See a Dentist

in Past 12 Months

Total

# of Participants

64

61

125

H 0 : p 1 =0.75, p 2 =0.25     or equivalently H 0 : Distribution of responses is 0.75, 0.25 

We must assess whether the sample size is adequate. Specifically, we need to check min(np 0 , np 1, ...,np k >) > 5. The sample size here is n=125 and the proportions specified in the null hypothesis are 0.75, 0.25. Thus, min( 125(0.75), 125(0.25))=min(93.75, 31.25)=31.25. The sample size is more than adequate so the formula can be used.

Here we have df=k-1=2-1=1 and a 5% level of significance. The appropriate critical value is 3.84, and the decision rule is as follows: Reject H 0 if χ 2 > 3.84. (Note that 1.96 2 = 3.84, where 1.96 was the critical value used in the Z test for proportions shown above.)

 

64

61

125

93.75

31.25

125

(Note that (-6.15) 2 = 37.8, where -6.15 was the value of the Z statistic in the test for proportions shown above.)

We reject H 0 because 37.8 > 3.84. We have statistically significant evidence at α=0.05 to show that there is a statistically significant difference in the use of dental service by children living in Boston as compared to the national data.  (p < 0.0001). This is the same conclusion we reached when we conducted the test using the Z test above. With a dichotomous outcome, Z 2 = χ 2 !   In statistics, there are often several approaches that can be used to test hypotheses. 

Tests for Two or More Independent Samples, Discrete Outcome

Here we extend that application of the chi-square test to the case with two or more independent comparison groups. Specifically, the outcome of interest is discrete with two or more responses and the responses can be ordered or unordered (i.e., the outcome can be dichotomous, ordinal or categorical). We now consider the situation where there are two or more independent comparison groups and the goal of the analysis is to compare the distribution of responses to the discrete outcome variable among several independent comparison groups.  

The test is called the χ 2 test of independence and the null hypothesis is that there is no difference in the distribution of responses to the outcome across comparison groups. This is often stated as follows: The outcome variable and the grouping variable (e.g., the comparison treatments or comparison groups) are independent (hence the name of the test). Independence here implies homogeneity in the distribution of the outcome among comparison groups.    

The null hypothesis in the χ 2 test of independence is often stated in words as: H 0 : The distribution of the outcome is independent of the groups. The alternative or research hypothesis is that there is a difference in the distribution of responses to the outcome variable among the comparison groups (i.e., that the distribution of responses "depends" on the group). In order to test the hypothesis, we measure the discrete outcome variable in each participant in each comparison group. The data of interest are the observed frequencies (or number of participants in each response category in each group). The formula for the test statistic for the χ 2 test of independence is given below.

Test Statistic for Testing H 0 : Distribution of outcome is independent of groups

and we find the critical value in a table of probabilities for the chi-square distribution with df=(r-1)*(c-1).

Here O = observed frequency, E=expected frequency in each of the response categories in each group, r = the number of rows in the two-way table and c = the number of columns in the two-way table.   r and c correspond to the number of comparison groups and the number of response options in the outcome (see below for more details). The observed frequencies are the sample data and the expected frequencies are computed as described below. The test statistic is appropriate for large samples, defined as expected frequencies of at least 5 in each of the response categories in each group.  

The data for the χ 2 test of independence are organized in a two-way table. The outcome and grouping variable are shown in the rows and columns of the table. The sample table below illustrates the data layout. The table entries (blank below) are the numbers of participants in each group responding to each response category of the outcome variable.

Table - Possible outcomes are are listed in the columns; The groups being compared are listed in rows.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

N

In the table above, the grouping variable is shown in the rows of the table; r denotes the number of independent groups. The outcome variable is shown in the columns of the table; c denotes the number of response options in the outcome variable. Each combination of a row (group) and column (response) is called a cell of the table. The table has r*c cells and is sometimes called an r x c ("r by c") table. For example, if there are 4 groups and 5 categories in the outcome variable, the data are organized in a 4 X 5 table. The row and column totals are shown along the right-hand margin and the bottom of the table, respectively. The total sample size, N, can be computed by summing the row totals or the column totals. Similar to ANOVA, N does not refer to a population size here but rather to the total sample size in the analysis. The sample data can be organized into a table like the above. The numbers of participants within each group who select each response option are shown in the cells of the table and these are the observed frequencies used in the test statistic.

The test statistic for the χ 2 test of independence involves comparing observed (sample data) and expected frequencies in each cell of the table. The expected frequencies are computed assuming that the null hypothesis is true. The null hypothesis states that the two variables (the grouping variable and the outcome) are independent. The definition of independence is as follows:

 Two events, A and B, are independent if P(A|B) = P(A), or equivalently, if P(A and B) = P(A) P(B).

The second statement indicates that if two events, A and B, are independent then the probability of their intersection can be computed by multiplying the probability of each individual event. To conduct the χ 2 test of independence, we need to compute expected frequencies in each cell of the table. Expected frequencies are computed by assuming that the grouping variable and outcome are independent (i.e., under the null hypothesis). Thus, if the null hypothesis is true, using the definition of independence:

P(Group 1 and Response Option 1) = P(Group 1) P(Response Option 1).

 The above states that the probability that an individual is in Group 1 and their outcome is Response Option 1 is computed by multiplying the probability that person is in Group 1 by the probability that a person is in Response Option 1. To conduct the χ 2 test of independence, we need expected frequencies and not expected probabilities . To convert the above probability to a frequency, we multiply by N. Consider the following small example.

 

10

8

7

25

22

15

13

50

30

28

17

75

62

51

37

150

The data shown above are measured in a sample of size N=150. The frequencies in the cells of the table are the observed frequencies. If Group and Response are independent, then we can compute the probability that a person in the sample is in Group 1 and Response category 1 using:

P(Group 1 and Response 1) = P(Group 1) P(Response 1),

P(Group 1 and Response 1) = (25/150) (62/150) = 0.069.

Thus if Group and Response are independent we would expect 6.9% of the sample to be in the top left cell of the table (Group 1 and Response 1). The expected frequency is 150(0.069) = 10.4.   We could do the same for Group 2 and Response 1:

P(Group 2 and Response 1) = P(Group 2) P(Response 1),

P(Group 2 and Response 1) = (50/150) (62/150) = 0.138.

The expected frequency in Group 2 and Response 1 is 150(0.138) = 20.7.

Thus, the formula for determining the expected cell frequencies in the χ 2 test of independence is as follows:

Expected Cell Frequency = (Row Total * Column Total)/N.

The above computes the expected frequency in one step rather than computing the expected probability first and then converting to a frequency.  

In a prior example we evaluated data from a survey of university graduates which assessed, among other things, how frequently they exercised. The survey was completed by 470 graduates. In the prior example we used the χ 2 goodness-of-fit test to assess whether there was a shift in the distribution of responses to the exercise question following the implementation of a health promotion campaign on campus. We specifically considered one sample (all students) and compared the observed distribution to the distribution of responses the prior year (a historical control). Suppose we now wish to assess whether there is a relationship between exercise on campus and students' living arrangements. As part of the same survey, graduates were asked where they lived their senior year. The response options were dormitory, on-campus apartment, off-campus apartment, and at home (i.e., commuted to and from the university). The data are shown below.

 

32

30

28

90

74

64

42

180

110

25

15

150

39

6

5

50

255

125

90

470

Based on the data, is there a relationship between exercise and student's living arrangement? Do you think where a person lives affect their exercise status? Here we have four independent comparison groups (living arrangement) and a discrete (ordinal) outcome variable with three response options. We specifically want to test whether living arrangement and exercise are independent. We will run the test using the five-step approach.  

H 0 : Living arrangement and exercise are independent

H 1 : H 0 is false.                α=0.05

The null and research hypotheses are written in words rather than in symbols. The research hypothesis is that the grouping variable (living arrangement) and the outcome variable (exercise) are dependent or related.   

  • Step 2.  Select the appropriate test statistic.  

The condition for appropriate use of the above test statistic is that each expected frequency is at least 5. In Step 4 we will compute the expected frequencies and we will ensure that the condition is met.

The decision rule depends on the level of significance and the degrees of freedom, defined as df = (r-1)(c-1), where r and c are the numbers of rows and columns in the two-way data table.   The row variable is the living arrangement and there are 4 arrangements considered, thus r=4. The column variable is exercise and 3 responses are considered, thus c=3. For this test, df=(4-1)(3-1)=3(2)=6. Again, with χ 2 tests there are no upper, lower or two-tailed tests. If the null hypothesis is true, the observed and expected frequencies will be close in value and the χ 2 statistic will be close to zero. If the null hypothesis is false, then the χ 2 statistic will be large. The rejection region for the χ 2 test of independence is always in the upper (right-hand) tail of the distribution. For df=6 and a 5% level of significance, the appropriate critical value is 12.59 and the decision rule is as follows: Reject H 0 if c 2 > 12.59.

We now compute the expected frequencies using the formula,

Expected Frequency = (Row Total * Column Total)/N.

The computations can be organized in a two-way table. The top number in each cell of the table is the observed frequency and the bottom number is the expected frequency.   The expected frequencies are shown in parentheses.

 

32

(48.8)

30

(23.9)

28

(17.2)

90

74

(97.7)

64

(47.9)

42

(34.5)

180

110

(81.4)

25

(39.9)

15

(28.7)

150

39

(27.1)

6

(13.3)

5

(9.6)

50

255

125

90

470

Notice that the expected frequencies are taken to one decimal place and that the sums of the observed frequencies are equal to the sums of the expected frequencies in each row and column of the table.  

Recall in Step 2 a condition for the appropriate use of the test statistic was that each expected frequency is at least 5. This is true for this sample (the smallest expected frequency is 9.6) and therefore it is appropriate to use the test statistic.

We reject H 0 because 60.5 > 12.59. We have statistically significant evidence at a =0.05 to show that H 0 is false or that living arrangement and exercise are not independent (i.e., they are dependent or related), p < 0.005.  

Again, the χ 2 test of independence is used to test whether the distribution of the outcome variable is similar across the comparison groups. Here we rejected H 0 and concluded that the distribution of exercise is not independent of living arrangement, or that there is a relationship between living arrangement and exercise. The test provides an overall assessment of statistical significance. When the null hypothesis is rejected, it is important to review the sample data to understand the nature of the relationship. Consider again the sample data. 

Because there are different numbers of students in each living situation, it makes the comparisons of exercise patterns difficult on the basis of the frequencies alone. The following table displays the percentages of students in each exercise category by living arrangement. The percentages sum to 100% in each row of the table. For comparison purposes, percentages are also shown for the total sample along the bottom row of the table.

36%

33%

31%

41%

36%

23%

73%

17%

10%

78%

12%

10%

54%

27%

19%

From the above, it is clear that higher percentages of students living in dormitories and in on-campus apartments reported regular exercise (31% and 23%) as compared to students living in off-campus apartments and at home (10% each).  

Test Yourself

 Pancreaticoduodenectomy (PD) is a procedure that is associated with considerable morbidity. A study was recently conducted on 553 patients who had a successful PD between January 2000 and December 2010 to determine whether their Surgical Apgar Score (SAS) is related to 30-day perioperative morbidity and mortality. The table below gives the number of patients experiencing no, minor, or major morbidity by SAS category.  

0-4

21

20

16

5-6

135

71

35

7-10

158

62

35

Question: What would be an appropriate statistical test to examine whether there is an association between Surgical Apgar Score and patient outcome? Using 14.13 as the value of the test statistic for these data, carry out the appropriate test at a 5% level of significance. Show all parts of your test.

In the module on hypothesis testing for means and proportions, we discussed hypothesis testing applications with a dichotomous outcome variable and two independent comparison groups. We presented a test using a test statistic Z to test for equality of independent proportions. The chi-square test of independence can also be used with a dichotomous outcome and the results are mathematically equivalent.  

In the prior module, we considered the following example. Here we show the equivalence to the chi-square test of independence.

A randomized trial is designed to evaluate the effectiveness of a newly developed pain reliever designed to reduce pain in patients following joint replacement surgery. The trial compares the new pain reliever to the pain reliever currently in use (called the standard of care). A total of 100 patients undergoing joint replacement surgery agreed to participate in the trial. Patients were randomly assigned to receive either the new pain reliever or the standard pain reliever following surgery and were blind to the treatment assignment. Before receiving the assigned treatment, patients were asked to rate their pain on a scale of 0-10 with higher scores indicative of more pain. Each patient was then given the assigned treatment and after 30 minutes was again asked to rate their pain on the same scale. The primary outcome was a reduction in pain of 3 or more scale points (defined by clinicians as a clinically meaningful reduction). The following data were observed in the trial.

50

23

0.46

50

11

0.22

We tested whether there was a significant difference in the proportions of patients reporting a meaningful reduction (i.e., a reduction of 3 or more scale points) using a Z statistic, as follows. 

H 0 : p 1 = p 2    

H 1 : p 1 ≠ p 2                             α=0.05

Here the new or experimental pain reliever is group 1 and the standard pain reliever is group 2.

We must first check that the sample size is adequate. Specifically, we need to ensure that we have at least 5 successes and 5 failures in each comparison group or that:

In this example, we have

Therefore, the sample size is adequate, so the following formula can be used:

Reject H 0 if Z < -1.960 or if Z > 1.960.

We now substitute the sample data into the formula for the test statistic identified in Step 2. We first compute the overall proportion of successes:

We now substitute to compute the test statistic.

  • Step 5.  Conclusion.  

We now conduct the same test using the chi-square test of independence.  

H 0 : Treatment and outcome (meaningful reduction in pain) are independent

H 1 :   H 0 is false.         α=0.05

The formula for the test statistic is:  

For this test, df=(2-1)(2-1)=1. At a 5% level of significance, the appropriate critical value is 3.84 and the decision rule is as follows: Reject H0 if χ 2 > 3.84. (Note that 1.96 2 = 3.84, where 1.96 was the critical value used in the Z test for proportions shown above.)

We now compute the expected frequencies using:

The computations can be organized in a two-way table. The top number in each cell of the table is the observed frequency and the bottom number is the expected frequency. The expected frequencies are shown in parentheses.

23

(17.0)

27

(33.0)

50

11

(17.0)

39

(33.0)

50

34

66

100

A condition for the appropriate use of the test statistic was that each expected frequency is at least 5. This is true for this sample (the smallest expected frequency is 22.0) and therefore it is appropriate to use the test statistic.

(Note that (2.53) 2 = 6.4, where 2.53 was the value of the Z statistic in the test for proportions shown above.)

Chi-Squared Tests in R

The video below by Mike Marin demonstrates how to perform chi-squared tests in the R programming language.

Answer to Problem on Pancreaticoduodenectomy and Surgical Apgar Scores

We have 3 independent comparison groups (Surgical Apgar Score) and a categorical outcome variable (morbidity/mortality). We can run a Chi-Squared test of independence.

H 0 : Apgar scores and patient outcome are independent of one another.

H A : Apgar scores and patient outcome are not independent.

Chi-squared = 14.3

Since 14.3 is greater than 9.49, we reject H 0.

There is an association between Apgar scores and patient outcome. The lowest Apgar score group (0 to 4) experienced the highest percentage of major morbidity or mortality (16 out of 57=28%) compared to the other Apgar score groups.

User Preferences

Content preview.

Arcu felis bibendum ut tristique et egestas quis:

  • Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris
  • Duis aute irure dolor in reprehenderit in voluptate
  • Excepteur sint occaecat cupidatat non proident

Keyboard Shortcuts

11.1 - when population variances are equal.

Let's start with the good news, namely that we've already done the dirty theoretical work in developing a hypothesis test for the difference in two population means \(\mu_1-\mu_2\) when we developed a \((1-\alpha)100\%\) confidence interval for the difference in two population means. Recall that if you have two independent samples from two normal distributions with equal variances \(\sigma^2_X=\sigma^2_Y=\sigma^2\), then:

\(T=\dfrac{(\bar{X}-\bar{Y})-(\mu_X-\mu_Y)}{S_p\sqrt{\dfrac{1}{n}+\dfrac{1}{m}}}\)

follows a \(t_{n+m-2}\) distribution where \(S^2_p\), the pooled sample variance:

\(S_p^2=\dfrac{(n-1)S^2_X+(m-1)S^2_Y}{n+m-2}\)

is an unbiased estimator of the common variance \(\sigma^2\). Therefore, if we're interested in testing the null hypothesis:

\(H_0:\mu_X-\mu_Y=0\) (or equivalently \(H_0:\mu_X=\mu_Y\))

against any of the alternative hypotheses:

\(H_A:\mu_X-\mu_Y \neq 0,\quad H_A:\mu_X-\mu_Y < 0,\text{ or }H_A:\mu_X-\mu_Y > 0\)

we can use the test statistic:

and follow the standard hypothesis testing procedures. Let's take a look at an example.

Example 11-1 Section  

car driving fast

A psychologist was interested in exploring whether or not male and female college students have different driving behaviors. There were several ways that she could quantify driving behaviors. She opted to focus on the fastest speed ever driven by an individual. Therefore, the particular statistical question she framed was as follows:

Is the mean fastest speed driven by male college students different than the mean fastest speed driven by female college students?

She conducted a survey of a random \(n=34\) male college students and a random \(m=29\) female college students. Here is a descriptive summary of the results of her survey:

Males Females

\(n = 34\)
\(\bar{x} = 105.5\)
\(s_x = 20.1\)

\(m = 29\)
\(\bar{y} = 90.9\)
\(s_y = 12.2\)

and here is a graphical summary of the data in the form of a dotplot:

Is there sufficient evidence at the \(\alpha=0.05\) level to conclude that the mean fastest speed driven by male college students differs from the mean fastest speed driven by female college students?

Because the observed standard deviations of the two samples are of similar magnitude, we'll assume that the population variances are equal. Let's also assume that the two populations of fastest speed driven for males and females are normally distributed. (We can confirm, or deny, such an assumption using a normal probability plot, but let's simplify our analysis for now.) The randomness of the two samples allows us to assume independence of the measurements as well.

Okay, assumptions all met, we can test the null hypothesis:

\(H_0:\mu_M-\mu_F=0\)

against the alternative hypothesis:

\(H_A:\mu_M-\mu_F \neq 0\)

using the test statistic:

\(t=\dfrac{(105.5-90.9)-0}{16.9 \sqrt{\dfrac{1}{34}+\dfrac{1}{29}}}=3.42\)

because, among other things, the pooled sample standard deviation is:

\(s_p=\sqrt{\dfrac{33(20.1^2)+28(12.2^2)}{61}}=16.9\)

The critical value approach tells us to reject the null hypothesis in favor of the alternative hypothesis if:

\(|t|\geq t_{\alpha/2,n+m-2}=t_{0.025,61}=1.9996\)

We reject the null hypothesis because the test statistic (\(t=3.42\)) falls in the rejection region:

There is sufficient evidence at the \(\alpha=0.05\) level to conclude that the average fastest speed driven by the population of male college students differs from the average fastest speed driven by the population of female college students.

Not surprisingly, the decision is the same using the \(p\)-value approach. The \(p\)-value is 0.0012:

\(P=2\times P(T_{61}>3.42)=2(0.0006)=0.0012\)

Therefore, because \(p=0.0012\le \alpha=0.05\), we reject the null hypothesis in favor of the alternative hypothesis. Again, we conclude that there is sufficient evidence at the \(\alpha=0.05\) level to conclude that the average fastest speed driven by the population of male college students differs from the average fastest speed driven by the population of female college students.

By the way, we'll see how to tell Minitab to conduct a two-sample t -test in a bit here, but in the meantime, this is what the output would look like:

Two-Sample T:   For Fastest

Gender N Mean StDev SE Mean
1 34 105.5 20.1 3.4
2 29 90.9 12.2 2.3

Difference = mu (1) - mu (2) Estimate for difference: 14.6085 95% CI for difference: (6.0630, 23.1540) T-Test of difference = 0 (vs not =) :   T-Value = 3.42    P-Value = 0.001   DF = 61 Both use Pooled StDev = 16.9066

  • School Guide
  • Mathematics
  • Number System and Arithmetic
  • Trigonometry
  • Probability
  • Mensuration
  • Maths Formulas
  • Integration Formulas
  • Differentiation Formulas
  • Trigonometry Formulas
  • Algebra Formulas
  • Mensuration Formula
  • Statistics Formulas
  • Trigonometric Table

Null Hypothesis

Null Hypothesis , often denoted as H 0, is a foundational concept in statistical hypothesis testing. It represents an assumption that no significant difference, effect, or relationship exists between variables within a population. It serves as a baseline assumption, positing no observed change or effect occurring. The null is t he truth or falsity of an idea in analysis.

In this article, we will discuss the null hypothesis in detail, along with some solved examples and questions on the null hypothesis.

Table of Content

What is Null Hypothesis?

Null hypothesis symbol, formula of null hypothesis, types of null hypothesis, null hypothesis examples, principle of null hypothesis, how do you find null hypothesis, null hypothesis in statistics, null hypothesis and alternative hypothesis, null hypothesis and alternative hypothesis examples, null hypothesis – practice problems.

Null Hypothesis in statistical analysis suggests the absence of statistical significance within a specific set of observed data. Hypothesis testing, using sample data, evaluates the validity of this hypothesis. Commonly denoted as H 0 or simply “null,” it plays an important role in quantitative analysis, examining theories related to markets, investment strategies, or economies to determine their validity.

Null Hypothesis Meaning

Null Hypothesis represents a default position, often suggesting no effect or difference, against which researchers compare their experimental results. The Null Hypothesis, often denoted as H 0 asserts a default assumption in statistical analysis. It posits no significant difference or effect, serving as a baseline for comparison in hypothesis testing.

The null Hypothesis is represented as H 0 , the Null Hypothesis symbolizes the absence of a measurable effect or difference in the variables under examination.

Certainly, a simple example would be asserting that the mean score of a group is equal to a specified value like stating that the average IQ of a population is 100.

The Null Hypothesis is typically formulated as a statement of equality or absence of a specific parameter in the population being studied. It provides a clear and testable prediction for comparison with the alternative hypothesis. The formulation of the Null Hypothesis typically follows a concise structure, stating the equality or absence of a specific parameter in the population.

Mean Comparison (Two-sample t-test)

H 0 : μ 1 = μ 2

This asserts that there is no significant difference between the means of two populations or groups.

Proportion Comparison

H 0 : p 1 − p 2 = 0

This suggests no significant difference in proportions between two populations or conditions.

Equality in Variance (F-test in ANOVA)

H 0 : σ 1 = σ 2

This states that there’s no significant difference in variances between groups or populations.

Independence (Chi-square Test of Independence):

H 0 : Variables are independent

This asserts that there’s no association or relationship between categorical variables.

Null Hypotheses vary including simple and composite forms, each tailored to the complexity of the research question. Understanding these types is pivotal for effective hypothesis testing.

Equality Null Hypothesis (Simple Null Hypothesis)

The Equality Null Hypothesis, also known as the Simple Null Hypothesis, is a fundamental concept in statistical hypothesis testing that assumes no difference, effect or relationship between groups, conditions or populations being compared.

Non-Inferiority Null Hypothesis

In some studies, the focus might be on demonstrating that a new treatment or method is not significantly worse than the standard or existing one.

Superiority Null Hypothesis

The concept of a superiority null hypothesis comes into play when a study aims to demonstrate that a new treatment, method, or intervention is significantly better than an existing or standard one.

Independence Null Hypothesis

In certain statistical tests, such as chi-square tests for independence, the null hypothesis assumes no association or independence between categorical variables.

Homogeneity Null Hypothesis

In tests like ANOVA (Analysis of Variance), the null hypothesis suggests that there’s no difference in population means across different groups.

  • Medicine: Null Hypothesis: “No significant difference exists in blood pressure levels between patients given the experimental drug versus those given a placebo.”
  • Education: Null Hypothesis: “There’s no significant variation in test scores between students using a new teaching method and those using traditional teaching.”
  • Economics: Null Hypothesis: “There’s no significant change in consumer spending pre- and post-implementation of a new taxation policy.”
  • Environmental Science: Null Hypothesis: “There’s no substantial difference in pollution levels before and after a water treatment plant’s establishment.”

The principle of the null hypothesis is a fundamental concept in statistical hypothesis testing. It involves making an assumption about the population parameter or the absence of an effect or relationship between variables.

In essence, the null hypothesis (H 0 ) proposes that there is no significant difference, effect, or relationship between variables. It serves as a starting point or a default assumption that there is no real change, no effect or no difference between groups or conditions.

The null hypothesis is usually formulated to be tested against an alternative hypothesis (H 1 or H [Tex]\alpha [/Tex] ) which suggests that there is an effect, difference or relationship present in the population.

Null Hypothesis Rejection

Rejecting the Null Hypothesis occurs when statistical evidence suggests a significant departure from the assumed baseline. It implies that there is enough evidence to support the alternative hypothesis, indicating a meaningful effect or difference. Null Hypothesis rejection occurs when statistical evidence suggests a deviation from the assumed baseline, prompting a reconsideration of the initial hypothesis.

Identifying the Null Hypothesis involves defining the status quotient, asserting no effect and formulating a statement suitable for statistical analysis.

When is Null Hypothesis Rejected?

The Null Hypothesis is rejected when statistical tests indicate a significant departure from the expected outcome, leading to the consideration of alternative hypotheses. It occurs when statistical evidence suggests a deviation from the assumed baseline, prompting a reconsideration of the initial hypothesis.

In statistical hypothesis testing, researchers begin by stating the null hypothesis, often based on theoretical considerations or previous research. The null hypothesis is then tested against an alternative hypothesis (Ha), which represents the researcher’s claim or the hypothesis they seek to support.

The process of hypothesis testing involves collecting sample data and using statistical methods to assess the likelihood of observing the data if the null hypothesis were true. This assessment is typically done by calculating a test statistic, which measures the difference between the observed data and what would be expected under the null hypothesis.

In the realm of hypothesis testing, the null hypothesis (H 0 ) and alternative hypothesis (H₁ or Ha) play critical roles. The null hypothesis generally assumes no difference, effect, or relationship between variables, suggesting that any observed change or effect is due to random chance. Its counterpart, the alternative hypothesis, asserts the presence of a significant difference, effect, or relationship between variables, challenging the null hypothesis. These hypotheses are formulated based on the research question and guide statistical analyses.

Difference Between Null Hypothesis and Alternative Hypothesis

The null hypothesis (H 0 ) serves as the baseline assumption in statistical testing, suggesting no significant effect, relationship, or difference within the data. It often proposes that any observed change or correlation is merely due to chance or random variation. Conversely, the alternative hypothesis (H 1 or Ha) contradicts the null hypothesis, positing the existence of a genuine effect, relationship or difference in the data. It represents the researcher’s intended focus, seeking to provide evidence against the null hypothesis and support for a specific outcome or theory. These hypotheses form the crux of hypothesis testing, guiding the assessment of data to draw conclusions about the population being studied.

Criteria

Null Hypothesis

Alternative Hypothesis

Definition

Assumes no effect or difference

Asserts a specific effect or difference

Symbol

H

H (or Ha)

Formulation

States equality or absence of parameter

States a specific value or relationship

Testing Outcome

Rejected if evidence of a significant effect

Accepted if evidence supports the hypothesis

Let’s envision a scenario where a researcher aims to examine the impact of a new medication on reducing blood pressure among patients. In this context:

Null Hypothesis (H 0 ): “The new medication does not produce a significant effect in reducing blood pressure levels among patients.”

Alternative Hypothesis (H 1 or Ha): “The new medication yields a significant effect in reducing blood pressure levels among patients.”

The null hypothesis implies that any observed alterations in blood pressure subsequent to the medication’s administration are a result of random fluctuations rather than a consequence of the medication itself. Conversely, the alternative hypothesis contends that the medication does indeed generate a meaningful alteration in blood pressure levels, distinct from what might naturally occur or by random chance.

People Also Read:

Mathematics Maths Formulas Probability and Statistics

Example 1: A researcher claims that the average time students spend on homework is 2 hours per night.

Null Hypothesis (H 0 ): The average time students spend on homework is equal to 2 hours per night. Data: A random sample of 30 students has an average homework time of 1.8 hours with a standard deviation of 0.5 hours. Test Statistic and Decision: Using a t-test, if the calculated t-statistic falls within the acceptance region, we fail to reject the null hypothesis. If it falls in the rejection region, we reject the null hypothesis. Conclusion: Based on the statistical analysis, we fail to reject the null hypothesis, suggesting that there is not enough evidence to dispute the claim of the average homework time being 2 hours per night.

Example 2: A company asserts that the error rate in its production process is less than 1%.

Null Hypothesis (H 0 ): The error rate in the production process is 1% or higher. Data: A sample of 500 products shows an error rate of 0.8%. Test Statistic and Decision: Using a z-test, if the calculated z-statistic falls within the acceptance region, we fail to reject the null hypothesis. If it falls in the rejection region, we reject the null hypothesis. Conclusion: The statistical analysis supports rejecting the null hypothesis, indicating that there is enough evidence to dispute the company’s claim of an error rate of 1% or higher.

Q1. A researcher claims that the average time spent by students on homework is less than 2 hours per day. Formulate the null hypothesis for this claim?

Q2. A manufacturing company states that their new machine produces widgets with a defect rate of less than 5%. Write the null hypothesis to test this claim?

Q3. An educational institute believes that their online course completion rate is at least 60%. Develop the null hypothesis to validate this assertion?

Q4. A restaurant claims that the waiting time for customers during peak hours is not more than 15 minutes. Formulate the null hypothesis for this claim?

Q5. A study suggests that the mean weight loss after following a specific diet plan for a month is more than 8 pounds. Construct the null hypothesis to evaluate this statement?

Summary – Null Hypothesis and Alternative Hypothesis

The null hypothesis (H 0 ) and alternative hypothesis (H a ) are fundamental concepts in statistical hypothesis testing. The null hypothesis represents the default assumption, stating that there is no significant effect, difference, or relationship between variables. It serves as the baseline against which the alternative hypothesis is tested. In contrast, the alternative hypothesis represents the researcher’s hypothesis or the claim to be tested, suggesting that there is a significant effect, difference, or relationship between variables. The relationship between the null and alternative hypotheses is such that they are complementary, and statistical tests are conducted to determine whether the evidence from the data is strong enough to reject the null hypothesis in favor of the alternative hypothesis. This decision is based on the strength of the evidence and the chosen level of significance. Ultimately, the choice between the null and alternative hypotheses depends on the specific research question and the direction of the effect being investigated.

FAQs on Null Hypothesis

What does null hypothesis stands for.

The null hypothesis, denoted as H 0 ​, is a fundamental concept in statistics used for hypothesis testing. It represents the statement that there is no effect or no difference, and it is the hypothesis that the researcher typically aims to provide evidence against.

How to Form a Null Hypothesis?

A null hypothesis is formed based on the assumption that there is no significant difference or effect between the groups being compared or no association between variables being tested. It often involves stating that there is no relationship, no change, or no effect in the population being studied.

When Do we reject the Null Hypothesis?

In statistical hypothesis testing, if the p-value (the probability of obtaining the observed results) is lower than the chosen significance level (commonly 0.05), we reject the null hypothesis. This suggests that the data provides enough evidence to refute the assumption made in the null hypothesis.

What is a Null Hypothesis in Research?

In research, the null hypothesis represents the default assumption or position that there is no significant difference or effect. Researchers often try to test this hypothesis by collecting data and performing statistical analyses to see if the observed results contradict the assumption.

What Are Alternative and Null Hypotheses?

The null hypothesis (H0) is the default assumption that there is no significant difference or effect. The alternative hypothesis (H1 or Ha) is the opposite, suggesting there is a significant difference, effect or relationship.

What Does it Mean to Reject the Null Hypothesis?

Rejecting the null hypothesis implies that there is enough evidence in the data to support the alternative hypothesis. In simpler terms, it suggests that there might be a significant difference, effect or relationship between the groups or variables being studied.

How to Find Null Hypothesis?

Formulating a null hypothesis often involves considering the research question and assuming that no difference or effect exists. It should be a statement that can be tested through data collection and statistical analysis, typically stating no relationship or no change between variables or groups.

How is Null Hypothesis denoted?

The null hypothesis is commonly symbolized as H 0 in statistical notation.

What is the Purpose of the Null hypothesis in Statistical Analysis?

The null hypothesis serves as a starting point for hypothesis testing, enabling researchers to assess if there’s enough evidence to reject it in favor of an alternative hypothesis.

What happens if we Reject the Null hypothesis?

Rejecting the null hypothesis implies that there is sufficient evidence to support an alternative hypothesis, suggesting a significant effect or relationship between variables.

What are Test for Null Hypothesis?

Various statistical tests, such as t-tests or chi-square tests, are employed to evaluate the validity of the Null Hypothesis in different scenarios.

Please Login to comment...

Similar reads.

  • Geeks Premier League
  • School Learning
  • Geeks Premier League 2023
  • Math-Concepts
  • How to Delete Discord Servers: Step by Step Guide
  • Google increases YouTube Premium price in India: Check our the latest plans
  • California Lawmakers Pass Bill to Limit AI Replicas
  • Best 10 IPTV Service Providers in Germany
  • 15 Most Important Aptitude Topics For Placements [2024]

Improve your Coding Skills with Practice

 alt=

What kind of Experience do you want to share?

Logo for BCcampus Open Publishing

Want to create or adapt books like this? Learn more about how Pressbooks supports open publishing practices.

Chapter 13: Inferential Statistics

Understanding Null Hypothesis Testing

Learning Objectives

  • Explain the purpose of null hypothesis testing, including the role of sampling error.
  • Describe the basic logic of null hypothesis testing.
  • Describe the role of relationship strength and sample size in determining statistical significance and make reasonable judgments about statistical significance based on these two factors.

The Purpose of Null Hypothesis Testing

As we have seen, psychological research typically involves measuring one or more variables for a sample and computing descriptive statistics for that sample. In general, however, the researcher’s goal is not to draw conclusions about that sample but to draw conclusions about the population that the sample was selected from. Thus researchers must use sample statistics to draw conclusions about the corresponding values in the population. These corresponding values in the population are called  parameters . Imagine, for example, that a researcher measures the number of depressive symptoms exhibited by each of 50 clinically depressed adults and computes the mean number of symptoms. The researcher probably wants to use this sample statistic (the mean number of symptoms for the sample) to draw conclusions about the corresponding population parameter (the mean number of symptoms for clinically depressed adults).

Unfortunately, sample statistics are not perfect estimates of their corresponding population parameters. This is because there is a certain amount of random variability in any statistic from sample to sample. The mean number of depressive symptoms might be 8.73 in one sample of clinically depressed adults, 6.45 in a second sample, and 9.44 in a third—even though these samples are selected randomly from the same population. Similarly, the correlation (Pearson’s  r ) between two variables might be +.24 in one sample, −.04 in a second sample, and +.15 in a third—again, even though these samples are selected randomly from the same population. This random variability in a statistic from sample to sample is called  sampling error . (Note that the term error  here refers to random variability and does not imply that anyone has made a mistake. No one “commits a sampling error.”)

One implication of this is that when there is a statistical relationship in a sample, it is not always clear that there is a statistical relationship in the population. A small difference between two group means in a sample might indicate that there is a small difference between the two group means in the population. But it could also be that there is no difference between the means in the population and that the difference in the sample is just a matter of sampling error. Similarly, a Pearson’s  r  value of −.29 in a sample might mean that there is a negative relationship in the population. But it could also be that there is no relationship in the population and that the relationship in the sample is just a matter of sampling error.

In fact, any statistical relationship in a sample can be interpreted in two ways:

  • There is a relationship in the population, and the relationship in the sample reflects this.
  • There is no relationship in the population, and the relationship in the sample reflects only sampling error.

The purpose of null hypothesis testing is simply to help researchers decide between these two interpretations.

The Logic of Null Hypothesis Testing

Null hypothesis testing  is a formal approach to deciding between two interpretations of a statistical relationship in a sample. One interpretation is called the   null hypothesis  (often symbolized  H 0  and read as “H-naught”). This is the idea that there is no relationship in the population and that the relationship in the sample reflects only sampling error. Informally, the null hypothesis is that the sample relationship “occurred by chance.” The other interpretation is called the  alternative hypothesis  (often symbolized as  H 1 ). This is the idea that there is a relationship in the population and that the relationship in the sample reflects this relationship in the population.

Again, every statistical relationship in a sample can be interpreted in either of these two ways: It might have occurred by chance, or it might reflect a relationship in the population. So researchers need a way to decide between them. Although there are many specific null hypothesis testing techniques, they are all based on the same general logic. The steps are as follows:

  • Assume for the moment that the null hypothesis is true. There is no relationship between the variables in the population.
  • Determine how likely the sample relationship would be if the null hypothesis were true.
  • If the sample relationship would be extremely unlikely, then reject the null hypothesis  in favour of the alternative hypothesis. If it would not be extremely unlikely, then  retain the null hypothesis .

Following this logic, we can begin to understand why Mehl and his colleagues concluded that there is no difference in talkativeness between women and men in the population. In essence, they asked the following question: “If there were no difference in the population, how likely is it that we would find a small difference of  d  = 0.06 in our sample?” Their answer to this question was that this sample relationship would be fairly likely if the null hypothesis were true. Therefore, they retained the null hypothesis—concluding that there is no evidence of a sex difference in the population. We can also see why Kanner and his colleagues concluded that there is a correlation between hassles and symptoms in the population. They asked, “If the null hypothesis were true, how likely is it that we would find a strong correlation of +.60 in our sample?” Their answer to this question was that this sample relationship would be fairly unlikely if the null hypothesis were true. Therefore, they rejected the null hypothesis in favour of the alternative hypothesis—concluding that there is a positive correlation between these variables in the population.

A crucial step in null hypothesis testing is finding the likelihood of the sample result if the null hypothesis were true. This probability is called the  p value . A low  p  value means that the sample result would be unlikely if the null hypothesis were true and leads to the rejection of the null hypothesis. A high  p  value means that the sample result would be likely if the null hypothesis were true and leads to the retention of the null hypothesis. But how low must the  p  value be before the sample result is considered unlikely enough to reject the null hypothesis? In null hypothesis testing, this criterion is called  α (alpha)  and is almost always set to .05. If there is less than a 5% chance of a result as extreme as the sample result if the null hypothesis were true, then the null hypothesis is rejected. When this happens, the result is said to be  statistically significant . If there is greater than a 5% chance of a result as extreme as the sample result when the null hypothesis is true, then the null hypothesis is retained. This does not necessarily mean that the researcher accepts the null hypothesis as true—only that there is not currently enough evidence to conclude that it is true. Researchers often use the expression “fail to reject the null hypothesis” rather than “retain the null hypothesis,” but they never use the expression “accept the null hypothesis.”

The Misunderstood  p  Value

The  p  value is one of the most misunderstood quantities in psychological research (Cohen, 1994) [1] . Even professional researchers misinterpret it, and it is not unusual for such misinterpretations to appear in statistics textbooks!

The most common misinterpretation is that the  p  value is the probability that the null hypothesis is true—that the sample result occurred by chance. For example, a misguided researcher might say that because the  p  value is .02, there is only a 2% chance that the result is due to chance and a 98% chance that it reflects a real relationship in the population. But this is incorrect . The  p  value is really the probability of a result at least as extreme as the sample result  if  the null hypothesis  were  true. So a  p  value of .02 means that if the null hypothesis were true, a sample result this extreme would occur only 2% of the time.

You can avoid this misunderstanding by remembering that the  p  value is not the probability that any particular  hypothesis  is true or false. Instead, it is the probability of obtaining the  sample result  if the null hypothesis were true.

Role of Sample Size and Relationship Strength

Recall that null hypothesis testing involves answering the question, “If the null hypothesis were true, what is the probability of a sample result as extreme as this one?” In other words, “What is the  p  value?” It can be helpful to see that the answer to this question depends on just two considerations: the strength of the relationship and the size of the sample. Specifically, the stronger the sample relationship and the larger the sample, the less likely the result would be if the null hypothesis were true. That is, the lower the  p  value. This should make sense. Imagine a study in which a sample of 500 women is compared with a sample of 500 men in terms of some psychological characteristic, and Cohen’s  d  is a strong 0.50. If there were really no sex difference in the population, then a result this strong based on such a large sample should seem highly unlikely. Now imagine a similar study in which a sample of three women is compared with a sample of three men, and Cohen’s  d  is a weak 0.10. If there were no sex difference in the population, then a relationship this weak based on such a small sample should seem likely. And this is precisely why the null hypothesis would be rejected in the first example and retained in the second.

Of course, sometimes the result can be weak and the sample large, or the result can be strong and the sample small. In these cases, the two considerations trade off against each other so that a weak result can be statistically significant if the sample is large enough and a strong relationship can be statistically significant even if the sample is small. Table 13.1 shows roughly how relationship strength and sample size combine to determine whether a sample result is statistically significant. The columns of the table represent the three levels of relationship strength: weak, medium, and strong. The rows represent four sample sizes that can be considered small, medium, large, and extra large in the context of psychological research. Thus each cell in the table represents a combination of relationship strength and sample size. If a cell contains the word  Yes , then this combination would be statistically significant for both Cohen’s  d  and Pearson’s  r . If it contains the word  No , then it would not be statistically significant for either. There is one cell where the decision for  d  and  r  would be different and another where it might be different depending on some additional considerations, which are discussed in Section 13.2 “Some Basic Null Hypothesis Tests”

Table 13.1 How Relationship Strength and Sample Size Combine to Determine Whether a Result Is Statistically Significant
Sample Size Weak relationship Medium-strength relationship Strong relationship
Small (  = 20) No No  = Maybe

 = Yes

Medium (  = 50) No Yes Yes
Large (  = 100)  = Yes

 = No

Yes Yes
Extra large (  = 500) Yes Yes Yes

Although Table 13.1 provides only a rough guideline, it shows very clearly that weak relationships based on medium or small samples are never statistically significant and that strong relationships based on medium or larger samples are always statistically significant. If you keep this lesson in mind, you will often know whether a result is statistically significant based on the descriptive statistics alone. It is extremely useful to be able to develop this kind of intuitive judgment. One reason is that it allows you to develop expectations about how your formal null hypothesis tests are going to come out, which in turn allows you to detect problems in your analyses. For example, if your sample relationship is strong and your sample is medium, then you would expect to reject the null hypothesis. If for some reason your formal null hypothesis test indicates otherwise, then you need to double-check your computations and interpretations. A second reason is that the ability to make this kind of intuitive judgment is an indication that you understand the basic logic of this approach in addition to being able to do the computations.

Statistical Significance Versus Practical Significance

Table 13.1 illustrates another extremely important point. A statistically significant result is not necessarily a strong one. Even a very weak result can be statistically significant if it is based on a large enough sample. This is closely related to Janet Shibley Hyde’s argument about sex differences (Hyde, 2007) [2] . The differences between women and men in mathematical problem solving and leadership ability are statistically significant. But the word  significant  can cause people to interpret these differences as strong and important—perhaps even important enough to influence the college courses they take or even who they vote for. As we have seen, however, these statistically significant differences are actually quite weak—perhaps even “trivial.”

This is why it is important to distinguish between the  statistical  significance of a result and the  practical  significance of that result.  Practical significance refers to the importance or usefulness of the result in some real-world context. Many sex differences are statistically significant—and may even be interesting for purely scientific reasons—but they are not practically significant. In clinical practice, this same concept is often referred to as “clinical significance.” For example, a study on a new treatment for social phobia might show that it produces a statistically significant positive effect. Yet this effect still might not be strong enough to justify the time, effort, and other costs of putting it into practice—especially if easier and cheaper treatments that work almost as well already exist. Although statistically significant, this result would be said to lack practical or clinical significance.

Key Takeaways

  • Null hypothesis testing is a formal approach to deciding whether a statistical relationship in a sample reflects a real relationship in the population or is just due to chance.
  • The logic of null hypothesis testing involves assuming that the null hypothesis is true, finding how likely the sample result would be if this assumption were correct, and then making a decision. If the sample result would be unlikely if the null hypothesis were true, then it is rejected in favour of the alternative hypothesis. If it would not be unlikely, then the null hypothesis is retained.
  • The probability of obtaining the sample result if the null hypothesis were true (the  p  value) is based on two considerations: relationship strength and sample size. Reasonable judgments about whether a sample relationship is statistically significant can often be made by quickly considering these two factors.
  • Statistical significance is not the same as relationship strength or importance. Even weak relationships can be statistically significant if the sample size is large enough. It is important to consider relationship strength and the practical significance of a result in addition to its statistical significance.
  • Discussion: Imagine a study showing that people who eat more broccoli tend to be happier. Explain for someone who knows nothing about statistics why the researchers would conduct a null hypothesis test.
  • The correlation between two variables is  r  = −.78 based on a sample size of 137.
  • The mean score on a psychological characteristic for women is 25 ( SD  = 5) and the mean score for men is 24 ( SD  = 5). There were 12 women and 10 men in this study.
  • In a memory experiment, the mean number of items recalled by the 40 participants in Condition A was 0.50 standard deviations greater than the mean number recalled by the 40 participants in Condition B.
  • In another memory experiment, the mean scores for participants in Condition A and Condition B came out exactly the same!
  • A student finds a correlation of  r  = .04 between the number of units the students in his research methods class are taking and the students’ level of stress.

Long Descriptions

“Null Hypothesis” long description: A comic depicting a man and a woman talking in the foreground. In the background is a child working at a desk. The man says to the woman, “I can’t believe schools are still teaching kids about the null hypothesis. I remember reading a big study that conclusively disproved it years ago.” [Return to “Null Hypothesis”]

“Conditional Risk” long description: A comic depicting two hikers beside a tree during a thunderstorm. A bolt of lightning goes “crack” in the dark sky as thunder booms. One of the hikers says, “Whoa! We should get inside!” The other hiker says, “It’s okay! Lightning only kills about 45 Americans a year, so the chances of dying are only one in 7,000,000. Let’s go on!” The comic’s caption says, “The annual death rate among people who know that statistic is one in six.” [Return to “Conditional Risk”]

Media Attributions

  • Null Hypothesis by XKCD  CC BY-NC (Attribution NonCommercial)
  • Conditional Risk by XKCD  CC BY-NC (Attribution NonCommercial)
  • Cohen, J. (1994). The world is round: p < .05. American Psychologist, 49 , 997–1003. ↵
  • Hyde, J. S. (2007). New directions in the study of gender similarities and differences. Current Directions in Psychological Science, 16 , 259–263. ↵

Values in a population that correspond to variables measured in a study.

The random variability in a statistic from sample to sample.

A formal approach to deciding between two interpretations of a statistical relationship in a sample.

The idea that there is no relationship in the population and that the relationship in the sample reflects only sampling error.

The idea that there is a relationship in the population and that the relationship in the sample reflects this relationship in the population.

When the relationship found in the sample would be extremely unlikely, the idea that the relationship occurred “by chance” is rejected.

When the relationship found in the sample is likely to have occurred by chance, the null hypothesis is not rejected.

The probability that, if the null hypothesis were true, the result found in the sample would occur.

How low the p value must be before the sample result is considered unlikely in null hypothesis testing.

When there is less than a 5% chance of a result as extreme as the sample result occurring and the null hypothesis is rejected.

Research Methods in Psychology - 2nd Canadian Edition Copyright © 2015 by Paul C. Price, Rajiv Jhangiani, & I-Chant A. Chiang is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License , except where otherwise noted.

Share This Book

null hypothesis test for equality

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • View all journals
  • Explore content
  • About the journal
  • Publish with us
  • Sign up for alerts
  • Published: 04 September 2024

CDK5–cyclin B1 regulates mitotic fidelity

  • Xiao-Feng Zheng   ORCID: orcid.org/0000-0001-8769-4604 1   na1 ,
  • Aniruddha Sarkar   ORCID: orcid.org/0000-0002-9393-1335 1   na1 ,
  • Humphrey Lotana 2 ,
  • Aleem Syed   ORCID: orcid.org/0000-0001-7942-3900 1 ,
  • Huy Nguyen   ORCID: orcid.org/0000-0002-4424-1047 1 ,
  • Richard G. Ivey 3 ,
  • Jacob J. Kennedy 3 ,
  • Jeffrey R. Whiteaker 3 ,
  • Bartłomiej Tomasik   ORCID: orcid.org/0000-0001-5648-345X 1 , 4   nAff7 ,
  • Kaimeng Huang   ORCID: orcid.org/0000-0002-0552-209X 1 , 5 ,
  • Feng Li 1 ,
  • Alan D. D’Andrea   ORCID: orcid.org/0000-0001-6168-6294 1 , 5 ,
  • Amanda G. Paulovich   ORCID: orcid.org/0000-0001-6532-6499 3 ,
  • Kavita Shah 2 ,
  • Alexander Spektor   ORCID: orcid.org/0000-0002-1085-3205 1 , 5 &
  • Dipanjan Chowdhury   ORCID: orcid.org/0000-0001-5645-3752 1 , 5 , 6  

Nature ( 2024 ) Cite this article

40 Altmetric

Metrics details

CDK1 has been known to be the sole cyclin-dependent kinase (CDK) partner of cyclin B1 to drive mitotic progression 1 . Here we demonstrate that CDK5 is active during mitosis and is necessary for maintaining mitotic fidelity. CDK5 is an atypical CDK owing to its high expression in post-mitotic neurons and activation by non-cyclin proteins p35 and p39 2 . Here, using independent chemical genetic approaches, we specifically abrogated CDK5 activity during mitosis, and observed mitotic defects, nuclear atypia and substantial alterations in the mitotic phosphoproteome. Notably, cyclin B1 is a mitotic co-factor of CDK5. Computational modelling, comparison with experimentally derived structures of CDK–cyclin complexes and validation with mutational analysis indicate that CDK5–cyclin B1 can form a functional complex. Disruption of the CDK5–cyclin B1 complex phenocopies CDK5 abrogation in mitosis. Together, our results demonstrate that cyclin B1 partners with both CDK5 and CDK1, and CDK5–cyclin B1 functions as a canonical CDK–cyclin complex to ensure mitotic fidelity.

This is a preview of subscription content, access via your institution

Access options

Access Nature and 54 other Nature Portfolio journals

Get Nature+, our best-value online-access subscription

24,99 € / 30 days

cancel any time

Subscribe to this journal

Receive 51 print issues and online access

185,98 € per year

only 3,65 € per issue

Buy this article

  • Purchase on SpringerLink
  • Instant access to full article PDF

Prices may be subject to local taxes which are calculated during checkout

null hypothesis test for equality

Similar content being viewed by others

null hypothesis test for equality

Core control principles of the eukaryotic cell cycle

null hypothesis test for equality

CDC7-independent G1/S transition revealed by targeted protein degradation

null hypothesis test for equality

Evolution of opposing regulatory interactions underlies the emergence of eukaryotic cell cycle checkpoints

Data availability.

All data supporting the findings of this study are available in the Article and its Supplementary Information . The LC–MS/MS proteomics data have been deposited to the ProteomeXchange Consortium 60 via the PRIDE 61 partner repository under dataset identifier PXD038386 . Correspondence regarding experiments and requests for materials should be addressed to the corresponding authors.

Wieser, S. & Pines, J. The biochemistry of mitosis. Cold Spring Harb. Perspect. Biol. 7 , a015776 (2015).

Article   PubMed   PubMed Central   Google Scholar  

Dhavan, R. & Tsai, L. H. A decade of CDK5. Nat. Rev. Mol. Cell Biol. 2 , 749–759 (2001).

Article   CAS   PubMed   Google Scholar  

Malumbres, M. Cyclin-dependent kinases. Genome Biol. 15 , 122 (2014).

Coverley, D., Laman, H. & Laskey, R. A. Distinct roles for cyclins E and A during DNA replication complex assembly and activation. Nat. Cell Biol. 4 , 523–528 (2002).

Desai, D., Wessling, H. C., Fisher, R. P. & Morgan, D. O. Effects of phosphorylation by CAK on cyclin binding by CDC2 and CDK2. Mol. Cell. Biol. 15 , 345–350 (1995).

Article   CAS   PubMed   PubMed Central   Google Scholar  

Brown, N. R. et al. CDK1 structures reveal conserved and unique features of the essential cell cycle CDK. Nat. Commun. 6 , 6769 (2015).

Article   ADS   CAS   PubMed   PubMed Central   Google Scholar  

Strauss, B. et al. Cyclin B1 is essential for mitosis in mouse embryos, and its nuclear export sets the time for mitosis. J. Cell Biol. 217 , 179–193 (2018).

Gavet, O. & Pines, J. Activation of cyclin B1-Cdk1 synchronizes events in the nucleus and the cytoplasm at mitosis. J. Cell Biol. 189 , 247–259 (2010).

Barbiero, M. et al. Cell cycle-dependent binding between cyclin B1 and Cdk1 revealed by time-resolved fluorescence correlation spectroscopy. Open Biol. 12 , 220057 (2022).

Pines, J. & Hunter, T. Isolation of a human cyclin cDNA: evidence for cyclin mRNA and protein regulation in the cell cycle and for interaction with p34cdc2. Cell 58 , 833–846 (1989).

Clute, P. & Pines, J. Temporal and spatial control of cyclin B1 destruction in metaphase. Nat. Cell Biol. 1 , 82–87 (1999).

Potapova, T. A. et al. The reversibility of mitotic exit in vertebrate cells. Nature 440 , 954–958 (2006).

Basu, S., Greenwood, J., Jones, A. W. & Nurse, P. Core control principles of the eukaryotic cell cycle. Nature 607 , 381–386 (2022).

Santamaria, D. et al. Cdk1 is sufficient to drive the mammalian cell cycle. Nature 448 , 811–815 (2007).

Article   ADS   CAS   PubMed   Google Scholar  

Zheng, X. F. et al. A mitotic CDK5-PP4 phospho-signaling cascade primes 53BP1 for DNA repair in G1. Nat. Commun. 10 , 4252 (2019).

Article   ADS   PubMed   PubMed Central   Google Scholar  

Fagerberg, L. et al. Analysis of the human tissue-specific expression by genome-wide integration of transcriptomics and antibody-based proteomics. Mol. Cell. Proteom. 13 , 397–406 (2014).

Article   CAS   Google Scholar  

Pozo, K. & Bibb, J. A. The emerging role of Cdk5 in cancer. Trends Cancer 2 , 606–618 (2016).

Sharma, S. & Sicinski, P. A kinase of many talents: non-neuronal functions of CDK5 in development and disease. Open Biol. 10 , 190287 (2020).

Sun, K. H. et al. Novel genetic tools reveal Cdk5’s major role in Golgi fragmentation in Alzheimer’s disease. Mol. Biol. Cell 19 , 3052–3069 (2008).

Sharma, S. et al. Targeting the cyclin-dependent kinase 5 in metastatic melanoma. Proc. Natl Acad. Sci. USA 117 , 8001–8012 (2020).

Nabet, B. et al. The dTAG system for immediate and target-specific protein degradation. Nat. Chem. Biol. 14 , 431–441 (2018).

Simpson, L. M. et al. Target protein localization and its impact on PROTAC-mediated degradation. Cell Chem. Biol. 29 , 1482–1504 e1487 (2022).

Vassilev, L. T. et al. Selective small-molecule inhibitor reveals critical mitotic functions of human CDK1. Proc. Natl Acad. Sci. USA 103 , 10660–10665 (2006).

Janssen, A. F. J., Breusegem, S. Y. & Larrieu, D. Current methods and pipelines for image-based quantitation of nuclear shape and nuclear envelope abnormalities. Cells 11 , 347 (2022).

Thompson, S. L. & Compton, D. A. Chromosome missegregation in human cells arises through specific types of kinetochore-microtubule attachment errors. Proc. Natl Acad. Sci. USA 108 , 17974–17978 (2011).

Kline-Smith, S. L. & Walczak, C. E. Mitotic spindle assembly and chromosome segregation: refocusing on microtubule dynamics. Mol. Cell 15 , 317–327 (2004).

Prosser, S. L. & Pelletier, L. Mitotic spindle assembly in animal cells: a fine balancing act. Nat. Rev. Mol. Cell Biol. 18 , 187–201 (2017).

Zeng, X. et al. Pharmacologic inhibition of the anaphase-promoting complex induces a spindle checkpoint-dependent mitotic arrest in the absence of spindle damage. Cancer Cell 18 , 382–395 (2010).

Warren, J. D., Orr, B. & Compton, D. A. A comparative analysis of methods to measure kinetochore-microtubule attachment stability. Methods Cell. Biol. 158 , 91–116 (2020).

Gregan, J., Polakova, S., Zhang, L., Tolic-Norrelykke, I. M. & Cimini, D. Merotelic kinetochore attachment: causes and effects. Trends Cell Biol 21 , 374–381 (2011).

Etemad, B., Kuijt, T. E. & Kops, G. J. Kinetochore-microtubule attachment is sufficient to satisfy the human spindle assembly checkpoint. Nat. Commun. 6 , 8987 (2015).

Tauchman, E. C., Boehm, F. J. & DeLuca, J. G. Stable kinetochore-microtubule attachment is sufficient to silence the spindle assembly checkpoint in human cells. Nat. Commun. 6 , 10036 (2015).

Mitchison, T. & Kirschner, M. Microtubule assembly nucleated by isolated centrosomes. Nature 312 , 232–237 (1984).

Fourest-Lieuvin, A. et al. Microtubule regulation in mitosis: tubulin phosphorylation by the cyclin-dependent kinase Cdk1. Mol. Biol. Cell 17 , 1041–1050 (2006).

Ubersax, J. A. et al. Targets of the cyclin-dependent kinase Cdk1. Nature 425 , 859–864 (2003).

Yang, C. H., Lambie, E. J. & Snyder, M. NuMA: an unusually long coiled-coil related protein in the mammalian nucleus. J. Cell Biol. 116 , 1303–1317 (1992).

Yang, C. H. & Snyder, M. The nuclear-mitotic apparatus protein is important in the establishment and maintenance of the bipolar mitotic spindle apparatus. Mol. Biol. Cell 3 , 1259–1267 (1992).

Kotak, S., Busso, C. & Gonczy, P. NuMA phosphorylation by CDK1 couples mitotic progression with cortical dynein function. EMBO J. 32 , 2517–2529 (2013).

Kitagawa, M. et al. Cdk1 coordinates timely activation of MKlp2 kinesin with relocation of the chromosome passenger complex for cytokinesis. Cell Rep. 7 , 166–179 (2014).

Schrock, M. S. et al. MKLP2 functions in early mitosis to ensure proper chromosome congression. J. Cell Sci. 135 , jcs259560 (2022).

Sun, M. et al. NuMA regulates mitotic spindle assembly, structural dynamics and function via phase separation. Nat. Commun. 12 , 7157 (2021).

Chen, Q., Zhang, X., Jiang, Q., Clarke, P. R. & Zhang, C. Cyclin B1 is localized to unattached kinetochores and contributes to efficient microtubule attachment and proper chromosome alignment during mitosis. Cell Res. 18 , 268–280 (2008).

Kabeche, L. & Compton, D. A. Cyclin A regulates kinetochore microtubules to promote faithful chromosome segregation. Nature 502 , 110–113 (2013).

Hegarat, N. et al. Cyclin A triggers mitosis either via the Greatwall kinase pathway or cyclin B. EMBO J. 39 , e104419 (2020).

Jumper, J. et al. Highly accurate protein structure prediction with AlphaFold. Nature 596 , 583–589 (2021).

Wood, D. J. & Endicott, J. A. Structural insights into the functional diversity of the CDK-cyclin family. Open Biol. 8 , 180112 (2018).

Brown, N. R., Noble, M. E., Endicott, J. A. & Johnson, L. N. The structural basis for specificity of substrate and recruitment peptides for cyclin-dependent kinases. Nat. Cell Biol. 1 , 438–443 (1999).

Tarricone, C. et al. Structure and regulation of the CDK5-p25 nck5a complex. Mol. Cell 8 , 657–669 (2001).

Poon, R. Y., Lew, J. & Hunter, T. Identification of functional domains in the neuronal Cdk5 activator protein. J. Biol. Chem. 272 , 5703–5708 (1997).

Oppermann, F. S. et al. Large-scale proteomics analysis of the human kinome. Mol. Cell. Proteom. 8 , 1751–1764 (2009).

van den Heuvel, S. & Harlow, E. Distinct roles for cyclin-dependent kinases in cell cycle control. Science 262 , 2050–2054 (1993).

Article   ADS   PubMed   Google Scholar  

Nakatani, Y. & Ogryzko, V. Immunoaffinity purification of mammalian protein complexes. Methods Enzymol. 370 , 430–444 (2003).

Tyanova, S., Temu, T. & Cox, J. The MaxQuant computational platform for mass spectrometry-based shotgun proteomics. Nat. Protoc. 11 , 2301–2319 (2016).

Tyanova, S. et al. The Perseus computational platform for comprehensive analysis of (prote)omics data. Nat. Methods 13 , 731–740 (2016).

Ritchie, M. E. et al. limma powers differential expression analyses for RNA-sequencing and microarray studies. Nucleic Acids Res. 43 , e47 (2015).

R Core Team. R: a language and environment for statistical computing (2021).

Wickham, H. ggplot2: elegant graphics for data analysis (2016).

Slowikowski, K. ggrepel: automatically position non-overlapping text labels with “ggplot2” (2018).

Wu, T. et al. clusterProfiler 4.0: a universal enrichment tool for interpreting omics data. Innovation 2 , 100141 (2021).

CAS   PubMed   PubMed Central   Google Scholar  

Deutsch, E. W. et al. The ProteomeXchange consortium in 2020: enabling ‘big data’ approaches in proteomics. Nucleic Acids Res. 48 , D1145–D1152 (2020).

CAS   PubMed   Google Scholar  

Perez-Riverol, Y. et al. The PRIDE database and related tools and resources in 2019: improving support for quantification data. Nucleic Acids Res. 47 , D442–D450 (2019).

Robinson, M. D., McCarthy, D. J. & Smyth, G. K. edgeR: a Bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics 26 , 139–140 (2010).

Nagahara, H. et al. Transduction of full-length TAT fusion proteins into mammalian cells: TAT-p27Kip1 induces cell migration. Nat. Med. 4 , 1449–1452 (1998).

Mirdita, M. et al. ColabFold: making protein folding accessible to all. Nat. Methods 19 , 679–682 (2022).

Lu, C. et al. OPLS4: improving force field accuracy on challenging regimes of chemical space. J. Chem. Theory Comput. 17 , 4291–4300 (2021).

Obenauer, J. C., Cantley, L. C. & Yaffe, M. B. Scansite 2.0: proteome-wide prediction of cell signaling interactions using short sequence motifs. Nucleic Acids Res. 31 , 3635–3641 (2003).

Download references

Acknowledgements

We thank D. Pellman for comments on the manuscript; W. Michowski, S. Sharma, P. Sicinski, B. Nabet and N. Gray for the reagents; J. A. Tainer for providing access to software used for structural analysis; and S. Gerber for sharing unpublished results. D.C. is supported by grants R01 CA208244 and R01 CA264900, DOD Ovarian Cancer Award W81XWH-15-0564/OC140632, Tina’s Wish Foundation, Detect Me If You Can, a V Foundation Award, a Gray Foundation grant and the Claudia Adams Barr Program in Innovative Basic Cancer Research. A. Spektor would like to acknowledge support from K08 CA208008, the Burroughs Wellcome Fund Career Award for Medical Scientists, Saverin Breast Cancer Research Fund and the Claudia Adams Barr Program in Innovative Basic Cancer Research. X.-F.Z. was an American Cancer Society Fellow and is supported by the Breast and Gynecologic Cancer Innovation Award from Susan F. Smith Center for Women’s Cancers at Dana-Farber Cancer Institute. A. Syed is supported by the Claudia Adams Barr Program in Innovative Basic Cancer Research. B.T. was supported by the Polish National Agency for Academic Exchange (grant PPN/WAL/2019/1/00018) and by the Foundation for Polish Science (START Program). A.D.D is supported by NIH grant R01 HL52725. A.G.P. by National Cancer Institute grants U01CA214114 and U01CA271407, as well as a donation from the Aven Foundation; J.R.W. by National Cancer Institute grant R50CA211499; and K.S. by NIH awards 1R01-CA237660 and 1RF1NS124779.

Author information

Bartłomiej Tomasik

Present address: Department of Oncology and Radiotherapy, Medical University of Gdańsk, Faculty of Medicine, Gdańsk, Poland

These authors contributed equally: Xiao-Feng Zheng, Aniruddha Sarkar

Authors and Affiliations

Division of Radiation and Genome Stability, Department of Radiation Oncology, Dana-Farber Cancer Institute, Harvard Medical School, Boston, MA, USA

Xiao-Feng Zheng, Aniruddha Sarkar, Aleem Syed, Huy Nguyen, Bartłomiej Tomasik, Kaimeng Huang, Feng Li, Alan D. D’Andrea, Alexander Spektor & Dipanjan Chowdhury

Department of Chemistry and Purdue University Center for Cancer Research, Purdue University, West Lafayette, IN, USA

Humphrey Lotana & Kavita Shah

Translational Science and Therapeutics Division, Fred Hutchinson Cancer Research Center, Seattle, WA, USA

Richard G. Ivey, Jacob J. Kennedy, Jeffrey R. Whiteaker & Amanda G. Paulovich

Department of Biostatistics and Translational Medicine, Medical University of Łódź, Łódź, Poland

Broad Institute of Harvard and MIT, Cambridge, MA, USA

Kaimeng Huang, Alan D. D’Andrea, Alexander Spektor & Dipanjan Chowdhury

Department of Biological Chemistry & Molecular Pharmacology, Harvard Medical School, Boston, MA, USA

Dipanjan Chowdhury

You can also search for this author in PubMed   Google Scholar

Contributions

X.-F.Z., A. Sarkar., A. Spektor. and D.C. conceived the project and designed the experiments. X.-F.Z. and A. Sarkar performed the majority of experiments and associated analyses except as listed below. H.L. expressed relevant proteins and conducted the kinase activity assays for CDK5–cyclin B1, CDK5–p35 and CDK5(S46) variant complexes under the guidance of K.S.; A. Syed performed structural modelling and analysis. R.G.I., J.J.K. and J.R.W. performed MS and analysis. B.T. and H.N. performed MS data analyses. K.H. provided guidance to screen CDK5(as) knocked-in clones and performed sequence analysis to confirm CDK5(as) knock-in. F.L. and A.D.D. provided reagents and discussion on CDK5 substrates analyses. X.-F.Z., A. Sarkar, A. Spektor and D.C. wrote the manuscript with inputs and edits from all of the other authors.

Corresponding authors

Correspondence to Alexander Spektor or Dipanjan Chowdhury .

Ethics declarations

Competing interests.

A.D.D. reports consulting for AstraZeneca, Bayer AG, Blacksmith/Lightstone Ventures, Bristol Myers Squibb, Cyteir Therapeutics, EMD Serono, Impact Therapeutics, PrimeFour Therapeutics, Pfizer, Tango Therapeutics and Zentalis Pharmaceuticals/Zeno Management; is an advisory board member for Cyteir and Impact Therapeutics; a stockholder in Cedilla Therapeutics, Cyteir, Impact Therapeutics and PrimeFour Therapeutics; and reports receiving commercial research grants from Bristol Myers Squibb, EMD Serono, Moderna and Tango Therapeutics. The other authors declare no competing interests.

Peer review

Peer review information.

Nature thanks Yibing Shan and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Peer reviewer reports are available.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data figures and tables

Extended data fig. 1 inhibition of cdk5 in analogue-sensitive (cdk5- as ) system..

a , Schematics depicting specific inhibition of the CDK5 analogue-sensitive ( as ) variant. Canonical ATP-analogue inhibitor (In, yellow) targets endogenous CDK5 (dark green) at its ATP-binding catalytic site nonspecifically since multiple kinases share structurally similar catalytic sites (left panel). The analogue-sensitive ( as , light green) phenylalanine-to-glycine (F80G) mutation confers a structural change adjacent to the catalytic site of CDK5 that does not impact its catalysis but accommodates the specific binding of a non-hydrolysable bulky orthogonal inhibitor 1NM-PP1(In*, orange). Introduction of 1NM-PP1 thus selectively inhibits CDK5- as variant (right panel). b , Immunoblots showing two clones (Cl 23 and Cl 50) of RPE-1 cells expressing FLAG-HA-CDK5- as in place of endogenous CDK5. Representative results are shown from three independent repeats. c , Proliferation curve of parental RPE-1 and RPE-1 CDK5- as cells. Data represent mean ± s.d. from three independent repeats. p -value was determined by Mann Whitney U test. d , Immunoblots showing immunoprecipitated CDK1-cyclin B1 complex or CDK5- as -cyclin B1 complex by the indicated antibody-coupled agarose, from nocodazole arrested RPE-1 CDK5- as cells with treated with or without 1NM-PP1 for inhibition of CDK5- as , from three independent replicate experiments. e , In-vitro kinase activity quantification of immunoprecipitated complex shown in d . Data represent mean ± s.d. from three independent experiments. p -values were determined by unpaired, two-tailed student’s t-test. f , Immunoblots of RPE-1 CDK5- as cells treated with either DMSO or 1NM-PP1 for 2 h prior to and upon release from RO-3306 and collected at 60 min following release. Cells were lysed and blotted with anti-bodies against indicated proteins (upper panel). Quantification of the relative intensity of PP4R3β phosphorylation at S840 in 1NM-PP1-treated CDK5- as cells compared to DMSO-treatment (lower panel). g , Experimental scheme for specific and temporal abrogation of CDK5 in RPE-1 CDK5- as cells. Data represent mean ± S.D from quadruplicate repeats. p -value was determined by one sample t and Wilcoxon test. h , Hoechst staining showing primary nuclei and micronuclei of RPE-1 CDK5- as with indicated treatment; scale bar is as indicated (left panel). Right, quantification of the percentage of cells with micronuclei after treatment. Data represent mean ± s.d. of three independent experiments from n = 2174 DMSO, n = 1788 1NM-PP1 where n is the number of cells. p- values were determined by unpaired, two-tailed student’s t-test. Scale bar is as indicated. Uncropped gel images are provided in Supplementary Fig. 1 .

Extended Data Fig. 2 Degradation of CDK5 in degradation tag (CDK5- dTAG ) system.

a , Schematic depicting the dTAG-13-inducible protein degradation system. Compound dTAG-13 links protein fused with FKBP12 F36V domain (dTAG) to CRBN-DDB1-CUL4A E3 ligase complex, leading to CRBN-mediated degradation. b , Immunoblots showing two clones of RPE-1 cells that express dTAG -HA-CDK5 in place of endogenous CDK5 (Cl N1 and Cl N4). Representative results are shown from three independent repeats. c , Proliferation curve of parental RPE-1 and RPE-1 CDK5-dTAG. Data represent mean ± s.d. of three independent repeats. p -value was determined by Mann Whitney U test. d and e , Representative images of RPE-1 CDK5- dTAG clone 1 (N1) ( d ) and RPE-1 CDK5- dTAG clone 4 (N4) ( e ) treated with DMSO or dTAG-13 for 2 h prior to and upon release from G2/M arrest and fixed at 120 min after release (top panel); quantification of CDK5 total intensity per cell (lower panels). Data represent mean ± s.d. of at least two independent experiments from n = 100 cells each condition. p- values were determined by unpaired, two-tailed student’s t-test. f , Immunoblots showing level of indicated proteins in RPE-1 CDK5- dTAG cells. Cells were treated with either DMSO or dTAG-13 for 2 h prior to and upon release from RO-3306 and lysed at 60 min following release (upper panel). Quantification of the relative intensity of PP4R3β phosphorylation at S840 in dTAG13-treated CDK5- dTAG cells compared to DMSO-treatment (lower panel). Data represent mean ± s.d. of four independent experiments. p -value was determined by one sample t and Wilcoxon test. g , Experimental scheme for specific and temporal abrogation of CDK5 in RPE-1 CDK5- dTAG cells. h , Hoechst staining showing primary nuclei and micronuclei of RPE-1 CDK5- dTAG with indicated treatment; scale bar is as indicated (left panel). Right, quantification of the percentage of cells with micronuclei after treatment. Data represent mean ± s.d. of three independent experiments from n = 2094 DMSO and n = 2095 dTAG-13, where n is the number of cells. p- values were determined by unpaired, two-tailed student’s t-test. Scale bar is as indicated. Uncropped gel images are provided in Supplementary Fig. 1 .

Extended Data Fig. 3 CDK5 abrogation render chromosome alignment and segregation defect despite intact spindle assembly checkpoint and timely mitotic duration.

a and b , Live-cell imaging snapshots of RPE-1 CDK5- as cells ( a ) and RPE-1 CDK5- dTAG cells ( b ) expressing mCherry-H2B and GFP-α-tubulin, abrogated of CDK5 by treatment with 1NM-PP1 or dTAG-13, respectively. Imaging commenced in prophase following release from RO-3306 into fresh media containing indicated chemicals (left); quantification of the percentage of cells with abnormal nuclear morphology (right). c and d , Representative snapshots of the final frame prior to metaphase-to-anaphase transition from a live-cell imaging experiment detailing chromosome alignment at the metaphase plate of RPE- CDK5- as (c) and RPE-1 CDK5- dTAG ( d ) expressing mCherry-H2B, and GFP-α-tubulin (left); quantification of the percentage of cells displaying abnormal chromosome alignment following indicated treatments (top right). e , Representative images showing the range of depolymerization outcomes (low polymers, high polymers and spindle-like) in DMSO- and 1NM-PP1-treated cells, as shown in Fig. 2e , from n = 50 for each condition, where n is number of metaphase cells . f , Quantifications of mitotic duration from nuclear envelope breakdown (NEBD) to anaphase onset of RPE-1 CDK5- as (left ) and RPE-1 CDK5- dTAG (right) cells, following the indicated treatments. Live-cell imaging of RPE-1 CDK5- as and RPE-1 CDK5- dTAG cells expressing mCherry-H2B and GFP-BAF commenced following release from RO-3306 arrest into fresh media containing DMSO or 1NM-PP1 or dTAG-13. g , Quantifications of the percentage of RPE-1 CDK5- as (left) and RPE-1 CDK5- dTAG (right) cells that were arrested in mitosis following the indicated treatments. Imaging commenced in prophase cells as described in a , following release from RO-3306 into fresh media in the presence or absence nocodazole as indicated. The data in a, c , and g represent mean ± s.d. of at least two independent experiments from n = 85 DMSO and n = 78 1NM-PP1 in a and c ; from n = 40 cells for each treatment condition in g . The data in b , d , and f represent mean ± s.d. of three independent experiments from n = 57 DMSO and n = 64 dTAG-13 in b and d ; from n = 78 DMSO and n = 64 1NM-PP1; n = 59 DMSO and n = 60 dTAG-13, in f , where n is the number of cells. p- values were determined by unpaired, two-tailed student’s t-test. Scale bar is as indicated.

Extended Data Fig. 4 CDK5 and CDK1 regulate tubulin dynamics.

a, b , Immunostaining of RPE-1 cells with antibodies against CDK1 and α-tubulin ( a ); and CDK5 and α-tubulin ( b ) at indicated stages of mitosis. c, d , Manders’ overlap coefficient M1 (CDK1 versus CDK5 on α-tubulin) ( c ); and M2 (α-tubulin on CDK1 versus CDK5) ( d ) at indicated phases of mitosis in cells shown in a and b . The data represent mean ± s.d. of at least two independent experiments from n = 25 cells in each mitotic stage. p- values were determined by unpaired, two-tailed student’s t-test.

Extended Data Fig. 5 Phosphoprotoemics analysis to identify mitotic CDK5 substrates.

a , Scheme of cell synchronization for phosphoproteomics: RPE-1 CDK5- as cells were arrested at G2/M by treatment with RO-3306 for 16 h. The cells were treated with 1NM-PP1 to initiate CDK5 inhibition. 2 h post-treatment, cells were released from G2/M arrest into fresh media with or without 1NM-PP1 to proceed through mitosis with or without continuing inhibition of CDK5. Cells were collected at 60 min post-release from RO-3306 for lysis. b , Schematic for phosphoproteomics-based identification of putative CDK5 substrates. c , Gene ontology analysis of proteins harbouring CDK5 inhibition-induced up-regulated phosphosites. d , Table indicating phospho-site of proteins that are down-regulated as result of CDK5 inhibition. e , Table indicating the likely kinases to phosphorylate the indicated phosphosites of the protein, as predicted by Scansite 4 66 . Divergent score denotes the extent by which phosphosite diverge from known kinase substrate recognition motif, hence higher divergent score indicating the corresponding kinase is less likely the kinase to phosphorylate the phosphosite.

Extended Data Fig. 6 Cyclin B1 is a mitotic co-factor of CDK5 and of CDK1.

a , Endogenous CDK5 was immunoprecipitated from RPE-1 cells collected at time points corresponding to the indicated cell cycle stage. Cell lysate input and elution of immunoprecipitation were immunoblotted by antibodies against the indicated proteins. RPE-1 cells were synchronized to G2 by RO-3306 treatment for 16 h and to prometaphase (M) by nocodazole treatment for 6 h. Asynch: Asynchronous. Uncropped gel images are provided in Supplementary Fig. 1 . b , Immunostaining of RPE-1 cells with antibodies against the indicated proteins at indicated mitotic stages (upper panels). Manders’ overlap coefficient M1 (Cyclin B1 on CDK1) and M2 (CDK1 on Cyclin B1) at indicated mitotic stages for in cells shown in b (lower panels). The data represent mean ± s.d. of at least two independent experiments from n = 25 mitotic cells in each mitotic stage. p- values were determined by unpaired, two-tailed student’s t-test. c , Table listing common proteins as putative targets of CDK5, uncovered from the phosphoproteomics anlaysis of down-regulated phosphoproteins upon CDK5 inhibition (Fig. 3 and Supplementary Table 1 ), and those of cyclin B1, uncovered from phosphoproteomics analysis of down-regulated phospho-proteins upon cyclin B1 degradation (Fig. 6 and Table EV2 in Hegarat et al. EMBO J. 2020). Proteins relevant to mitotic functions are highlighted in red.

Extended Data Fig. 7 Structural prediction and analyses of the CDK5-cyclin B1 complex.

a , Predicted alignment error (PAE) plots of the top five AlphaFold2 (AF2)-predicted models of CDK5-cyclin B1 (top row) and CDK1-cyclin B1 (bottom row) complexes, ranked by interface-predicted template (iPTM) scores. b , AlphaFold2-Multimer-predicted structure of the CDK5-cyclin B1 complex. c , Structural comparison of CDK-cyclin complexes. Left most panel: Structural-overlay of AF2 model of CDK5-cyclin B1 and crystal structure of phospho-CDK2-cyclin A3-substrate complex (PDB ID: 1QMZ ). The zoomed-in view of the activation loops of CDK5 and CDK2 is shown in the inset. V163 (in CDK5), V164 (in CDK2) and Proline at +1 position in the substrates are indicated with arrows. Middle panel: Structural-overlay of AF2 model of CDK5-cyclin B1 and crystal structure of CDK1-cyclin B1-Cks2 complex (PDB ID: 4YC3 ). The zoomed-in view of the activation loops of CDK5 and CDK1 is shown in the inset. Cks2 has been removed from the structure for clarity. Right most panel: structural-overlay of AF2 models of CDK5-cyclin B1 and CDK1-cyclin B1 complex. The zoomed view of the activation loops of CDK5 and CDK1 is shown in the inset. d , Secondary structure elements of CDK5, cyclin B1 and p25. The protein sequences, labelled based on the structural models, are generated by PSPript for CDK5 (AF2 model) ( i ), cyclin B1 (AF2 model) ( ii ) and p25 (PDB ID: 3O0G ) ( iii ). Structural elements ( α , β , η ) are defined by default settings in the program. Key loops highlighted in Fig. 4d are mapped onto the corresponding sequence.

Extended Data Fig. 8 Phosphorylation of CDK5 S159 is required for kinase activity and mitotic fidelity.

a , Structure of the CDK5-p25 complex (PDB ID: 1h41 ). CDK5 (blue) interacts with p25 (yellow). Serine 159 (S159, magenta) is in the T-loop. b , Sequence alignment of CDK5 and CDK1 shows that S159 in CDK5 is the analogous phosphosite as that of T161 in CDK1 for T-loop activation. Sequence alignment was performed by CLC Sequence Viewer ( https://www.qiagenbioinformatics.com/products/clc-sequence-viewer/ ). c , Immunoblots of indicated proteins in nocodazole-arrested mitotic (M) and asynchronous (Asy) HeLa cell lysate. d , Myc-His-tagged CDK5 S159 variants expressed in RPE-1 CDK5- as cells were immunoprecipitated from nocodazole-arrested mitotic lysate by Myc-agarose. Input from cell lysate and elution from immunoprecipitation were immunoblotted with antibodies against indicated protein. EV= empty vector. In vitro kinase activity assay of the indicated immunoprecipitated complex shown on the right panel. Data represent mean ± s.d. of four independent experiments. p -values were determined by unpaired two-tailed student’s t-test. e , Immunoblots showing RPE-1 FLAG-CDK5- as cells stably expressing Myc-His-tagged CDK5 WT and S159A, which were used in live-cell imaging and immunofluorescence experiments to characterize chromosome alignment and spindle architecture during mitosis, following inhibition of CDK5- as by 1NM-PP1, such that only the Myc-His-tagged CDK5 WT and S159A are not inhibited. Representative results are shown from three independent repeats. f , Hoechst staining showing nuclear morphology of RPE-1 CDK5- as cells expressing indicated CDK5 S159 variants following treatment with either DMSO or 1NMP-PP1 and fixation at 120 min post-release from RO-3306-induced arrest (upper panel); quantification of nuclear circularity and solidity (lower panels) g , Snapshots of live-cell imaging RPE-1 CDK5- as cells expressing indicated CDK5 S159 variant, mCherry-H2B, and GFP-α-tubulin, after release from RO-3306-induced arrest at G2/M, treated with 1NM-PP1 2 h prior to and upon after release from G2/M arrest (upper panel); quantification of cells displaying abnormal chromosome alignment in (lower panel). Representative images are shown from two independent experiments, n = 30 cells each cell line. h , Representative images of RPE-1 CDK5- as cells expressing indicated CDK5 S159 variants in metaphase, treated with DMSO or 1NM-PP1 for 2 h prior to and upon release from RO-3306-induced arrest, and then released into media containing 20 µM proTAME for 2 h, fixed and stained with tubulin and DAPI (upper panel); metaphase plate width and spindle length measurements for these representative cells were shown in the table on right; quantification of metaphase plate width and spindle length following the indicated treatments (lower panel). Data in f and h represent mean ± s.d. of at least two independent experiments from n = 486 WT, n = 561 S159A, and n = 401 EV, where n is the number of cells in f ; from n = 65 WT, n = 64 S159A, and n = 67 EV, where n is the number of cells in h . Scale bar is as indicated. Uncropped gel images are provided in Supplementary Fig. 1 .

Extended Data Fig. 9 The CDK5 co-factor-binding helix regulates CDK5 kinase activity.

a , Structure of the CDK5-p25 complex (PDB ID: 1h41 ). CDK5 (blue) interacts with p25 (yellow) at the PSSALRE helix (green). Serine 46 (S46, red) is in the PSSALRE helix. Serine 159 (S159, magenta) is in the T-loop. b , Sequence alignment of CDK5 and CDK1 shows that S46 is conserved in CDK1 and CDK5. Sequence alignment was performed by CLC Sequence Viewer ( https://www.qiagenbioinformatics.com/products/clc-sequence-viewer/ ). c , Immunoblots of CDK5 immunoprecipitation from lysate of E. coli BL21 (DE3) expressing His-tagged human CDK5 WT or CDK5 S46D, mixed with lysate of E. coli BL21 (DE3) expressing His-tagged human cyclin B1. Immunoprecipitated CDK5 alone or in the indicated complex were used in kinase activity assay, shown in Fig. 5b . Representative results are shown from three independent repeats. d , Immunoblots showing RPE-1 FLAG-CDK5- as cells stably expressing Myc-His-tagged CDK5 S46 phospho-variants, which were used in live-cell imaging and immunofluorescence experiments to characterize chromosome alignment and spindle architecture during mitosis, following inhibition of CDK5- as by 1NM-PP1, such that only the Myc-His-tagged CDK5 S46 phospho-variants are not inhibited. Representative results are shown from three independent repeats. e , Immunostaining of RPE-1 CDK5- as cells expressing Myc-His-tagged CDK5 WT or S46D with anti-PP4R3β S840 (pS840) antibody following indicated treatment (DMSO vs 1NM-PP1). Scale bar is as indicated (left). Normalized intensity level of PP4R3β S840 phosphorylation (right). Data represent mean ± s.d. of at least two independent experiments from n = 40 WT and n = 55 S46D, where n is the number of metaphase cells. p- values were determined by unpaired two-tailed student’s t-test. f , Immunoblots showing level of indicated proteins in RPE-1 CDK5- as cells expressing Myc-His-tagged CDK5 WT or S46D. Cells were treated with either DMSO or 1NM-PP1 for 2 h prior to and upon release from RO-3306 and collected and lysed at 60 min following release (left). Quantification of the intensity of PP4R3β phosphorylation at S840 (right). Data represent mean ± s.d. of four independent experiments. p -values were determined by two-tailed one sample t and Wilcoxon test. g , Representative snapshots of live-cell imaging of RPE-1 CDK5- as cells harbouring indicated CDK5 S46 variants expressing mCherry-H2B and GFP-α-tubulin, treated with 1NM-PP1, as shown in Fig. 5d , from n = 35 cells. Imaging commenced in prophase following release from RO-3306 into fresh media containing indicated chemicals. Uncropped gel images are provided in Supplementary Fig. 1 .

Extended Data Fig. 10 Localization of CDK5 S46 phospho-variants.

Immunostaining of RPE-1 CDK5- as cells stably expressing Myc-His CDK5-WT ( a ), S46A ( b ), and S46D ( c ) with antibodies against indicated protein in prophase, prometaphase, and metaphase. Data represent at least two independent experiments from n = 25 cells of each condition in each mitotic stage.

Extended Data Fig. 11 RPE-1 harbouring CDK5- as introduced by CRISPR-mediated knock-in recapitulates chromosome mis-segregation defects observed in RPE-1 overexpressing CDK5- as upon inhibition of CDK5- as by 1NM-PP1 treatment.

a , Chromatogram showing RPE-1 that harbours the homozygous CDK5- as mutation F80G introduced by CRISPR-mediated knock-in (lower panel), replacing endogenous WT CDK5 (upper panel). b , Immunoblots showing level of CDK5 expressed in parental RPE-1 and RPE-1 that harbours CDK5- as F80G mutation in place of endogenous CDK5. c , Representative images of CDK5- as knocked-in RPE-1 cells exhibiting lagging chromosomes following indicated treatments. d , Quantification of percentage of cells exhibiting lagging chromosomes following indicated treatments shown in (c). Data represent mean ± s.d. of three independent experiments from n = 252 DMSO, n = 220 1NM-PP1, where n is the number of cells. p -value was determined by two-tailed Mann Whitney U test.

Extended Data Fig. 12 CDK5 is highly expressed in post-mitotic neurons and overexpressed in cancers.

a , CDK5 RNAseq expression in tumours (left) with matched normal tissues (right). The data are analysed using 22 TCGA projects. Note that CDK5 expression is higher in many cancers compared to the matched normal tissues. BLCA, urothelial bladder carcinoma; BRCA, breast invasive carcinoma; CESC cervical squamous cell carcinoma and endocervical adenocarcinoma; CHOL, cholangiocarcinoma; COAD, colon adenocarcinoma; ESCA, esophageal carcinoma; HNSC, head and neck squamous cell carcinoma; KICH, kidney chromophobe; KIRC, kidney renal clear cell carcinoma; KIRP, kidney renal papillary cell carcinoma; LIHC, liver hepatocellular carcinoma; LUAD, lung adenocarcinoma; LUSC, lung squamous cell carcinoma; PAAD, pancreatic adenocarcinoma; PCPG, pheochromocytoma and paraganglioma; PRAD, prostate adenocarcinoma; READ, rectum adenocarcinoma; SARC, sarcoma; STAD, stomach adenocarcinoma; THCA, thyroid carcinoma; THYM, thymoma; and UCEC, uterine corpus endometrial carcinoma. p -value was determined by two-sided Student’s t-test. ****: p <= 0.0001; ***: p <= 0.001; **: p <= 0.01; *: p <= 0.05; ns: not significant, p  > 0.05. b , Scatter plots showing cells of indicated cancer types that are more dependent on CDK5 and less dependent on CDK1. Each dot represents a cancer cell line. The RNAi dependency data (in DEMETER2) for CDK5 and CDK1 were obtained from the Dependency Map ( depmap.org ). The slope line represents a simple linear regression analysis for the indicated cancer type. The four indicated cancer types (Head/Neck, Ovary, CNS/Brain, and Bowel) showed a trend of more negative CDK5 RNAi effect scores (indicative of more dependency) with increasing CDK1 RNAi effect scores (indicative of less dependency). The p value represents the significance of the correlation computed from a simple linear regression analysis of the data. Red circle highlights the subset of the cells that are relatively less dependent on CDK1 but more dependent on CDK5. c , Scatter plots showing bowel cancer cells that expresses CDK5 while being less dependent on CDK1. Each dot represents a cancer cell line. The data on gene effect of CDK1 CRISPR and CDK5 mRNA level were obtained from the Dependency Map ( depmap.org ). The slope line represents a simple linear regression analysis. Red circle highlights the subset of cells that are relatively less dependent on CDK1 but expresses higher level of CDK5. For b and c , solid line represents the best-fit line from simple linear regression using GraphPad Prism. Dashed lines represent 95% confidence bands of the best-fit line. p -value is determined by the F test testing the null hypothesis that the slope is zero. d , Scatter plots showing rapidly dividing cells of indicated cancer types that are more dependent on CDK5 and less dependent on CDK1. Each dots represents a cancer cell line. The doubling time data on the x-axis were obtained from the Cell Model Passports ( cellmodelpassports.sanger.ac.uk ). The RNAi dependency data (in DEMETER2) for CDK5, or CDK1, on the y-axis were obtained from the Dependency Map ( depmap.org ). Only cell lines with doubling time of less than 72 h are displayed and included for analysis. Each slope line represents a simple linear regression analysis for each cancer type. The indicated three cancer types were analysed and displayed because they showed a trend of faster proliferation rate (lower doubling time) with more negative CDK5 RNAi effect (more dependency) but increasing CDK1 RNAi effect (less dependency) scores. The p value represents the significance of the association of the three cancer types combined, computed from a multiple linear regression analysis of the combined data, using cancer type as a covariate. Red circle depicts subset of fast dividing cells that are relatively more dependent on CDK5 (left) and less dependent on CDK1 (right). Solid lines represent the best-fit lines from individual simple linear regressions using GraphPad Prism. p -value is for the test with the null hypothesis that the effect of the doubling time is zero from the multiple linear regression RNAi ~ Intercept + Doubling Time (hours) + Lineage.

Supplementary information

Supplementary figure 1.

Full scanned images of all western blots.

Reporting Summary

Peer review file, supplementary table 1.

Phosphosite changes in 1NM-PP1-treated cells versus DMSO-treated controls as measured by LC–MS/MS.

Supplementary Table 2

Global protein changes in 1NM-PP1-treated cells versus DMSO-treated controls as measured by LC–MS/MS.

Supplementary Video 1

RPE-1 CDK5(as) cell after DMSO treatment, ×100 imaging.

Supplementary Video 2

RPE-1 CDK5(as) cell after 1NM-PP1 treatment (example 1), ×100 imaging.

Supplementary Video 3

RPE-1 CDK5(as) cell after 1NM-PP1 treatment (example 2), ×100 imaging.

Supplementary Video 4

RPE-1 CDK5(dTAG) cell after DMSO treatment, ×100 imaging.

Supplementary Video 5

RPE-1 CDK5(dTAG) cell after dTAG-13 treatment (example 1), ×100 imaging.

Supplementary Video 6

RPE-1 CDK5(dTAG) cell after dTAG-13 treatment (example 2) ×100 imaging.

Supplementary Video 7

RPE-1 CDK5(as) cells expressing MYC-CDK5(WT) after 1NM-PP1 treatment, ×20 imaging.

Supplementary Video 8

RPE-1 CDK5(as) cells expressing MYC-EV after 1NM-PP1 treatment, ×20 imaging.

Supplementary Video 9

RPE-1 CDK5(as) cells expressing MYC-CDK5(S159A) after 1NM-PP1 treatment (example 1), ×20 imaging.

Supplementary Video 10

RPE-1 CDK5(as) cells expressing MYC-CDK5(S159A) after 1NM-PP1 treatment (example 2), ×20 imaging.

Supplementary Video 11

RPE-1 CDK5(as) cells expressing MYC-CDK5(WT) after 1NM-PP1 treatment, ×100 imaging.

Supplementary Video 12

RPE-1 CDK5(as) cells expressing MYC-CDK5(S46A) after 1NM-PP1 treatment (example 1), ×100 imaging.

Supplementary Video 13

RPE-1 CDK5(as) cells expressing MYC-CDK5(S46A) after 1NM-PP1 treatment (example 2), ×100 imaging.

Supplementary Video 14

RPE-1 CDK5(as) cells expressing MYC-CDK5(S46D) after 1NM-PP1 treatment (example 1), ×100 imaging.

Supplementary Video 15

RPE-1 CDK5(as) cells expressing MYC-CDK5(S46D) after 1NM-PP1 treatment (example 2), ×100 imaging.

Supplementary Video 16

RPE-1 CDK5(as) cells expressing MYC-EV after 1NM-PP1 treatment,×100 imaging.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article.

Zheng, XF., Sarkar, A., Lotana, H. et al. CDK5–cyclin B1 regulates mitotic fidelity. Nature (2024). https://doi.org/10.1038/s41586-024-07888-x

Download citation

Received : 24 March 2023

Accepted : 30 July 2024

Published : 04 September 2024

DOI : https://doi.org/10.1038/s41586-024-07888-x

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

By submitting a comment you agree to abide by our Terms and Community Guidelines . If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Quick links

  • Explore articles by subject
  • Guide to authors
  • Editorial policies

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

null hypothesis test for equality

COMMENTS

  1. 1.3.5.3. Two-Sample <i>t</i>-Test for Equal Means

    The absolute value of the test statistic for our example, 12.62059, is greater than the critical value of 1.9673, so we reject the null hypothesis and conclude that the two population means are different at the 0.05 significance level. In general, there are three possible alternative hypotheses and rejection regions for the one-sample t-test:

  2. Lesson 11: Tests of the Equality of Two Means

    We reject the null hypothesis because the test statistic (\(t=3.42\)) falls in the rejection region: 1.9996 -1.9996 3.42 There is sufficient evidence at the \(\alpha=0.05\) level to conclude that the average fastest speed driven by the population of male college students differs from the average fastest speed driven by the population of female ...

  3. 8.1: The null and alternative hypotheses

    The Null hypothesis \(\left(H_{O}\right)\) is a statement about the comparisons, e.g., between a sample statistic and the population, or between two treatment groups. The former is referred to as a one-tailed test whereas the latter is called a two-tailed test. The null hypothesis is typically "no statistical difference" between the ...

  4. Two Sample t-test: Definition, Formula, and Example

    Fortunately, a two sample t-test allows us to answer this question. Two Sample t-test: Formula. A two-sample t-test always uses the following null hypothesis: H 0: μ 1 = μ 2 (the two population means are equal) The alternative hypothesis can be either two-tailed, left-tailed, or right-tailed:

  5. 9.1: Null and Alternative Hypotheses

    If certain conditions about the sample are satisfied, then the claim can be evaluated for a population. In a hypothesis test, we: Evaluate the null hypothesis, typically denoted with \(H_{0}\). The null is not rejected unless the hypothesis test shows otherwise. The null statement must always contain some form of equality \((=, \leq \text{or ...

  6. Null Hypothesis: Definition, Rejecting & Examples

    Typically, the null hypothesis includes an equal sign. The null hypothesis states that the population parameter equals a particular value. That value is usually one that represents no effect. In the case of a one-sided hypothesis test, the null still contains an equal sign but it's "greater than or equal to" or "less than or equal to."

  7. Lesson 11: Tests of the Equality of Two Means

    Lesson 11: Tests of the Equality of Two Means. Overview. In this lesson, we'll continue our investigation of hypothesis testing. In this case, we'll focus our attention on a hypothesis test for the difference in two population means μ 1 − μ 2 for two situations: a hypothesis test based on the t -distribution, known as the pooled two-sample ...

  8. Null and Alternative Hypotheses

    Note. H 0 always has a symbol with an equal in it. H a never has a symbol with an equal in it. The choice of symbol depends on the wording of the hypothesis test. However, be aware that many researchers (including one of the co-authors in research work) use = in the null hypothesis, even with > or < as the symbol in the alternative hypothesis.

  9. PDF Lecture #8 Chapter 8: Hypothesis Testing 8-2 Basics of hypothesis

    rue. The null hypothesis (denoted by H0) is a hypothesis that contains a statement of equality, =.The alternative hypo. If the claim value is k and the population parameter is p, then some possible pairs of null and alternative hypothesis are. H0: p = k. = kH0: p = kH1: p > kH1: p < kH1: pIde.

  10. Independent t-test for two samples

    The null hypothesis for the independent t-test is that the population means from the two unrelated groups are equal: H 0: u 1 = u 2. In most cases, we are looking to see if we can show that we can reject the null hypothesis and accept the alternative hypothesis, which is that the population means are not equal: H A: u 1 ≠ u 2

  11. Why must a null hypothesis contain equality?

    In traditional hypothesis testing the null hypothesis always contains an = = -sign, whether it is as =, ≤, =, ≤, or ≥. ≥. The mull hypothesis determines the null distribution of the test statistic. Hence also the critical value used in testing at a particular level or, in computer programs, the P-value. Example: In a simple binomial ...

  12. Null & Alternative Hypotheses

    The null hypothesis (H0) answers "No, there's no effect in the population.". The alternative hypothesis (Ha) answers "Yes, there is an effect in the population.". The null and alternative are always claims about the population. That's because the goal of hypothesis testing is to make inferences about a population based on a sample.

  13. Null hypothesis

    The null hypothesis and the alternative hypothesis are types of conjectures used in statistical tests to make statistical inferences, which are formal methods of reaching conclusions and separating scientific claims from statistical noise.. The statement being tested in a test of statistical significance is called the null hypothesis. The test of significance is designed to assess the strength ...

  14. 10.1: Null and Alternative Hypotheses

    If certain conditions about the sample are satisfied, then the claim can be evaluated for a population. In a hypothesis test, we: Evaluate the null hypothesis, typically denoted with \(H_{0}\). The null is not rejected unless the hypothesis test shows otherwise. The null statement must always contain some form of equality \((=, \leq \text{or ...

  15. How to Write a Null Hypothesis (5 Examples)

    Example 1: Weight of Turtles. A biologist wants to test whether or not the true mean weight of a certain species of turtles is 300 pounds. To test this, he goes out and measures the weight of a random sample of 40 turtles. Here is how to write the null and alternative hypotheses for this scenario: H0: μ = 300 (the true mean weight is equal to ...

  16. 9.1 Null and Alternative Hypotheses

    The actual test begins by considering two hypotheses.They are called the null hypothesis and the alternative hypothesis.These hypotheses contain opposing viewpoints. H 0, the —null hypothesis: a statement of no difference between sample means or proportions or no difference between a sample mean or proportion and a population mean or proportion. In other words, the difference equals 0.

  17. Levene Test for Equality of Variances

    The null hypothesis for Levene's is that the variances are equal across all samples. In more formal terms, that's written as: H 0 : σ 1 2 = σ 2 2 = … = σ k 2 . The alternate hypothesis (the one you're testing), is that the variances are not equal for at least one pair: H 0 : σ 1 2 ≠ σ 2 2 ≠… ≠ σ k 2 .

  18. Hypothesis Testing

    For this test, df=(4-1)(3-1)=3(2)=6. Again, with χ 2 tests there are no upper, lower or two-tailed tests. If the null hypothesis is true, the observed and expected frequencies will be close in value and the χ 2 statistic will be close to zero. If the null hypothesis is false, then the χ 2 statistic will be large.

  19. 11.1

    11.1. 11.1 - When Population Variances Are Equal. Let's start with the good news, namely that we've already done the dirty theoretical work in developing a hypothesis test for the difference in two population means \ (\mu_1-\mu_2\) when we developed a \ ( (1-\alpha)100\%\) confidence interval for the difference in two population means.

  20. 10.29: Hypothesis Test for a Difference in Two Population Means (1 of 2)

    Step 1: Determine the hypotheses. The hypotheses for a difference in two population means are similar to those for a difference in two population proportions. The null hypothesis, H 0, is again a statement of "no effect" or "no difference.". H 0: μ 1 - μ 2 = 0, which is the same as H 0: μ 1 = μ 2. The alternative hypothesis, H a ...

  21. Null Hypothesis

    Understanding these types is pivotal for effective hypothesis testing. Equality Null Hypothesis (Simple Null Hypothesis) The Equality Null Hypothesis, also known as the Simple Null Hypothesis, is a fundamental concept in statistical hypothesis testing that assumes no difference, effect or relationship between groups, conditions or populations ...

  22. Some Basic Null Hypothesis Tests

    The most common null hypothesis test for this type of statistical relationship is the t test. In this section, we look at three types of t tests that are used for slightly different research designs: the one-sample t test, the dependent-samples t test, and the independent-samples t test. The one-sample t test is used to compare a sample mean (M ...

  23. 16.3: The Process of Null Hypothesis Testing

    16.3.5 Step 5: Determine the probability of the data under the null hypothesis. This is the step where NHST starts to violate our intuition - rather than determining the likelihood that the null hypothesis is true given the data, we instead determine the likelihood of the data under the null hypothesis - because we started out by assuming that the null hypothesis is true!

  24. Understanding Null Hypothesis Testing

    A crucial step in null hypothesis testing is finding the likelihood of the sample result if the null hypothesis were true. This probability is called the p value. A low p value means that the sample result would be unlikely if the null hypothesis were true and leads to the rejection of the null hypothesis. A high p value means that the sample ...

  25. CDK5-cyclin B1 regulates mitotic fidelity

    The phosphosite instances with localization probability of less than or equal to 0.75 were filtered out. ... p-value is determined by the F test testing the null hypothesis that the slope is zero ...