What Is ANOVA?

Analysis of Variance (ANOVA) is a statistical technique that tests whether the means of three or more groups differ significantly. While the name says "variance," it is fundamentally a test about means. The core idea: ANOVA partitions the total variability in your data into between-group variability and within-group variability, then asks whether the between-group variability is large enough to be unlikely by chance.

ANOVA was developed by Ronald A. Fisher in the 1920s and remains one of the most commonly used methods in experimental research. Any time you have a categorical independent variable (e.g., treatment group A, B, C) and a continuous dependent variable (e.g., tumor volume, test score), one-way ANOVA is likely your starting point.

ANOVA vs. T-Test: When to Use Which

A t-test compares exactly two groups. ANOVA extends this to three or more groups. You might ask: "Why not just run multiple t-tests?" The answer is multiple comparisons inflation. If you compare 4 groups using 6 pairwise t-tests at alpha = 0.05, your overall Type I error rate balloons to approximately 26%. ANOVA controls this by performing a single omnibus test first.

Scenario	Recommended Test
2 groups	T-test (or Welch's t-test)
3+ groups, 1 factor	One-way ANOVA
3+ groups, 2+ factors	Two-way / Factorial ANOVA
Repeated measures on same subjects	Repeated-measures ANOVA
Non-normal data, 3+ groups	Kruskal-Wallis test

How ANOVA Works: The F-Statistic

ANOVA calculates the F-statistic, which is the ratio of between-group variance to within-group variance:

F = MS_between / MS_within

MS_between (Mean Square Between) captures how much group means differ from the grand mean. MS_within (Mean Square Within, also called Mean Square Error) captures the average variability within groups. A large F means the groups differ much more than you would expect from random variation alone.

The F-statistic follows an F-distribution with (k−1, N−k) degrees of freedom, where k is the number of groups and N is the total sample size.

Understanding Post-hoc Tests

A significant ANOVA result tells you that at least one group mean differs — but not which ones. Post-hoc (Latin for "after this") tests perform pairwise comparisons while controlling for multiple testing.

Tukey's HSD (Honestly Significant Difference)

The most commonly used post-hoc test. It compares every pair of group means and controls the family-wise error rate at your chosen alpha level. Best suited when you have equal or nearly equal sample sizes and want to examine all possible pairwise comparisons.

Bonferroni Correction

A simpler but more conservative approach: divide your alpha by the number of comparisons. For 3 groups (3 comparisons), use alpha = 0.05/3 = 0.0167. Easy to understand but can be overly strict with many groups.

When Neither ANOVA Nor Post-hoc Is Needed

If your ANOVA is not significant (p > 0.05), do not proceed to post-hoc tests. The omnibus test has already told you there is insufficient evidence for group differences.

Assumptions of One-Way ANOVA

Independence. Observations must be independent within and between groups.
Normality. The dependent variable should be approximately normally distributed within each group. ANOVA is robust to moderate violations when sample sizes are similar and n > 15 per group.
Homogeneity of variances (homoscedasticity). Group variances should be roughly equal. Levene's test can check this. When violated, use Welch's ANOVA or the Kruskal-Wallis test.

The Kruskal-Wallis Test: Non-parametric Alternative

When normality assumptions are violated or your data is ordinal, the Kruskal-Wallis test is the non-parametric equivalent of one-way ANOVA. It ranks all observations and tests whether the mean ranks differ between groups. It does not assume normality or equal variances, but it is less powerful than ANOVA when assumptions are met.

Effect Size: Eta-squared

Just as Cohen's d accompanies the t-test, eta-squared (η²) accompanies ANOVA. It represents the proportion of total variance explained by the grouping variable:

η²	Interpretation
0.01	Small effect
0.06	Medium effect
0.14	Large effect

Frequently Asked Questions

Can I use ANOVA with only 2 groups?

Yes, and the result will be mathematically identical to an independent t-test (F = t²). However, for 2 groups a t-test is simpler and more conventional.

What if my group sizes are very unequal?

Unequal group sizes reduce the robustness of ANOVA to violations of homogeneity of variances. Use Welch's ANOVA or the Kruskal-Wallis test when variances differ and sample sizes are unbalanced.

How do I report ANOVA results?

APA style: "A one-way ANOVA revealed a significant effect of treatment on tumor volume, F(2, 45) = 8.73, p < .001, η² = .28. Tukey's HSD indicated that Group A (M = 12.3, SD = 3.1) differed significantly from Group C (M = 18.7, SD = 4.2), p = .002."

My data is ordinal (e.g., Likert scale ratings). Can I use ANOVA?

This is debated. Many researchers use ANOVA on Likert data with 5+ points, arguing it is robust enough. The conservative approach is to use the Kruskal-Wallis test for ordinal data. Either way, report your rationale.

This tool is free forever. If it saved you time, consider buying me a coffee.

☕ Buy me a coffee

📈 ANOVA