By: Sneka. P

We are grappling with a pandemic that’s operating at a never-before-seen scale. Researchers all over the globe and frantically trying to develop a vaccine or a cure for COVID-19 while doctors are just keeping the pandemic from overwhelming the entire world.
So let’s consider a situation where doctors have four medical treatments to apply to cure the patients. Once we have the test results, one approach is to assume that the treatment which took the least time to cure the patients is the best among them. But what if some of these patients had been partially cured already, or if any other medication was already working on them?
In this article, let me initiate the ANOVA test and its different types that are being used to make better decisions. So I’ll demonstrate each type of ANOVA test in python to visualize how they work on covid-19 data. So let’s get going
WHAT IS ANOVA TEST?
An Analysis of Variance test, or ANOVA, can be thought of as a generalization of the t-test for more than 2 groups. The independent t-test is used to differentiate means of a situation between two groups. ANOVA is used when we want to compare the means of a condition between more than two groups.
ANOVA tests if there is a difference in the mean somewhere in the model, but it does not tell us where the difference is (if there is one). To find where the difference among the groups, we have to conduct the post-hoc tests.
To perform any tests, we first need to define the null and alternate hypothesis:
- NULL HYPOTHESIS: There is no significant difference between the groups.
- ALTERNATE HYPOTHESIS: There is significant difference between the groups.
Basically, ANOVA is performed by comparing two types of variation, the variation between the sample means, as well as the variation within each of the samples. The below mentioned formula represents one-way Anova test. The formula is given below.

Assumptions of an ANOVA Test
There are certain assumptions we need to make before performing ANOVA:
- The observation are obtained independently and randomly from the population defined by the factor levels.
- The data for each factor level is normally distributed.
- Independence of cases: the sample cases should be independent of each other.
- Homogeneity of variance: Homogeneity means that the variance among the groups should be approximately equal.
The premise of homogeneity of variance can be tested using tests such as Levene’s test or the brown-Forsythe test. Normality of the distribution of the scores can be tested using histograms, the values of skewness and kurtosis, or using tests such as Shapiro or q-q- plot.
The assumption of independence can be determined from the design of the study. It is important to note that ANOVA is not robust to violations to the assumptions of independence. This is to say that even if you violate the premise of homogeneity or normality you can perform the test and basically trust findings.
However, the results of ANOVA are invalid if the indecency assumption is violated. In general, with violation of homogeneity, the analysis is considered robust if you have equal-sized groups. With violations of normality, continuing with ANOVA is generally ok if you have a large sample size.
Types of ANOVA Tests:
- One-way ANOVA: It has just one independent variable.
- For e.g : differences in corona cases can be assessed by country, and a country can have 2, 20 or more different categories to compare.
- Two-way ANOVA: A two- way anova(also called factorial ANOVA) refers to an ANOVA using two independent variables
- For e.g: old age group may have higher corona cases overall compared to the Young Age group, but this difference could be greater (or less) in Asian countries compared to European countries.
- N-way ANOVA: A researcher can also use more than two independent variables, and this is an n-way ANOV (with n being the number of independent variables)
- For e.g: potential differences in corona cases can be examined by country, gender, age, group, ethnicity etc.
WITH REPLICATION (VS) WITHOUT REPLICATION
- Two-way ANOVA with replication: Two groups and the members of those groups are doing more than one thing
- For e.g: let’s say a vaccine has not been developed for covid-19, and doctors are trying two different treatments to cure two groups of covid-19 infected patients
- Two-way ANOVA without replication: Its used when you only have one group and you are double-testing that same group
- For e.g: let’s say a vaccine has been developed for covid-19, and researchers are testing one set of volunteers before and after they have been vaccinated to see if it works or not.
- POST-ANOVA Test
- When we conduct an ANOVA, we are attempting to determine if there is a statically significant difference between the groups. If we find that there is a difference, we will then need to examine where the group differences lay.
So this is the end I have tried to explain the ANOVA test using a relevant case study in these pandemic times. It was fun experience putting this all together for our community!