# Parametric vs. Non-parametric Procedures
A parametric test makes assumptions about a population’s parameters:
**1. Normality** — Data in each group should be normally distributed
**2. Independence** — Data in each group should be sampled randomly and independently
**3. No Outliers** — no extreme outliers in the data
**4. Equal Variance** — Data in each group should have approximately equal variance
A **non-parametric test** (sometimes referred to as a _distribution free test_) does not assume anything about the underlying distribution
We can assess normality visually using a [[Q-Q (quantile-quantile) plot|Q-Q (quantile-quantile) plot]]. In these plots, the observed data is plotted against the expected quantile of a normal distribution. A demo code in python is seen here, where a random normal distribution has been created. If the data are normal, it will appear as a straight line.
```python
import numpy as np
import statsmodels.api as statmod
import matplotlib.pyplot as plt#create dataset with 100 values that follow a normal distribution
data = np.random.normal(0,1,100)#create Q-Q plot with 45-degree line added to plot
fig = statmod.qqplot(data, line='45')
plt.show()
```
![[qq_plot_demo.png]]
- Tests to check for normality
- Shapiro-Wilk
- Kolmogorov-Smirnov
*The null hypothesis of both of these tests is that the sample was sampled from a normal (or Gaussian) distribution. Therefore, if the p-value is significant, then the assumption of normality has been violated and the alternate hypothesis that the data must be non-normal is accepted as true.*
![[parametric_vs_nonparametric_tests.webp]]
KM, used in [[Customer Lifetime Value — Survival Analysis|survival analysis]] is a non-parametric procedure, whereas Cox Regression is a semi-parametric procedure.
# Advantages and Disadvantages
Non-parametric tests have several advantages, including:
- More statistical power when assumptions of parametric tests are violated.
- Assumption of normality does not apply
- Small sample sizes are ok
- They can be used for all data types, including ordinal, nominal and interval (continuous)
- Can be used with data that has outliers
Disadvantages of non-parametric tests:
- Less powerful than parametric tests if assumptions haven’t been violated