Checking normality of residuals
Residuals are calculated by lm() and can be checked for normality by functions from the olsrr package.
When to check normality of residuals?
Each time you use lm(), you should check normality of the residuals, unless you have a less than 7 measurements. In that case it’s impossible to check normality and you have to assume a normal distribution without being sure (but realize that the test is prone to generate false positives).
Can you use lm() to calculate residuals if assumptions are not met?
Are the residuals reliable if the assumptions of lm() are not met? It depends.
- When the data are not normal, you can still use lm() to check normality of the residuals.
- When the groups have unequal spreads, you can still use lm() to check normality of the residuals.
- When the data are dependent (repeated measures or paired), the residuals are calculated differently and lm() is not suited to calculate residuals.
Do you have to use the olsrr package?
No, the residuals are accessible in the list that is generated by lm().
You can run shapiro.test() on the residuals as follows:
shapiro.test(fit$residuals)
You can use ggplot2 to create a QQ-plot of the residuals:
ggplot(fit,aes(sample=fit$residuals)) + stat_qq() + stat_qq_line()
Quizzes