lm() for comparing groups

You can use lm() as a parametric test to compare groups.

Linear regression for comparing groups

lm() can do the following things:

  • t-test when you give him 2 groups to compare
  • ANOVA: when you give him more than 2 groups
  • linear regression: when you give him a continuous variable

In all cases, lm() fits a line to the data, the line that gives the smallest residuals.

Assumptions lm() makes about the data

Note that lm() assumes:

  • data are continuous
  • residuals are normal
  • groups have equal spreads
  • data are independent

Defining the formula for statistical tests

Most statistical tests in R accept the following input:

  • a data frame with the data
  • a formula (model) that specifies what you want to compare with format y ~ x
    • y is the column that contains the measurements
    • x is the column that defines the groups you want to compare. You can specify multiple x columns.

Running lm()

Also lm() is a function that accepts a formula and a data frame with data. You have to save the results of lm() in a variable for later use.