Linear Regression

Regression, in its simplest form, is merely a matter of finding the line-of-best-fit for a graph that plots one or more independent variables against a continuous dependent variable. The question is, what makes one line-of-fit better than another?

Linear regression works on the assumption that the line-of-best-fit is the line for which the sum of the squared distances between the line and each point in the training set is smallest. It has a number of properties that other models don't:

  1. There is a unique, closed form solution: In order to find the line-of-best-fit, we don't need to test out lots of potential candidates. There is a simple formula which we can apply to find the line-of-best-fit every time.
  2. Linear regression is very interpretable. It allows us to understand the relationships between variables easily, so that we can answer questions such as: 'If we change this independent variable by 10% how would we expect the dependent variable to change as a result?'

Online resources

Click the links below to access the Jupyter Notebooks for linear regression