Linear Regression

Predicts the target as a weighted sum of the feature inputs.

Assumptions

  • Linearity (no interactions and no non-linearities)
  • Normality (target variable follows a Normal Distribution given the features)
  • Homoscedasticity
    • e.g. high house price variance vs. low house price variance
  • Independence
    • if not, use Mixed Effect Model or GEEs
  • Fixed Features (there is no measurement error or uncertainty in the data)
    • otherwise a more complex model has to be used to account for the possible measurement error
  • Uncorrelated Features (Correlation)
    • it becomes hard to know which feature of the correlated ones is having the effect on the target
    • use LDA to get less correlated features

Problem to solve

Given some datapoint with targets , find weights that minimize the error .

There is a closed form solution (see below) or you can use Gradient Descent to calculate the weights from minimizing the Mean Squared Error.

If our data has Gaussian noise we can even get confidence intervals for beta.

Explanations

Linear Regression has Intrinsic explanations.

We can get Explanations from a Linear Regression model via the estimated weights.

  • Numerical
    • one unit change → outcome changes by its weight
  • Binary
    • changing from 0 to 1 → outcome changes by weight
  • Categorical
  • Intercept
    • Predicted value when all values are at their mean (when standardized)
    • else when all values are zero

Feature Importance can be calculated with the t-statistic. So the importance increases when the weight of the feature increases and it decreases when it is uncertain about the correct value.

We can plot the importances with a Weight Plot and the features Effects with an Effect Plot.

Quality of Explanations

  • Contrastive
    • to the zero values instance
    • or the mean values instance (when normalized)
  • Fidelity
    • Explanations are truthful if the assumptions are held
    • otherwise it might simplify the actual dependencies to simple linear dependencies which of course then are not truthful
  • Stability
    • Stable by design
  • Not selective by default

Evaluation

To evaluate a linear regression model, one can use the R-Squared Metric.

Disadvantages

All nonlinearities have to be hand-crafted and given as input features. Sometimes weights are not interpretable when you have high correlation (e.g. room number and size of house).

Other

What model describes our data the best?

Simple Linear Regression

A calculation with the Linear Gaussian Model yields a solution to the simple linear regression if not all datapoints are the same.

or, in terms of the data

The regression line is defined as which goes through the Center of Mass of the data sequence.

Empirical Correlation Coefficient

Steps

  1. Linear Model
  2. Optimization of a Linear Model
  3. Basis Function Expansion

Use Cases for Linear Regression

  • Computer vision and trend analysis.
  • Dataset is rather small and a Linear Model is enough to approximate the function.
  • We use regression to predict a real-valued output.

Class Prediction

By applying Logistic Regression we can also do Classification.