# Linear Regression ## What is Linear Regression? --- >[!Abstract] Linear regression is form of [[Supervised Learning]], where a model is trained on labeled input data. Linear regression has quick runtime and is generally interpretable. The goal of linear regression is to estimate a function $f(x)$ such that each feature has a linear relationship to the target variable y: >$y = X\beta$ >Where $X$ is a matrix of predictior variables and $\beta$ is a vector of parameters that determines the weight of each variable in predicting the target variable. Regression was noticed as a result of value drifts towards population averages: >[!Quote] Galton called this phenomenon **regression.** >A father's son's height tends to *regress* (or drift towards) the mean height. ```R # By convention we use the feature we're trying to predict # into the splitter sample <- sample.split(df$G3, SplitRatio=0.7) # train train <- subset(df, sample==TRUE) test <- subset(df, sample==FALSE) # Train and build a model; . uses all columns model <- lm(G3 ~ ., data=train) # Run the model summary(model) # R will indicate the exploratory significance of the features using *'s ``` ## Evaluating Linear Regression Evaluation of regression models is built on the concept of a residual: the distance between what the model predicted versus the actual value, see [[Residuals]]. Generally linear regression estimates $\beta$ by minimizing the residual sum of squares (RSS): $RSS(\beta) = (y - X\beta)^T(y - X\beta)$ The total goodness-of-fit is given by: $R^2 = 1 - \frac{RSS}{TSS}$ Where $TSS = ESS + RSS$. Ranging from 0 to 1, the *goodness-of-fit* explains the proportion of variability in the data explained by the model. Other evaluation metrics to measure the *goodness-of-fit* include the [[Mean Squared Error (MSE)]] and [[Mean Absolute Error (MAE)]], the former measuring the variance of the residuals whereas the later measuring the average of residuals. >[!Note] The MSE penalizes larger errors more than MAE, making it more sensitive to outliers. ## Pros and Cons of Linear Regression **Pros:** - Simple to explain - Highly interpretable - Model training and prediction are fast - No tuning is required (excluding regularization) - Features don't need scaling - Can perform well with a small number of observations - Well-understood **Cons:** - Assumes a linear relationship between the features and the response - Performance is (generally) not competitive with the best supervised learning methods due to high bias - Can't automatically learn feature interactions In the plot below, a linear regression model would not perform well because it would not be able to capture a non-linear relationship as there are different amounts of rentals in different seasons. ![[R_boxplot.png]] ![[Regression to the Mean]]