Prediction Problems

Prediction problems can be categorized into two main types:

Regression: Predicting continuous numerical values, such as sales figures, stock prices, or weather forecasts. It refers to how much/how many questions.
Classification: Predicting discrete categorical labels, such as whether an email is spam or not, whether a customer will churn or not, or whether a patient has a specific disease or not. It refers to which on questions.

In the terminology of machine learning:

The dataset is called a training dataset or training set.
Each row in the dataset is called a sample, example, data point or instance.
The thing we are trying to predict is called a label or target (Y).
The variables upon which the predictions are based are called features or covariates (X).
The model that describes how features can be transformed into an estimate of the target.
The weights or slopes determine the influence of each feature on our prediction.
The bias, offset or intercept determines the value of the estimate when all features are zero.
Transformation involves changing the scale or form of the variables in order to improve the fit of the model or to make the results easier to interpret. For example, you might transform a variable by taking its natural logarithm, its square root, or its reciprocal.
Translation involves adding or subtracting a constant value to all of the observations of a variable. Translation does not change the relationship between the variables, but it can be used to shift the results of the regression model up or down. For example, you might translate a variable by adding or subtracting the mean of the variable.

Linear Regression Model: \({\hat{\mathbf{y}}} = \mathbf{X} \mathbf{w} + b\)

\(\mathbf{X}\): Design Matrix \(\mathbf{X} \in \mathbb{R}^{n \times d}\). Here, \(\mathbf{X}\) contains one row for every example and one column for every feature.
Element-wise Operations: Matrix broadcasting with vector.
The goal of linear regression is to find the weight vector \(\mathbf{w}\) and the bias term \(b\).

Regression