A type of regression method, which uses functions linear in the parameters to predict the label.
Input variable
The feature vector for the machine learning function. In the context of regression, it is also called independent variable, predictor variable, or explanatory variable.
Output variable
The output the machine learning function. In the context of regression, it is also called dependent variable, or response variable.
Ordinary Least Squares (OLS)
Linear regression method derived from Maximum Likelihood Estimation (MLE). The learning objective for optimization consists of a sum of squared errors.
Residue sum of squares
The sum of the squared differences between the predicted values and the labels.
The coefficient of determination
An evaluation metric of how well the linear function fits the data, which compares the linear function learned to the best constant function. It’s between [0,1], and a larger value means better quality of the linear function learned.
Ridge Regression
Linear regression method derived from Maximum A Posteriori (MAP). The learning objective for optimization consists of a sum of squared errors plus a regularization term. The regularization term is a regularization coefficient times the 2-norm of the parameter vector.
Logistic regression
Linear classification method. It uses the logistic function (for binary classification) or softmax (for multi-class classification) to transform linear functions of the input to a probability vector, and then derive the learning objective for optimization by MLE or MAP.
Gradient descent:
An optimization algorithm. For minimization, it iteratively moves the current solution towards the negative gradient direction. For maximization, it moves towards the gradient direction.
Stochastic gradient descent:
An optimization algorithm similar to gradient descent, but uses a stochastic estimation of the gradient over the dataset by the gradient for the term associated with random data points.