Baseline

Used to evaluate the performance of the ML algorithm compared to simpler/naive/trivial solutions.

A model or algorithm that provides a reference point for comparison:

rule-based algorithm
heuristic algorithm
a simple statistic about training data
another ml algorithm
etc

#Random Prediction

Throwing a dice but be careful and look at the probability distributions

classification: randomly picking a class from all the classes of the problem.
regression: randomly selecting a target from all unique target values in the training data.

#Zero Rule

classification: predict the most common class in the Training Dataset, independent of the input value.E.g., 100 records, 80 non-span, 20 spam. Baseline: 80% accuracy.
regression: predict the sample average of the target values in the training data.

#Machine Learning

Text classification, machine translation: SVM with a linear kernel.
General numerical dataset: Linear Regression or Logistic Regression or KNN (k=5)
Image classification: SVM with linear kernel, convolutional [[neural network]].

#Rule-based or groups of humans

e.g. doctors.

Bagging

Bias-Variance Tradeoff Next