What is machine learning?

Definition

Literlly, “machine” denotes “programming computer” and “learning” denotes “learn from data”.
In a general sense, machine learning means that the computer can learn some ability without explicitly programming.
From the perspective of engineering, given some task $T$, corresponding experience (training data) $E$ and performance measurement $P$, machine learning hopes to learn from $E$ so that the performance $P$ on task $T$ can be improved.

Machine learning is a interdisciplinary field, which relates to computer science, statistics, mathematics and so on.

Data. Every insatcne is called sample. The set of training and testing data is called training set and testing set, respectively. Since for some algorithms, parameters are required to be tuned, we need to split a subset from the training set, which is called evaluation set and used for determining how good or bad the parameters are.
Model. It can be viewed as a function $f$. Given an input $x$, one can get an output $y$. The model may rely on some changeable parameters $\theta$. The process of learning is to update $\theta$.
Performance measurement. It is used to evalute the performance of the model. We can use utility function, fitness function to evaluate how good a model is. And we can also use the cost function to evaluate how bad a model is.

There are many categories for machine learning algorithms. Generally, we can classify them from the following perspectives.

In supervised learning, each training sample $x\in \mathscr{X}$ has a label $y\in\mathscr{Y}$.

Classification. The label set $\mathscr{Y}$ consists of finite elements, such as $\{0,1\}$, $\{\text{Yes}, \text{No}\}$ and so on. The classification task is to determine which class is for a given sample.

Regression. The label set $\mathscr{Y}$ consists of an interval or even more complex elements, such as $[0,1]$. The regression task is to find a suitable map from $\mathscr{X}$ to $\mathscr{Y}$.

Ranking. The samples are splitted into different group, and the label set can either be discrete or continuous. This is a special task and commonly used in recommended systems. It aims to give ranks of samples in a group.

Some common supervised learning algorithms are given below:

Clustering
- K-Means
- DBSCAN
- Hierarchical Cluster Analysis (HCA)
Anomaly detection and novelty detection
- One-class SVM
- Isolation Forest
Visualization and dimensionality reduction
- Principal Component Analysis (PCA)
- Kernel PCA
- Locally-Linear Embedding (LLE)
- t-distributed Stochastic Neighbor Embedding (t-SNE)
Association rule learning
- Apriori
- Eclat