Machine learning is a sub-field of artificial intelligence (AI) that provides systems the ability to automatically learn and improve from experience without being explicitly programmed. During the “learning” part, we need to have data (samples, examples, or observations) in order to explore potential underlying patterns hidden in our data. These learned patterns are learned by the computer systems automatically without human intervention or input, and they are usually some functions or decision boundaries. Based on the data characteristics and the learning scenario, machine learning algorithms are often categorized as supervised, unsupervised, and reinforcement learning.
In this article, let’s explore supervised learning and unsupervised learning.
Just as “supervised” tells us, supervised learning is a model learns from a labeled dataset with guidance. For this machine learning category, the research needs to have a dataset with observations and the metadata (labels or classes) of the observations. For example, the observations could be a short audio recording and the labels are what’s in the recording (e.g. dog bark, piano music, etc.). A pipeline illustration of supervised learning is shown in Figure 1.
Figure 1: Pipeline of supervised learning
The supervised ML models learn from the labeled dataset and then are often used to predict or classify future events. The input is a known training data set with its corresponding labels, which means, for each dataset given, an answer or solution to it is given as well. This would help the model in learning and hence providing the result of the problem easily. After sufficient training, the model will be able to provide predictions for any new input.
There are two famous types of supervised learning problems: classification problems and regression problems. We will discuss these problems in future articles.
In contrast to supervised learning, the input data here only contains raw data without any labels or metadata. Unsupervised learning studies how systems can infer a function to describe a hidden structure from unlabeled data. Instead of predicting the right output, an unsupervised learning system explores the data and finds previously unknown patterns in data set without any labels. A pipeline illustration of unsupervised learning is shown in Figure 2.
Figure 2: Pipeline of unsupervised learning
Since unsupervised learning models receive a dataset without providing any instructions, grouping the data is a typical approach to solve the problem. When new input data comes in, the model will make a comparison to guess the output. Clustering and association are two famous unsupervised learning groups and will be discussed in our future articles. To conclude, unsupervised learning is where the machine is given training based on unlabeled data without any guidance. Given the scarcity of labeled data in the world, the future of AI in large part depends on getting better at unsupervised learning.
Besides the supervised and unsupervised learning families, there is a mixture of them which is so-called semi-supervised learning. The only difference is that semi-supervised models use both labeled and unlabeled data in the training stage.
Hope you like our articles! We will discuss reinforcement learning and some machine learning examples in our future blog posts.
- Supervised Learning vs Unsupervised Learning vs Reinforcement Learning
- Goodfellow, I., Bengio, Y., Courville, A., & Bengio, Y. (2016). Deep learning (Vol. 1, No. 2). Cambridge: MIT press.
- A.I. Wiki
Editor: Chieh-Feng Cheng
Ph.D. in ECE, Georgia Tech
Technical Writer, inwinSTACK