For tens of thousands of years humanity has applied its enormous capacity for environmental adaptation through the creation and implementation of tools, which Joseph Campbell referred to as “the beginning of a machine.” Since the mid-1950’s scientists and engineers have been attempting to develop machines that can think, and behave, more like human beings.

In 1959, Arthur Lee Samuel, one of the early pioneers of what we now call artificial intelligence, coined the term machine learning, which he defined as “the programming of a digital computer to behave in a way which, if done by human beings or animals, would be described as involving the process of learning (by) […] experience” (p. 210), i.e., trial and error.  Machine learning couples algorithms with typically vast amounts of data to improve the accuracy of the developed models.

Artificial intelligence is the scientific study of the computational principals behind thought and intelligent behavior (“Great Debate”). Machine learning (ML) is an application or subfield of artificial intelligence (AI) that uses data to continuously learn and improve itself. In other words, ML is not explicitly programmed how to accomplish this and therefore demonstrates a powerful capability of pseudo-educational autonomy, albeit extremely narrowly focused. The ML models developed are only as good as the algorithms employed and data they are being fed. This is a fantastic example of “garbage-in, garbage-out,” whereby the sheer volume of data and statistical sophistication of the algorithms cannot compensate for bad data. It is important to note that the majority of the current AI advancements in the news today are attributed to ML.

Machine learning is classified, or grouped, by what the specific algorithms have been designed/programmed to accomplish, broadly: clustering, regression and classification.

The approaches taken by which ML pursues these goals are: supervised learning, unsupervised learning, semi-supervised learning and reinforcement learning.

Classification — Inputs are divided into groups, or classes, which are defined, or labeled, at the outset. Classification involves supervised learning. Think of the objective here as providing a label to a discrete value, in a logical “either/or” paradigm. Examples: “Spam/Not Spam,” “Approved/Denied,” “Yellow/Blue,” “Fast/Slow,” “Legitimate/Fraudulent,” etc.

Clustering — Inputs are divided into groups, or classes, which are not defined, or not labeled at the outset. Clustering involves unsupervised learning. Think of the objective here as providing a label to a discrete value, in a logical “either/or” paradigm. Examples: “Spam/Not Spam,” “Approved/Denied,” “Yellow/Blue,” “Fast/Slow,” “Legitimate/Fraudulent,” etc.

Regression — Determining a continuous value from a set of data. Regression involves supervised learning. Think of the objective here as predicting a quantity or numeric value. Regression problems sometimes deal with time series analysis. Examples: “Height,” “Age,” “Revenue,” “Account Balance,” “Car Value.”

Supervised Learning — Machines are fed labeled training data that contain input-output pairs whereby the possible outputs are known beforehand. Classic examples are cat images, dog images, hotdog images, etc., whereby the images are fed into the machine with the proper labels assigned to each. Typically, a data scientist or similar professional actively participates in “correcting” the machine as necessary throughout the learning process. Some consider supervised learning to be a very weak example of AI. Supervised learning typically involves classification or regression.

Unsupervised Learning — Machines are fed unlabeled training data whereby the possible outputs are not known beforehand. The approach can unearth hidden patterns and correlations unknown at the outset. Some consider unsupervised learning to be a fairly strong example of AI. Unsupervised learning typically involves clustering.

Semi-Supervised Learning — This method resides somewhere between supervised and unsupervised learning. The most common scenario is that you provide the machine some labeled data and much more unlabeled data. Many data scientists prefer this approach as labeling vast amounts of data can take a tremendous amount of time and effort. In addition, the human involvement in the labeling process itself can introduce varying degrees of bias into the equation.

Reinforcement Learning — This allows software “agents” acting on behalf of a user, for example, to learn the most optimal behavior based on the “old carrot and stick” or reward-inducing motivational model. The agent can be a program playing Fortnite, a program controlling a leg of a robot, a self-driving vehicle, etc.

Scientists and engineers employ any number of algorithms depending upon the task or objective for machine learning (ML): clustering, regression, classification. The most common are: Linear Regression, Logistic Regression, Classification and Regression Trees, K-Nearest Neighbors, Naïve Bayes, Support Vector Machines, Bagging with Random Forests, and Adaptive Boosting, all of which will be covered in future articles.

Machine learning is a very exciting field with unlimited applications, and we have just scratched the surface of what it can help us accomplish. This article has presented ML at a fairly high level, assuming some industry and technical knowledge, and there is quite a bit more to this story. Please look for additional insights in future articles as we explore this dynamic aspect of artificial intelligence and how, if understood and applied correctly, it can transform many industries and our future.


ASU Originals Project: “Great Debate: Artificial Intelligence: Who is in Control?”

Samuel, A.L., “Some Studies in Machine Learning Using the Game of Checkers,” IBM Journal of Research Development, vol. 3, no. 2, July 1959. p. 210-229.