In this series, we would be exploring, at a high level, some ideas, techniques, and algorithms that are at the foundation of AI. AI covers a wide range of techniques and in this series, I would be covering the following categories of problems:
In this post, we will explore Learning.
So far, we saw AI problems that needed the computer to be given explicit instructions to perform tasks.
In Machine Learning, we would not be giving instructions to the computer. Instead, we would be giving it access to information in the form of data or patterns. The computer learns from this data to be able to perform a task on its own.
There are 3 main categories in machine learning - (a) Supervised Learning, (b) Reinforcement Learning and (c) Unsupervised Learning.
In Supervised Learning, we feed the computer with input-output pairs and the computer arrives at a function to map inputs to outputs.
Classification Tasks - categorizing data into discrete categories.
Identifying counterfeit banknotes - The computer is trained on a dataset containing physical attributes of several banknotes and a label for each indicating whether the note is authentic or counterfeit. Then, given the attributes of a new banknote, the AI will be able to classify it as authentic or counterfeit.
Weather Forecasting - Similar to the example above, we give the computer training data containing humidity and pressure for a set of days and a label indicating if it rained on those days. Using this data and today’s humidity and pressure, the computer will be able to predict if it will rain today.
Regression Tasks - when the output is a continuous value or real number.
The algorithms that are commonly used are K-Nearest Neighbor Classification, Linear regression, Perceptron Learning Rule and Support Vector Machine.
Overfitting happens when a model fits too closely with a training dataset and as a result, fails to generalize to future data. Hold-out cross-validation is a way to check for overfitting. In this method, we split data into a training set and a test set, such that learning happens on the training set and is evaluated on the test set. Regularization can help avoid overfitting by balancing complexity of the solution with accuracy of the output.
Unlike Supervised Learning which trains the AI with input-output pairs at the beginning, Reinforcement Learning learns from experience.
The AI takes an action and gets rewarded if it does well or punished if it does poorly and thus learns what to do and what not to do based on individual actions.
A game playing agent that learns from experience - we let AI play the game several times and we reward the AI with 1 if it wins the game and -1 if it loses the game. So we don’t have to tell the AI what actions to make; the AI figures it out by playing with itself or others.
A physical robot trying to walk around.
Markov Decision Process, a model for decision making, is used by AI to learn what actions to take in the future. Q-learning is an algorithm that AI can use to find the best action to take given the current state.
In complex games such as Chess, where it is not possible to explore all the states that exist, Function Approximation can be used to estimate the value of a state based on others states with similar features.
Exploitation is when the AI takes only actions that have led to a reward earlier. Meaning, the AI uses only knowledge it already has.
Exploration is when the AI takes actions it doesn’t know about with the hope of finding a better overall result.
An agent that only exploits and not explores, might be able to get to the reward but may not be able to maximize the reward because it doesn’t know what possibilities are out there. By using Epsilon-Greedy algorithm, we can configure how often the AI should explore, thus maintaining a balance between exploration and exploitation.
In Unsupervised Learning, we feed the computer with input data without any additional feedback (such as labels or categories as in the case of Supervised Learning) and the computer learns patterns.
Clustering is a task under Unsupervised Learning, where a set of objects are organized into groups in such a way that similar objects tend to be in the same group.
k-means Clustering is an algorithm that groups data based on repeatedly assigning points to clusters and updating those clusters’ centers.
Reference: CS50’s Introduction to Artificial Intelligence with Python