The “No Free Lunch Theorem” theorem is a concept in machine learning. It asserts no single machine learning method is optimal for every task, especially for supervised learning offered in an accessible free data science course. You can’t conclude neural networks are superior to decision trees or vice versa. Many factors are at work, including the size and structure of your dataset.
As a result, you should explore various methods for your problem while using a “test set” of data to evaluate performance and choose the winner. Of course, the algorithms you use must be suitable for your situation, where selecting the right machine learning assignment comes in. For example, if you need to clean your house, you might use a vacuum, broom, or mop, but you wouldn’t get a shovel and start digging.
The Major Principles
However, all supervised machine learning methods for predictive modeling have a fundamental basis.
Learning a target function (f) that best maps input variables (X) to output variables (Y) is how machine learning algorithms work: Y equals f (X)
This is a general learning task in which we want to predict the future (Y) given new examples of input variables (X). We have no idea what the function (f) looks like or its form. We would use it directly rather than learn it from data through machine learning methods if we did.
The most common machine learning is learning the mapping Y = f(X) to predict Y given new X. This is known as predictive modeling or predictive analytics, and we want to create the most accurate forecasts possible.
Here is a quick tour of the top 10 machine learning algorithms used by data scientists for machine learning rookies who want to understand the foundations of machine learning.
<iframe width=”560″ height=”315″ src=”https://www.youtube.com/embed/9it0EHu6ki0″ title=”YouTube video player” frameborder=”0″ allow=”accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture” allowfullscreen></iframe>
1st — Linear Regression
Linear regression is one of statistics and machine learning’s most well-known and well-understood algorithms.
Predictive modeling primarily focuses on reducing a model’s error or making the most accurate forecasts feasible at the sacrifice of explainability. We will borrow, reuse, and steal algorithms from various domains, including statistics, and apply them to these goals.
Linear regression is represented by an equation that defines a line that best matches the relationship between the input variables (x) and the output variables (y) by determining precise weightings for the input variables known as coefficients (B).
2nd — Logistic Regression
Logistic regression is another approach borrowed from statistics by machine learning. It is the choice approach for binary classification problems (problems with two class values).
The purpose of logistic regression, like linear regression, is to identify the values for the coefficients that weight each input variable. In contrast to linear regression, the logistic function is used to alter the prediction for the output.
The logistic function, which resembles a giant S, will convert any value into the range 0 to 1. This is useful because we can apply a rule to the logistic function output to snap values to 0 and 1 (for example, IF less than 0.5, output 1) and forecast a class value.
Linear Discriminant Analysis (3rd)
Logistic Regression is a classification approach that has typically been restricted to two-class classification issues. The Linear Discriminant Analysis algorithm is the preferred linear classification technique when there are more than two classes.
LDA is represented easily. It is made up of the statistical qualities of your data that have been calculated for each class.
4 — Regression and Classification Trees
Decision Trees are a common form of the algorithm used in predictive Modeling and machine learning.
A binary tree is used to depict the decision tree model. This is your basic binary tree made from algorithms and data structures. Each node represents a single input variable (x) and its split point (assuming the variable is numeric).
The tree’s leaf nodes include an output variable (y) used to create a prediction. Predictions are made by walking the tree’s splits until they reach a leaf node and then outputting the class value at that leaf node.
Trees learn quickly and make accurate predictions. They are also frequently accurate for many problems and do not necessitate any specific data preparation.
5th — Naive Bayes
Naive Bayes is a basic but unexpectedly strong predictive Modeling technique.
The probability model, once calculated, can be used to create predictions for new data using the Bayes Theorem. The model comprises two types of probabilities, which can be derived directly from the training data: 1) the likelihood of each class and 2) the conditional probability for each class given each x value. When dealing with real-valued data, it is typical to assume a Gaussian distribution (bell curve) to estimate these probabilities easily.
The term “naive Bayes” refers to the assumption that each input variable is independent. Although this is a significant assumption and unreasonable for real-world data, the technique is remarkably effective in complex situations.
6 — K-Closest Neighbors
The KNN algorithm is both simple and powerful. The complete training dataset is used as the model representation for KNN. Isn’t it simple?
Predictions are formed for each new data point by scanning the complete training set for the K most similar examples (the neighbors) and summarizing the output variable for those K instances. This could be the mean output variable in a regression problem or the classification problem’s modal (or most common) class value.
The issue is determining how to compare the similarity of the data instances. Suppose your characteristics are all of the same scale (for example, all in inches). In that case, the simplest way is to utilize the Euclidean distance, a value you can immediately calculate based on the differences between each input variable.