Machine Learning Guide for Newbies

machine learning

Concepts like machine learning and artificial intelligence sound weird and mysterious?

No worries!

I will accompany you to discover this fascinating world starting from the basics.

So, finally, the technological secrets that permeate our modern daily life will be clearer to you.


Machine learning surrounds us!

You wake up, take some bread and heat it in the microwave, which in a few seconds does its job as expected.

You drive to work and your smartphone recommends you to choose another way because of an accident. Even today you have avoided a dozen insults.

Are you Italian like me? More than a dozen.

After four hours in the office, you spend your lunch break biting a sandwich and scrolling photos on Tinder, who seems to know you better than your best friend.

How do you say? You have never used Tinder?

Of course, my dear, of course. Me neither! BLINK BLINK.

After the break, you spend your time PRODUCTIVELY on the laptop browsing Netflix, which recommends several interesting new titles.


Magic? No, Machine learning!

All these interactions between you and the machines now seem obvious and automatic.

Yet, if you think about it, they have something almost magical. This magic, which is NOT magic, is called machine learning.

Excluding the microwave.

That really works by magic.


What is Machine learning?

Machine learning is basically a branch of artificial intelligence.

More specifically, it explores the study and construction of algorithms that can learn from a set of data and make predictions about them.

In other words, our algorithms will inductively build a model based on samples. It is, therefore, a process closely related to pattern recognition.

In the field of computer science, machine learning is a variant of traditional programming.

It provides systems the ability to automatically learn and improve from experience without being explicitly programmed.

This translates materially into having algorithms that can give interesting information about a set of data, without having to write any specific code.

Instead of writing the code, the data is entered into a generic algorithm and the algorithm generates its own logic based on the data entered.


“Machine learning provides systems the ability to automatically learn and improve from experience without being explicitly programmed.”


A mathematical example to understand Machine learning

To give a small example, traditional programming provides an X input and an f function that processes it giving, as a result, an Y output.

In machine learning we could instead give the computer a series of inputs and the related outputs, asking it to find out which mathematical function connects them.

schema machine learning

We talk in this case of supervised learning and we will analyze the concept in detail later.


Traditional programming: Y = f (X)

Machine learning Y = f (X)

We could trivially provide the machine with the numbers you see in the table below.

It will identify the correct function, which in this case is obviously the squared elevation.

machine learning equation

Machine learning… in practice!

You don’t like numbers? Do you hate math?

No fear!

Let’s move on to a more practical example, very common among experts in the sector to explain the concept of machine learning to beginners.

It is the legendary house example. Let’s imagine we are a real estate company.

Over the years, we have accumulated hundreds or thousands of data relating to real estate sales.

At this point, our problem is to find a link between the square footage of houses and their potential commercial value.

We decide to do it through machine learning.

In essence, the computer will have to identify the connection between input (square footage) and output (price).

function real estate example machine learning

Exactly like our agents are already able to do it through their intuition developed after years of experience.


How does Machine learning represent reality?


At this point, it is better to stop for a second and give a couple of definitions.

The values of the houses that I will have to predict are called labels. The values I enter as input are the features.

Finally, every sample I give to the computer is a data point.

Okay, I gave three definitions and not two.

Be patient.

Once the input and output data have been entered, the machine will try to process a function that connects them through the generic algorithm mentioned above.

This is a process that goes through numerous attempts and errors, in which we try to obtain a model that deviates as little as possible from reality.

In short, we aspire to obtain the smallest possible gap between the model’s expectations and the actual results.


What is the Cost function? Reality vs expectations

This difference between model and reality is calculated through the so-called cost function.

It calculates what in the image below is represented graphically as the distance between the dots and the curve of the model.


ml functions

However, the truth is that, as you can see in the image, a model cannot be perfect. The first example is clearly inappropriate, i.e. underfit.

The second, on the other hand, already seems to us to be more consistent with the data.

Finally, the third is the one with the lowest cost. Practically the curve coincides perfectly with all the dots.

What is missing from the latter model is generalization!


How do we improve our models in Machine learning?

Generalization is the ability to adapt adequately to any new data without fossilizing only on the initial ones.

Otherwise, the model will tend to overfit, as in the third example, and will, therefore, be perfect for the data of the first test but not representative in a different context.

The solution to this problem is to divide the data between the two parts, namely the training set, and the test set.

The algorithm will then be evaluated based on its performance with data it has never seen, practically on its ability to generalize.