Imagine that you are in a contest. The host presents you with three doors. Behind one of those doors is a brand new Audi and behind the other remaining two doors are goats. The host asks you to choose one of these three doors, and then proceeds to open one of the other two remaining doors to reveal a goat. Now tell me, do you stick to your initial door or do you switch to the other unopened door to get your Audi?

This is the famous Monty Hall Problem. What if I tell you that by learning just the basics of probability, you have a higher chance of winning this contest and going home with a brand new Audi?
Statistics and Probability are subjects which are widely overlooked when it comes to Machine Learning. A lot many people tend to ignore them, because they come off as being difficult and maybe not as cool as Machine Learning. But in order to understand and grasp the core concepts behind some of the most prominently used Machine Learning algorithms, it is important that one is at least familiar with the basics of Statistics and Probability. The aim of this article is to give you a valuable introduction to Probability and its various types. Along with that, we also need to figure out the Monty Hall problem, so let’s go over a few important things.

Probability

Probability, as the name suggests, is nothing but an estimate of how likely an event might take place. Also known as Marginal Probability, it is simply a number that reflects the likelihood that an event will take place. It could be a number between 0 and 1 or it could be expressed as a percentage value. Let us take it step by step.

Experiment

We will define an Experiment within the context of Probability Theory — a branch of mathematics dealing particularly with probability. An Experiment is defined as a procedure which, although can be repeated infinite number of times, still has a well-defined set of possible outcomes.

Random v/s Deterministic Experiment | Image by Author

An Experiment can be of two types and we differentiate between them in terms of outcomes. An example for both:

  • Random Experiment: Rolling a die can give us one of 6 values -{ 1, 2, 3, 4, 5, 6 }
  • Deterministic Experiment: Adding up some particular bunch of numbers will always give the same result

Event

An event is an outcome or rather a set of outcomes of an experiment for which we can calculate the probability. The collection of all possible outcomes of an experiment forms the sample space. So we can say that events are basically a subset of the sample space.
Let’s say I take a coin from my wallet and flip it. The experiment here is flipping the coin — what are the likely outcomes?
It could be either heads or tails — you know that one of these will be the outcome but you can’t say which one. That means that there is a possibility of two events — landing a heads or landing a tails.

Random Variable

In such cases, as mentioned before we say that this experiment is random. Any variable representing the outcome of such a random experiment is said to be a random variable. But again, can you say how likely is a heads event? Or tails?

Probability Theory Intuition

Because flipping a heads is one of the events from these two possible scenarios — {heads or tails} — we say that there is a 50% chance of the event being heads. At the same time, it can be said that there is a 50% chance of flipping tails.

Heads or Tails? | Image by rupixen.com on Unsplash

If we extend this logic, we can come up with a similar approach which can be applied to other areas and problems. Let us go through one more such problem.

Which color? | Image by Author

Say we have a bag. We place three balls inside the bag each of a different colour — {Blue O, Green O, Red O}.
Can we know how likely it is to pull out the red ball O?
Going by the same logic as in our coin flipping experiment, we see that we have three possible outcomes here. When we pull out a ball from the bag we can get one of those three colors. Hence, the probability that we get the red one should be 1/3.

Equally Likely Events

For all this to happen in this manner, that is for this intuition to work, note that the events are all equally likely. This implies that each event has exactly the same likelihood or exactly equal chance of happening. We are as likely to get heads as the outcome as we are for tails. Similarly, we are as likely to pull the red ball out of the bag as we are for the green one. There is no bias or anything else which favours one event over another, they all have the same chance of occurring and so these events are called Equally Likely events.

Formal Definition

For equally likely events such as the ones described above, the formula for probability states that for each event, the probability would be the ratio of the number of ways that particular event can take place to the total number of possible outcomes. In other words, the probability of an event taking place is ratio of the number of ways that that particular event can occur to the total number of possible outcomes.

Now if you go through them again, you might be able to notice that this is what is happening:

As an example, also look at this:

Properties of Probability

  • All Probabilities are between 0 and 1, both inclusive
  • A Probability of 0 means the event is impossible, it cannot happen
  • A Probability of 1 means the event is certainly going to happen

Independent Events

In probability, two events are said to be independent if the outcome of one event does not affect the outcome of the other event. An example could be rolling a pair of dice — the outcome of one of these dice does not affect the outcome of the second one and we can get a total of 36 different possible combinations of outcomes.

Combinations from a pair of dice | Image by Author

But talk about pulling out a card from a single shuffled deck of cards, and every move alters the probability. Say we want to find the probability of pulling out the ace of diamonds from a well shuffled deck of cards. Pulling out this card in the very first attempt has a probability of 1/52. But let’s say the first move pulls out 7 of spades — now for the second attempt, probability of pulling out the ace of diamonds has a probability of 1/51, hence implying that these are not independent events.

Joint Probability

To explain what Joint Probability is, we’ll quickly go over another interesting topic, Set Theory — a branch of mathematics which deals with sets, which simply stated, are a collection of objects or elements. Look at this Venn diagram:

Set Theory Example | Image by Author

As we can see, there are three circles A, B and C which are intersecting each other, each denoting something. There are weird little symbols as well in the labels for the intersecting areas — âˆ©. In Set Theory, these symbols denote intersections. If we consider A, B and C to be sets, ∩ would denote those elements inside these sets which are common to the sets which are on either side to this symbol. For example, let’s assume for the sake of an example-
A = { 4, 22, 10, 19, 97 }
B = { 30, 3, 9, 19, 97 }
This will imply that A ∩ B = { 19, 97 }

Let us take another example –
Event A = People who follow football (I mean soccer, for my dear American readers)
Event B = People who follow cricket
Event C = People who follow F1

A ∩ B = People who follow both, football and cricket
B ∩ C = People who follow both, cricket and F1
A ∩ C = People who follow both, football and F1

A ∩ B ∩ C = People who follow all these sports

Now back to Joint Probability, we can define it as the probability of two or more events occurring simultaneously. In our example above, (A ∩ B) are people who like football and cricket simultaneously. The probability that a person chosen at random likes both football and cricket simultaneously will be P(A ∩ B). If you look at the venn diagram above, this person will be found in the area which is formed due to events A and B intersecting.


Joint Probability is an important statistic to measure, and you can find various little problems to practise involving cards. This concept requires that two events happen at the same time and cards give us plenty of such situations like:

  • What is the probability of pulling a card that is both red and 2?
  • What is the probability of pulling out a card that is both red and is odd-numbered?

For Independent events, Joint Probability is simply a product of their Marginal Probabilities. So if event A and B are independent:

-> P(A ∩ B) = P(A) * P(B)

Conditional Probability

Conditional Probability is the probability that an event will occur given that we know that some other event(s) has already taken place. So if A and B are two events, conditional probability can tell us the likelihood of event A happening if event B has already taken place. Wait a minute, this sounds an awful lot like Joint Probability. Don’t worry, we read about the die roll example earlier, now let’s go over that again. Let’s have two events:

  • Event A — Rolling the number 4
  • Event B — Rolling an even number

From what we know, we can easily figure out this:

Conditional Probability Example | Image by Author

Having these as two separate events is fine, but what happens when say we know for sure that event B has already taken place? Do you think the probability of getting a 4 will be 1/6 if we know for sure that the die has rolled out an even number already?
The short answer is NO. Now that we know that our outcome is definitely an even number, our set of possible outcomes gets reduced from {1, 2, 3, 4, 5, 6} to {2, 4, 6}. This means that now the probability of rolling a 4 from this set should be 1/3 — so our chances of getting 4 as the outcome are twice as before. As is evident, it is unwise to just use Marginal Probability which ignores this new information on event B. If we formalize this approach, we would reach at this:

Conditional Probability — probability of A given that B has happened.

And this is how Joint Probability and Marginal Probability all come together and form Conditional Probability. The term on the left hand side, P(A/ B) symbolizes the statement — Probability of Event A given that Event B has already occurred. For example, let’s talk about clouds and rain (and how they make you feel maybe?). Let’s say you want to find out how likely it is to rain today given that it is cloudy. All we need is the joint probability of occurrence of rain and clouds along with the Marginal Probability of rain, and we can find this out.

Let’s do another sample problem to get this right. Here we have a venn diagram for a class of students studying Astrophysics and Arts (interesting bunch of students). Can we find out what is the probability of a studying Arts if they are studying Astrophysics?

Conditional Probability Example | Image by Author

From the image we can calculate these things:

  • P (Astrophysics) = 112/358
  • P (Arts) = 240/358
  • P (Astrophysics ∩ Arts) = 6/358

Now, we want to find the probability of a student studying Arts given that they are already studying Astrophysics:

P (Arts / Astrophysics) = P (Astrophysics ∩ Arts) / P (Astrophysics)
=> 6/112

This gives us the ratio of students who are studying both the subjects to the ones who are just studying Astrophysics, and that is how we find conditional probability.

Now, let’s go back to our Monty Hall problem. There are many solutions to this problem, and opening the Wikipedia page to this will present all of them.

We will go over the ‘simple solution’. Look at this table, assuming we always choose door 1:

Image by Author

As you can see, a simple tabular formulation of the outcomes shows that you have a higher probability of winning the contest if you switch — 2/3.
Let’s quickly walk through this example. In the first scenario, our Audi is behind door 3. We have already chosen door 1 so the host will open door 2 to reveal a goat. Now if we choose to stay with door 1, we lose. If the Audi is behind door 2, the host will open door 3 to reveal a goat, switching here again will win us the contest. So that means 2/3 times switching results in winning. Only in scenario 3 if the choose to stay, we win — this has a probability of 1/3.


In this article I wanted to give you guys a brief introduction to all the concepts around Probability, I hope you liked the post, like / share / subscribe to thedatascienceportal. Let me know what you think about the content down in the comments!
Thank you for reading!