Helping you Understand Lean and Six Sigma tools and concepts in the easiest way possible.

# Probability Distribution for Lean Six Sigma In this post, we will look at what Probability is and its application in Lean Six Sigma context. We will also talk about discrete and continuous Probability Distribution and look at their characteristics.

Before we delve into probability distribution, lets take a quick view of basic Probability concept in general.

## What is Probability?

You must have encountered multiple instances when you roll a dice or flip a coin. We know that the result of flipping a coin will either be a heads or a tail. That of rolling a dice is any number between 1 and 6, both included.

Simply put, Probability is the chance of getting a heads (or a tail) for each instance of rolling a dice.

Generalizing the above example, probability is the likelihood that a particular outcome (event) will occur for a trial.

Here, ‘Trial‘ is each instance when you flip the coin. The possible outcomes are Heads and Tails. ‘Event‘ is the particular or specific outcome that we are looking for. Thus, probability is the likelihood that a specific event or outcome will occur in a trial. Simple.

You can calculate probability of a specific event using the below formula. You will need to know the total possible outcomes for each trail to calculate probability. (2 possible outcomes when you flip a coin and 6 possible outcomes when you roll a dice, right!)

The probability of getting a ‘Heads’ (event) in the next coin flip (trial) is 50% or 0.5 as there are only two outcomes possible. And either of them can occur. Similarly, the probability of getting a score of 6 when you roll a dice is 1/6, that it 0.167 or 16.67%.

Contains 220+ LSS acronyms and abbreviations, a handy reference guide for all LSS Practitioners. And its FREE!

## Probability in Six Sigma

We also need the basic probability concepts in Lean Six Sigma. We surely are not dealing with the process of flipping a coin or rolling a dice. However, if you think about it, the business processes are also similar.

Just replace the process of flipping a coin with a manufacturing process. Think of a defective unit produced as an event. And each unit manufactured as a trial. This will give you the probability of a unit being defective. So the probability of a unit being defective is again 1/2 or 50% or 0.5. Sounds strange, right?

Remember, this is the probability of each unit being defective. A unit will either be defective or not. However, the fun starts when you look at the units collectively. When you want to know the possible number of defective units in a given sample size. That is where Probability Distribution comes into picture.

Instead of looking at just one unit independently, lets say you want to look at a sample of 100 units. And you want to calculate the probability of receiving a defective unit.

Six Sigma professionals will collect a defined samples from the manufactured units. They will check these units for defects. Lets say, there were 2 defects per 100 units produced. This takes us to the conclusion that, for every 100 units produced, we will get 2 defective units. So, probability of receiving a defective unit in a batch of 100 units is 2/100, ie 0.02 or 2%.

Now that we understand basic probability concept, let us look at Probability Distribution.

Contains 220+ LSS acronyms and abbreviations, a handy reference guide for all LSS Practitioners. And its FREE!

## Probability Distribution

We now know that Probability is the likelihood of a particular event occurring for a trial. Probability Distribution, on the other hand, is the likelihood of a particular value/s that a random variable can take. Confused? Let is further elaborate this.

Let us assume, you are in the business of packing food delivery orders in a restaurant. And you are working there for the last 3 years. Will you know how much time it takes to pack one order? You will surely answer “Yes” to this question.

Will you also know the probability of the next order you pack taking more than 15 minutes? That would be tough to answer.

To answer this, you need to collect the packing time data for a good amount of orders and build the Probability distribution. Once done, it will help answer such questions quite easily.

We will talk about the types of probability distributions and their characteristics in the next section. Since the probability distribution describes the possibility of values a random variable can take, it depends on the type of random variable.

## Types of Probability Distribution

In my previous post on data type for Lean Six Sigma projects (opens in new tab), we talked about two types of data. Discrete and Continuous. Similarly, there are two broad types of probability distribution depending on the data type of the random variable. Discrete and Continuous.

The type of probability distribution depends on the data type of the random variable. In the restaurant example, the time to pack an order is continuous data type. Hence the probability distribution will be continuous probability distribution. Results of flipping a coin or rolling the dice are discrete data type. Hence the distribution you get is discrete probability distribution.

Before we get into the details of continuous and discrete probability distributions, here are a few key points to remember.

Sum total of all probabilities of all values that a variable can take will always equal to 1. The probability of a particular event will always be between 0 and 1, both included. It cant go above 1 or above 100%. It cant come below 0 or below 0%. And it surely cant be negative.

Contains 220+ LSS acronyms and abbreviations, a handy reference guide for all LSS Practitioners. And its FREE!

## Continuous Probability Distribution

You get continuous probability distribution when the random variable is continuous in nature. This is the also represented by a frequency plot or histogram plot using past data. Do give a read to my previous post on Histogram to understand it better (opens in new tab).

Let us look at an example. We spoke about the time taken to pack food delivery orders. If you collect the time taken data for last 50 orders, you will get the frequency data for each interval or bin. Using this, you plot the histogram. When you connect the end points of all these bars on the histogram by a smooth curve, you get continuous probability distribution curve.

#### Continuous Probability Distribution Calculations

The total area under this curve is always equal to zero. In a perfect normal distribution, the mean is at the center of the curve. The probability of an order taking more than the mean time for packing is 0.5 or 50%. The probability of the order taking less than the mean time is also 0.5 or 50%.

However, for continuous probability distribution, the probability of the random variable taking one specific value is almost zero. This is because the random variable, being continuous in nature, can take any of the infinite possible values. The time taken for one particular order can vary from 0 minutes to any possible number of minutes, almost infinite. Hence the possible outcomes are infinite and you are looking at probability of one specific outcome from infinite possible outcomes.

Hence statisticians always calculate the probability of a range of value for continuous variable and not for a specific value. This means, you can calculate the probability of the next order taking 3 to 4 minutes to pack or more than 4 minutes, or less than 3 minutes and so on. But you can’t calculate the probability of an order taking exactly 3 minutes.

There are multiple distributions under continuous probability distribution. The shape of the distribution curve also decides the type of distribution that the data follows.

Some of the well know and common distributions are Normal, Weibull and Lognormal distribution. Normal distribution is the most common and most important continuous distribution for Lean Six Sigma practitioners. This is discussed in details, click here to read (opens in new tab).

## Discrete Probability Distribution

When you are dealing with a discrete random variable, you will get a discrete probability distribution.

The variable assigned to the result of flipping a coin or rolling a dice is discrete in nature. It can take fixed set of values. Or the number of units produced every day for that matter. These will give you discrete probability distributions.

As we saw earlier, the continuous probability distributions are always displayed in form of a curve. This is because the variable can take any of the infinite possible values. And the total area below this curve is equal to 1.

However, for discrete probability distributions, the values that the variable can take are finite. For rolling a dice, there can only be one of 6 values. And flipping a coin can only result in 2 outcomes.

Thus, discrete probability distribution are not represented in form of a curve. They can be simply put up in a tabular format. Each outcome will have a non-zero probability. And the sum of probabilities of all the outcomes will be equal to 1.

Some of the well know discrete probability distributions that you can use are Poisson distribution, Binomial distribution and Uniform distribution.

Poisson distribution is usually useful for discrete count data, Binomial distribution for discrete binary data and Uniform distribution for discrete ordinal data.

Here is another great resource to further understand statistical distributions for Lean Six Sigma from ISIXSIGMA. Do give it a read.

Thus probability plays an important role in Lean Six Sigma context as well. I would like to know if you have encountered any instance where probability distribution helped or If you have any questions, in the comments below.