Project Metric Data Type for Lean Six Sigma Project
Identifying Data type for project metric sometimes can be confusing. In this post, we discuss how to identify data type for your Lean Six Sigma project metric. And more importantly, what mistakes to avoid.
Before we start talking about the data type of your project metric, we need to understand base and derived metrics. It is also important to understand common pitfalls and best practices for identifying and defining your project metric. Please read through my detailed post on Project Metric here and on Data Types here (opens in new tab).
Let us take a simple example. Your project is about reducing the number of calls received at your customer care centre. Or improving the accuracy of transaction processing in your back office operations. Or you wish to reduce the defective units produced in your manufacturing unit.
In the first example, you will collect data around number of calls you receive each day at the customer care centre. It would be in form of day and number of calls received. This is your base data and ‘number of calls’ is your base metric. However, you cant really put all these data points into your goal statement as your baseline. Hence, you need to come up with a metric which represents these data points into one single data point. So you define your project metric as ‘Calls received per day’. Such metric is called derived metric. The value of this metric is derived by dividing the total calls received in a defined time period (week or month) by the total days in such time period. This is referred to as derived metric value.
In the second example, the base data will be in form of transaction reference number and its accuracy in terms of Yes and No or Accurate and Inaccurate. However, your project metric will be ‘Accuracy percentage’, a metric derived by dividing total accurate transactions by total number of transactions processed in the given time period (day, month or week). So accuracy percentage is a derived metric.
Similarly, for a manufacturing unit, the number of defective units for each batch is your base data. But, the metric ‘Defective units per batch’ is derived by dividing total number of defective units by the total batches produced.
In all the above 3 examples, the base data is of discrete data type whereas the derived metric data is of continuous data type. This brings us to the obvious question. What is the correct data type for these project metrics?
Download my latest eBook – Lean Six Sigma Acronyms
Contains 220+ LSS acronyms and abbreviations, a handy reference guide for all LSS Practitioners. And its FREE!
Project Metric Data Type
This actually is quite simple if you remember a few pointers. As mentioned in my previous post on data types, you should always decide the data type by looking at the base data. And not just the values in the base data but the nature of the base metric.
In our example, the base metric is number of calls received each data. This data will always be in integers as you can’t really receive a fraction of a call. Hence, it is of Discrete count data type.
In other two examples, it’s the accuracy of the processed transaction and the defectiveness of the manufactured units. The data that you collect in these scenarios will always talk about whether the transaction is accurately processed or not. And whether the unit manufactured is defective or not. In both cases, there will only to two possible values in the data set. Hence, these metrics are of Discrete Binary data type.
“We always decide the data types by looking at the base nature of our base metric, never on the basis of the calculated value of the derived metric”
Now the other obvious question is, why?
Why the base metric data type?
The answer lies in the way we use the project metric data in our data analysis. We know that the data type helps us chose the correct statistical test. Not all the tests can be used for all data types. There are separate tests for discrete and separate tests for continuous data.
Once we chose the test, we use the base metric data values as an input to these tests. And never the values of the derived metrics. In the example stated above, the input for the statistical tests will always be the number of calls received each day. We will never use the value 346.8 as an input to any statistical test. Hence knowing the data type for these value, or for the derived metric as such, is not relevant.
In closing, always look at the base metric, the values that this base metric can take and then decide the data type for your lean six sigma project metric.
Do let me know if you have any comments or thoughts on the above in the comments section below and I will try to respond to / incorporate the same.