I suspect it will simply be “since Ppk is always worse then Cpk, as long as your Ppk is better then our min requirement for transfer then you’re good”. What we’ll do with that information is another question. In fact I almost had a Cpk of 9.43, until someone asked if it was normal, and I had to transform it which brought it down to a pitiful 1.37 P.Īlso, this is more of a theoretical question so that I can describe why no Cpk on a distribution to my co-workers and understand the difference myself more clearly. It’s not about maximizing Cpk, it’s about maximizing fit. If the data is not normal, then the resulting Cpk on a normal graph, can be just as misleading as if you transformed data that shouldn’t be.
How does this work when you set your sub-group size as Yes, a transform or distribution should only be used if it fits the data better, which you can tell using “Individual Distribution Identification”. Therefore, capability indices that use the within-subgroup standard deviation represent better performance than the actual one.” “However, the within-subgroup standard deviation does not account for
Isn’t the whole point of using a distribution to make the data normal? Can’t you make a secondary normal x-axis and calc the standard deviation based on that, now that it’s normal on the non-normal distribution? “the properties of the normal distribution that make these methods possible are not shared by nonnormal distributions” Why can you calc standard deviation for “all measurements”, but not “within-subgroup”? Sounds to me like they both need to be able to calc standard deviation. … The second method is to use the within-subgroup standard deviation to calculate the capability indices.” “The first method of calculating the standard deviation is to assume all measurements make one big sample, and calculate its sample standard deviation. OK, I’ve slept on it and I’m still confused. Cpk assumes that one unit away in either direction from the mean will have equal probability, whereas for a skewed distribution like this that is not even remotely close to reality. Your Cpk will be about 0.33 (based on CPL) whereas only CPU is realistic and would give a value of about 0.67.
If you utilized Cpk, your value would be based on the mean being about 1 unit from the lower spec even though it is obvious you will never see a single data point below the lower spec. Imagine your specs are 0 and 3, and make a histogram of the data. However, almost every nonnormal distribution you come across is NOT symmetric, so Cpk is useless.Īs a quick example, go make some exponential data with a mean of 1. For Cpk to make sense, you have to have equal likelihood of falling outside of each spec given the same distance away for the metric to be useful, so you have to have a symmetric distribution. Think about the formulas for Cpk and Ppk – with Cpk, you are comparing to the closest spec whereas with Ppk you are comparing to the entire range of the specs. I’ll give the short answer and then provide the long one in a link. And – You guys are way too concerned about Cpk for nonnormal data on a weekend!