Tuesday, November 21, 2017

Statistics of multi-dimensional data, example

In the previous blog post, Statistics of multi-dimensional data, theory, I introduced a generalization of the standard deviation to three-dimensional data. I called it ellipsification. In this blog post I am going to apply this ellipsification thing to real data to demonstrate the application to statistical process control of color.

I posted this cuz there just aren't enough trolls on the internet

Is the data normal?

In traditional SPC, the assumption is almost always made that the underlying variation is normally distributed. (This assumption is rarely challenged, so we blithely use the hammers that are conveniently in our toolbox -- standard SPC tools -- to drive in screws. But that's another rant.)

The question of normalcy is worth addressing. First off, since I am at least pretending to be a math guy, I should at least pay lip service to stuff that has to do with math. Second, we are venturing into uncharted territory, so it pays to be cautious. Third, we already have a warning that deltaE color difference is not normal. Ok, maybe a bunch of warnings. Mostly from me.

I demonstrate in the next section that my selected data set can be transformed into another data set with components that are uncorrelated, have zero mean and standard deviation of 1.0, and which give every indication of being normal. So, one could us this transform on the color data and apply traditional SPC techniques on the individual components, but you will see that I take this one step further.

    Original data

I use the solid magenta data from the data set that I describe below in the section below called "Provenance of the data". I picked magenta because it is well known that it has a "hook". In other words, as you increase pigment level or ink film thickness, it changes hue. The thicker the magenta ink, the redder it goes. Note that this can be seen in the far left graph as a tilt to the ellipsoid.

I show three views of the data below. The black ellipses are slices through the middle of the ellipsification in the a*b* plane, the L*a* plane, and the L*b* plane, respectively.

View from above

View from the b* axis

View from the a* axis

    Standardized data

Recall for the moment when you were in Stats 201. I know that probably brings up memories of that cute guy or girl that sat in the third row, but that's not what I am talking about. I am talking about standardizing the data to create a Z score. You subtracted the mean and then divided by the standard deviation so that the standardized data set has zero mean, and standard deviation of 1.0.

I will do the same standardization, but generalized to multiple dimensions. One change, though. I need an extra step to rotate the axes of the ellipsoid so that all the axes are aligned with the coordinate axes. The cool thing is that the new scores (call them Z1, Z2, and Z3, if you like) are now all uncorrelated.

Geometrically, the operations are as follows: subtract the mean, rotate the ellipsoid, and then squish or expand the individual axes to make the standard deviations all equal to 1.0. The plot below show three views of the data after standardization. (Don't ask me which axes are L*, a*, and b*, by the way. These are not L*, a*, or b*.)

Standardized version of the L*, a*, and b* variation charts

Not much to look at -- some circular blobs with perhaps a tighter pattern nearer the origin. That's what I would hope to see. 

Here are the stats on this data:

Mean Stdev Skew Kurtosis
Z1  0.000  1.000 -0.282  -0.064
Z2  0.000   1.000  0.291   0.163
Z3  0.000  1.000 -0.092  -0.658

The mean and standard deviation are exactly 0.000 and 1.000. This is reassuring, but not a surprise. It just means that I did the arithmetic correctly. I designed the technique to do this! Another thing that happened by design is that the correlations between Z1 and Z2, and between Z1 and Z3 are both exactly 0.000. Again, not a surprise. Driving those correlations to zero was the whole point of rotating the ellipsoid, which I don't mind saying was no easy feat.

The skew and kurtosis are more interesting. For an ideal normal distribution, these two values will be zero. Are they close enough to zero? None of these numbers are big enough to raise a red flag. (In the section below entitled "Range for skew and kurtosis", I give some numbers to go by to scale our expectation of skew and kurtosis.)

In the typical doublespeak of a statistician, I can say that there is no evidence that the standardized  color variation is not normal. Of course, that's not to say that the standardized color variation actually is normal, but a statement like that would be asking too much from a statistician. Suffice it to say that it walks like normally distributed data and quacks like normally distributed data.

Dr. Bunsen Honeydew lectures on proper statistical grammar

This is an important finding. At least for this one data set, we know that the standardized scores Z1, Z2, and Z3 can be treated independently as normally distributed variables. Or, as we shall see in the next section, we can combine them into one number that has a known distribution.

Can we expect that all color variation data behaves this nicely when it is standardized by ellipsification? Certainly not. If the data is slowly drifting, the standardization might yield something more like a uniform distribution. If the color is bouncing back and forth between two different colors, then we expect the standardized distributions to be bi-modal. But I intend to look at a lot of color to try to see if 3D normal distribution is the norm for processes that are in control.

In the words of every great research paper every written, "clearly more research is called for".

The Zc statistic

I propose a statistic for SPC of color, which I call Zc. This is a generalization of the Z statistic that we all know and love. This new statistic could be applied to any multi-dimensional data that we like, but I am reserving the name to apply to three-dimensional data, in particular, to color data. (The c stands for "color". If you have trouble remembering that, then note that c is the first letter of my middle name.)

Zc is determined by first ellispifying the data set. The data set is then standardized, and then each data point is reduced to a single number (a scalar), as described in the plot below. The red points are a standardization of the data set we have been working with.the data set we have been working with. I have added circles at Zc of 1, 2, 3, 4. Any data points on one of these circles will have a Zc score of the corresponding circle. Points in between will have intermediate values, which are the distance from the origin. Algebraically, Zc is the sum in quadrature of the individual three components, that is to say, the square root of the sum of the squares of the three individual components.

A two-dimensional view of the Z scores

Now that we have standardized our data into three uncorrelated random variables that are (presumably) Gaussian with zero mean and unit standard deviation, we can build on some established statistics. The sum of the squares of our standardized variable will follow a chi-squared distribution, and the square root of the sums of the squares will follow a chi distribution. Note that this quantity is the distance from the data point to the origin.

Chi is the Greek version of our letter X. It is pronounced with the hard K sound, although I have heard neophytes embarrass themselves by pronouncing it with the ch sound. To make things even more confusing, there is a Hebrew letter chai which is pronounced kinda like hi, only with that rasping thing in the back of your throat. Even more confusing is the fact that the Hebrew chai looks a lot like the Greek letter pi, which is the mathematical symbol for all things circular like pie and cups for chai tea. But the Greek letter chi has nothing to do with either chai tea, or its Spoonerism tai chi.

Whew. Glad I got that outa the way.

Why is it important that we can put a name on the distribution? This gives us a yardstick from which to gauge the probability that any given data point belongs to the set of typical data. The table below gives some probabilities for the Zc distribution. Here is an example that will explain the table a bit. The fifth row of the table says that 97% of the data points that represent typical behavior will have Zc scores of less than 3.0. Thus the chance that a given data point will have a Zc score larger than that is 1 in 34.

Levels of significance of Zc

Zc  P(Zc)Chance
1.00.19875     1
1.50.47783     2
2.00.73854     4
2.50.89994    10
3.00.97071    34
3.50.99343   152
4.00.99887   882
4.50.99985  6623
5.00.99999 66667

The graph below is a run time chart of the Zc scores for the 204 data points that we have been dealing with. The largest score is about 3.5. We would be hard pressed at calling this an aberrant point, since the table above says that there is a 1 in 152 chance of such data happening at random. By the way, we had close to 152 data points, so we should expect 1 data point above 3.5. A further test: I count eight data points where the Zc score is above 3.0. Based on the table, I expect about 6.

My conclusion is that there is nothing funky about this data.

Runtime chart for Zc of the solid magenta patches

Where do we draw the line between common cause and special cause variation? In traditional SPC, we use Z > 3 as the test for individual points. Note that for a normal distribution, the probability of Z < 3 is 0.99865, or one chance in 741 of Z < 3.0. This is pretty close to the probability of Zc < 4 for a chi distribution. In other words, if you are using Z > 3 as a threshold for QC with normally distributed data, then you should use Zc > 4 when using my proposed Zc statistic for color data. Four is the new three.

Provenance for this data

In 2006, the SNAP committee (Specifications for Newspaper Advertising Production) took on a large project to come to some consensus about what color you get when you mix specific quantities of CMYK ink on newsprint. A total of 102 newspapers printed a test form on its presses. The test form had 928 color patches. All of the test forms were measured by one very busy spectrophotometer. The data was averaged by patch type, and it became known as CGATS TR 002.

Some of the patches were duplicated on the sheet for quality control. In particular all of the solids were duplicated. Thus, in the blog post, I was dealing with 204 measurements of a magenta solid patch from 102 different newspaper printing presses.

Range for skew and kurtosis

How do we decide when a value of skew or kurtosis is indicative of a non-normal distribution? Skew should be 0.0 for normal variation, but can it be 0.01 and still be normal? Or 0.1? Where is the cutoff?

Consider this: the values for skew and kurtosis that we compute from a data set are just estimates of some metaphysical skew and kurtosis. If we asked all the same printers to submit another data set the following day, we would likely have a somewhat different value of all the statistics. If we had the leisure of collecting a Gillian or a Brilliant or even a vermillion measurements, we would have a more accurate estimate of these statistical measures. 

Luckily some math guy figgered out a simple formula that allows us to put a reliability on the estimates of skew and kurtosis that we compute.

Our estimate of skew has a standard deviation of sqrt (6 / N). For N = 204 (as in our case) this works out to 0.171. So, an estimate of skew that is outside of the range from -0.342 to 0.342 is suspect, and outside the range of -0.513 to 0.513 is very suspect.

For kurtosis, the standard deviation of the estimate is sqrt (24/N), which gives us a range of +/- 0.686 for suspicious and +/- 1.029 for very suspicious.

Tuesday, November 14, 2017

Statistics of multi-dimensional data, theory

This blog post is the culmination of a long series of blog posts on the statistics of color difference data. Most of them just basically said "yeah, normal stats don't work". Lotta help that is, eh? Several blog posts alluded to the fact that I did indeed have a solution. The most recent of which alluded to a method that works in the very title of the blog post: Statistical process control of color, approaching a method that works.

Now it's time to unveil the method.

Generalization of the standard deviation

One way of describing the technique is to call it a generalization of the standard deviation to multiple dimensions -- three dimensions if we are dealing with color data. That's a rather abstract concept, so I will explain.

     One dimensional standard deviation

We can think of our good friends, the standard deviation and mean, as describing a line segment on the number line, as illustrated below. If the data is normally distributed (also called Gaussian, or bell curve), then you would expect that about 68% of the data will fall on the line segment within one standard deviation unit (one sigma) of the mean, 95.45% of the data will fall within two sigma of the mean, and 99.73% of the data will be within three sigma of the mean.

As an aside, note that not all data is normally distributed. This holds true for color difference data, which is the issue that got me started down this path!

So, a one-dimensional standard deviation can be thought of as a line segment that is 2 sigma long, and centered on the mean of the data. It is a one-dimensional summary of all the underlying data.

     Two-dimensional standard deviation

Naturally, a two-dimensional standard deviation is a two-dimensional summary of the underlying two-dimensional data. But instead of a (one-dimensional) line segment, we get an ellipse in two dimensions.

In the simplest case, the two-dimensional standard deviation is a circle (shown in orange below) which is centered on the average of the data points. The circle has a radius of one sigma. If you want to get all mathematical about this, the circle represents a portion of a two-dimensional Gaussian distribution with 39% of the data falling inside the circle, and 61% falling outside.

Two dimensional histogram of a simple set of two dimensional data
The orange line encompasses 39% of the volume.

I slipped a number into that last paragraph that deserves to be underlined: 39%. Back when we were dealing with one-dimensional data, +/- one sigma would encompass 68% of normally distributed data. The number for two-dimensional data is 39%. Toto, I have a feeling we're not in one-dimensional-normal-distribution-ville anymore.

Of course, not all two-dimensional standard deviations are circular like the one in the drawing above. More generally, they will be ellipses. The the length of the semi-major and semi-minor axes of the ellipse are the major and minor standard deviation.

--- Taking a break for a moment

I better stop to review some pesky vocabulary terms. A circle has a radius, which is the distance from the center of the circle to any point on the circle. A circle also has a diameter, which is the distance between opposite points on the circle. The diameter is twice the radius.

When we talk about ellipses, we generally refer to the two axes of the ellipse. The major axis is the longest line segment that goes through the center of the ellipse. The minor axis is the shortest line segment that goes through the center of the ellipse. The lengths of the major and minor axes are essentially the extremes of the diameters of the ellipse. They run perpendicular to each other.

An ellipse, showing off the most gorgeous set of axes I've ever seen

There is no convenient word for the two "radii" of an ellipse. All we have is the inconvenient phrases semi-major axis and semi-major axis. These are half the length of the major and minor axes, respectively.

--- Break over, time to get back to work

The axes of the ellipses won't necessarily be straight up and down and left-to-right on a graph. So, the full description of the two-dimensional standard deviation must include information to identify the orientation of these axes.

The image below shows a set of hypothetical two-dimensional data that has been ellipsified. The red dots are random data that was generated using Mathematica. I asked it to give me 200 normally distributed x data points with a standard deviation of 3, and 200 normally distributed y data points  with a standard deviation of 1. These original data points (the x and y values) were uncorrelated.

This collection of data points were then rotated by 15 degrees so that the new x values had a bit of y in them, and the new y values had a bit of x in them. In other words, there was some correlation (r = 0.6) between the new x and y. I then added 6 to the new x values and 3 to the new y values to move the center of the ellipse. So, the red data points are designed to represent some arbitrary data set that could just happen in real life.

I performed an ellipsification, and have plotted the one, two, and three sigma ellipses (in pink). The major and minor axes of the one sigma ellipse are shown in blue.

Gosh darn! that's purdy!

The result of ellipsifying this data is all the parameters pertaining to the innermost of the ellipses in the image above. This is an ellipse that is centered on {6.11, 3.08}, with major axis of 3.19 and minor axis of 1.00. The ellipse is oriented at 15.8 degrees. These are all rather close to the original parameters that I started with, so I musta done sumthin right.

I also counted the number of data points within the three ellipses. I counted 38.5% in the 1 sigma ellipse, 88.5% in the 2 sigma ellipse, and 99% in the 3 sigma ellipse. (Of course when I say I did this, I really mean that Mathematica gave me a little help.) If the data follows a two-dimensional normal distribution, then the ellipses will encompass 39%, 86.5%, and 98.9% of the data. This is one indication that this condition is met.

The following pieces of information are determined in the ellipsification process of two-dimensional data:

     a) The average of the data which is the center of the ellipse (two numbers, for the horizontal and vertical values)
     b) The orientation of the ellipse (which could be a single number, such as the rotation angle)
     c) The lengths of the semi-major and semi-minor axes of the ellipse (two numbers)

The ellipsification can be described in other ways, but these five numbers will tell me everything about the ellipse. The ellipse is the statistical proxy for the whole data set.

     Three-dimensional standard deviation

The extension to three-dimensional standard deviation is "obvious". (Obvious is a mathematician's way of saying "I don't have the patience to explain this to mere mortals.")

The result of ellipsifying three-dimensional data is the following nine pieces of information that are necessary to describe an arbitrary (three-dimensional) ellipsoid:

    a) The average of the data (three numbers, for the average of the x, y, and z values)
    b) The orientation of the ellipsoid (three numbers defining the direction that the axes point)
    c) The lengths of the semi-major, semi-medial, and semi-minor axes of the ellipse (three numbers)

The image below is an ellipsification of real color data. The data is the color of a solid patch as produced by 102 different newspaper printing presses. There were two samples of this patch from each press, so the number of dots is 204.

The 204 dots were used to compute the three-dimensional standard deviation, which is represented by the three lines. The longest line, the red line, is the major axis of the ellipse, and has a length of 5.6 CIELAB units. The green and blue lines are the medial and minor axes, respectively. They 2.2 and 2.1 CIELAB units long. All three of the axes meet at the mean of all the data points, and all three are two-sigma long (+/-1 sigma from the mean). Depending on the angle you are looking, it may not appear that the axes are all perpendicular to each other, but they are.

Ellipsification of some real data, as shown with the axes of the ellipsoid

Trouble viewing the image above? The image is a .gif image, so you should see it rotating. If it doesn't, then try a different browser, or download it to your computer and view it directly.

What can we do with the ellipsification?

The ellipsification of color data is a three-dimensional version of the standard deviation, so in theory, we can use it for anything that we would use the standard deviation for. The most common use (in the realm of statistical process control) is to decide whether a given data point is an example of typical process variation, or if there is some nefarious agent at work. (Deming would call that a special cause.)

We will see an example of this on real data in the next blog post on the topic: Statistics of multi-dimensional data, example.

Saturday, October 7, 2017

Can a light be gray?

I follow Quora. I am not saying I am proud of that, but at least I will admit it. I look for questions in my topic of expertise, which is to say, color. And of course, my motives for answering the questions are purely altruistic. Answering the questions is part of my crusade to overcome the general public's ignorance of that noblest of the Sciences, Color Science.

Or maybe I just like to hear myself talk.

Regardless of the reasons, here is a question that I recently answered. My answer has been embellished a bit for this blog post -- mostly to make me sound more important. But I also added some cool pictures.

Is there gray light?

This is an interesting question! I have some experiments that will answer the question.

First experiment

If you connect a white LED to a variable power supply, and gradually turn the voltage down, it will always appear white, even though gray is somewhere between full white and black. I show that in the image below. This is a white LED (color temperature of maybe 6500K) run at 2.4V. This is about as low as it will go without flickering out. The camera clearly sees it as white.

There is no way to suppress the whiteness of this LED

Don't have a variable power supply? You can buy a white LED flashlight and leave it on until the batteries almost run down to nuthin. The LEDs will still be white.

So, first answer: No, there ain't no such thing as gray light. 

Second experiment

But if your arrange those white LEDs into a matrix and call that matrix a computer screen, then you can dim a portion of those white LEDs and get gray. Yes, Virginia, there is gray light, and it's what's coming at you when you look at the image below! Contrary to the guy who wrote about the first experiment, you can make gray light with white LEDs.

You're looking at gray light, right this very minute!!

(I should clarify... some, but not all, computer displays use white LEDs as a backlight behind filters. The idea here is that in principle, you could make a computer display with white LEDs, and you could display gray on that screen.)

Third experiment

Turn your entire computer screen into “gray” (RGB values of 128, for example), and turn out the lights in your room. After a few minutes, you will perceive the screen as white.

I am totally dumbfounded by how white my gray screen looks
Or maybe just dumb? 

Why did that happen? Gray is not a perceivable color all by itself. To see it, you need a white to reference it to.

In the first experiment, the white LED is not putting out a huge amount of light, but the light from the white LED is all coming from a small point. This means that the intensity at that point is very, very high, and likely much brighter than anything else in your field of view.

In the second experiment, I didn’t say this, but it is likely that there are some pixels on your computer screen that are close to white (RGB=255), so the area with RGB=128 will appear gray in comparison. In the third experiment, the only white reference that you have is the computer screen itself, so once your eyes have accommodated to the new lighting, the computer screen will be perceived as white.

Fourth experiment

I came up with a startling way to demonstrate this idea that "gray is perceived only in comparison to a reference white". Brilliant idea, really. I used the same set up I did for that first cool picture of a white LED. But in this case, I used two LEDs, wired in series. Note that I had to crank up the voltage to 4.8V. Electricity passes through each of the LEDs, so in principle they produce the same amount of total light.

The difference between the two LEDs is that the one on the right doesn't have a clear plastic bubble -- the plastic bubble on the LED on the right is a translucent white. The one on the right is a diffuse LED. The light from the diffuse LED is about the same, but it is spread out over a larger area, and not focused, so the amount of light hitting my eye is much less.

My camera saw the diffuse LED as being somewhat dimmer than the one on the left. Maybe from the picture you would call this a gray LED? My eyes looked at the two white LEDs and saw the one on the right as being gray. Honest to God, it was emitting gray light. My eyes saw the LED on the left, and used that as the white reference. The fact that I was drinking heavily during this fourth experiment is largely irrelevant.

Today's special - gray LEDs

So, I can definitively say that "gray" light exists, since I built a system with both a white and a gray LED. I'm sure if I would have introduced this a few weeks ago, I would have gotten an early morning call from Mr. Nobel about some sort of prize. Well, maybe next year. I will try to look surprised.


This blog has to have a moral. It was a bit hard for me to set up an experiment that demonstrated the emission of gray light. Why? Light, when taken in isolation, can never be gray. We only see gray when it is viewed in contrast to another brighter, whiter color.

Wednesday, September 13, 2017

Just plain stupid

Just in case you were hoping for another blog post about stupid memes, I can top those last few blogposts!

Tuesday, September 12, 2017

Statistical process control of color, approaching a method that works

As many of you know, I have been on a mission: to bring the science of statistical process control to the color production process. In my previous blog posts (too numerous to mention) I have wasted a lot of everyone's time describing obvious methods that (if you choose to believe me) don't work as well as we might hope.

Today, I'm going to change course. I will introduce a tool that is actually useful for SPC, although one big caveat will require that I provide a sequel to this interminable series of blog posts.

The key goal for statistical process control

Just as a reminder, the goal of statistical process control is to rid the world of chaos. But to rid the world of chaos, we must first be able to identify chaos. Process control freaks have a whole bunch of tools for this rolling around in their toolboxes.

The most common tool for identifying chaos is the run-time chart with control limits. Step one is to establish a baseline by analyzing a database of some relevant metric collected over a time period when everything was running hunky-dory. This analysis gives you control limits. When a new measurement is within the established control limits, then you can continue to watch reruns of Get Smart on Amazon Prime or cat videos on YouTube, depending on your preference.

Run-time chart from Binney and Smith

But when a measurement slips out of those control limits, then it's time to stop following the antics of agents 86 and 99 and start running some diagnostics. It's a pretty good bet that something has changed. I described all that statistical process control stuff before, just with a few different jokes. But don't let me dissuade you from reading the previous blog post.

There are a couple other uses for the SPC tool set. If you have a good baseline, you can make changes to the process (new equipment, new work flow, training...) and objectively tell whether the change has improved the process. This is what continuous process improvement is all about.

Another use of the SPC tool set is to ask a pretty fundamental question that is very often ignored: Is my process capable of reliably meeting my customer's specifications?

I would be remiss if I didn't point out the obvious use of SPC. Just admit it, there is nothing quite so hot as someone of your preferred gender saying something like "process capability index".

What SPC is not

Statistical process control is something different from "process control". The whole process control shtick is finding reliable ways to adjust knobs on a manufacturing gizmo to control the outcome. There are algorithms involved, and a lot of process knowledge. Maybe there is a PID control loop, or maybe a highly trained operator has his/her hand on the knob. But that subject is different from SPC.

Statistical process control is also something different from process diagnostics. SPC won't tell you whether you need more mag in your genta. It will let you know that something about the magenta has changed, but immediate job of SPC is not to figger out what changed. This should give the real process engineers some sense of job security!

Quick review of my favorite warnings

I don't want to appear to be a curmudgeon by belaboring the points I have made repeatedly before. (I realize that as of my last birthday, I qualify for curmudgeonhood. I just don't want to appear that way.) But for the readers who have stumbled upon this blog post without the benefit of all my previous tirades, I will give a quick review  of my beef with ΔE based SPC.

Distribution of ΔE is not normal

I would hazard a guess that most SPC enthusiasts are not kept up at night worrying about whether the data that they are dealing with is normally distributed (AKA Gaussian, AKA bell curve). But the assumption of normality underlies practically everything in the SPC toolbox. And ΔE data does not have a normal distribution.

I left warnings about the assumption of abnormality in Mel Brooks' movies

To give an idea of how long I have been on this soapbox, I first blogged about the abnormal distribution of color difference data almost five years ago. And to show that it has never been far from my thoughts and dreams, I blogged about this abnormal distribution again almost a year ago.

A run-time chart of ΔE can be deceiving

Run-time charts can be seen littering the living rooms of every serious SPC-nic. But did you know that using run-time charts of ΔE data can be misleading? A little bit of bias in your color rendition can completely obscure any process variation, lulling you into a false sense of security.

My famous example of a run-time chart with innocuous variation (on upper right)
that hides the ocuous variation in the underlying data (underlying, on lower left)

The Cumulative Probability Density function is obtuse

The ink has barely had a chance to dry on my recent blog post showing that the cumulative relative frequency plot of ΔE values is just lousy as a tool for SPC.

As alluring and seductive as this plot is,
don't mix CRF with SPC!

It can be a useful predictor of whether the customer is going to pay you for the job, but don't try to infer anything about your process from it. Just don't.

A useful quote misattributed to Mark Twain

Everybody complains about the problems with using statistical process control on color difference data, but nobody does anything about it. I know, you think Mark Twain said that, but he never did. Contrary to popular belief, Mark Twain was not much of a color scientist.

The actual quote from Mark Twain

So now it's time for me to leave the road of "if you use these techniques, I won't invite you over to my place for New Year's eve karaoke", and move onto "this approach might work a bit better; what N'Sync song would you like to sing?"

Part of the problem with ΔE is that it is an absolute value. It tells you how far, but not which direction. Another part of the problem is that color is inherently three dimensional, so you need some way to combine three numbers, either explicitly or implicitly.

Three easy pieces

Many practitioners have taken the tack of treating the three axes of color separately. They look at ΔL*, Δa*, and Δb* each in isolation. Since these individual differences can be either positive or negative, they at least have a fighting chance of being somewhat normally distributed. In my vast experience, when a color process is in control, the variations of  ΔL*, Δa*, and Δb* are not far from being normal. I briefly glanced at one data set, and somebody told me something or other about another data set, so this is pretty authoritative.

Let's take an example of real data. The scatter plot below shows the a*b* values of yellow during a production run. These are the a*b* values of two solid yellow patches from each of 102 newspaper printers around the US. This is the data that went into the CGATS TR 002 characterization data set. The red bars are upper and lower control limits for a* and for b*, each computed as the mean, plus and minus three standard deviation units. This presents us with a nice little control limit box.

Scatterplot of a*b* values of solid yellow patches on 102 printers

There is a lot of variation in this data. There is probably more than most color geeks are used to seeing. Why so much? First off, this is newspaper printing, which is on an inexpensive stock, so even within a given printing plant, the variation is fairly high. Second, the printing was done at 102 different printing plants, with little more guidance other than "run your press normally, and try to hit these L*a*b* values".

The variation is much bigger in b* than in a*, by a factor of about 4.5. A directionality like this is to be expected when the variation in color is largely due to a single factor. In this case, that factor is the amount of yellow ink that got squished onto the substrate, and it causes the scatterplot to look a lot like a fat version of the ideal ink trajectory. Note that often, the direction of the scatter is toward and away from neutral gray.

This is actually a great example of SPC. If we draw control limits at 3 standard deviation units away from the mean, then there is a 1 in 200 chance that normally distributed data will fall outside those limits. There are 204 data points in this set, so we would expect something like one data point outside any pair of limits. We got four, which is a bit odd. And the four points are tightly clustered, which is even odder. This kicks in the SPC red flag: it is likely that these four points represent what Deming called "special cause".

I had a look at the source of the data points. Remember when I said that all the printers were in the US? I lied. It turns out that there were two newsprinters from India, each with two representatives of the solid yellow. All the rest of the printers were from either the US or from Canada. I think it is a reasonable guess that there is a slightly different standard formulation for yellow ink in that region of the world. It's not necessarily wrong, it's just different.

I'm gonna call this a triumph of SPC! All I did was look at the numbers and I determined that something was fishy. I didn't know exactly what was fishy, but SPC clued me in that I needed to go look.

Two little hitches

I made a comment a few paragraphs ago, and I haven't gotten any emails from anyone complaining about some points that I blissfully neglected. Here is the comment: "If we draw our control limits at 3 standard deviation units away from the mean, then there is a 1 in 200 chance that normally distributed data will fall outside those limits." There are two subtle issues with this. One makes us over-vigilant, and the other makes us under-vigilant.

Two little hitches with the approach, one up and one down

I absentmindedly forgot to mention that there are two sets of limits in our experiment. There is a 1 in 200 chance of random data wandering outside of one set of limits, and 1 in 200 chance of random data wandering outside of the other set of limits. If we assume that a* and b* are uncorrelated, then this strategy will give us about a 2 in 200 chance of accidentally heading off to investigate random data that is doing nothing more than random data does. Bear in mind that we have only been considering a* and b* - we should also look at L*. So, if we set the control limits to three standard deviation units, then we have a 3 in 200 chance of flagging random data.

So, that's the first hitch. It's not huge, and you could argue that it is largely academic. The value of "three standard deviation units" is somewhat arbitrary. Why not 2.8 or 3.4? The selection of that number has to do with how much tolerance we have for wasting time looking for spurious problems. So we could correct this minor issue by adjusting the cutoff to about 3.15 standard deviation units. Not a big problem.

The second hitch is that we are perhaps being a bit too tolerant of cases where two or more of the values (L*, a*, and b*) are close to the limit. The scatter of data kinda looks like an ellipsoid, so if we call the control limits a box, we are not being suspicious enough of values near the corners. These values that we should be a bit more leery of are shown in pink below. For three dimensional data, we should raise a flag on about half of the regions within the box.

We need to be very suspicious of intruders in the pink area

The math actually exists to fix this second little hitch, and it has been touched on in previous blog posts, in particular this blog post on SPC of color difference data. This math also fixes the problem of the first hitch. Basically, if you scale the data axes appropriately and measure distance from the mean, the statistical distribution is chi-squared with three degrees of freedom.

(If you understood that last sentence, then bully for you. If you didn't get it, then please rest assured that there are at least a couple dozen color science uber-geeks who are shaking their head right now, saying, "Oh. Yeah. Makes sense.")

So, in this example of yellow ink, we would look at the deviations in L*, a*, and a* and normalize them in terms of the standard deviations in each of the directions, and then add them up according to Pythagoras. This square root of the sums of the squares is then compared against a number that was pulled from a table of chi-squared values to determine whether the data point is an outlier. Et voila, or as they say in Great Britain, Bob's your uncle.

Is it worth doing this? I generally live in the void between (on the one side) practical people like engineers and like people who get product out the door, and (on the other side) academics who get their jollies doing academic stuff. That rarified space gives me a unique perspective in answering that question. My answer is a firm "Ehhh.... ". Those who are watching me type can see my shoulders shrugging. So far, maybe, maybe not. But the next section will clear this up.

Are we done?

It would seem that we have solved the basic problem of SPC of color difference data, but is Bob really your uncle? It turns out that yellow was a cleverly chosen example that just happens to work out well. There is a not-so-minor hitch that rears it's ugly head with most other data.

The image below is data from that same data set. It is the L* and b* values of the solid cyan patches. Note that, in this view, there is an obvious correlation between L* deviation and b* variation. (The correlation coefficient is 0.791.) This reflects the simple physics: as you put more cyan ink on the paper, the color gets both more saturated and darker.

This image is not supposed to look like a dreidel

Once again, the upper and lower control limit box has been marked off in dotted red lines. According to the method which has been described so far, everything within this box will be considered "normal variation". (Assuming the a* value is also also within its limits.)

But here, things get pretty icky. The upper left and lower right corners are really, really unlikely to appear under normal variation. I mean really, really, really. Those corner points are around 10 standard deviation units (in the 45 degree direction) from the mean. Did I say really, really, really, really unlikely. Like, I dunno, one chance in about 100 sextillion? I mean, the chance of me getting a phone call from Albert Munsell while giving a keynote at the Munsell Centennial Symposium are greater than that.

Summarizing, the method that has been discussed - individually applying standard one dimensional SPC tools to each of the L*, a*, and b* axes individually - can fail to flag data points that are far outside of the normal variability of color. This happens whenever there is a correlation in the variation between any two of the axes. I have demonstrated with real data that such variation is not unlikely, in fact, it is likely to happen when a single pigment is used to create color at hue angles of 45, 135, 225, or 315 degrees.

What to do?

In the figure above, I also added an ellipse as an alternative control limit. All points within the ellipse are considered normal variation, and all points outside the ellipse are an indication of something weird happening. I would argue that the elliptical control limit is far more accurate than the box.

If we rotated the axes in the L*b* scatter plot of cyan by 44 degrees counter-clockwise, we have an ellipse that is perpendicular to the new horizontal and vertical axes. When we look at the points in this new coordinate system, we have rekindled the beauty that we saw in the yellow scatter plot. We can meaningfully look at the variation in the horizontal direction separately from the variation in the vertical direction. From there, we can do the normalization that I spoke of before and compare against the chi-squared distribution. This gives us the elliptical control limit shown below. (Or ellipsoidal, if we take in all three dimensions.)

It all makes sense if we rotate our head half-way on its side

This technique alleviates hitches #1 and #2, and also fixes the not-so-minor hitch #3. But, this all depends on our ability to come up with a way to rotate the L*a*b* coordinates around so that the ellipsoid lies along the axes. Not a simple problem, but I hear someone in the back of the room whispering "principle component analysis". That technique, tied in with singular value decomposition, and eigenvectors and eigenvalues, can tell us how to rotate the coordinates so that the individual components are all uncorrelated.

Wednesday, September 6, 2017

Interpreting color difference data - a practical discussion of the CRF

My avid readers (yes, both of you) will realize that I have been on a mission, a holy quest, for process control of color. The holy grail that I seek is a technique for looking at the color data of a color-making process, and distinguishing between normal and abnormal variations in the process. This is subtly different from setting acceptance tolerances. The first is inward looking, seeking to improve the process. The second is outward-focused, with the ultimate goal of getting paid by a happy customer.

I'm not talking about getting paid. Unfortunately, getting paid is rarely the outcome of this blog!

In this blog post, I talk about a tool for color geeks. This tool is appropriate for analysis of whether the color is acceptable. I will discuss whether this tool is acceptable for identifying changes in the process. I promise to write about this tool for customer satisfaction later.

Recap of exciting events leading up to this

Just to recap, I wrote a lengthy and boring sequence of blogs on statistical process control of color difference data. The four part series was aptly called Statistical process control of color difference data. (Note that the link is to the first part of the series. For those who tend toward masochism, each of the first three posts have a link to the next in the series, so feel free to indulge.)

The topic for today's blog post is a tool from statistics called the CPDF (cumulative probability density function). At least that's the name that it was given in the stats class that I flunked out of in college. It is also called CPD (cumulative probability distribution), CDF (cumulative distribution function), and in some circles it's affectionately known as Clyde. In the graphic arts standards committee clique, it has gone by the name of CRF (cumulative relative frequency).

I blogged on the CPDF/CPD/CDF/Clyde/CRF before, and by golly, I just realized that I used two of the names in that blog post. This violates rule number 47 in technical writing, which states that you have to be absolutely consistent in your use of technical terms. This is also rule number 7 in patent writing, where it is generally assumed that different words mean different things. I will try to be consistent in calling this tool the CRF. Please forgive me if I slip up and call it Clyde.

Now that the nomenclature is out of the way, today's blog is an extension of the previous blog post on CRF. Today, I want to discuss the practical aspects. Looking at a CRF, what can we discern about the nature of a collection of color differences? And importantly, what conclusions should we avoid coming to, however tempting they may be.

Brief refresher

Below we have a picture of a CRF. The horizontal axis is color difference. The purplish blue curve represents the percentage of color differences that are smaller than that particular DE.

Picture of Clyde from another blog post of mine

It is rather easy from this plot to determine what are called the rank order statistics. The red arrows show that the median (AKA the 50th percentile) is a shade under 1.5 DE. The green arrows show the 95th percentile as being 3.0 DE.

So, one use of the CRF is to quickly visualize whatever percentile is your favorite. You like the 68th percentile? No problem. Your neighbor's dog prefers the 83rd? Yup, it's there as well.

The plot above is real data from one set of real measurements of real printed sheets. The plot below shows nine different sets of real data. The different data sets show medians of anywhere from 0.3 DE to 4 DE. It's clear from looking at the plots that some are tighter distributions than others. Most people would say that the tighter ones are better than the broader ones, but I try not to be so judgmental. I try to love all CRFs equally.

Picture of Bonnie, from the same blog post, who exhibits multiple personality disorder

So, we see that another use of the CRF is to visualize the overall magnitude of a set of color differences. Of course, the median or 95th percentile also give you that impression, but the CRF plot is visual (great for people who are visual learners), and it incorporates all the percentiles into one picture.

Do we need more than one number?

This begs a question. If I know the median, do I need the 95th percentile as well? Is there any additional information in having both numbers?

I assessed this in that prior blog post that I keep referring back to. Based on the data sets that I had at my disposal, I found that there is a very strong correlation between the median and the 95th percentile (r-squared = 0.903). You could get a pretty good guess at the 95th percentile just by multiplying the median by 1.840. That's actually good news for those who have only a small amount of data to draw conclusions from. The median is a lot more stable than trying to infer the median from (say) 20 points.

But I should add a caveat -- the data that I was analyzing was pretty much all well-behaved. If the data is not well behaved, then the ratio between the two will probably not be close to 1.840. So, having both numbers might provide exactly the information that you are looking for. I mean, why would you be wasting your time looking at color difference data, if not to identify funkiness???!?

The next graph illustrates this point. I started with that same set of 9 CRFs from the previous graph. I scaled each of them horizontally so that they had a median color difference of 2.4 DE. If all the different percentiles have the same basic information, then all the curves would lay right atop one another. 

But they don't all lay atop each other. After doing the rubber band thing to adjust for scaling, they don't all have the same shape. The orange plot is below the others in the lower percentiles, and mostly above the others at higher percentiles. The red plot is kinda the opposite.

What does this observation about the two CRFs tell us about the DE values from which the CRFs were created? And more importantly, what does this tell us about the set of color differences that went into creating the CRF?

(That was a rhetorical question. Please don't answer it. If you answered it, it would steal the thunder from the whole rest of this blog post. And that would make me sad, considering the amount of time that I am going to put into analyzing data and the writing this post! You may actually learn something, cuz I think that no one in the history of the known universe has investigated this topic to the depth that I did to write this post.)

The simple answer to the rhetorical question is that the orange plot is closer to having color difference values that are all the same, and the red plot has more of a range of color difference values. But the answer is actually more nuanced than that. (I recently heard the word nuanced, and I admit that I have been waiting for the opportunity to show it off. It's such a pretentious word!)

Here is the third thing we learn by looking at the CRF: Not all CRFs are created equal. The shape of the CRF tells us something about the color difference data, but we aren't quite sure what it tells us. Yet.

Looking at cases

To help build our intuition, I will look at a few basic cases. I will show the distribution of points in an a*b* plot, and then look at the associated CRF. Conclusions will be drawn and we will find ourselves with a deeper appreciation of the merits and limitations of CRFology as applied to color difference data.

About the data

The data that I use for the different cases is all random data generated deep in the CPU of my computer. I used a random number generator set up to give me DL*, Da* and Db* data that is normally distributed (Gaussian). Note that while DE values are definitely not normally distributed, the variations in the individual components of L*a*b* are more likely to follow a normal distribution, at least when the color process is under "good control". Whatever that means.

A peek inside my computer as it generated data for CRFs

In most of the cases following, I have looked at data that is strictly two-dimensional (only a* and b* values, with the ridiculous assumption that L* has no variation). I will bring three-dimensional variation in on the last two cases. Those two will blow your mind.

All of this work is done with the 1976 DE formula, simply because it is sooooo much easier to work with. The conclusions that we draw from analysis of these cases will not change if we use the DE2000 color difference formula. This is not immediately evident (at least not right now), but trust me that I kinda have a feeling that what I said was true.

I should mention one other thing. The figures below all show scatter plots and CRF plots. I decided to use 200 data points for the scatter plots, since that made a plot where the scatter was pretty clear. If I went larger, the dots all merge into a blob, which is bad. But worse than that is the fact that the size of the blob depends on the number of points in the scatter plot. (I have a solution to that, but it it several blog posts away).

For the CRF plots, 200 points would give a jaggy appearance that would mask the true underlying shape. So, for the CRFs I chose the luxury of enlisting a few more CPU cycles to generate 10,000 data points. My computer didn't seem to mind. At least it didn't say anything.

In all the scatter plots below, I have plotted Da* and Db*, and not a* and b*. The values are relative to some hypothetical target a*b* value.

Case 1

This first case is the most basic. We assume that the distribution of measurements in a*b* is a circular cloud. The values for Da* and Db* are both normally distributed with standard deviation of 1.0 DE, and Da* and Db* are uncorrelated. You can probably see this; I show the scatter plot in the upper left hand corner of the figure, and the associated CRF below.

Case 1, uncorrelated variation of equal size in a* and b*

The lower right corner of the figure introduces a new metric for analysis of the CRF; the median-to-95th percentile ratio. I intend this to be a parameter which describes the shape of the CRF. Dividing by the median makes it independent of the overall spread of the CRF. I have decided to give this ratio a sexy name: MedTo95. Kinda rolls off the fingertips as I type it. 

Note that a lower value of MedTo95 corresponds to a CRF that has an abrupt transition (high slope at the median), and a lower number indicates a more laid back shape.

(The astute reader will note that the number 1.840 that I described in a previous blog as a way to estimate the 95th percentile from the median was a MedTo95 metric based on real DE data. The really astute reader will note that 1.840 is not equal to 2.11. The really, really astute reader will draw an inference from this inequality. The CRF shown in Case 1 is a bit abnormal.)

Case 2

The most natural question to ask is what happens if the cloud of Ddata is moved over a little bit. To put this into context, this is a DE cloud where there is a bias. We have the usual variation in color, but we are also not quite dead-on the target value.

The Figure below shows the Case 1 data in black and this second Ddata in green. I thought it was pretty clever of me to come up with a figure like this that is so incredibly intuitive. If you had to scratch your head for a while before catching on, then I guess I am not as clever as I would like to think.

Case 2, which is the same as Case 1, except shifted over

It should come as no surprise that the CRF has been expanded outward toward larger DE values. We have added a second source of color difference, so we expect larger DE values.

Important note for the analysis of the CRFs: The bias in the color (of production versus target a*b*) caused a reduction in MedTo95.

I should also point out that it makes no difference whether the cloud in a*b* has shifted to the left by 2 DE, to the right by 2 DE, or upward or downward. The DE values will all be the same, so the CRF will be the same.

Case 3

It is hard to compare the shape of the black and green plots of Case 2, since one has been stretched. I have altered the cloud of points in the figure below so that the CRFs have the same median. Note that this scaling of the CRF required a commensurate scaling of the a*b* cloud. So, the green cloud is more compact than the black cloud of points. The standard deviations, shown at the bottom left corner of each of the scatter plots, were cut in half.

Case 3 - two data sets with the same median, but different offset

The MedTo95 ratio was 1.70 in Case 1, and is 1.69 in this case -- almost identical. That's reassuring. I mean, that's why I introduced this shape parameter as a ratio.  

Tentative conclusion #1: MedTo95 kinda works as a parameter identifying the shape.

We see that introducing a bias in our process (the color is varying about a color somewhat different than the target color) changed the shape of the CRF. The CRF of the biased data makes a faster transition, that is, it has a higher slope at the median, that is, MedTo95 is smaller.

Tentative conclusion #2: A lower MedTo95 is an indication of bias - that you're not hitting the target color.

(Please note that I underlined and boldfaced the word tentative. This might be foreshadowing or something.)

Case 4

The next most obvious change we could make to the distribution is to change the relative amount of variation in a* and b*, in other words, to make the scatter plot elliptical. This is to be expected in real data. Imagine that the plot below is of the variation in a* and b* measurements of a solid yellow throughout a press run. The predominant cause of variation is the variation in yellow ink film thickness, which is reflected mostly in the b* direction. 

I will take this a step further. The eccentricity (as opposed to circularity) of the scatter plot is an indication that one cause of variation is predominant over the others. 

The figure below shows the effect that a 5:1 aspect ratio has on the CRF.

Case 4 - elliptical variation

This is cool. The shape of the CRF has changed in the opposite direction. I would call this a mellow, laid-back, sixties kind of cool and groovy CRF, one which makes the transition gradually. The slope at the median is smaller. Graciously, MedTo95 has responded to this by getting much larger. Once again, MedTo95 has given us an indication of the shape of the CRF.

Perhaps we can tie the change in MedTo95 back to the variation in a*b*? 

Tentative conclusion #3: A higher value of MedTo95 is an indication of a departure from circularity of the scatter plot. 

Now we're getting somewhere! We can look at MedTo95 and understand something important about the shape of the scatter plot. If we see MedTo95 go up, it is a sign that we have one source of variation which is rearing its ugly head.

But once again, the word tentative is bold faced and underlined. It's almost like I am warning the reader of some broader conclusion that may eclipse this one.

Case 5

Case 4 looked at eccentricity in the b* axis. This is typical of variation in yellow, but magenta (for example) tends to vary a lot more in a*. What if the angle of the eccentricity changes? 

I played with my random number generator to simulate a variation which has been rotated so that the major axis is in the a* direction. I show the figures below.

Case 5 - a comparison of eccentricity in two different directions

This is reassuring. The CRFs are pretty much identical, and a so are the MedTo95 values. This shouldn't be much of a surprise. A moment's consideration should convince one that the color difference values (in DE76) would be the same, so the CRF shouldn't change. This will not be the case for DE2000 values.

This is good news and perhaps not-so-good news. The good news is that the CRF and MedTo95 of DE76 values is irrespective of the predominant direction of the variation. The bad news is that the CRF and MedTo95 of DE76 values are irrespective of the predominant direction of the variation.

Tentative conclusion #4: CRF and MedTo95 don't know nothing from the direction of variation. 

Case 6

We have looked at a bias and eccentricity of variation in isolation. How about if both occur? In the figure below we look at one possible example of that. The blue cloud has been shifted to the right, and squished horizontally. It was also squished just a bit vertically, just so the median is that same as all the other CRFs. 

Case 6, in which we encounter both sliding over and squishing

From the figure, it is clear that the combination of these two effects causes a CRF that is more uptight than the standard scatter plot that we started with. The new CRF is reluctant to change initially, but changes rapidly once it decides to change.

Thus, we re-iterate tentative conclusion #2: A lower MedTo95 is an indication of bias - that you're not hitting the target color. And, of course, we can forget about tentative conclusion #3: A higher value of MedTo95 is an indication of a departure from circularity of the scatter plot.

How does elliptical distribution with offset (Case 6) compare Case 3, where the scatter plot shows a circular distribution with offset? The two are compared in the figure below.

A comparison of two biased distributions

Here we see two CRFs that are pretty darn close if you look at the area above the median. The MedTo95 of the two are also (to no one's surprise) very close. If I may remind you, the CRFs represent a collection of a whopping 10,000 data points where the initial distributions were algorithmically designed to be normal distributions. You ain't never gonna see no real CRF that is as pristine as these CRFs.

Our tentative conclusions are starting to unravel. :(  

Tentative conclusion #5: There ain't no way, no how, that you can use MedTo95 to diagnose ellipticity when there is a bias.

But, these  conclusions are based on what's going on above the median. There is still some hope that the stuff below the median might pan out. We would need, of course, an additional parameter. Maybe MedTo25? 

Case 7

In Case 6, we looked at elliptical variation that is perpendicular to the bias. The bias was off to the right, and the principle axis of the variation was up and down. Let's look at bias and variation that are both along the a* axis. This is shown in the next figure.

Case 7 - comparison of variation parallel and perpendicular to bias 

The new curve - the one in violet - kinda looks like red curve shown in Case 4. Things have certainly gotten complicated! I will try to capture this in another tentative conclusion.

Tentative conclusion #6: In the presence of both elliptical variation and bias, elliptical variation in the direction of the bias looks similar to elliptical variation. Elliptical variation perpendicular to the direction of the bias looks like bias. 

Ummm... I didn't stop to consider what happens when the elliptical variation is at 45 degrees to the bias. Presumably, it looks a lot like circular variation with no bias. That ain't so good. 

I probably should actually show an example of this. I think the CRF of elliptical variation at 45 degrees to direction of the bias would look a lot like the black CRF that we have been using as a reference, at least above the waist. But, rather than head further down the rabbit hole, I have one more consideration.

Case 8

All of the examples so far have made the assumption that the variation is strictly two-dimensional, that is, in a* and b*. That's a simplification that I made in order to aid in our understanding of the interpreting of a CRF. One would expect that three-dimensional variation is more likely to be encountered in the real world.

In the cyan of the figure below, I modeled spherical variation which is of equal magnitude in L*, a*, and b*.

Case 8 - the effect of dimensionality on the CRF

By comparing the cyan CRF to the black (two-dimensional), we see that adding a third dimension has the effect of making the transition sharper and of decreasing MedTo95. The red CRF has been added to suggest the effect of reducing the dimensionality to something closer to one.

(Some readers may be thinking something along the line of "chi-squared distribution with n degrees of freedom, where n is some number between 1 and 3, where that number might not be an integer." If those words are gobbledy-gook, then that's ok.)

This next figure compares the CRF of a three dimensional spherical distribution with a two dimensional circular distribution with a little bias. 

Comparison of spherical distribution with circular distribution with bias

I think that this might be just a tad scary for those who wish to use the CRF to divine something about the scatter of points in color space. We see two distributions that are nothing like each other, but yet have CRFs that are very similar. 

In a theoretical world, one might be able to tell the difference between these two. But, there are two things fighting against us. First, we never have CRF plots that are as clean as the ones I have created. 

Second, this blog post shows that we have a lot of knobs to play with. The shape of the CRF is effected by the length of all three of the axes of the ellipsoid, as well as by the magnitude and direction of the bias with respect to the axes of the ellipsoid. Without a lot of trying, I have twice come up with pairs of dissimilar distributions where the CRFs are similar. And I haven't even considered variations that are non-normal. Given a bit more time, I think I could get some pairs of CRFs that would boggle the mind. 

The non-tentative conclusion

If two CRFs are different, we can pretty definitively make the statement that there is some difference between the distributions of color differences. But, one might just as well look at tea leaves to divine the nature of the difference in the distributions. Furthermore, if the CRFs of two sets of color difference data are similar, one cannot come to any conclusions about the similarity between the underlying color variation in CIELAB space.

This blog post and the exciting conclusion was not an Aha! experience for me. The Aha! moment occurred earlier, when I was coming to grips with the fact that a bias will mask variation in CIELAB. This was described in one of my blog posts about process control of color difference data

So, what's the answer? How can we look at a cloud of points in CIELAB and draw any inferences about it? The following diagram is a clue -- drawing ellipsoids that "fit" the variation of points in CIELAB. The ellipses are an extension of the standard deviation to three-dimensions. This involves the co-variance matrix of DL*, Da*, and Db* values. It involves a strange concept of taking the square root of a matrix - not to be confused with the square root of the components of a matrix. And it involves principle component analysis. And it is related to Hotelling's T-squared statistic.

The ellipses below were generated using this technique. I will get around to blogging about this eventually!

I gratefully acknowledge the comments are proofreading from a good friend, Bruce Bayne. He won't admit it, but he is a pretty sharp guy.