## Sunday, May 28, 2017

### Chromaticity Diagrams

Humans can see light of the wavelengths between around 380 nm and 780 nm. We see many photons at a time, and we recognize the collection of photons as a particular color. Each photon has a frequency, which means that a particular color is the effect of how much power the photons have at each particular frequency. Put another way, a color is a distribution of power throughout the visible wavelengths of light. For example, if 1/3 of your power is at 700 nm and 2/3 of your power is at 400 nm, the color is a deep purple. This color can be described by the 2-dimensional function:

Different curves over this domain represent different colors. Here is the curve for daylight:

So, if we want to represent a color, we can describe the power function over the domain of visible wavelengths. However, we can do better if we include some biology.

## Biology

We have three types of cells (called “cones”) in our eyes which react to light. Each of the three kinds of cones are sensitive to different wavelengths of light. Cones only exist the the center of our eye (the “fovea”) and not in our peripheral vision, so this model is only accurate to describe the colors we are directly looking at. Here is a graph of the sensitivities of the three kinds of cones:

Here, you can see that the “S” cones are mostly sensitive to light at around 430 nm, but they still respond to light within a window of about 75 nm around it. You can also see that if all the light entering your eye is at 540 nm, the M cones will respond the most, the L cones will respond strongly (but not as much as the M cones), and the S cones will respond almost not at all.

This means that the entire power distribution of light entering your eye is encoded as three values on its way to your brain. There are significantly fewer degrees of freedom in this encoding than there are in the source frequency distribution. This means that information is lost. Put another way, there are many different frequency distributions which get encoded the same way by your cones’ response.

This is actually a really interesting finding. It means we can represent color by three values instead of a whole function across the frequency spectrum, which results in a significant space savings. It also means that, if you have two colors which appear to “match,” their frequency distributions may not match, so if you perform a modification the same way to both colors, they may cease to match.

If you think about it, though, this is the principle that computer monitors and TVs use. They have phosphors in them which emit light at a particular frequency. When we watch TV, the frequency diagram of the light we are viewing contains three spikes at the three frequencies of phosphors. However, the images we see appear to match our idea of nature, which is represented by a much more continuous and flat frequency diagram. Somehow, the images we see on TV and the images we see in nature match.

## Describing color

So color can be represented by a triple of numbers: (Response strength of S cones, Response strength of M cones, Response strength of L cones). Every combination of these three values represents every color we can perceive.

It would be great if we could simply represent a color by the response strength of each of the particular kinds of cones in our eyes; however, this is difficult to measure. Instead, let’s pick frequencies of light which we can easily produce. Let’s also select these frequencies such that they will correspond as well as possible to each of the three cones. By varying the power these lights produce, we should be able to produce many of the three triples, and therefore many of the colors we can see.

In 1931, two experiments attempted to “match” colors using lights of 435.8 nm, 546.1 nm, and 700 nm (let’s call them “blue,” “green,” and “red” lamps). The first two wavelengths are easily created by using mercury vapor tubes, and correspond to the S and M cones, respectively. The last frequency corresponds to the L cones, and, though isn’t easily created with mercury, is insensitive to small errors because the L cones’ frequency is close to flat in this neighborhood.

So, which colors should be matched? Every color can be decomposed to a collection of power values at particular frequencies. Therefore, if we could find a way to match every frequency in the observable range by humans, this data would be sufficient to match any color. For example, if you have a color with a peak at 680 nm and a peak at 400 nm, and you know that 680 nm light corresponds to our lamp powers of (a, b, c) and 400 nm corresponds to our lamp powers of (d, e, f), then the (a + d, b + e, c + f) should match our color.

This was performed by two people: Guild and Wright, using 7 and 10 samples, respectively (and averaging the results). They went through every frequency of light in the visible range, and found how much power they had to make each of the lamps emit in order to match the color.

However they found something a little upsetting. Consider the challenge of matching light at wavelength 510 nm. At this wavelength, we can see that the S cones would react near 0, and that the M cones would react maybe 20% more than the L cones. So, we are looking for how much power our primaries should emit to construct this same response in our cones.

(The grey bars are our primaries, and the light blue bar is our target)

Our primaries lie at 435.8 nm, 546.1 nm, and 700 nm. So, the blue lamp should be at or near 0; so far so good. If we select a power of the green light which gives us the correct M cone response, we find that it causes too high of an L cone response (because of how the cones overlap). Adding more of the red light only causes the problem to grow. Therefore, because the cones overlap, it is impossible to achieve a color match with this wavelength of light using these primaries.

The solution is to subtract the red light instead of adding it. The reason we couldn’t find a match before is because our green light added too much L cone response. If we could remove some of the L cone’s response, our color would match. We can do this by, instead of matching against 520 nm, let’s instead match against the sum of 520nm plus some of our red lamp. This has the effect of subtracting out some of the L cone response, and lets us match our color.

Using this approach, we can construct a graph, where for each wavelength, the three powers of the three lights are plotted. It will include negative values where matches would otherwise be impossible.

## X Y Z color space

Once we have this, we  now can represent any color by a triple, possibly negative, where each value in the triple represents the power of our particular primary. However, the fact that these values can be negative kind of sucks. In particular, machines were created which can measure colors, but the machines would have to be more complicated if some of the values could be negative.

Luckily, the power of light follows mathematical operations. In particular, addition and multiplication hold. Color A plus color B yields a consistent result, no matter what frequency distribution color A is represented by. The same is true for multiplication. This means that we are actually dealing with a vector space. A vector space can be transformed via a linear transformation.

So, the power values of each of the lights at each frequency were transformed such that the resulting values were positive at each frequency. This new, transformed, vector space, is called X Y Z, and is not physically-based.

Given these new non-physical primaries, you can construct a similar graph. It shows, for each frequency of light, how much of each primary is necessary to represent that frequency.

## Chromaticity graphs

So, for each frequency of light, we have an associated (X, Y, Z) triple. Let’s plot it on a 3-D graph!

(Best viewed in Safari Nightly build.)
Click and drag to control the rotation of the graph!

The white curve is our collection of (X, Y, Z) triples. (The red is our unit axes.)

Remember that every visible color is represented as a linear combination of the vectors from the origin to points on this (one-dimensional) curve.

The origin is black, because it represents 0 power. The hue and saturation of the color is described by the orientation of the point, not the distance the point is from the origin. If we want to represent this space in two dimensions, it would make sense to eliminate the brightness component and instead only show the hue and saturation. This can be done by projecting each point onto the X + Y + Z = 1 plane, as seen by the green on the above chart.

Note that this shape is convex. This is particularly interesting: any point on the contour of the shape, plus any other point on the contour of the shape, yields a point within the interior of the shape (when projected back to our projection plane). Recall that all visible colors are equal the the linear combination of points on the contour of the curve. Therefore, all visible colors equal all the points in the interior of this curve. Points on the exterior of this curve represent colors with negative cone response for at least one type of cone (which cannot happen).

So, the inside of this horseshoe shape represents every visible color. This shape is usually visualized by projecting it down to the (X, Y) plane. That yields the familiar diagram:

Inside this horseshoe represents every color we can see. Also, you can notice that, because X Y Z was constructed so that every visible color has positive coordinates, and the projection we are viewing is onto the X + Y + Z = 1 plane, all the points on the diagram are below the X + Y = 1 line.

## Color spaces

A color space is usually represented as three primary colors, as well as a white point (or a maximum bounds on the magnitude of each primary color). The colors in the color space are usually represented as a linear combination of the primary colors (subject to some maximum). In our chromaticity diagram, we aren’t concerned with brightness, so we can ignore these maximums values (and associated white-points). Because we know the representable colors in a color space are a linear combination of the primaries, we can plot the primaries in X, Y, Z color space and project them to the same X + Y + Z = 1 plane. Using the same logic we used above, we know that the representable colors in the color space are on the interior of the triangle realized by this projection.

You can see the result of this projection for the primaries of sRGB in the shaded triangle in the above chart. As you can see, there are many colors that human eyes can see which aren’t representable within sRGB. The chart also allows you to toggle the bounding triangle for the DCI-P3 color space, which Apple recently released on some of its devices. You can see how Display P3 includes more colors than sRGB.

Because the shape of all visible colors isn’t a triangle, it isn’t possible to create a color space where each primary is a visible color and the colorspace encompasses every visible color. If your color space encompasses every visible color, the primaries must lie outside of the horseshoe and are therefore not visible. If your primaries lie inside the horseshoe, there are visible colors which cannot be captured by your primaries. Having your primaries be real physical colors is valuable so that you can, for example, actually build physical devices which include your primaries (like the phosphors in a computer monitor). You can get closer to encompassing every visible color if you increase the number of primaries to 4 or 5, at the cost of making each color "fatter."

Keep in mind that these chromaticity diagrams (which are the ones in 2D above) are only useful for plotting individual points. Specifically, 2-D distances across this diagram are not meaningful. Points that are close together on the diagram may not be visually similar, and points which are visually similar may not be close together on the above diagram.

Also, when reading these horseshoe graphs, realize that they are simply projections of a 3D graph onto a somewhat-arbitrary plane. A better visualization of color would include all three dimensions.