Monday, January 6, 2014

How Multisampling Works in OpenGL

I’ve always been a little confused about how multisampling exactly works in OpenGL, and I’ve always seemed to be able to get away without knowing how it works. Recently I did some reading and some playing around with it, and I thought I’d explain it here. I’m speaking from the perspective of OpenGL 4.3 (though my machine only supports 4.1, it should work the same in both). There is a lot of outdated information on the Internet, so I thought I’d specify the version up front.

Multisampled Textures and Renderbuffers


Multisampling only makes logical sense if you’re rendering into a destination that is multisampled. When you bind a texture for the first time, if you bind it to GL_TEX_2D_MULTISAMPLE, that texture is defined to be a multisampled texture (and similarly for renderbuffers). A multisampled texture works similarly to a regular texture, except without mipmaps. In lieu of mipmaps, each texel gets a number of slots for writing values into. It’s similar to a texture array (except without mipmaps).

You create storage for a multi sampled texture with glTexImage2DMultisample() instead of glTexImage2D(). There are four main differences between these two calls:

  • You can't specify pixel data for initializing the texture
  • You don't specify a LOD number (because there are no mipmaps to choose between)
  • You specify the number of slots each texture holds per texel (number of samples)
  • fixedsamplelocations boolean, which I explain later

Reading from a Multisampled Texture


Shaders can read from multisampled textures, though they work differently than regular textures. In particular, there is a new type, sampler2DMS, that refers to the multisampled texture. You can’t use any of the regular texture sampling operations with this new type. Instead, you can only use texelFetch(), which means that you don’t get any filtering at all. The relevant signature of texelFetch() takes an additional argument which refers to which of the slots you want to read from.


Writing to a Multisampled Texture or Renderbuffer


The only way you can write to a multisampled texture or renderbuffer is by attaching it to a framebuffer and issuing a draw call.

Normally (with multisampling off), if you run a glDraw*() call, a fragment shader invocation is run for every fragment whose center is deemed to be inside the geometry that you’re drawing. You can think of this as each fragment having a single sample located at its center. With multisampling on, there are a collection of sample points located throughout the fragment. If any of these sample points lies within the rendered geometry, an invocation of the fragment shader is run (but only a single invocation for the entire fragment - more on this later). The fragment shader outputs a single output value for the relevant texel in the renderbuffer, and this same value is copied into each slot that corresponds to each sample that was covered by the rendered geometry. Therefore, the number of samples is dictated by the number of slots in the destination renderbuffer. Indeed, if you try to attach textures/renderbuffers with differing slot counts to the same framebuffer, the framebuffer won’t be complete.

There is even a fragment shader input shader variable, glSampleMaskIn, which is a bitmask of which samples are covered by this fragment. It’s actually an array of ints because you might have more than 32 samples per pixel (though I’ve never seen that). You can also modify which slots will be written to by using the glSampleMask fragment shader output variable. However, you can’t use this variable to write to slots that wouldn’t have been written to originally (corresponding to samples that aren’t covered by your geometry).

Eventually, you eventually want to render to the screen. When you create your OpenGL context, you can specify that you want a multisampled pixel format. This means that when you render to framebuffer 0 (corresponding to the screen), you are rendering to a multisampled renderbuffer. However, the screen itself can only show a single color. Therefore, one of the last stages in the OpenGL pipeline is to average all of the slots in a given texel. Note that this only happens when you’re drawing to the screen.

Note that because we’re writing to all of the specific samples covered by the geometry (and not using something like alpha to determine coverage) that adjacent geometry works properly. If two adjacent triangles cover the same fragment, some of the samples will be written to by one of the triangles, and the rest of the samples will be written to by the other triangle. Therefore, when averaging all these samples, you get a nice blend of both triangles.

There is a related feature of OpenGL called Sample Shading. This allows you to run multiple invocations of your fragment shader for each fragment. Each invocation of the fragment shader will correspond to a subset of the samples in each fragment. You turn this feature on by saying glEnable(GL_SAMPLE_SHADING) (multisampling can also be turned on and off with glEnable() as well, but its default value is “on”). Then, you can configure how many invocations of the fragment shader you’d like to run with glMinSampleShading(). You pass this function a normalized float, where 1.0 means to run one invocation of the fragment shader for each sample, and 0.5 means run one invocation of the fragment shader for every two samples. There is a GLSL input variable, gl_SampleID, which corresponds to which invocation we are running. Therefore, if you set the minimum sample shading to 1.0, gl_SampleMaskIn will always be a power of two.

For an example, if you want to copy one multisampled texture into another one, you could bind the destination texture to the framebuffer, turn on sample shading with a minimum rate of 1.0, draw a fullscreen quad, and have your fragment shader say “outColor = texelFetch(s, ivec2(gl_FragCoord.xy), gl_SampleID);”

There is a function, glGetMultisamplefv(), which lets you query for the location of a particular sample within a fragment. You can also get the number of samples with glGet() and GL_SAMPLES. You can then upload this data to your fragment shader in a Uniform Buffer if you want to use it during shading. However, the function that creates a multisampled texture, glTexImage2DMultisample(), takes a boolean argument, fixedsamplelocations, which dictates whether or not the implementation has to keep the sample count and arrangement the same for each fragment. If you specify GL_FALSE for this value, then the output of glGetMultisamplefv() doesn’t apply.

It’s also worth noting that multisampling is orthogonal to the various framebuffer attachments. It works the same for depth buffers as it does for color buffers.

Now, when you play a game at "8x MSAA," you know exactly what's going on!