Wednesday, May 12, 2021

Understanding CVDisplayLink

I found it actually somewhat difficult to understand how to use CVDisplayLink. But, after a while of playing around with it, I think I've got a pretty good handle on it. It's not too complicated.

The main use of a CVDisplayLink is to have a callback that runs once per vsync of a screen. It's also stateful, so you can stop and start the callback stream.

Creation

When you create one of these objects, you have to tell the system which screen to match - because different screens can have different refresh rates. The type CVDisplayLink accepts to do this is CGDirectDisplayID. You can get this from an NSScreen* as follows:

NSDictionary<NSDeviceDescriptionKey, id> *deviceDescription = theScreen.deviceDescription;

NSNumber *directDisplayIDNumber = deviceDescription[@"NSScreenNumber"];

CGDirectDisplayID directDisplay = directDisplayIDNumber.unsignedIntValue;

Then, you can use CVDisplayLinkCreateWithCGDisplay() to create the object for that display.

Setup

The setup can be either a block or a C function. The block doesn't need a void* userInfo object because that context is implicitly captured by the block. So, you just say:

CVDisplayLinkSetOutputHandler(displayLink, ^CVReturn (CVDisplayLinkRef displayLink, const CVTimeStamp *inNow, const CVTimeStamp *inOutputTime, CVOptionFlags flagsIn, CVOptionFlags *flagsOut) {

    ...

    return kCVReturnSuccess;

});

And then you start it with just CVDisplayLinkStart(displayLink); Easy peasy. There are also functions for stopping, retaining, and releasing the CVDisplayLink.

Interpreting the Arguments

It actually took me quite a while to figure out what each of the arguments means. The docs say that flagsIn and flagsOut are 0, and the displayLink is the CVDisplayLink that you started, so there are only really two interesting arguments: inNow, and inOutputTime, both of which are of type CVTimeStamp. inNow represents the time that this callback is being run, and inOutputTime represents the time that anything you draw in the callback is supposed to show up at.

So, let's dig into CVTimeStamp. The version and reserved fields are 0, and the flags field tells you which of the fields in the CVTimeStamp are valid. I don't know what SMPTE time is, but it never seems to be set/valid, so I'm going to ignore that one. So these are the ones that are remaining:

  • hostTime
  • rateScalar
  • videoRefreshPeriod
  • videoTime
  • videoTimeScale

The thing you have to realize is that there are two timelines happening concurrently: "host" time and "video" time. So, a "point" in time actually has two different representations: one for each of the timelines.

The hostTime field uses the same tick count that mach_absolute_time() uses. To convert it to seconds, you have to use mach_timebase_info(). And, the "meaning" of the hostTime field is the current time as measured by your application - exactly what mach_absolute_time() returns.

The videoTime field does not use those same tick counts. Instead, it uses the videoTimeScale field. It's a rational number: videoTime / videoTimeScale = seconds. videoRefreshPeriod is a rational number too, using the same denominator, but it represents the delta between adjacent video frames.

For CVDisplayLink, the "video" time represents time as measured by vsyncs. You can think of vsyncs as an independent clock - it ticks every so-often, and those ticks don't have to be in exact cadence with any of the other clocks on the system. They're supposed to be, but when you actually measure them, they won't perfectly line up, because of course nothing is that perfect. So, if videoRefreshPeriod / videoTimeScale equals 1/60, and you record adjacent frames' hostTime and convert them to seconds using mach_timebase_info(), you'll get something that's close to 1/60, but it won't be exact, because nothing is ever that exact all the time.

So that's what rateScalar tries to measure. It's the only field that is floating point, and it measures the speed of the video timeline relative to the speed of the host timeline. Ideally, it would always be 1.0, but, of course, nothing is ever that perfect. It's not sensitive to workload, just as time doesn't dilate when you start asking your computer to do some work.

The video time is time based on vsyncs, not time based on the window server render loop or the core animation render loop. If some other application loads up a big Core Animation scene, your CVDisplayLink isn't going to tick slower.

Also, I assume the fact that videoRefreshPeriod is passed into each callback indicates that videos can change their refresh rate ... but I'm not sure.