Saturday, April 6, 2013

Design of Mesa 3D Part 1: __glXInitialize()


Lately I’ve become interested in the Mesa 3D project. Mesa 3D is a free (libre) OpenGL implementation*, which means that I can read the source for exactly how my OpenGL programs get executed. I’m fairly familiar with the OpenGL API, and I’ve read the incredibly helpful trip through the graphics pipeline by Ryg. However, We Can Go Deeper! There’s no substitute for actually reading the source. There are other proprietary OpenGL implementations that each card manufacturer writes; however, these aren’t open source and are therefore dead to me. Yeah.

I got interested in Mesa 3D because I’d like to use the OpenVG API on FreeBSD (as opposed to Cairo because OpenVG is easily accelerated). However, the FreeBSD version of Mesa 3D is old enough that it doesn’t support OpenVG yet (or even the Gallium drivers). Since I’m a professional software developer, I figure I’d see what I could do about getting a newer version of Mesa 3D into FreeBSD. Along the way I realized that I would benefit from just reading the source proper.

So, I'm going to focus on the version of Mesa3D that comes with the version of FreeBSD that I'm running (9.1). According to the ports tree, this is currently 7.11 or 7.6. There's a branch for both of these in the Mesa3D source repository, so I'll focus on 7.6. I've also got an NVidia card, so I'm also going to focus on that codepath.

All right, let's dig in. Before we can get to OpenGL proper, we must understand how GLX works, and understand its interactions with the rest of the system. There are a variety of players here:
  • libGL itself (implemented by Mesa 3D)
  • the X11 server (implemented by xorg)
  • libdrm (talks to the kernel itself, bypassing the X11 server. This is why the direct rendering path is "direct"
  • The DRM interface is actually a little different for each card. Mesa actually has different drivers which provide DRI support. They're inside files that look like drivername_dri.so
  • The kernel has to receive the DRM commands, as well as coordinate the various DRM clients. It does this with a piece of the kernel called DRI
  • X11's 2D driver (called DDX)
In general, one of the first GLX functions that you'd call would be something like glXChooseFBConfig(), implemented in src/glx/x11/glxcmds.c. That call in particular immediately calls glXGetFBConfigs(), which immediately calls __glXInitialize(), implemented in src/glx/x11/glxext.c. Alright, here's where the fun starts.

At this point it would be useful for me to mention that so far, the code that I've read uses, in effect, subclasses, implemented in C. I've seen this pattern used in a variety of other programs. So, how does it work? You want to have a parent struct, and you want a child struct that contains everything that the parent contains, along with some other stuff. You also want to safely be able to cast between the two. Well, if you put an instance of the parent class as the first member in the child class, then the child contains everything the parent contains, accessible by saying "child->parent->item". You can cast the child to the parent by just saying "child->parent". If you have the parent and you know which type of child it is, you can safely just cast the pointer to the type of the child class. This only works because C guarantees that items in a struct are laid out at increasing addresses (C99 §6.7.2.1 clause 13). Alignment isn't a problem because the allocation had to be done at the child level, and the parent class must be allowed to be aligned to the same boundary that the child class is aligned to because it has smaller size (yay math!). Also, you want to be able to have the child classes' functions override the parent classes' functions; this can be done by explicitly creating your own vtable for the parent class, and making the child class fill in the vtable at creation time. This is actually how C++ solves the problem; the only difference is that C++'s vtables are static and shared among all instances of a particular class. In Mesa 3D, the vtable is implemented as a series of function pointers inside each instance of the struct itself. (This is similar to prototype-based programming)

__glXInitialize() takes a Display pointer and returns a GLXdisplayPrivate pointer. One of the fields in the return type is a pointer to the original Display object, so this is using the poor-man's subclasses described earlier. This function is tasked with filling in the members of the GLXdisplayPrivate struct. This struct is defined in src/glx/x11/glxclient.h. A shortened copy of the struct is pasted below:

struct __GLXdisplayPrivateRec
{
   Display *dpy;
   int majorOpcode;
   int majorVersion, minorVersion;
   const char *serverGLXvendor;
   const char *serverGLXversion;
   __GLXscreenConfigs *screenConfigs;
#ifdef GLX_DIRECT_RENDERING
   __GLXDRIdisplay *driswDisplay;
   __GLXDRIdisplay *driDisplay;
   __GLXDRIdisplay *dri2Display;
#endif
};

As you can see, there's not actually that much information there. The first thing that function does is make sure that the server supports the GLX extension, which allows us to fill in the majorOpcode field. Then, we query the server version and make sure that the version the server supports of the GLX extension is something that we can deal with. Then, we run dri2CreateDisplay(), driCreateDisplay(), and driswCreateDisplay() to populate dri2Display, driDisplay, and driswDisplay, respectively.

I'll focus on DRI2 here to the exclusion of the other two. The full signature of the function is "__GLXDRIdisplay * dri2CreateDisplay(Display * dpy)", defined in src/glx/x11/dri2_glx.h. The body isn't actually that complicated. Here's the parent struct definition:

struct __GLXDRIdisplayPrivateRec
{
   __GLXDRIdisplay base;
   int driMajor;
   int driMinor;
   int driPatch;
};

It populates the major and minor versions with a call to DRI2QueryVersion(), defined in src/glx/x11/dri2.c, which just asks the X11 server for the DRI version. In fact, all the functions in that file look like they just send commands to the X11 server and wait for a response back. In addition, it looks like all the functions start with the string "DRI2" (in all caps).

Here's the child struct definition, in src/glx/x11/glxclient.h:

struct __GLXDRIdisplayRec
{
   void (*destroyDisplay) (__GLXDRIdisplay * display);
   __GLXDRIscreen *(*createScreen) (__GLXscreenConfigs * psc, int screen,
                                    __GLXdisplayPrivate * priv);
};

This sure looks like a vtable! dri2CreateDisplay simply initializes these two fields to two static functions, defined in the same file.

Now, back to __glXInitialize(). The last thing we do is call AllocAndFetchScreenConfigs() to populate the screenConfigs field. This field is an array of __GLXscreenConfigs pointers, one for each attached screen on the X11 display. The full function signature is "static Bool AllocAndFetchScreenConfigs(Display * dpy, __GLXdisplayPrivate * priv)". Here's the __GLXscreenConfigs struct:

struct __GLXscreenConfigsRec
{
   const char *serverGLXexts;
   char *effectiveGLXexts;
#ifdef GLX_DIRECT_RENDERING
   __DRIscreen *__driScreen;
   const __DRIcoreExtension *core;
   const __DRIlegacyExtension *legacy;
   const __DRIswrastExtension *swrast;
   const __DRIdri2Extension *dri2;
   __glxHashTable *drawHash;
   Display *dpy;
   int scr, fd;
   void *driver;
   __GLXDRIscreen *driScreen;
   const __DRIconfig **driver_configs;
   const __DRIcopySubBufferExtension *driCopySubBuffer;
   const __DRIswapControlExtension *swapControl;
   const __DRIallocateExtension *allocate;
   const __DRIframeTrackingExtension *frameTracking;
   const __DRImediaStreamCounterExtension *msc;
   const __DRItexBufferExtension *texBuffer;
   const __DRI2flushExtension *f;
#endif
   __GLcontextModes *visuals, *configs;
   unsigned char direct_support[8];
   GLboolean ext_list_first_time;
};

So, we iterate through the number of screen on the display, and for each one, do two things. First, we call getVisualConfigs() and getFBConfigs() on the screen, then second try to create a relevant screen.

getVisualConfigs() and getFBConfigs() populate the visuals and configs fields in the __GLXscreenConfigsRec struct. They do it by simply asking the X11 server for the visuals and configs (xGLXGetVisualConfigsReq and xGLXGetFBConfigsReq). Then, we just convert the data the server gives us into our own representation of the struct (__glXInitializeVisualConfigFromTags())

This creation step actually prefers filling in the screen with a DRI2 screen, then with a DRI screen, then with a DRI sw screen. So, it tries to call "(*priv->dri2Display->createScreen) (psc, i, priv)", and if that fails, then tries to call "(*priv->driDisplay->createScreen) (psc, i, priv)", then if that fails, finally tries to call "(*priv->driswDisplay->createScreen) (psc, i, priv)".

Once again, I'll only focus on DRI2. dri2CreateScreen(), defined in src/glx/x11/dri2_glx.h, is actually somewhat complicated because Mesa's DRI implementation is driver-based. The full function signature is "static __GLXDRIscreen * dri2CreateScreen(__GLXscreenConfigs * psc, int screen, __GLXdisplayPrivate * priv)". Therefore, we've got to figure out which driver to load, load it, and then delegate to the driver. The first thing we do is run DRI2Connect() to ask the X11 server for the driver name to load (this ultimately comes from the DDX), and for the device name to run our libdrm IOCTLs on (that's how libdrm is implemented: IOCTLs on an fd. The source for libdrm is really simple). The X11 server replies with this information, and we call driOpenDriver(), defined in src/glx/x11/dri_common.c, with the name of the driver. This function has some logic in it, but generally tries to run a dlopen() on drivername_dri.so, and return the void*.  We then dlsym() the symbol called __DRI_DRIVER_EXTENSIONS, defined in include/GL/internal/dri_interface.h. This is an array of pointers to __DRIextension objects, which is a base class using the trick mentioned above. We then iterate through the extensions (since we have the layout of the struct) to try to find some extensions that we care about (namely __DRI_CORE and __DRI_DRI2). Here's the layout of the parent extension struct:

struct __DRIextensionRec {
    const char *name;
    int version;
};

When we find one with the name that we're looking for, we can just cast the pointer to the correct type. The child classes of __DRIextension objects are essentially vtables. Here's the __DRIdri2Extension struct definition, for example:

struct __DRIdri2ExtensionRec {
    __DRIextension base;
    __DRIscreen *(*createNewScreen)(int screen, int fd,
   const __DRIextension **extensions,
   const __DRIconfig ***driver_configs,
   void *loaderPrivate);
    __DRIdrawable *(*createNewDrawable)(__DRIscreen *screen,
const __DRIconfig *config,
void *loaderPrivate);
    __DRIcontext *(*createNewContext)(__DRIscreen *screen,
     const __DRIconfig *config,
     __DRIcontext *shared,
     void *loaderPrivate);
};

Back to dri2CreateScreen(). Once we've looped through the extensions,  assuming we've found all the ones we're looking for, we actually open the device name that we got back from the X11 server. We then run drmGetMagic() on the fd (part of libdrm, so the kernel answers this request), and feed that into DRI2Authenticate(), which allows the server and kernel to match up the clients. At this point, we actually do the delegation and ask the DRI driver to create the screen for us. This looks like "psc->__driScreen = psc->dri2->createNewScreen(...)". At this point, we call driBindExtensions() to bind all the optional extensions that the dri driver may have had (that weren't required), which sets all those const pointers inside the __GLXscreenConfigsRec struct. There's one more thing to do: The DRI driver might not actually support all the visuals that the X11 server previously returned in getVisualConfigs() and getFBConfigs(). So, we have to loop through the X11 server's modes, and try to match each one up to one of the modes that were returned to us by the DRI driver (by calling createNewScreen()). This is done in driConvertConfigs(), in src/glx/x11/dri_common.c. The old psc->configs are overwritten by the new filtered list (outputted by driConvertConfigs()).

At this point, we can actually start creating the output __GLXDRIscreen and start populating it. As you can see, __GLXDRIscreen is essentially just a vtable:

struct __GLXDRIscreenRec
{
   void (*destroyScreen) (__GLXscreenConfigs * psc);
   __GLXDRIcontext *(*createContext) (__GLXscreenConfigs * psc,
                                      const __GLcontextModes * mode,
                                      GLXContext gc,
                                      GLXContext shareList, int renderType);
   __GLXDRIdrawable *(*createDrawable) (__GLXscreenConfigs * psc,
                                        XID drawable,
                                        GLXDrawable glxDrawable,
                                        const __GLcontextModes * modes);
   void (*swapBuffers) (__GLXDRIdrawable * pdraw);
   void (*copySubBuffer) (__GLXDRIdrawable * pdraw,
                          int x, int y, int width, int height);
   void (*waitX) (__GLXDRIdrawable * pdraw);
   void (*waitGL) (__GLXDRIdrawable * pdraw);
};

So, populating this is straightforward enough: make each function pointer equal a static function defined in the local file, then return the struct.

Now we're almost done with __glXInitialize(); there's only one last thing to do in that function. __glXInitialize()actually caches it's output __GLXdisplayPrivate in a XExtData list. In particular, one of the first steps the function performs (that I previously left out) is to call XFindOnExtensionList() to see if we've already got our output cached. However, instead of simply storing the __GLXdisplayPrivate object, the actual object that's cached is a XExtData object which contains the output __GLXdisplayPrivate object in its private_data field. The last step is to construct this XExtData object, set its relevant fields, and call XAddToExtensionList() to remember it. Now we're done :-)

Alright, cool. Now I've got a handle on the basic setup that's performed when you first start using Mesa 3D. Now when successive functions start looking through data structures, I'll know what they mean! I'll keep reading Mesa sources and post more about what I learn when/if I have time.

* Technically not, but it’s close enough that you would never know the difference

1 comment:

  1. As far as I can see, __GLXDRIdisplayPrivateRec must be the child of __GLXDRIdisplayRec, not the other way around. Note that __GLXDRIdisplayPrivateRec includes __GLXDRIdisplayRec as a field, hence it is the child. Or, am I missing something?

    Seems to be a terrific series. I have seen no other articles which tackled the mesa internals. So it will fill a great void.

    ReplyDelete