Saturday, December 12, 2015

Binary Compatibility of Windows Programs

COM


Around 1993, there was an issue around binary compatibility of object-oriented programs. The problem is that callers and callees must agree on the internal layout of structs in memory. This means, however, that if a callee is updated (as libraries often are) and changes this memory layout, the program will not behave as expected.

Microsoft came up with a solution to this problem by creating an “object model” (named “COM”) which objects that traverse library boundaries should adhere to. The object model has a few rules:
  • Objects must not have any data that is exposed. Therefore, the only thing publicly available from an object is functions to call on the object.
  • The object’s functions are represented in memory as an array of pointers to functions. The order of this array matches the order that the functions are declared in source code.
Therefore, given these requirements, everybody can agree on how to interact with objects; all interactions with the objects must go through the object’s functions, and everyone agrees on where the functions exist.

You may recognize this design as an “interface.” The function array is a vtable. Therefore, mapping this idea to C and C++ is very natural. Because the only things that exist in the objects is a vtable, there are no problems with padding or alignment of data members or anything like that. Inheritance is implemented by simply adding more function pointers on to the end of the array.

This memory layout also means that many languages can interact with these COM objects.

There are two more interesting pieces to this object model: there are 3 functions that every object must implement. This can be thought of as the root of the inheritance hierarchy. In C++, this base class is known as IUnknown. The three functions are:
  • AddRef()
  • QueryInterface()
  • Release()
AddRef() and Release() are for reference counting; every COM object is reference counted using these functions.

QueryInterface() is pretty interesting. It is a way of converting one object into another type. The idea is that all COM types have an associated ID number. When you call QueryInterface(), you pass in an identifier of a type that you want. If the object knows how to represent itself as that type, it returns an interface of the correct type. Therefore, it’s sort of like a dynamic_cast, except that the source and destination types don’t have to be related at all. This is actually a really powerful concept, because it provides a mechanism for interoperability between APIs. A new library can be released, and its objects might want to support being used as the objects in another library; this can naturally be achieved by supporting QueryInterface().

In every COM library I’ve used, functions use out params instead of return values. Instead, they return an HRESULT error code. Therefore, it’s common to see every call to a COM object wrapped in a block which checks the return code.

.NET


Around 10 years later, Java was becoming popular. One of Java’s goals was to allow programs to run on any machine without requiring recompilation. It achieved this goal by compiling to a “virtual machine” bytecode. Then, at runtime, an interpreter would run the bytecode.

This approach means that all memory accesses and function calls operate through an interpreter, so the interpreter can make sure that nothing bad happens (for example: a program can never segfault). It also means that the virtual machine is free to implement any object model it wants; all object accesses pass through it, so it is free to, for example, move objects around in memory whenever it wants. This naturally leads to a garbage-collected language, where the garbage collector is implemented in the virtual machine.

(Now, I’d like to make an aside about this “managed code” business. It isn’t really true that all Java programs are interpreted; the virtual machine will JIT the program and point the program counter into the JITted code. Whenever the interpreter needs to do some work, it just makes the generated bytecode call back into its own code. Therefore, nothing is being done here that couldn’t be done with a simple software library. Indeed; this managed “runtime” is the same fundamental idea as the raw C “runtime”; it’s just that many operations which would be raw instructions in C are instead implemented as function calls into the runtime. If you are diligent about all your object accesses, you can make your C++ program garbage collected. “Managed code” doesn’t really mean anything mechanical; instead, it is a way of thinking about the runtime model of your program.)

Microsoft wanted to be able to use this model of “managed code” in their own apps, so they invented a system quite similar to Java’s. You would compile your apps to a “CLI” form, which is a platform-agnostic spec describing a virtual machine. Then, the Microsoft CLR interprets and runs your program. A new language (C#) was invented which nicely maps all its concepts to CLI concepts.

There are a couple extra benefits of having a well-specified intermediary form. One is that there can be multiple languages which compile to your CLI. Indeed, Microsoft has many languages which can be compiled to the CLI, such as F# and VB.NET. Another benefit is that the standard library can exist in the runtime, which means that all of those languages can call into the standard library for free. Each language doesn’t need its own standard library.

A pretty interesting benefit of this virtual machine is that it solves the binary compatibility problem. The fundamental problem with binary compatibility is that one library might reach inside an object vended by another library; with a virtual machine, that “reaching” must go through the runtime. The runtime can then make sure that everything will work correctly. So, we now have another solution for the binary compatibility problem.

C++/CLI


Remember how I mentioned earlier how that this concept of “runtime” is really just a library of functions that you should call when you want to do things (like call a function or access a property of an object)? Well, C++ supports linking with libraries… there is no reason why one shouldn’t be able to call into these runtime libraries from a C++ program.

The difficulty, however, is being diligent with your accesses. Every time you do anything with one of these “managed” objects, you must call through the runtime. This is a classic case of where the language can help you; you could imagine augmenting C++ to be able to naturally represent the concept of a managed object, and enforcing all interactions with such objects to be done through the runtime.

This is exactly what C++/CLI is. It is a superset of the C++ language, and it understands what managed objects are. These managed objects are fundamentally different from C++ objects; they are represented using different syntax. For example, when you create a managed object, you use the keyword “gcnew” instead of “new”, in order to remind you that these objects are garbage collected. A reference to a managed object uses the ^ character instead of the * character. Creating a reference to a managed object is done with the % character instead of the & character. These new types are a supplement to the language; all the regular C++ stuff still exists. This way, the compiler forces you to interact with the CLI runtime in the correct way.

Windows Runtime


The currently accepted way to write Windows programs is using the Windows Runtime (WinRT). You may think that WinRT is a runtime just like the .NET runtime discussed above; however, this isn’t quite right. WinRT objects are integrated with the .NET runtime, and are considered “managed” in the same way that .NET objects are considered “managed.” However, they aren’t binary compatible with .NET objects; you can’t interact with WinRT objects using the same functions you use to interact with .NET objects.

Therefore, C++/CLI doesn’t work for WinRT objects. Instead, a similarly-looking language, C++/CX, was invented to interact with these WinRT objects. It keeps some of the syntax from C++/CLI (such as ^ and %) but replaces others (such as “ref new” instead of “gcnew”).

No comments:

Post a Comment