Saturday, September 29, 2018

Texture Sampling

Textures are one of the fundamental data types in 3D graphics. Any time you want to show an image on a 3D surface, you use a texture.

Texture Types


First of all, there are many kinds of textures. The simplest kind of texture to understand is a 2D texture, whose purpose is to act like a rectangular image. Each element in the image is configurable; you can specify that it’s a float, or an int, or 4 floats (for each channel of RGBA), etc. These “elements” are usually called “texels.” Similarly, there are 1D textures and 3D textures, which act similarly.

Then, you’ve got 1D texture arrays, and 2D texture arrays, which are not simply arrays-of-textures. Instead, they are distinct types, where each element in the array is the relevant texture type. They are their own distinct resource types because GPUs can operate on them in hardware, so the array doesn’t have to be implemented in software. As such, the hardware restricts each element in the array to have the same dimensions. If you don’t like this requirement, you can create a software array of textures, and it will go slower but the requirement won’t apply. (Or you could even have an array of texture arrays!)

Mipmaps


There’s one other important piece to textures: mipmaps. Generally, textures are mapped onto arbitrary 3d geometry, which means that the number of pixels on-screen the texture is stretched over is totally arbitrary. Using regular projection matrices, the farther the geometry is from the viewer, the fewer pixels the texture is mapped onto.

Consider drawing a single pixel of geometry that is far away from the camera. Here, the entire texture will be squished to fit into a small number of pixels, so that single pixel will be covered by many texels. So, if the renderer wanted to compute an accurate color for that pixel, it would have to average all the covered pixels together. However, what if that geometry moves closer to the camera, such that each pixel contains only ~1 texel? In this situation, no averaging is necessary; you can just do a read in the texture data.

So, if the texture is big relative to the size on-screen it’s drawn, that’s a problem, but if it’s small, that’s no problem. Think about that for a second - big data sizes are a problem, but small data sizes okay. So what if the system could just reduce the big texture to a small texture as a preprocess? In fact, if there was a collection of reductions of various sizes, there would always be a size that is appropriate for the number of pixels being drawn.

That’s exactly what a mipmap is. If a 2D texture has dimensions m * n, the object also has storage for an additional level of m/2 * n/2, and an additional level of m/4 * n/4, etc, down to a single texel. This doesn’t even waste that much memory, because it’s provable that x + x/2 + x/4 + x/8 … = 2*x, so the memory overhead is as much as an additional texture. This storage requirement also assumes that texture sizes are always powers-of-two, which is generally required, though nowadays many implementations have extensions that relax this requirement.

So, naïvely, addressing a 2D texture requires 3 components: x, y, and which mipmap level. 3D textures require 4 components, and 1D textures require 2 components. 2D texture arrays require 4 components (there’s an extra one for the layer in the array) and 1D texture arrays require 3 components. With these components, the system only has to do a single read at runtime - no looping over texels required.

Automatic Miplevel Selection


The shader API, however, can calculate the mipmap level for you, so you don’t have to do that yourself in the shader (though you can if you want to). The key here is to figure out how many texels per pixel the texture is getting squished down to. If the answer is 2, you should use the second mipmap level. If the answer is 4, you should use the third mipmap level (since each level is half as large as the previous).

So how does the system know how many texels cover your pixel? Well, if you think about it, this is the screen-space derivative of the sampling coordinate in the base level. Stated differently, it’s the rate of change of the texture coordinate (in texels) across the screen. So, how do you calculate this?

If the code you’re writing is differentiable, you could just calculate it yourself in closed-form, and just use that. However, the system can approximate it automatically, using the fact that the GPU scheduler can schedule fragment shader threads however it likes. If the scheduler chooses to dispatch fragment shader threads in 2x2 blocks, then each thread in the block can share data among each other. Then, approximating this derivative is easy, it’s simply change-in-y / change-in-x = the difference of adjacent sampling coordinates divided by the difference of adjacent screen-space coordinates. Because we are sampling adjacent pixels, the difference of adjacent screen-space coordinates is just 1, so this derivative is calculated by just subtracting the sampling position of adjacent pixels. The pixels in the 2x2 block can share the result. (Of course this sharing only works if every fragment shader in the 2x2 block is at the same point in the shader so they can cooperate together.)

So, the system does this subtraction of adjacent sampling coordinates to estimate the derivative, and takes the log base 2 of the derivative to select which miplevel to use. The result of this may not be exactly integral, so the sampler describes whether or not to just clamp to the nearest integer miplevel or to read both straddling miplevels and use a weighted average. You can also short-circuit this computation by explicitly specifying derivatives to use (which means the derivatives won’t be automatically calculated, but everything else will work the same way) or by just specifying which miplevel to use directly.

Dimension Reduction


But I’ve breezed over one of the details here - 2D textures have 2-dimensional texel coordinates, and screens also have 2-dimensional coordinates. How do we reduce these to a single miplevel? The Vulkan spec doesn’t actually describe exactly how to reduce the 2-dimensional texel coordinates into a single scalar, but it does say in section 15.6.7:

ρx and ρy may be approximated with functions fx and fy, subject to the following constraints:
fx is continuous and monotonically increasing in each of mux, mvx, and mwx
fy is continuous and monotonically increasing in each of muy, mvy, and mwy
max(|mux|, |mvx|, |mwx|) <= fx <= sqrt(2) * (|mux| + |mvx| + |mwx|)
max(|muy|, |mvy|, |mwy|) <= fy <= sqrt(2) * (|muy| + |mvy| + |mwy|)

So, you reduce the n-dimensional texture coordinate to a scalar by making up a formula that fits the above requirements. You apply the function twice - once for the horizontal screen derivative direction, and once for the vertical screen derivative direction.

So this tells you (roughly) how many texels fit in the pixel vertically, and how many texels fit in the pixel horizontally. But these values don’t have to be the same. Imagine looking out in first-person across a rendered floor. There are many texels squished vertically, but not that many horizontally.

This is called anisotropy. The amount of anisotropy is just the ratio of these two values. By default, texture sampling will just use the minimum of these two values when figuring out which miplevel to use. Remember - miplevels are zero-indexed, so the smaller the index, the more data is in that level, so the smaller miplevel means the highest level of detail. However, there some techniques in this area that involve doing extra work to improve the quality of the result.

Wrapping Things Up


At this point, the sampler provides shader authors some control over the miplevel selection. The sampler / optional arguments can include a “LOD Bias” which gets added to this value, so the author can get higher-or-lower detail as necessary. The sampler / optional arguments can also include a “LOD Clamp” which will be applied here, if, for example, not all the miplevels of the texture have their contents populated yet.

So, now that you have a miplevel, you can do the rest of the operation. If the sampler says the sampling coordinate is normalized, you denormalize it by multiplying by the dimensions of the miplevel, and modulus / mirror / whatever the sampler tells you to do. Then, depending on the sampler settings, you either round the denormalized coordinates to the nearest integer, or you read all the straddling texels and perform a weighted average. Then, if the sampler tells you to, you do it all again at the next miplevel, and perform yet another weighted average.

There’s one last tiny detail I’ve skipped over, and that is the fact that texel elements are considered to lie at the center of the texel. So, if you have a 1D texture with 2 texels, where one is black and one is white, 1/4 of the way through the texture will be full black, and 3/4 of the way through the texture will be full white, and 1/4 - 3/4 will be a gradient from black to white. But what is drawn from 0 - 1/4 and from 3/4 - 1? What about values less than 0 or greater than 1? The sampler allows for configuring this. The modulus / mirroring operation results in a value that is either on the interior of the texture, or 1 texel around the edge. These texels around the edge either get values from being repeated / mirrored / whatever, or they can just be set to a constant “border color.” This color is fed as input to the weighted average calculation, so everything just works correctly.

Sunday, September 9, 2018

Comparison of Entity Framework to Core Data

Object Relational Mapping libraries connect in-memory object graphs to relational databases. Object-oriented programming is built upon the idea that there is an in-memory object graph, where each object is an instance of a class. An ORM is the software that can save that object graph to a database, either on-disk or using a service across the network.

Entity Framework is Microsoft’s premier ORM library, and Core Data is Apple’s premier ORM library. Both have the same goals - to persist an object graph to a database - but they were developed by different companies for different languages. It stands to reason that they made some different design choices.

Which Entity Framework?


Microsoft is infamous for creating multiple ways to do the same thing, and ORM libraries are no different. There are two versions of Entity Framework: Entity Framework 6, and Entity Framework Core. The documentation says that Entity Framework Core is the new hotness. Also, Entity Framework Core is open source.

So let’s start using Entity Framework Core, right? Well, not so fast. It turns out that you have to pick a runtime that Entity Framework Core will run on top of.

Which Runtime?


Entity Framework was originally developed for .NET. So that’s fine, but it turns out there are multiple versions of .NET.
  • .NET Framework only runs on Windows
  • .NET Core is written by Microsoft, and runs on Windows, Linux, and macOS. The documentation says that .NET Core is better than .NET Framework. Also, .NET Core is open source.
  • .NET Standard is just a standard. It isn’t a piece of software - it’s a specification that describes a level of support that a runtime needs to have in order to be compliant. Xamarin is another .NET runtime that supports the .NET Standard (and it runs on iOS / Android). Targeting this runtime means your app will work in every .NET implementation, but it won’t have access to some of the libraries only present in .NET Core.
  • The Universal Windows Platform is a runtime compliant with the .NET Standard. The Entity Framework documentation says that UWP is now supported. One interesting note: as part of the compilation process, the platform-independent .NET bytecode is run through the .NET Native toolchain, which produces a platform-dependent binary. They say this is to improve performance. (So I guess this means that the Universal Windows Platform isn’t really universal?) This compilation is somewhat lossy because reflection doesn’t fully work in native apps, and it sounds like Entity Framework had some bugs here that they had to fix.
There’s an example in the Entity Framework Core documentation about how to use it with the Universal Windows Platform, and UWP is the new hotness, so I’ll use that. If you dig into the example, you’ll find that the Entity Framework tools don’t work with UWP projects, so they had to make a dummy .NET Core project with nothing inside it, just to run the tools. How unfortunate.

Getting Entity Framework


Entity Framework is not built in to the system. Instead, you’ll have to get it from Visual Studio’s blessed package manager, named NuGet. When you install packages with NuGet, they’re not installed across the whole system; instead, they’re installed only for a single project. NuGet is built in to Visual Studio - simply go to Project -> Manage NuGet Packages to search/install packages.

Entity Framework is designed to be pluggable to different kinds of databases, and each database has its own package inside NuGet. The example uses a SQLite database, so it uses the Microsoft.EntityFrameworkCore.Sqlite package. There is also another package, Microsoft.EntityFrameworkCore.Tools, which includes command-line tools to generate migration code / apply migrations, so that one is included too.



How to get Core Data


It’s already part of the platform, and there’s only one version. Just use it.

High Level


Both libraries have a concept of a “context” which is the thing that holds the link to all the objects in the object graph. For Entity Framework, this is the Microsoft.EntityFrameworkCore.DbContext, and for Core Data, this is the NSManagedObjectContext. When you create an object, you register it with the context, and when you delete an object, you notify the context that it has been deleted. After you’ve done all your modifications, you tell the context to “save,” which stores all the changes in the database.

Entity Framework:
var blog = new Blog { url = url };
db.Blogs.Add(blog);
db.SaveChanges();


Core Data:
let blog = Blog(context: context, url: url)
try context.save()


Read/Modify/Write operations are also quite similar:

Entity Framework:
var blog = db.Blogs.First();
blog.Url = url;
db.SaveChanges();


Core Data:
let fetchRequest = Blog.fetchRequest() as NSFetchRequest
fetchRequest.fetchLimit = 1
let blog = try context.fetch(fetchRequest)[0]
blog.url = url
try context.save()



Context


In Core Data, the NSManagedObjectContext is just a class. When modifications are made to the object graph, the NSManagedObjectContext makes a strong reference to the object (because Swift is reference-counted, the distinction between strong and weak references are important). When it gets saved, the NSManagedObjectContext knows what to save.

However, in Entity Framework, the DbContext is magical. The application needs to subclass DbContext, and the subclass needs to have DbSet properties. These DbSets refer to the various tables in the database. When the DbContext’s constructor is run, it uses reflection to inspect itself, find all the DbSet properties, and inspect the generic type argument to determine the data model. It builds up a Microsoft.EntityFrameworkCore.ModelBuilder, and lets you make any last-minute changes you want inside DbContext.OnModelCreating().

Objects


In Core Data, each object in the object graph is represented by NSManagedObject. This object acts like a dictionary; you can “set properties” by using the Key-Value Coding functions value(forKey:) and setValue(_, forKey:). You can get better type-safety if you subclass NSManagedObject for each of your entities, and add typed properties. However, if you do this, you have to make sure that getting/setting these properties calls the Key-Value Coding methods on the inner NSManagedObject. Swift has a helpful keyword, @NSManaged, which does this for you. Even further, Xcode will even generate the subclass for you at compilation time, with the appropriately typed @NSManaged properties, if you select the appropriate value for “Codegen” in the right sidebar, with the entity selected. (Or you can use the managedObjectClassName string property on NSEntityDescription when building the NSManagedObjectModel, and Core Data will construct this class at runtime using the Objective-C runtime).



NSManagedObjects know which context they belong to, and their context requires you to pass in the context. This is presumably so when values get modified, the NSManagedObject can notify the NSManagedObjectContext.

In Entity Framework, each object in the object graph is just a regular object. No subclassing required, and no manifest or custom model creation code either. The DbContext learns about the object’s shape from reflection. This means that the ChangeTracker in the DbContext doesn’t automatically know about changes; instead, it has to DetectChanges() which iterates through the known objects. This is done automatically when it’s required.

Connection Between Classes and Data


In Core Data, when the system wants to populate a property of an object, it can do it dynamically, because the getter of the property will be filtered through value(forKey:). This way, the setter doesn’t have to know what the name of the field is at compilation time, which is required when the data model is created at runtime.

However, in Entity Framework, objects are just regular classes. This is a problem, though; how can Entity Framework set the correct property on the class when the name of the property is only known at runtime (because the model can be modified at runtime)? Well, it turns out it uses Linq to build a program at runtime that can set properties that are only known at runtime. This is extremely powerful; it looks like you can use Linq to write almost anything that you could write in C#.

Data Model


In Entity Framework, the DbContext constructor uses reflection to discover the object graph. You get a chance to modify the model at runtime in DbContext.OnModelCreating(), which is called inside the DbContext’s constructor. However, adding an entity to the model requires a class to match that entity. However, for properties, you can have a property that is present in the model but isn’t present in the class. This is valuable for things like automatically saved date fields.

In Core Data, there is a separate data file that describes the model declaratively (with the file extension .xcdatamodeld). You can edit these with a GUI inside Xcode. This file corresponds to a NSManagedObjectModel, which you can build at runtime instead, if you want. Then, when you bring up the Core Data stack, you can specify this model.

Fetch Queries


In EntityFramework, the DbSet implements the IQueryable interface. This is an interface that represents a Query node inside the Linq framework. Functions like .where() and .OrderBy() operate on these nodes and return other nodes, letting you chain up these operators. These operators aren’t actually applied at the time you call the function; instead they are a sort of retained-mode program. Whenever you want to actually pull data out of the query at the end, the runtime will look at the chain of operators and figure out how best to apply it (usually by creating SQL that matches the operation). However, some of the operations need to be applied by the client; this transparently works, but it obviously isn’t great for performance.

Core Data uses the same sort of thing, encapsulated by NSPredicate and NSExpression. NSExpression is the same kind of node inside a retained-mode program. These are quite powerful; you can even call arbitrary selectors on arbitrary objects. The big difference between this and Linq is that, in true Objective-C style, NSExpression isn’t typed, but Linq is typed.

Parallelism


Both Entity Framework and Core Data’s contexts are single-threaded, which means the managed objects all have to live on the same thread as their context. However, fetches and stores involve round trips to databases, which can be quite slow and would block the main thread. Entity Framework gets around this by providing Async versions of the fetching / saving functions. In this model, the objects live on the main thread, but the UI can still be redraw during the slow database operations.

Core Data has two approaches to this. One way is to host the entire Core Data object graph in another thread. You get this if the NSManagedObjectContext is initialized with the concurrencyType argument set to .privateQueueConcurrencyType. If you do this, the NSManagedObjectContext will create its own private queue, and operations on the NSManagedObjectContext are only valid from that queue. You run code on that queue by using NSManagedObjectContext’s perform(_:) function. Inside the callback, you can execute your fetch requests, build up some data, and post a message back to the main queue with your data (but not with NSManagedObjects!).

Alternatively, you can use the main queue, and use NSAsynchronousFetchRequest to create objects asynchronously. As far as I can tell, there is no equivalent call for NSManagedObjectContext.save(), and from my sampling, it appears that NSManagedObjectContext.save() is synchronous (though perhaps it doesn't have to be?)

Entity Framework:
var blog = await db.Blogs.FirstAsync();
blog.Url = url;
await db.SaveChangesAsync();


Core Data:
let fetchRequest = Blog.fetchRequest() as NSFetchRequest
fetchRequest.fetchLimit = 1
let asynchronousFetch = NSAsynchronousFetchRequest(fetchRequest: fetchRequest) { (result) in
    let blog = result.finalResult![0]
    blog.url = url
    do {
        try context.save()
    } catch {
        …
    }
}
try context.execute(asynchronousFetch)


Edit: The Core Data Best Practices video from 2012 describes how you can achieve asynchronous saves by using a parent/child NSManagedObjectContext pair. You set the child to live on one thread and the parent to live on the other thread, and when you tell the child to save, it will just push its changes to the other context on the other thread. Then you can asynchronously tell the other thread to save by using perform(_:).

Migrations


In Entity Framework, a migration is modeled as a chunk of code. However, this code is written by one of the tools inside Microsoft.EntityFrameworkCore.Tools. The command line tool saves a snapshot of whatever the current database schema is, and can create a new schema by using the same mechanism that DbContext uses when it creates a schema at runtime. Then, after you’ve created a migration, you can apply it, which involves running the code on your local development machine to upgrade the database to the new version. These tools even have documentation. You have a chance to fine-tune the migration by editing the source code the tool created, because creating the migration code and applying it to the database are two distinct steps. Because the migration is generated code, you can run it in your app instead of on your local development machine.

But wait, not so fast! The command-line tools use reflection on your source code to generate a model? Yep. That means the command-line tools build your source code. Then they look in your source code for the new model. If the command-line tools are supposed to perform the migration, then it’s supposed to connect to the database, too. But wait, how does it connect to the database? Well, your source code connects to the database … and the command-line tools will just run that code. The documentation describes what functions / classes it will look for in your code and run on your local machine.

Core Data handles migrations totally differently. Some simple migrations can happen automatically, right when you open the database (and you can check whether your change is “simple” by using a class function on NSMappingModel.) But, more complicated migrations are described declaratively in a .xcmappingmodel file, which Xcode lets you edit with a GUI. The expressions are described by strings, which (presumably) are the same strings that NSExpression accepts. This file corresponds to a NSMappingModel, which you can construct at runtime instead of loading from a bundle. Then, when you want to run the migration, you can use NSMigrationManager and pass in the NSMappingModel you want it to use. (One gotcha: to create a .xcmappingmodel in Xcode, it has to be between two different versions of the same model. You can create a new version of a model by selecting Editor -> Add Model Version.)

Configuring the Database


The constructor to Microsoft.EntityFrameworkCore.DbContext requires a Microsoft.EntityFrameworkCore.DbContextOptions, which is built by a Microsoft.EntityFrameworkCore.DbContextOptionsBuilder. C# has this nifty feature where you can declare a free function, but give the first argument the “this” keyword, and that free function will appear as if it was inside the class definition. So, the individual database package adds a function to the DbContextOptionsBuilder. (I haven’t investigated what the package does inside this function.) Then, the client code calls optionsBuilder.UseSqlite(connectionString), for example. You can use Microsoft.Data.Sqlite.SqliteConnectionStringBuilder() to build the connection string. You do this inside the DbContext.OnConfiguring() function so the command-line tools know how to configure the database.

Core Data works differently. Each persistent store is described via a NSPersistentStoreDescription, which includes a string “type” property. This “type” refers to the registeredStoreTypes registry inside NSPersistentStoreCoordinator, which can be extended with additional subclasses of NSPersistentStore. There are also 4 built-in strings for well-known database types.