Litherum: How Cocoa Apps Are Written

Cocoa is the framework on Mac OS X and iOS that you use if you want to make an app. It serves as a high-level API to anything and everything that you ask the platform to do. If you use Cocoa, then all the code in your app is simply business logic, directing Cocoa to do all the heavy lifting for you.

First off, apps have GUIs. On OS X, you’ve got a Window (or multiple) that display(s) stuff. On iOS, you’ve got the whole screen (which, for the sake of brevity, I’m going to call a Window).

One possible system would be to give the developer a canvas the size of the window and tell him/her to go wild. (This actually seems to be the approach that Wayland uses). However, this doesn’t have the benefit of composability or reusability. Indeed, a better approach would be to section off a portion of the window and deal with that piece in isolation - a sort of mini-window inside the window. (Ironically, Microsoft Windows even uses this terminology somewhat.) Then, we can re-use this mini-window. In order to make these things composable, we could put mini-windows inside mini-windows. That way, we can build large pieces out of a collection of small pieces.

In Cocoa, these things are called Views. A view is a rectangular portion of the window which contains its own conceptual canvas. Views are arranged in a tree, so you can put Views inside Views - these subviews will appear within the superview. A View can draw into its own canvas, and all drawing operations are clipped to the view’s bounds. Each View has a “frame” attribute, which refers to where it lies within its superview. Also note that Views are drawn in order, so if sibling views that overlap, the last sibling appears on top.

Now, you can create different kinds of Views by subclassing the View class. Apple has actually already done this for many kinds of commonly used Views. For example, if you want to show some text, use a TextView, which has a “string” property. If you want to show an image, use an ImageView, which has an “image” property. Cocoa includes many of these prebuilt View subclasses. Cocoa is also smart enough to realize that when you change the content of one of these built in views, the window is dirty and needs to be repainted, and it handles all of the machinery required to do that. There are also built-in Views, such as StackView, which don’t draw anything themselves, but are simply designed to be a container to position all of their subviews.

You can also create your own custom View by creating a new subclass of View. If you want it to display something within the bounds of your View, override the drawRect: method. In this method, you can use the Cocoa 2D drawing functions such as calling “stroke” on a BezierPath. All 2D drawing APIs use a “context” object, and in Cocoa, the context is stashed in a static variable before drawRect: so the Cocoa drawing functions can interrogate it.

Views are bidirectional, meaning that they transfer information from the program to the user, but they also transfer information from the user to the program. This has the form of mouse events, touch events, keyboard events, and stuff like that. If the user interacts with a View, Cocoa calls a callback on the View. This is how buttons work.

Apps also have data. There is some graph of objects which represents the source of truth for the app. This data model is conceptually distinct from the Views, which are all about presentation. Surely the data itself is different from how it is presented. This data model is often persisted to some stable store so that it can be loaded for the next time the app is started. The way to do this with Cocoa is to use Core Data. You can give a tree of predicates to Core Data and ask it to create in-memory objects representing all database records that pass the expression. Once you have the in-memory objects, you can update them or delete them, and CoreData is responsible for updating the database to reflect that. You can also ask Core Data to insert a new record, and then you manually populate itself just like any other update. In order for Core Data to know what database scheme to use (It is in charge of managing all aspects of the database - the only interface you have to it is by way of the objects it produces), you have to provide it with a data model, which describes all the entities and relations between entities. In short, it is a template for your object graph. Core Data will then take it from there.

Okay, so now we’ve got data and a presentation of that data. Because these things are conceptually distinct, we need some code to populate information inside the View hierarchy given our ground truth data model. We also need some code to react to user events (usually by modifying the data model). This code is generally called “Controller” code.

There we have it - we have a data model, Controller code, and a hierarchy of Views. The Controller is the intermediary between the data and the Views, and information passes through it in both directions (from views to data and from data to Views). The data doesn’t know anything about the Views, and the Views don’t know anything about the underlying data model. The Views are just dumb displays of information, and the data is just that dumb data itself.

Therefore, the Controller is where the app is. All the executive logic about what to modify, when, under which situations, is in the Controller. The important bit - the transitions - occur in the Controller. The controller is said to own its data and its Views. This is actually enforced with reference counting - Controllers have strong references to its subtree of Views and its data.

Now, it turns out that apps usually have one portion of their UI operate on a portion of data fairly independently from the rest of the app. It makes sense - data is organized visually, so one subtree of views operates on one subset of data in the app. This means that we can actually apply the same composable logic to Controllers as we did for Views. The Controller tree, however, is much more sparse than the View tree, as you don’t need a controller for every last button and text field in your app. Instead, you have a hierarchy of conceptual pieces of your app, each one gets a Controller, and each Controller gets a subtree of Views. The Views are all connected up into a single big tree (so that everything gets drawn properly), but only certain Views correspond to a matching Controller.

This actually affects how you think about the internal workings of your app. Let’s say one Controller somewhere realizes that some View somewhere needs to be updated. If the Controller owns the View in question, great - it just updates the view and is done. However, if the Controller doesn’t own the View, it must notify either its parent Controller or one of its children Controllers that something happened. Now - this part is critical - it is up to this ::other:: Controller to determine how to react to the message it just received. This means that updates to a particular View can’t come from just anywhere; instead, all updates must go through one place whose job it is to make sure the View is showing the correct information. This makes it straightforward to implement policy regarding either presentation or persistence of data. Your app has just turned from a giant monolithic item to a network of messages flowing between pieces, each of which has their own concerns.

I’ll now take this opportunity to mention that Views are implemented by the UIView & NSView classes (for iOS & OS X, respectively), and Controllers are implemented by the UIViewController & NSViewController classes. The ViewController classes have a strong references to a “view” which is the root of their owning subtree. They also have a weak reference to their “parentViewController” and strong reference to an array of “childViewControllers.”

Now, it turns out that most apps set up a template view hierarchy at app launch time, and then keep it around indefinitely. This, coupled with Views’ tree structure makes them ripe for a declarative description of a view hierarchy. Once you have a declarative description, you can create an editor to build the hierarchy as if it is data. This exists as part of Xcode called Interface Builder. The declarative description of a View hierarchy is contained within .xib files. Interface Builder has a tree view on the left where you can describe your hierarchy, and a details view on the right which allows you to describe attributes and properties that each view should have (such as initial text content, font color, position, etc.).

So what happens at runtime with these .xib files? Well, Cocoa has this concept of a “bundle.” Bundles are a folder where a framework or app has all of its constituent files. Certainly, shared libraries or the executable is within a bundle, but any required data files are inside a bundle as well. Artwork and shaders required by the framework or app go inside the bundle. (A bit of terminology: A Framework exists on disk as a bundle, and one of the files inside the bundle is a library - either static or shared.) Bundles all have an Info.plist file, a sort of manifest, describing their contents. (A plist file is just a hierarchical file with a particular schema which allows typed key/value pairs, readable by Cocoa. It stands for “property list.”) When you start an app or use a Framework, the app / Framework has access to all the files in its particular bundle.

Now, the main() of a Cocoa app usually just has a single call to NSApplicationMain() in it. One of the things that occurs within this call is that the Info.plist file is read, and inside it is the filename of a .xib file. Cocoa will then open the .xib file and instantiate all the views inside it and set up the attributes on the views and relationships between the views that are described by the file. This means that when you run your app, you can have something that looks halfway decent without even writing a single line of code! You can even tell Cocoa to use a particular View subclass (as a string) and Cocoa will use the magic of Objective C’s dynamism at runtime to instantiate that subclass instead. Cool stuff!

This is all well and good, but there are two new problems: from the perspective of code that is running in a Controller,

How do we refer to a View which Interface Builder has created in order to push information to it?
Users will interact with the views which Interface Builder has created. However, all the interactions which the user performs are interactions with Views, not Controllers. How does flow control get from Views (as caused by a user action) to Controllers?

Interface Builder has a solution to each of these problems. The solution to the first problem is called an IBOutlet. In your controller, you annotate a variable (with type of some View subclass) with the keyword “@IBOutlet” and then tell Interface Builder to “connect” this variable with a particular View inside Interface Builder (done by dragging and dropping). This actually just sets a string inside the .xib. At runtime, due to the magic of Objective-C’s (and Swift’s) dynamic nature, Cocoa can look up the variable by name (with just a string!) and set it to whatever it wants.

The second problem is solved by something similar called an IBAction. This is the same kind of idea, except this time it’s performed on functions instead of variables. In your Controller, you annotate a function with “@IBAction,” and by dragging and dropping, you associate it with a View inside Interface Builder, which sets a string inside the .xib. Then, at runtime, Cocoa will actually set 2 variables on the View: a “target” and an “action.” The “target” is a weak reference to the controller you identified when you dragged and dropped, and the “action” is the selector to call on it when the View (really, “Control,” a subclass of View) is activated. “Activation” means different things to different Controls - a button activates when the user clicks on it, a text field activates when the user presses enter or when it loses focus, etc. Note that the “target” reference is weak, because it usually points up to the owning Controller.

But let’s think a little more about this “target” thing. In particular, what happens when you drag onto a class that is completely unrelated to Interface Builder, so Interface Builder has no concept of it? In this case, the dragging operation fails, and no IBAction is created. The IBAction (and IBOutlet) dragging operation only succeeds on classes which Interface Builder recognizes.

Well, which classes does Interface Builder recognize? Well, .xibs also have a collection of objects (not Views) which are instantiated along with everything else inside the .xib. If you specify the class of one of these objects as a subclass of ViewController, you can then set the “view” property on that object to a particular view in the hierarchy. (At runtime, one instance of this ViewController will be created along with all the Views, and the “view” property will be set accordingly.) If you set the class of this ViewController, then this is a class that Interface Builder understands, and you can then drag IBActions and IBOutlets to it.

There is actually another way to tell Interface Builder about a class, and that is the magical “File’s Owner.” Each .xib file has an object in it called “File’s Owner.” Now, the actual interface to the Cocoa function that opens and instantiates .xib files requires an extra argument called “owner.” Anything in the .xib file which refers to this “File’s Owner” will then be set to this object. For the main .xib file (the one that is listed in Info.plist), the File’s Owner is set to one of the autogenerated class stubs that Xcode creates for you when you create the project (and is therefore sensitive to the checkboxes you provide during the creation wizard). It is important, though, to realize that you can change the recorded type of “File’s Owner” inside a .xib file. Once you’ve done this, you have told Interface Builder about the type of file which will be opening this .xib, which means you can then drag IBOutlets and IBActions onto it. Note that this “File’s Owner” is usually the controller for a subtree of views.

You can also create secondary .xib files, and instantiate them (read: cause Cocoa to instantiate all the objects listed within) from code. When you do this you can set the owner argument. Also, when you do this, you immediately have access to all the objects which Cocoa created, so you are free to set any upward-pointing weak references inside the new tree as you will.

We are now at a point where we can characterize the setup of Cocoa apps that use Interface Builder. Apps have a main .xib file which is instantiated at startup. There is one object which acts as the “File’s Owner” of the .xib file, and Xcode creates stubs for this class at project creation time. During .xib instantiation, upward-pointing weak references are set up inside Views created therein, and downward-pointing strong (or weak, you have the option for either when you drag and drop) references are set up inside any classes that Interface Builder knows about. When the user interacts with views, messages are passed upward via weak references, and the ViewController is free to do whatever it wants with the message, possibly routing it to a parent ViewController via a weak reference (and, hopefully, an interface which allows for reusability and testability) or down to another child ViewController via a strong reference (or handled itself). ViewControllers own their data and View subtrees, and can interact with them as they want, thanks to IBOutlets which were set up by Cocoa at creation time. ViewControllers can instantiate child ViewControllers and View subtrees in code by calling into Cocoa and passing self as the file’s owner. Then these controllers can interact with the newly created instances. All the while, a ViewController can interact with the data that it owns, and the data is responsible for persisting itself to a database. Some Views have a weak reference to a delegate or a dataSource, which are usually implemented by a higher-level ViewController.

Throughout this discussion I have neglected to mention layout; that will be the subject for another post. Overall, though, there must be some way for each view to know where it should end up relative to its parent, so that all views get a place (and size) on screen. This computation must be repeated whenever the owning window or a superview resizes (otherwise, you wouldn’t be able to implement a behavior of something like “stick to the right edge”). This is an entire subsystem inside Cocoa named Auto Layout.

Litherum

Monday, May 4, 2015

How Cocoa Apps Are Written

No comments:

Post a Comment