Monday, March 18, 2024

So I wrote a double delete...

I wrote a double delete. Actually, it was a double autorelease. Here's a fun story describing the path it took to figure out the problem.

I'm essentially writing a plugin to a different app, and the app I'm plugging-in-to is closed-source. So, I'm making a dylib which gets loaded at runtime. I'm doing this on macOS.

The Symptom

When running my code, I'm seeing this:


Let's see what we can learn from this.

First, it's a crash. We can see we're accessing memory that we shouldn't be accessing.

Second, it's inside objc_release(). We can use a bit of deductive reasoning here: If the object we're releasing has a positive retain count, then the release shouldn't crash. Therefore, either we're releasing something that isn't an object, or we're releasing something that has a retain count of 0 (meaning: a double release).

Third, we can actually read a bit of the assembly to understand what's happening. The first two instructions are just a way to check if %rdi is null, and, if so, jump to an address that's later in the function. Therefore, we can deduce that %rdi isn't null.

%rdi is interesting because it's the register that holds the first argument. It's probably a safe assumption to make that objc_release() probably just takes a single argument, and that argument is a pointer, and that pointer is stored in %rdi. This assumption is somewhat-validated by reading the assembly: nothing seems to be using any of the other parameter registers.

The next 3 lines check if the low bit in %rdi is 1 or not. If it's 1, then we again jump to an address that's later in the function. Therefore, we can deduce that %rdi is an even number (its low bit isn't 1).

The next 3 lines load a value that %rdi is pointing to, and mask off most of its bits. The next line, which is the line that's crashing, is trying to load the value that the result points to.

All this makes total sense: Releasing a null pointer should do nothing, and releasing tagged pointers (which I'm assuming are marked by having their low bit set to 1) should do nothing as well. If the argument is an Objective-C object, it looks like we're trying to load the isa pointer, which probably holds something useful at offset 0x20. That's the point where we're crashing.

That leads to the deduction: Either the thing we're trying to release isn't an Objective-C object, or it's already been released, and the release procedure clears (or somehow poisons) the isa value, which caused this crash. Either way, we're releasing something that we shouldn't be releasing.

One of the really useful observations about the assembly is that nothing before the crash point clobbers the value of %rdi. This means that a pointer to the object that's getting erronously released is *still* in %rdi at the crash site.

We can also see that the crash is happening inside AutoreleasePool:

This doesn't indicate much - just that we're autoreleasing the object instead of releasing it directly. It also means that, because autorelease is delayed, we can't see anything useful in the stack trace. (If we were releasing directly instead of autoreleasing, we could see exactly what caused it in the stack trace.)

The First Thing That Didn't Work

The most natural solution would be "Let's use Instruments!" It's supposed to have a tool that shows all the retain stacks and release stacks for every object.

When running with Instruments, we get a nice crash popup showing us that we crashed:

The coolest part about this is that it shows us the register state at the crash site, which gives us %rdi, the pointer to the object getting erroneously released.

Cool, so the object which is getting erroneously released is at 0x600002a87b40. Let's see what instruments lists for that address:

Well, it didn't list anything for that address. It listed something for an address just before it, and just after it, but not what we were looking for. Thanks for nothing, Instruments.

The Second Thing That Didn't Work

Well, I'm allocating and destroying objects in my own code. Why don't I try to add logging all my own objects to see where they all get retained and released! Hopefully, by cross referencing the address of the object that gets erroneously deleted with the logging of the locations of my own objects, I'll be able to tell what's going wrong.

We can do this by overriding the -[NSObject release] and -[NSObject retain] calls:

As well as init / dealloc:

Unfortunately, this spewed out a bunch of logging, but the only thing it told me was the object that was being erroneously released wasn't one of my own objects. It must be some other object (NSString, NSArray, etc.).

The Third Thing That Didn't Work

Okay, we know the object is being erroneously autoreleased. Why don't we log some useful information every time anyone autoreleases anything? We can add a symbolic breakpoint on -[NSObject autorelease].

Here's what it looks like when this breakpoint is hit:

Interesting - so it looks like all calls to -[NSObject autorelease] are immediately redirected to _objc_rootAutorelease(). The self pointer is preserved as the value of the first argument.

If you list the registers at the time of the call, you can see the object being released:

So let's modify the breakpoint to print all the information we're looking for:

Unfortunately, this didn't work because it was too slow. Every time lldb evaluates something, it takes a bunch of time, and this was evaluating 3 things every time anybody wanted to autorelease anything, which is essentially all the time. The closed-source application I'm debugging is sensitive enough, that if anything takes too long, the application just quits.

The Fourth Thing That Didn't Work

Lets try to print out the same information as before, but do it inside the application rather than in lldb. That way, it will be much faster.

The way we can do this is with something called "function interposing." This uses a feature of dyld which can replace a library's function with your own. Note that this only works if you disable SIP and set the nvram variable amfi_get_out_of_my_way=0x1 and reboot.

We can do this to swap out all calls to _objc_rootAutorelease() with our own function.

Inside our own version of _objc_rootAutorelease(), we want to keep track of everything that gets autoreleased. So, let's keep track of a global dictionary, from pointer value to info string.

We can initialize this dictionary inside a "constructor," which is a special function in a dylib which gets run when the dylib gets loaded by dyld. This is a great way to initialize a global.

Inside my_objc_rootAutorelease(), we can just add information to the dictionary. Then, when the crash occurs, we can print the dictionary and find information about the thing that was autoreleased.

However, something is wrong...

The dictionary only holds 315 items. That can't possibly be right - it's inconceivable that only 315 things got autoreleased.

The Fifth Thing That Didn't Work

We're close - we just need to figure out why so few things got autoreleased. Let's verify our assumptions, that [foo autorelease] actually calls _objc_rootAutorelease() by writing such code and looking at its disassembly.

And if you look at the disassembly...

You can see 2 really interesting things: the call to alloc and init got compressed to a single C call to objc_alloc_init(), and the call to autorelease got compressed to a single C call to obc_autorelease(). I suppose the Objective-C compiler knows about the autorelease message, and is smart enough to not invoke the entire objc_msgSend() infrastructure for it, but instead just emits a raw C call for it. So that means we've interposed the wrong function - we were interposing _objc_rootAutorelease() when we should have been interposing objc_autorelease(). So let's interpose both:

This, of course, almost worked - we just have to be super sure that my_objc_autorelease() doesn't accidentally call autorelease on any object - that would cause infinite recursion.

The Sixth Thing That Didn't Work

Avoiding calling autorelease inside my_objc_autorelease() is actually pretty much impossible, because anything interesting you could log about an object will, almost necessarily, call autorelease. Remember that we're logging information about literally every object which gets autoreleased, which is, in effect, every object in the entire world. Even if you call NSStringFromClass([object class]) that will still cause something to be autoreleased.

So, the solution is to set some global state for the duration of the call to my_objc_autorelease(). If we see a call to my_objc_autorelease() while the state is set, that means we're autoreleasing inside being autoreleased, and we can skip our custom logic and just call the underlying objc_autorelease() directly. However, there's a caveat: this "global" state can't actually be global, because Objective-C objects are created and retained and released on every thread, which means this state has to be thread-local. Therefore, because we're writing in Objective-C and not C++, we must use the pthreads API. The pthreads threadspecific API uses a "key" which has to be set up once, so we can do that in our constructor:

Then we can use pthread_setspecific() and pthread_getspecific() to determine if our calls are being nested.

Except this still didn't actually work, because abort() is being called...

The Seventh Thing That Didn't Work

Luckily, when abort() is called, Xcode shows us a pending Objective-C exception:

Okay, something is being set to nil when it shouldn't be. Let's set an exception breakpoint to see what is being set wrong:

Welp. It turns out NSStringFromClass([object class]) can sometimes return nil...

The Eighth Thing That Worked

Okay, let's fix that by checking for nil and using [NSNull null]. Now, the program actually crashes in the right place!

That's more like it. Let's see what the pointer we're looking for is...

Okay, let's look for it in bigDict!

Woohoo! Finally some progress. The object being autoreleased is an NSDictionary.

But that's not enough, though. What we really want is a backtrace. We can't use lldb's backtrace because it's too slow, but luckily macOS has a backtrace() function which gives us backtrace information! Let's build a string out of the backtrace information:

Welp, that's too slow - the program exits. Let's try again by setting frameCount to 6:

So here is the final autorelease function:

Okay, now let's run it, and print out the object we're interested in:

And the bigDict:

Woohoo! It's a great success! Here's the stack trace, formatted:

Excellent! This was enough for me to find the place where I had over-released the object.