I’m Bavarious, (sporadically) writing about Objective-C, Apple frameworks and OS X.

Jul 23

Which framework defines a method? dladdr() to the rescue

I’ve recently had a conflict when trying to use a property called ‘selected’ in my CALayer subclass. It turns out that ImageKit defines a LayerExtra category on CALayer and this category defines -setSelected:. In order to find out which framework (shared library) was responsible for that, I used dladdr():

IMP imp = [layer methodForSelector:@selector(setSelected:)];
if (imp) {
    Dl_info info;
    if (dladdr((void *)imp, &info)) {
        NSLog(@"lib path name is %s", info.dli_fname);

dladdr() is declared in <dlfcn.h>.

Feb 8

Example of copyfile() progress via COPYFILE_STATE_COPIED

Since it looks like the Web doesn’t have an example of monitoring progress of a copyfile() operation via COPYFILE_STATE_COPIED, here it goes. Note that the progress callback is disabled in OS X v10.8 even though the man page mentions it. As explained by Doug, copyfile()’s status callback can be used to monitor progress of single files and recursive copies. Check his example with COPYFILE_STATE_STATUS_CB in an NSOperation subclass.

Feb 6

Execnames in DTrace

A few weeks ago, David Smith asked whether it was possible to obtain the executable name of a process (given its PID) in DTrace. There is no built-in function for that, and I suggested the following:

  1. Since the proc provider supplies probes for process creation and termination, use these probes to keep a map of PIDs to execnames for new processes;
  2. For previous processes, use the output from ps(1) and feed it to the DTrace script at startup time.

The DTrace fragments below keep a pnames map (an associative array) mapping PIDs to execnames of new processes. I’m using three probes: proc:::create for forked processes, proc:::exec for execed process and proc:::exit for processes that have terminated.

Since it is possible that a child process was forked and not execed, copy the parent process execname to the child process:

    this->pid = args[0]->pr_pid;
    this->ppid = args[0]->pr_ppid;

    pnames[this->pid] = pnames[this->ppid];

If the child process was execed after fork, use the child execname instead of the parent execname:

    this->pname = basename(args[0]);
    pnames[pid] = this->pname;

When a process exits, keep its execname and append (exited) for debugging purposes:

    pnames[pid] = strjoin(pnames[pid], " (exited)");

For existing processes, run ps and convert its output to a series of DTrace statements that add PID → execname mappings to pnames like so:

ps -axco "pid= comm="
| awk '{printf "pnames[%s]", $1; $1=""; sub(/^ /, "", $0); printf " = \x22%s\x22;\n", $0}'
> tmp/pnames.h
; sudo dtrace -C -I/tmp -s someDTraceScript.d
  1. The first line (ps) outputs a list of running processes with two columns: PID and command name;
  2. The second line (awk) writes pnames[PID] = "execname""; for each process listed in step 1;
  3. The third line (>) stores the output from step 2 in /tmp/pnames.h;

    Sample contents of /tmp/pnames.h:

    pnames[1] = "launchd";
    pnames[11] = "UserEventAgent";
    pnames[12] = "kextd";
    pnames[13] = "taskgated";
    pnames[14] = "notifyd";
    pnames[15] = "securityd";
    pnames[16] = "diskarbitrationd";
  4. The fourth line runs dtrace with the C preprocessor (-C), specifying /tmp as a directory to search for include files.

Using /tmp/pnames.h, the file created in step 3, is easy enough:

#include "pnames.h"

This is not a perfect solution, though: processes created between steps 1 and 4 may not be stored in /tmp/pnames.h or the pnames associative array, but it should be a good enough solution for many cases.

The following DTrace script, signals.d, shows the use of the pnames map to list the names of the sender process and the receiver process in a signal operation. It uses predicates to pattern-match the pid_t argument of kill(2) and print different messages accordingly. For debugging purposes, it also prints a message whenever a process is forked, execed or terminated. Even if you’re not particularly interested in tracing, you may find out something interesting about your system behaviour. Check it out!

Jan 18

Stack-trace-dumping regular-expression-based symbolic breakpoints in LLDB

Even though LLDB supports regular expressions when setting breakpoints, Xcode (as of 4.5.2) does not. Regular expressions can be useful when one’s trying to examine how a certain class is used overall.

In order to use a regular expression in a breakpoint, use the --func-regex option. For example:

(lldb) breakpoint set --func-regex NSLayoutManager
Breakpoint created: 2: regex = 'NSLayoutManager', locations = 312, resolved = 312

Breakpoint number 2 matches NSLayoutManager in 312 locations in this example. Inspecting all locations is easy:

(lldb) breakpoint list 2

LLDB stopping at each NSLayoutManager method quickly becomes annoying. In general, if I’m using a regular expression, I want to examine the stack trace to identify which methods have led to the execution of each location and automatically continue execution after the breakpoint is hit. This is easily accomplished in Xcode by editing a breakpoint, setting the Debugger Command action to bt and checking the Automatically continue after evaluating checkbox.

Since we’re not using Xcode, here’s how to obtain the same behaviour in LLDB. Recall that the example created a breakpoint numbered 2. We’ll add a three-line command to that breakpoint:

(lldb) breakpoint command add 2
Enter your debugger command(s).  Type 'DONE' to end.
> script print "----------"
> bt
> continue

Voilà! Now we can resume program execution by issuing continue and LLDB will gladly dump a stack trace whenever a method/function whose name matches NSLayoutManager is executed. Note that this breakpoint does not match methods declared in superclasses unless they have been overridden by NSLayoutManager.

If you want to be more strict and, say, avoid C functions whose names contain NSLayoutManager, e.g. NSLayoutManagerLogDebug, add [ to the regular expression since Objective-C methods are represented by -[…] or +[…]. Recall that [ is a special character in POSIX regular expressions, so it needs to be escaped:

(lldb) breakpoint set --func-regex "\[NSLayoutManager"
Breakpoint created: 2: regex = '\[NSLayoutManager', locations = 311, resolved = 311

Tailor the regular expression to suit your needs, be prepared for many stack traces in the console log, and happy debugging!

Jan 20

YOU MUST EXEC, a Core Foundation fork safety tale

Mike Ash has recently published a post about fork safety. This promptly reminded me of an error message that I, like possibly other developers jumping from Unix into Cocoa, stumbled upon when trying to use fork() in Mac OS X:

The process has forked and you cannot use this CoreFoundation functionality safely. You MUST exec().

A symbol in all caps is hard to miss or forget.

What’s happening here is that Core Foundation (aka CF) has detected that a process used a fork-unsafe CF operation, fork()ed, didn’t exec(), and now a descendant process is trying to use some fork-unsafe CF functions. But how does CF detect this?

First, consider the following scenarios.

Scenario #1

  1. A process is created from an executable file that is linked against Core Foundation
  2. The process executes a fork-unsafe CF operation (for example, using CFRunLoop)
  3. The process spawns a child process via fork()
  4. The child process tries to execute a fork-unsafe CF operation, and CF complains.

Scenario #2

  1. A process is created from an executable file that is linked against Core Foundation
  2. The process executes fork-safe CF operations
  3. The process spawns a child process via fork()
  4. The child process tries to execute a fork-unsafe CF operation. CF doesn’t complain because this is the first time an unsafe operation is being executed.

Scenario #3, or #2 with more descendants

  1. A process is created from an executable file that is linked against Core Foundation
  2. The process executes fork-safe CF operations
  3. The process spawns a child process via fork()
  4. The child process tries to execute a fork-unsafe CF operation. CF doesn’t complain because this is the first time an unsafe operation is being executed.
  5. The child process spawns another (let’s call it grandchild) process via fork()
  6. The grandchild process tries to execute a fork-unsafe CF operation. CF complains because an ancestor process (in this case, the child process) executed an unsafe operation, so the grandchild process is wading into dangerous waters.

In summary, from a Core Foundation perspective, there’s potential for trouble when a fork-unsafe CF operation is being executed in a process with an ancestor that has already executed a fork-unsafe CF operation. This can be controlled by two variables:

  • Whether the current process or any ancestor process has executed a fork-unsafe CF operation. This should be passed along to its children so that they’re able to set the variable below;

  • Whether an ancestor process has executed a fork-unsafe CF operation.

Core Foundation does the following:

  • At startup, current_or_ancestor_process_has_executed_fork_unsafe_CF_operation = false and ancestor_process_has_executed_fork_unsafe_CF_operation = false

  • Whenever a fork-unsafe CF operation is about to be executed, set current_or_ancestor_process_has_executed_fork_unsafe_CF_operation = true and test whether ancestor_process_has_executed_fork_unsafe_CF_operation == true, in which case, show an error message and bail out

  • Whenever a child process is created via fork(), the child process tests whether current_or_ancestor_process_has_executed_fork_unsafe_CF_operation == true — since variables are copied from the parent process to the child process, at this moment current_or_ancestor_process_has_executed_fork_unsafe_CF_operation is a copy of the parent process value. If it’s true, set ancestor_process_has_executed_fork_unsafe_CF_operation = true.

It’s important to note that CF deals with fork-unsafe operations from a CF perspective only. Whether the child process might be executing fork-unsafe operations in the general case is left to the programmer.

And now, the implementation details.

POSIX provides a pthread_atfork() function to register callbacks that are called before and after fork():

int pthread_atfork(void (*prepare)(void), void (*parent)(void), void (*child)(void));

The prepare handler is called before fork(), and the parent and child handlers are called after fork() in the parent and child process, respectively.

Core Foundation uses atfork handlers to implement the test for fork-unsafe operations previously described:

  • The library has two variables with static storage duration. Instead of being called current_or_ancestor_process_has_executed_fork_unsafe_CF_operation and ancestor_process_has_executed_fork_unsafe_CF_operation, they’re called __CF120290 and __CF120293, respectively. They’re initialised with false.

  • Core Foundation is a dynamic library and uses the -init linker parameter to specify a symbol that is run as the first initialiser. The corresponding initialiser function is CFInitialize();

  • CFInitialize() registers a child atfork handler via pthread_atfork(). It’s called, erm, __01123__();

  • Whenever a fork-unsafe CF operation is about to be executed, set current_or_ancestor_process_has_executed_fork_unsafe_CF_operation = true and test whether ancestor_process_has_executed_fork_unsafe_CF_operation == true. If it is, call __THE_PROCESS_HAS_FORKED_AND_YOU_CANNOT_USE_THIS_COREFOUNDATION_FUNCTIONALITY___YOU_MUST_EXEC__(), which prints the warning messages.

  • Whenever a child process is created via fork(), the atfork child handler registered by CF is executed. It tests whether current_or_ancestor_process_has_executed_fork_unsafe_CF_operation == true and, if it is it sets ancestor_process_has_executed_fork_unsafe_CF_operation = true.

If you want to see some code, go check CFRuntime.c and CFInternal.h. Although the open source edition of Core Foundation does not necessarily reflect the actual code in Core Foundation, it should be similar enough.

Dec 4

Testing closures (aka Blocks) for equality

(In which @bbum slaps me around a bit with a large trout. Or a leg of lamb.)

A couple of days ago, @kongtomorrow asked on Twitter whether there were a function to test the equality of two closures:

Is there a func like Block_equals such that Block_equals(block, Block_copy(block)) is true? I could have sworn Block_equals existed…

There doesn’t seem to exist such a function in either Block.h (public closure functions) or Block_private.h (private closure functions). In this post, I show an implementation of testing closure equality. Bear in mind that I am using private details about how closures are currently implemented. This might change in the future and thou shalt never, ever use this information in shipped applications. That said, it should work with current compiler & runtime if you need to test closure equality in, for example, unit tests.

The Problem

Let’s say you’re writing unit tests for an API: you start by writing a closure literal, hand it to the API and, after a few hoops, you want to assert that a certain closure reference is the same as the original closure literal you’ve written.

If the closure literal has static storage, you can simply compare the corresponding pointers since they’ll always be the same. This won’t work for context-capturing, automatic storage (stack) closures that have been copied to the heap, though.

The following code illustrates this problem:

typedef void (^closure)(void);

int main(void) {
    int answer = 42;
    closure c = ^{ printf("Hello, I'm a closure: %d\n", answer); };

    closure c0 = Block_copy(c);
    closure c1 = Block_copy(c);
    closure c00 = Block_copy(c0);

    printf("closure = copy(closure)? %d\n", c == c0);
    printf("copy(closure) = another copy(closure)? %d\n", c0 == c1);
    printf("copy(closure) = copy(copy(closure))? %d\n", c0 == c00);

    return 0;

The program generates the following output:

$ clang BVBlocksTest.c && ./a.out
closure = copy(closure)? 0
copy(closure) = another copy(closure)? 0
copy(closure) = copy(copy(closure))? 1

As you can see, a stack closure has different address from its heap copy, which is expected. Different copies of the same stack closure are also placed in different addresses. A copy of a copy, on the other hand, has the same address — it’s the same (heap) closure and its reference count has been incremented.

Conceptually, they are all the same closure, though.

Even though one could restrict oneself to never use a stack closure as a basis for comparison, there might be situations where this is needed.

A Solution

When the compiler finds a closure literal in a program, e.g.

void (^c)(void) = ^{ printf("Hey\n"); };

it emits code that includes:

  • A structure that describes the closure;
  • A function containing the code inside the closure.

All of this is transparent and there’s no public API to introspect it.

For the purpose of this post, I use two important facts (which, mind you, are true for the current compiler & runtime but may change in the future):

  • Although different closures may have different underlying structures, they all share the same first members;
  • There’s a one-to-one relation between closure literals and the underlying functions containing the closure code.

In libclosure, there’s a file called BlockImplementation.txt which describes the ABI used by Apple’s implementation of closures. In particular, it specifies the following structure for closure literals:

struct {
    void *isa;
    int flags;
    int reserved; 
    void (*invoke)(void *, ...);

The actual structure varies according to the details of the closure literal but the members listed above are always present. Of particular interest is the invoke member: it’s a pointer to the function that contains the code specified inside the closure literal.

With this information, writing a function that tests for closure equality shouldn’t be hard: given two closures, they are equal if their underlying functions are the same:

bool Block_equals(const void *b0, const void *b1) {
    // The following header is (at the moment) shared by all closure literals
    typedef struct {
        void *isa;
        int flags;
        int reserved;
        void *invoke;
    } Block_header;

    const Block_header *h0 = b0;
    const Block_header *h1 = b1;

    // We consider two closures to be the same if their underlying functions
    // are located at the same memory address, i.e., they're the same functions
    return h0->invoke == h1->invoke;

The actual code needs to take care of a couple of issues: adjusting types, including ARC bridge casts if needed, and testing for NULL pointers. I’ve published it on GitHub: https://github.com/bavarious/BVBlock_equals.

Once again, don’t use this in shipped applications. This solution uses details about how the compiler emits code for closures according to the current ABI and it might break in the future. If @bbum ever finds out you’ve used this code in the wild, he’ll slap you around a bit with a large trout. Or a leg of lamb.

Discussion: Psy on Freenode and @ahruman on Twitter pointed out that this equality test doesn’t take into account the values of imported variables. Here’s @ahruman’s example:

IntBlock foo(int x) { return ^{ return x; } };
BVBlock_equals(foo(1), foo(2)); // Should be false. ;-)

In this case, the two closures have the same closure literal so they are equal in that sense. However, since the values of imported variables aren’t the same, one could argue that they’re in fact different.

Maybe my function should be called BVBlock_literal_equals() instead!

Oct 27

CFLite, Core Foundation’s little cousin

Apple Open Source is the Web site where Apple publish releases of part of the software that’s bundled in Mac OS X, iOS and the developer tools. One of the available open source components is CFLite, a subset of Core Foundation. In this post, I describe how I’ve built CFLite on Mac OS X and implemented a minor optimisation.

Obtaining CFLite

Mac OS X open source releases are grouped into OS versions. Lion is currently v10.7.2 but its corresponding CF version, CF-635.15, is not available yet. I’ve used CF-635 from v10.7.1.

CFLite is licensed under Apple’s APSL licence. Bear in mind that APSL is not as developer-friendly as MIT, BSD or Apache licences. Some other open source components are not APSL-licensed, e.g. libauto (Apache), libclosure (MIT), libdispatch (Apache).

Building CFLite

README_CFLITE states that the default makefile can be used to build CoreFoundation.Framework on Mac OS X:

% make

Well, not outside of Apple.

There are a number of private header files being included in the source files. Commenting them out leads to undefined symbols and removing the corresponding function calls (or even entire functions) may be necessary. This means that the local build will lack some functionality but it should be enough to do some simple tests.

It’s also necessary to install ICU, International Components for Unicode. Even though there’s a /usr/lib/libicucore.A.dylib file, I wasn’t able to find the corresponding header files under /usr/include or /Developer/SDKs/MacOSX10.7.sdk/usr/include. Luckily ICU is also available at Apple Open Source and can be built and installed under /usr/local.

After some rather blunt patching in my eagerness to build CFLite, I came up with this patch to make CF-635 build on Mac OS X v10.7.2 with ICU-461.13. If you can improve this patch, let me know.

Testing CFLite

Makefile builds CFLite as a framework, CoreFoundation.framework, which gets installed under ../CF-Root. In order to make sure that programs are linked against CFLite instead of system Core Foundation, set the DYLD_FRAMEWORK_PATH environment variable to that CF-Root directory:

$ export DYLD_FRAMEWORK_PATH=/path/to/CF-Root

You may create a small test program, say:

#include <CoreFoundation/CoreFoundation.h>

int main(void) {
    CFStringRef s = CFSTR("hey there");
    return 0;

build it:

$ clang testcf.c -framework CoreFoundation -o testcf

and test it:

$ ./testcf

If you want to be sure that the program is using CFLite instead of Core Foundation, you may set the DYLD_PRINT_LIBRARIES environment variable. This tells dyld, the dynamic linker, to output the location of all libraries that are loaded when running a program:

$ ./testcf
dyld: loaded: /path/to/CF-Root/CoreFoundation.framework/Versions/A/CoreFoundation

You may unset that variable:


after having confirmed that dyld is in fact using CFLite.

Note that a program cannot be linked against both CFLite and system Core Foundation, and CFLite won’t work with Foundation.

Mmm, code

CFLite being open is useful because it allows developers on other platforms to use a subset of Core Foundation — for example, to parse property lists on Linux. Moreover, it’s handy to be able to read source code and (hopefully!) understand what happens under the hood in Cocoa applications, like the array that wasn’t. It can also be used to debug some obscure problem or help security researchers spot potential vulnerabilities in Core Foundation.

And, if one’s feeling adventurous, one might even find opportunities for optimisation.

Whilst browsing CFLite, I stumbled upon a lookup table in CFURL.c:

static const unsigned char sURLValidCharacters[] = {
    /* ' '  32 */   0,
    /* '!'  33 */   VALID | UNRESERVED | PATHVALID ,
    /* '"'  34 */   0,
    /* '#'  35 */   0,
    /* '$'  36 */   VALID | PATHVALID ,

Now, I like lookup tables. I especially like lookup tables indexed by characters since they look rather neat when initialised with C99 designators:

int array[128] = {
    ['a'] = 1,
    ['b'] = 5,
    ['c'] = 13

The initialisation of sURLValidCharacters[] can be rewritten with designators like so:

static const unsigned char sURLValidCharacters[128] = {
    [' '] = 0,
    ['"'] = 0,
    ['#'] = 0,
    ['$'] = VALID | PATHVALID ,

and it’s possible to drop the 0 assignments because an object with static storage which hasn’t been explicitly initialised by a designator is automatically zeroed out:

static const unsigned char sURLValidCharacters[128] = {
    ['$'] = VALID | PATHVALID ,

One may prefer to keep the 0 assignments to make it clear that a character doesn’t have any of the mask bits set.

The table had to grow 32 bytes because ' ' is equivalent to 32, so we end up reserving space for the control characters whose corresponding codes are less than 32. Which, mind you, can be a good thing.

Here’s one of the functions that use that lookup table:

CF_INLINE Boolean isURLLegalCharacter(UniChar ch) {
    return ( ( 32 <= ch ) && ( ch <= 127 ) ) ? ( sURLValidCharacters[ ch - 32 ] & VALID ) : false;

The original lookup table starts with the space character, code 32. This means that isURLLegalCharacter() must subtract 32 (ch - 32) from the character code and it must make sure that this subtraction doesn’t result in a negative value (32 <= ch).

However, since the lookup table using designators includes all codes from 0 to 127, isURLLegalCharacter() can be simplified to:

CF_INLINE Boolean isURLLegalCharacter(UniChar ch) {
    return ( ch <= 127 ) ? ( sURLValidCharacters[ ch ] & VALID ) : false;

One less comparison, no need to use logical and in the conditional operator condition, no subtraction — at the cost of 32 extra bytes in the lookup table. This same optimisation can be applied to the other functions that use sURLValidCharacters. By doing so, CFURLCreateWithString() is about 25% faster in my tests (~130-character valid URLs). Your mileage may vary.

Thanks to David Smith for the inspiration to play with CFLite and his comments on an early draft, and to Landon Fuller for calling my attention to the fact that Apple have not used APSL in some recent open source components.

Sep 22

Would you please crash my out of scope stack closure?

Programmers who’ve used Apple/Objective-C closures are probably aware that a literal closure should be copied from the stack to the heap via Block_copy() or -copy if it is to be used outside its enclosing block. The documentation says so, Bill Bumgarner has a post explaining it, and dispatch/queue.h even has a warning for this:

// The declaration of a block allocates storage on the stack. 
// Therefore, this is an invalid construct:

dispatch_block_t block;

if (x) {
    block = ^{ printf("true\n"); };
} else {
    block = ^{ printf("false\n"); };
block(); // unsafe!!!

// What is happening behind the scenes:

if (x) {
    struct Block __tmp_1 = ...; // setup details
    block = &__tmp_1;
} else {
    struct Block __tmp_2 = ...; // setup details
    block = &__tmp_2;

// As the example demonstrates, the address of a stack variable is 
// escaping the scope in which it is allocated. That is a classic C bug.

However, this sample program based on the warning on dispatch/queue.h doesn’t crash:

#include <stdio.h>

int main(void) {
    void (^closure)(void);

    int n;
    scanf("%d", &n);

    if (n) {
        closure = ^{ printf("non-zero\n"); };
    else {
        closure = ^{ printf("zero\n"); };


    return 0;

Let’s investigate why.


In C, a block is a {}-delimited group of declarations and statements that form a syntactical unit. Identifiers declared inside a block have scope/visibility limited to that block and the objects corresponding to automatic/local variables have storage duration/lifetime limited to that block. Even though C99 uses the term scope for visibility only, it’s not uncommon for scope to mean lifetime scope as well.

Since block already has a meaning in C, I prefer to use the term closure for Apple/Objective-C blocks, much like Apple’s N1451 submission of closures to ISO/IEC JTC1/SC22/WG14.

Global and stack closures

A global closure is a closure that exists throughout the execution of the program, e.g . static or global (file-level visibility) ones — in C99 terminology, that’s called static storage duration — and therefore it does not need to be copied to the heap. In the code below, all closures are global:

#include <stdio.h>

int someInt = 21;
void (^closure0)(void) = ^{ printf("hey there\n"); };
void (^closure1)(void) = ^{ printf("it's %d\n", someInt); };

void someFunction() {
    static void (^closure2)(void) = ^{ printf("hola"); };

When generating code, the compiler sets the isa pointer of the corresponding closure structures to _NSConcreteGlobalBlock.

However, if a closure doesn’t need to capture its local context then the compiler turns it into a global closure. Bill writes:

(…) when the compiler detects that a block is effectively constant because it captures no state, the compiler will create a static global for said block and copying is a no-op. Implementation detail. Now you know, but forget it when writing code.

For example, in:

void someFunction() {
    void (^closure)(void) = ^{ printf("salut\n"); };

Apple Clang 2.1 considers that closure as being global:

leaq    ___block_literal_global(%rip), %rax
    .quad   __NSConcreteGlobalBlock

If we want to crash a stack closure being referenced outside of its scope, we’d better make sure it is truly a stack one. Making it reference a variable in the current context should do the trick.

#include <stdio.h>

void someFunction(int n) {
    void (^closure)(void) = ^{ printf("%d\n", n); };


leaq    ___block_descriptor_tmp(%rip), %rcx
leaq    ___someFunction_block_invoke_0(%rip), %rdx
movq    __NSConcreteStackBlock@GOTPCREL(%rip), %rsi

so we should be good to go.

Stack frame and object lifetime

In a C function or Objective-C method, its stack frame is a stack memory range where local/automatic variables used throughout the function are stored. When the function returns, its stack frame shouldn’t be considered to keep valid objects used in that function.

Something similar happens to blocks in C. A local, non-static variable has its lifetime restricted to its enclosing block. In C99 terminology, the object associated to such a variable has automatic storage duration. Attempting to reference said object outside of its enclosing block, i.e., after it’s reached the end of its lifetime, is an undefined behaviour.

Consider the following program:

#include <stdio.h>
#include <time.h>

int main(void) {
    void *p0, *p1;

    // Block 0
        time_t n0 = time(NULL);
        p0 = &n0;

    // Block 1
        time_t n1 = time(NULL);
        p1 = &n1;

    printf("p0 = %p\np1 = %p\n", p0, p1);

    return 0;

Variables n0 and n1 have scope (visibility) limited to their enclosing blocks and, because of their automatic storage duration, the lifetime of the corresponding objects is limited to their enclosing blocks as well. When the program leaves Block 0, the function shouldn’t rely on the memory address of n0 pointing to a valid object. In fact, the compiler may choose to reuse the memory used by Block 0 — conceptually, we could consider that all automatic variables inside Block 0 are pushed onto the stack at the beginning of the block and then popped out of the stack at the end of the block.

I get the following output when running the program above after compiling it with GCC 4.6 (binaries available at the HPC project) with an optimisation level (-O) greater than 0:

$ ./a.out
p0 = 0x7fff6f8a9c08
p1 = 0x7fff6f8a9c08

thus GCC 4.6 is effectively reusing the memory used by Block 0.

When compiling the same program with GCC 4.6, -O0:

$ ./a.out
p0 = 0x7fff6f0cbbf0
p1 = 0x7fff6f0cbbe8

addresses don’t get reused.

With both Apple Clang 2.1 and LLVM-GCC 4.2.1, addresses don’t get reused regardless of -O optimisation level. Specifying a block for if or for structures seemingly doesn’t change that.

(Not) Making a stack closure crash

In summary, there are two circumstances that influence an out of scope stack closure crash inside a single function:

  1. It’s not enough for the closure to be literal and non-static for it to be a stack closure: it must capture and reference a variable in the current context. This is valid as of Apple Clang 2.1 but it may change in the future. Being an implementation detail, you shouldn’t rely on it;

  2. The stack memory used by a block must be reused and rewritten with data that do not represent a valid closure.

The problem with item 2 is that neither Apple Clang 2.1 nor LLVM-GCC 4.2.1 seem to reuse the stack memory used by a block whose execution has ended. This means that every object inside a function will be assigned a unique memory address, hence it will never be overwritten inside that function unless the programmer explicitly does that. And whilst stock GCC 4.6 is more aggressive and can reuse memory used inside a block, it doesn’t support closures.

If you have an example where Clang emits conservative stack usage code, drop me a line! I’m also curious as to which combination of GCC flags applied to -O0 makes it emit conservative stacks. -fconserve-stack is not enough by itself.

Thanks to Sedate Alien for the inspiration for this post.

Aug 16

A CHOCKING MYSTERY: po [NSNumber numberWithBool:NO] outputs 1

Craig Hockenberry posted this on Twitter:

(gdb) po [NSNumber numberWithBool:NO]


Does not inspire confidence…

Jens Ayton cleverly noted that NO is a macro, so where would GDB be getting a value from?

A few comments ensued, including speculations about tagged pointers, until Cédric Luthi mentioned how to correctly handle it:

That’s unfortunate but Foundation exports YES and NO symbols, the correct way to use it is *((char*)NO)

I wasn’t aware of that, so I decided to take a look.

In Objective-C source code using Apple’s runtime, the BOOL type is defined as:

typedef signed char     BOOL;

with corresponding boolean values:

#define YES             (BOOL)1
#define NO              (BOOL)0

This means that a boolean value is a signed, 1-byte character, with 1 representing a true value and 0 representing a false value (actually, the C language allows any non-zero value to represent a true value).

In source code, any reference to YES or NO is converted to (BOOL)1 or (BOOL)0 by the preprocessor.

However, GDB does not invoke the preprocessor. This means that in:

po [NSNumber numberWithBool:NO]

NO must be obtained from somewhere else.

In fact:

(gdb) p NO


$1 = {<text variable, no debug info>} 0x9374bc6a73a0a1 <NO>

hence there’s a symbol called NO that was loaded onto the address 0x9374bc6a73a0a1. Since the address is different from 0, +numberWithBool: returns an NSNumber representing a true value.

Inspecting the Foundation framework, we can see:

$ nm Foundation | grep NO
00000000002630a1 S _NO

that Foundation indeed exports a symbol called NO. The S flag means that the symbol is in a section other than text, data or bss. Running:

$ nm -m Foundation | grep NO
00000000002630a1 (__TEXT,__const) external _NO

shows that NO is in the (__TEXT,__const) section. Inspecting the contents of that section:

$ nm -s __TEXT __const Foundation | sort
00000000002630a0 S _YES
00000000002630a1 S _NO
00000000002630a2 s ___NSOperationPrios

we can see that the symbols YES and NO use only one byte. This matches the previous definition that BOOL is one-byte long, so it’s reasonable to expect that those symbols point to a 1-byte memory region whose contents are either 1 or 0.

If we cast the NO symbol as a pointer to signed char and then dereference it to obtain its value, we get:

(gdb) p *((signed char *)NO)
$1 = 0 '\0'

As expected, we get 1 for YES:

(gdb) p *((signed char *)YES)
$1 = 1 '\001'

So the correct way to obtain NO, with the same value and representation used in an Objective-C program, is to use the expression:

*((signed char *)NO)

as pointed out by Cédric Luthi. For example:

(gdb) po [NSNumber numberWithBool:*((signed char *)NO)]
(gdb) po [NSNumber numberWithBool:*((signed char *)YES)]

Even though this mimics what happens in Objective-C source code, it’s not strictly necessary. Knowing that NO is 0 and YES is 1 allows us to write the much simpler expressions:

(gdb) po [NSNumber numberWithBool:0]
(gdb) po [NSNumber numberWithBool:1]

without having to reference the symbols exported by Foundation.

Alternatively, Core Foundation exports two constants — kCFBooleanTrue and kCFBooleanFalse — that represent the two possible boxed boolean values:

(gdb) po (id)kCFBooleanFalse
(gdb) po (id)kCFBooleanTrue

In fact, whenever Foundation/Core Foundation returns a boxed boolean value, it returns either kCFBooleanTrue or kCFBooleanFalse:

(gdb) p/a (int)[NSNumber numberWithBool:0]
$1 = 0x7f270f20
(gdb) p/a (int)kCFBooleanFalse
$2 = 0x7f270f20


(gdb) p/a (int)[NSNumber numberWithBool:1]
$1 = 0x7f270eb0
(gdb) p/a (int)kCFBooleanTrue
$2 = 0x7f270eb0

Note how the objects returned by +numberWithBool: have the same addresses as those Core Foundation constants. After Core Foundation is loaded and throughout the execution of a program, there are exactly two NSNumber objects representing true and false.

Also note that the addresses are even integers, hence they aren’t tagged pointers. Even though they are of type NSNumber, they are not treated as (fast-pathed) integers by Core Foundation.

Edit: Since GDB optimises its symbol lookup table, code that doesn’t reference NSNumber might yield a No symbol "NSNumber" in current context. error message. If that happens, replace NSNumber with NSClassFromString(@"NSNumber"):

po [NSClassFromString(@"NSNumber") numberWithBool:NO]

Jul 21

Tagged pointers and fast-pathed CFNumber integers in Lion

Tagged pointers are a new feature in Lion and the OS X v10.7 SDK. They provide an alternative representation of objects based on the fact that not every integer can represent the memory address of an arbitrary Objective-C object since allocators return 16-byte aligned addresses.

In Lion, the Objective-C runtime and Core Foundation tag pointers as follows:

  • Every tagged pointer has its lowest bit set, hence tagged pointers are odd integers;

  • The next 3 bits (from lowest to highest) define the tagged object class. At the moment, there are classes for integers, managed objects, and dates;

  • The next 4 bits are for type information specific to the tagged object class.

Thus the lowest eight bits in the memory address are used as tag metadata, and the remaining 24 or 56 bits are used as payload.

CFNumber (and, consequently, NSNumber) takes advantage of tagged pointers for integers — if an integer can fit in the payload of a tagged pointer then no actual CFNumber is created. Instead, the memory address represents the integer number itself according to the following layout:

   6         5         4         3         2         1         0
|                                                       |   |  +-- (1 bit) always 1 for tagged pointers
|                                                       |   +----- (3 bits) 001 is the tagged object class for integers
|                                                       +--------- (4 bits) for integers, xxxx is either:
|                                                                           0000 for 8-bit integers,
|                                                                           0100 for 16-bit integers,
|                                                                           1000 for 32-bit integers,
|                                                                           1100 for 64-bit integers
+------------------------------------------------------------------ (56 bits) payload with the actual integer value

Given a tagged pointer, obtaining its underlying integer value is as simple (and fast) as a shift right by 8 bits of data that’s already stored in a register. Preliminary tests by Joshua Weinberg yielded a speedup of 2.42 in both creating NSNumber objects and reading its integer value.

Since there are no actual CFNumber/NSNumber objects, there’s no need for memory management.

In order for a tagged pointer to behave as an object, the Objective-C runtime keeps an isa table (_objc_tagged_isa_table[]) that maps the four lowest bits of a tagged pointer (effectively, its tagged object class) to the isa pointer that’s expected for Objective-C objects. This table is used by object_getClass() (and its corresponding private function) so that the correct isa is returned when querying tagged pointers. Note that directly accessing the isa pointer via ->isa will break in this case. Developers should use object_getClass() instead.

Warning: this post is based upon CF-635. It is internal information, and subject to change in future releases.


Page 1 of 2