Reference count debugging with gdb

As I was hacking today, I ran into some hard-to-debug reference counting problems with one of my classes. The normal smattering of printf()s didn't help, and neither did this newfangled systemtap, which was a bit disappointing.

It worked, in that my probes were correctly run and correctly highlighted each reference/dereference of the class I was interested in, but printing a backtrace only extended to the g_object_ref()/g_object_unref() call, and no further. I'm guessing this was a problem with the location of the debug symbols for my code (since it was in a development prefix, whereas systemtap was not), but it might be that systemtap hasn't quite finished userspace stuff yet. That's what I read, at least.

In the end, I ended up using conditional breakpoints in gdb. This was a lot slower than systemtap, but it worked. It's the sort of thing I would've killed to know a few years (or even a few months) ago, so hopefully it's useful for someone (even if it's not the most elegant solution out there).

set pagination off
set $foo=0
break main
run

break g_object_ref
condition 2 _object==$foo
commands
	silent
	bt 8
	cont
	end

break g_object_unref
condition 3 _object==$foo
commands
	silent
	bt 8
	cont
	end

break my_object_init
commands
	silent
	set $foo=my_object
	cont
	end
enable once 4
cont

The breakpoint in main() is to stop gdb discarding our breakpoints out of hand because the relevant libraries haven't been loaded yet. $foo contains the address of the first instance of MyObject in the program; if you need to trace the n+1th instance, use ignore 4 n to only fire the my_object_init breakpoint on the n+1th MyObject instantiation.

This can be extended to track (a fixed number of) multiple instances of the object, by using several $fooi variables and gdb's if statements to set them as appropriate. This is left as an exercise to the reader!

I welcome the inevitable feedback and criticism of this approach. It's hacky, ugly and slower than systemtap, but at least it works.

9 thoughts on “Reference count debugging with gdb

  1. Joe Buck

    It's an old technique. The trick is to try to figure out how to choose the breakpoints so that gdb runs as little as possible, since it's slow. If there is a breakpoint with a condition, gdb wakes up every time, checks the condition, and restarts the program if it isn't satisfied, which can be expensive (though it's a lot faster than doing it by hand). Enabling and disabling breakpoints can sometimes help.

  2. Frank Ch. Eigler

    "[systemtap] worked, in that my probes were correctly run and correctly highlighted each reference/dereference of the class I was interested in, but printing a backtrace only extended to the g_object_ref()/g_object_unref() call, and no further."

    Some major improvements to backtracing are coming soon to stap land. It's possible though that the only thing you were lacking were some '-d /path/to/shlib -d /bin/foo' options to preload unwind data into the systemtap probe module.

    1. Mark Wielaard

      The soon to be released SystemTap 1.3 should at least print the "module" (share library name) of the last frame of the backtrace (plus address of course). That way you can at least see why SystemTap couldn't unwind further. As Frank says then you could provide SystemTap with that shared library hint through -d. Also 1.3 has --ldd which makes SystemTap pick up everything ldd knows about a program. There were also a couple of plain bug fixes in the unwinder that should improve the output.

      If that isn't enough maybe we can teach systemtap about pkg-config files to pick up which shared libraries are likely to be used in a gnome program. Or is there some other hints about dynamically loaded libraries that systemtap should know about?

  3. Pingback: Reference count debugging with systemtap — drboblog

  4. Arun Raghavan

    Maybe I got what you're doing wrong, but I usually set a watchpoint on the object's refcount. Something like:

    p object (fugly, but this makes a gdb variable, say $1, with the address of object, so your watchpoint is not tied to the symbol 'object')
    watch ((GObject*)$1)->ref_count

Comments are closed.