tl;dr: Visualise your main context and sources using Dunfell. Feedback and ideas welcome.
At the DX hackfest, I’ve been working on a new tool for instrumenting and visualising the behaviour of the GLib main context (or main contexts) in your program.
It’s called Dunfell (because I’m a sucker for hills) and at a high level it works by using SystemTap to record various GMainContext interactions in your program, saving them to a log file. The log file can then be examined using a viewer program.
The source is available on GitLab or GitHub because I still haven’t decided which is better.
In the screenshot above, each vertical line is a thread, each blue box is one dispatch phase of the main context which is currently running on that thread, each orange blob is a new GSource being created, and the green blob is a GSource which has been selected for closer inspection.
At the moment, it requires a couple of GLib patches to add some more SystemTap probe points, and it also requires a recent version of GTK+. It needs SystemTap, and I’ve only tested it on Fedora, so it might need some patching to work with the SystemTap installed on other distributions.
This screenshot is of a trace of the buffered-input-stream test from GIO, showing I/O callbacks being made across threads as idle source callbacks.
More visualisation ideas are welcome! At the moment, what Dunfell draws is quite simplistic. I hope it will be able to solve various common debugging problems eventually but suggestions for ways to do this intuitively, or for other problems to visualise, are welcome. Here are the use cases I was initially thinking about (from the README):
- Detect GSources which are never added to a GMainContext.
- Detect GSources which are dispatched too often (i.e. every main context iteration).
- Detect GSources whose dispatch function takes too long (and hence blocks the main context).
- Detect GSources which are never removed from their GMainContext after being dispatched (but which are never dispatched again).
- Detect GMainContexts which have GSources attached or (especially) events pending, but which aren’t being iterated.
- Monitor the load on each GMainContext, such as how many GSources it has attached, and how many events are processed each iteration.
- Monitor ongoing asynchronous calls and GTasks, giving insight into their nesting and dependencies.
- Monitor unfinished or stalled asynchronous calls.
- Allow users to record logs to send to the developers for debugging on a different machine. The users may have to install additional software to record these logs (some component of Dunfell, plus its dependencies), but should not have to recompile or otherwise modify the program being debugged.
- Work with programs which purely use GLib, through to programs which use GLib, GIO and GTK+.
- Allow visualisation of this data, both in a standalone program, and in an IDE such as GNOME Builder.
- Allow visualising differences between two traces.
- Minimise runtime overhead of logging a program, to reduce the risk of disturbing race conditions by enabling logging.
- Connecting to an already-running program is not a requirement, since by the time you’ve decided there’s a problem with a program, it’s already in the wrong state.
Pingback: DX hackfest 2016 aftermath — drboblog
FYI, with GCC 5.3.1:
libdunfell/dfl-parser.c:451:58: error: ‘initial_timestamp’ may be used uninitialized in this function [-Werror=maybe-uninitialized]
if ((highest_timestamp == NULL && timestamp_int < initial_timestamp) ||
^
libdunfell/dfl-parser.c:217:11: note: ‘initial_timestamp’ was declared here
guint64 initial_timestamp;
^
I'm using --disable-Werror, nifty that 🙂
Fixed, thanks.
This horrible, horrible hack to dunfell-record.in makes jhbuild work for me:
exec stap --unprivileged --ldd -o "$log_file" -c "/usr/lib64/ld-linux-x86-64.so.2 --library-path $LD_LIBRARY_PATH $*" $STAP_OPTIONS @datadir@/libdunfell-@DFL_API_VERSION@/dunfell-record.stp
Without that, LD_LIBRARY_PATH doesn't work and the system libraries in /usr/lib64 always gets used.
Hmm, not sure I’m going to push that one. Surely it’s a SystemTap bug if it’s not applying LD_LIBRARY_PATH to things?
Pingback: GTK+ hackfest 2016 — drboblog