(const gchar*) vs. (gchar*) and other memory management stories

This article now been upstreamed to the GNOME Programming Guidelines, where future updates will be made to it.

Memory management in C is often seen as a minefield, and while it’s not as simple as in more modern languages, if a few conventions are followed it doesn’t have to be hard at all. Here are some pointers about heap memory management in C using GObject and the usual GNOME types and coding conventions, including GObject introspection. A basic knowledge of C is assumed, but no more. Stack memory allocation and other obscure patterns aren’t covered, since they’re much less commonly used in a typical GObject application.

The most important rule when thinking about memory management is: which code has ownership of this memory?

Ownership can be defined as the responsibility to free or deallocate a piece of memory. If code which does own some memory doesn’t deallocate that memory, that’s a memory leak. If code which doesn’t own some memory deallocates it (in addition to the owner doing so), that’s a double-free. Both are bad.

For example, if a function returns a newly allocated string and doesn’t retain a pointer to it, ownership of the string is transferred to the caller. Note that when thinking about ownership transfer, malloc()/free() and reference counting are treated the same: in the former case, a newly allocated piece of heap memory is being transferred; in the latter, a newly incremented reference.

gchar *transfer_full_function (void) {
	return g_strdup ("Newly allocated string.");
}

/* Ownership is transferred here. */
gchar *my_string = transfer_full_function ();

Conversely, if the function does retain a pointer (e.g. inside an object), ownership is not transferred to the caller.

const gchar *transfer_none_function (void) {
	return "Static string.";
}

const gchar *transfer_none_method (MyObject *self) {
	gchar *new_string = g_strdup ("Newly allocated string.");
	/* Ownership is retained by the object here. */
	self->priv->saved_string = new_string;
	return new_string;
}

/* In both of these cases, ownership is not transferred. */
const gchar *my_string1 = transfer_none_function ();
const gchar *my_string2 = transfer_none_method (some_object);

In all of these examples, you may notice the return type of the functions reflects whether they transfer ownership of their return values. const return types indicate no transfer, and non-const return types indicate some transfer.

While this is useful, it’s by no means completely clear. For example, what if a function returns a non-const GList*? Is ownership of the list elements transferred, or just the list? What if the function’s author forgot to make the return type const, and actually there’s no transfer? This is where documentation comments are useful.

There’s a convention in GNOME documentation comments to specify the function which should be used to free a returned value. If such a function is mentioned, ownership of the returned memory is transferred. If no function is mentioned, ownership probably isn’t transferred, but it’s hard to be sure. That’s why it’s good to always be explicit when writing documentation comments.

/**
 * my_object_get_some_string:
 * @self: a #MyObject
 *
 * Gets the value of the #MyObject:some-string property.
 *
 * Return value: (transfer none): some string, owned by the
 * object and must not be freed
 */
const gchar *my_object_get_some_string (MyObject *self) {
	return self->priv->some_string;
}

/**
 * my_object_build_result:
 * @self: a #MyObject
 *
 * Builds a result string which probably represents something
 * meaningful.
 *
 * Return value: (transfer full): a newly allocated result
 * string; free with g_free()
 */
gchar *my_object_build_result (MyObject *self) {
	return g_strdup_printf ("%s %s",
	                        self->priv->some_string,
	                        self->priv->some_other_string);
}

/**
 * my_object_dup_controller:
 * @self: a #MyObject
 *
 * Gets the value of the #MyObject:controller property,
 * incrementing the controller's reference count.
 *
 * Return value: (transfer full): the object's
 * controller; unref with g_object_unref()
 */
MyController *my_object_dup_controller (MyObject *self) {
	return g_object_ref (self->priv->controller);
}

When GObject introspection was introduced, these kinds of documentation comments were formalised as introspection annotations: (transfer full), (transfer container) or (transfer none), as documented on wiki.gnome.org. These allow the runtimes of language bindings to manage memory correctly. Since a C programmer is essentially doing the same job as a language runtime when writing C code, the information provided by transfer annotations is sufficient for perfect memory management. Unfortunately, not all parameters and return types of all functions have these annotations added. The examples above do, however.

Finally, a few libraries use a function naming convention. Functions named *_get_* do not transfer ownership, whereas functions named *_dup_* (for ‘duplicate’) do transfer ownership. This can be seen in the examples above, or with json_node_get_array() vs. json_node_dup_array(). Be aware, though, that only a few libraries use this convention. Other libraries use *_get_* for both functions which do and do not transfer ownership of their results. Other code, such as that generated by gdbus-codegen, uses a different and incompatible convention: *_get_* methods signify full transfer, and *_peek_* methods signify no transfer. For example, goa_object_get_manager() vs. goa_object_peek_manager(). For this reason, going by function naming conventions only works within libraries, not between different libraries.

Memory management of parameters is analogous to return values: look at whether the parameter is const and whether there are any introspection annotations for it.

In summary, here are a set of guidelines one can follow to determine whether ownership of a return value is transferred, and hence whether the caller needs to free it:

  1. If the type has an introspection (transfer) annotation, look at that.
  2. Otherwise, if the type is const, there is no transfer.
  3. Otherwise, if the function documentation explicitly specifies the return value must be freed, there is full or container transfer.
  4. Otherwise, if the function is named *_dup_*, there is full or container transfer.
  5. Otherwise, if the function is named *_peek_*, there is no transfer.
  6. Otherwise, you need to look at the function’s code to determine whether it intends ownership to be transferred. Then file a bug against the documentation for that function, and ask for an introspection annotation to be added.

Some common pitfalls:

  • If you’re using an explicit typecast (e.g. casting a (const gchar*) return value to (gchar*)), it’s likely something’s wrong.
  • Generally, return values which are not transferred (such as (const gchar*)) are freed when the owning object is destroyed — so if you need such a value to persist, you must copy it or increase its reference count. You then have ownership of the copy or new reference, and are responsible for freeing it (not the original).

How can one check for incorrect memory handling? Use Valgrind. It will detect leaks and double-frees, and is simple to use:

valgrind --tool=memcheck --leak-check=full my-program-name

Or, if running your program from the source directory, use the following to avoid running leak checking on the libtool helper scripts:

libtool --mode=execute valgrind --tool=memcheck --leak-check=full ./my-program-name

Valgrind lists each memory problem it detects, along with a short backtrace (if you’ve compiled your program with debug symbols), allowing the cause of the memory error to be pinpointed and fixed!

4 thoughts on “(const gchar*) vs. (gchar*) and other memory management stories

  1. Aleksander

    Worth noting; another convention (e.g. the one used in gdbus-codegen generated code when getting interface objects from the base dbus object) is to have _peek () when doing transfer-none and _get() when doing transfer-full (check for example the GOA reference).

  2. skagedal

    Excellent post! Very useful.

    Just a tiny thing: the function that should be called "my_object_dup_controller", and is documented as such in the comment, is called "my_object_get_controller" in the code.

Comments are closed.