Passing variable-sized buffers around is a common task in C, especially in networking code. Ignoring GLib convenience API like GBytes, GInputVector and GOutputVector, a buffer is simply an offset and length describing a block of memory. This is just a void* and int, right? What’s the problem?
tl;dr: I’m suggesting to use (uint8_t*, size_t) to describe buffers in C code, or (guint8*, gsize) if you’re using GLib. This article is aimed at those who are still getting to grips with C idioms. See also: (const gchar*) vs. (gchar*) and other memory management stories.
There are two problems here: this effectively describes an array of elements, but void* doesn’t describe the width of each element; and (assuming each element is a byte) an int isn’t wide enough to index every element in memory on modern systems.
But void* could be assumed to refer to an array of bytes, and an int is probably big enough for any reasonable situation, right? Probably, but not quite: a 32-bit signed integer can address 2 GiB (assuming negative indices are ignored) and an unsigned integer can address 4 GiB1. Fine for network processing, but not when handling large files.
How is size_t better? It’s defined as being big enough to refer to every addressable byte of memory in the current computer system (caveat: this means it’s architecture-specific and not suitable for direct use in network protocols or file formats). Better yet, it’s unsigned, so negative indices aren’t wasted.
What about the offset of the buffer — the void*? Better to use a uint8_t*, I think. This has two advantages: it explicitly defines the width of each element to be one byte; and it’s distinct from char* (it’s unsigned rather than signed2), which makes it more obvious that the data being handled is not necessarily human-readable or nul-terminated.
Why is it important to define the width of each element? This is a requirement of C: it’s impossible to do pointer arithmetic without knowing the size of an element, and hence the C standard forbids arithmetic on void* pointers, since void doesn’t have a width (C11 standard, §6.2.5¶¶19,1). So if you use void* as your offset type, your code will end up casting to uint8_t* as soon as an offset into the buffer is needed anyway.
A note about GLib: guint8 and uint8_t are equivalent, as are gsize and size_t — so if you’re using GLib, you may want to use those type aliases instead.
For a more detailed explanation with some further arguments, see this nice article about size_t and ptrdiff_t by Karpov Andrey.
Technically, it’s architecture-dependent whether char is signed or unsigned (C11 standard, §6.2.5¶15), and whether it’s actually eight bits wide (though in practice it is really always eight bits wide), which is another reason to use uint8_t. ↩