NRK's NOTES

Bunch of small technical notes. There's a dedicated RSS feed for this section if you want to get notification on a new note.

Notes on alloc_size attribute

(22 Jun 2024)

For people who use custom allocators, GCC has an alluring alloc_size attribute which informs the compiler about your allocation functions and allows it to understand the size of the resulting allocation. This can be useful for catching out of bound accesses via UBSan when debugging or hardening against buffer overflows in production via _FORTIFY_SOURCE.

In theory this is pretty great, but in practice alloc_size is very much dangerously broken in GCC due to this bug which causes GCC to lose the alloc_size information when inlined.

Let's say you have a simple bump allocator. An alloc function to do allocation and a resize function that resizes the last allocation in-place.

__attribute(( alloc_size(1, 2) )) void *alloc(int count, int size);
__attribute(( alloc_size(2, 3) )) void *resize(void *oldptr, int count, int size);

Now you allocate an integer. If alloc() call isn't inlined, GCC will believe p points to one valid int.

int *p = alloc(1, sizeof(int));  // not inlined.
p[0] = 0; // GCC thinks `p` points to a single `int`, so this is okay.

Now if you extend p but the reize() call gets inlined then GCC will continue thinking that p points to only a single valid int instead of 2!

p = resize(p, 2, sizeof(int)); // inlined, thus loses the new size information
// GCC *still* thinks `p` points to only a single `int` from the earlier `alloc`
// call and thus thinks `p[1]` is out of bound (!!!)
p[1] = 1;

You can try adding noinline to all your allocators to prevent this bug. But for simple allocators, like an arena allocator with very few instructions, forcing noinline would be a performance regression.

Clang doesn't seem to have this particular inlining bug. But a sibling malloc attribute behaves a bit suspiciously which does not raise much confidence.

Not many people use custom allocators (not counting LD_PRELOAD-ed malloc drop-ins), and even less people have their allocator code visible to the compiler (e.g compiling as a single translation unit or via LTO). So it probably shouldn't be surprising that these attributes are buggy when used in non-conventional settings, they barely get any testing.

And so all things considered, if you're using a custom allocator, it's best to simply avoid both the malloc and alloc_size attribute alltogether and remove any compiler bugs that are associated with these attributes out of the equation.

Notes on -fwhole-program

(31 Aug 2023)

GCC has a nifty flag called -fwhole-program which basically marks all functions (except "main") and global variables as static so that they can be optimized more aggressively. This is pretty great if you're trying to jumbo-build other programs that weren't written with jumbo builds in mind.

However, if everything is manually marked as static already then (unsurprisingly) -fwhole-program has zero effect. On the other hand, it does have some downside for "libc-free" programs such as having to mark non-conventional entry-points as externally_visible or leaving you with undefined references due to buggy optimization. Clang also doesn't support this flag (or the externally_visible attribute) so that adds some more annoyances.

-fwhole-program is still useful for jumbo building other software but if you're writing something with jumbo-build in mind already then it's better and more portable to just mark everything as static manually and just avoid -fwhole-program.

Lua: no true assertions

(29 Mar 2023)

Assertions are an incredibly useful debugging tool (they are also incredibly misunderstood, novices often mistake it as an error-handler and packagers (apparently) mistake it for "security" checks).

Lua unfortunately doesn't have true assertions. It does have an assert function - but unlike a debugging tool (which is what assertions are supposed to be) - the lua assert sticks around in the source no matter what.

To get around this limitation, I'm using the following method.

-- at the top of the file
local myassert = function() end

-- and then at "startup"
if debug then
    myassert = assert
end

This turns the assert into a no-op in non-debug environment. However, unlike C, lua doesn't have a "pre-processor" and since lua is (typically) interpreted rather than compiled, there's no way (that I'm aware of) to avoid evaluating the argument itself.

For my use-cases it's not a fatal flaw. However, it is disappointing that lua's assert is basically just a fatal-error-handler - which just further propagates the misunderstanding of assertions.

C: realloc also acts as malloc

(25 Feb 2023)

One useful, but relatively unknown feature of realloc is that if the ptr argument is NULL, it acts as malloc. So instead of writing:

if (p == NULL) /* first time around, allocate */
    p = malloc(...);
else /* grow the buffer */
    p = realloc(p, ...);

You can get rid of the if statement and write the following:

p = realloc(p, ...);

Also worth noting that older C standard says that if the size argument of realloc is 0 and the pointer is non-null, it acts as free. However, not all implementation follows this and thus this is non-portable in practice. Due to this, zero sized realloc have been made undefined in the upcoming C2x standard as well. So you should definitely avoid it.

C: Surprising behavior with strtoul

(19 Feb 2023)

Given that strtoul is supposed to parse an unsigned long, you'd expect something like "-1" to error out due to ERANGE due to being outside of [0, ULONG_MAX].

However, that's not what it does. Instead it's defined to return -num for "-$num" input, i.e ULONG_MAX for "-1" (1, 2).

So be wary if you decide to use this function (you also need to be wary of locales too).

C: Safe bitwise rotation

(19 Feb 2023)

The idiomatic (x >> (32 - r)) | (x << r) isn't safe when the rotation amount r can be 0 (assuming x is 32bit, it'll lead to x >> 32 which is UB).

Use this instead (note that the information about clang's optimizer on that article is outdated/fixed now): (x >> (-(unsigned)r & 31)) | (x << r)

Tcl: fast max/min integer out of a list

(19 Feb 2023)

[lindex [lsort -integer $list] end] is significantly faster than [tcl::mathfunc::max {*}$list]. I'm not entirely sure why. Might be due to max() needing to deal with floating-point while the integer sort doesn't.



RSS Feed