Bunch of small technical notes. There's a dedicated RSS feed for this section if you want to get notification on a new note.
alloc_size attribute(22 Jun 2024)
For people who use custom allocators, GCC has an alluring alloc_size
attribute which informs the compiler about your allocation
functions and allows it to understand the size of the resulting allocation.
This can be useful for catching out of bound accesses via UBSan when debugging
or hardening against buffer overflows in production via _FORTIFY_SOURCE.
In theory this is pretty great, but in practice alloc_size is very much
dangerously broken in GCC due to this bug which causes
GCC to lose the alloc_size information when inlined.
Let's say you have a simple bump allocator. An alloc function to do allocation
and a resize function that resizes the last allocation in-place.
__attribute(( alloc_size(1, 2) )) void *alloc(int count, int size);
__attribute(( alloc_size(2, 3) )) void *resize(void *oldptr, int count, int size);
Now you allocate an integer. If alloc() call isn't inlined, GCC will believe
p points to one valid int.
int *p = alloc(1, sizeof(int)); // not inlined.
p[0] = 0; // GCC thinks `p` points to a single `int`, so this is okay.
Now if you extend p but the reize() call gets inlined then GCC will continue
thinking that p points to only a single valid int instead of 2!
p = resize(p, 2, sizeof(int)); // inlined, thus loses the new size information
// GCC *still* thinks `p` points to only a single `int` from the earlier `alloc`
// call and thus thinks `p[1]` is out of bound (!!!)
p[1] = 1;
You can try adding noinline to all your allocators to prevent this bug.
But for simple allocators, like an arena allocator with
very few instructions, forcing noinline would be a performance regression.
Clang doesn't seem to have this particular inlining bug.
But a sibling malloc attribute behaves a bit suspiciously
which does not raise much confidence.
Not many people use custom allocators (not counting LD_PRELOAD-ed malloc
drop-ins), and even less people have their allocator code visible to the
compiler (e.g compiling as a single translation unit or via LTO).
So it probably shouldn't be surprising that these attributes are buggy when used
in non-conventional settings, they barely get any testing.
And so all things considered, if you're using a custom allocator, it's best to
simply avoid both the malloc and alloc_size attribute alltogether and remove
any compiler bugs that are associated with these attributes out of the equation.
(31 Aug 2023)
GCC has a nifty flag called -fwhole-program which basically marks all
functions (except "main") and global variables as static so that they can be
optimized more aggressively.
This is pretty great if you're trying to jumbo-build other programs
that weren't written with jumbo builds in mind.
However, if everything is manually marked as static already then
(unsurprisingly) -fwhole-program has zero effect.
On the other hand, it does have some downside for "libc-free" programs such as
having to mark non-conventional entry-points as externally_visible or leaving
you with undefined references due to buggy optimization.
Clang also doesn't support this flag (or the externally_visible attribute) so
that adds some more annoyances.
-fwhole-program is still useful for jumbo building other software but if
you're writing something with jumbo-build in mind already then it's better and
more portable to just mark everything as static manually and just avoid
-fwhole-program.
(29 Mar 2023)
Assertions are an incredibly useful debugging tool (they are also incredibly misunderstood, novices often mistake it as an error-handler and packagers (apparently) mistake it for "security" checks).
Lua unfortunately doesn't have true assertions. It does have an
assert function - but unlike a debugging tool (which is what
assertions are supposed to be) - the lua assert sticks around in the source
no matter what.
To get around this limitation, I'm using the following method.
-- at the top of the file
local myassert = function() end
-- and then at "startup"
if debug then
myassert = assert
end
This turns the assert into a no-op in non-debug environment. However, unlike C, lua doesn't have a "pre-processor" and since lua is (typically) interpreted rather than compiled, there's no way (that I'm aware of) to avoid evaluating the argument itself.
For my use-cases it's not a fatal flaw. However, it is disappointing that lua's
assert is basically just a fatal-error-handler - which just further propagates
the misunderstanding of assertions.
(25 Feb 2023)
One useful, but relatively unknown feature of realloc is that if
the ptr argument is NULL, it acts as malloc. So instead of writing:
if (p == NULL) /* first time around, allocate */
p = malloc(...);
else /* grow the buffer */
p = realloc(p, ...);
You can get rid of the if statement and write the following:
p = realloc(p, ...);
Also worth noting that older C standard says that if the size argument of
realloc is 0 and the pointer is non-null, it acts as free. However, not all
implementation follows this and thus this is non-portable in practice. Due to
this, zero sized realloc have been made undefined in the upcoming C2x
standard as well. So you should definitely avoid it.
(19 Feb 2023)
Given that strtoul
is supposed to parse an unsigned long, you'd expect something like "-1" to
error out due to ERANGE due to being outside of [0, ULONG_MAX].
However, that's not what it does. Instead it's defined to return -num for
"-$num" input, i.e ULONG_MAX for "-1" (1, 2).
So be wary if you decide to use this function (you also need to be wary of locales too).
(19 Feb 2023)
The idiomatic (x >> (32 - r)) | (x << r) isn't safe when the rotation amount
r can be 0 (assuming x is 32bit, it'll lead to x >> 32 which is UB).
Use this instead (note that the
information about clang's optimizer on that article is outdated/fixed now):
(x >> (-(unsigned)r & 31)) | (x << r)
(19 Feb 2023)
[lindex [lsort -integer $list] end] is significantly faster than
[tcl::mathfunc::max {*}$list]. I'm not entirely sure why. Might be due to
max() needing to deal with floating-point while the integer sort doesn't.