Bunch of small technical notes. There's a dedicated RSS feed for this section if you want to get notification on a new note.
alloc_size
attribute(22 Jun 2024)
For people who use custom allocators, GCC has an alluring alloc_size
attribute which informs the compiler about your allocation
functions and allows it to understand the size of the resulting allocation.
This can be useful for catching out of bound accesses via UBSan when debugging
or hardening against buffer overflows in production via _FORTIFY_SOURCE
.
In theory this is pretty great, but in practice alloc_size
is very much
dangerously broken in GCC due to this bug which causes
GCC to lose the alloc_size
information when inlined.
Let's say you have a simple bump allocator. An alloc
function to do allocation
and a resize
function that resizes the last allocation in-place.
__attribute(( alloc_size(1, 2) )) void *alloc(int count, int size);
__attribute(( alloc_size(2, 3) )) void *resize(void *oldptr, int count, int size);
Now you allocate an integer. If alloc()
call isn't inlined, GCC will believe
p
points to one valid int
.
int *p = alloc(1, sizeof(int)); // not inlined.
p[0] = 0; // GCC thinks `p` points to a single `int`, so this is okay.
Now if you extend p
but the reize()
call gets inlined then GCC will continue
thinking that p
points to only a single valid int
instead of 2!
p = resize(p, 2, sizeof(int)); // inlined, thus loses the new size information
// GCC *still* thinks `p` points to only a single `int` from the earlier `alloc`
// call and thus thinks `p[1]` is out of bound (!!!)
p[1] = 1;
You can try adding noinline
to all your allocators to prevent this bug.
But for simple allocators, like an arena allocator with
very few instructions, forcing noinline
would be a performance regression.
Clang doesn't seem to have this particular inlining bug.
But a sibling malloc
attribute behaves a bit suspiciously
which does not raise much confidence.
Not many people use custom allocators (not counting LD_PRELOAD
-ed malloc
drop-ins), and even less people have their allocator code visible to the
compiler (e.g compiling as a single translation unit or via LTO).
So it probably shouldn't be surprising that these attributes are buggy when used
in non-conventional settings, they barely get any testing.
And so all things considered, if you're using a custom allocator, it's best to
simply avoid both the malloc
and alloc_size
attribute alltogether and remove
any compiler bugs that are associated with these attributes out of the equation.
(31 Aug 2023)
GCC has a nifty flag called -fwhole-program
which basically marks all
functions (except "main") and global variables as static
so that they can be
optimized more aggressively.
This is pretty great if you're trying to jumbo-build other programs
that weren't written with jumbo builds in mind.
However, if everything is manually marked as static
already then
(unsurprisingly) -fwhole-program
has zero effect.
On the other hand, it does have some downside for "libc-free" programs such as
having to mark non-conventional entry-points as externally_visible
or leaving
you with undefined references due to buggy optimization.
Clang also doesn't support this flag (or the externally_visible
attribute) so
that adds some more annoyances.
-fwhole-program
is still useful for jumbo building other software but if
you're writing something with jumbo-build in mind already then it's better and
more portable to just mark everything as static
manually and just avoid
-fwhole-program
.
(29 Mar 2023)
Assertions are an incredibly useful debugging tool (they are also incredibly misunderstood, novices often mistake it as an error-handler and packagers (apparently) mistake it for "security" checks).
Lua unfortunately doesn't have true assertions. It does have an
assert
function - but unlike a debugging tool (which is what
assertions are supposed to be) - the lua assert
sticks around in the source
no matter what.
To get around this limitation, I'm using the following method.
-- at the top of the file
local myassert = function() end
-- and then at "startup"
if debug then
myassert = assert
end
This turns the assert into a no-op in non-debug environment. However, unlike C, lua doesn't have a "pre-processor" and since lua is (typically) interpreted rather than compiled, there's no way (that I'm aware of) to avoid evaluating the argument itself.
For my use-cases it's not a fatal flaw. However, it is disappointing that lua's
assert
is basically just a fatal-error-handler - which just further propagates
the misunderstanding of assertions.
(25 Feb 2023)
One useful, but relatively unknown feature of realloc is that if
the ptr
argument is NULL, it acts as malloc
. So instead of writing:
if (p == NULL) /* first time around, allocate */
p = malloc(...);
else /* grow the buffer */
p = realloc(p, ...);
You can get rid of the if statement and write the following:
p = realloc(p, ...);
Also worth noting that older C standard says that if the size
argument of
realloc
is 0 and the pointer is non-null, it acts as free
. However, not all
implementation follows this and thus this is non-portable in practice. Due to
this, zero sized realloc
have been made undefined in the upcoming C2x
standard as well. So you should definitely avoid it.
(19 Feb 2023)
Given that strtoul
is supposed to parse an unsigned long, you'd expect something like "-1" to
error out due to ERANGE
due to being outside of [0, ULONG_MAX
].
However, that's not what it does. Instead it's defined to return -num
for
"-$num" input, i.e ULONG_MAX
for "-1" (1, 2).
So be wary if you decide to use this function (you also need to be wary of locales too).
(19 Feb 2023)
The idiomatic (x >> (32 - r)) | (x << r)
isn't safe when the rotation amount
r
can be 0 (assuming x
is 32bit, it'll lead to x >> 32
which is UB).
Use this instead (note that the
information about clang's optimizer on that article is outdated/fixed now):
(x >> (-(unsigned)r & 31)) | (x << r)
(19 Feb 2023)
[lindex [lsort -integer $list] end]
is significantly faster than
[tcl::mathfunc::max {*}$list]
. I'm not entirely sure why. Might be due to
max()
needing to deal with floating-point while the integer sort doesn't.