A gentle introduction to static analyzers for C

16 Sep 2023

One of the great thing about C are the many fairly mature tools in the ecosystem. The bad news however is that most educational materials don't teach them and thus many beginners (and sometimes even experienced) developers have absolutely no awareness that these tools even exists.

Today I'd like to tackle this a tad bit by focusing on static-analyzers. Static analyzers are tools that can analyze your source code and report potential bugs without having to run the source code (hence the "static" in the name).

Since this is supposed to be a gentle introduction (mainly aimed at beginners) I'll be focusing on static analyzers with the following properties:

Easily accessible.
Zero (or near zero) setup.
Focus on producing least amount of false-positives (FP).

Compiler warnings

Before going into more specialized static-analyzers, it's worthwhile to talk a bit about compiler warnings. While compiler warnings are probably the most used form of static analysis, it's still criminally underused by beginners.

A lot of static-analyzers also expect that you already have compiler warnings turned on and thus do not attempt to catch mistakes that compilers already can catch. So it's important to set up some decent warning flags for your compiler.

For GCC and clang:

-Wall turns on a group of warnings that can catch common mistakes.
-Wpedantic can warn about certain non-portable extension usage. This is useful so you don't unknowingly end up introducing compiler extensions. However keep in mind that -Wpedantic isn't a conformance checker, just because it doesn't warn doesn't mean the code is fully standard compliant.

The above flags alone almost always will end up catching a bug or two (typically more) on a beginner's code. However despite having "all" in the name, -Wall doesn't actually turn all warnings. So a third flag -Wextra is also useful, which turns on some additional (and sometime noisy) warnings.

You might also want to selectively disable some noisy warnings by using -Wno-${warning_name}, e.g -Wno-pointer-sign will disable -Wpointer-sign warnings. However, you should always be aware of what a warning is trying to prevent and why that warning exists before disabling it.

A couple other flags worth mentioning:

-Wshadow warns about when a variable shadows an outer one. This isn't technically a bug, but unintended shadowing often leads to bugs so this ends up being a decent flags to use.
-Wstrict-prototypes warns about obsolete function prototypes with unspecified amount of arguments.
-Wvla to catch usage of variable-length-arrays, which are non-portable and often used without caution.

Further reading: "My favorite C compiler flags during development" (also includes advice on msvc).

And with some basic compiler flags out of the way, let's look at some more specialized static-analyzers.

A small caveat about GCC: for historical reasons, GCC's optimization and warnings are tied together. In other words, there are certain warnings (e.g -Wnull-dereference) which are only active when certain optimizations are active as well. So you'd want to also enable -O2 or above.

Cppcheck

Don't let the name fool you. Despite the name, cppcheck is perfectly capable of analyzing C code. It has a focus on trying to produce the least amount of false-positives by default so you can just run it on a code-base with practically zero-setup:

$ cppcheck -j$(nproc) --enable=portability src/*.c

The -j flag enables multiple thread, which can speed up the process. And the "portability" flag enables some interesting portability warnings that other static-analyzers often miss.

You can also get a bit more "strict" (read: more noisy) analysis using the following:

$ cppcheck -j$(nproc) --enable=style src/*.c

The style group enables all the warnings from portability, performance and warnings group, but also enables some stylistic warnings too (i.e reducing variable scope when possible etc).

You can selectively disable a check using the --suppress flag. A couple other flags that are worth mentioning are: --std to specify a standard, -q to make cppcheck quiet, --inline-suppr to add support for inline suppression via comments and you can also use -D and -U to define and undefine macros similar to the -D and -U compiler (technically pre-processor) flags.

GCC's fanalyzer

With newer a GCC version, you'll have an additional static analyze which can be enabled as simply as appending -fanalyzer to your compiler flags. I recommend using at least GCC v12, since in my experience there were a decent amount of FPs in older versions.

While GCC's analyzer isn't as mature as some other options, the direction looks promising. And the fact that it requires basically zero-setup makes it even more appealing.

Disabling certain checks is the same as disabling warnings, -Wno-${check_name}.

The same caveat about GCC's optimization pass still applies to -fanalyzer as well.

Clang-tidy

I've been hesitant about whether to put clang-tidy in this list or not. On one hand, it's fairly powerful. On the other hand, it's default list of checks contain a couple garbage checks and setting it up requires some effort compared to cppcheck or gcc's -fanalyzer.

But ultimately I decided to include it in the list since I think the effort is worthwhile because clang-tidy has caught a number of bugs in real world program in my experience.

The very first thing you need to do is disable the "insecureAPI" check that's enabled by default. All it does is blindly flag standard functions as "unsafe" and recommend non-portable and dubious annex K variants. It's very disappointing that such low effort checks are enabled by default. It doesn't catch actual bugs and steers amateurs who don't know any better into writing non-portable code with a false sense of security.

Checks can be disabled at the command line via --checks or more conveniently through creating a .clang-tidy config file. If a check starts with - it's disabled, otherwise it's enabled. Globs are also supported, so -misc* disables all checks under misc category.

Here's a config which can serve as a good "baseline":

Checks: >
    performance-*,
    misc-*,
    android-cloexec-*,
    readability-duplicate-include,
    readability-misleading-indentation,
    bugprone-assert-side-effect,
    bugprone-macro-repeated-side-effects,
    bugprone-infinite-loop,
    bugprone-macro-parentheses,
    bugprone-posix-return,
    bugprone-reserved-identifier,
    bugprone-signal-handler,
    bugprone-signed-char-misuse,
    bugprone-sizeof-expression,
    bugprone-branch-clone,
    -clang-analyzer-security.insecureAPI.*,
    -misc-no-recursion,

# treat all warnings as errors
WarningsAsErrors: '*'

CheckOptions:
  - key:             bugprone-assert-side-effect.AssertMacros
    value:           'ASSERT'

ExtraArgs: [-std=c11,-DDEBUG]

It disables some annoying checks and enables a couple useful ones. Couple notable things:

The "insecureAPI" checks are considered harmful and thus disabled.
The android-cloexec category of checks recommends adding O_CLOEXEC or equal flags when opening a fd. However, some of the recommendations are not part of POSIX and thus may not be portable. Feel free to disable this check.
Certain checks may accept "options". For example, I'm using it to tell clang-tidy about my custom ASSERT macro.

You can find the list of checks along with some description of what they do in here.

After all this setup, you'd think it'd now be easy to get going by just doing:

$ clang-tidy src/*.c

Almost... The problem is clang-tidy requires you to pass in various compiler/pre-processor flags in order to function properly. So you basically need to duplicate any compiler flags when invoking clang-tidy after --:

$ clang-tidy src/*.c -- ${CFLAGS} ${CPPFLAGS}

You can arrange your build system to append these flags. Or manually add them to ExtraArgs in your clang-tidy config. Or there's also a tool called scan-build which is an attempt at automating the process.

Lastly, specific warnings can be silenced via NOLINT comments. This can be useful if you want to silence a specific false-positive but don't want to disable that check entirely.

Having the right mindset

While the above tools do a good job at static analysis, it's also important to have a right mindset about it. It's easy fall into the trap of aggressively enabling a shitload of noisy checks and fooling yourself into thinking you're being productive by "fixing" them - when in reality you might just be doing busywork.

I've done this early on as well. In hindsight I cannot exactly say it was a mistake since I did end up learning about some actually useful flags which aren't enabled by default in the process. But nowadays I have a much more strict criterion about whether or not to keep a checks.

Noise: How much false-positives does it produce?
Effective: Has the check caught any real bugs in the past or does it seem like it would catch real bugs in the future?
Friction: How easy is it to silence the false-positives?

If a check isn't effective and produces false-positives, then it's usually not worth enabling. If it's effective but produces too much noise or friction then I will have it disabled by default, but every now and then I'll enable it and see if it finds any actual bugs (and ignore the false-positives rather than doing some dance to silence it).

The "right" amount of utility to friction ratio will obviously depend on the project (e.g something security sensitive running as root vs some toy cat implementation). But as a general baseline, I've found that the above mindset gives a nice sweet spot where you can get the most amount of utility out of static-analyzers while adding the least amount of friction to your workflow.

Tags: [ c, toolchain ]

RSS Feed