Using Valgrind to Detect Memory Management Problems in C

The C language has a reputation for being difficult to learn and to code in. I think this is unfair as it is actually a very small and simple language, but most of the perceived difficulty with using C comes from its memory management, or rather lack of. Only the smallest and simplest programs can get away with using auto variables: sooner or later you are going to have to use dynamic memory, opening yourself up to an extensive range of tricky bugs. There's no foolproof way to get round this, but you can catch most bugs before they wreak havoc in production with a brilliant little program called Valgrind.

Valgrind

I referred to Valgrind as a "little program" above but it's actually quite complex and sophisticated. If you want to investigate it in detail this is the website valgrind.org. However, the purpose of this article is just to show how to use it to catch a few of the most common memory management problems and to demonstrate that it is easy to incorporate it into the workflow of any C program you write.

Firstly though you need to install it. On Linux you just need to open up a terminal and enter

Installing Valgrind on Linux

sudo apt-get install valgrind

Once you have entered your password Valgrind will install quickly. You can check it has installed correctly by entering

Checking Valgrind Version

valgrind --version

(I wish developers would standardise on command line switches. I tried -v, -V, --v and --V and none of them worked. I had to Google "valgrind version" to find out that you need to use --version!)

The Project

This program will consist of one C source code file containing a few functions, each of which will commit some kind of memory management sin. Fear not though for our sins will find us out, or rather Valgrind will find us out.

Create a new folder and within it create a file called usingvalgrind.c. You can download it as a zip or use the Github repository if you prefer.

Source Code Links

ZIP File
GitHub

usingvalgrind.c (part 1)

#include<stdio.h>
#include<stdlib.h>
#include<math.h>

//--------------------------------------------------------
// FUNCTION PROTOTYPES
//--------------------------------------------------------
void uninitialized_memory();
void memory_leak();
void use_freed_memory();
void overshoot_memory();
void realloc_memory();

//--------------------------------------------------------
// FUNCTION main
//--------------------------------------------------------
int main(int argc, char* argv[])
{
    puts("-----------------");
    puts("| codedrome.com |");
    puts("| valgrind      |");
    puts("-----------------\n");

    uninitialized_memory();
    memory_leak();
    use_freed_memory();
    overshoot_memory();
    realloc_memory();

    return EXIT_SUCCESS;
}

That's all quite straightforward - just a few function prototypes and a call of each of those functions in main. It should be clear from the function names just what memory management problem each causes. Now let's get on to the functions themselves. As you can see each has some commented out code which, if uncommented, will solve the problem. Leave it commented for the time being though.

usingvalgrind.c (part 2)

//--------------------------------------------------------
// FUNCTION uninitialized_memory
//--------------------------------------------------------
void uninitialized_memory()
{
    puts("Using uninitialized memory");

    int* data;

    // uncomment to fix error
    //data = malloc(sizeof(int) * 32);

    data[0] = 123;

    printf("%d\n\n", data[0]);

    // uncomment to fix error
    //free(data);
}

//--------------------------------------------------------
// FUNCTION memory_leak
//--------------------------------------------------------
void memory_leak()
{
    puts("Memory leak");

    int* data;

    data = malloc(sizeof(int) * 32);

    data[0] = 234;

    printf("%d\n\n", data[0]);

    // uncomment to fix error
    //free(data);
}

//--------------------------------------------------------
// FUNCTION use_freed_memory
//--------------------------------------------------------
void use_freed_memory()
{
    puts("Using freed memory");

    int* data;

    data = malloc(sizeof(int) * 32);

    data[0] = 345;

    // move to after printf to fix error
    free(data);

    printf("%d\n\n", data[0]);
}

//--------------------------------------------------------
// FUNCTION overshoot_memory
//--------------------------------------------------------
void overshoot_memory()
{
    puts("Overshooting memory");

    int* data;

    data = malloc(sizeof(int) * 32);

    // change indexes to between 0 and 31 to fix error
    data[32] = 456;

    printf("%d\n\n", data[32]);

    free(data);
}

//--------------------------------------------------------
// FUNCTION realloc_memory
//--------------------------------------------------------
void realloc_memory()
{
    puts("realloc memory");

    int* data = NULL;

    for(int i = 0; i < 32; i++)
    {
        data = realloc(data, (sizeof(int)) * (i + 1));

        data[i] = pow(i, 2);

        printf("%d\t%d\n", i, data[i]);
    }

    // uncomment to fix error
    //free(data);
}

In the function uninitialized_memory we have forgotten to call malloc, and are therefore trespassing on memory which doesn't belong to us.

In memory_leak we have forgotten to call free, therefore hogging a block of redundant memory until either the program terminates or, if we make a habit of that sort of thing, actually crashes.

In use_freed_memory we have called free but then carry on using the memory anyway. This might not matter for a short time but sooner or later will cause problems if it is given back to us or another program for a different purpose.

The overshoot_memory function is well behaved in terms of calling malloc and free, but tries to write and read memory outside of that which it has been given.

Finally, just to check that Valgrind works just as well with realloc as it does with malloc, I have included a function realloc_memory which forgets to call free on realloc'ed memory rather than malloc'ed memory.

That's the coding finished so we can now compile and run it. In the terminal enter the following.

Compile and Run (without Valgrind)

gcc usingvalgrind.c -g -std=c11 -lm -o usingvalgrind

./usingvalgrind

Note that I have included a -g switch to tell the compiler to include debug information in the executable. This will enable Valgrind to report back the line numbers in the source code where any errors occur. Obviously that makes the executable a bit bigger so should be removed for production builds.

However, first time round we are not using Valgrind, just running the program as normal. This is the output.

Program Output

-----------------
| codedrome.com |
| valgrind      |
-----------------

Using uninitialized memory
123

Memory leak
234

Using freed memory
345

Overshooting memory
456

realloc memory
0       0
1       1
2       4
3       9
4       16
5       25
6       36
7       49
8       64
9       81
10      100
11      121
12      144
13      169
14      196
15      225
16      256
17      289
18      324
19      361
20      400
21      441
22      484
23      529
24      576
25      625
26      676
27      729
28      784
29      841
30      900
31      961

So what went wrong as result of all our memory management errors? Well, nothing at all. At least nothing you would notice when you run the program. It didn't crash or even display an error message, and the output is exactly what we expected, although I cannot guarantee this will be the case for you. It is very important to realise that bugs such as the ones we have deliberately introduced do not necessarily make themselves felt immediately.

However, we know that the code is actually full of bugs, ones that will cause serious problems if we put the code into production. So let's run the program again but this time under the umbrella of Valgrind.

Using Uninitialized Memory

We'll do this one function at a time, so in main comment out all the functions except uninitialized_memory, go back to the terminal, and run the following.

Compile and Run (with Valgrind)

gcc usingvalgrind.c -g -std=c11 -lm -o usingvalgrind

valgrind --track-origins=yes --leak-check=full ./usingvalgrind

Valgrind slows down program execution to a crawl, so even with a short program like this you might notice a delay of a second or two before anything happens. When you do see some output, it will include the following.

Program Output (partial)

==10764== Use of uninitialised value of size 8
==10764==    at 0x40079D: uninitialized_memory (usingvalgrind.c:43)
==10764==    by 0x40077F: main (usingvalgrind.c:22)
==10764== Uninitialised value was created by a stack allocation
==10764==    at 0x400787: uninitialized_memory (usingvalgrind.c:35)
==10764==
==10764== Use of uninitialised value of size 8
==10764==    at 0x4007A7: uninitialized_memory (usingvalgrind.c:45)
==10764==    by 0x40077F: main (usingvalgrind.c:22)
==10764== Uninitialised value was created by a stack allocation
==10764==    at 0x400787: uninitialized_memory (usingvalgrind.c:35)

Valgrind tells us we are using uninitialized memory, as well as the function name, source file name and line number. (I have shown the salient parts of the message in red, but Valgrind shows everything in white.) The single error of forgetting to call malloc manifests itself twice, first when we write to the memory and then again when we read it.

The error isn't in line 43 or 45, but we know it is a "Use of uninitialised value" so we just need to track backward from these lines a short way to realise we declared a variable called data but forgot to allocate any memory to it. The solution is already in place, just uncomment the malloc and free lines, then run the above commands to build and run under Valgrind again.

Program Output (partial)

ERROR SUMMARY: 0 errors from 0 contexts

Now we see "0 errors". Good! Now let's carry out the same process with the next function, memory_leak.

Memory Leak

In main comment out uninitialized_memory and uncomment memory_leak. Run the program with Valgrind again and it will spot that we have not called free.

Program Output (partial)

==3692==     in use at exit: 128 bytes in 1 blocks
==3692==   total heap usage: 1 allocs, 0 frees, 128 bytes allocated
==3692==
==3692== 128 bytes in 1 blocks are definitely lost in loss record 1 of 1

You know what you need to do - uncomment the line which calls free and run again. This time we get.

Program Output (partial)

==4249== All heap blocks were freed -- no leaks are possible

Using Freed Memory

Another problem solved. Now edit main to run use_freed_memory, and run with Valgrind. This time we get.

Program Output (partial)

==4467== Invalid read of size 4
==4467==    at 0x400845: use_freed_memory (usingvalgrind.c:86)
==4467==    by 0x40077F: main (usingvalgrind.c:24)
==4467== Address 0x5502040 is 0 bytes inside a block of size 128 free'd

This tells us that we are trying to use freed memory, so move the call to free to after the printf, and run again. As you can see this fixes the problem.

Program Output (partial)

==4790== ERROR SUMMARY: 0 errors from 0 contexts

Overshooting Memory

Three down, two to go. Edit main to run overshoot_memory, and when run with Valgrind you'll see.

Program Output (partial)

==5087== Invalid write of size 4
==5087==    at 0x400882: overshoot_memory (usingvalgrind.c:101)
==5087==    by 0x40077F: main (usingvalgrind.c:25)
==5087==  Address 0x55020c0 is 0 bytes after a block of size 128 alloc'd
==5087==    at 0x4C2AB80: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==5087==    by 0x400875: overshoot_memory (usingvalgrind.c:98)
==5087==    by 0x40077F: main (usingvalgrind.c:25)
==5087==
==5087== Invalid read of size 4
==5087==    at 0x400890: overshoot_memory (usingvalgrind.c:103)
==5087==    by 0x40077F: main (usingvalgrind.c:25)
==5087==  Address 0x55020c0 is 0 bytes after a block of size 128 alloc'd

Edit overshoot_memory to use an index between 0 and 31, and run again. Again we see the familiar output.

Program Output (partial)

==5678== ERROR SUMMARY: 0 errors from 0 contexts

Testing Valgrind with Realloc

We have seen that if we call malloc once Valgrind will check whether we call free once. However, realloc is a bit different: we can call it any number of times but of course only need to call free once. The final function checks that Valgrind can handle this. Edit main to run realloc_memory, and run again. The output is interesting.

Program Output (partial)

==6446== in use at exit: 128 bytes in 1 blocks
==6446== total heap usage: 32 allocs, 31 frees, 2,112 bytes allocated
==6446==
==6446== 128 bytes in 1 blocks are definitely lost in loss record 1 of 1
==6446== at 0x4C2CE8E: realloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==6446== by 0x4008F6: realloc_memory (usingvalgrind.c:119)
==6446== by 0x40077F: main (usingvalgrind.c:26)
==6446==
==6446== LEAK SUMMARY:
==6446== definitely lost: 128 bytes in 1 blocks

Although our code does not call free, you can see that behind the scenes free is actually called 31 times, in all but one of the realloc calls. Uncomment the call to free, run again, and you will see just what you expected to see.

Program Output (partial)

==6922== ERROR SUMMARY: 0 errors from 0 contexts

I hope this post has removed much of the memory management angst associated with C programming - just remember to include a run of your program with Valgrind in your workflow.