*ฅ^•ﻌ•^ฅ* ✨✨  HWisnu's blog  ✨✨ о ฅ^•ﻌ•^ฅ

Debunking the Unsafe C myths part 1

Introduction

In this post we are going to have so much fun debunking myths about unsafe C. I'm going to use the sample codes from this Youtube video. Why this video? Well let's say it's there like a sitting duck and I'm a hungry soldier with a sniper rifle.

I know it's fun to make fun of poor old legacy C language..fear not, we are here to crush some bugses + memowy ewwows and debunk some untrue and misleading myths.

Setup

alias gcc-14_safe='gcc-14 -Wall -Werror -Wextra -Wpedantic -std=c23 -D_POSIX_C_SOURCE=202308L'
alias gcc-14_safest='gcc-14 -ggdb -Wall -Werror -Wextra -Wpedantic -Wconversion -fsanitize=address -fsanitize=undefined -fno-omit-frame-pointer -std=c23 -D_POSIX_C_SOURCE=202308L'

[Important Note] Why the need for two different compile step? That is because if we run both sanitizers and static analyzer it will result in conflict due to the sanitizers modifies the code and this interferes / reduces the effectiveness of -fanalyzer.

So if you're strict on safety measures, it is a best practice to introduce at least two compile step:

  1. Static analysis step: gcc-14_safe -fanalyzer
  2. Runtime step: gcc-14_safest

I did a bad job not mentioning this in the previous post and I just realized this might confuse some people: "Why this dude sometimes uses gcc-14 and gcc-14_safest??". Sorry for that, it's just like an automatic habit for me to run two compile steps.

Case 1 - uninitialized pointer

#include <stdio.h>

int main(void)
{
    int *pointer;
    *pointer = 10;
    printf("Address: %p\nValue: %d\n", pointer, *pointer);
    return 0;
}

Command: gcc-14_safest -o unsafe_01 unsafe_01.c -fanalyzer --> deliberately combined the compile steps to prove a point.

unsafe-01 Comment:

Case 2 - out of bounds

#include <stdio.h>

// Error - Out of bounds
int main(void)
{
    int array[5] = {1, 2, 3, 4 ,5};
    int array2[10] = {6, 7, 8, 9, 10, 11, 12, 13, 14, 15};
    int index = 0;
    index += 8;
    int value = array[index];
    printf("value: %d\n", value);
    printf("array 2-8: %d\n", array2[index]);
    
    return 0;
}

Command: gcc-14_safe -o unsafe_02 unsafe_02.c -fanalyzer

unsafe-02 Comment:

Rust compiler error: unsafe-02-rs Yeah I guess it is functional, it does the job very well.

Case 3 - memory leak

#include <stdio.h>
#include <stdlib.h>
#include <stdbool.h>

char *getUserInput()
{
    int size = 100;
    char *input = malloc(sizeof(char) * size);
    if (input == NULL) {
        fprintf(stderr, "Memory allocation failed.\n");
        return NULL;
    }

    printf("Enter something: ");
    fgets(input, size, stdin);
    return input;
}

int main(void)
{
    while (true) {
        char *input = getUserInput();
        printf("input: %s\n", input);
        // free(input);     // deliberately commented out to cause memory leak!
    }
    return 0;
}

Command: gcc-14_safe -o unsafe_03 unsafe_03.c -fanalyzer

unsafe-03 Comment:

This is getting boring, don't you think? Everything is caught at COMPILE TIME..where's the thrill?! Okay let's continue...

Case 4 - double free & use after free

#include <stdio.h>
#include <stdlib.h>

int main(void)
{
    int *pointer = malloc(sizeof(int));
    free(pointer);
    free(pointer);
    return 0;
}

Command: gcc-14_safe -o unsafe_04 unsafe_04.c -fanalyzer

unsafe-04 Comment:

I'm not kidding guys..I'm getting sleepy over here, this is boring! Where's my segmentation fault?? Zzzzzz let's continue our hunt!

Case 5 - string error

Now we get to the nasty stuff: STRING!

#include <stdio.h>

int main(void)
{
    // let's be real here...who actually write string like this??
    char phrase[12] = {'H','e','l','l','o',' ','W','o','r','l','d','!'};
    printf("'%c'\n", phrase[0]);

    // below is the proper way to write a string
    char frase[] = "Hello World!";
    const char *hello_world = frase;
    printf("'%s'\n", hello_world);
    return 0;
}

Command: gcc-14_safe -o unsafe_05 unsafe_05.c -fanalyzer
Program works fine and prints out:

'H'
'Hello World!'

But okay, lets just follow the example to the letter: unsafe-05 Comment:

What's the score now? Compile time [4] - Runtime [1] - Error [0]

Case 6 - more string!

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

const char *trimSpaces(const char* input) 
{
    const size_t input_length = strlen(input);
    char *buffer = malloc(sizeof(char) * input_length);
    // char *buffer = malloc(sizeof(char) * input_length + 1);  // deliberately commented out
    if (buffer == NULL) {
        fprintf(stderr, "Memory allocation failed.\n");
        exit(1);
    }

    size_t buffer_index = 0;
    for (size_t i=0; i<input_length; i++) {
        if (input[i] == ' ') {
            continue;
        }
        buffer[buffer_index] = input[i];
        buffer_index += 1;
    }
    // buffer[buffer_index] = '\0';     // deliberately commented out
    return buffer;
}

int main(void)
{
    const char *text = "This is an example of some text";
    const size_t string_length = strlen(text);
    printf("'%s', length = %zu\n", text, string_length);

    const char *trimmed_string = trimSpaces(text);
    const size_t trimmed_string_length = strlen(trimmed_string);
    printf("'%s' length = %zu\n", trimmed_string, trimmed_string_length);

    free((void*)trimmed_string);
    return 0;
}

Command #1: gcc-14_safe -o unsafe_06 unsafe_06.c -fanalyzer --> it compiles.
Command #2: gcc-14_safest -o unsafe_06 unsafe_06.c --> run the program --> runtime error!

unsafe-06 Comment:

Compile time [4] - Runtime [2] - Error [0]

Note: uncommment the inactive lines to see the proper way of handling this case study.

Case 7 - using safer functions when dealing with strings!

I agree completely on this one, it is one of the best practices particulary when working with strings.

Please consult to the C standard to check if a safer function exists. My favorite references are:

  1. The C standard: C23
  2. C secure coding rules: open-std --> this is an understated document, it provides examples of both non-compliant and compliant codes.

So I've got no argument against this case study, instead I want to encourage it further by showing some simple examples of the difference between using unsafe and safer functions.

7a - strnlen

#include <stdio.h>
#include <string.h>

void unsafe_str(void)
{
    char buffer[12] = {'H','e','l','l','o',' ','W','o','r','l','d','!'};
    // Unsafe: strlen
    size_t len_unsafe = strlen(buffer);
    printf("\nUnsafe length: %zu\n", len_unsafe);
}

void safe_str(void)
{
    char buffer[12] = {'H','e','l','l','o',' ','W','o','r','l','d','!'};
    // Safe: strnlen
    size_t len_safe = strnlen(buffer, 12);
    printf("\nSafe length: %zu\n", len_safe);
}

int main(void)
{
    unsafe_str();
    safe_str();
    return 0;
}

Command: gcc-14 -o unsafe_07a_strnlen unsafe_07a_strnlen.c -- note I deliberately compile the program without any compiler flags in order to show the behaviour between strlen vs strnlen.

Output:

Unsafe length: 18  --> wrong length!
Safe length: 12    --> correct length!

The strlen keep on reading way beyond the string length since there's no null terminator '\0' while strnlen since we explicitly set the size of the maxlen (12), it printed out the correct length.

Now let's turn on static analyzer: gcc-14_safe -o unsafe_07a_strnlen unsafe_07a_strnlen.c -fanalyzer unsafe-07a-strnlen Comment:

Anyway if you're curious whether sanitizers are able to catch the error on runtime, the answer is yes! Using the command: gcc-14_safest -o unsafe_07a_strnlen unsafe_07a_strnlen.c -- it compiles, but upon running the program you'll see an error: stack buffer-overflow pointing exactly at the strlen.

Compile time [5] - Runtime [2] - Error [0]

7b - strncpy

#include <stdio.h>
#include <string.h>
#include <stddef.h>

void unsafe_copy(char *src)
{
    char dest_unsafe[5];
    strcpy(dest_unsafe, src);
    printf("strcpy: %s\n", dest_unsafe);
}

void safe_copy(char *src)
{
    char dest_safe[5];
    size_t len = sizeof(dest_safe) - 1;     //*
    strncpy(dest_safe, src, len);
    dest_safe[len] = '\0';                  //*
    printf("strncpy: %s\n", dest_safe);
}

int main(void)
{
    char src[] = "HelloWorld!";
    unsafe_copy(src);
    safe_copy(src);
    return 0;
}

Command: gcc-14 -o unsafe_07b_strncpy unsafe_07b_strncpy.c -- as usual, no compiler flags whatsoever to show the error caused by unsafe function: strcpy.

Output:

strcpy: HelloWorld!
[1]    129945 segmentation fault  ./unsafe_07b_strncpy

Yayy, we got a SEGFAULT!! LOL now we're talking!

However if we never call the unsafe_copy function then compile and re-run the program, it's fine:

strncpy: Hell

Comment: pay attention to the //* I put in the code, to make sure you correctly calculate the len and put null terminator at the end.

Static analyzer: gcc-14_safe -o unsafe_07b_strncpy unsafe_07b_strncpy.c -fanalyzer unsafe-07b-strncpy Comment:

Compile time [6] - Runtime [2] - Error [0]

7c - strncmp

#include <stdio.h>
#include <string.h>
#include <stddef.h>

void unsafe_compare(char *str1, char *str2)
{
    if (strcmp(str1, str2) == 0) {
        printf("strcmp: Strings are equal\n");
    } else {
        printf("strcmp: String are not equal\n");
    }
}

void safe_compare(char *str1, char *str2, size_t len)
{
    if (strncmp(str1, str2, len) == 0) {
        printf("strncmp: Strings are equal\n");
    } else {
        printf("strncmp: String are not equal\n");
    }
}

int main(void)
{
    char str1[] = "KingC";
    char str2[5] = {'K','i','n','g','C'};
    size_t len = 5;

    unsafe_compare(str1, str2);
    safe_compare(str1, str2, len);
    return 0;
}

Command: gcc-14 -o unsafe_07c_strncmp unsafe_07c_strncmp.c -- on this one we don't need to use static analyzer or other comp flags.

Output:

strcmp: String are not equal
strncmp: Strings are equal

Comment: as you can see, strcmp result is wrong while strncmp is correct. Both strings are "KingC".

7d - strncat

#include <stdio.h>
#include <string.h>
#include <stddef.h>

void unsafe_concat(char *dest_unsafe, char *src)
{
    strcat(dest_unsafe, src);
    printf("strcat: %s\n", dest_unsafe);
}

void safe_concat(char *dest_safe, size_t dest_size, char *src)
{
    size_t len = dest_size - strlen(dest_safe) - 1;
    strncat(dest_safe, src, len);
    printf("strncat: %s\n", dest_safe);
}

int main(void)
{
    char dest_unsafe[15] = "SpongeBob";
    char dest_safe[15] = "SpongeBob";
    char src[] = "SquarePants";

    unsafe_concat(dest_unsafe, src);
    safe_concat(dest_safe, sizeof(dest_safe), src);
    return 0;
}

Command: gcc-14 -o unsafe_07d_strncat unsafe_07d_strncat.c

Output:

strcat: SpongeBobSquarePants
strncat: SpongeBobSquar

Comment: strncat correctly concatenate the strings according to the destination available space, while strcat concat everything which is clearly out of bounds.

Static Analyzer: gcc-14_safe -o unsafe_07d_strncat unsafe_07d_strncat.c -fanalyzer unsafe-07d-strncat Comment:

Compile time [7] - Runtime [2] - Error [0]

Let's continue the rest on Part 2?

Hey you know what? This post has gotten very long due to the inclusion of safer functions..I prefer to read and write posts that are "bite-sized".

Next part we will deal with:

Maybe I'll write Part-2 over the weekend. Cheers!

C is freedom, it can be as safe or unsafe as you want it to be!