Debunking the Unsafe C myths part 1
Introduction
In this post we are going to have so much fun debunking myths about unsafe C. I'm going to use the sample codes from this Youtube video. Why this video? Well let's say it's there like a sitting duck and I'm a hungry soldier with a sniper rifle.
I know it's fun to make fun of poor old legacy C language..fear not, we are here to crush some bugses + memowy ewwows and debunk some untrue and misleading myths.
Setup
- I'm going to use GCC compiler only as it was proven in the previous post that GCC is far superior compared to Clang.
- Here are some of the aliases I'm using:
alias gcc-14_safe='gcc-14 -Wall -Werror -Wextra -Wpedantic -std=c23 -D_POSIX_C_SOURCE=202308L'
alias gcc-14_safest='gcc-14 -ggdb -Wall -Werror -Wextra -Wpedantic -Wconversion -fsanitize=address -fsanitize=undefined -fno-omit-frame-pointer -std=c23 -D_POSIX_C_SOURCE=202308L'
[Important Note] Why the need for two different compile step? That is because if we run both sanitizers and static analyzer it will result in conflict due to the sanitizers modifies the code and this interferes / reduces the effectiveness of -fanalyzer.
So if you're strict on safety measures, it is a best practice to introduce at least two compile step:
- Static analysis step: gcc-14_safe -fanalyzer
- Runtime step: gcc-14_safest
I did a bad job not mentioning this in the previous post and I just realized this might confuse some people: "Why this dude sometimes uses gcc-14 and gcc-14_safest??". Sorry for that, it's just like an automatic habit for me to run two compile steps.
Case 1 - uninitialized pointer
#include <stdio.h>
int main(void)
{
int *pointer;
*pointer = 10;
printf("Address: %p\nValue: %d\n", pointer, *pointer);
return 0;
}
Command: gcc-14_safest -o unsafe_01 unsafe_01.c -fanalyzer
--> deliberately combined the compile steps to prove a point.
Comment:
- Uninitialized error caught at compile time.
- Additional error caught: error=format.
- Notice I deliberately combined the sanitizer and static analyzer, it resulted with the same error caught twice: the first error (yellow) is caught by the compiler flags, while the second error (orange) caught by static analyzer. This is not needed / redundant hence it is best to split the compile steps into two: first static analyzer check, second runtime sanitizers check.
Case 2 - out of bounds
#include <stdio.h>
// Error - Out of bounds
int main(void)
{
int array[5] = {1, 2, 3, 4 ,5};
int array2[10] = {6, 7, 8, 9, 10, 11, 12, 13, 14, 15};
int index = 0;
index += 8;
int value = array[index];
printf("value: %d\n", value);
printf("array 2-8: %d\n", array2[index]);
return 0;
}
Command: gcc-14_safe -o unsafe_02 unsafe_02.c -fanalyzer
Comment:
- Out of bounds error caught at compile time.
- Notice how beautiful the error visualization provided by GCC static analyzer! Some praises Rust to have the best compiler error, but in my opinion C static analyzer visualizations are great as well.
Rust compiler error:
Yeah I guess it is functional, it does the job very well.
Case 3 - memory leak
#include <stdio.h>
#include <stdlib.h>
#include <stdbool.h>
char *getUserInput()
{
int size = 100;
char *input = malloc(sizeof(char) * size);
if (input == NULL) {
fprintf(stderr, "Memory allocation failed.\n");
return NULL;
}
printf("Enter something: ");
fgets(input, size, stdin);
return input;
}
int main(void)
{
while (true) {
char *input = getUserInput();
printf("input: %s\n", input);
// free(input); // deliberately commented out to cause memory leak!
}
return 0;
}
Command: gcc-14_safe -o unsafe_03 unsafe_03.c -fanalyzer
Comment:
- Memory leak error caught at compile time.
This is getting boring, don't you think? Everything is caught at COMPILE TIME..where's the thrill?! Okay let's continue...
Case 4 - double free & use after free
#include <stdio.h>
#include <stdlib.h>
int main(void)
{
int *pointer = malloc(sizeof(int));
free(pointer);
free(pointer);
return 0;
}
Command: gcc-14_safe -o unsafe_04 unsafe_04.c -fanalyzer
Comment:
- Double free error and UAF caught at compile time.
I'm not kidding guys..I'm getting sleepy over here, this is boring! Where's my segmentation fault?? Zzzzzz let's continue our hunt!
Case 5 - string error
Now we get to the nasty stuff: STRING!
#include <stdio.h>
int main(void)
{
// let's be real here...who actually write string like this??
char phrase[12] = {'H','e','l','l','o',' ','W','o','r','l','d','!'};
printf("'%c'\n", phrase[0]);
// below is the proper way to write a string
char frase[] = "Hello World!";
const char *hello_world = frase;
printf("'%s'\n", hello_world);
return 0;
}
Command: gcc-14_safe -o unsafe_05 unsafe_05.c -fanalyzer
Program works fine and prints out:
'H'
'Hello World!'
But okay, lets just follow the example to the letter:
Comment:
- Stack buffer overflow error caught at runtime. --> the first that we failed to catch at compile time, but as mentioned earlier no one writes string like that hence it should be easily avoided.
What's the score now? Compile time [4] - Runtime [1] - Error [0]
Case 6 - more string!
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
const char *trimSpaces(const char* input)
{
const size_t input_length = strlen(input);
char *buffer = malloc(sizeof(char) * input_length);
// char *buffer = malloc(sizeof(char) * input_length + 1); // deliberately commented out
if (buffer == NULL) {
fprintf(stderr, "Memory allocation failed.\n");
exit(1);
}
size_t buffer_index = 0;
for (size_t i=0; i<input_length; i++) {
if (input[i] == ' ') {
continue;
}
buffer[buffer_index] = input[i];
buffer_index += 1;
}
// buffer[buffer_index] = '\0'; // deliberately commented out
return buffer;
}
int main(void)
{
const char *text = "This is an example of some text";
const size_t string_length = strlen(text);
printf("'%s', length = %zu\n", text, string_length);
const char *trimmed_string = trimSpaces(text);
const size_t trimmed_string_length = strlen(trimmed_string);
printf("'%s' length = %zu\n", trimmed_string, trimmed_string_length);
free((void*)trimmed_string);
return 0;
}
Command #1: gcc-14_safe -o unsafe_06 unsafe_06.c -fanalyzer
--> it compiles.
Command #2: gcc-14_safest -o unsafe_06 unsafe_06.c
--> run the program --> runtime error!
Comment:
- Heap buffer-overflow caught at runtime! Yeah, dealing with strings in C are particularly nasty and require more scrutiny. But the error caught successfully!
Compile time [4] - Runtime [2] - Error [0]
Note: uncommment the inactive lines to see the proper way of handling this case study.
Case 7 - using safer functions when dealing with strings!
I agree completely on this one, it is one of the best practices particulary when working with strings.
- strnlen not strlen
- strncpy not strcpy
- strncmp not strcmp
- strncat not strcat
Please consult to the C standard to check if a safer function exists. My favorite references are:
- The C standard: C23
- C secure coding rules: open-std --> this is an understated document, it provides examples of both non-compliant and compliant codes.
So I've got no argument against this case study, instead I want to encourage it further by showing some simple examples of the difference between using unsafe and safer functions.
7a - strnlen
#include <stdio.h>
#include <string.h>
void unsafe_str(void)
{
char buffer[12] = {'H','e','l','l','o',' ','W','o','r','l','d','!'};
// Unsafe: strlen
size_t len_unsafe = strlen(buffer);
printf("\nUnsafe length: %zu\n", len_unsafe);
}
void safe_str(void)
{
char buffer[12] = {'H','e','l','l','o',' ','W','o','r','l','d','!'};
// Safe: strnlen
size_t len_safe = strnlen(buffer, 12);
printf("\nSafe length: %zu\n", len_safe);
}
int main(void)
{
unsafe_str();
safe_str();
return 0;
}
Command: gcc-14 -o unsafe_07a_strnlen unsafe_07a_strnlen.c
-- note I deliberately compile the program without any compiler flags in order to show the behaviour between strlen vs strnlen.
Output:
Unsafe length: 18 --> wrong length!
Safe length: 12 --> correct length!
The strlen keep on reading way beyond the string length since there's no null terminator '\0' while strnlen since we explicitly set the size of the maxlen (12), it printed out the correct length.
Now let's turn on static analyzer: gcc-14_safe -o unsafe_07a_strnlen unsafe_07a_strnlen.c -fanalyzer
Comment:
- Static analyzer caught the error at compile time. Again don't you think the error message is pleasing to look at? And the message is very clear: "argument 1 of 'strlen' must be a pointer to a null-terminated string"
Anyway if you're curious whether sanitizers are able to catch the error on runtime, the answer is yes! Using the command: gcc-14_safest -o unsafe_07a_strnlen unsafe_07a_strnlen.c
-- it compiles, but upon running the program you'll see an error: stack buffer-overflow pointing exactly at the strlen.
Compile time [5] - Runtime [2] - Error [0]
7b - strncpy
#include <stdio.h>
#include <string.h>
#include <stddef.h>
void unsafe_copy(char *src)
{
char dest_unsafe[5];
strcpy(dest_unsafe, src);
printf("strcpy: %s\n", dest_unsafe);
}
void safe_copy(char *src)
{
char dest_safe[5];
size_t len = sizeof(dest_safe) - 1; //*
strncpy(dest_safe, src, len);
dest_safe[len] = '\0'; //*
printf("strncpy: %s\n", dest_safe);
}
int main(void)
{
char src[] = "HelloWorld!";
unsafe_copy(src);
safe_copy(src);
return 0;
}
Command: gcc-14 -o unsafe_07b_strncpy unsafe_07b_strncpy.c
-- as usual, no compiler flags whatsoever to show the error caused by unsafe function: strcpy.
Output:
strcpy: HelloWorld!
[1] 129945 segmentation fault ./unsafe_07b_strncpy
Yayy, we got a SEGFAULT!! LOL now we're talking!
However if we never call the unsafe_copy function then compile and re-run the program, it's fine:
strncpy: Hell
Comment: pay attention to the //* I put in the code, to make sure you correctly calculate the len and put null terminator at the end.
Static analyzer: gcc-14_safe -o unsafe_07b_strncpy unsafe_07b_strncpy.c -fanalyzer
Comment:
- Static analyzer caught the error at compile time.
Compile time [6] - Runtime [2] - Error [0]
7c - strncmp
#include <stdio.h>
#include <string.h>
#include <stddef.h>
void unsafe_compare(char *str1, char *str2)
{
if (strcmp(str1, str2) == 0) {
printf("strcmp: Strings are equal\n");
} else {
printf("strcmp: String are not equal\n");
}
}
void safe_compare(char *str1, char *str2, size_t len)
{
if (strncmp(str1, str2, len) == 0) {
printf("strncmp: Strings are equal\n");
} else {
printf("strncmp: String are not equal\n");
}
}
int main(void)
{
char str1[] = "KingC";
char str2[5] = {'K','i','n','g','C'};
size_t len = 5;
unsafe_compare(str1, str2);
safe_compare(str1, str2, len);
return 0;
}
Command: gcc-14 -o unsafe_07c_strncmp unsafe_07c_strncmp.c
-- on this one we don't need to use static analyzer or other comp flags.
Output:
strcmp: String are not equal
strncmp: Strings are equal
Comment: as you can see, strcmp result is wrong while strncmp is correct. Both strings are "KingC".
7d - strncat
#include <stdio.h>
#include <string.h>
#include <stddef.h>
void unsafe_concat(char *dest_unsafe, char *src)
{
strcat(dest_unsafe, src);
printf("strcat: %s\n", dest_unsafe);
}
void safe_concat(char *dest_safe, size_t dest_size, char *src)
{
size_t len = dest_size - strlen(dest_safe) - 1;
strncat(dest_safe, src, len);
printf("strncat: %s\n", dest_safe);
}
int main(void)
{
char dest_unsafe[15] = "SpongeBob";
char dest_safe[15] = "SpongeBob";
char src[] = "SquarePants";
unsafe_concat(dest_unsafe, src);
safe_concat(dest_safe, sizeof(dest_safe), src);
return 0;
}
Command: gcc-14 -o unsafe_07d_strncat unsafe_07d_strncat.c
Output:
strcat: SpongeBobSquarePants
strncat: SpongeBobSquar
Comment: strncat correctly concatenate the strings according to the destination available space, while strcat concat everything which is clearly out of bounds.
Static Analyzer: gcc-14_safe -o unsafe_07d_strncat unsafe_07d_strncat.c -fanalyzer
Comment:
- Static analyzer caught the error at compile time.
Compile time [7] - Runtime [2] - Error [0]
Let's continue the rest on Part 2?
Hey you know what? This post has gotten very long due to the inclusion of safer functions..I prefer to read and write posts that are "bite-sized".
Next part we will deal with:
- Arrays.
- Integer overflows.
- Pointer arithmetic.
- Extremely safe C variant as discussed in the video.
Maybe I'll write Part-2 over the weekend. Cheers!
C is freedom, it can be as safe or unsafe as you want it to be!