Safe User Input Handling in C
Asking users for input is common in C programmes, but doing so safely can be tricky. This guide covers best practices for reading user input without falling into common pitfalls associated with functions like scanf(). Incorrect handling can lead to undefined behaviour, security vulnerabilities, and crashes — famous bugs such as Heartbleed were caused by improper input handling.
The following is based on the excellent article by Felix Palmen: “A Beginner’s Guide Away from scanf()”.
Why Not Just Use scanf()?
scanf() isn’t inherently broken, but for beginners, it can lead to serious problems. Here’s Rule 0:
Rule 0: Don’t use
scanf()unless you know exactly what you’re doing.
1. Reading a Number from the User
The Classic Example
// example1.c
#include <stdio.h>
int main(void) {
int a;
printf("enter a number: ");
scanf("%d", &a);
printf("You entered %d.\n", a);
}
Looks fine? Try this:
$ ./example1
enter a number: abcdefgh
You entered 38.
Why?
scanf()tried to parse an integer but found no digits.- The variable
aremains uninitialised, leading to undefined behaviour.
Undefined Behaviour in C
C doesn’t protect you from mistakes. If input doesn’t match the format, scanf() leaves variables untouched. Using uninitialised variables can crash your programme or produce rubbish values.
Retrying with scanf()
#include <stdio.h>
int main(void) {
int a;
printf("enter a number: ");
while (scanf("%d", &a) != 1) {
printf("enter a number: ");
}
printf("You entered %d.\n", a);
}
Problem: If input is invalid (e.g., abc), scanf() never “consumes” it, meaning the buffer is never cleared, causing an infinite loop.
Rule 1:
scanf()is for parsing (extracting formatted data), not for reading raw input.
2. Reading a String from the User
Unsafe Example
#include <stdio.h>
int main(void) {
char name[12];
printf("What's your name? ");
scanf("%s", name);
printf("Hello %s!\n", name);
}
Issue: Buffer overflow if input exceeds 11 characters → leading to the dreaded undefined behaviour.
Fix with Field Width
#include <stdio.h>
int main(void) {
char name[40];
printf("What's your name? ");
scanf("%39s", name);
printf("Hello %s!\n", name);
}
Rule 2: Always use field widths with %s to prevent overflow.
%s Reads Only One Word
scanf("%s", name) stops at whitespace. For full lines, use scansets:
scanf(" %39[^\n]", name);
But beware: pressing Enter without input leaves name uninitialised.
Rule 3:
scanf()format strings differ fromprintf(). Read carefully.
3. A Better Way: fgets()
fgets() reads a whole line safely:
#include <stdio.h>
#include <string.h>
int main(void) {
char name[40];
printf("What's your name? ");
if (fgets(name, 40, stdin)) {
name[strcspn(name, "\n")] = 0; // remove newline
printf("Hello %s!\n", name);
}
}
We have to use strcspn() to strip the newline character (when the user hit enter) that fgets() includes.
4. Reading Numbers Without scanf()
Use atoi (simple but limited)
#include <stdio.h>
#include <stdlib.h>
int main(void) {
int a;
char buf[1024];
do {
printf("enter a number: ");
if (!fgets(buf, 1024, stdin)) return 1;
a = atoi(buf);
} while (a == 0);
printf("You entered %d.\n", a);
}
atoi() — ASCII to Integer
- Purpose: Converts a string (e.g.,
"123") into anint. - Header:
#include <stdlib.h> - Usage:
int value = atoi("123"); // value = 123 - Behaviour:
- Reads characters from the start of the string until a non-digit is found.
- Ignores leading whitespace.
- Returns
0if no valid conversion is possible.
- Limitations:
- No error reporting — you can’t tell if
0means “invalid input” or actual zero. - Doesn’t handle overflow or invalid characters well - what happens if you input “123abc”? or a number larger than
INT_MAX?
- No error reporting — you can’t tell if
Another simple function is atof() for floating-point numbers.
atof() — ASCII to Floating-point
- Purpose: Converts a string (e.g.,
"3.14") into adouble. - Header:
#include <stdlib.h> - Usage:
double value = atof("3.14"); // value = 3.14 - Behaviour:
- Similar to
atoi(), but parses floating-point numbers. - Stops at the first invalid character.
- Similar to
- Limitations:
- Same as
atoi()— no error checking, no way to detect invalid input reliably.
- Same as
Better Alternatives
- Use
strtol()for integers andstrtod()for floating-point values:- They provide error checking via
endptranderrno. - Safer for robust input handling.
- They provide error checking via
Using strtol() (robust)
#include <stdio.h>
#include <stdlib.h>
#include <errno.h>
int main(void) {
long a;
char buf[1024];
int success;
do {
printf("enter a number: ");
if (!fgets(buf, 1024, stdin)) return 1;
char *endptr;
errno = 0;
a = strtol(buf, &endptr, 10);
if (errno == ERANGE || endptr == buf || (*endptr && *endptr != '\n'))
success = 0;
else
success = 1;
} while (!success);
printf("You entered %ld.\n", a);
}
strtol() — String to Long Integer
- Purpose: Converts a string (e.g.,
"123") into alonginteger with error checking. - Header:
#include <stdlib.h> #include <errno.h> - Function Signature:
long strtol(const char *nptr, char **endptr, int base); - Parameters:
nptr: Pointer to the string to convert.endptr: Pointer to a pointer — after conversion, it points to the first character that wasn’t converted.base: Number base (e.g.,10for decimal,16for hex).
How It Works
- Reads characters from the string and converts them to a number.
- Stops at the first invalid character.
- Sets
errnotoERANGEif the number is too large or too small. - Allows you to check:
- If nothing was converted (
endptr == nptr). - If there are extra characters after the number (
*endptr != '\0'or'\n').
- If nothing was converted (
Why strtol() is Better than atoi()
- Error detection: Can tell if conversion failed or if extra characters exist.
- Range checking: Detects overflow/underflow.
- Flexible: Works with different bases (binary, hex, etc.).
5. Can scanf() Be Fixed?
Yes, but it’s tricky:
#include <stdio.h>
int main(void) {
int a, rc;
printf("enter a number: ");
while ((rc = scanf("%d", &a)) == 0) {
scanf("%*[^\n]"); // discard invalid input
printf("enter a number: ");
}
if (rc == EOF)
printf("Nothing more to read.\n");
else
printf("You entered %d.\n", a);
}
Rule 4:
scanf()is powerful but dangerous. Prefer simpler, safer functions likefgets()for reading input.
Best Practices for Safe Input in C
-
Avoid
scanf()for general input
Usefgets()for reading lines and then parse manually. -
Always validate input
Check return values of functions likefgets(),strtol(), etc. - Prevent buffer overflows
- Use field widths with
%sif you must usescanf(). - Allocate enough space for strings (including
\0terminator).
- Use field widths with
- Use robust conversion functions
- Prefer
strtol(),strtod(), etc., overatoi()for better error handling.
- Prefer
- Handle edge cases
- Empty input
- Extra characters after numbers
- Out-of-range values
-
Never use
fflush(stdin)
It’s undefined behaviour in C. - Read the manual
scanf()andprintf()format strings differ—know the rules.
```