Introduction
File handling is a fundamental aspect of programming that allows us to interact with data stored on our computers. Whether it's reading text from a file, writing data to it, or manipulating its contents, mastering Input/Output (IO) operations is essential for any developer.
In this article, we would cover some file handling concepts in C, low-level and high-level File I/O, file descriptors and more.
Importance of File Handling
- When a program is terminated, the entire data is lost. Storing in a file will preserve data even if the program terminates.
- If you have to enter a large number of data, it'll take sometime. However, if you have a file containing all the data, you can easily access the contents with few commands.
- File handling enables the reading and writing of configuration files, allowing users to tailor software settings to their preferences.
- Through file handling, programs that need to exchange data with other programs or systems can share data in various formats with external entities.
- Regularly saving data to files ensures that valuable information can be restored in case of system failures or data loss
Getting started with File Handling
At its core, file handling involves performing operations on files which include opening, reading, writing and closing files. In C programming, file handling is achieved through the following
- Streams
- File pointers
- File descriptors.
Let's look at these more closely
Streams
These are a fundamental abstraction for file handling in C. They provide a high-level interface for reading and writing to files. In C, there are standard streams like stdin
(standard input) stdout
(standard output) and stderr
(standard error) which are automatically available for input and output.
File Pointers
A file pointer is a mechanism used to keep track of the current position within a file. It determines where the next read or write operation will occur. File pointers are essential for sequential file access and help navigate through the file's contents.
File Descriptors
These are low-level integer identifiers that represent open files in C. Each descriptor corresponds to a particular stream as we found above.
Below is a table to summarize the various descriptors and their corresponding streams.
Integer Value | Name | Symbolic Constant | File Stream |
---|---|---|---|
0 | Standard Input | STDIN_FILENO | stdin |
1 | Standard Output | STDOUT_FILENO | stdout |
2 | Standard Error | STDERR_FILENO | stderr |
NOTE
stdin
: It is used to read input from the user or another program.
stdout
: Used to write output to the user or another program.
stderr
: Used to write error messages and other diagnostics output to the user.
Basic operations in File handling with C
There are four (4) basic operations in file handling with C. They are opening, reading, writing and closing. These operations must be followed in order when handling and manipulating files.
Aside the four (4) basic operations, there are generally two (2) approaches to handling them. They are low-level approach with system calls and high-level approach with standard library.
Another point to note is that files come in different formats such as csv, binary and text. This article would focus on the text files only.
Let us look at the basic operations in file handling. For each operation, we would look at its implementation in both the high-level and low-level approaches.
Opening a File
- This is the first step in file handling. It establishes connection between your program and the file on disk.
- During file opening, you specify the following parameters
- File's name
- Location
- Mode with which you want to open the file in. These modes specify what exactly you would like to do with the file you are opening. These includes: reading only, writing only, appending etc. Man fopen
High-Level Approach
FILE *file = fopen("example.txt", "r");
RETURN VALUE: FILE pointer if successful, NULL if otherwise
Low-Level Approach
int fd = open("example.txt", O_RDONLY);
man open
RETURN VALUE: New file descriptor if successful, -1 if an error occurred.
Below is a table to understand various modes and how they correspond to each other in high-level and low-level approaches.
fopen() mode | open() flags | Usage |
---|---|---|
r | O_RDONLY | Opens file for reading |
r+ | O_RDWR | Opens file for reading and writing |
w | O_WRONLY | O_CREAT | O_TRUNC | Writes to a file. It clears(truncates) everything if there is already text |
w+ | O_RDWR | O_CREAT | O_TRUNC | Opens for reading and writing. The file is created if it doesn't exist otherwise it's truncated. |
a | O_WRONLY | O_CREAT | O_APPEND | Open for appending (writing at end of file). The file is created if it doesn't exist |
Reading from a File
- This involves retrieving data from existing file on disk.
- You can read data character by character, line by line or in larger chunks depending on your program's requirement.
High-Level Approach
There are two ways to read from files with this approach. They are:
fgets(): This reads texts line by line and stores in a buffer.
FILE *file;
char buffer[1024];
while(fgets(buffer, sizeof(buffer), file) != NULL) {
// Process each line in the file
}
RETURN VALUE: A string on success, NULL on error.
man fgets
fread(): This reads specified number of bytes from a file or for reading binary files also into a buffer.
man fread
FILE *file;
char buffer[1024];
size_t bytes_to_read = sizeof(buffer);
size_t bytes_read = fread(buffer, 1, bytes_to_read, file);
RETURN VALUE: Number of items read
Low-Level Approach
This uses the read()
function.
char buffer[1024];
ssize_t bytes_read = read(fd, buffer, sizeof(buffer));
if (bytes_read == -1)
// handle error
else
// Process the data in the buffer
NOTE: The fd
is the return value of the open
function
man read
Writing to a file
- This adds or updates data in a file.
- You can write character by character, line by line or in larger blocks as well.
- Writing is essential for tasks like creating log files, saving program output or storing user-generated content.
High-Level Approach
This method uses fprintf()
(Write formatted text data to a file)
FILE *file = fopen("example.txt", "w"); // Open for writing
fprintf(file, "Hello, %s!\n", "World");
man fprintf
fwrite()
(Writes a specified number of bytes from the buffer to a file)
FILE *file = fopen("data.bin", "wb"); // Open for binary writing
char data[] = {0x01, 0x02, 0x03};
size_t bytes_to_write = sizeof(data);
size_t bytes_written = fwrite(data, 1, bytes_to_write, file);
Low-Level Approach
write()
is the function used here.
int fd = open("example.txt", O_CREAT | O_WRONLY | O_TRUNC, 0644);
const char *text = "Hello, World!\n";
ssize_t bytes_written = write(fd, text, strlen(text));
NOTE: Modify the various modes/flags of the write operation to get the desired results like append etc.
In the low-level approach, you have more control over the writing process and can directly manipulate the binary data. However, you need to manage buffering and handle text encoding yourself if you're working with text files.
Closing a File
- This is the final step in the file handling and it's essential to release the system resources and ensure data integrity.
- Once you finish reading or writing, you should close the file to free up the file descriptors and ensure that all pending changes are saved.
- Failing to close a file property can result in resource leaks and data corruption.
High level approach
FILE *file = fopen("example.txt", "w"); // Open for writing
// Write data to the file
fclose(file); // Close the file when done
Low level approach
int fd = open("example.txt", O_CREAT | O_WRONLY | O_TRUNC, 0644); // Open for writing
// Write data to the file
close(fd); // Close the file descriptor when done
Until now you might have realized that there are two approaches to handline files in C. The commonest one used is the high-level approach. However, let us look at some differences between the two. This would help inform our decisions as to which of them to use at what point in time.
Aspect | High level Approach | Low level Approach |
---|---|---|
File representation | Uses a file stream represented by *FILE**, for file operations | Uses file descriptors represented by *int* for file operations |
Common functions | Common functions include fopen(), fclose(), fgets(), fprintf(), fwrite(). | Common functions include open(), close(), read(), write(). |
Text vs. Binary Files | Suitable for both text and binary files, with functions handling text encoding and formatting. | Suitable for both text and binary files, but you need to handle text encoding and formatting manually if required. |
Error Handling | Uses functions like ferror() and feof() to handle errors. | Relies on error codes returned by functions like read() and write() for error handling. |
Resource Cleanup | Automatically flushes data and releases resources when you use fclose() | Requires manual closure of file descriptors using close() for resource cleanup. |
Now let's look at some examples with practical questions
QUESTION 1
Implement a program to read text from a file
SOLUTION
High level approach
#include <stdio.h>
#include <stdlib.h>
int main(int argc, char *argv[]) {
if (argc != 2) {
printf("Usage: %s source_file\n", argv[0]);
return 1;
}
FILE *source = fopen(argv[1], "r");
if (!source) {
perror("Error");
return 1;
}
char buffer[1024];
while (fgets(buffer, sizeof(buffer), source)) {
printf("%s", buffer);
}
fclose(source);
return 0;
}
Low level approach
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <fcntl.h>
int main(int argc, char *argv[]) {
if (argc != 2) {
printf("Usage: %s source_file\n", argv[0]);
return 1;
}
int source_fd = open(argv[1], O_RDONLY);
if (source_fd == -1) {
perror("Error");
return 1;
}
char buffer[1024];
ssize_t bytes_read;
while ((bytes_read = read(source_fd, buffer, sizeof(buffer))) > 0) {
if (write(STDOUT_FILENO, buffer, bytes_read) == -1) {
perror("Error");
return 1;
}
}
close(source_fd);
return 0;
}
QUESTION 2
Implement a program to append some text to a file. This program should take the text from the user (stdin) and then append to the file
SOLUTION
High level approach
#include <stdio.h>
#include <stdlib.h>
int main(int argc, char *argv[]) {
if (argc != 2) {
printf("Usage: %s destination_file\n", argv[0]);
return 1;
}
FILE *destination = fopen(argv[1], "a");
if (!destination) {
perror("Error");
return 1;
}
char input[1024];
printf("Enter text (Ctrl-D to end):\n");
while (fgets(input, sizeof(input), stdin)) {
fputs(input, destination);
}
fclose(destination);
return 0;
}
Low level approach
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <fcntl.h>
int main(int argc, char *argv[]) {
if (argc != 2) {
printf("Usage: %s destination_file\n", argv[0]);
return 1;
}
int destination_fd = open(argv[1], O_WRONLY | O_CREAT | O_APPEND, 0644);
if (destination_fd == -1) {
perror("Error");
return 1;
}
char input[1024];
printf("Enter text (Ctrl-D to end):\n");
while (fgets(input, sizeof(input), stdin)) {
if (write(destination_fd, input, strlen(input)) == -1) {
perror("Error");
return 1;
}
}
close(destination_fd);
return 0;
}
QUESTION 3
Implement a File Copy Program (like cp command)
SOLUTION
High level approach
#include <stdio.h>
#include <stdlib.h>
int main(int argc, char *argv[]) {
if (argc != 3) {
printf("Usage: %s source_file destination_file\n", argv[0]);
return 1;
}
FILE *source = fopen(argv[1], "rb");
FILE *destination = fopen(argv[2], "wb");
if (!source || !destination) {
perror("Error");
return 1;
}
char buffer[1024];
size_t bytes_read;
while ((bytes_read = fread(buffer, 1, sizeof(buffer), source)) > 0) {
fwrite(buffer, 1, bytes_read, destination);
}
fclose(source);
fclose(destination);
return 0;
}
Low level approach
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <fcntl.h>
int main(int argc, char *argv[]) {
if (argc != 3) {
printf("Usage: %s source_file destination_file\n", argv[0]);
return 1;
}
int source_fd = open(argv[1], O_RDONLY);
int destination_fd = open(argv[2], O_WRONLY | O_CREAT | O_TRUNC, 0644);
if (source_fd == -1 || destination_fd == -1) {
perror("Error");
return 1;
}
char buffer[1024];
ssize_t bytes_read;
while ((bytes_read = read(source_fd, buffer, sizeof(buffer))) > 0) {
if (write(destination_fd, buffer, bytes_read) == -1) {
perror("Error");
return 1;
}
}
close(source_fd);
close(destination_fd);
return 0;
}
NOTE: In low-level implementation, the open()
function has an optional argument known as permissions. These are file permissions that you can set to the file are read, write and execute permissions represented by their numerical values.
More about file permissions here
References
Congratulations on making it this far ππΎ. As you have studied, file handling is a very useful programming concept. I would love to know what your experience is with File handling in the comment section.
Follow me on Github, let's get interactive on Twitter and form great connections on LinkedIn π
Happy coding π₯
Top comments (6)
Hi Paul,
Your comments show that you have really scrutinized my article and I'm glad you did that.
I would have wished to add as much information, but I provided helpful links so that it would clear up some of the loopholes.
And yeah, a CSV file is also a text file π. Concerning the differences in OS with respect to the line-endings, thank you for calling me out on that.
Overall, I'll take your comments seriously as I write my next article on Building a shell
Thank you
Nice article. Thanks for sharing
My pleasure
Interesting read
I'm glad you loved it ππΎ