DEV Community

Kshitij Srivastava
Kshitij Srivastava

Posted on • Edited on

stdint.h et al

One of the first things to realise about C is that a lot of important things are left as implementation details. The C standard doesn't actually define the size of an int, or a float or even a char for that matter. Add to the fact that the notion of a simple boolean value doesn't really exist as part of its core data-types.

So, a header file called stdint.h was introduced in C99 to allow programmers to write code with typedefs for common data-types that specified their underlying sizes.

This allowed for programmers to write code with data-types independent of the architecture of their target machines. This is especially useful for a language like C that is designed to run on the smallest of embedded systems all the way to the fastest of supercomputers.

An integer on any modern 64-bit machine takes up 32-bits by default. So, for consistency's sake, instead of using int, use int32_t or, instead of char, use uint8_t which, as the name hints, refers to an unsigned 8-bit integer, which is sufficient to store any possible ASCII character.

#include <stddef.h>
#include <stdint.h>

int main(void) {
    int number_1 = 5;  // ambiguous size (depends on platform)
    int32_t number_2 = 10;  // fixed 32-bit size on all platforms

    char character_1 = 'a';  // mostly 8-bits (but on some embedded architecture, you never know!)
    uint8_t character_2 = 'b';  // fixed 8-bit size on all platforms

    size_t length = 5;
    int32_t* array = malloc(sizeof(int32_t) * length);

    for (size_t i = 0; i < length; i++) {
        array[i] = i;  // No fear that i can ever go out of bounds.
                       // Also note the implicit type-cast from size_t to int32_t.
    }

    free(array);

    return 0;
}
Enter fullscreen mode Exit fullscreen mode

This also extends to using size_t and bool from 'stddef.h' and 'stdbool.h' respectively. The latter, as is obvious is just a typedef over 0 for false and 1 for true to make it more clear whether something is a boolean and to disallow any other value on accident if using something less restrictive, like an int.

size_t is where things get more interesting. Officially defined as an unsigned integer of a size that is at least 16-bits, it is used to define the size of objects. That doesn't seem very platform independent, does it? Well, it's not supposed to be.

size_tis great at holding object sizes and of particular note is its use in representing array lengths, and therefore, array indices. Instead of using an int which may either be negative, or either too big, or too small, size_t remains the perfect data-type to use in such scenarios.

Another nice use of size_t is that it is meant to represent the size of any data-type by design. So, the sizeof operator must return a value compatible with size_t.

Top comments (3)

Collapse
 
pauljlucas profile image
Paul J. Lucas
  • The exact fixed-width types only exist on platforms where there's an underlying type that supports them. If some platform has a char that is not 8 bits, then the type int8_t will not exist on that platform. This is why the type int_least8_t (and the other "least" types like it) exist.
  • You should use specific-sized types only if the specific size matters, for example reading data either from disk or from a socket. Otherwise you're conveying incorrect information to the reader.
  • FYI, sizeof(char) is always 1.
  • Array indices can be negative.
  • The sizeof operator returns size_t exactly.
Collapse
 
k-srivastava profile image
Kshitij Srivastava

Thanks for the corrections. Not too sure about a couple of points though.

  • sizeof(char) is always 1 from C99 onwards. I don't think it is standard before it. Plus, that notion of 1 is not necessarily the number of bytes rather, the smallest "memory unit" supported by the compiler. All other results are just sizes measured as number of chars.
  • Array indices can be negative (is is just decomposed to pointer arithmetic) but, using negative array indices in nearly all cases leads to unexpected behaviour since it's rare to index backwards from the origin of an array.
Collapse
 
pauljlucas profile image
Paul J. Lucas

According to The C Programming Language, 1st edition, page 126:

The expression sizeof( object ) yields an integer equal to the size of the specified object. (The size is given in unspecified units called "bytes," which are the same size as a char.)

So sizeof has apparently always been in char-sized units. I'd guess that in most cases sizeof(char) was typically 1 — it's just that the standard didn't require it until C99.

Some historical supercomputers made all sizes be the same for speed at the expense of memory, but sizeof(char) was still 1 even if the underlying addressable unit was 60+ bits.

Yes, using negative indices is rare, but it's possible to do safely. Look at C code generated by yacc sometime. It generates code where yyvsp is a stack pointer and yyvsp[-1] refers to the element just below the top of the stack, yyvsp[-2] to two elements below, etc.

When explaining anything about a programming language, you have to be (1) precise and (2) complete.