DEV Community

Cover image for Recreating strlen and strcmp in Assembly: A Step-by-Step Guide
Trish
Trish

Posted on

Recreating strlen and strcmp in Assembly: A Step-by-Step Guide

Writing low-level functions in assembly might seem daunting, but it’s an excellent way to deepen your understanding of how things work under the hood. In this blog, we’ll recreate two popular C standard library functions, strlen and strcmp, in assembly language and learn how to call them from a C program.

This guide is beginner-friendly, so don’t worry if you’re new to assembly programming. Let’s dive in! 🚀


Table of Contents

  1. Introduction to Assembly and C Integration
  2. What Are strlen and strcmp?
  3. Setting Up Your Environment
  4. Writing strlen in Assembly
  5. Writing strcmp in Assembly
  6. Integrating Assembly with C
  7. Compiling and Running the Program
  8. Expected Output
  9. Conclusion
  10. Connect on Twitter

1. Introduction to Assembly and C Integration

Assembly language operates at a very low level, close to machine code. When combined with a high-level language like C, you get the best of both worlds:

  • High-level readability.
  • Low-level control and optimization.

In this guide, we’ll write two functions in assembly—my_strlen and my_strcmp—and call them from C to demonstrate this integration.


2. What Are strlen and strcmp?

  • strlen: This function calculates the length of a null-terminated string (a string that ends with \0).
  • strcmp: This function compares two strings character by character. It returns:
    • 0 if the strings are identical.
    • A negative value if the first string is smaller.
    • A positive value if the first string is greater.

We’ll replicate their behavior in assembly.


3. Setting Up Your Environment

Tools Required:

  1. NASM (Netwide Assembler): To assemble the assembly code.
  2. GCC (GNU Compiler Collection): To compile and link the C and assembly code.

Installing the Tools (Linux):

Run the following commands:

sudo apt update
sudo apt install nasm gcc
Enter fullscreen mode Exit fullscreen mode

4. Writing strlen in Assembly

Logic of strlen:

  1. Start with a counter initialized to 0.
  2. Read each character of the string until the null terminator (\0) is found.
  3. Return the counter as the length.

Assembly Code for my_strlen:

section .text
global my_strlen

my_strlen:
    xor rax, rax           ; Set RAX (length) to 0
.next_char:
    cmp byte [rdi + rax], 0 ; Compare current byte with 0
    je .done               ; If 0, jump to done
    inc rax                ; Increment RAX
    jmp .next_char         ; Repeat
.done:
    ret                    ; Return length in RAX
Enter fullscreen mode Exit fullscreen mode
  • Registers Used:
    • RDI: Points to the input string.
    • RAX: Stores the string length (output).

5. Writing strcmp in Assembly

Logic of strcmp:

  1. Compare the characters of two strings.
  2. Stop when the characters differ or the null terminator (\0) is reached.
  3. Return:
    • 0 if strings are equal.
    • Difference of ASCII values if they differ.

Assembly Code for my_strcmp:

section .text
global my_strcmp

my_strcmp:
    xor rax, rax           ; Set RAX (result) to 0
.next_char:
    mov al, [rdi]          ; Load byte from first string
    cmp al, [rsi]          ; Compare with second string
    jne .diff              ; If not equal, jump to diff
    test al, al            ; Check if we’ve hit \0
    je .done               ; If \0, strings are equal
    inc rdi                ; Advance pointers
    inc rsi
    jmp .next_char         ; Repeat
.diff:
    sub rax, [rsi]         ; Return difference
.done:
    ret
Enter fullscreen mode Exit fullscreen mode
  • Registers Used:
    • RDI and RSI: Pointers to the input strings.
    • RAX: Stores the result (output).

6. Integrating Assembly with C

Let’s write a C program that calls these assembly functions.

C Code:

#include <stdio.h>
#include <stddef.h>

// Declare the assembly functions
extern size_t my_strlen(const char *str);
extern int my_strcmp(const char *s1, const char *s2);

int main() {
    // Test my_strlen
    const char *msg = "Hello, Assembly!";
    size_t len = my_strlen(msg);
    printf("Length of '%s': %zu\n", msg, len);

    // Test my_strcmp
    const char *str1 = "Hello";
    const char *str2 = "Hello";
    const char *str3 = "World";

    int result1 = my_strcmp(str1, str2);
    int result2 = my_strcmp(str1, str3);

    printf("Comparing '%s' and '%s': %d\n", str1, str2, result1);
    printf("Comparing '%s' and '%s': %d\n", str1, str3, result2);

    return 0;
}
Enter fullscreen mode Exit fullscreen mode

7. Compiling and Running the Program

  1. Save the assembly code to functions.asm.
  2. Save the C code to main.c.
  3. Compile and link them:
   nasm -f elf64 functions.asm -o functions.o
   gcc main.c functions.o -o main
   ./main
Enter fullscreen mode Exit fullscreen mode

8. Expected Output

Length of 'Hello, Assembly!': 17
Comparing 'Hello' and 'Hello': 0
Comparing 'Hello' and 'World': -15
Enter fullscreen mode Exit fullscreen mode

9. Conclusion

By writing strlen and strcmp in assembly, you gain a better understanding of:

  • Low-level memory operations.
  • How strings are processed at the machine level.
  • Integrating assembly with C to leverage the strengths of both languages.

What other C standard library functions would you like to see recreated in assembly? Let me know in the comments below!


10. Connect on Twitter

Enjoyed this guide? Share your thoughts or ask questions on Twitter! Let’s connect and explore more low-level programming together. 🚀

Top comments (0)