In Part 1 of this series we have looked into how to use fork
system call to spawn a child process and branch out on the child process to do a different code path. In this part we will look into how we can implement a more functional way to do multithreading in assembly.
Goal
We want to use any function/subroutine in our assembly code as a thread, meaning that we want a way to call a function and this function will run on a different thread, then we need a way to wait (on the parent process) for the threads we create.
Implementation
We will create a file lib/thread.asm
that will have two functions:
thread_run
: It will take a pointer to the code we want to execute (stored inrax
) and then it will pass along the registersrdi
,rsi
,rdx
andr8
unchanged to the function now running on the child thread it will create. This function will callfork
and then on the child thread it will jump to the code pointed at byrax
, on the parent it will return the PID of the child process inrax
.thread_wait
: This function takes a single argument inrax
which is a PID of a child process, it will wait for that child process to finish executing then it returns the exit code of that child process inrax
.
thread_run
# lib/thread.asm
.intel_syntax
.section .text
# thread_run
# A function to start running the given code on a child process
#
# Arguments:
# rax: Pointer to function(label)
# rdi, rsi, rdx, rcx, r8 : The arguments to the function
#
# Returns:
# rax: Child process id
#
.global thread_run
thread_run:
push %rax # Push rax value to the stack
# Fork this process
mov %rax, 0x39
syscall
cmp %rax, 0x00 # Compare the returned PID
mov %r10, %rax # Store the PID in r10 temporarily
pop %rax # Get back rax value (Function pointer)
je _invoke # Jump if this is the child process
# If this is the parent process then return
mov %rax, %r10 # Restore the PID
ret
_invoke:
# Jump to the pointer of the function
call %rax
# Safe exit
mov %rax, 0x3c
mov %rdi, 0x00
syscall
In the above code, the function starts by storing (pushing) the pointer that is in rax
to the stack because we need the rax
register to make a fork
system call, then we compare the return of fork
to zero and store the PID (the current value of rax
) in r10
register temporarly, and put back the function pointer in rax
register and jump conditionally to the _invoke
function/subroutine if the value of the earliy comparison is correct, other-wise we restore the PID in rax
and return. _invoke
subroutine will call the function in the address stored in rax
and then exit safetly with the exit
syscall.
thread_wait
# lib/thread.asm
...
# thread_wait
# Wait for a process to exit
#
# Arguments:
# rax: Process PID
#
# Returns:
# rax: Exit code
#
.global thread_wait
thread_wait:
# Allocate space in the stack to store the exit code
sub %rsp, 4
mov %rdi, %rax # Move the PID to %rdi for wait4 syscall
mov %rax, 0x3d # wait4 syscall
mov %rsi, %rsp # Pointer to the exit code space
mov %rdx, 0x00 # Flags
mov %r10, 0x00 # NULL pointer
syscall
# Store the exit code in EAX (4 byte value of rax)
mov %eax, [%rsp]
add %rsp, 4
ret
In the above code we start by allocating 4 bytes of memory on the stack, then we call the wait4 system call, we basically give it the PID of the child process in rdi
and a pointer to 4 bytes in memory to store the exit value of the child process that exited (We give it the 4 bytes we just allocated on the stack). After wait4
syscall returns, we get the return value from the stack and deallocate that memory from the stack.
NOTE: The last step could be achived using pop DWORD %rax
but for some reason using gas
it didn't want to assemble, but I assume it will work fine on MASM
or another assembler.
Assembly
Now we will assemble this code the same way we did with util.asm
using the following commad:
$ as lib/thread.asm -o thread.o
If eveything works fine we will now have a thread.o
file that we will link our code with.
Driver Code
To demonstrate how these two functions work, we will create a new directory hello_threads
in the root of our project and put a main.asm
file inside of it.
In this file we will write a program that will execute a thread to run the print
function we wrote in util.asm
last time and wait for the thread to finish executing, and then print from the main (parent) process.
Then the program will execute another thread with the print
function and continue running the main (parent) process without waiting.
# hello_threads/main.asm
.global _start
.intel_syntax
.section .text
# The external functions we need
.extern thread_run
.extern thread_wait
.extern print
_start:
# Create a thread
call _log
lea %rax, [%rip + print]
lea %rdi, [%rip + thread_msg]
mov %rsi, OFFSET thread_msg_len
call thread_run
# Wait for the process (PID is in rax) then print
call thread_wait
call _pmsg
# Create a thread
call _log
lea %rax, [%rip + print]
lea %rdi, [%rip + thread2_msg]
mov %rsi, OFFSET thread2_msg_len
call thread_run
# Print without waiting
call _pmsg
# Exit
jmp _exit
_pmsg:
# Print 'parent_msg'
lea %rdi, [%rip + parent_msg]
mov %rsi, OFFSET parent_msg_len
call print
ret
_log:
# Print 'log_msg'
lea %rdi, [%rip + log_msg]
mov %rsi, OFFSET log_msg_len
call print
ret
_exit:
# Exit safetly with status code 0
mov %rax, 0x3c
mov %rdi, 0x00
syscall
.section .data
log_msg:
.ascii "\nCreating a thread..\n" # 20
log_msg_len = . - log_msg
thread_msg:
.ascii "Hello from thread 1!\n" #19
thread_msg_len = . - thread_msg
thread2_msg:
.ascii "Hello from thread 2!\n" #19
thread2_msg_len = . - thread2_msg
parent_msg:
.ascii "Hello from parent!!\n" #20
parent_msg_len = . - parent_msg
Now we assemble our code and link the resulting object file with both lib/util.o
and lib/thread.o
with the following commads:
$ as hello_threads/main.asm -o hello_threads/hello_threads.o
$ ld hello_threads/hello_threads.o lib/util.o lib/thread.o -o hello_threads/hello_threads.elf -nostdlib
If everything goes well, we should now have a hello_threads.elf
in hello_threads
directory, if we execute the file we should see an output similar to this:
Creating a thread..
Hello from thread 1!
Hello from parent!!
Creating a thread..
Hello from parent!!
Hello from thread 2!
Notice how the first thread executes before the main (parent) thread due to us using the thread_wait
function that we defined earlier. But in the second theard case it is different, the parent executes first -for most of the time- since it does not have to wait for the thread to finish executing.
Now that we have a functional way to spwan and wait for threads we will need a way for threads to communicate and share data with each other, so in the next part we will look into how we can allocate and share memory between threads.
The code is available on the Github repository if you want to check it out.
Top comments (1)
Isn't this more multiprocessing than it is multithreading?