Arseni Kavalchuk

Posted on Nov 18, 2024

iOS String to Kotlin ByteArray Performance Analysis

When working with Kotlin Multiplatform (KMP), interoperability between Kotlin and native code can introduce performance bottlenecks. One such case is converting a Swift String into a Kotlin ByteArray. In this article, we analyze the performance of several approaches to improve this conversion process. We write Swift String extension functions and Kotlin ByteArray factory methods using native pointers and system functions like memcpy to optimize performance. Full source code. Refer to How to set up KMP library in iOS for the details about KMP library and integration with iOS.

Kotlin ByteArray API in iOS

The Kotlin ByteArray is exposed to iOS as the KotlinByteArray class, which provides basic methods like get(index) and set(index:value:), and constructors like init(size:). However, this API is inefficient for scenarios requiring high-performance operations on byte arrays.

`KotlinByteArray` Interface

__attribute__((objc_subclassing_restricted))
__attribute__((swift_name("KotlinByteArray")))
@interface KmpLibKotlinByteArray : KmpLibBase
+ (instancetype)arrayWithSize:(int32_t)size __attribute__((swift_name("init(size:)")));
- (int8_t)getIndex:(int32_t)index __attribute__((swift_name("get(index:)")));
- (void)setIndex:(int32_t)index value:(int8_t)value __attribute__((swift_name("set(index:value:)")));
@property (readonly) int32_t size __attribute__((swift_name("size")));
@end

This interface makes random access and element-wise operations slow due to its lack of batch processing capabilities.

Using Native APIs in KMP

KMP common code cannot access native APIs directly, but native parts of the KMP code can leverage platform-specific functions plus using cinterop API. This enables us to optimize byte array copying by using native constructs.

Working with Native Pointers

In Kotlin/Native, the CPointer type is used to interface with raw memory through pointers. Understanding the differences between pointer types like CPointer<Byte> and CPointer<ByteVar> is essential for efficient memory operations and interoperability with native libraries.

CPointer<Byte> represents a pointer to an immutable sequence of bytes. It is typically used when you want to read data from a memory location without modifying its content. This type is ideal for operations where the memory is treated as read-only, such as parsing a buffer or reading data from a constant memory region.

For example:

fun printByteArray(data: CPointer<Byte>, size: Int) {
    for (i in 0 until size) {
        println(data[i])
    }
}

In this case, data points to a sequence of bytes, and the function iterates over the memory to print each byte.

CPointer<ByteVar> is a pointer to a mutable byte variable. It is used for memory regions that can be written to, such as buffers for receiving data or memory blocks that are initialized and modified. The ByteVar type encapsulates a mutable Byte value in Kotlin/Native, allowing operations like setting new values or performing in-place modifications.

For example:

fun setByteArray(data: CPointer<ByteVar>, size: Int, value: Byte) {
    for (i in 0 until size) {
        data[i] = value
    }
}

Here, data is a mutable pointer, and the function writes a specified value to each byte in the memory block.

Default ByteArray Handling

The Kotlin/Native ByteArray.readBytes is a convenient but inefficient function that loops over each byte, as shown below:

@OptIn(ExperimentalForeignApi::class)
fun byteArrayFromPtrReadBytes(data: CPointer<ByteVar>, size: Int): ByteArray =
    data.readBytes(size)

This essentially goes into this implementation in Kotlin:

fun getByteArray(source: NativePointed, dest: ByteArray, length: Int) {
    val sourceArray = source.reinterpret<ByteVar>().ptr
    for (index in 0 until length) {
        dest[index] = sourceArray[index]
    }
}

Optimizing with `memcpy`

Instead of looping, we can use the highly efficient POSIX memcpy:

@OptIn(ExperimentalForeignApi::class)
fun byteArrayFromPtrMemcpy(data: CPointer<ByteVar>, size: Int): ByteArray {
    return ByteArray(size).also {
        it.usePinned { pinned ->
            memcpy(pinned.addressOf(0), data, size.toULong())
        }
    }
}

Testing Approaches

We implemented five test cases to compare performance:

Loop Copy: Convert a Swift String to a byte array using a loop.
ReadBytes: Use ByteArray.readBytes to copy from a pointer.
Memcpy: Use memcpy to copy from a pointer.
Swift UTF8 Byte Array: Convert String.utf8 to a byte array and compare performance with readBytes and memcpy.
Swift UTF8 CString Pointer: Use String.utf8CString with both readBytes and memcpy.

Swift Implementations

Here are the Swift extension functions:

Loop Copy

func toKotlinByteArrayLoopCopy() -> KotlinByteArray {
    let utf8Bytes = Array(self.utf8)
    let kotlinByteArray = KotlinByteArray(size: Int32(utf8Bytes.count))
    for (index, byte) in utf8Bytes.enumerated() {
        kotlinByteArray.set(index: Int32(index), value: Int8(bitPattern: byte))
    }
    return kotlinByteArray
}

Data Pointer with `readBytes`

func toKotlinByteArrayDataPtrReadBytes() -> KotlinByteArray {
    var data = Array(self.utf8)
    let size = Int32(data.count)
    return data.withUnsafeMutableBytes { ptr in
        ByteArrayUtilKt.byteArrayFromPtrReadBytes(data: ptr.baseAddress!, size: size)
    }
}

Data Pointer with `memcpy`

func toKotlinByteArrayDataPtrMemcpy() -> KotlinByteArray {
    var data = Array(self.utf8)
    let size = Int32(data.count)
    return data.withUnsafeMutableBytes { ptr in
        ByteArrayUtilKt.byteArrayFromPtrMemcpy(data: ptr.baseAddress!, size: size)
    }
}

UTF8 CString with `readBytes` and `memcpy`

func toKotlinByteArrayUtf8CStringReadBytes() -> KotlinByteArray {
    var data = self.utf8CString
    return data.withUnsafeMutableBufferPointer { ptr in
        ByteArrayUtilKt.byteArrayFromPtrReadBytes(data: ptr.baseAddress!, size: Int32(strlen(ptr.baseAddress!)))
    }
}

func toKotlinByteArrayUtf8CStringMemcpy() -> KotlinByteArray {
    var data = self.utf8CString
    return data.withUnsafeMutableBufferPointer { ptr in
        ByteArrayUtilKt.byteArrayFromPtrMemcpy(data: ptr.baseAddress!, size: Int32(strlen(ptr.baseAddress!)))
    }
}

Benchmark Results

The results from running 1000 iterations of each method:

Method	Time (ms)
LoopCopy	32.10
DataPtrReadBytes	2.60
Utf8CStringReadBytes	0.89
DataPtrMemcpy	0.06
Utf8CStringMemcpy	0.02

Insights

LoopCopy is the slowest, due to its repeated calls to set(index:value:).
Using readBytes significantly improves performance but is still not optimal.
Memcpy is the fastest method due to its highly efficient memory operations.
The combination of Swift's utf8CString and Kotlin's memcpy achieves the best performance.

Conclusion

For optimal performance when converting a Swift String to a Kotlin ByteArray, use the following:

Kotlin: Implement a ByteArray factory using memcpy.
Swift: Use utf8CString with unsafe buffer pointers.

This combination delivers minimal overhead, unlocking high-performance interoperability in KMP.

The full implementation is in the project on GitHub.

DEV Community

iOS String to Kotlin ByteArray Performance Analysis

Kotlin ByteArray API in iOS

`KotlinByteArray` Interface

Using Native APIs in KMP

Working with Native Pointers

Default ByteArray Handling

Optimizing with `memcpy`

Testing Approaches

Swift Implementations

Loop Copy

Data Pointer with `readBytes`

Data Pointer with `memcpy`

UTF8 CString with `readBytes` and `memcpy`

Benchmark Results

Insights

Conclusion

Top comments (0)

Read next

Fastest and Cheapest Ways to Delete Millions of Files from Amazon S3

Building a webhook tester from scratch

AWS S3 Simplified: Automate Operations Without CLI on Remote Server

My 2025 Tech Stack: Tools & Tech I'm Using This Year

Kotlin ByteArray API in iOS

KotlinByteArray Interface

Using Native APIs in KMP

Working with Native Pointers

Default ByteArray Handling

Optimizing with memcpy

Testing Approaches

Swift Implementations

Loop Copy

Data Pointer with readBytes

Data Pointer with memcpy

UTF8 CString with readBytes and memcpy

Benchmark Results

Insights

Conclusion

Read next

Fastest and Cheapest Ways to Delete Millions of Files from Amazon S3

Building a webhook tester from scratch

AWS S3 Simplified: Automate Operations Without CLI on Remote Server

My 2025 Tech Stack: Tools & Tech I'm Using This Year

`KotlinByteArray` Interface

Optimizing with `memcpy`

Data Pointer with `readBytes`

Data Pointer with `memcpy`

UTF8 CString with `readBytes` and `memcpy`