First off, I research for this with some help from DeepSeek. It's easier to search answers that way than Googling.
Motivation
- I have a need to do massive parallel processing over a large set of data
- I don't feel secure dealing with the type-safety in Typescript and Javascript.
- I like compiled languages as the compiler can help to remove a lot of errors/mistakes
- Why not WASM? yeah, why not? The only excuse I can think of is: the team doesn't like it...
Pre-requisites
- Mate, you gotta know Rust to some extend
- You need to know some Javascript also
Features of the RustDLL
- It's in Rust
- It uses
rayon
for data-parallelism - It uses
serde
to serialize and deserialize structs to json string - It uses a nodejs callback to deal with some JS-side functionality that I am too lazy to convert to Rust
Features of the nodeProgram
- It's in Javascript (duh)
- It uses
deasync
to get asyncs to work in sync function - It hosts a callback for the RustDLL to call to handle some functionality. The functionalities could be running in asyncs
Let's get started. (Please read the comments in the code)
The nodeProgram
Required modules
- ffi-napi - This is used to implement ffi for interfacing with the DLL
- deasync - This is used to call async function in a normal function and block the caller function until the async completes
Implementation
import ffi from 'ffi-napi';
import deasync from 'deasync';
import { prepareJson } from './data_prep.js';
// We are making this into an async to demonstrate that
// the DLL can be called in this condition
(async () => {
// Load the Rust DLL
const loadedDLL = ffi.Library('../path/to/rust.dll', {
// Define the Rust function signature
// [output-type, [input-types]]
// function is call 'wow_function'
// return nothing
// receives a string (json string)
// and a pointer (function pointer for callback)
wow_function: ['void', ['string', 'pointer']],
});
// Define an async Node.js callback function
// Node callback function to be sent into the 'wow_function'
// (output-type, [input-types], handler)
// returns a string
// receives a string
const nodeCallback = ffi.Callback('string', ['string'], (message) => {
console.log("nodeCallback 0", message);
// Perform async operations here
let done = false;
someAsyncFunction()
.then(v => {
done = !done;
return v;
});
deasync.loopWhile(_ => !done);
//
return `nodeCallback returns original: ${message}`;
});
// some random long-winded async function called from nodeCallback
async function someAsyncFunction() {
return new Promise(resolve => {
setTimeout(() => {
console.log('Async operation completed');
resolve();
}, 1000);
});
}
// Call the Rust function
// for the first param, we send in a stringified json
// for second param, we send in nodeCallback
// note the use of .async(...), this will allow the Rust to run
// multiple threads
const json = prepareJson();
loadedDLL.wow_function.async(JSON.stringify(json), nodeCallback, (err) => {
console.log("end");
});
})();
The rust.dll
Required crates
- libc - For type conversion
- rayon - For data parallelism (given a vec of data, process them in threads)
- serde - For serializing and deserializing struct
- serde_json - JSON serialization and deserialization
Project setup
-
cargo new [a_bold_project_name] --lib
to make a lib project - modify the Cargo.toml with the following:
[lib]
crate-type = ["cdylib"]
name = "rust" # this is applied when 'cargo build', check target/debug/rust.dll
path = "./src/lib.rs" # this may be different depending on how you structure your project
-
cargo add
the above crates
Implementation
// These are for data-parallelism
use rayon::iter::{IntoParallelRefIterator, ParallelIterator};
use serde::{Deserialize, Serialize};
use std::ffi::CStr;
// Defines a type for the nodecallback function pointer
type NodeCallback = extern "C" fn(*const libc::c_char) -> *const libc::c_char;
// no_mangle is important, otherwise the wow_function for be appended with some interesting stuff
// we receive 'string' from node as '*const libc::c_char'
// and callback as 'NodeCallback'
#[no_mangle]
pub extern "C" fn wow_function(json_ptr: *const libc::c_char, callback: NodeCallback) {
// Ensure the input pointer is not null
assert!(!json_ptr.is_null());
// Convert the C string to a Rust string
let json_str = string_from_cstr(json_ptr);
// Deserialize the JSON string
let jobs: Vec<Job> = serde_json::from_str(&json_str).expect("Failed to parse JSON");
let job_handling_results = jobs
// The parallel process the data in jobs
.par_iter()
// We are calling the nodeCallback in a thread
.map(|job| handle_job(callback, job))
.collect::<Vec<_>>();
println!("{job_handling_results:?}");
}
fn handle_job(callback: NodeCallback, job: &Job) -> String {
let json = serde_json::to_string(job).expect("cannot job convert to string");
let message_ptr = cstr_from_string(json);
println!("handle_job calling back");
let reply = callback(message_ptr);
string_from_cstr(reply)
}
// Converts a 'libc::c_char' to a Rust 'String'
fn string_from_cstr(ptr: *const libc::c_char) -> String {
unsafe { CStr::from_ptr(ptr).to_string_lossy().into_owned() }
}
// Converts a Rust 'String' to a 'libc::c_char'
fn cstr_from_string(message: String) -> *const libc::c_char {
let message = format!("{message}\0");
message.as_ptr() as *const libc::c_char
}
// Struct to hold the incoming json-string
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct Job {
pub project_uuid: String,
pub src_path: String,
pub dest_path: String,
pub need_correction: bool,
}
Compile
- simply
cargo build
(for debug version)
Test
- go back to the nodeProgram folder
- run
node [yourNodeProgramName]
- cross fingers
Conclusion
- There we go.
- The setup is not painful and it serves my purpose well
- I hope you guys/gals find it useful
Top comments (0)