Quick Friday hit for the 0 people following along my project and the dozen people discovering this years later from Google.
There aren't any good examples on creating a Ruby class with mruby that encapsulates data not used within the VM. This is something you need when you're using mruby to interface with a C library that provides it's own data types. You want to call methods in Ruby which run C code calling out to the external library. Examples include database systems, graphics APIs and interacting with the operating system.
It's a different though simpler process with mruby than mainline Ruby. C Ruby requires a separate allocate
method before initialize
. With mruby we can perform all operations with creating and storing C data all within the initialize
method.
We'll use a contrived Foo_data
struct.
struct Foo_data {
uint32_t first[16];
uint32_t second[16];
};
For demonstration we'll create a constructor for a Foo
class taking one integer parameter. This parameter will instruct how to fill out the first
and second
members of a Foo_data
struct that's a part of an instance of Foo
. Building a Ruby Foo
class in C like usual
mrb_state *state = mrb_open();
RClass *Foo = mrb_define_class(state, "Foo", state->object_class);
MRB_SET_INSTANCE_TT(Foo, MRB_TT_DATA);
mrb_define_method(state, Foo, "initialize", Foo_initialize, MRB_ARGS_REQ(1));
mrb_define_method(state, Foo, "check", Foo_check, MRB_ARGS_NONE());
There's a highly important new thing here. The MRB_SET_INSTANCE_TT
sets the class object table type to the one specified. In this case we're setting this to a special MRB_TT_DATA
. I neglected to do this step and encountered erratic difficult to debug program behavior. The program would follow pointers to nowhere and quit outright, not throwing the normal Access Violation error normally seen when following wild pointers.
Next we create a mrb_data_type
. This is a struct which informs the mruby VM of the data type and the function to call when the GC comes for the object. It has two members, name and a function pointer. Name is a unique char*
identifier, this is unique and identifies the data type. The name of the class, "Foo"
is fine. The function pointer is what the GC calls when it destroys the object. If you're not doing anything special you can use the default built in function mrb_free
static const mrb_data_type Foo_type = {
"Foo", mrb_free
};
Filling out the initialize
method. We'll allocate a new Foo_data
and save it to the instance on initialize
.
mrb_value
Foo_initialize(mrb_state* state, mrb_value self) {
mrb_int n;
mrb_get_args(state, "i", &n);
Foo_data *foo = (Foo_data *)DATA_PTR(self);
if(foo) { mrb_free(state, foo); }
mrb_data_init(self, nullptr, &Foo_type);
foo = (Foo_data *)malloc(sizeof(Foo_data));
for(uint32_t i = 0; i < 16; ++i) {
foo->first[i] = i * n;
foo->second[i] = i * n * n;
}
mrb_data_init(self, foo, &Foo_type);
return self;
}
Explaining a couple things here. DATA_PTR
pulls a void *
out from the object specified, in this case self
. We then cast it to what we want, a Foo_data *
. For reasons I don't entirely understand, though recommended in this discussion and used in the mruby time gem we see if there already is a pointer associated with the instance. If so we free it.
We call mrb_data_init
first to initialize the void *
destined for holding the Foo_data *
. Then a normal C heap allocation, and then filling out the data with the integer passed into the constructor. Calling mrb_data_init
with the populated Foo_data *
saves the data to the instance.
We extract it again at check
mrb_value
Foo_check(mrb_state* state, mrb_value self) {
Foo_data *foo;
Data_Get_Struct(state, self, &Foo_type, foo);
mrb_assert(foo != nullptr);
return mrb_nil_value();
}
Data_Get_Struct
is a macro which will pull out and type cast the void *
saved to the instance. All of the definitions and implementations are in data.h
and data.c
in the mruby code.
Now all that's left is create instances of Foo
with different data and confirm with a debugger that the data is what we expect.
mrb_load_string(state, "a = Foo.new(10); a.check; b = Foo.new(50); b.check");
And that's it! It seems daunting at first understanding how to create a class within mruby for saving arbitrary C data with the lack of information. I hope this information makes the process clear and helps someone wanting to do the same.
The full program
#include <stdlib.h>
#include "mruby.h"
#include "ext\mruby\data.h"
#include "ext\mruby\class.h"
#include "ext\mruby\compile.h"
struct Foo_data {
uint32_t first[16];
uint32_t second[16];
};
static const mrb_data_type Foo_type = {
"Foo", mrb_free
};
mrb_value
Foo_initialize(mrb_state* state, mrb_value self) {
mrb_int n;
mrb_get_args(state, "i", &n);
Foo_data *foo = (Foo_data *)DATA_PTR(self);
if(foo) { mrb_free(state, foo); }
mrb_data_init(self, nullptr, &Foo_type);
foo = (Foo_data *)malloc(sizeof(Foo_data));
for(uint32_t i = 0; i < 16; ++i) {
foo->first[i] = i * n;
foo->second[i] = i * n * n;
}
mrb_data_init(self, foo, &Foo_type);
return self;
}
mrb_value
Foo_check(mrb_state* state, mrb_value self) {
Foo_data *foo;
Data_Get_Struct(state, self, &Foo_type, foo);
mrb_assert(foo != nullptr);
return mrb_nil_value();
}
int main() {
mrb_state *state = mrb_open();
RClass *Foo = mrb_define_class(state, "Foo", state->object_class);
MRB_SET_INSTANCE_TT(Foo, MRB_TT_DATA);
mrb_define_method(state, Foo, "initialize", Foo_initialize, MRB_ARGS_REQ(1));
mrb_define_method(state, Foo, "check", Foo_check, MRB_ARGS_NONE());
mrb_load_string(state, "a = Foo.new(10); a.check; b = Foo.new(50); b.check");
return 0;
}
Top comments (3)
That's a pretty useful tutorial!
One question though: here you are calling mrb_data_init on the mrb_value self in the Foo constructor (Foo_initialize), so I suppose this mrb_value is already setup to receive a data pointer. How do you initialize an mrb_value with custom C data outside of a constructor, e.g. from within a normal function (doing the equivalent of a Foo.new inside a C function)?
I have tried this:
mystruct* val = ... ; // allocation
mrb_value v;
mrb_data_init(v, (void*)val, &mydatatype);
but the mrb_data_init is producing a segfault. I suppose there is something to do with mrb_data_object_alloc first, but this function returns an RData*, not an mrb_value, and I haven't found any function to convert an RData* into an mrb_value.
I think you're running into confusion on what an
mrb_value
is, which is understandable. There is little documentation about mruby and the api is inconsistent with either using themrb_value
or the raw pointers. Anmrb_value
type is temporary. It's a convenience wrapper for interpreting a region of allocated memory. Thett
member of themrb_value
informs you, the programmer, how to proceed with themrb_value
For instance, if thett
isMRB_TT_FIXNUM
then you read thevalue.i
member directly. However, with other values you cast thevalue.p
void *
based upon thett
type. Another example, if thett
isMRB_TT_CLASS
you'd doRClass *Foo = (RClass *)obj.value.p
. The same forMRB_TT_DATA
, thevalue.p
is anRData *
and should be cast as suchRData *foo = (RData *)obj.value.p
. (Note, strings are really weird in that they have lots of hidden optimizations underneath, don't ever work with an mruby string directly no matter what even if it looks like a normalchar *
)In the few months of working with mruby I've never constructed an
mrb_value
myself. I let the API or convenience macros build them and do the right thing instead. Most API functions return anmrb_value
pre-populated. When I need to create anmrb_value
based upon C data I use a provided API function or C macro.boxing_no.h
contains some C macros for making anmrb_value
from C data. Otherwise the type specific header file contains C macros for creating anmrb_value
for that type, likestring.h
As to your question when defining an mruby method in C the calling signature is
(mrb_state*, mrb_value)
. The mruby VM boxes up the currentself
object into anmrb_value
and passes it along as the second parameter of the C function. In the case of theinitialize
method the mruby VM already created and allocated the object and passed it along as theself
giving you a chance to work with it more.Knowing that you can achieve what you're after. You cannot call
mrb_data_init
on an unpopulatedmrb_value
. It doesn't point to anything useful in your example and has the classic C problem of uninitialized garbage memory data.mrb_data_init
is following thev.value.p
to some random location and then crashing.Instead, use the
self
value passed into every function which the VM populates properly. If you set the class type usingMRB_SET_INSTANCE_TT
then theself
mrb_value
is a properly constructedmrb_value
wrapping anRData *
. I hope that I helped your understanding and you got a little farther.One bit of note. Calling a method on an object other than
new
and having the object allocate memory for itself is surprising to me. If you think about it in pure Ruby land, calling methods on a constructed object will only construct other objects, which their constructors allocate memory, and optionally save the references as instance variables. I'd never expect doing something likefoo = Foo.new; foo.some_operation
to allocate more memory, I'd expectFoo.new
to do all the allocations required for an instance ofFoo
. Something to consider that may simplify your designs and help your understanding.Thanks for your answer! I also asked the question as an issue on the mruby repo and got an answer about using the C macros. My code works, now.
My use-case for allocating an object in a function that isn't a constructor was not to allocate more memory for "self", but to allocate a new object. Imagine the following code:
class Foo
...
end
class Bar
...
def do_something
...
return Foo.new
end
end
Now imagine Foo and Bar are actually not defined in Ruby but in C, and do_something is a C function, you would need to create an instance of Foo from inside a C function.