Matz released mruby, a yet another ruby implementation for small systems, at GitHub. Mruby is a new implementation of Ruby Language based on RiteVM, which seems to be a register based VM with tri-color marking and mark-and-sweep collector GC.
I'll just quick look at the following points on the VM, which is said to be the most costly part of any virtual machine.
- The Instruction Dispatcher
- Accessing the Operands
- Performing the Computation
The instruction dispatcher
The instruction dispatcher, mrb_run(), is a simple C switch dispatcher. There seems to be a compile time condition code for a label dispatcher, too, but the default one is the C switch.
The instruction dispather is compiled to 17200 bytes (0x4330 bytes) of function under GCC 4.3 for ARM (the number depends on compiler options, of course). I haven't checked how this works in any type of ARM CPU cache.
$ objdump -t vm.o |grep mrb_run 000001d8 g F .text 00004330 mrb_run
Accessing the Operands
The VM instructions are encoded in 32 bit fixed size instruction set. There seems to be three types of instructions. You can find this out in src/opcode.h.
The opcode is 7 bit long, which means that RiteVM can support 127 opcodes. There already are 75 opcodes and five reserved opcodes defined in src/opcode.h. So, 6 bit (64 opcodes) is too short RiteVM.
If you look at the comment above, it seems to be that the opcode is at MSB, but it is actually at the LSB side.
So, it's more like A:B:C:OP instead of OP:A:B:C.
A:B:C:OP takes three operands, the first two are 9 bits each and the last one is 7 bits. So, to retrieve the operand A from this vm instruction, you need to shift 23(9+7+7) bits to the right. You can do that in one instruction on ARM CPU.
A:Bx:OP takes two operands, the first one is 9 bits and the second one is 16 bits.
Ax:OP takes one 25 bits operand.
Performing the Computation
I'm still not sure how much computation is required for each instructions. This is fun to try but takes some time.
I also checked mruby registers. They are an array of mrb_values, which is a union of C variables and a byte for type indicator, as you can see below.