As you are probably aware, the Ruby interpreter and some of the core libraries are written in C. Over the next few weeks I plan to share a look at some of the internals of Ruby and how it achieves some of the things it does from the C side of things.

The first point of interest is the VALUE - Ruby’s internal representation of its objects. In the general sense, a VALUE is just a C pointer to a Ruby object data type. We use VALUEs in the C code like we would use objects in the Ruby code.

some_function(VALUE arg_object)
{
  some_method(arg_object);
}

One would expect that the VALUE is just a typedef to a C pointer and there’s a lookup table as to which object it represents, and this would be partially correct. However, there’s also some trickery involved.

Instead of implementing the VALUE as a pointer, Ruby implements it as an unsigned long. It just so happens that sizeof(void *) == sizeof(long) - at least on the platforms I’m familiar with. After all, what is a pointer? It’s just an n-byte integer that represents a memory address.

But because of this, there’s some tricks Ruby can perform.

First, for performance purposes, Ruby doesn’t use the VALUE as a pointer in every instance. For Fixnums, Ruby stores the number value directly in the VALUE itself. That keeps us from having to keep a lookup table of every possible Fixnum in the system.

The trick lies in the fact that pointers are aligned in 4 byte chunks ( 8 bytes on 64 bit systems ). For example, if there was an object stored at 0×0000F000, then the next would be one stored at 0×0000F004. This jump from 00 to 04 in the lower nibble is important. Expanding out as bits, it is: 00000000 and 0000100. This means that if we use the VALUE as a pointer, the lowest two bits will always be 0s.

Ruby uses this to its advantage. It will tuck a 1 in the lowest bit, and then use the rest of the space (31 bits) to store a Fixnum. One of the bits will be used for the sign, so a Ruby Fixnum can be up to 30 bits in length.

irb(main):021:0> (2 ** 30).class
=> Bignum
irb(main):022:0> (2 ** 30 - 1).class
=> Fixnum
irb(main):024:0> (-(2 ** 30)).class
=> Fixnum
irb(main):025:0> (-(2 ** 30)-1).class
=> Bignum

Ruby uses the other bit to help distinguish other common types, like false, true, and nil. Symbols and their IDs are also stored with this bit on, so Ruby recognizes it as a special instance and interprets accordingly.

The rest of the time a VALUE is a good old fashioned memory address, which points to an object structure in memory.

ruby_value_diagram.png

So there you have it. I hope this little snippet was of some VALUE to you.