With Ruby 1.9.1 being out for a while now, it's time to review my calculations regarding the memory footprint of objects, since 1.9 incorporates some optimizations that improve significantly on 1.8. I also measured the footprint of OCaml objects while I was at it.
Addendum: added note about Ruby Enterprise Edition (Ruby EE), a patched Ruby 1.8.6 used with Phusion Passenger; see below.
This table summarizes the results (sizes in bytes on x86; around (exactly, for OCaml) twice as much on x86-64 --- the malloc overhead might differ):
| Ruby 1.8 | Ruby EE | Ruby 1.9 | OCaml | |
|---|---|---|---|---|
| object with no IVs | 20 | 20 | 20 | 12 |
| object 1 IV | 120 | 96 | 20 | 16 |
| object 2 IVs | 144 | 112 | 20 | 20 |
| object 3 IV | 168 | 128 | 20 | 24 |
| object 4 IV | 192 | 144 | 48 | 28 |
| struct (Struct or record) 1 elm. | 32 | 24 | 20 | 8 |
| struct 2 elms. | 36 | 28 | 20 | 12 |
| struct 3 elms. | 40 | 32 | 20 | 16 |
| struct 4 elms. | 44 | 36 | 44 | 20 |
(The Ruby EE gains come from the TCMalloc allocator and these are best case figures; the actual footprint will be between them and those for Ruby 1.8.)
Keep in mind that both Ruby 1.8 and 1.9 can suffer from heavy memory fragmentation (both internal and external) when allocating many objects (also, objects might be retained for an arbitrarily long amount of time because the GC is conservative). OCaml has no such problem, as it has got a generational, exact GC with a copying GC in the minor heap and an incremental mark & sweep & compact GC in the major heap.
In Ruby 1.8, an object with one instance variable (IV) takes:
5 words for the object slot
4 (+2 =
mallocoverhead) words for the IV table (st_tablestruct)11 (+2) words for the bins
4 (+2) words for the entry
That is, given
class X; def initialize(x); @x = x end end
X.new(1) will take 30 words, or 120 bytes in x86 (24 of which are used by malloc for internal bookkeeping).
Additional IVs cost 6 words (24 bytes) per IV until we reach 11 IVs (at which point the hash table resizes to 19 bins).
Ruby Enterprise Edition
Ruby EE is a patched Ruby 1.8.6 which uses Google's TCMalloc, which is much faster than the most common one, based on Doug Lea's. There are no changes to the runtime representation of objects, so all the possible gains space-wise come from TCMalloc. According to its documentation, small blocks can be allocated with virtually no overhead, so Ruby EE will take up to 24 fewer bytes per object with IVs, and as much as 8 bytes less per Struct.
Ruby 1.9
Ruby 1.9 doesn't use a symbol -> value hash table for IVs anymore. There's an IV index table per class which contains the index associated to the IV name. The index is used to dereference a per-object IV array.
(Note that the IV index table is shared amongst all the objects of the same class. If each one uses different names for the IVs, the indexes will keep increasing, so in a pathological case the IV array could become arbitrarily large even when the object has got only one IV.)
Ruby 1.9 stores up to 3 instance variables in the object slot without using an external table, so an object with one IV will only take 5 words. Beyond 3 instance variables, it reverts to an external IV array which is resized exponentially (factor 1.25) as new elements are added. For an object with 4 IVs, it'll be of size 5, and the overall footprint will be:
5 words for the object slot
5 (+2) words for the IV array
I'd never bothered to look into the size of OCaml objects before (as you're going for records when you want speed anyway), even though it's really easy using the low-level Obj module, which gives information about the runtime representation:
# open Obj;; # let value_size o = let t = repr o in if is_block t then 1 + size t else 0;; val value_size : 'a -> int = <fun> # value_size 0;; - : int = 0 # type foo = A | B of int;; type foo = A | B of int # value_size A;; - : int = 0 # value_size (B 1);; - : int = 2 # value_size (object end);; - : int = 3 # value_size (object val a = 1 end);; - : int = 4 # value_size (object val a = 1 method x = 1 end);; - : int = 4
The value_size function returns the size in words of the value in the heap, and returns 0 for immediate values (bool, char, int, constant constructors).
After a look at CamlinternalOO.ml, I now know that, in addition to the 1 word overhead taken for all values in the heap (the block header used by the GC and the runtime), objects take:
1 word for the method table
1 word for an unique object ID
1 additional word per instance variable
