[439] | 1 | # How to use `mrb_gc_arena_save()`/`mrb_gc_arena_restore()`/`mrb_gc_protect()`
|
---|
| 2 |
|
---|
| 3 | _This is an English translation of [Matz's blog post][matz blog post]
|
---|
| 4 | written in Japanese._
|
---|
| 5 | _Some parts are updated to reflect recent changes._
|
---|
| 6 | [matz blog post]: http://www.rubyist.net/~matz/20130731.html
|
---|
| 7 |
|
---|
| 8 | When you are extending mruby using C language, you may encounter
|
---|
| 9 | mysterious "arena overflow error" or memory leak or very slow
|
---|
| 10 | execution speed. This is an error indicating overflow of "GC arena"
|
---|
| 11 | implementing "conservative GC".
|
---|
| 12 |
|
---|
| 13 | GC (garbage collector) must ensure that object is "alive", in other
|
---|
| 14 | words, that it is referenced by somewhere from program. This can be
|
---|
| 15 | determined by checking if the object can be directly or indirectly
|
---|
| 16 | referenced by root. The local variables, global variables and
|
---|
| 17 | constants etc are root.
|
---|
| 18 |
|
---|
| 19 | If program execution is performed inside mruby VM, there is nothing to
|
---|
| 20 | worry about because GC can access all roots owned by VM.
|
---|
| 21 |
|
---|
| 22 | The problem arises when executing C functions. The object referenced
|
---|
| 23 | by C variable is also "alive", but mruby GC cannot aware of this, so
|
---|
| 24 | it might mistakenly recognize the objects referenced by only C
|
---|
| 25 | variables as dead.
|
---|
| 26 |
|
---|
| 27 | This can be a fatal bug if the GC tries to collect a live object.
|
---|
| 28 |
|
---|
| 29 | In CRuby, we scan C stack area, and use C variable as root to check
|
---|
| 30 | whether object is alive or not. Of course, because we are accessing C
|
---|
| 31 | stack just as memory region, we never know it is an integer or a
|
---|
| 32 | pointer. We workaround this by assuming that if it looks like a
|
---|
| 33 | pointer, then assume it as a pointer. We call it "conservative".
|
---|
| 34 |
|
---|
| 35 | By the way, CRuby's "conservative GC" has some problems.
|
---|
| 36 |
|
---|
| 37 | The biggest problem is we have no way to access to the stack area in
|
---|
| 38 | portable way. Therefore, we cannot use this method if we'd like to
|
---|
| 39 | implement highly portable runtime, like mruby.
|
---|
| 40 |
|
---|
| 41 | So we came up with an another plan to implement "conservative GC" in mruby.
|
---|
| 42 |
|
---|
| 43 | Again, the problem is when an object which was created in C function, becomes
|
---|
| 44 | no longer referenced in the Ruby world, and cannot be treated as garbage.
|
---|
| 45 |
|
---|
| 46 | In mruby, we recognize all objects created in C function are alive.
|
---|
| 47 | Then we have no problem such as confusing a live object as dead.
|
---|
| 48 |
|
---|
| 49 | This means that because we cannot collect truly dead object, we may
|
---|
| 50 | lose efficiency, but as a trade-off the GC itself is highly portable.
|
---|
| 51 | We can say goodbye to the problem that GC deletes live objects due to
|
---|
| 52 | optimization which sometimes occurs in CRuby.
|
---|
| 53 |
|
---|
| 54 | According to this idea, we have a table, called "GC arena", which
|
---|
| 55 | remembers objects created in C function.
|
---|
| 56 |
|
---|
| 57 | The arena is stack structure, when C function execution is returned to mruby
|
---|
| 58 | VM, all objects registered in the arena are popped.
|
---|
| 59 |
|
---|
| 60 | This works very well, but can cause another problem: "arena overflow error" or
|
---|
| 61 | memory leak.
|
---|
| 62 |
|
---|
| 63 | As of this writing, mruby automatically extend arena to remember
|
---|
| 64 | objects (See `MRB_GC_FIXED_ARENA` and `MRB_GC_ARENA_SIZE` in
|
---|
| 65 | doc/guides/mrbconf.md).
|
---|
| 66 |
|
---|
| 67 | If you create many objects in C functions, memory usage will increase, since
|
---|
| 68 | GC never kick in. This memory usage may look like memory leak, but will also
|
---|
| 69 | make execution slower as more memory will need to be allocated.
|
---|
| 70 |
|
---|
| 71 | With the build time configuration, you can limit the maximum size of
|
---|
| 72 | arena (e.g., 100). Then if you create many objects, arena overflows,
|
---|
| 73 | thus you will get an "arena overflow error".
|
---|
| 74 |
|
---|
| 75 | To workaround these problems, we have `mrb_gc_arena_save()` and
|
---|
| 76 | `mrb_gc_arena_restore()` functions.
|
---|
| 77 |
|
---|
| 78 | `int mrb_gc_arena_save(mrb)` returns the current position of the stack
|
---|
| 79 | top of GC arena, and `void mrb_gc_arena_restore(mrb, idx)` sets the
|
---|
| 80 | stack top position to back to given `idx`.
|
---|
| 81 |
|
---|
| 82 | We can use them like this:
|
---|
| 83 |
|
---|
| 84 | ```c
|
---|
| 85 | int arena_idx = mrb_gc_arena_save(mrb);
|
---|
| 86 |
|
---|
| 87 | // ...create objects...
|
---|
| 88 | mrb_gc_arena_restore(mrb, arena_idx);
|
---|
| 89 |
|
---|
| 90 | ```
|
---|
| 91 |
|
---|
| 92 | In mruby, C function calls are surrounded by this save/restore, but we
|
---|
| 93 | can further optimize memory usage by surrounding save/restore, and can
|
---|
| 94 | avoid creating arena overflow bugs.
|
---|
| 95 |
|
---|
| 96 | Let's take a real example. Here is the source code of `Array#inspect`:
|
---|
| 97 |
|
---|
| 98 | ```c
|
---|
| 99 | static mrb_value
|
---|
| 100 | inspect_ary(mrb_state *mrb, mrb_value ary, mrb_value list)
|
---|
| 101 | {
|
---|
| 102 | mrb_int i;
|
---|
| 103 | mrb_value s, arystr;
|
---|
| 104 | char head[] = { '[' };
|
---|
| 105 | char sep[] = { ',', ' ' };
|
---|
| 106 | char tail[] = { ']' };
|
---|
| 107 |
|
---|
| 108 | /* check recursive */
|
---|
| 109 | for(i=0; i<RARRAY_LEN(list); i++) {
|
---|
| 110 | if (mrb_obj_equal(mrb, ary, RARRAY_PTR(list)[i])) {
|
---|
| 111 | return mrb_str_new(mrb, "[...]", 5);
|
---|
| 112 | }
|
---|
| 113 | }
|
---|
| 114 |
|
---|
| 115 | mrb_ary_push(mrb, list, ary);
|
---|
| 116 |
|
---|
| 117 | arystr = mrb_str_new_capa(mrb, 64);
|
---|
| 118 | mrb_str_cat(mrb, arystr, head, sizeof(head));
|
---|
| 119 |
|
---|
| 120 | for(i=0; i<RARRAY_LEN(ary); i++) {
|
---|
| 121 | int ai = mrb_gc_arena_save(mrb);
|
---|
| 122 |
|
---|
| 123 | if (i > 0) {
|
---|
| 124 | mrb_str_cat(mrb, arystr, sep, sizeof(sep));
|
---|
| 125 | }
|
---|
| 126 | if (mrb_array_p(RARRAY_PTR(ary)[i])) {
|
---|
| 127 | s = inspect_ary(mrb, RARRAY_PTR(ary)[i], list);
|
---|
| 128 | }
|
---|
| 129 | else {
|
---|
| 130 | s = mrb_inspect(mrb, RARRAY_PTR(ary)[i]);
|
---|
| 131 | }
|
---|
| 132 | mrb_str_cat(mrb, arystr, RSTRING_PTR(s), RSTRING_LEN(s));
|
---|
| 133 | mrb_gc_arena_restore(mrb, ai);
|
---|
| 134 | }
|
---|
| 135 |
|
---|
| 136 | mrb_str_cat(mrb, arystr, tail, sizeof(tail));
|
---|
| 137 | mrb_ary_pop(mrb, list);
|
---|
| 138 |
|
---|
| 139 | return arystr;
|
---|
| 140 | }
|
---|
| 141 | ```
|
---|
| 142 |
|
---|
| 143 | This is a real example, so a little bit complicated, but bear with me.
|
---|
| 144 | The essence of `Array#inspect` is that after stringifying each element
|
---|
| 145 | of array using `inspect` method, we join them together so that we can
|
---|
| 146 | get `inspect` representation of the entire array.
|
---|
| 147 |
|
---|
| 148 | After the `inspect` representation is created, we no longer require the
|
---|
| 149 | individual string representation. This means that we don't have to register
|
---|
| 150 | these temporal objects into GC arena.
|
---|
| 151 |
|
---|
| 152 | Therefore, in order to keep the arena size small; the `ary_inspect()` function
|
---|
| 153 | will do the following:
|
---|
| 154 |
|
---|
| 155 | * save the position of the stack top using `mrb_gc_arena_save()`.
|
---|
| 156 | * get `inspect` representation of each element.
|
---|
| 157 | * append it to the constructing entire `inspect` representation of array.
|
---|
| 158 | * restore stack top position using `mrb_gc_arena_restore()`.
|
---|
| 159 |
|
---|
| 160 | Please note that the final `inspect` representation of entire array
|
---|
| 161 | was created before the call of `mrb_gc_arena_restore()`. Otherwise,
|
---|
| 162 | required temporal object may be deleted by GC.
|
---|
| 163 |
|
---|
| 164 | We may have a usecase where after creating many temporal objects, we'd
|
---|
| 165 | like to keep some of them. In this case, we cannot use the same idea
|
---|
| 166 | in `ary_inspect()` like appending objects to existing one.
|
---|
| 167 | Instead, after `mrb_gc_arena_restore()`, we must re-register the objects we
|
---|
| 168 | want to keep in the arena using `mrb_gc_protect(mrb, obj)`.
|
---|
| 169 | Use `mrb_gc_protect()` with caution because it could also lead to an "arena
|
---|
| 170 | overflow error".
|
---|
| 171 |
|
---|
| 172 | We must also mention that when `mrb_funcall` is called in top level, the return
|
---|
| 173 | value is also registered to GC arena, so repeated use of `mrb_funcall` may
|
---|
| 174 | eventually lead to an "arena overflow error".
|
---|
| 175 |
|
---|
| 176 | Use `mrb_gc_arena_save()` and `mrb_gc_arena_restore()` or possible use of
|
---|
| 177 | `mrb_gc_protect()` to workaround this.
|
---|