1 | # How to use `mrb_gc_arena_save()`/`mrb_gc_arena_restore()`/`mrb_gc_protect()`
|
---|
2 |
|
---|
3 | _This is an English translation of [Matz's blog post][matz blog post]
|
---|
4 | written in Japanese._
|
---|
5 | _Some parts are updated to reflect recent changes._
|
---|
6 | [matz blog post]: http://www.rubyist.net/~matz/20130731.html
|
---|
7 |
|
---|
8 | When you are extending mruby using C language, you may encounter
|
---|
9 | mysterious "arena overflow error" or memory leak or very slow
|
---|
10 | execution speed. This is an error indicating overflow of "GC arena"
|
---|
11 | implementing "conservative GC".
|
---|
12 |
|
---|
13 | GC (garbage collector) must ensure that object is "alive", in other
|
---|
14 | words, that it is referenced by somewhere from program. This can be
|
---|
15 | determined by checking if the object can be directly or indirectly
|
---|
16 | referenced by root. The local variables, global variables and
|
---|
17 | constants etc are root.
|
---|
18 |
|
---|
19 | If program execution is performed inside mruby VM, there is nothing to
|
---|
20 | worry about because GC can access all roots owned by VM.
|
---|
21 |
|
---|
22 | The problem arises when executing C functions. The object referenced
|
---|
23 | by C variable is also "alive", but mruby GC cannot aware of this, so
|
---|
24 | it might mistakenly recognize the objects referenced by only C
|
---|
25 | variables as dead.
|
---|
26 |
|
---|
27 | This can be a fatal bug if the GC tries to collect a live object.
|
---|
28 |
|
---|
29 | In CRuby, we scan C stack area, and use C variable as root to check
|
---|
30 | whether object is alive or not. Of course, because we are accessing C
|
---|
31 | stack just as memory region, we never know it is an integer or a
|
---|
32 | pointer. We workaround this by assuming that if it looks like a
|
---|
33 | pointer, then assume it as a pointer. We call it "conservative".
|
---|
34 |
|
---|
35 | By the way, CRuby's "conservative GC" has some problems.
|
---|
36 |
|
---|
37 | The biggest problem is we have no way to access to the stack area in
|
---|
38 | portable way. Therefore, we cannot use this method if we'd like to
|
---|
39 | implement highly portable runtime, like mruby.
|
---|
40 |
|
---|
41 | So we came up with an another plan to implement "conservative GC" in mruby.
|
---|
42 |
|
---|
43 | Again, the problem is when an object which was created in C function, becomes
|
---|
44 | no longer referenced in the Ruby world, and cannot be treated as garbage.
|
---|
45 |
|
---|
46 | In mruby, we recognize all objects created in C function are alive.
|
---|
47 | Then we have no problem such as confusing a live object as dead.
|
---|
48 |
|
---|
49 | This means that because we cannot collect truly dead object, we may
|
---|
50 | lose efficiency, but as a trade-off the GC itself is highly portable.
|
---|
51 | We can say goodbye to the problem that GC deletes live objects due to
|
---|
52 | optimization which sometimes occurs in CRuby.
|
---|
53 |
|
---|
54 | According to this idea, we have a table, called "GC arena", which
|
---|
55 | remembers objects created in C function.
|
---|
56 |
|
---|
57 | The arena is stack structure, when C function execution is returned to mruby
|
---|
58 | VM, all objects registered in the arena are popped.
|
---|
59 |
|
---|
60 | This works very well, but can cause another problem: "arena overflow error" or
|
---|
61 | memory leak.
|
---|
62 |
|
---|
63 | As of this writing, mruby automatically extend arena to remember
|
---|
64 | objects (See `MRB_GC_FIXED_ARENA` and `MRB_GC_ARENA_SIZE` in
|
---|
65 | doc/guides/mrbconf.md).
|
---|
66 |
|
---|
67 | If you create many objects in C functions, memory usage will increase, since
|
---|
68 | GC never kick in. This memory usage may look like memory leak, but will also
|
---|
69 | make execution slower as more memory will need to be allocated.
|
---|
70 |
|
---|
71 | With the build time configuration, you can limit the maximum size of
|
---|
72 | arena (e.g., 100). Then if you create many objects, arena overflows,
|
---|
73 | thus you will get an "arena overflow error".
|
---|
74 |
|
---|
75 | To workaround these problems, we have `mrb_gc_arena_save()` and
|
---|
76 | `mrb_gc_arena_restore()` functions.
|
---|
77 |
|
---|
78 | `int mrb_gc_arena_save(mrb)` returns the current position of the stack
|
---|
79 | top of GC arena, and `void mrb_gc_arena_restore(mrb, idx)` sets the
|
---|
80 | stack top position to back to given `idx`.
|
---|
81 |
|
---|
82 | We can use them like this:
|
---|
83 |
|
---|
84 | ```c
|
---|
85 | int arena_idx = mrb_gc_arena_save(mrb);
|
---|
86 |
|
---|
87 | // ...create objects...
|
---|
88 | mrb_gc_arena_restore(mrb, arena_idx);
|
---|
89 |
|
---|
90 | ```
|
---|
91 |
|
---|
92 | In mruby, C function calls are surrounded by this save/restore, but we
|
---|
93 | can further optimize memory usage by surrounding save/restore, and can
|
---|
94 | avoid creating arena overflow bugs.
|
---|
95 |
|
---|
96 | Let's take a real example. Here is the source code of `Array#inspect`:
|
---|
97 |
|
---|
98 | ```c
|
---|
99 | static mrb_value
|
---|
100 | inspect_ary(mrb_state *mrb, mrb_value ary, mrb_value list)
|
---|
101 | {
|
---|
102 | mrb_int i;
|
---|
103 | mrb_value s, arystr;
|
---|
104 | char head[] = { '[' };
|
---|
105 | char sep[] = { ',', ' ' };
|
---|
106 | char tail[] = { ']' };
|
---|
107 |
|
---|
108 | /* check recursive */
|
---|
109 | for(i=0; i<RARRAY_LEN(list); i++) {
|
---|
110 | if (mrb_obj_equal(mrb, ary, RARRAY_PTR(list)[i])) {
|
---|
111 | return mrb_str_new(mrb, "[...]", 5);
|
---|
112 | }
|
---|
113 | }
|
---|
114 |
|
---|
115 | mrb_ary_push(mrb, list, ary);
|
---|
116 |
|
---|
117 | arystr = mrb_str_new_capa(mrb, 64);
|
---|
118 | mrb_str_cat(mrb, arystr, head, sizeof(head));
|
---|
119 |
|
---|
120 | for(i=0; i<RARRAY_LEN(ary); i++) {
|
---|
121 | int ai = mrb_gc_arena_save(mrb);
|
---|
122 |
|
---|
123 | if (i > 0) {
|
---|
124 | mrb_str_cat(mrb, arystr, sep, sizeof(sep));
|
---|
125 | }
|
---|
126 | if (mrb_array_p(RARRAY_PTR(ary)[i])) {
|
---|
127 | s = inspect_ary(mrb, RARRAY_PTR(ary)[i], list);
|
---|
128 | }
|
---|
129 | else {
|
---|
130 | s = mrb_inspect(mrb, RARRAY_PTR(ary)[i]);
|
---|
131 | }
|
---|
132 | mrb_str_cat(mrb, arystr, RSTRING_PTR(s), RSTRING_LEN(s));
|
---|
133 | mrb_gc_arena_restore(mrb, ai);
|
---|
134 | }
|
---|
135 |
|
---|
136 | mrb_str_cat(mrb, arystr, tail, sizeof(tail));
|
---|
137 | mrb_ary_pop(mrb, list);
|
---|
138 |
|
---|
139 | return arystr;
|
---|
140 | }
|
---|
141 | ```
|
---|
142 |
|
---|
143 | This is a real example, so a little bit complicated, but bear with me.
|
---|
144 | The essence of `Array#inspect` is that after stringifying each element
|
---|
145 | of array using `inspect` method, we join them together so that we can
|
---|
146 | get `inspect` representation of the entire array.
|
---|
147 |
|
---|
148 | After the `inspect` representation is created, we no longer require the
|
---|
149 | individual string representation. This means that we don't have to register
|
---|
150 | these temporal objects into GC arena.
|
---|
151 |
|
---|
152 | Therefore, in order to keep the arena size small; the `ary_inspect()` function
|
---|
153 | will do the following:
|
---|
154 |
|
---|
155 | * save the position of the stack top using `mrb_gc_arena_save()`.
|
---|
156 | * get `inspect` representation of each element.
|
---|
157 | * append it to the constructing entire `inspect` representation of array.
|
---|
158 | * restore stack top position using `mrb_gc_arena_restore()`.
|
---|
159 |
|
---|
160 | Please note that the final `inspect` representation of entire array
|
---|
161 | was created before the call of `mrb_gc_arena_restore()`. Otherwise,
|
---|
162 | required temporal object may be deleted by GC.
|
---|
163 |
|
---|
164 | We may have a usecase where after creating many temporal objects, we'd
|
---|
165 | like to keep some of them. In this case, we cannot use the same idea
|
---|
166 | in `ary_inspect()` like appending objects to existing one.
|
---|
167 | Instead, after `mrb_gc_arena_restore()`, we must re-register the objects we
|
---|
168 | want to keep in the arena using `mrb_gc_protect(mrb, obj)`.
|
---|
169 | Use `mrb_gc_protect()` with caution because it could also lead to an "arena
|
---|
170 | overflow error".
|
---|
171 |
|
---|
172 | We must also mention that when `mrb_funcall` is called in top level, the return
|
---|
173 | value is also registered to GC arena, so repeated use of `mrb_funcall` may
|
---|
174 | eventually lead to an "arena overflow error".
|
---|
175 |
|
---|
176 | Use `mrb_gc_arena_save()` and `mrb_gc_arena_restore()` or possible use of
|
---|
177 | `mrb_gc_protect()` to workaround this.
|
---|