Baby’s First Garbage Collector › 标志标记 Marky mark [#268]

第一阶段是标记。我们需要遍历所有可到达的对象并设置它们的标记位。我们需要的第一件事是向 Object 添加一个标记位:
The first phase is marking. We need to walk all of the reachable objects and set their mark bit. The first thing we need then is to add a mark bit to Object:

typedef struct sObject {
  unsigned char marked;
  /* Previous stuff... */
} Object;

当我们创建了这个新对象,我们还要修改 newObject() 以将 marked 初始化为零。
When we create a new object, we modify newObject() to initialize marked to zero.

为了标记所有可到达的对象,我们从内存中的变量开始,这意味着遍历栈。看起来像这样:
To mark all of the reachable objects, we start with the variables that are in memory, so that means walking the stack. That looks like this:

void markAll(VM* vm)
{
  for (int i = 0; i < vm->stackSize; i++) {
    mark(vm->stack[i]);
  }
}

这又调用了 mark。我们将分阶段构建它。首先是:
That in turn calls mark. We’ll build that in phases. First:

void mark(Object* object) {
  object->marked = 1;
}

这是字面上最重要的一点(bit)。我们已将对象本身标记为可达。但记住,我们还需要处理对象中的引用⸺可达性是递归的。如果该对象是一个对,则它的两个字段也是可达的。处理递归很简单:
This is the most important bit, literally. We’ve marked the object itself as reachable. But remember we also need to handle references in objects—reachability is recursive. If the object is a pair, its two fields are reachable too. Handling that is simple:

void mark(Object* object) {
  object->marked = 1;

  if (object->type == OBJ_PAIR) {
    mark(object->head);
    mark(object->tail);
  }
}

你看到这里有一个 bug 了吗?我们现在正在递归,但我们没有检查成环。如果你有一堆在循环中相互指向的对,这将使 C 调用栈溢出并崩溃。
There’s a bug here. Do you see it? We’re recursing now, but we aren’t checking for cycles. If you have a bunch of pairs that point to each other in a loop, this will overflow the C callstack and crash.

为了解决这个问题,我们只需要在到达已经处理过的对象时退出即可。所以完整的 mark() 函数是:
To handle that, we simply need to bail out if we get to an object that we’ve already processed. So the complete mark() function is:

void mark(Object* object) {
  /* If already marked, we're done. Check this first
     to avoid recursing on cycles in the object graph. */
  if (object->marked) return;

  object->marked = 1;

  if (object->type == OBJ_PAIR) {
    mark(object->head);
    mark(object->tail);
  }
}

现在我们可以调用 markAll(),它将正确标记内存中每个可达的对象。我们已经完成一半了!
Now we can call markAll() and it will correctly mark every reachable object in memory. We’re halfway done!