Version 2 (modified by bacek, 12 years ago)

Minor fix

One of the bottlenecks in the current mark and sweep GC is the sweep portion. We have to walk through every memory pool in every arena to find every dead GCable. The more GCables in the system, the longer this process takes. In the worst case, marking requires us to touch every GCable in the system, that's slightly worse than O(n) (plus the overhead of recursing through the object graph and pruning out cycles). In this case, all GCables are live, but we still have to sweep all of the pools in another slightly worse than O(n) operation -- to say nothing of the memory cache performance.

We can avoid that sweep if we have some mechanism to segregate live from dead GCables. (This also gets us further along to a generational system.)

Every time we allocate a new pool, allocate additional space to store three pointers:

  • a pointer to the GCable
  • a previous pointer
  • a next pointer

... that is, a doubly-linked list.

The GC structure needs to contain four additional pointers:

  • a pointer to the head of the LIVE list
  • a pointer to the head of the FRESH list
  • a pointer to the head of the DEAD list
  • a pointer to the head of the MAYBE list

Every time we grab a fresh new GCable, we add it to the the FRESH list.

At the start of the mark phase, we move the FRESH list to the MAYBE list and start a new FRESH list. This avoids the freshly-allocated-but-not-marked problem.

During the mark phase, we append every live GCable to the LIVE list, if it's in the MAYBE list. (We'll have to manage the existing flags carefully to avoid adding and re-adding; perhaps some toggle will help. "Last time, 0 meant live. This time, 1 means live.")

By the end of the mark phase, all live GCables should be in the LIVE list. Everything in the MAYBE list should now be DEAD, so we can append it to that list. We can walk the DEAD list and perform any custom destruction then.

The benefit to this system is that the O(n) operation of the sweep (which tends to hew to the full extent of n) of sweeping becomes much more efficient. The cost of allocating an object is slightly higher due to the need to manipulate pointers in the linked lists, but the cost of freeing dead objects is much cheaper.

We may be able to find smaller and cheaper data structures than the doubly linked list, though the cheapness of splicing elements and lists together is compelling.