Ticket #1805 (new todo)

Opened 11 years ago

Last modified 11 years ago

GC threshold and other values need to be set during configuration

Reported by: jkeenan Owned by:
Priority: normal Milestone:
Component: GC Version: 2.8.0
Severity: medium Keywords:
Cc: bacek, dukeleto, luben, fperrad Language:
Patch status: Platform:


Since the merge of the gc_massacre branch into trunk at r49269, we have experienced a number of problems during make and make test. These have been discussed  on parrot-dev as well as on #parrot. In the course of this discussion, attention was called to the fact that certain values affecting garbage collection are hard-coded deep inside .c source code files.

As a general rule, it is not a good idea to bury hard-coded values deep inside source code. They should be made available for determination by the user during the configuration process. We need to pull these values out of source code and make them more visible during Parrot configuration.

Here's a GC-amateur's list of GC-related values which appear to be hard-coded into our source code:

./src/gc/gc_ms2.c:626:        self->gc_threshold = 256 * 1024 * 1024;
./src/main.c:429:                if (interp->gc_threshold > 1000) {


 26 #define RECLAMATION_FACTOR 0.20
 29 /* show allocated blocks on stderr */
 30 #define RESOURCE_DEBUG 0
 31 #define RESOURCE_DEBUG_SIZE 1000000
280 static void
281 check_fixed_size_obj_pool(ARGIN(const Fixed_Size_Pool *pool))
282 {
283     ASSERT_ARGS(check_fixed_size_obj_pool)
284     size_t total_objects;
285     size_t last_free_list_count;
286     Fixed_Size_Arena * arena_walker;
287     size_t free_objects;
288     size_t count;
289     GC_MS_PObj_Wrapper * pobj_walker;
291     count = 10000000; /*detect unendless loop just use big enough number*/
328     count = 10000000;
357 static void
358 check_var_size_obj_pool(ARGIN(const Variable_Size_Pool *pool))
359 {
360     ASSERT_ARGS(check_var_size_obj_pool)
361     size_t count;
362     Memory_Block * block_walker;
363     count = 10000000; /*detect unendless loop just use big enough number*/

We currently have one configuration step, auto::gc, which configures configuration. But since we currently only offer users one garbage collection system, this step (found in config/auto/gc.pm doesn't do much other than to identify src/gc/alloc_resources.c as the location where GC is determined. auto::gc currently conducts no C probes of the user's machine. It might as well be init::gc and its status as an 'automatic probe' configuration step merely reflects legacy settings.

(To recap how configuration works (oversimplified): init:: steps take configuration values from Perl 5 %Config or other hard-coded locations; inter:: steps work similarly but, on request, will offer the configurer several options; auto steps calculate settings, largely based on running C probes of the user's machine, that should not need user intervention; and gen steps take all the values discovered in the first three kinds of steps and writes Makefiles and other files needed for build and records the values in lib/Parrot/Config/Generated.pm and config_lib.pir.)

The hard-coded values listed above should, at least in certain cases, be user-configurable. That means that the configuration of garbage collection needs to be moved to some combination of init::, inter:: and auto:: steps. For example, if we determine that on machines with 'small' physical memory the value of gc_threshold needs to be set an order of magnitude smaller than 256M, then we need to probe the machine for the size of its physical memory, then calculate an appropriate value for gc_threshold, then provide the user the option of confirming/altering this choice at a command-line prompt.

I can't yet propose specific modifications of the configuration system to handle GC values. But I am convinced that some or all of these values need to be extracted from .c source code files and made explicit during configuration.

Thank you very much.


Change History

follow-up: ↓ 3   Changed 11 years ago by chromatic

On Sunday 26 September 2010 at 07:38, Parrot  wrote:

> I am convinced that some or all of these values
>  need to be extracted from ''.c'' source code files and made explicit
>  during configuration.

I disagree.  Consider packagers, who may build Parrot on a machine with plenty 
of memory and distribute binaries to users with much less memory.

Instead we can provide user-configurable limits through command-line options or 
environment variables, and we can catch malloc errors when attempting to 
allocate more memory and adjust the GC thresholds accordingly.

-- c

  Changed 11 years ago by nwellnhof

I agree with chromatic. No user should be required to specify internal parameters during compile time. This should be configurable at run time, especially because these things can need some testing. OTOH, neither command line options nor environment variables look like the perfect solution.

Catching malloc errors sounds nice, but to do it right we would need a GC that never allocates more memory during a GC, does real compacting, and can never be blocked. We're a long way from all that.

in reply to: ↑ 1   Changed 11 years ago by jkeenan

Replying to chromatic:

Instead we can provide user-configurable limits through command-line options or environment variables, ...

Well, if you're handling them through command-line options, then you're handling them through the configuration system, albeit more likely in an init:: step rather than an inter:: or auto::.

So, how would you handle the situation I and others face right now: i.e., that I had to change a hard-coded value deep inside a .c file whose existence I was previously unaware of -- in order to get Parrot to run its own tests at all?

Thank you very much.


  Changed 11 years ago by jkeenan

Related ticket, per observation by coke: TT #827

  Changed 11 years ago by bacek


I think we can close this ticket as duplicate of #67. Or at lease merge them.

-- Bacek

Note: See TracTickets for help on using tickets.