Changes between Version 26 and Version 27 of GCTasklist

04/27/10 18:07:49 (12 years ago)



  • GCTasklist

    v26 v27  
     1See also ["Fixing GC Bugs"] 
    13== Tasks == 
     5 - Fix memory leaks (See #1465, #791, #748, #721, #706) 
     7 - Consider deleting src/gc/res_lea.c (doesn't work anyway) (See #655 and #490) 
     9 - Apply patch to remove _synchronize (See #978) 
     11 - Kill non-working GC cores (See #655) 
     13 - Rearrange the GC interface (See #670) 
     15 - Compacting/copying GC (See #616, CopyingGarbageCollector) 
    317 - Create an incremental tri-color mark GC module 
    519 - Integrate the new incremental GC into the existing system (See #670) 
     21 - Stress-test GC with concurrency 
     23 - Separate out GC String Core (See #828) 
     25 - Add a command-line argument to limit memory allocation (See #67, #827) 
     27 - Implement a sweep-free GC (See #1522) 
    11  - Consider deleting src/gc/res_lea.c (doesn't work anyway) (See #655 and #490) 
     29 - Realtime garbage collector for RTMS (See #1352) 
     31 - Deprecate mem_internal_*alloc functions (See #1402) 
     33 - Write test for #945 
     35 - Fix system-dependent code in src/gc/system.c (See #273) 
     37== Problems to Solve == 
     39 - Speed. The current gc is slow, and the performance in GC is affecting performance of other systems. Large numbers of objects (e.g. NQP/PGE) are particularly problematic. 
     41 - Concurrency. The current gc does not play well with threads. Running GC in a separate thread could help with speed. 
     43 - compact_pool is bad for cache thrashing, it copies all pools even if they're complete full or almost full. bacek has a branch for it (compact_pool_revamp, slower than trunk) 
     45 - need to be able to free allocated pools (a precise compacting collector) 
     47 - need to be able to identify and collect short-lived garbage much more cheaply (we avoid that to some extent by reducing the amount of garbage we create, but would be a significant win for making sub/method calls less expensive. Needs escape analysis. 
     49 - copying collection 
    1351== Completed Tasks == 
    15  - Rename files in src/gc for sanity, suggested names: 
    16   * memory.c -> alloc_memory.c or mem_allocate.c (r39002) 
    17   * register.c -> alloc_registers.c or reg_allocate.c (r39003) 
    18   * resources.c -> alloc_resources.c or resource_allocate.c (r39004) 
    20  - Collapse src/gc/smallobject.c into src/gc/api.c.(r39022 and r39023) 
    22  - Move src/malloc.c and src/malloc-trace.c into src/gc (not strictly GC, but want to group all memory management), consider deleting if only used by src/gc/res_lea.c (r39006) 
    25 ---- 
    27 Improve abstraction/encapsulation for existing GC modules. (r38654 and later) 
    29 If there are any non-API functions in src/gc/api.c move them into another file, possibly src/gc/common.c to indicate that they're internal to the GC system only, but shared between all the GC modules. (r38654 file currently named "mark_sweep.c") 
    31 ---- 
    33 Renamed all API functions to Parrot_gc_* (r34775). 
    36 ---- 
    38 Renamed files: 
    40   * dod.c -> api.c and dod.h -> gc_api.h (r34774) 
    41   * gc_gms.c -> generational_ms.c (r34795) 
    42   * gc_ims.c -> incremental_ms.c (r34796) 
    45 ---- 
    47 == Branch History == 
    49 svn copy \ 
    50   \ 
    51       -m "Creating a branch for the first round of GC refactoring." 
    53 initial revision: r34100 
    56 SVK merged r34113 
    58 ---- 
    60 (allison) 
    61 svn copy \ 
    62   \ 
    63       -m "Creating a branch for a second round of GC refactoring, cleanups and code reorganization." 
    65 initial revision: r34686 
    68 (chromatic) 
    70 Brought the GC refactoring branch up to date with trunk r35194. 
    72 new revision: r35195 
    75 (allison) 
    77 svn merge -r35194:HEAD 
    79 new revision: r35369 
    81 [pdd09gc] Bringing the pdd09gc_part2 branch up-to-date with trunk r35369. 
    84 (allison) 
    86 svn merge -r35369:HEAD 
    88 new revision: r35373 
    90 [pdd09gc] Bringing the pdd09gc_part2 branch up-to-date with trunk r35373. 
    93 (allison) 
    95 svn merge -r34686:HEAD 
    97 new revision: r35374 
    99 [pdd09gc] Merging the pdd09gc_part2 branch into trunk for r34686 to r35374. GC refactor: reorganize code for sanity and maintainability. 
    102 (allison) 
    104 svn delete -m "Removing second GC development branch from the repository" 
    106 new revision: r35380 
    109 ---- 
    111 == GC rewrapping talk ==  
    112 {{{ 
    113 <bacek> Whiteknight: I want to break something in parrot. And I'll need your help with this idea :) 
    114 <Whiteknight> i like breaking, and ideas! 
    115 <bacek> Current GC API. 
    116  It's broken-by-design. 
    117 <bacek> E.g. using "Small_Object_Pool" directly for allocating objects enforce particular implementation. 
    118 <bacek> Whiteknight: also, current GC API doesn't allow something like "Gimme chunk of memory with this size" 
    119  Whiteknight: and without this ability I can merge "PMC" and "ATTRibutes" into single allocation. 
    120  Or convert Context to use ATTRibutes. 
    121 <Whiteknight> bacek: we need PMCs to be located together so we can sweep 
    122 <bacek> So, "mark-sweep" enforces particular design. And it's bad 
    123  What about compacting GCs? 
    124 <Whiteknight> bacek: any GC is going to have a sweep 
    125 <bacek> Or trademills 
    126  Whiteknight: not all of them 
    127 <Whiteknight> bacek: I can't think of a single algorithm that doesn't use some sort of sweep 
    128 <bacek> In many cases it can be implicit 
    129 <Whiteknight> ACTUALLY, chromatic's idea might work here 
    130 <bacek> "Uniprocessor Garbage Collection Techniques" 
    132  anyway, "sweep" doesn't require all objects to be same size 
    133  and allocated from "Small_Object_Pool" 
    134 <dukeleto> 67 pages of GC goodness 
    135 <bacek> dukeleto: indeed 
    136  Whiteknight: check src/gc/api.c, line 329 
    137 <Whiteknight> dukeleto: You jest, but I'm going to print and read every one of them 
    138 <bacek> My idea is not to have "arenas" in interp, but some "struct gc*" with few functions and "void * gc_private" 
    139 <Whiteknight> bacek: Then you are going to love the new patch from jrtayloriv: It does exactly that 
    140 <bacek> So, all implementation details will be decoupled from Interp. 
    141 <dukeleto> Whiteknight: I don't jest. that looks like a really good introductory/overview paper. 
    142 <bacek> Whiteknight: Yay! jrtayloriv++ 
    143 <Whiteknight> bacek: I like that idea a lot. I've wanted to do it myself for a while 
    144 <bacek> Just ensure that this functions has explicit ""size" argument 
    145 <Whiteknight> bacek: if we changed the fixed-size attributes pool to include PMC and ATTR together, we could get the effect you are talking about 
    146 <bacek> yes 
    147 <Whiteknight> then we could sweep those pools because all objects would be guaranteed to be PMCs 
    148 <bacek> Why sweeping require all objects to be PMC? 
    149  It's just chunks of memory 
    150 <Whiteknight> bacek: If we sweep linearly over a pool, we don't know if the block is a PMC or not 
    151 <bacek> We know size. 
    152 <Whiteknight> PMCs have flags, need destruction, etc. We need to know if an object is a PMC 
    153  but I can allocate an X byte buffer in the same pool I allocate an X byte PMC 
    154  and I don't want to call VTABLE_destroy on the buffer 
    155 <bacek> Ok. 
    156  We just need pairs of function "get_raw_chunk" and "get_pmc" in gc* 
    157  no, bad idea. 
    158  All we need is pass (optional) pointer to "destroy function". 
    159  Which will be invoked by GC during sweep. 
    160 <Whiteknight> so store the destroy function in the pool? 
    161  that idea actually has some merit 
    162 <bacek> In allocated object. 
    163 <Whiteknight> so for every object we add a header with a pointer to a destroy function? 
    164 <bacek> Whiteknight: only when requested. GC can be smart enough to not store it always. 
    165  E.g. in most PMC without "destroy" VTABLE it's useless. 
    166 <bacek> Bah! We have to change Pmc2s to set "active_destroy" as pragma, not in "VTABLE_init" 
    167 <bacek> This step is too big... Ok, scratch last idea. 
    168 <Whiteknight> yes, and custom_mark 
    169 <bacek> Let's just have pair of functions for allocate memory. 
    170  Both of them should accept explicit size. 
    171 <Whiteknight> I like that idea, but we need to be able to tell whether an item is a PObj or not 
    172  and then we need to be able to find all STRING headers that point to a string buffer 
    173  because of COW, it can be N:1 
    174 <bacek> Immutable strings ftw! 
    175 <Whiteknight> I want immutable strings. That would be a very good idea 
    176 <dukeleto> i like cows 
    177 <Whiteknight> but, it would be a huge job 
    178 <bacek> Actually, with this approach we can just drop STRING and always use String PMC. 
    179  Just reimplement it use ATTRibutes and allocate raw chunk of memory for real string 
    180  Then, String,mark will do all required job 
    181 <Whiteknight> there is some merit to that. I think the string compactor is very expensive 
    182 <bacek> And incomprehensible :) 
    183 <NotFound> Note that in order to make PMC variable size we might need to forbid morphing assignments 
    184 <Whiteknight> NotFound: Yes, I hadn't even thought about that 
    185 <bacek> (morphing) VTABLE_morph is way-too-premature optimisation 
    186 <Whiteknight> yes, but some people do use the morph opcode 
    187  so we can't kill it entirely 
    188  at least, not yet 
    189 <bacek> Fortunately, I've put broad deprecation warning for all VTABLEs in 1.4 :) 
    190 <NotFound> bacek: morph is not the only way of morphing. Look for example at the Undef pmc. 
    191 <Whiteknight> yeah, we lose a lot of flexibility 
    192  I really like this idea. Hard part is finding a way to do it without breaking the whole damn codebase 
    193 <Whiteknight> bacek: yes, dump some of your ideas onto the wiki. I think there is a "GCTasklist" page you can use 
    194 <NotFound> You can start thinking about how to avoid assignment to Undef 
    195 <bacek> NotFound: anyway, we have to have something for implementing compacting GC. 
    196  And it's closely related to "assign to Undef" 
    197  Because we have to update all pointers to PMC. 
    198  Or wrap "PMC" into struct with single pointer. 
    199 <Whiteknight> a "PObj Header" could be allocated to provide indirect access to a PMC 
    200  it would be fixed size and we could store it in a very simple pool 
    201 <bacek> Whiteknight: +1 
    202 }}}