Changes between Initial Version and Version 1 of PCCPerformanceImprovements

Show
Ignore:
Timestamp:
08/17/10 21:11:20 (11 years ago)
Author:
chromatic
Comment:

initial braindump ready for comments, critiques, and questions (need a third hard-k sound)

Legend:

Unmodified
Added
Removed
Modified
  • PCCPerformanceImprovements

    v1 v1  
     1The two biggest expenses in PCC right now are creating unnecessary CallSignatures and allocating memory for register sets and the like. 
     2 
     3Most of our call signatures we know at the point of PIR compilation: call this function, passing these specific registers, receiving these values back in those specific registers.  From a PBC point of view, if signatures are immutable, we can cache these signatures in bytecode once and use the frozen signatures for all calls.  Because we have constant caching, we can use the same signature PMC for multiple calls with the same logical signature.  Likewise, any code which uses the C API to make calls into Parrot can create a single signature PMC for each logical signature. 
     4 
     5This is similar to what NCI does, if you like prior art. 
     6 
     7To make this work, we need to separate the mutable portion of CallSignature from the immutable portion.  The immutable portion should describe: 
     8 
     9 * the parameter information (number, type, and any flags such as slurpy or flat) 
     10 * the return parameter information (ditto) 
     11 
     12The mutable portion should describe: 
     13 
     14 * the calling context (a reference to the caller's caller, a reference to the signature) 
     15 * the registers themselves 
     16 * the destination PC 
     17 * active exception handlers 
     18 * (likely) the storage for the callee's registers 
     19 
     20In effect, the mutable portion should represent enough information to serve as a continuation.  If this data structure supports cloning, we can even treat it simultaneously as a continuation and return continuation. 
     21 
     22== speculation == 
     23 
     24For additional fun, we could consider *avoiding* the copying of values between registers during PCC with a smarter register allocation strategy.  Assume the code: 
     25 
     26{{{ 
     27.sub f 
     28    .local pmc x, y, z 
     29    ... 
     30    y = g( x, z ) 
     31.end 
     32 
     33.sub g 
     34    .param pmc x 
     35    .param pmx y 
     36 
     37    .local pmc z 
     38    z = x + y 
     39    .return( z ) 
     40.end 
     41}}} 
     42 
     43If we mandate that all registers of a specific type used as arguments to an 
     44invocation must be in successive registers, f() could desugar to: 
     45 
     46{{{ 
     47.sub f 
     48    P3 = g( P1, P2 ) 
     49.end 
     50}}} 
     51 
     52... and the register set passed in as parameters to g() could merely point to 
     53the appropriate place to *start* finding these linear registers.  In other words, instead of copying registers into the register set for g() and only then being able to use them, g() could operate on its caller's registers directly: 
     54 
     55{{{ 
     56.sub g 
     57    .alias pmc R1, P1 
     58    .alias pmc R2, P2 
     59    R3 = P1 + P2 
     60    .return() 
     61.end 
     62}}} 
     63 
     64It's not entirely clear how this would work in the case of complex return 
     65handling (such as slurpy/flat -- though named is fairly simple), but we can 
     66resolve this at compile time and avoid calculating things we need to know. 
     67It's also not obvious how this would work with complex continuations.  We would 
     68also have to revise how we refer to registers in the caller context, but that's 
     69doable as well.