|Version 23 (modified by allison, 5 years ago)|
The purpose of the PCC branch is to unify the internal implementation of the calling conventions into one code path. The current implementation of subroutine, method, and NCI invocation uses multiple different styles of passing and receiving arguments. A call to a subroutine or method from PIR saves a pointer to the location of the opcode where the arguments were passed, and later extracts the variables or constants from that opcode instruction. A call to a subroutine or method from C may pass around a varargs list and extract the arguments from it later or create a call state struct.
The new implementation standardizes argument passing for all calls (whether from PIR or C). Each call creates a call signature object right away, inserting arguments into the signature object from the opcode pointer or varargs list as soon as the call is made. Parameters are always extracted from a call signature object, and know nothing about the source of the call (they don't need to know what kind of environment invoked them, since all calls follow the same code path, and pass arguments the same way). NCI calls and what was previously known as PCCMETHOD calls also use the same argument passing strategy. Signature strings will be standardized on the PCC form (e.g. PI->S, for a call with a pmc and an integer argument, that returns a string result). The exception is NCI signatures, which will be left alone for now, but may be standardized later.
The following opcodes will be kept, but their implementation changed.
src/ops/core.ops: set_args get_params set_returns get_results
The following functions are deprecated and will be removed, as well as any functions only called by these functions.
src/call/pcc.c: parrot_pass_args parrot_pass_args_fromc Parrot_PCCINVOKE Parrot_init_ret_nci Parrot_init_arg_nci count_signature_elements (static) set_context_sig_returns_varargs (static) src/call/ops.c: runops_args Parrot_run_meth_fromc Parrot_runops_fromc_args Parrot_runops_fromc_args_event Parrot_runops_fromc_args_reti Parrot_runops_fromc_args_retf Parrot_run_meth_fromc_args Parrot_run_meth_fromc_args_reti Parrot_run_meth_fromc_args_retf Parrot_runops_fromc_arglist Parrot_runops_fromc_arglist_reti Parrot_runops_fromc_arglist_retf Parrot_run_meth_fromc_arglist Parrot_run_meth_fromc_arglist_reti Parrot_run_meth_fromc_arglist_retf src/extend.c (these functions are part of the defined public API, so will be modified to be backward compatible for now, and removed after 2.0): Parrot_call_sub Parrot_call_sub_ret_int Parrot_call_sub_ret_float Parrot_call_method Parrot_call_method_ret_int Parrot_call_method_ret_float src/nci.c set_nci_I set_nci_N set_nci_S set_nci_P
The following code generators have been updated to produce new-style argument retrieval, instead of old-style argument retreival:
The following functions are added and act as replacements for the deprecated functions.
src/call/pcc.c: Parrot_pcc_invoke_sub_from_c_args Parrot_pcc_build_sig_object_from_op Parrot_pcc_build_sig_object_returns_from_op Parrot_pcc_build_sig_object_from_varargs (added in an earlier refactor, but for this purpose) Parrot_pcc_fill_params_from_op Parrot_pcc_fill_params_from_c_args Parrot_pcc_fill_returns_from_op Parrot_pcc_fill_returns_from_c_args
The following static functions are added, and support the new argument passing behavior:
src/call/pcc.c: dissect_aggregate_arg extract_named_arg_from_op parse_signature_string src/call/callsignature.c Parrot_pcc_get_call_sig_raw_args Parrot_pcc_get_call_sig_raw_returns Parrot_pcc_set_call_sig_raw_args Parrot_pcc_set_call_sig_raw_returns
The signature object is stored in the current_sig element of the caller's context. This means that argument storing and return value extraction happens on current_sig in CURRENT_CONTEXT(interp), while parameter extraction and return value storing must retrieve the caller's context and use its current_sig element. The reason for this distinction is that the subroutine's local current_sig element will be set for every call it makes. (This will be simplified when contexts and call_signatures are collapsed into one.)
See also the earlier wishlist/tasklist CallingConventionsTasklist which mentions some of the motivations and reasoning for these changes.
The Parrot_call_sub_* and Parrot_call_method_* variants in src/extend.c don't all have the necessary changes to allow them to work with the new calling conventions (these are temporary implementations for backward compatibility before they're removed at the next deprecation point). Parrot_call_sub does have the right changes, and the others can be largely copied from it.
NCI hasn't been fully updated on caller and receiver side to use the new argument passing style.
- Return argument processing (Parrot_pcc_fill_returns_from_op and Parrot_pcc_fill_returns_from_c_args) doesn't currently support :named, :slurpy, :flatten, :optional, :opt_flag etc. (t/compilers/imcc/syn/tail.t, t/pmc/coroutine.t)
- Parrot_pcc_fill_params_from_op and Parrot_pcc_fill_params_from_c_args are monolithic functions (basically finite state machines iterating over the arguments), that contain a great deal of nearly repeated code (one or two things different each time). Parrot_pcc_fill_returns_from_op and Parrot_pcc_fill_returns_from_c_args will be just as bad once they support all the options. Not a requirement before the merge, but these should be refactored into smaller subroutines. Needs to be thought through carefully though, it was a similar plan that lead to the current mess.
- Flattening an argument doesn't alter the signature string stored, so multiple dispatch can't handle the resulting string. Need to modify the signature string while building the CallSignature object. (t/pmc/multisub.t and t/pmc/multidispatch.t)
Edge cases on auto boxing/unboxing argument types, e.g. "Unable to set PMC value, the pointer is not a PMC" (t/oo/methods.t, t/oo/subclass.t, t/pmc/hashiteratorkey.t, t/pmc/object-meths.t, t/pmc/resizablestringarray.t, t/pmc/string.t)
- GC attempting to mark a bad variable (t/op/box.t) (Actually IMCC creates bad PackFile_Constants segment for box.t)
- Insufficient checking on positional arguments passed inside named arguments (t/op/calling.t)
- Insufficient checking on missing named arguments (t/op/cc_state.t)
- Insufficient checking on too many arguments passed (t/pmc/exporter.t)
- Some tests need to be updated to match the current error messages (t/op/calling.t, t/op/cc_params.t)
Segfault in set_returns, possibly a null call object? (t/op/gc.t, t/pmc/exceptionhandler.t)
- Probably a problem with argument handling (t/op/lexicals.t, t/pmc/class.t, t/pmc/codestring.t, t/pmc/object-meths.t, t/pmc/sub.t)
Two NCI edge cases (t/pmc/nci.t)
- Likely cases of code still calling into the old functions for invoking from C, instead of the new functions (t/pmc/sub.t, t/pmc/threads.t)
- Likely bug in Parrot_call_sub reimplementation (t/src/embed.t)
- miniparrot fails to compile if an installed parrot is not already present, because it has -lparrot in the compile options
- Make sure the API and documentation consistently uses "args" and "returns" to mean "the things passed into the call or return" and "params" and "results" to mean "the things extracted from the call or return".
- Currently, test_more.pir cannot be used on pcc_reapply. Attempting to use test_more.pir results in
FixedIntegerArray: index out of bounds! current instr.: 'parrot;Test;Builder;Test;_initialize' pc 44 (runtime/parrot/library/Test/Builder/Test.pir:46) called from Sub 'parrot;Test;Builder;_initialize' pc 0 (runtime/parrot/library/Test/Builder.pir:51) ... call repeated 1 times
Argument Processing Algorithms
Here is some documentation for ways that the argument processing algorithm (Parrot_pcc_fill_params_from_op and Parrot_pcc_fill_params_from_varargs) work. There are two possible ways, depending on whether we iterate over the list of parameters and pull arguments, or iterate over the list of arguments and push them into parameters. First, iterating over parameters (similar to what is done now):
Infinite loop get next positional arg get next positional parameter slot if parameter slot is NULL if positional arg is NULL # No more of either, done. Break break if error checking throw exception if positional arg is NULL if error checking throw exception if parameter slot is slurpy create new RPA PMC insert new RPA PMC into destination context parameter slot if we have a slurpy array pmc add positional arg to slurpy array pmc else Insert positional into destination context parameter slot if current parameter slot is optional set corresponding opt_flag TRUE Infinite loop get next named arg name get next named arg value get next named parameter slot if arg name/value pair is NULL if parameter slot is NULL break # Even, we're done else if error checking throw exception if parameter slot is NULL throw exception if parmeter slot is slurpy create new Hash PMC insert new Hash PMC into destination context parameter slot if we have a slurpy hash PMC add name/value pair to slurpy hash PMC else Insert value into destination context parameter slot if current parameter slot is optional set corresponding opt_flag to TRUE Loop over all remaining opt_flag parameters set to FALSE
Next, iterating over arguments. This requires iterator functionality installed into CallSignature, which does not currently exist:
Infinite loop get next positional arg get next parameter flag if no parameter flag if no positional arg # No more of either, done. end loop if error checking throw exception if no positional arg set all remaining positional opt_flags to FALSE if error checking throw exception if parameter flag is slurpy create new RPA PMC insert new RPA PMC into destination context parameter slot add positional arg to slurpy array pmc add all remaining positional args to slurpy array pmc end loop if parameter flag is named insert positional arg into next param slot mark name as used if parameter flag is lookahead if have named arg insert named arg into next param slot mark name as used fetch next param flag for positional else insert positional arg into next param slot Insert positional arg into destination parameter slot if current parameter slot is optional set corresponding opt_flag TRUE Infinite loop get next named arg name get next named arg value get next parameter flag if no parameter flag if no arg name/value pair end loop # Even, we're done if error checking throw exception if no arg name/value pair set all remaining named opt_flags to FALSE if error checking throw exception if parameter flag is slurpy create new Hash PMC insert new Hash PMC into destination context parameter slot add name/value pair to slurpy hash PMC add all remaining name/value pairs to slurpy hash PMC Insert value into destination context parameter slot if current parameter slot is optional set corresponding opt_flag to TRUE
Pseudocode for return/result handling, parallel to arg/param handling:
Infinite loop Get next result slot Get next return value If there are no more returns if there are no more results we're done, no error else error, too many positional returns If result slot is slurpy if result slot is named end positional loop (handle named slurpy in named loop) create a new array PMC Loop over all remaining positional returns Insert positional into slurpy array put the slurpy array in result slot break loop (all positionals used up) if have more results to fill if result is named fill result slot with positional return mark name as used else if we have already seen a named result error, positionals must be before named add return to next result slot if result slot is optional set the corresponding opt_flag to 1 else if result is optional if result is named break loop (handle named in named loop) set the corresponding opt_flag to 0 else if result is named break loop (handle named in named loop) error (too few positional returns) Loop over remaining returns if return is not named error (all remaining returns should be named) store named return in temporary named returns hash Infinite loop get next result slot if result slot is not named error (positionals must come before named) if result slot is slurpy if temporary named returns hash is not null put temporary named returns hash in result slot else put new empty hash in result slot get result slot name if the name has an element in temp named returns hash get next result slot put return value in result slot delete the entry from the temp named returns hash if result is optional set the corresponding opt_flat to 1 else if the result is optional set the corresponding opt_flat to 0 else if error checking is on else (too few named returns) if there are any items left in the temporary named returns hash loop over temporary named returns hash collect key names for error message error (too many named returns)