|Version 21 (modified by jkeenan, 5 years ago)|
Lorito Design Questions
As Lorito seems to be a somewhat nebulous concept about which people seem to be certain about only a few specific details (not that the specific details are the same between any two people), this is a list of questions we need to consider when designing, planning and implementing Lorito. Feel free to add more. When adding answers, please ensure that there is some agreement among Parrot developers about the answer.
How should I think about Lorito to understand its purpose and to answer these questions?
At its lowest level a small set of primitives which could form the basis of a virtual machine to run code written in C without having to follow such features of C as its calling conventions and memory management.
Put another way, what if we had a small and fast virtual machine with a handful of simple ops and native garbage collection? What if we could build the rest of Parrot in a language which supports CPS, multiple dispatch, and PCC natively?
What are the main goals that Lorito should accomplish?
We want Lorito to have the same capabilities as C in a form that is easy to parse, instrument, transform, analyze, etc. Much of the slowness in the current (circa 2.6.0) pre-Lorito Parrot comes from the impedance mismatch where C code and PIR code need to interact (inferior runloops, for instance). By creating Lorito and using it to implement code that is currently written in C, we can eliminate that bottleneck and make Parrot more amenable to further optimization and analysis.
How much of Parrot's current core do we want to eventually rewrite in Lorito?
Ideally, everything. Some very low-level code such as GC's stack walking may need to be partially implemented in C or assembly, but the endgame is that almost all of Parrot is implemented in an HLL that compiles down to Lorito ops.
Will PIR compile down to Lorito assembly or will it be a superset of Lorito assembly?
We expect that current PIR code will continue to work unchanged on a Lorito-based Parrot.
How many layers below PIR will exist in Lorito?
PIR is currently syntactic sugar and parser magic layered over Parrot ops, which themselves are written in C. Lorito will replace C. Thus, PIR can remain unchanged at least at the architecture level.
Will Lorito have the option of compiling to C?
Yes. We will support transforming Lorito ops into equivalent snippets of C code as an alternative to the function-based runcore.
Will Lorito use a different bytecode format from the existing PBC?
This hasn't been decided yet.
If there is a distinct LBC (Lorito bytecode) format, will instructions be fixed-length?
Yes. All Lorito ops will take three arguments.
Will Lorito have the same calling conventions as Parrot currently does?
Yes, at least from the PIR/PCC level. One of the goals of Lorito is to replace the C/PCC boundary, such that we can use PCC throughout the system.
If Lorito's calling conventions differ from Parrot's current ones, what will they be?
PIR that works with Parrot now is expected to run without modification on a Lorito-based Parrot. This includes calling conventions.
Will Lorito have a stack?
Will Lorito distinguish between data types at the lowest level?
Will there be separate storage for different types of data at the lowest level?
Will Lorito have a object model built-in?
Will Lorito still have the same core object model as Parrot?
Parrot's core object model will be implemented on top of Lorito.
Will Lorito have a single op that does method dispatch at the lowest level or it will be simulated using a series of ops?
MMD will be implemented on top of Lorito.
Will Lorito have some declarative syntax at the lowest level for creating classes/types?
Will there be a declarative syntax at some level below HLLs for creating classes/types?
Yes, though this is not a function of Lorito.
Will PMCs and Objects be merged?
Will Strings and PMCs/Objects be merged?
This idea has been mentioned but we haven't reached a decision.
What requirements will Lorito impose on the memory layout of objects?
Unknown; likely we can escape the tyranny of the C memory model (though we should keep in mind things like struct layout and padding where it matters on various architectures).
Will objects have a static vtable in addition to method dispatch?
VTABLE functions and methods will be unified post-Lorito. The Parrot development community haven't decided what this mechanism will be, but it is likely that the internal representation of vtable functions will change from its current form.
How should method dispatch work?
Not a question for Lorito.
Should method dispatch be tied to classes, to objects, to some vtable/prototype object associated with each object?
Yes, unless the system uses pervasive multidispatch at the lowest level. This needs more discussion, especially with regard to the needs of HLLs.
Should method dispatch use strings or symbols?
Symbols are likely but this design decision hasn't been widely discussed yet.
How will Lorito support native types?
(Unknown; which native types?)
How will Lorito support calling C functions in existing libraries?
Lorito primitives at the lowest level will support building platform-specific call frames and calling C functions; this will be the basis of the new FFI/NCI.
How will Lorito support advanced control flow constructs such as coroutines, continuations, exceptions and CPS in general?
These will all be implemented on top of Lorito in terms of goto. Internally, Lorito will use CPS.
What kind(s) of memory access will Lorito support? How will memory management (automatic and manual) work?
Lorito will support direct memory access with its lowest level of primitives.
Will Lorito allow direct memory access/pointer arithmetic?
Will Lorito allow manual memory allocation/deallocation?
Lorito will allow manual memory management through the same interface used to access other C-level functions.
Will Lorito require us to port existing code from C to Lorito?
Every time we port code from C to Lorito, we get more benefits -- not just from the cleanliness of reimplementation and rethinking, but from the performance improvements of not crossing the C/Parrot boundary. Even so, Lorito *must* allow the calling of C functions from its ops, so we can migrate existing Parrot to Lorito ops gradually.
This means that cleaning up existing C code is very valuable.
How will addressing work? Will there be a different instruction for local goto vs far-away goto?*
How will CPS work if all we have is goto?*
How will security work?*
What is Lorito's concept of memory vs registers? There seems to be an assumption there, but it's not documented.*
What will Lorito have that will give us flexibility equivalent to C's pointers? More concretely, what will the data structure for RPA look like?*
What bytecode segments will Lorito-based bytecode have and how will they differ from the current PBC?**
How can we fix freeze/thaw so that rakudo can use it to decrease its startup time?**
How tightly-defined are PMCs in Lorito, i.e. are they more like structs or chunks of memory? Will there be a way for Lorito to generically understand PMC guts, especially references to other PMCs?**
How will composed ops work, both on a VM level and in bytecode?**
How will Lorito code be translated for stack VMs?**
The object metamodel is important, especially since Object and PMC will be covered by the same concept. If we're not using P&W, we need to nail down what we are going to use.**
* - These questions are important to figure out before we dig in to a final prototype.
** - These questions we can work on as we work on the final prototype.