Version 8 (modified by jhorwitz, 13 years ago)

--

mod_parrot HLL Module Developer's Guide

Overview

This is the mod_parrot HLL module developer's guide. The target audience is developers wishing to embed their language in Apache using mod_parrot. The benefits of this are one-time compilation of scripts, a persistent execution environment, direct access to the Apache API, and the ability to write custom hooks in the embedded language. Some languages can be self-hosted, meaning the code to implement mod_foo is written in the "foo" language.

Most examples are taken from the PIR HLL module, with several from mod_perl6 to illustrate self-hosting.

Prerequisites

Any language targeted to Parrot can use mod_parrot to execute code in a persistent environment. However, to best take advantage of mod_parrot's features, including self-hosting, languages should support the following:

  • namespaces
  • lexical and global variables
  • an Parrot-compatible object model

Using mod_parrot without these features is still possible with some PIR scaffolding.

Bootstrapping

Each HLL in mod_parrot is contained in its own Apache module, known as an HLL module. All steps leading up to and including the registration of the HLL module with Apache is called bootstrapping. The bootstrapping process typically follows this procedure:

  1. Load the HLL compiler.
  2. Declare server and directory configuration hooks.
  3. Declare Apache directive hooks.
  4. Declare Apache directives.
  5. Declare metahandlers (hooks for Apache phases).
  6. Register the Apache module.

As of mod_parrot 0.5, step 1 must be written (or compiled to) PIR, but all subsequent steps can be written in the HLL itself (a self-hosting HLL module). The PIR bootstrap file MUST be compiled to bytecode and located in ModParrot/HLL/hllname.pbc in Parrot's library path. HLL code can be located anywhere, though conventions will eventually be defined. If there is HLL bootstrap code, it must be loaded and executed using PIR in the bootstrap file.

Bootstrap code must be placed in a PIR subroutine marked with the :load adverb so it is run when the file is loaded. This subroutine can be named or anonymous (using the :anon adverb).

Example: mod_perl6

The first part of the bootstrap file from mod_perl6 loads the compiler and supporting libraries, then executes Perl 6 code from mod_perl6.pm:

.sub __onload :anon :load
    load_bytecode 'languages/perl6/perl6.pbc'
    load_bytecode 'ModParrot/Apache/Module.pbc'
    load_bytecode 'ModParrot/Constants.pbc'

    # load mod_perl6.pm, which may be precompiled
    $P0 = compreg 'Perl6'
    $P1 = $P0.'compile'('use mod_perl6')
    $P1()

    ...

Configuration

Each HLL module provides two configuration data structures to Apache: server and directory. Server configurations are specific to the main server and individual virtual hosts. Directory configurations are specific to individual sections, which can be real directories or locations defined in the Apache configuration file. All configuration structures can be merged with parent configurations to implement inheritance or overriding behavior.

Creating HLL Configurations

Each Apache module is responsible for defining and creating its own configuration data structures. When Apache asks an HLL module for a server or directory configuration, mod_parrot will look for a "constructor" subroutine in the ModParrot;HLL;hllname namespace to execute. Server configs are provided by server_create, while directory configs are provided by dir_create. These subroutines should create a data structure, possibly populated with default values, and return it. The type of the structure is up to the implementor, as long as it is a valid Parrot PMC.

Signatures

  • PMC server_create()
  • PMC dir_create()

Example: the PIR configuration constructors

.namespace [ 'ModParrot'; 'HLL'; 'PIR' ]

.sub server_create
     $P0 = new 'Hash'
     .return($P0)
.end

.sub dir_create
     $P0 = new 'Hash'
     .return($P0)
.end

Merging HLL Configurations

Like the constructor subroutines above, HLL modules can provide subroutines to merge two configurations. This is useful when, for example, a particular configuration setting for a directory should be overridden by a different setting in a subdirectory. Or perhaps your HLL is maintaining an array of values for a virtual host that should be concatenated with the array defined in the main server. All of this behavior is performed by the merge subroutines.

The server merge subroutine is called server_merge, and the directory merge is handled by dir_merge. They are passed the "base" configuration and the "new" configuration, and are expected to return a merged configuration. These subroutines should create a new PMC for the merged configuration rather than reusing the PMC from the "new" configuration. Parrot passes PMCs by reference, and the code would thus be changing the "new" configuration directly, resulting in unexpected behavior.

Signatures

  • PMC server_merge(PMC basecfg, PMC newcfg)
  • PMC dir_merge(PMC basecfg, PMC newcfg)

Example: A self-hosted server merge from mod_perl6

sub server_merge(%base, %new)
{
    my %merged;

    # merge handlers -- never inherit
    for @server_phases.map({$_ ~ '_handler'}) -> $h {
        %merged{$h} = %new{$h};
    }

    return %merged;
}

Custom Apache Directives

The server and directory configurations would be fairly useless without support for adding custom Apache directives for an HLL module. Here we will learn how to define a custom directive. Registration of the directive occurs when you add the module to Apache.

A directive can be defined as a Parrot Hash, or any HLL type that implements a keyed-by-string interface. There are five keys for which you need to provide values:

  • name: the name of the directive as specified in the Apache configuration file
  • args_how: a constant defining how arguments are processed
  • func: a reference to a callback subroutine that will process the arguments
  • req_override: a constant defining where in the Apache configuration file the directive can be used
  • errmsg: a message displayed when the directive is misused

Values for args_how

NO_ARGSno arguments
TAKE1one argument
TAKE2two arguments
TAKE3three arguments
TAKE12one or two arguments
TAKE23two or three arguments
TAKE123one two or three arguments
ITERATEa list of arguments passed to the callback one at a time
ITERATE2one argument followed by a list of arguments passed to the callback one at a time, along with the first argument
FLAGa single On or Off argument, passed to the callback as 0 (off) or 1 (on)
RAW_ARGSno parsing, passes the entire configuration line to the callback

Values for req_override

ACCESS_CONFcan be used in directory sections, but not in .htaccess files
OR_NONEcannot be overridden by AllowOverride
OR_ALLcan be used anywhere in the configuration
OR_AUTHCFGcan be used inside directory sections (and .htaccess with the AuthConfig override)
OR_FILEINFOcan be used anywhere (and .htaccess with the FileInfo override)
OR_INDEXEScan be used anywhere (and .htaccess with the Indexes override
OR_OPTIONScan be used anywhere (and .htaccess with the Options override
OR_LIMITcan be used in directory sections (and .htaccess with the Options override
RSRC_CONFcan be used outside of a directory section (not allowed in .htaccess)
OR_UNSETnot yet implemented
EXEC_ON_READnot yet implemented

Accessing mod_parrot's Configuration

Metahandlers

Naming Conventions

Registering Apache Hooks

The Context Object

Capturing Output

Registering the HLL Apache Module

Miscellany

Persistence

Sharing an Interpreter with Other Languages