Version 2 (modified by octo, 8 years ago)

Improved the introduction a bit.

PIR Tutorial

(Currently being moved from  the old Parrot wiki; this article is probably incomplete!)


Welcome to the beginners' tutorial for Parrot Intermediate Representation (PIR)! So, you want to program the most exciting and new virtual machine for dynamic languages eh? This is the place to start! Although Parrot is still under development, it can already solve a lot of your programming problems. As time progresses, it will get even better. Moreover, because Parrot is currently not compiled with optimizations, it will get faster too! It should be noted at this point that the syntax of Parrot's internal language is not set in stone, and may change at some points. However, if you stick to the syntax as described in this tutorial, you should be quite safe.

If you're comfortable with Backus–Naur Form (BNF), a format for the grammar of context-free languages, you may take a look at the grammar of PIR in languages/PIR. Note that this is not the official implementation, it is an attempt to be as close as possible. If you don't know what BNF is, you may forget about it :-).

Now, let's get started!

PIR Basics

Your first Parrot program

As always, we start with the simplest program imaginable:

  .sub main
    print "Hello Parrot!\n"

Save this program to a file called hello.pir. Then, to run this program, type this on the command line (assuming you successfully compiled Parrot):

  parrot hello.pir

And the output will be:

  Hello Parrot!

That was not too hard, now was it? Before we continue to more complex examples, let's first analyze what happened. The first line in the file is .sub main. This indicates that we're defining a subroutine that goes by the name main. Note that it is not necessary to name your subroutine like this, even if it's the only subroutine. The name main does not indicate execution will start at that subroutine, like in C. In PIR, execution will start at the top-most defined subroutine in the file, not matter what its name is. (There are ways to change this, though, but we will forget that for now. More on that later). As you can see on the third line, the subroutine is closed with the .end directive.

In between these subroutine directives, you can define the subroutine body, which consists of PIR or Parrot assembly (PASM) instructions. In this simple program we just sticked to the print instruction. It takes 1 parameter that can be of any type, as long as it is something (i.e. it is not undefined or null). Please note that all instruction should be in between a .sub/.end pair.

More instructions

Parrot has a lot of instructions. I mean, a lot. This tutorial will not discuss all of them, but instead we will discuss them as the need arises for them. Now, we will first see how to do some calculations so you can do some useful stuff. We'll do it step by step and explain things as they pass by. (Do note however, this is not a tutorial on assembly programming, so some knowledge of registers etc. is expected).

Storing things

Before we continue, we need to explain some details on how Parrot stores numbers, strings and objects. As Parrot is a register-based virtual machine (as opposed to stack-based VMs like the Java VM), you store things in registers. There are 4 types: registers for storing integers (I registers), floating-point numbers (N registers), strings (S registers) and objects (P registers). So, let's consider the case we need to store some things, we could do it like this:

  I0  = 42              # store 42 in integer register 0
  N10 = 3.14            # store 3.14 in numeric register 10
  S20 = "Hello world!"  # store this string in string register 20
  P30 = new .String     # create a new String object in PMC register 30. See "Where to read further?" for links to more tutorials.

Above we used Parrot registers, and there's only a limited number of them. Instead, it's better to use temporary registers; they look almost the same as registers, but have a $ prefix. They can be considered as variables that don't need any declaration (and you can use as many of them as you need). Some examples:

  $I0 = 42
  $S9999999 = "Hi" # use *any* register number

However, if you like to name things by their name, you might consider using named temporary variables. These, however, do need declaration. This is done by stating:

  .local int answer
  .local num PI, e
  answer = 42
  PI = 3.14
  e = 2.7

This declares some temporary variables. Although this declares an integer and some numeric variables, you could use any of the following types:

  • int - declare an integer variable
  • num - declare a floating-point number variable
  • string - declare a string variable
  • pmc - declare a Parrot Magic Cookie (PMC) variable

You might wonder what the heck is a Parrot Magic Cookie. This is where Parrot's Magic comes in. In fact, it's so magical, there's a separate document written on that. Have a look at the section Where to read further?.

Now we know how to store numbers and strings, let's do some operations on those values.

Calculating things

Calculating things is as trivial as you might expect. We'll give some full examples below, so you can copy+paste the code and run it yourself:

ABC formula

  .sub foo
    .local num a, b, c, det

    # give a, b and c some value for now; later specify them as parameters
    a = 2
    b = -3
    c = -2

    # calculate -b and b squared.
    $N0 = -b
    $N1 = b * b

    # calculate 4ac
    $N2 = 4 * a
    $N2 = $N2 * c
    $N3 = 2 * a
    det = $N1 - $N2
    $N4 = sqrt det

    .local num x1, x2   
    x1 = $N0 + $N4
    x1 = x1 / $N3

    x2 = $N0 - $N4
    x2 /= $N3      # fancy way of saying x2 = x2 / $N3, but more efficient

    print "Answers to ABC formula are:\n"
    print "x1 = "
    print x1
    print "\nx2 = "
    print x2
    print "\n"

Of course, as Parrot offers operations at a more abstract level than hardware processors, you can also do more fancy things like manipulating strings, like in the example below:

  .sub joe
    .local string name
    name = " Joe!"
    $S0 = "Hi"
    $S1 = $S0 . name
    $S1 .= "\n"  # extend $S1 with "\n"
    print $S1

The dot is short in PIR for the concat operation. It takes 2 strings and concatenates them. Just as the assignment operations in the ABC formula example (x2 /= $N3), this can also be done with strings using the .= operator.

As mentioned, Parrot has many instructions. This tutorial will not list all of them, but instead you could take a look at  the list of ops by category.

Where to read further?

Take a look at these documents:

  •  docs/glossary.pod - contains explanations of some often used terms
  • docs/art/pp001-intro.pod - a general introduction
  • docs/art/pp002-pmc.pod - a good introduction to PMCs
  • docs/art/pp003-oop.pod - an introduction to Object Oriented Programming in Parrot
  • docs/imcc/ - all files in this directory
  •  docs/compiler_faq.pod - a document describing how to implement various language constructs in PIR
  •  docs/pdds/pdd03_calling_conventions.pod - the Parrot Design Document on Parrot's calling conventions
  •  docs/pdds/pdd20_lexical_vars.pod - the Parrot Design Document on Lexical variables
  • languages/PIR/docs/pirgrammar.pod - the grammar of PIR as implemented using PGE (matches about 90% of PIR)
  • compilers/pirc - A top-down recursive descent parser for PIR, with embedded specification
  •  Parrot Docs - all kinds of files on particular subjects