One of the sub-projects of Pugs is a series of Perl 6 sanity tests which define a minimal set of useful Perl 6 features. The idea behind those tests is that a Perl 6 implementation which can pass the sanity tests supports enough features so that it’s possible to bootstrap the rest of Perl 6 in that minimal implementation.
The Parrot project recently borrowed those sanity tests for the Perl 6 on Parrot implementation. (I work on Parrot in part because I believe that Parrot’s compiler tools are much more suitable for building compilers and languages than anything else I’ve ever used.)
Though I spend more of my Parrot time these days applying submitted patches, fixing bugs, and refactoring code, I try to make time for new development. I heard that we almost had all of the first suite of sanity tests passing and decided to see if I could improve the situation.
Most of the 19 tests passed. The most app failure was in t/01-sanity/08-test.t, which loads Test.pm and calls two functions exported from that module:
#!/usr/bin/pugs
# Checking that testing is sane: Test.pm
use v6;
use Test;
plan 1;
my $x = '0';
ok $x == $x;
Within languages/perl6, run make and then
../../parrot perl6.pbc t/01-sanity/08-test.t. As of Parrot
r18892, you’d see errors about invoking null subroutines–because Perl 6
didn’t actually load Test.pm.
Inside the Perl 6 Compiler
Though perl6.pbc looks like is a single monolithic chunk of
Parrot code, it’s actually a combination of several different stages of
compilation. The first is a PGE-based grammar (built on Perl 6 Synopsis
05). The rest are successive refinements of the Perl 6 program in
different tree forms, from the parse tree generated from PGE to actual code
in Parrot’s native language. A Parrot class named HLLCompiler
performs most of the work to read in source code and shuttle the trees
through the transformations; the Perl 6 compiler subclasses this class to
add some very basic additional behavior.
Though multiple stages add conceptual complexity to understanding the
compiler when you start, they often add clarity and simplicity in practice
when adding features. For example, I only had to modify the first two
stages of compilation (out of four) to support use and
require.
Parsing Support
I was fortunate, though, that other files in the sanity test all had the
line use v6;. The grammar (in
src/parser/grammar_rules.pg) already had a token defined to handle
use statements:
token use_statement {
<?ws> use <?ws> <version> <?ws> <expression: ;>?
}
token version { <ident> }
In brief, a use statement starts with optional whitespace, contains the
literal term use, has optional whitespace, contains something
matching the version rule, has optional whitespace, and
optionally contains an expression followed by a semicolon. A
version is just a valid identifier.
This token isn’t quite perfectly correct yet, particularly
because version is semantically the wrong name at this point
in the grammar, but it’s close enough to add the feature, and it will be
the target of refactoring when improving use support.
All I needed to know was that the parser actually matched use
v6; and use Test; appropriately here:
$ parrot perl6.pir --target=PARSE t/01-sanity/08-test.t
...
<statement> => ResizablePMCArray (size:5) [
PMC 'Perl6::Grammar' => "#!/usr/bin/pugs... " @ 0 {
<use_statement> => PMC 'Perl6::Grammar' => "#!/usr/bin/pugs..." @ 0 {
<version> => PMC 'Perl6::Grammar' => "v6" @ 63 {
<ident> => PMC 'Perl6::Grammar' => "v6" @ 63
}
}
}
...
]
The --target option to the Perl 6 compiler allows you to
emit the result of any stage in the compilation process. PARSE
dumps the PGE tree in a visible (if lengthy) form. (README gives the
other flags and options.)
Transforming the Parse Tree
So far, I didn’t have to change any code (except that the PGE dumping
didn’t work, due to a missing line of code, but that was a bugfix and not a
new feature). The next step was to transform this
use_statement branch appropriately into PAST. (The first
transformation step is from program source code into a PGE–or parse–tree.
The second transformation is from the parse tree into a Parrot Abstract
Syntax Tree.)
The Perl 6 compiler’s PAST transformations are individual rules in a TGE
grammar found in src/PAST/Grammar.tg. There’s usually at least one
rule for each named node in the PGE output. I searched the file for
use_statement and found it mentioned in the transformation
rule for Perl6::Grammar::statement nodes:
transform past (Perl6::Grammar::statement) :language('PIR') {
$P0 = node['use_statement']
unless null $P0 goto use_statement
...
use_statement:
null $P0
.return ($P0)
node represents the particular node of the parse tree.
Child nodes are available through keyed access–thus this code looks in a
Perl6::Grammar::statement node for a
use_statement node. If that’s found, it jumps to the
use_statement label, which does nothing. Returning a null
value here effectively splices this node out of the resulting PAST.
When I saw this, I knew I needed to add some code.
TGE transformations create PAST objects. Each TGE transform is basically a map from particular elements in one tree structure to another. They take whatever information they need from the input tree and create nodes in the output tree with that data.
From personal experience and poking around in the file (and liberal use
of the --target=PAST option to the Perl 6 compiler, I decided
that a use statement needed a PAST::Op node to
represent the use operator with a child PAST::Val
node to represent the name of the module to load.
Here’s a fun piece of magic: whatever other rule invoked this
transformation will splice the PAST::Op node returned from
this rule into the final PAST automatically. If you’ve ever had to rethread
a syntax or operator tree on your own, you may appreciate this.
The code was remarkably easy, though again my experience made this easier than if this were my first attempt:
use_statement:
.local pmc name
name = $P0['version';'ident']
.local string name_string
name_string = name
if name_string == 'v6' goto use_pragma
if name_string == 'lib' goto use_pragma
.local pmc use_op
use_op = new 'PAST::Op'
use_op.'init'('node' => $P0, 'name' => 'use')
.local pmc module_name
module_name = new 'PAST::Val'
module_name.'init'('node' => $P0, 'vtype' => '.Perl6Str', 'name' => name, 'ctype' => 's~')
use_op.'push'(module_name)
.return (use_op)
This code grabs the value of the version (from its
contained ident; this is a fixable wart in the grammar),
converts that from a PMC to a native Parrot string, and then compares the
result against two hard-coded cases. (Only the v6 case should
survive past the near future.)
Everything else represents a genuine Perl module to load.
The rest of the code creates a PAST::Op object, where the
important attribute is name–to represent a use
operation, and a PAST::Val object which represents a Perl6Str
PMC representing a string. (This is the only spot that gave me trouble; I
had to guess at what 'ctype' => 's~' signified, but I have
confidence that it’s accurate.)
After attaching the PAST::Val as a child of the
PAST::Op with push(), the code returns the
PAST::Op. That’s all it takes to add support in the compiler.
Subsequent transformations (from PAST to POST and from POST to PIR) are all
free; the third transformer knows how to handle PAST::Op nodes
already.
Adding a Built-In
Of course, that didn’t actually define the use
operator, merely made it callable from Perl 6 programs. Builtins are PIR
subroutines in the src/builtins/*.pir files. I chose
src/builtins/io.pir as a likely candidate for two new operators:
.sub 'use'
.param pmc module
.local string module_string
module_string = module
.local pmc path
path = split '::', module_string
.local string file_string
file_string = join '/', path
.local pmc filename
filename = new .Perl6Str
filename = file_string
filename .= '.pm'
require(filename)
.end
use takes the name of a module and converts it into
filename, represented as a a Perl 6-style string. Then it calls the
require operator.
.sub 'require'
.param pmc filename
.local pmc p6compiler
p6compiler = compreg 'Perl6'
p6compiler.'evalfiles'(filename)
.end
require is a lot shorter, mostly because it cheats. I wrote
earlier that the Perl 6 compiler extends HLLCompiler to add a
few features. One of those features is registering itself as the
Perl6 compiler with Parrot. This allows require
to grab the Perl 6 compiler object from the compiler registry, then call
its evalfiles() method on the passed filename.
That merely starts the compilation process over on the named file.
Conclusion
This isn’t the final implementation of either use or
require. In particular, use doesn’t call any
import() method on the required module, nor does
require respect any include paths or perform any caching to
avoid re-loading unchanged files.
However, it took more time to write this description than to write this code and make it work. Though as of the current Parrot revision (r18935), the test still isn’t passing (because the compiler needs a few more grammar rules to parse all of Test.pm properly), a small amount of work added a new feature to Perl 6 on Parrot that brought bootstrapping one step closer.
Perhaps even better, a fuller Perl 6 implementation on Parrot will make writing compilers even easier, as you can use Perl 6 to write your transformation rules instead of PIR.

