2009Q3 Grant Proposal: Changing the Perl 5 optree build process into a Abstract Syntax Tree generation and a code generation step

Name:

Gerard Goossen

Email:

[hidden email]

Amount Requested

$ 5,000

Synopsis

This project is to change the Perl 5 optree build process into a step where the parser generates an Abstract Syntax Tree (AST) and a step where the executable code in the form of a list of function pointers (probably with arguments in some form) is generated from the AST.

Because this project is refactoring of a major part of the Perl 5 core, a good benchmark of Perl 5 to measure the effect on performance of the changes would be required.

Benefits to the Perl Community

Changing the Perl 5 optree build process into two steps does have the following advantages:

Changing the target for the code generation will be easier. Especially using LLVM could benefit from the more flexible code generation step.

The Abstract Syntax Tree will be simpler than the current optree, and could be easier analysed.

The separation of the generation of the Abstract Syntax Tree and the code generation increases the maintainability.

Having two steps separate steps for the code generation might introduce some overhead, but the increased flexibility should compensate for this and might even improve performance.

Optimisations which don't fit very will into the current model of one function per OP would be possible.

code generation could be just-in-time, i.e. at the first time a sub is called. Saving the cost of generating code for not-used subs, which is might also make it advantages to do more aggressive optimisations.

The change does not require any change to Perl code (except of course some B:: modules) nor would it require changes to XS code.

Deliverables

The Perl 5 optree build process is changed to an Abstract Syntax Tree and code generation step.


=item

Documentation of the Abstract Syntax Tree.

Documentation of the code generation from the Abstract Syntax Tree.

A benchmark for Perl 5

Investigate/try using LLVM as backend for Perl 5.

Project Details

For example the current optree for program for(@ARGV) { print "arg: $_\n"; } is:

    {
    1   TYPE = leave  ===> DONE
        TARG = 1
        FLAGS = (VOID,KIDS,PARENS)
        PRIVATE = (REFCOUNTED)
        REFCNT = 1
        {
    2       TYPE = enter  ===> 3
        }
        {
    3       TYPE = nextstate  ===> 4
            FLAGS = (VOID)
            LINE = 1
            PACKAGE = "main"
        }
        {
    20      TYPE = leaveloop  ===> 1
            FLAGS = (VOID,KIDS)
            {
    8           TYPE = enteriter  ===> 18
                FLAGS = (LIST,KIDS,STACKED)
                {
                    TYPE = null  ===> (4)
                      (was pushmark)
                    FLAGS = (SCALAR)
                }
                {
                    TYPE = null  ===> (7)
                      (was list)
                    FLAGS = (LIST,KIDS,MOD)
                    {
    4                   TYPE = pushmark  ===> 5
                        FLAGS = (SCALAR,MOD)
                    }
                    {
    6                   TYPE = rv2av  ===> 7
                        TARG = 1
                        FLAGS = (SCALAR,KIDS,REF,MOD)
                        {
    5                       TYPE = gv  ===> 6
                            FLAGS = (SCALAR)
                        }
                    }
                }
                {
    7               TYPE = gv  ===> 8
                    FLAGS = (SCALAR)
                }
            }
            {
                TYPE = null  ===> (20)
                FLAGS = (VOID,KIDS)
                {
    19              TYPE = and  ===> 20
                    FLAGS = (VOID,KIDS)
                    OTHER ===> 9
                    {
    18                  TYPE = iter  ===> 19
                        FLAGS = (SCALAR)
                    }
                    {
    21                  TYPE = lineseq  ===> 0
                        FLAGS = (VOID,KIDS)
                        {
    9                       TYPE = nextstate  ===> 10
                            FLAGS = (VOID)
                            LINE = 1
                            PACKAGE = "main"
                        }
                        {
    16                      TYPE = print  ===> 17
                            FLAGS = (VOID,KIDS)
                            {
    10                          TYPE = pushmark  ===> 11
                                FLAGS = (SCALAR)
                            }
                            {
                                TYPE = null  ===> (16)
                                  (was stringify)
                                FLAGS = (SCALAR,KIDS)
                                {
                                    TYPE = null  ===> (11)
                                      (was pushmark)
                                    FLAGS = (SCALAR)
                                }
                                {
    15                              TYPE = concat  ===> 16
                                    TARG = 3
                                    FLAGS = (SCALAR,KIDS,STACKED)
                                    {
    13                                  TYPE = concat  ===> 14
                                        TARG = 2
                                        FLAGS = (SCALAR,KIDS)
                                        {
    11                                      TYPE = const  ===> 12
                                            FLAGS = (SCALAR)
                                            SV = PV("arg: "\0)
                                        }
                                        {
                                            TYPE = null  ===> (13)
                                              (was rv2sv)
                                            FLAGS = (SCALAR,KIDS)
                                            {
    12                                          TYPE = gvsv  ===> 13
                                                FLAGS = (SCALAR)
                                            }
                                        }
                                    }
                                    {
    14                                  TYPE = const  ===> 15
                                        FLAGS = (SCALAR)
                                        SV = PV("\n"\0)
                                    }
                                }
                            }
                        }
                        {
    17                      TYPE = unstack  ===> 18
                            FLAGS = (VOID)
                        }
                    }
                }
            }
        }
    }

This would be changed to an Abstract Syntax Tree of:

    {
        TYPE = block
        FLAGS = (VOID)
        {
            TYPE = forloop
            FLAGS = (VOID)
            {
                TYPE = null
            }
            {
                TYPE = rv2av
                FLAGS = (SCALAR,REF,MOD)
                {
                    TYPE = gv
                    FLAGS = (SCALAR)
                    REF = *ARGV
                }
            }
            {
                TYPE = block
                {
                    TYPE = print
                    FLAGS = (VOID)
                    {
                        TYPE = stringify
                        FLAGS = (SCALAR)
                        {
                            TYPE = const
                            FLAGS = (SCALAR)
                            SV = PV("arg: "\0)
                        }
                        {
                            TYPE = rv2sv
                            FLAGS = (SCALAR,KIDS)
                            {
                                TYPE = gvsv  ===> 13
                                FLAGS = (SCALAR)
                            }
                        }
                        {
                            TYPE = const
                            FLAGS = (SCALAR)
                            SV = PV("\n"\0)
                        }
                    }
                }
            }
        }
    }

Which would be converted to the code:

    pp_enter()
    pp_nextstate(state)
    pp_pushmark()
    pp_gv(gv)
    pp_rv2av(SCALAR|REF|MOD)
    pp_gv(gv)
    pp_enteriter()
   loop:
    pp_iter()
    pp_and(&leaveloop)
    pp_nextstate(state)
    pp_pushmark()
    pp_const("arg: ")
    pp_gvsv(gvsv)
    pp_concat()
    pp_const("\n")
    pp_concat() 
    pp_print(VOID)
    pp_unstack()
    goto loop
   leaveloop:
    pp_leaveloop()
    pp_leave()

But for example the print could be more aggressively compiled into something like:

    ...
    pp_pushmark()
    res1 = concat(PV("arg: "), gvsv)
    res2 = concat(res1, "\n")
    pp_push(res2)
    pp_print(VOID)
    ...

Inch-stones

There is a lot of flexibility in this project, because a lot of time has to be spent benchmarking/profiling/optimizing the code.

Writing a benchmark for Perl 5.

Leave the optree building process as it currently is (including linked-list generation and peephole optimisation). But add a code generation step, which just uses the op->op_next linked-list to convert them to pp_xxx(op) calls

Adjust B modules and other modules which use the hook into runops and similar things.

Directly generate the code without the intermediate op->op_next linked-list.

Move the peephole optimisations and constant folding to the code generation step.

Simplify the Abstract Syntax Tree, by moving things the logic to the code generation step, like the for loop in the example above.

Optimise everything using the benchmark.

Use LLVM as backend.

Project Schedule

One month full-time, starting about a month after acceptance of this proposal.

Bio

About Two and a halve years ago I start Perl Kurila, an experimental fork of Perl 5, through it I have any about every part of the Perl internals, specifically I have done some optree rewriting. And some optree analysis as part of the Perl5-to-Perl5 converter using the annotated optree generated by Perl with Misc Attribute Decoration (MAD).

About TPF

The Perl Foundation - supporting the Perl community since 2000. Find out more at www.perlfoundation.org.

About this Entry

This page contains a single entry by Alberto Simões published on August 10, 2009 12:23 PM.

2009Q3 Grant Proposal: Enhancing CPANHQ was the previous entry in this blog.

2009Q3 Grant Proposal: Corporate, Embedded, and Multi-user Perl on Windows is the next entry in this blog.

Find recent content on the main index or look in the archives to find all content.

OpenID accepted here Learn more about OpenID
Powered by Movable Type 4.38