2009Q3 Grant Proposal: Changing the Perl 5 optree build process into a Abstract Syntax Tree generation and a code generation step

Mon, 10-Aug-2009 by Alberto Simões edit post

<dl> <dt>Name: <dd> Gerard Goossen </dd> </li> <dt>Email: <dd> [hidden email] </dd> </li> <dt>Amount Requested <dd> $ 5,000 </dd> </li> </dl> <h2>Synopsis</h2> This project is to change the Perl 5 optree build process into a step where the parser generates an Abstract Syntax Tree (AST) and a step where the executable code in the form of a list of function pointers (probably with arguments in some form) is generated from the AST. Because this project is refactoring of a major part of the Perl 5 core, a good benchmark of Perl 5 to measure the effect on performance of the changes would be required. <h2>Benefits to the Perl Community</h2> Changing the Perl 5 optree build process into two steps does have the following advantages: <dl> <dt> <dd> Changing the target for the code generation will be easier. Especially using LLVM could benefit from the more flexible code generation step. </dd> </li> <dt> <dd> The Abstract Syntax Tree will be simpler than the current optree, and could be easier analysed. </dd> </li> <dt> <dd> The separation of the generation of the Abstract Syntax Tree and the code generation increases the maintainability. </dd> </li> <dt> <dd> Having two steps separate steps for the code generation might introduce some overhead, but the increased flexibility should compensate for this and might even improve performance. </dd> </li> <dt> <dd> Optimisations which don't fit very will into the current model of one function per OP would be possible. </dd> </li> <dt> <dd> code generation could be just-in-time, i.e. at the first time a sub is called. Saving the cost of generating code for not-used subs, which is might also make it advantages to do more aggressive optimisations. </dd> </li> <dt> <dd> The change does not require any change to Perl code (except of course some B:: modules) nor would it require changes to XS code. </dd> </li> </dl> <h2>Deliverables</h2> <dl> <dt> <dd> The Perl 5 optree build process is changed to an Abstract Syntax Tree and code generation step. </dd> <dd> <pre> =item</pre> </dd> <dd> Documentation of the Abstract Syntax Tree. </dd> </li> <dt> <dd> Documentation of the code generation from the Abstract Syntax Tree. </dd> </li> <dt> <dd> A benchmark for Perl 5 </dd> </li> <dt> <dd> Investigate/try using LLVM as backend for Perl 5. </dd> </li> </dl> <h2>Project Details</h2> For example the current optree for program <code> for(@ARGV) { print "arg: $_\n"; } </code> is: <pre> { 1 TYPE = leave ===> DONE TARG = 1 FLAGS = (VOID,KIDS,PARENS) PRIVATE = (REFCOUNTED) REFCNT = 1 { 2 TYPE = enter ===> 3 } { 3 TYPE = nextstate ===> 4 FLAGS = (VOID) LINE = 1 PACKAGE = "main" } { 20 TYPE = leaveloop ===> 1 FLAGS = (VOID,KIDS) { 8 TYPE = enteriter ===> 18 FLAGS = (LIST,KIDS,STACKED) { TYPE = null ===> (4) (was pushmark) FLAGS = (SCALAR) } { TYPE = null ===> (7) (was list) FLAGS = (LIST,KIDS,MOD) { 4 TYPE = pushmark ===> 5 FLAGS = (SCALAR,MOD) } { 6 TYPE = rv2av ===> 7 TARG = 1 FLAGS = (SCALAR,KIDS,REF,MOD) { 5 TYPE = gv ===> 6 FLAGS = (SCALAR) } } } { 7 TYPE = gv ===> 8 FLAGS = (SCALAR) } } { TYPE = null ===> (20) FLAGS = (VOID,KIDS) { 19 TYPE = and ===> 20 FLAGS = (VOID,KIDS) OTHER ===> 9 { 18 TYPE = iter ===> 19 FLAGS = (SCALAR) } { 21 TYPE = lineseq ===> 0 FLAGS = (VOID,KIDS) { 9 TYPE = nextstate ===> 10 FLAGS = (VOID) LINE = 1 PACKAGE = "main" } { 16 TYPE = print ===> 17 FLAGS = (VOID,KIDS) { 10 TYPE = pushmark ===> 11 FLAGS = (SCALAR) } { TYPE = null ===> (16) (was stringify) FLAGS = (SCALAR,KIDS) { TYPE = null ===> (11) (was pushmark) FLAGS = (SCALAR) } { 15 TYPE = concat ===> 16 TARG = 3 FLAGS = (SCALAR,KIDS,STACKED) { 13 TYPE = concat ===> 14 TARG = 2 FLAGS = (SCALAR,KIDS) { 11 TYPE = const ===> 12 FLAGS = (SCALAR) SV = PV("arg: "\0) } { TYPE = null ===> (13) (was rv2sv) FLAGS = (SCALAR,KIDS) { 12 TYPE = gvsv ===> 13 FLAGS = (SCALAR) } } } { 14 TYPE = const ===> 15 FLAGS = (SCALAR) SV = PV("\n"\0) } } } } { 17 TYPE = unstack ===> 18 FLAGS = (VOID) } } } } } }</pre> This would be changed to an Abstract Syntax Tree of: <pre> { TYPE = block FLAGS = (VOID) { TYPE = forloop FLAGS = (VOID) { TYPE = null } { TYPE = rv2av FLAGS = (SCALAR,REF,MOD) { TYPE = gv FLAGS = (SCALAR) REF = *ARGV } } { TYPE = block { TYPE = print FLAGS = (VOID) { TYPE = stringify FLAGS = (SCALAR) { TYPE = const FLAGS = (SCALAR) SV = PV("arg: "\0) } { TYPE = rv2sv FLAGS = (SCALAR,KIDS) { TYPE = gvsv ===> 13 FLAGS = (SCALAR) } } { TYPE = const FLAGS = (SCALAR) SV = PV("\n"\0) } } } } } }</pre> Which would be converted to the code: <pre> pp_enter() pp_nextstate(state) pp_pushmark() pp_gv(gv) pp_rv2av(SCALAR|REF|MOD) pp_gv(gv) pp_enteriter() loop: pp_iter() pp_and(&leaveloop) pp_nextstate(state) pp_pushmark() pp_const("arg: ") pp_gvsv(gvsv) pp_concat() pp_const("\n") pp_concat() pp_print(VOID) pp_unstack() goto loop leaveloop: pp_leaveloop() pp_leave()</pre> But for example the print could be more aggressively compiled into something like: <pre> ... pp_pushmark() res1 = concat(PV("arg: "), gvsv) res2 = concat(res1, "\n") pp_push(res2) pp_print(VOID) ...</pre> <h2>Inch-stones</h2> There is a lot of flexibility in this project, because a lot of time has to be spent benchmarking/profiling/optimizing the code. <dl> <dt> <dd> Writing a benchmark for Perl 5. </dd> </li> <dt> <dd> Leave the optree building process as it currently is (including linked-list generation and peephole optimisation). But add a code generation step, which just uses the <code>op->op_next</code> linked-list to convert them to <code>pp_xxx(op)</code> calls </dd> </li> <dt> <dd> Adjust B modules and other modules which use the hook into runops and similar things. </dd> </li> <dt> <dd> Directly generate the code without the intermediate <code>op->op_next</code> linked-list. </dd> </li> <dt> <dd> Move the peephole optimisations and constant folding to the code generation step. </dd> </li> <dt> <dd> Simplify the Abstract Syntax Tree, by moving things the logic to the code generation step, like the <code>for</code> loop in the example above. </dd> </li> <dt> <dd> Optimise everything using the benchmark. </dd> </li> <dt> <dd> Use LLVM as backend. </dd> </li> </dl> <h2>Project Schedule</h2> One month full-time, starting about a month after acceptance of this proposal. <h2>Bio</h2> About Two and a halve years ago I start Perl Kurila, an experimental fork of Perl 5, through it I have any about every part of the Perl internals, specifically I have done some optree rewriting. And some optree analysis as part of the Perl5-to-Perl5 converter using the annotated optree generated by Perl with Misc Attribute Decoration (MAD).

Category: Grants

2009Q3 Grant Proposal: Changing the Perl 5 optree build process into a Abstract Syntax Tree generation and a code generation step

Comments (0)

Sign in to add comment

About TPF

Categories

Popular Tags

Recent Tags

Get Perl

Links