Grant Report: Changing the Perl 5 optree build process into a Abstract Syntax Tree generation and a code generation step

Mon, 05-Apr-2010 by Alberto Simões edit post

TPF is proud to announce the completion of Gerard Goossen grant. As Gerard sent a big report, please check the complete report on the extended new page. I would like to thank you both Gerard for completing the grant, and Jeff Horwitz for managing it. <meta http-equiv="content-type" content="text/html; charset=utf-8">Below are my results of the "Changing the Perl 5 optree build process into a Abstract Syntax Tree generation and a code generation step" grant.  For the original grant request see <a class="moz-txt-link-freetext" href="http://news.perlfoundation.org/2009/08/2009q3-grant-proposal-changing.html">http://news.perlfoundation.org/2009/08/2009q3-grant-proposal-changing.html</a> In short the idea was to have a new code generation step instead of  the linked op nodes. This would allow the optree to become more of an  Abstract Syntax Tree. The code is available at github at <a class="moz-txt-link-freetext" href="http://github.com/ggoossen/perl/tree/codegen">http://github.com/ggoossen/perl/tree/codegen</a>  blead is currently frozen for the 5.12 release and thus it can't be  merged yet. The new code for the code generation step is in codegen.c. instruction.c and instruction.h contain the code for the instructions. Optree The optree no longer also functions as a instruction sequence, and as a result has been simplified. As can be seen by the example below derived from the output of: /*./perl -Dx -e 'for (@ARGV) { print "arg: $_\n"; }' BEFORE: 1   TYPE = leave  ===> DONE 2       TYPE = enter  ===> 3 3       TYPE = nextstate  ===> 4 20      TYPE = leaveloop  ===> 1 8           TYPE = enteriter  ===> 18                 TYPE = null  ===> (4)                 TYPE = null  ===> (7) 4                   TYPE = pushmark  ===> 5 6                   TYPE = rv2av  ===> 7 5                       TYPE = gv  ===> 6 7               TYPE = gv  ===> 8             TYPE = null  ===> (20) 19              TYPE = and  ===> 20 18                  TYPE = iter  ===> 19 21                  TYPE = lineseq  ===> 0 9                       TYPE = nextstate  ===> 10 16                      TYPE = print  ===> 17 10                          TYPE = pushmark  ===> 11                             TYPE = null  ===> (16)                                 TYPE = null  ===> (11) 15                              TYPE = concat  ===> 16 13                                  TYPE = concat  ===> 14 11                                      TYPE = const  ===> 12                                         TYPE = null  ===> (13) 12                                          TYPE = gvsv  ===> 13 14                                  TYPE = const  ===> 15 17                      TYPE = unstack  ===> 18 AFTER: 1   TYPE = leave 2       TYPE = enter 3       TYPE = nextstate 4       TYPE = foreach 5           TYPE = list 6               TYPE = rv2av 7                   TYPE = gv 8           TYPE = gv 9           TYPE = lineseq 10              TYPE = nextstate 11              TYPE = print 12                  TYPE = stringify 13                      TYPE = concat 14                          TYPE = concat 15                              TYPE = const 16                              TYPE = rv2sv 17                                  TYPE = gv 18                          TYPE = const Code generation How the code generation works is documented in codegen.c. Here I will only show the result of the code generation step as obtained using the -DG flag: ./perl -DG -e 'for (@ARGV) { print "arg: $_\n"; }' Instructions of codeseq (0x96439b8): 0x9648270: enter 0x9648280: nextstate 0x9648290: pushmark 0x96482a0: gv 0x96482b0: rv2ax 0x96482c0: go 0x96482d0: enteriter    redo=label1    next=label2    last=label3 label5: 0x96482e0: iter 0x96482f0: instr_cond_jump    label4 label1: 0x9648300: nextstate 0x9648310: pushmark 0x9648320: const 0x9648330: gvsv 0x9648340: concat 0x9648350: const 0x9648360: concat 0x9648370: print label2: 0x9648380: unstack 0x9648390: instr_jump    label5 label4: 0x96483a0: leaveloop label3: 0x96483b0: leave 0x96483c0: instr_end 0x96483d0: (finished) Memory consumption The change causes the memory consumption to go down.  For example the maximum amount of memory used by 'perldoc perlfunc' is after 2.91MB compared to 2.98 that (a 2% reduction), or for example 'perlcritic autodie/exception.pm' now uses 13.5MB compared to 14.4 before (a 6% reduction).  The memory usage reduction is caused by a smaller optree, the intermediate code is never generated as long as the code isn't run. Thus amount of memory reduction would be large for large program  with many modules of which most parts are not used. Performance  There doesn't appear to be any significant performance difference as measured by perlbenchapps. perlbench gives mixed result and is probably too artificial to be indicative of anything. I find this a bit strange because I would expect it to perform better given that it uses less memory, and that the instructions are sequential in memory, which should result in better cache performance, but apparently this doesn't matter. A = /home/gerard/perl/inst/blead-compile/bin/perl5.12.0 B = /home/gerard/perl/inst/blead/bin/perl lcov   A: 4.42(3)*100   B: 4.392(15)*100 lwp-request   A: 1.791(14)*100   B: 1.807(13)*100 net_http_get   A: 1.13(2)*100   B: 1.11(2)*100 net_http_get_config   A: 6.08(3)*10-2   B: 6.25(2)*10-2 perlcritic   A: 1.0875(13)*100   B: 1.084(3)*100 perldoc   A: 9.72(10)*10-1   B: 9.65(4)*10-1 spamassassin   A: 1.726(3)*100   B: 1.773(9)*100 spamd   A: 2.733(16)*101   B: 2.76(2)*101 Easier optimisations  Optimisation which require a different instruction sequence are easier to make.  Previously this would require manipulating the optree and its linked list, which is rather difficult.  Now it can simply be done by generating a different instruction list.  For making the sort work in-place is reduced from about 40 lines of code to about 10 lines. Influence on existing code Bugs There are no known new bugs, caused by the change. Possible area where new bugs are introduced is that compiling happens, when the subroutine is first needed, and part of the environment might "leak" into the compiling, for example initially the PL_taiting wasn't set to false, causing constant folded expression to be tainted if the first time the subroutine was called was inside a tainted expression. B modules Of course things which analyse the opcode tree using B::OP will notice some changes in the tree, but the new tree should be easier to analyse. Some of the B modules which can be used to output things about the optree in execution order are obviously gone/broken. Incompatible changes The only change concerning Perl code is the removal of '"goto" into a construct', but that is already deprecated in 5.12 thus that shouldn't be problem. The ops are no longer made into a linked list. Also the optree is a bit simplified.  Also the peephole analyser is gone, instead after the optree is build finish_optree is called which does some additional checking and makes the tree thread-safe.  When a sub is called if it doesn't already have a instruction list in CvCODESEQ the optree is compiled using the compile_cv/compile_op function.  The pp_xxx functions now have two arguments ppflags and pparg (most functions ignore them and still use PL_op for arguments) and instead of returning the next op it should set the next instruction if it isn't the default RUN_SET_NEXTINSTRUCTION. Future improvements  There are many more things which can be moved from semantic analysis step to the intermediate code generation step (i.e. from op.c to codegen.c), for example a 'scope' still generate an 'enter' and a 'leave' op, this could be merged to only one op which is split into two when generating the instruction list, another example would be the sort code, where determining whether a optimised comparison can be used is determined in the intermediate code generation.  Doing this makes the syntax tree more simple and thus easier to analyse, also it would probably make the code simpler. Use of instruction flags To pass arguments to the instruction the op is still used, as an experiment I changed the pp_helem instruction to use the instruction arguments instead of the op, with the idea that it should perform better because it wouldn't have to read from the op memory. But my benchmarks don't show any significant difference. But if this would be combined with LLVM the improvement might be significant because the compiler could optimize away the passed constant. LLVM The idea with LLVM was that the instruction list could be compiled by the LLVM compiler on the fly to one real function, thus optimising away the overhead of function calls, and possibly optimising the away code based on the constant values (flags) passed to the instruction functions.  Unfortunately I couldn't figure out how to do this. Opcode/instruction list The list of opcode in opcode.pl functions both for the opcode and for instruction list. Although there are a lot of operations where have exactly one opcode and one instruction (for example 'add', 'helem', 'open'), there are also instruction-only opcodes and op-only opcodes (for example 'instr_jump', 'instr_end' which are instruction-only and 'foreach' which op-only). It might be a good idea to separate these two (especially since I would expect the difference to grow).

Category: Grants

Comments (1)

[ 737 ] | Tue, 06-Apr-2010 by nicholasclark

It doesn't even pass its own tests. How on on earth is this considered complete?

Test Summary Report

-------------------

re/qr.t                                                         (Wstat: 0 Tests: 5 Failed: 1)

  Failed test:  5

op/stash.t                                                      (Wstat: 0 Tests: 31 Failed: 0)

  TODO passed:   26

porting/manifest.t                                              (Wstat: 0 Tests: 9463 Failed: 1)

  Failed test:  9463

../cpan/B-Debug/t/debug.t                                       (Wstat: 65280 Tests: 4 Failed: 3)

  Failed tests:  1-3

  Non-zero exit status: 255

  Parse errors: Bad plan.  You planned 7 tests but ran 4.

../cpan/B-Lint/t/lint.t                                         (Wstat: 0 Tests: 29 Failed: 22)

  Failed tests:  1-7, 12-20, 22-27

../dist/B-Deparse/t/deparse.t                                   (Wstat: 256 Tests: 85 Failed: 2)

  Failed tests:  61, 85

  Non-zero exit status: 1

  Parse errors: Bad plan.  You planned 84 tests but ran 85.

../ext/B/t/concise-xs.t                                         (Wstat: 65024 Tests: 1658 Failed: 774)

  Failed tests:  1, 4-6, 10, 17, 21-22, 24, 26, 37, 41-43

                272-281, 283-653, 655-659, 661-673, 676-743

                797, 801, 804-810, 812-831, 833-842, 849

                861-867, 869-870, 872-878, 881-882, 887-902

                905, 909-911, 917-921, 925, 928-930, 932

                934-935, 950-954, 960, 962-964, 968, 975

                980, 984-986, 988, 990-992, 996-1000, 1006-1009

                1022-1024, 1030, 1032-1034, 1038-1039, 1045

                1050, 1054-1055, 1060-1069, 1071, 1075-1076

                1078, 1080, 1089, 1094, 1096-1103, 1105-1107

                1109-1117, 1119, 1121, 1124, 1126, 1131

                1133-1142, 1144-1151, 1153, 1156, 1158

                1163-1168, 1172, 1174, 1176-1177, 1179-1181

                1184-1185, 1195, 1198-1214, 1216-1224, 1226-1231

                1233-1234, 1237-1256, 1259, 1264-1265, 1267

                1269, 1272-1275, 1277-1278, 1280-1284, 1286

                1289, 1292, 1352, 1379, 1398-1399, 1460

                1469, 1472, 1474, 1538, 1657

  Non-zero exit status: 254

../ext/B/t/concise.t                                            (Wstat: 65280 Tests: 30 Failed: 5)

  Failed tests:  1-5

  Non-zero exit status: 255

  Parse errors: Bad plan.  You planned 157 tests but ran 30.

../ext/B/t/f_map.t                                              (Wstat: 65280 Tests: 0 Failed: 0)

  Non-zero exit status: 255

  Parse errors: No plan found in TAP output

../ext/B/t/f_sort.t                                             (Wstat: 65280 Tests: 0 Failed: 0)

  Non-zero exit status: 255

  Parse errors: No plan found in TAP output

../ext/B/t/optree_check.t                                       (Wstat: 65280 Tests: 0 Failed: 0)

  Non-zero exit status: 255

  Parse errors: No plan found in TAP output

../ext/B/t/optree_concise.t                                     (Wstat: 65280 Tests: 0 Failed: 0)

  Non-zero exit status: 255

  Parse errors: No plan found in TAP output

../ext/B/t/optree_constants.t                                   (Wstat: 65280 Tests: 0 Failed: 0)

  Non-zero exit status: 255

  Parse errors: No plan found in TAP output

../ext/B/t/optree_misc.t                                        (Wstat: 65280 Tests: 0 Failed: 0)

  Non-zero exit status: 255

  Parse errors: No plan found in TAP output

../ext/B/t/optree_samples.t                                     (Wstat: 65280 Tests: 0 Failed: 0)

  Non-zero exit status: 255

  Parse errors: No plan found in TAP output

../ext/B/t/optree_sort.t                                        (Wstat: 65280 Tests: 0 Failed: 0)

  Non-zero exit status: 255

  Parse errors: No plan found in TAP output

../ext/B/t/optree_specials.t                                    (Wstat: 65280 Tests: 0 Failed: 0)

  Non-zero exit status: 255

  Parse errors: No plan found in TAP output

../ext/B/t/optree_varinit.t                                     (Wstat: 65280 Tests: 0 Failed: 0)

  Non-zero exit status: 255

  Parse errors: No plan found in TAP output

../ext/B/t/showlex.t                                            (Wstat: 65280 Tests: 0 Failed: 0)

  Non-zero exit status: 255

  Parse errors: No plan found in TAP output

../ext/B/t/terse.t                                              (Wstat: 65280 Tests: 1 Failed: 1)

  Failed test:  1

  Non-zero exit status: 255

  Parse errors: Bad plan.  You planned 16 tests but ran 1.

../ext/POSIX/t/posix.t                                          (Wstat: 0 Tests: 66 Failed: 0)

  TODO passed:   11

../lib/utf8.t                                                   (Wstat: 65280 Tests: 46 Failed: 0)

  Non-zero exit status: 255

  Parse errors: Bad plan.  You planned 150 tests but ran 46.

Files=1803, Tests=349370, 701 wallclock secs (119.20 usr 30.02 sys + 644.36 cusr 179.50 csys = 973.08 CPU)

Result: FAIL

*** Error code 41

Some of those failures are quite serious. It also fails a slew of assertions when assertions are enabled.

Grant Report: Changing the Perl 5 optree build process into a Abstract Syntax Tree generation and a code generation step

Comments (1)

Sign in to add comment

About TPF

Categories

Popular Tags

Recent Tags

Get Perl

Links