Grant Report: Changing the Perl 5 optree build process into a Abstract Syntax Tree generation and a code generation step

1 Comment

TPF is proud to announce the completion of Gerard Goossen grant. As Gerard sent a big report, please check the complete report on the extended new page. I would like to thank you both Gerard for completing the grant, and Jeff Horwitz for managing it.

Below are my results of the "Changing the Perl 5 optree build process into a Abstract Syntax Tree generation and a code generation step" grant.  For the original grant request see http://news.perlfoundation.org/2009/08/2009q3-grant-proposal-changing.html

In short the idea was to have a new code generation step instead of  the linked op nodes. This would allow the optree to become more of an  Abstract Syntax Tree.

The code is available at github at http://github.com/ggoossen/perl/tree/codegen  blead is currently frozen for the 5.12 release and thus it can't be  merged yet.

The new code for the code generation step is in codegen.c. instruction.c and instruction.h contain the code for the instructions.

Optree

The optree no longer also functions as a instruction sequence, and as a result has been simplified. As can be seen by the example below derived from the output of:

/*./perl -Dx -e 'for (@ARGV) { print "arg: $_\n"; }'

BEFORE:

1   TYPE = leave  ===> DONE
2       TYPE = enter  ===> 3
3       TYPE = nextstate  ===> 4
20      TYPE = leaveloop  ===> 1
8           TYPE = enteriter  ===> 18
                TYPE = null  ===> (4)
                TYPE = null  ===> (7)
4                   TYPE = pushmark  ===> 5
6                   TYPE = rv2av  ===> 7
5                       TYPE = gv  ===> 6
7               TYPE = gv  ===> 8
            TYPE = null  ===> (20)
19              TYPE = and  ===> 20
18                  TYPE = iter  ===> 19
21                  TYPE = lineseq  ===> 0
9                       TYPE = nextstate  ===> 10
16                      TYPE = print  ===> 17
10                          TYPE = pushmark  ===> 11
                            TYPE = null  ===> (16)
                                TYPE = null  ===> (11)
15                              TYPE = concat  ===> 16
13                                  TYPE = concat  ===> 14
11                                      TYPE = const  ===> 12
                                        TYPE = null  ===> (13)
12                                          TYPE = gvsv  ===> 13
14                                  TYPE = const  ===> 15
17                      TYPE = unstack  ===> 18

AFTER:

1   TYPE = leave
2       TYPE = enter
3       TYPE = nextstate
4       TYPE = foreach
5           TYPE = list
6               TYPE = rv2av
7                   TYPE = gv
8           TYPE = gv
9           TYPE = lineseq
10              TYPE = nextstate
11              TYPE = print
12                  TYPE = stringify
13                      TYPE = concat
14                          TYPE = concat
15                              TYPE = const
16                              TYPE = rv2sv
17                                  TYPE = gv
18                          TYPE = const


Code generation

How the code generation works is documented in codegen.c. Here I will only show the result of the code generation step as obtained using the -DG flag:

./perl -DG -e 'for (@ARGV) { print "arg: $_\n"; }'

Instructions of codeseq (0x96439b8):
0x9648270: enter
0x9648280: nextstate
0x9648290: pushmark
0x96482a0: gv
0x96482b0: rv2ax
0x96482c0: go
0x96482d0: enteriter    redo=label1    next=label2    last=label3
label5:
0x96482e0: iter
0x96482f0: instr_cond_jump    label4
label1:
0x9648300: nextstate
0x9648310: pushmark
0x9648320: const
0x9648330: gvsv
0x9648340: concat
0x9648350: const
0x9648360: concat
0x9648370: print
label2:
0x9648380: unstack
0x9648390: instr_jump    label5
label4:
0x96483a0: leaveloop
label3:
0x96483b0: leave
0x96483c0: instr_end
0x96483d0: (finished)


Memory consumption

The change causes the memory consumption to go down.  For example the maximum amount of memory used by 'perldoc perlfunc' is after 2.91MB compared to 2.98 that (a 2% reduction), or for example 'perlcritic autodie/exception.pm' now uses 13.5MB compared to 14.4 before (a 6% reduction).  The memory usage reduction is caused by a smaller optree, the intermediate code is never generated as long as the code isn't run. Thus amount of memory reduction would be large for large program  with many modules of which most parts are not used.

Performance 

There doesn't appear to be any significant performance difference as measured by perlbenchapps. perlbench gives mixed result and is probably too artificial to be indicative of anything. I find this a bit strange because I would expect it to perform better given that it uses less memory, and that the instructions are sequential in memory, which should result in better cache performance, but apparently this doesn't matter.

A = /home/gerard/perl/inst/blead-compile/bin/perl5.12.0
B = /home/gerard/perl/inst/blead/bin/perl

lcov
  A: 4.42(3)*100
  B: 4.392(15)*100
lwp-request
  A: 1.791(14)*100
  B: 1.807(13)*100
net_http_get
  A: 1.13(2)*100
  B: 1.11(2)*100
net_http_get_config
  A: 6.08(3)*10-2
  B: 6.25(2)*10-2
perlcritic
  A: 1.0875(13)*100
  B: 1.084(3)*100
perldoc
  A: 9.72(10)*10-1
  B: 9.65(4)*10-1
spamassassin
  A: 1.726(3)*100
  B: 1.773(9)*100
spamd
  A: 2.733(16)*101
  B: 2.76(2)*101

Easier optimisations 

Optimisation which require a different instruction sequence are easier to make.  Previously this would require manipulating the optree and its linked list, which is rather difficult.  Now it can simply be done by generating a different instruction list.  For making the sort work in-place is reduced from about 40 lines of code to about 10 lines.

Influence on existing code

Bugs

There are no known new bugs, caused by the change. Possible area where new bugs are introduced is that compiling happens, when the subroutine is first needed, and part of the environment might "leak" into the compiling, for example initially the PL_taiting wasn't set to false, causing constant folded expression to be tainted if the first time the subroutine was called was inside a tainted expression.

B modules

Of course things which analyse the opcode tree using B::OP will notice some changes in the tree, but the new tree should be easier to analyse. Some of the B modules which can be used to output things about the optree in execution order are obviously gone/broken.

Incompatible changes

The only change concerning Perl code is the removal of '"goto" into a construct', but that is already deprecated in 5.12 thus that shouldn't be problem.

The ops are no longer made into a linked list. Also the optree is a bit simplified.  Also the peephole analyser is gone, instead after the optree is build finish_optree is called which does some additional checking and makes the tree thread-safe.  When a sub is called if it doesn't already have a instruction list in CvCODESEQ the optree is compiled using the compile_cv/compile_op function.  The pp_xxx functions now have two arguments ppflags and pparg (most functions ignore them and still use PL_op for arguments) and instead of returning the next op it should set the next instruction if it isn't the default RUN_SET_NEXTINSTRUCTION.

Future improvements 

There are many more things which can be moved from semantic analysis step to the intermediate code generation step (i.e. from op.c to codegen.c), for example a 'scope' still generate an 'enter' and a 'leave' op, this could be merged to only one op which is split into two when generating the instruction list, another example would be the sort code, where determining whether a optimised comparison can be used is determined in the intermediate code generation.  Doing this makes the syntax tree more simple and thus easier to analyse, also it would probably make the code simpler.

Use of instruction flags

To pass arguments to the instruction the op is still used, as an experiment I changed the pp_helem instruction to use the instruction arguments instead of the op, with the idea that it should perform better because it wouldn't have to read from the op memory. But my benchmarks don't show any significant difference. But if this would be combined with LLVM the improvement might be significant because the compiler could optimize away the passed constant.

LLVM

The idea with LLVM was that the instruction list could be compiled by the LLVM compiler on the fly to one real function, thus optimising away the overhead of function calls, and possibly optimising the away code based on the constant values (flags) passed to the instruction functions.  Unfortunately I couldn't figure out how to do this.

Opcode/instruction list

The list of opcode in opcode.pl functions both for the opcode and for instruction list. Although there are a lot of operations where have exactly one opcode and one instruction (for example 'add', 'helem', 'open'), there are also instruction-only opcodes and op-only opcodes (for example 'instr_jump', 'instr_end' which are instruction-only and 'foreach' which op-only). It might be a good idea to separate these two (especially since I would expect the difference to grow).

1 Comment

It doesn't even pass its own tests. How on on earth is this considered complete?

Test Summary Report
-------------------
re/qr.t                                                         (Wstat: 0 Tests: 5 Failed: 1)
  Failed test:  5
op/stash.t                                                      (Wstat: 0 Tests: 31 Failed: 0)
  TODO passed:   26
porting/manifest.t                                              (Wstat: 0 Tests: 9463 Failed: 1)
  Failed test:  9463
../cpan/B-Debug/t/debug.t                                       (Wstat: 65280 Tests: 4 Failed: 3)
  Failed tests:  1-3
  Non-zero exit status: 255
  Parse errors: Bad plan.  You planned 7 tests but ran 4.
../cpan/B-Lint/t/lint.t                                         (Wstat: 0 Tests: 29 Failed: 22)
  Failed tests:  1-7, 12-20, 22-27
../dist/B-Deparse/t/deparse.t                                   (Wstat: 256 Tests: 85 Failed: 2)
  Failed tests:  61, 85
  Non-zero exit status: 1
  Parse errors: Bad plan.  You planned 84 tests but ran 85.
../ext/B/t/concise-xs.t                                         (Wstat: 65024 Tests: 1658 Failed: 774)
  Failed tests:  1, 4-6, 10, 17, 21-22, 24, 26, 37, 41-43
                272-281, 283-653, 655-659, 661-673, 676-743
                797, 801, 804-810, 812-831, 833-842, 849
                861-867, 869-870, 872-878, 881-882, 887-902
                905, 909-911, 917-921, 925, 928-930, 932
                934-935, 950-954, 960, 962-964, 968, 975
                980, 984-986, 988, 990-992, 996-1000, 1006-1009
                1022-1024, 1030, 1032-1034, 1038-1039, 1045
                1050, 1054-1055, 1060-1069, 1071, 1075-1076
                1078, 1080, 1089, 1094, 1096-1103, 1105-1107
                1109-1117, 1119, 1121, 1124, 1126, 1131
                1133-1142, 1144-1151, 1153, 1156, 1158
                1163-1168, 1172, 1174, 1176-1177, 1179-1181
                1184-1185, 1195, 1198-1214, 1216-1224, 1226-1231
                1233-1234, 1237-1256, 1259, 1264-1265, 1267
                1269, 1272-1275, 1277-1278, 1280-1284, 1286
                1289, 1292, 1352, 1379, 1398-1399, 1460
                1469, 1472, 1474, 1538, 1657
  Non-zero exit status: 254
../ext/B/t/concise.t                                            (Wstat: 65280 Tests: 30 Failed: 5)
  Failed tests:  1-5
  Non-zero exit status: 255
  Parse errors: Bad plan.  You planned 157 tests but ran 30.
../ext/B/t/f_map.t                                              (Wstat: 65280 Tests: 0 Failed: 0)
  Non-zero exit status: 255
  Parse errors: No plan found in TAP output
../ext/B/t/f_sort.t                                             (Wstat: 65280 Tests: 0 Failed: 0)
  Non-zero exit status: 255
  Parse errors: No plan found in TAP output
../ext/B/t/optree_check.t                                       (Wstat: 65280 Tests: 0 Failed: 0)
  Non-zero exit status: 255
  Parse errors: No plan found in TAP output
../ext/B/t/optree_concise.t                                     (Wstat: 65280 Tests: 0 Failed: 0)
  Non-zero exit status: 255
  Parse errors: No plan found in TAP output
../ext/B/t/optree_constants.t                                   (Wstat: 65280 Tests: 0 Failed: 0)
  Non-zero exit status: 255
  Parse errors: No plan found in TAP output
../ext/B/t/optree_misc.t                                        (Wstat: 65280 Tests: 0 Failed: 0)
  Non-zero exit status: 255
  Parse errors: No plan found in TAP output
../ext/B/t/optree_samples.t                                     (Wstat: 65280 Tests: 0 Failed: 0)
  Non-zero exit status: 255
  Parse errors: No plan found in TAP output
../ext/B/t/optree_sort.t                                        (Wstat: 65280 Tests: 0 Failed: 0)
  Non-zero exit status: 255
  Parse errors: No plan found in TAP output
../ext/B/t/optree_specials.t                                    (Wstat: 65280 Tests: 0 Failed: 0)
  Non-zero exit status: 255
  Parse errors: No plan found in TAP output
../ext/B/t/optree_varinit.t                                     (Wstat: 65280 Tests: 0 Failed: 0)
  Non-zero exit status: 255
  Parse errors: No plan found in TAP output
../ext/B/t/showlex.t                                            (Wstat: 65280 Tests: 0 Failed: 0)
  Non-zero exit status: 255
  Parse errors: No plan found in TAP output
../ext/B/t/terse.t                                              (Wstat: 65280 Tests: 1 Failed: 1)
  Failed test:  1
  Non-zero exit status: 255
  Parse errors: Bad plan.  You planned 16 tests but ran 1.
../ext/POSIX/t/posix.t                                          (Wstat: 0 Tests: 66 Failed: 0)
  TODO passed:   11
../lib/utf8.t                                                   (Wstat: 65280 Tests: 46 Failed: 0)
  Non-zero exit status: 255
  Parse errors: Bad plan.  You planned 150 tests but ran 46.
Files=1803, Tests=349370, 701 wallclock secs (119.20 usr 30.02 sys + 644.36 cusr 179.50 csys = 973.08 CPU)
Result: FAIL
*** Error code 41

Some of those failures are quite serious. It also fails a slew of assertions when assertions are enabled.

About TPF

The Perl Foundation - supporting the Perl community since 2000. Find out more at www.perlfoundation.org.

Recent Comments

  • Nicholas Clark: It doesn't even pass its own tests. How on on read more

About this Entry

This page contains a single entry by Alberto Simões published on April 5, 2010 7:16 PM.

Call for Grant Managers was the previous entry in this blog.

Fixing Perl5 Core Bugs: First Monthly Report is the next entry in this blog.

Find recent content on the main index or look in the archives to find all content.

OpenID accepted here Learn more about OpenID
Powered by Movable Type 4.38