Apply some efficiency optimisations to generated optrees in order to gain faster runtime performance. There are three ideas covered by this proposal; any or all of them can be created independently. In each case, it is hard to estimate upfront whether it would actually provide a measurable benefit to programs in practice, so part of the project involves attempting to measure the impact it creates.
## Project Details
Inside a running `perl` interpreter, source code is compiled into a more direct representation of its behaviour, called an optree. This is comprised of individual elements called ops. Each is executed in sequence, and performs a small specific part of work involved in executing the program as a whole.
Due to the small nature of the specific ops, there are several optimisations that are performed by the interpreter before it starts executing a program; mostly on a theme of combining the effects of several small ops into fewer, larger ones. This allows them to overall work more efficiently as they can either share intermediate results, or can avoid the various data-passing overheads between them.
This project aims to look at three new ideas on this theme of skipping certain elements of the optree by combining their effects together. A full description of each is given in the appendix section below. Each of the three ideas were first mentioned on the [Perl5-Porters mailing list](https://www.nntp.perl.org/group/perl.perl5.porters/2021/12/msg262094.html).
Each of the three optimisations would process through the following stages:
**1. Create an "ideal-case" benchmark test **
Begin by creating a benchmark containing some simple code that is entirely composed of one particular kind of operation, which is the target of the
optimisation. Measure this benchmark case in order to obtain a baseline performance measurement.
**2. Implement the optimisation **
Apply code to the Perl interpreter by defining new opcode flags, adjusting the peephole optimiser, or whatever other techniques may be required to achieve it.
Take a second measurement of performance with the optimisation applied. This should give an indication of the maximal possible gain that could be achieved.
**3. Analyse large programs to estimate extent of application**
While the figure gained in the above step gives a best-case value, it is unlikely that real-world programs would be able to gain as much benefit. It would be useful to analyse the generated optree of real-world programs to get an estimate of how likely these optimisations are to be hit, and a guess at what proportion of the potential benefit could actually be achieve in a real case.
Actually this step could be performed first, for each of the three optimisations, to get a suggestion on which of them are likely to be the most useful, and thus how to assign the remaining project time to each of them.
## Author Information
I am Paul Evans, PEVANS on [CPAN](https://metacpan.org/author/PEVANS) and current member of the Perl Steering Committee.
I have been a CPAN maintainer for over 12 years, and currently have over 200 distributions under my name. Recently I have been working on a variety of perl core features; adding the `isa` operator to Perl 5.32, `try/catch` syntax to 5.34, and the `builtin::` namespace of additional core functions expected to be part of the upcoming 5.36 release.
I have successfully completed two TPF projects before, to improve the implementation of the `Future::AsyncAwait` module; and to create bindings for the `libuv` event system.
## Amount Requested
The three optimisations described in more detail:
### `OA_TARGLEX` on `OP_CONST`
Give the `OP_CONST` opcode the `OA_TARGLE` flag, meaning that code such as
$var = 123;
gains a performance optimisation, discarding the `OP_PADSV` and `OP_SASSIGN` which is normally used to implement scalar lexical variable assignment from a constant, leaving just a single `OP_CONST` in its place.
### `OA_TARGLEX` with `OPpLVAL_INTRO`
Create a new `OA_...` constant, or adjust the semantics of the existing `OA_TARGLEX`, such that it can also apply in `OPpLVAL_INTRO` situations. This would allow the (currently fairly-rare) `OA_TARGLEX` optimisation to
also apply on variables introduced in `my` expressions, such as
my $zero = 0;
by once again discarding the `OP_PADSV` and `OP_SASSIGN` ops.
### Fold away `OP_PADSV` arguments to `UNOP`s
Create a similar optimisation to the `->targ` opcode field, applicable to `UNOP`s to contain the pad offset of a lexical variable argument for arguments being passed into `UNOP`s. Thus for example a statement like
could discard the `OP_PADSV` of its incoming argument, and similarly avoid using the stack for the lexical variable.