Dave Mitchell writes:
I spent the majority of my time last month continuing to work on fixing and refactoring the Perl_re_intuit_start() function, which is the main run-time optimisation facility in the regex engine.
Part of my goal has been to simplify the structure of the code, which uses no large-scale structural features like while-loops, but instead relies on lots of labels and goto's. Ignoring the three "go here on failure" labels (giveup, fail_finish, fail), there were 10 other labels in this function. I've now reduced that to 6, and expect to reduce it further soon.
I've also managed to merge two very similar (but maddeningly different) blocks of code, that handled the "second-longest substring is anchored" and "... is floating" cases, while removing inconsistencies between the two branches.
The utf8 handling is now much more efficient, and no longer goes quadratic on certain classes of long utf8 strings.
The main legacy of this refactoring should be that the code is finally comprehensible. This has involved a lot of "why is that set so? Add assert; see what fails in the test suite" iterations.
My work so far can be found in the branch davem/intuit2.
- 41:39 RT#120692 Slow global pattern match with input from utf8
- 5:47 RT#40565 Windows fork emulation's child pseudo process cannot restore local scalar values
- 3:12 fix smoke issues
- 29:10 process p5p mailbox
79:48 Total (HH::MM)
79.8 total hours
18.0 average hours per week
As of 2014/01/31: since the beginning of the grant:
263.1 total hours
16.7 average hours per week
There are 137 hours left on the grant.