2010 Grant Proposal: Enhancing Perl 6 Pattern Matching

Category: Grants

Comments (7)


Way back when MJD and I proposed SNOBOL-style pattern matching for Perl6. (I'll see if I can dig up the relevant document, though I'm not sure I still have it.) As I recall it didn't get a lot of traction, perhaps because we didn't present it in a compelling enough way.

I'm excited to hear that this is back on the table. SNOBOL's pattern matching was lovely to work with, and having something similar in Perl would be great.


Morris clearly feels strongly about his design, working on it for so long on purely personal initiative. The Perl 6 community interacts mainly in #perl6. It would be nice of Morris to appear there to discuss his ideas. Presenting a semi complete concept at YAPC|10 with more Snobol4 details than Perl 6 suggestions would not have been very persuasive.

Most programmers react very positively to the existing Perl 6 grammar and regex design (Synopsis 5). P6ers often rate it as a "killer feature". Support for abandoning it would be nil. So Morris proposes coexistence to enable gradual migration. S05 already allows for :Perl5 or :P5 regex modifiers, so an :Alt modifier should be no problem. But do enough people want it? The average reaction, as reported within the proposal, can be summarised as lukewarm.

The proposal withholds the details, instead touting Snobol4 features.

Feature (A) contrasts with S05 by adding a build time phase. S05 justified omitting one for execution efficiency. There are occasional questions in #perl6 about building patterns at runtime, so it merits
consideration.

Feature (B) contrasts with S05 by executing actions only after match time, eliminating the problem of undesired side effects. If Perl 6 needs that behaviour, it can just specify it in S05 now.

Feature (C) suggests a minor tweak of assertions, which S05 might also be stretched to incorporate, if desired.

I think Perl 6 has an adequate specification already, and critically lacks implementation. The current form of this proposal does not change this. Retroactive funding for unsolicited work is strange, it should be spent on removing project blockers. Since the proposal is negotiable, I recommend that some working proof of concept, preferably implemented in Perl 6 or Perl 5, be added as a deliverable.

Disclaimer: I am not a regex engine developer, and don't know SNOBOL ;)


I just want to highlight this bit of the grant - "My grant request is intended to retroactively fund some of my past development"

I don't think that's how the grant program is supposed to work.

Of course, if the proposed future work is sufficient, this is a non-issue, but it just seems weird.


To me, this is the most exciting grant proposal in the current batch. I've always appreciated Perl's regex abilities. The innovations in Perl6 impressed me greatly. I like the idea of Captures and the ability to unpack regex into a grammar. However, the whole design of regex stills seems somewhat ad-hoc. Though innovative, Perl6 regex do not have the appearance of a complete makeover and resulting tidyness reflected in other aspects of the Perl6 language specification. Regex is a central and extraordinarily complex feature. The community, and Perl6, could benefit from the experienced input of somebody like Siegel.


The language [Lua](www.lua.org) has a great pattern-matching library [LPeg](http://www.inf.puc-rio.br/~roberto/lpeg/lpeg.html) which follows too the Snobol tradition.

A module lpeg.re (written with lpeg) supports a more conventional regex syntax (http://www.inf.puc-rio.br/~roberto/lpeg/re.html).

LPeg has formal foundation (PEG : Parsing Expression Grammars) and some papers are available :
- A Text Pattern-Matching Tool based on Parsing Expression Grammars (http://www.inf.puc-rio.br/%7Eroberto/docs/peg.pdf)
- A Parsing Maching for PEGs (http://www.inf.puc-rio.br/%7Eroberto/docs/ry08-4.pdf)
- Parsing Expression Grammars: A Recognition-Based Syntactic Foundation (http://www.brynosaurus.com/pub/lang/peg.pdf)
- Packrat Parsing: Simple, Powerful, Lazy, Linear Time (http://www.brynosaurus.com/pub/lang/packrat-icfp02.pdf)


If there's no interest from the Perl 6 core developers and language designers, this seems like a non-starter.

I think Morris should start by working with them to come up with a proposal that they'd endorse.


I think this grant proposal is quite interesting, but one thing bothers me: It doesn't even mention multi dispatch as an/the alternative way to match non-strings, especially nested data structures.

When last I talked with Larry about tree matching, he said that multi dispatch and subsignatures were the currently intended way to do such a thing. I think it would be very beneficial to investigate a unification of patterns and multi dispatch. The proposed approach moves in the opposite direction, if I understood that correctly.


Sign in to add comment