Grant Proposal: Pegex Grammar for YAML
Mon, 15-Sep-2014 by
Makoto Nozaki
edit post
_We have received the following grant application "Pegex Grammar for YAML". Please leave feedback in the comments field by September 25th, 2014._
# Pegex Grammar for YAML
- Name:
- Ingy döt Net
- David Oswald
- Amount Requested
USD $3500
## Synopsis
Make YAML.pm and YAML::Tiny driven by a common formal grammar.
## Benefits to the Perl Community
Perl has four major YAML implementations:
- [YAML](https://metacpan.org/pod/YAML)
- [YAML::Tiny](https://metacpan.org/pod/YAML::Tiny)
- [YAML::XS](https://metacpan.org/pod/YAML::XS)
- [YAML::Syck](https://metacpan.org/pod/YAML::Syck)
They all have major incompatabilities. In the past year the #yaml channel on
irc.perl.org has gotten all the right people together to resolve this. A great
next step would be to make the two pure Perl implementations [YAML](https://metacpan.org/pod/YAML) and
[YAML::Tiny](https://metacpan.org/pod/YAML::Tiny) be grammar driven.
[Pegex](https://metacpan.org/pod/Pegex) is a Perl 6 Rules inspired framework that greatly lowers the barriers
to writing parsers. The main goal of Pegex is to make grammars for parsing a
language or syntax be as human friendly as possible. Pegex is also extremely
fast for pure Perl code.
By having the Load functions for [YAML](https://metacpan.org/pod/YAML) and [YAML::Tiny](https://metacpan.org/pod/YAML::Tiny) be grammar driven,
the following benefits would occur:
1. Both modules would parse the same YAML exactly the same
2. Bugs could easily be fixed for both modules in the same grammar
3. YAML::Tiny would be tinyer
4. YAML.pm would become faster
## Deliverables
This project will provide:
- Pegex grammar for YAML
- [YAML](https://metacpan.org/pod/YAML) and [YAML::Tiny](https://metacpan.org/pod/YAML::Tiny) parser/loaders based on the grammar
- Common test suite proving compatability
## Project Details
Pegex is four years old, and has several CPAN modules based on it. It makes
language defining grammars appear crystal clear. It has undergone an
optimization development phase that makes it very fast.
Recent work was done to get YAML indentation working in Pegex. This was a
major hurdle. This is now a good time to make a complete YAML grammar. Since
Pegex works in many languages, eventually there will be exactly compatible
YAML parsers in Perl, Python, Ruby, JS, etc.
## Inch-stones
- Write a grammar for YAML in Pegex
- Grammar will be well documented
- Each grammar rule will be tested
- Convert [YAML](https://metacpan.org/pod/YAML) to use the grammar for its loader
- Convert [YAML::Tiny](https://metacpan.org/pod/YAML::Tiny) to use (a subset of) the grammar for its loader
- Both modules pass a common test suite
## Project Schedule
This project will take 2-3 months and can be started immediately upon
acceptance.
## Completeness Criteria
Both modules released to CPAN, using the new Pegex grammar and passing the
same tests. Pegex/YAML grammar published in its own GitHub repo.
## Bio
Ingy döt Net invented the YAML language, is the author and maintainer of
[YAML](https://metacpan.org/pod/YAML) and [YAML::XS](https://metacpan.org/pod/YAML::XS) and is one of he people currently actively maintaining
[YAML::Tiny](https://metacpan.org/pod/YAML::Tiny). He also is the creator of the Pegex parsing framework.
David Oswald has been a Perl user for over a decade, is an author of several
CPAN modules, and maintainer of more. David also runs Salt Lake Perl Mongers.
Ingy and David work well together and have decided to collaborate on a number
of projects that will benefit Perl and Software Development.
Comments (4)
Given the move over the past few years away from YAML to JSON (for example, the switch to META.json) what is the relative importance of YAML to the ongoing success of Perl?
While there might be some preference to use JSON over YAML on some Perl projects/files, we don't think YAML is losing relevance in Perl. YAML adoption on the whole (Open Software) seems to be firmly on the rise.
Note that while YAML and JSON are in a common space, YAML is a complete serialization language where JSON is a data language with a limited (albeit very useful) scope. To my knowledge there is no other format that offers the following (in Perl, let alone across languages):
* Complete object serialization
* Plain text, human readable/writable
* References and circular references
* Support for any data types
* Comments
This proposal makes a great stride towards having our Perl YAML parsers be readily understandable, community maintainable, spec compliant, and perfectly in sync.
That said, it is definitely the most ambitious of the proposals. As with all of these proposals, these are things that we plan to work on over time (regardless of compensation), but the grant allows us to be able to immediately focus on them to completion. Perhaps this one will be more compelling after a bit more work.
— Ingy and David
Is it possible to change the implementation of YAML::Tiny, given that it is currently wrapped in a thin API and released as a dual-life module with perl, as CPAN::Meta::YAML, and therefore cannot use any non-core modules?
One point of ::Tiny is that implementation must be tiny, and that includes not just the module itself, but also its runtime dependencies. Pegex is not so tiny, so I don't think that changing the YAML::Tiny to use Pegex would keep the tinyness.
But I would be supportive of a grant for Pegex::YAML or YAML::Pegex.
Replacement of the current YAML.pm implementation would also need discussion once we could benchmark it (on performance and features) against other implementations.