Grant Report : Robust Perl 6 Unicode Support - June 2017

No Comments

Samantha McVey has made progress on her grant to improve the robustness of Unicode support in Rakudo. She is working in the following repos: https://github.com/samcv/UCD, https://github.com/samcv/Unicode-Grant.

Here are a few highlights from her complete blog post.

The script tests the contents of each grapheme individually from the GraphemeClusterBreak.txt file from the Unicode 9.0 test suite.

Previously we only checked the total number of ‘.chars’ each for the string as a whole. Obviously we want something more precise than that, since the test specifies the location of each of the breaks between codepoints. The new code checks that codepoints are put in the correct graphemes in the proper order. In addition we also check the string length as well.

This new test uses a grammar to parse the file and generally is much more robust than the previous script.

  • I have some currently unmerged tests which need to wait to be merged, although sections of it are complete and are being incorporated into the larger Unicode Database Retrofit, reusing this code.

  • I have written grammars and modules to process and provide data on the PropertyValueAliases and PropertyAliases. They will be used for testing that all of the canonical property names and all the property values themselves properly resolve to separate property codes, as well as that they are usable in regex.

  • As part of my grant work I am working on making Unicode property values distinct per property, and also on allowing all canonical Unicode property values to work.

  • I've also started adding some documentation to my Unicode-Grant wiki with information about what is enclosed in each Unicode data files; there are a few other pages as well. This wiki is planned to be expanded to have many more sections than it does currently."

MAJ

Leave a comment

About TPF

The Perl Foundation - supporting the Perl community since 2000. Find out more at www.perlfoundation.org.

About this Entry

This page contains a single entry by Mark A Jensen published on June 6, 2017 3:26 AM.

Perl 6 IO Grant: May 2017 Report (Complete) was the previous entry in this blog.

Welcoming New Chair for the Community Advocacy Committee is the next entry in this blog.

Find recent content on the main index or look in the archives to find all content.

Pages

OpenID accepted here Learn more about OpenID
Powered by Movable Type 6.2.2