February 2013 Archives

I am pleased to announce that Ricardo Signes's grant application for travel to the Perl QA Hackathon has been accepted. I would to thank everyone who took the time to provide feedback on this grant application.

The Perl QA Hackathon is taking place in Lancaster from Friday April 12th to Sunday April 14th 2013. If you would like to contribute directly to the 2013 Perl QA Hackathon they are accepting donations.

If you would like to help fund grants like this or any of our other projects please visit our donation system.

Nicholas Clark writes:

As per my grant conditions, here is a report for the January period.

After catching up with some of the e-mail backlog, I tried having another prod at the awkward clang/ASAN problem described last month.

It turned out that I missed something. Key to replicating the problem was that one must turn on clang's optimiser. Of course, trying to debug a problem in the C code, I'd gone off down the path of trying debugging builds, ie -g, which by default don't enable the optimiser. (Debugging code with the optimiser enabled tends to screw with your head, because the debugger reports execution jumping backwards and forwards.) I hadn't tried compiling with -O on a local machine, and that does replicate the problem. Better still, on that machine, -O -g together still replicates the problem. So, now I can stick gdb on it:

    1..31
    ok 1 - $^H{foo} doesn't exist initially
    ok 2 - $^H doesn't contain HINT_LOCALIZE_HH initially
    ok 3 - $^H{foo} is now 'a'
    ok 4 - $^H contains HINT_LOCALIZE_HH while compiling
    ok 5 - $^H{foo} is now 'b'
    ok 6 - $^H{foo} restored to 'a'
    ok 7 - $^H{foo} doesn't exist while finishing compilation
    ok 8 - $^H doesn't contain HINT_LOCALIZE_HH while finishing compilation
    Breakpoint 1, 0x000000000042e440 in __asan_report_error ()
    (gdb) up
    #1  0x000000000042f7a7 in __asan_report_load8 ()
    (gdb)
    #2  0x00000000004fa030 in Perl_re_op_compile (patternp=<optimized out>,
        pat_count=-1292, expr=0xffffffffb6e, eng=<optimized out>, old_re=0x0,
        is_bare_re=0x7fffffffdba4, pm_flags=<optimized out>,
        orig_rx_flags=<optimized out>) at regcomp.c:5634
    5634        if (   old_re
    (gdb) p old_re
    $1 = (REGEXP * volatile) 0x0
    (gdb) p &old_re
    Address requested for identifier "old_re" which is in register $r8

Odd. How come a volatile variable thinks that it's in a register? And it's an LVALUE, so why can't I take its address? What does the compiler think - let's try some "printf" debugging:

    diff --git a/regcomp.c b/regcomp.c
    index a8b27dc..342e772 100644
    --- a/regcomp.c
    +++ b/regcomp.c
    @@ -5631,6 +5631,7 @@ Perl_re_op_compile(pTHX_ SV ** const patternp, int pat_cou
         /* return old regex if pattern hasn't changed */
    +    PerlIO_printf(PerlIO_stderr(), "%p %p\n", old_re, &old_re);
         if (   old_re
             && !recompile
            && !!RX_UTF8(old_re) == !!RExC_utf8

and we stop at the same place:

    1..31
    ok 1 - $^H{foo} doesn't exist initially
    ok 2 - $^H doesn't contain HINT_LOCALIZE_HH initially
    ok 3 - $^H{foo} is now 'a'
    ok 4 - $^H contains HINT_LOCALIZE_HH while compiling
    ok 5 - $^H{foo} is now 'b'
    ok 6 - $^H{foo} restored to 'a'
    ok 7 - $^H{foo} doesn't exist while finishing compilation
    ok 8 - $^H doesn't contain HINT_LOCALIZE_HH while finishing compilation
    0 7fffffffd760
    0 7fffffffd760
    Breakpoint 1, 0x000000000042e440 in __asan_report_error ()
    (gdb) up
    #1  0x000000000042f7a7 in __asan_report_load8 ()
    (gdb) up
    #2  0x00000000004fa060 in Perl_re_op_compile (patternp=<optimized out>,
        pat_count=-1292, expr=0xffffffffb6e, eng=<optimized out>,
        old_re=<optimized out>, is_bare_re=0x7fffffffdba4,
        pm_flags=<optimized out>, orig_rx_flags=<optimized out>) at regcomp.c:5634
    5634        PerlIO_printf(PerlIO_stderr(), "%p %p\n", old_re, &old_re);
    (gdb) p old_re
    $2 = <optimized out>
    (gdb) p &old_re
    Can't take address of "old_re" which isn't an lvalue.

Can't take the address of it? That's despite the printf clearly showing that its address is 0x7fffffffd760.

So something is clearly going badly wrong with the code generation here. It's probably a bug in clang. But (a) it's not clear how to reduce it to a terser test case (b) that's not actually helpful, as I really need to make this code work on the clang we have here and now, as without this, we can't usefully use ASAN any more to find bugs.

But some things started to come together. Whilst this specific failure is probably a bug in clang, the function isn't exactly innocent. A couple of people had already reported that they could get everything to pass if they marked more variables as volatile. But playing whack-a-mole with volatile is papering over the symptoms, not finding the cause. However, digging further into the blame log for the function revealed a couple of previous commits in the past few years adding volatile qualifiers, to fix compiler warnings due to the addition of a all to setjmp(). So what's that all about?

The setjmp() relates to an optimisation added by commit bbd61b5ffb7621c2 in September 2010. The backstory is that the compiled format for regular expressions differs if it needs to store Unicode. (The documentation refers to the human-readable form as the pattern, and the compiled form as the program.) The Unicode representation for the program is larger, and cannot be matched as efficiently, so it's preferable to use the tighter byte-based format where possible. Unfortunately, it's not always possible to know for sure which is the right decision until midway through parsing. If the pattern contains literal Unicode, it's obvious that the program needs to store Unicode. Otherwise, the parser optimistically assumes that the more efficient representation can be used, and starts sizing on this basis. However, if it then encounters something in the pattern which must be stored as Unicode, such as an \x{...} escape sequence representing a character literal, then this means that all previously calculated sizes need to be redone, using values appropriate for the Unicode representation.

The problem is that the point where the parser discovers that it needs to redo everything from the start is at least 4 functions down in the best case, and possibly a lot further if the "problem" construction is within parentheses groups. (The parser calls back to the top level to parse parentheses.) Before the commit mentioned above, the recalculation was implemented by setting a flag of "really, this needs Unicode", and simply carrying on. The top level call spotted the flag, and restarted the parse. Obviously, this is rather wasteful, particularly if the problem is detected very early in the pattern.

So that commit refactored things so that the top level caller used setjmp() to store a checkpoint, and upon detecting the need for a restart use longjmp() to warp straight back to it. This skips doing needless work, which is good. The problem is the warp bit - it's far more "warp" than "jump", as the C compiler is not expecting the perfectly innocent looking setjmp() function to return twice. C doesn't have continuations, so this doesn't happen. But of course, it does, and it is breaking the rules, introducing control flow that the compiler has no idea about, so you have to start (effectively) lying to the compiler left right and centre, by declaring variables volatile, forbidding it to do any (sane, reasonable, normal) optimisation, so that the non-local control flow doesn't come unstuck. And it's easy to miss one that matters, as the compiler doesn't warn.

So the question then became, can I implement the early return in some way other than longjmp()? It sort of looked plausible from an initial look at the functions - they only return NULL for special cases, and everywhere checks the return value and propagates NULL onward. So hook into that.

All, that is, except this call to S_regbranch():

        if (is_define)
            vFAIL("(?(DEFINE)....) does not allow branches");
        lastbr = reganode(pRExC_state, IFTHEN, 0); /* Fake one for optimizer. */
        regbranch(pRExC_state, &flags, 1,depth+1);

So is that a bug I've just found? Or something more subtle that doesn't matter. What's with the return value of S_regbranch(), and is it safe to ignore? The code in question dates from November 1997, and is part of Ilya's "Jumbo regexp patch" (commit c277df42229d99fe). More confusingly, it added code with two calls from S_reg() to S_regbranch(), one of which carefully checks the return value and generates a LONGJMP node if it returns NULL, the other of which is called in void context, and so both ignores any return value, or the possibility that it is NULL.

So, to try to figure this out...

As documented in pod/perlreguts.pod, the call graph for regex parsing involves several levels of functions in regcomp.c, sometimes recursing more than once.

The top level compiling function, S_reg(), calls S_regbranch() to parse each single branch of an alternation. In turn, that calls S_regpiece() to parse a simple pattern followed by quantifier, which calls S_regatom() to parse that simple pattern. S_regatom() can call S_regclass() to handle classes, but can also recurse into S_reg() to handle subpatterns and some other constructions. Some other routines call call S_reg(), sometimes using an alternative pattern that they generate dynamically to represent their input.

These routines all return a pointer to a regnode structure, and take a pointer to an integer that holds flags, which is also used to return information. After quite a bit of head scratching and figuring things out I have untangled the possible return values from these 5 functions (and related functions which call S_reg()). The return value is either a pointer to the last node added to the program, or NULL. The pointer is to something already accessible elsewhere, so it's just a convenience return. Nothing leaks by ignoring it. But what about those NULL returns?

The obvious cause of NULL returns is when S_reg() encounters a pragma-like construction - an embedded pattern match operator, such as (?i). In this case, it consumes it, acts on it, and sets the flags to TRYAGAIN and returns NULL. But the code seemed to be prepared to cope with other NULL returns, without the TRYAGAIN flag. So, can those happen?

Starting from the top:

S_reg() will return NULL and set the flags to TRYAGAIN at the end of pragma- like constructions that it handles. Otherwise, historically it would return NULL if S_regbranch() returned NULL. In turn, S_regbranch() would return NULL if S_regpiece() returned NULL without setting TRYAGAIN. If S_regpiece() returns TRYAGAIN, S_regbranch() loops, and ultimately will not return NULL.

S_regpiece() returns NULL with TRYAGAIN if S_regatom() returns NULL with TRYAGAIN, but (historically) if S_regatom() returns NULL without setting the flags to TRYAGAIN, S_regpiece() would to. Where S_regatom() calls S_reg() it has similar behaviour when passing back return values, although often it is able to loop instead on getting a TRYAGAIN.

Which gets us back to S_reg(), which can only generate NULL in conjunction with TRYAGAIN. NULL without TRYAGAIN could only be returned if a routine it called generated it. All other functions that these call that return regnode structures cannot return NULL. Hence

1) in the loop of functions called, there is no source for a return value of NULL without the TRYAGAIN flag being set

2) a return value of NULL with TRYAGAIN set from an inner function does not propagate out past S_regbranch()

Hence the only return values that most functions can generate are non-NULL, or NULL with TRYAGAIN set, and as S_regbranch() catches these, it cannot return NULL. The longest sequence of functions that can return NULL (with TRYAGAIN set) is S_reg() -> S_regatom() -> S_regpiece() -> S_regbranch(). Rapidly returning right round the loop back to S_reg() is not possible.

Hence code added by commit c277df42229d99fe to handle a NULL return from S_regbranch(), along with some other code is dead.

More usefully, this means that all the twisted maze of code makes some sense - the only "can happen" NULL return value is with the flags TRYAGAIN. Which means that it is going to be possible to use a NULL return value with a different flag to signal "restart with Unicode", and eliminate the longjmp().

However, a couple of things still stood in the way. One was working out that all other possible return paths from calls to S_reg() were covered. Various routines in the parser for tricky constructions (eg case insensitive Unicode folds) are implemented by building up a pattern that represents the construction in question, and then making a call to S_reg() to parse that as if it were written in place of the original. So I needed to check each code path that could get to one of these, to work out whether it could trigger the restart. Most couldn't, but three or four could, and two of them didn't have tests. They do now.

Finally I killed the longjmp(). Firstly by moving it up into the same function as the setjmp() and then by replacing it with a goto. Yes, unfortunately still evil. But definitely the lesser of two evils. Particularly as clang was now happy, and address sanitiser reports no errors. I pushed it to a smoke-me branch to let it stew.

As part of this, I discovered a recent regression in the SV dumping code. Father Chrysostomos fixed problems with the interaction between regex objects and LVALUE magic by changing the location in the the SV wrapper of the pointer to the pattern. However, no-one noticed that the SV dumping code wasn't aware of this change, hence dumping regexs no longer worked. Dumping regexs is rather useful when you're trying to figure out what the regex parser is doing :-(. So I added tests for the dumper, and fixed the bug.

During all this I'd had a couple of further insights into hashing, so spent some more time investigating them. I sent an updated summary to the security list. There's nothing I want to report publicly, other than the fact that I'm still comfortable with the current state of all stable releases and blead.

Finally, of note this month, I killed the Rhapsody port. I've been forgetting to note that I'd been removing code related to one dead OS each month. This month was Rhapsody, an Apple OS that later evolved into Darwin and Mac OS X. It was initially only released to developers, but later became Mac OS X Server, wit releases in 1999 and 2000. It was obsoleted by Mac OS X 10.0, released in March 2001. That's 142 lines gone, no longer getting in the way:

    Configure               |   2 +-
    Cross/Makefile-cross-SH |   2 +-
    MANIFEST                |   1 -
    Makefile.SH             |   2 +-
    hints/rhapsody.sh       | 138 ---------------------------------------------
    installperl             |   4 +-
    pod/perldelta.pod       |   6 +--
    t/op/stat.t             |   3 +-
    8 files changed, 8 insertions(+), 150 deletions(-)

A more detailed breakdown summarised from the weekly reports. In these:

16 hex digits refer to commits in http://perl5.git.perl.org/perl.git
RT #... is a bug in https://rt.perl.org/rt3/
CPAN #... is a bug in https://rt.cpan.org/Public/
BBC is "bleadperl breaks CPAN" - Andreas König's test reports for CPAN modules

HoursActivity
0.25MOP
36.50PERL_HASH
1.00POSIX::strptime
0.25RT #24689
0.75Rhapsody
0.75STRANGE_MALLOC
6.00SVt_REGEXP and sv_dump
7.50clang
0.50ext/B/t/optree_misc.t
1.00hashing
1.00investigating security tickets
0.50module evictions
6.25process, scalability, mentoring
26.00reading/responding to list mail
42.25regcomp/setjmp
regcomp/setjmp (killed longjmp)
1.00smoke-me/jrobinson/configure-for-cross
0.50smoke-me/threads-shared-stress
0.25timely destruction

132.25 hours total

YAPC::NA::2014 Call for Location

No Comments

We need people excited about Perl and want YAPC to come to their City. Forget the
Mississippi rule lets go all across America! (Come on West Cost groups and get those bids done.)

Please have a prepared bid with location and simple budget emailed to me (heath at perlfoundation.org) by May 15th, 2013!

So get your ideas and plans together and lets get YAPC::NA::2014 planned.

As chair of the Grants Committee I am sorry to inform that this committee did not receive any grant proposal to be funded in this quarter.

This is the second, consecutive, quarter without grant proposals.

We have received the following Perl 5 grant application from Ricardo Signes.

Before we vote on this proposal we would like to have a period of community consultation that will last seven days. Please leave feedback in the comments or if you prefer send email with your comments to karen at perlfoundation.org.

Name

Ricardo Signes

Project Title

Perl QA Hackathon 2013

Amount Requested:

$1200

Synopsis

This grant will be used to pay for travel for Ricardo Signes to and from the Perl QA Hackathon in England in Q2 2013.

Benefits to Perl 5

I have attended four of the five Perl QA Hackathons (Oslo, Birmingham, Amsterdam, Paris) and have, at each of them, been able to contribute several solid work days of very productive work to the infrastructure behind the CPAN and related tools. Specifically, I was one of the chief implementors of the new CPAN Testers platform (Metabase) and built the Fake CPAN system for testing CPAN tools, and several reusable software libraries that are used to power both Metabase and Fake CPAN. In 2012, I worked on refactoring PAUSE, adding tests and improving maintainability. PAUSE the system which processes contributor uploads to the CPAN, manages CPAN contributor identity, and builds the CPAN indexes used by CPAN clients to locate libraries for installation.

In previous years, I also spent a significant amount of time working with other attendees on their contributions, and plan to do the same this year. This is one of the several reasons that attendance in person is incomparably superior to "virtual attendance."

Deliverable Elements

The QA Hackathon does not have a set agenda, so promising specific work product from it up front seems unwise. I have detailed, above, the sort of work that I am almost certain to do, however. Further, I will provide a public, written report of my activities at the Hackathon.

I hope, in particular, to work on PAUSE, to discuss issues relating to the toolchain in the core perl distribution, and to discuss mechanisms for improving core and CPAN issue tracking. There's also a good chance that I'll use some of this time to work on some of the Pod toolchain issues currently underway in core: Pod::Checker, Pod::Html, installhtml, and so on.

The hackathon takes place over the course of three days, with eight to ten hour workdays. I'll probably also be working on the travel and in the evenings.

Any software that I produce will be released under the Perl 5 standard license terms.

Applicant Biography

I have been building software in Perl professionally for about thirteen years. I am a frequent contributor of original software to the CPAN and a frequent contributor to, or maintainer of, other popular CPAN libraries. I am also a contributor to the core Perl 5 project, and its current project lead.

I have been the recipient of TPF grants three times before, all of which were successful.

Dr. Nicholas Clark has requested an extension of $20,000 for his Improving Perl 5 grant. This grant started in September 2011 and is on track to finish successfully in February 2013. The requested extension would allow Nicholas to devote another 400 hours to the project. The funds for this extension would come from the Perl 5 Core Maintenance Fund.

As well as weekly reports posted on the p5p mailing list Nicholas provides detailed monthly reports, the most recent of these can be found in the following blog posts:

November 2012
December 2012

Before we make a decision on this extension we would like to have a period of community consultation that will last for seven days. Please leave feedback in the comments or, if you prefer, email your comments to karen at perlfoundation.org.

About TPF

The Perl Foundation - supporting the Perl community since 2000. Find out more at www.perlfoundation.org.

Recent Comments

  • Nicholas Clark: Karl's choice of words are uncannily close to what I read more
  • Karl Williamson: I think Tony did an outstanding job, and I was read more
  • Ricardo Signes: Yes, please, with all possible speed. read more
  • Ron Savage: I'm with Craig. The depth of Tony's understanding is something read more
  • Craig Berry: Tony's patience and skill in executing the initial grant have read more
  • Ron Savage: Hi Tony Well done! That's a lot of valuable work read more
  • Karen Pauley: I don't think that the current Perl 5 Core Fund read more
  • diakopter: Karen, will there be a "Perl 6 core" fund? Or read more
  • Nicholas Clark: Dave's grant-funded work on the Perl core has been incredibly read more
  • Ricardo Signes: I am strongly in favor of this grant being granted! read more

About this Archive

This page is an archive of entries from February 2013 listed from newest to oldest.

January 2013 is the previous archive.

March 2013 is the next archive.

Find recent content on the main index or look in the archives to find all content.

OpenID accepted here Learn more about OpenID
Powered by Movable Type 4.38