Improving Perl 5: Grant Report for Month 5

No Comments

Nicholas Clark writes:

I finally got time to address the weak reference global destruction "panic" described in November's report. Now understanding the problem, the interim fix seemed clean and fast to implement. Of course - no plan survives contact with the enemy. In this case, fixing the first panic revealed a second panic at a different point in the code. In turn, that took a lot of time to nail down (saved debugger logs, hardware watchpoints, re-runs, occasionally crashing gdb. The usual), but in the end turned out to be the backreference between a CV and its typeglob, mutually destroyed at the same time.

This investigation was hindered by the form of the backref panic messages.
Perl_sv_del_backref() contained these:

if (!svp || !*svp)
Perl_croak(aTHX_ "panic: del_backref");

else {
/* optimisation: only a single backref, stored directly */
if (*svp != sv)
Perl_croak(aTHX_ "panic: del_backref");
*svp = NULL;
}

It took a while to realise that the "new" panic was not from the same spot as the old, now-fixed panic, because the text is identical. Moreover, in each case diagnostic information is thrown away. "panic"s aren't supposed to occur, hence when the "impossible" does happen it can often be genuinely impossible to work out what the chain of events leading to it was, let alone replicate them. So I searched the codebase for panic messages, eliminating duplicates such as above, and changing as many as possible to output the values that caused panic. Hence the above two are now:

if (!svp)
Perl_croak(aTHX_ "panic: del_backref, svp=0");
if (!*svp) {
/* It's possible that sv is being freed recursively part way through the
freeing of tsv. If this happens, the backreferences array of tsv has
already been freed, and so svp will be NULL. If this is the case,
we should not panic. Instead, nothing needs doing, so return. */
if (PL_phase PERL_PHASE_DESTRUCT && SvREFCNT(tsv) 0)
return;
Perl_croak(aTHX_ "panic: del_backref, *svp=%p phase=%s refcnt=%" UVuf,
*svp, PL_phase_names[PL_phase], SvREFCNT);
}

and

else {
/* optimisation: only a single backref, stored directly */
if (*svp != sv)
Perl_croak(aTHX_ "panic: del_backref, *svp=%p, sv=%p", *svp, sv);
*svp = NULL;
}

I spent a fun-filled day digging into installhtml and Pod::Html. As the names suggest, Pod::Html generates HTML documentation from Pod, and installhtml generates HTML with it and then installs it. As of 5.14.0, both were, um, full of the wrong sort of tentacles*. Thanks to Google Summer of Code, Marc Green was able to tame Pod::Html and update it to use modern Pod infrastructure. However, all change carries risks - the potential for introducing bugs, and in this case it was the interaction between Pod::Html and installhtml. Nailing down the actual problem was not as simple as it initially looked, as problems were caused by the interactions of several changes in more than one branch, and the various programs produced a lot of diagnostic output which had to be analysed to determine whether this was the same bug or different. In the process I opened 5 bugs in RT for other issues: #107866, #107870, #107874, #107880 and #107882. Most of these predate 5.14.0 and the refactoring.

At this point I thought I was done, and reported back to the list that it should all work. The response was that it was still causing the failure of make install on Win32. One of the problems was that installhtml used to accept a --libpods argument, and if present pass this on to Pod::Html::pod2html(). The GSoC refactoring removed this argument from pod2html(), but no-one spotted that installhtml was still passing it, hence one of the failures fixed the previous week was to purge it from installhtml, and from anything that called it. So, obviously, I grep'd every file in MANIFEST for 'libpods'. That should fix it, right? Wrong. The problem is that Getopt::Long::GetOptions() defaults to accepting abbreviations of options (I believe because this is the GNU standard), and the Win32 makefiles were invoking installhtml with --libpod. No 's'. So my grep missed that.

In the middle of January perl5-porters traffic lit up with discussion about distributions, and one size doesn't fit all. I think David Golden coined the jargon term "redhat-mini-onion" to specifically refer to how we work with vendors to identify the best way to meet their needs for a packaging Perl for their distributions, particularly their install media and other constrained environments. I've intentionally kept the somewhat jargon term in the summary below, as it's unambiguous about what the focus should be here. Historically these sort of threads get distracted with more general talking about creating different distributions to suit different end user cases, but it never seems end up with any of the passionate participants following through after the excitement dies down.

I've spent a fair chunk of January working on Pod::Functions. It's a seemingly innocuous module shipped in the perl core, allegedly only used by splitpod, a script never installed, but run from installhtml as part of installing HTML documentation. Its curious claim to fame is that the installers are special-cased not to install its documentation, and buildtoc not to mention it in perltoc, claiming

return if $file =~ qr!/lib/Pod/Functions.pm\z!; # Used only by pod itself

Except that it is used by at least 4 modules on CPAN:

http://grep.cpan.me/?q=%28use%7Crequire%29%5Cs%2BPod%3A%3AFunctions

and I infer to generate this page:

http://perldoc.perl.org/index-functions-by-cat.html

from which you can see that its obscurity has made it very very stale. For some years now perlfunc has had a section "Keywords related to the switch feature", but Pod::Functions was never updated to match. In fact, Pod::Functions is full of Don't Repeat Yourself violations, by containing data which mostly duplicates perlfunc.pod

So the intent is to eliminate the duplication by generating Pod::Function's data structures from perlfunc.pod. My original intent was to locate the Pod file as installed in lib, but the signal in the redhat-mini-onion discussion made me realise that this would make Pod::Functions fragile and awkward, because it would depend at runtime on a 324K Pod file being present so that it could extract a few K of metadata. So instead I changed my design to parse perlfunc.pod at build time, and install a self-contained module. This work is ongoing, and threw up a few other discussion items on the list.

One of the problems we've had with managing dual life modules is knowing which code is "owned" by the core, and which code is "upstream" CPAN, where bugs should be forwarded onwards, instead of editing the files locally. To simplify tracking this, a couple of years ago I worked to get the extension-building code simplified so that we could then adapt it to use more than one directory, so that "upstream" CPAN modules would all be together, and obviously so. A side effect of this is that when an extension changes status it has to move directory, which isn't an issue with anyone building from a release tarball, but causes problems if it is done by a git update on a built tree. Until now this has often ended up with a seemingly unrelated messy build failure, and it's not possible to automatically determine the right cleanup to make to fix it. However, this week I figured out that it was possible to detect the problem at Configure time, and that bailing out with a notification was a lot better than what we had. That fix is now in blead.

A side effect of other discussions, including redhat-mini-onion, was that I also had inspiration on a long standing problem. There is desire to be able to make making custom installations as easy as "just drop the tarballs into the core distribution at it will handle the rest". The problem is that the toolchain (eg CPAN.pm) assumes that it's running against an installed perl and builds serially, but is able to figure out dependencies and pre-requisites as it goes. In contrast the core's build system needs to know the complete list of modules up front, including build dependencies, needs to be able to build everything before installing, and has three clear steps

1: Configure everything
2: build everything
3: test everything

where the second and third can run in parallel, maximally exploiting modern hardware. I think I can see how to reconcile the two. My thoughts are here:

http://www.xray.mpe.mpg.de/mailing-lists/perl5-porters/2012-01/msg01422.html

Whilst this isn't my itch to scratch, I seem to be the expert on how the build system works (across *nix, Win32 and VMS), so that's my kickstarter gift to anyone who does want to work on this.

A more detailed breakdown summarised from the weekly reports. In these:

16 hex digits refer to commits in http://perl5.git.perl.org/perl.git
RT #... is a bug in https://rt.perl.org/rt3/
CPAN #... is a bug in https://rt.cpan.org/Public/
BBC is "bleadperl breaks CPAN" - Andreas König's test reports for CPAN modules

HoursActivity
1.00%POSIX::SIGRT
0.50BBC (CPAN #73665)
0.25CPAN #70924
0.50Copyright years
0.50PL_check
22.75Pod::Functions
0.25RT #106864
0.50RT #107000
0.25RT #107086
0.25RT #107528
0.25RT #107962
0.25RT #108398
0.50RT #108848
0.50RT #37033
0.25RT #93428
2.00approaches to pruning maybe not dead code
2.50better panics
1.25checking and merging smoke-me branches
2.00copy overlap check
5.75defined @::array
1.50detecting duplicate extension directories
1.50diagnostics.pm
0.25documentation of supported versions
2.25fc
1.00g++ smoke failure
13.00global destruction backref panic
0.50global destruction optimisations
9.75installhtml
installhtml on Win32
installhtml, Pod::Html
0.25installperl
0.25lib/diagnostics.t
0.25mro
1.75mymalloc
2.75perlio
0.75perllocale
11.25process, scalability, mentoring
52.75reading/responding to list mail
2.50redhat-mini-onion
0.25strange File::Glob error
2.00strange Win32 smoke failures with enabled features
0.75t/re/regexp.t, Data::Dumper, miniperl
1.00typemap files
0.50untarring extra packages

148.75 hours total

* Cthulhu, rather than Flying Spaghetti Monster.

Leave a comment

About TPF

The Perl Foundation - supporting the Perl community since 2000. Find out more at www.perlfoundation.org.

About this Entry

This page contains a single entry by Karen published on February 17, 2012 8:33 PM.

2012Q1 Grant Proposals was the previous entry in this blog.

Perl 5 Grant Application Accepted is the next entry in this blog.

Find recent content on the main index or look in the archives to find all content.

OpenID accepted here Learn more about OpenID
Powered by Movable Type 4.38