Improving Perl 5: Grant Report for Month 4

1 Comment

Nicholas Clark writes:

The start of December was mostly spent unpicking and simplifying how embed.h is included in perl.h. embed.h is one of the core's generated headers, containing C pre-processor directives to simplify use of the C API functions, by wrapping them such that the programmer doesn't need to deal with the implementation details of ithreads.

Over the years, embed.h has come to be included by perl.h in 4 different places depending on which OS perl is being built on. The confusion started 13 years ago when a second #include "embed.h" was added later in perl.h to support the now long defunct "PERL_OBJECT" system. This later migrated to be used for all Win32 builds. Subsequently VMS and Symbian gained special-cased includes of their own in perl.h, as like Win32 they defined HAVE_INTERP_INTERN, which needed to be done before embed.h was included.

The silly part is that it turns out that there was no reason why the include of embed.h had to be early on any OS. So it's possible to simplify perl.h by taking the Win32-only #include embed.h near the end, and removing the conditional code, so that all operating systems use it.

I also eliminated code from a2p to support using Perl's malloc, which was "temporarily" commented out 13 years ago, pending a proper fix. The lack of bug reports suggest that no-one has missed this. I worked with Craig Berry to simplify vmsish.h, which contained support code for a2p on VMS that isn't actually necessary.

Related to investigating both, I spotted that conditional code for NO_EMBED could be removed from sdbm.h. The functionality of NO_EMBED was removed 13 years ago, and most every other reference purged in 2003.

There is probably quite a bit more conditional compilation code that could be removed from sdbm.h. However, I'm not sure that it's worth it, as the cost of verifying that one has done it without error is likely to be much greater than the benefit that it brings. There's a lot more low hanging fruit elsewhere, plenty more than we have people to harvest it.

The "highlight" of the month was installman, the script that make install uses to locate documentation and install it as man pages. Initially it was something I could work on offline whilst travelling by train to London to meet Jesse (and others) for Dim Sum. Except that it turned out to be a lot more messy than it first seemed.

installman is responsible for finding Pod documentation in .pm and .pod files, passing the Pod to Pod::Man to generate manpages, and then installing the manpages in the correct location. There are exceptions - we don't install documentation for certain modules, we don't install certain modules (so we shouldn't install their documentation), and we should skip test modules. Half of this overlaps with what pod/buildtoc does. However, the code was completely different, with partially inconsistent exception lists, exceptions that are no longer needed, etc.

installman and buildtoc now share as much code as possible, with the installation exception rules in one place, and all dual life Pods now install in man1/

While the code still fresh in my mind, I also refactored the common Pod installation to eliminate pod.lst. This was the master Pod data file, added in 2003 by commit 416302502f485afa as part of a refactoring to reduce the amount of manual work needed to maintain Pod-related rules in various files, and to reduce errors and inconsistencies between them. It turned out that pod.lst contained quite a lot of data which were only needed to generate the table of contents in pod/perl.pod, and that there wasn't that much extra in pod.lst that wasn't in pod/perl.pod. Hence the two were near duplicates of each other, and simplification should be possible. By careful use of =begin, =end and =for Pod directives in pod/perl.pod is has been possible to readably encode the extra data into the source of pod/perl.pod without affecting the rendering and thus eliminate pod.lst. This both simplifies the distribution, and the tasks the release manager needs to perform each month.

In the process of doing this, I discovered that the changes that permitted dual-life Pods to move from pod/ to their owner distributions in dist/ and cpan/ had introduced another regression post v5.15.0 - the Pod files were now installed as if they were modules, into lib/, instead of the correct location with all the other core Pod files. In theory, this was a simple change to the installperl script to get it to use the code common to installman, buildtoc and a couple of other helper scripts. In practice, it turned out to be more complex, as there was another organic mess that needed simplification.

Key to this was changing the invocation of File::Find::find() that processes lib/ to run with 'no_chdir'. This allows that code to call other routines that use pathnames relative to the top of the build tree to access files, and as a side effect eliminates the localised package variable $::depth which was used to maintain a faked path prefix for display purposes to counter the effects of the directory change.

There is still a considerable amount of sanity refactoring that could be done on installperl. In particular, the code that processes lib/ has accumulated a lot of exception rules, at least some of which are now redundant and should be removed. This is left as an exercise to any reader who enjoys the mental challenge of figuring out the history of code from the "blame" log:

A more detailed breakdown summarised from the weekly reports. In these:

16 hex digits refer to commits in
RT #... is a bug in
CPAN #... is a bug in
BBC is "bleadperl breaks CPAN" - Andreas König's test reports for CPAN modules

0.2593ea32b83e27783a and vmsish.h
0.25NO_EMBED in sdbm.h
POSIX::dup2 RT #98912
a2p and vmsish.h
8.25abolish pod.lst
1.00checking and merging smoke-me branches
0.75cross compiling
12.75embed.h in perl.h, perl.h in x2p
6.50fc [foldcase]
4.75fileno on a localised tied handle
1.50global destruction backref panic
global destruction backref panic (analysis)
6.25process, scalability, mentoring
1.25questions about saving stacks
19.50reading/responding to list mail
0.25suidperl purge
0.25t/re/regexp.t cleanup and POSIX::set_locale

114.50 hours total

1 Comment

great work

About TPF

The Perl Foundation - supporting the Perl community since 2000. Find out more at

About this Entry

This page contains a single entry by Karen Pauley published on January 8, 2012 12:29 PM.

2012Q1: Call for Grant Proposals was the previous entry in this blog.

craigslist Charitable Fund Donates $100,000 to the Perl Foundation is the next entry in this blog.

Find recent content on the main index or look in the archives to find all content.


OpenID accepted here Learn more about OpenID
Powered by Movable Type 6.2.2