Improving Perl 5: Grant Report for Month 4
Sun, 08-Jan-2012 by
Karen Pauley
edit post
_Nicholas Clark writes:_
The start of December was mostly spent unpicking and simplifying how embed.h is included in perl.h. embed.h is one of the core's generated headers, containing C pre-processor directives to simplify use of the C API functions, by wrapping them such that the programmer doesn't need to deal with the implementation details of ithreads.
Over the years, embed.h has come to be included by perl.h in 4 different places depending on which OS perl is being built on. The confusion started 13 years ago when a second #include "embed.h" was added later in perl.h to support the now long defunct "PERL_OBJECT" system. This later migrated to be used for all Win32 builds. Subsequently VMS and Symbian gained special-cased includes of their own in perl.h, as like Win32 they defined HAVE_INTERP_INTERN, which needed to be done before embed.h was included.
The silly part is that it turns out that there was no reason why the include of embed.h had to be *early* on any OS. So it's possible to simplify perl.h by taking the Win32-only #include embed.h near the end, and removing the conditional code, so that all operating systems use it.
I also eliminated code from a2p to support using Perl's malloc, which was "temporarily" commented out 13 years ago, pending a proper fix. The lack of bug reports suggest that no-one has missed this. I worked with Craig Berry to simplify vmsish.h, which contained support code for a2p on VMS that isn't actually necessary.
Related to investigating both, I spotted that conditional code for NO_EMBED could be removed from sdbm.h. The functionality of NO_EMBED was removed 13 years ago, and most every other reference purged in 2003.
There is probably quite a bit more conditional compilation code that could be removed from sdbm.h. However, I'm not sure that it's worth it, as the cost of verifying that one has done it without error is likely to be much greater than the benefit that it brings. There's a lot more low hanging fruit elsewhere, plenty more than we have people to harvest it.
The "highlight" of the month was installman, the script that make install uses to locate documentation and install it as man pages. Initially it was something I could work on offline whilst travelling by train to London to meet Jesse (and others) for Dim Sum. Except that it turned out to be a lot more messy than it first seemed.
installman is responsible for finding Pod documentation in .pm and .pod files, passing the Pod to Pod::Man to generate manpages, and then installing the manpages in the correct location. There are exceptions - we don't install documentation for certain modules, we don't install certain modules (so we shouldn't install their documentation), and we should skip test modules. Half of this overlaps with what pod/buildtoc does. However, the code was completely different, with partially inconsistent exception lists, exceptions that are no longer needed, etc.
installman and buildtoc now share as much code as possible, with the installation exception rules in one place, and all dual life Pods now install in man1/
While the code still fresh in my mind, I also refactored the common Pod installation to eliminate pod.lst. This was the master Pod data file, added in 2003 by commit 416302502f485afa as part of a refactoring to reduce the amount of manual work needed to maintain Pod-related rules in various files, and to reduce errors and inconsistencies between them. It turned out that pod.lst contained quite a lot of data which were only needed to generate the table of contents in pod/perl.pod, and that there wasn't that much extra in pod.lst that wasn't in pod/perl.pod. Hence the two were near duplicates of each other, and simplification should be possible. By careful use of =begin, =end and =for Pod directives in pod/perl.pod is has been possible to readably encode the extra data into the source of pod/perl.pod without affecting the rendering and thus eliminate pod.lst. This both simplifies the distribution, and the tasks the release manager needs to perform each month.
In the process of doing this, I discovered that the changes that permitted dual-life Pods to move from pod/ to their owner distributions in dist/ and cpan/ had introduced another regression post v5.15.0 - the Pod files were now installed as if they were modules, into lib/, instead of the correct location with all the other core Pod files. In theory, this was a simple change to the installperl script to get it to use the code common to installman, buildtoc and a couple of other helper scripts. In practice, it turned out to be more complex, as there was another organic mess that needed simplification.
Key to this was changing the invocation of File::Find::find() that processes lib/ to run with 'no_chdir'. This allows that code to call other routines that use pathnames relative to the top of the build tree to access files, and as a side effect eliminates the localised package variable $::depth which was used to maintain a faked path prefix for display purposes to counter the effects of the directory change.
There is still a considerable amount of sanity refactoring that could be done on installperl. In particular, the code that processes lib/ has accumulated a lot of exception rules, at least some of which are now redundant and should be removed. This is left as an exercise to any reader who enjoys the mental challenge of figuring out the history of code from the "blame" log: http://perl5.git.perl.org/perl.git/blame/HEAD:/installperl
A more detailed breakdown summarised from the weekly reports. In these:
16 hex digits refer to commits in http://perl5.git.perl.org/perl.git
RT #... is a bug in https://rt.perl.org/rt3/
CPAN #... is a bug in https://rt.cpan.org/Public/
BBC is "bleadperl breaks CPAN" - Andreas König's test reports for CPAN modules
| Hours | Activity |
| 0.50 | #105928 |
| 0.25 | 93ea32b83e27783a and vmsish.h |
| 0.25 | NO_EMBED in sdbm.h |
| 2.00 | POSIX |
| | POSIX::dup2 RT #98912 |
| | POSIX::sleep |
| 0.50 | Pod::Functions |
| 0.50 | Porting/corelist-perldelta.pl |
| 0.75 | Porting/expand-macro.pl |
| 0.25 | Porting/pod_rules.pl |
| 0.25 | Porting/podtidy |
| 0.50 | SSNEW |
| 0.50 | a2p |
| | a2p and vmsish.h |
| 8.25 | abolish pod.lst |
| 1.00 | checking and merging smoke-me branches |
| 0.75 | cross compiling |
| 12.75 | embed.h in perl.h, perl.h in x2p |
| 6.50 | fc [foldcase] |
| 4.75 | fileno on a localised tied handle |
| 1.50 | global destruction backref panic |
| | global destruction backref panic (analysis) |
| 39.25 | installman |
| | installman/installperl |
| 0.50 | pod/perlmodlib.PL |
| 6.25 | process, scalability, mentoring |
| 1.25 | questions about saving stacks |
| 19.50 | reading/responding to list mail |
| 0.25 | sfio |
| 0.25 | suidperl purge |
| 2.75 | t/porting/utils.t |
| 0.25 | t/re/regexp.t cleanup |
| 2.00 | version.pm and POSIX::set_locale |
| 0.50 | x2p/malloc.c |
**114.50 hours total**
Comments (1)
great work