Improving Perl 5: Grant Report for Month 19

Nicholas Clark writes:

As per my grant conditions, here is a report for the May period.

This month I simplified part of the implementation of pack and unpack, removing about 130 lines of code, and reducing the object code size by about 2K. The only casualty was support for mixed-endian systems. Sorry, PDP-11 users.

In the medal stakes for "risk to sanity", the implementation of pack and unpack are strong contestants, although only for bronze. (They're fighting it out with the implementation of sprintf. Compared with sprintf, they have the advantage that they've been seen as sufficiently dangerous to nearby code that they've been quarantined into their own file. The regular expression engine gets silver, and the reigning champion and gold medal holder remains the parser itself.)

pack/unpack is one of those pieces of code which has grown organically as features were added, and so has become too large to easily comprehend the big picture. In particular, it's hard to look at it at more than one level at once, and hence see where it could be profitably refactored.

The main code of pp_pack and pp_unpack are large switch statements, which are used to dispatch a single template code, and perform the correct sized and typed read or write of bytes to or from the string. There seems to be a lot of repetition in there, but it's more at a "template" level than a code level, because the types of the variables in each case differ. Pretty much all the things that are the same size and could be merged have been merged. There's no obvious way to squeeze it further with this design.

However, one thing did strike me as odd - a pattern of repeated macros in the unpack code, particularly in all the cases that dealt with the various sizes and signedness of integers. They exist due to the order that the code was written and then enhanced. The original code (dating from Perl 3 era) read numbers as chunks of memory. It copied bytes to/from the (possibly unaligned) input string to a local variable of the correct type. Much much later Marcus Holland-Moritz added code to provide the pack modifiers "<" and ">" (for endianness) and implemented the endian swapping they require, by endian swapping the values in the integer variables. Subsequently Ton Hospel added code to cope with the unpack input string having been internally transcoded to UTF-8, by having the copy transcode back if needed. These two later additions are respectively the second and first macros.

So there was one macro to read a number into a variable, and then a second macro to (optionally) spin the bits to re-order for a different endianness. This looked more complex than it might be. On digging further I realised that it was more subtle than it first looked - the code didn't just support the obvious big and little endian byte orders, but also mixed endian. (Technically, actually, I think it could support any arbitrary endianness.)

Big endian and little endian are familiar - on a big endian system the 32 bit number 0x04030201 would be stored in memory as the bytes 0x04 0x03, 0x02, 0x01; on a little endian system as 0x01, 0x02, 0x03, 0x04. Mixed endian is quirky - 0x03, 0x04, 0x01, 0x02

We haven't had many bug reports from PDP-11 users since, hmmm, actually since 5.000 was shipped, which made me wonder just how much of the code still worked, and if anyone at all would notice*.

So I went digging. Mixed-endian support first was first added in Perl 3, as part of adding 'N' and 'n' templates to pack. Fallback functions were provided in util.c in case the system didn't define htonl() and ntohl().

The fallback functions contain conditionally compiled special-case code for little endian systems, and a fallback loop for other values of BYTEORDER. The intent is that the the fallback loop will work on any byteorder. In fact, it only works correctly on 32 bit little endian systems, because it always swaps the order of the bytes within the word. So irony is that the special-purpose little endian code is actually correct for all platforms, and the intended to-be generic code only works on little endian.

Hence I infer that they were never needed and hence never used used, because if they had been, regression tests would have failed. Moreover, it seems that no-one ever even needed to link with them, because as is noted in "perl 4.0 patch 19" (commit 988174c19bcf26f6, Nov 1991):

+ * I think my_swap(), htonl() and ntohl() have never been used.
+ * perl.h contains last-chance references to my_swap(), my_htonl()
+ * and my_ntohl().  I presume these are the intended functions;
+ * but htonl() and ntohl() have the wrong names.  There are no
+ * functions my_htonl() and my_ntohl() defined anywhere.
+ * -DWS
+ */

(DWS appears to be "David W. Sanderson", who provided code for 'V' and 'v')

Because I wasn't confident about my code inspection, and it all looked too unlikely to be actually true, I tested some of this on an emulator. I don't know how much a PDP-11 cost in its day, but it's amusing that you can emulate it on Raspberry Pi. (Which is a lot smaller, uses considerably less power, and costs $25).

The nice thing about Unix is that an ancient BSD is familiar enough to be workable. The not so nice thing is that the C compiler misses a few things - it only understands K&R argument style, doesn't do function prototypes, and it's missing some *printf formats such as "%lX" and "%hx". (Which matter, because int is 16 bits, and so there are a whole lot more size errors you can't get away with for varargs functions such as printf.) There's no way that that toolchain is going to build a perl from this decade, and I'm not even sure if a viable C89 toolchain exists. So I didn't spend any more time looking at it. (And it wouldn't have been trivial, as my "connectivity" with the machine was typing and copy/paste on the logged in console. Getting emulated networking set up would require root on the host machine figuring out how to set up a bunch of IP-level games. This is a lot more complex than running apt-get to install the emulator and then downloading the relevant disk image for NetBSD.)

I made the executive decision that all the mixed endian support code can go. Patches welcome to restore it, but we're only really interested if you have a genuine reason to keep supporting PDP-11s going forward, and are in a position to run a smoker.

With it gone, the simplification becomes tractable. Endianness is now either "big endian" or "little endian", and the problem of "which endian?" reduces to either "no-op" or "swap all the bytes". At which point, it's possible to implement the endian-swap as part of the copy/transcode fixup, because the number of bytes wanted is fixed and known, and the transformation is always "reverse". So if endian swapping is needed, simply process those bytes from the input in reverse order. The pair of macros (actually, an entire brace of macros) simplify massively, to an inline call to memcpy() it if it's all fine, and a call to a fixup function if it is not. This removes a lot of inline code related to endian swapping, and results in the 2K object code shrinkage. The common use cases remain inline, so the code shouldn't be any slower (although I haven't tried to measure this).

A bug that consumed a lot of time this month was RT #118055. For v5.18.0 Father Chrysostomos significantly re-wrote the OP generation code to use a slab allocation scheme, which solves many previous problems with memory leaks when syntax errors cause compilation to abort. (The "blame" here seems to be one of those subtle problems of using code in ways it wasn't originally intended. yacc doesn't complicate its design by making efforts to clean up on compilation failure, because it assumes that compilation failure will mean process exit, and everything gets freed anyway. This holds true for the main Perl program too - syntax errors abort. But Perl 5 gained the ability to be embedded (where total cleanup matters), and is now often used as a persistent process (where even the occasional failed eval will start to add up), so the cost of the problem has slowly been growing.)

However, there was an unfortunate bug slab allocator none of us spotted. We were actually being naughty with alignment - the slab allocator assumes that everything in the OP structure only needs alignment with the platform's pointer size, but because PMOP was using an IV when it should have be using a pointer-sized integer this didn't always hold true.

The problem was only relevant on 32 bit platforms when optionally compiling with 64 bit IVs and ithreads (neither of which is the default), and the only platform we're aware of that enforces the alignment is sparc32. Frustratingly it was only spotted (by the Debian Per team) just after v5.18.0 was released, else it would have been a release blocker.

Ideally we'd fix the header to change the structure to use the correct type but that's obviously not viable for a maintenance fix, as it's not binary compatible. Having chewed over various ways to work around the problem, all of which felt complex, fragile or both and had the potential to be slower, I spotted a different approach which was simple and could be constrained to just the class of affected builds.

OP allocation cannot only be from slabs - for various reasons OPs need to be allocated without a slab being available. Hence each OP already had a flag saying whether it was from a slab, or had been allocated directly using malloc(). malloc() will, by design, always return suitably aligned memory. Hence the fix is for the affected configurations to conditionally compile in code that intercepts allocations for PMOP, and always allocates them with malloc(). It's 2 lines of C code, 2 lines of C pre-processor, and will be in v5.18.1.

A second bug of note is RT #118139. Reini Urban figured out that Storable can crash when used in DESTROY blocks which are being run during global destruction, because it is accessing an already freed PL_modglobal or the internal ctx.

I feel guilty about this, because I believe that it's the same problem that I first hit in 2003 (while writing slides for the German Perl Workshop). I saw that the C stacktrace of the SEGV meant that it was during global destruction, and concluded roughly "OMG global destruction" and "I don't have time to figure this out and write my slides", and so, strangely enough elected to deal with my slides. And there it languished.

The problem with global destruction is that it has to destroy everything. What's the problem with that? In a word, cycles. If objects A and B each reference each other, then (obviously) B needs to stick around longer than A, in case A's DESTROY needs to follow the reference to B for some reason. But A needs to stick around longer than B, in case B's DESTROY... "You go first". "No, you go first". Global destruction cuts the Gordian knot by destroying things in any order it feels like. The upshot is that anything you needed might already be gone, so you code gets to live in interesting times. So interesting that it may not actually be able to work. This, I had assumed, was quite the difficulty with Storable - that a solution might not exist - so I didn't investigate further.

Unlike me back then, Reini Urban hadn't given up so easily, and distilled it to an excellent clear test case, and suggested a patch to avoid the SEGV, by causing Storable to die if used during global destruction. (An exception which quite possibly wouldn't be seen, but that's considerably better than a SEGV.)

However, it gave Leon Timmermans an insight. Why is Storable using DESTROY? It's not actually running any Perl code. It's just using a blessed reference as a hook to ensure free() is called on some memory allocations in its per-interpreter context. The problem is that during global destruction references to that memory are still visible to Storable routines called from Perl code, use after free bug.

So he figured that we can ensure the same cleanup is run by attaching magic to the context scalar instead of blessing it. The mg_free hook is called when all scalars are finally freed. This is much later during global destruction, after all destructors have been called and no more Perl code is running. Hence this ensures the desired ordering constraint - that the context survives until after all code no longer needs it, but is freed eventually. I implemented his suggestion, and verified that this alternative solution also made Reini's test case pass. So this is the solution that we went with, as it permits Storable to be used reliably during global destruction, instead of reliably forbidding it.

I simplified the build infrastructure for the x2p/ subdirectory. x2p/ contains the various "to Perl" utilities, along with tools needed to build them. In x2p/ it seems that the only consistency is inconsistency. The names are a2p, s2p and find2perl (not f2p). The sed and find converters are written in Perl. The awk converter is written in C. So it's not simply repeat the same build rules for 3 things.

I think I've read somewhere that Larry considered the awk to Perl converter sufficiently important for Perl adoption that he delayed the Perl 1 release until a2p was done. The C code itself uses a yacc grammar to parse awk, and looks suspiciously like Perl 1, down to using the same data structures for scalars and the same stdio cheating code for fgets(). (See if you're interested)

However useful it was 25 years ago to drive Perl adoption, it's not needed for that these days, but it is still the only awk to Perl converter, so it would be nice to ship it independently somehow. Frustratingly, it's not at all obvious how to do this. It's not XS code linking against the core, so ExtUtils::MakeMaker doesn't have anything helpful. In fact, the situation is worse than this - the shipped x2p/Makefile.SH is only used for the build on *nix systems. It's not a true little self contained sub-perl, as VMS and Win32 include rules directly in their top level Makefiles to build the contents of the x2p directory. Additionally, it has no tests - for some reason I feel more uncomfortable with the idea of shipping something to CPAN with no tests than I do with the status quo. Maybe it's because if it's an independent distribution it is completely obvious that it has no tests.

Tests would be useful. I've searched and failed to find any suitably licensed for inclusion in the core. Effectively to ship dual APL/GPL code needs to be either exactly that (unlikely), or BSD licensed, or public domain. Which seems surprising, as there are BSD licensed awk implementations.

So as it seems that evicting a2p is actually more work than continuing to host it in the core distribution, I attempted to reduce duplication in the build setup, even if that means I've bound it a bit more tightly. In that, for the past 25 years x2p has been shipping a second copy of the makedepend shell script that automates calculation of C file dependencies to generate Makefile rules, and a second copy of the cflags shell script used to invoke the compiler. This seems wasteful. However, it took a bit of untangling to remove some assumptions from the top-level versions of each about only dealing with the current directory. But once that was fixed it was possible to rework x2p's Makefile to use the versions at the top level, and delete its private forks.

I was also easily able to move running makedepend on x2p into its own rule in the main Makefile, so that it can happen in parallel with C compilation for miniperl. Previously it ran as part of the main makedepend, which happens before make is invoked, meaning it's bottle necked on only one core. Now a little more can happen concurrently, and maybe a second is shaved off the elapsed build time for a parallel make.

In the x2p/ Makefile I spotted a few more tentacles of dead things that need purging - rules that turned out to be for deleting files generated when building on VM/ESA (an obsolete platform we have removed support for), rules for generating targets related to the Perl compiler (removed in 2006), and rules related to man page targets deleted by 5.000. They're not the first dead things (eg installperl rules related to 5.000 to 5.001 changes), and they won't be the last. But in something this complex, it's never easy to spot what's actually dead, versus what's obscure but still needed.

Whilst on the subject of extermination, this month I finally removed fakethr.h, and eliminated all references to it and FAKE_THREADS. fakethr.h and FAKE_THREADS were for a "green" threads implementation of 5005threads. 5005threads itself is long gone, and it's not clear that -DFAKE_THREADS ever built correctly. Certainly it did not work for the 5.005 release, and it did not work at the time of the commits for the initial checkin. The closest that it seems to have been to working is around commit c6ee37c52f2ca9e5 (Dec 1997), where the headers no longer contained errors, but perl.c failed to compile.

I had spotted this dead code a couple of years ago, but stalled on the problem that ExtUtils::MakeMaker contained a hard-coded list of headers which it uses to generate dependency rules for XS code, and fakethr.h was one of them. Hence simply innocently removing the file will have the less than desirable effect of causing all XS code to fail. (It's very easy to break most of CPAN. It's a lot harder to not break CPAN)

Manually removing another line felt like a bodge, as it doesn't actually fix the real problem, so with plenty of other things to make progress on, I took no action. Fortunately as part of his hash work, Yves fixed the header list in ExtUtils::Makemaker properly (in a way that had eluded me), such that there is no longer list to maintain (and get stale). Hence, with that pre-requisite unblocked, I carefully purged the code once v5.18.0 shipped. So that's the last part of 5.005 threads removed. That is, until the next part is found.

As I've been removing code for obsolete platforms, I've been searching the entire codebase for references to them. It's impressive how many nooks and crannies need to be checked to clean up properly, and not surprising that sometimes the odd reference get missed. (Just like fakethr.h, hiding in plain sight for a decade.) Part of the fun is because platform specific README files are installed as man pages. For example, the file README.vos becomes perlvos.pod, which is installed as a man page, which you can read with `man perlvos`. The fun comes because various parts of the codebase want a list of all the man pages, and these platform specific files (along with various other things generated or copied) should be in that list. So you can't get the complete list by globbing , at which point do you create a file listing everything, or just the exceptions to add to that glob? Or something else? It's a trade off.

One place that some time ago I had noticed had a list of perl manpages was the perl debugger. The intent is that you can save some typing by being able to abbreviate "doc perlfunc" as "doc func" etc, omitting the "perl" prefix from the man page name. It needs to know the correct name because on *nix platforms it runs the system's man command to display the page. If `man func` failed to run, and "func" was in the list, then it would try a second time with `man perlfunc`. To implement this abbreviation feature, the debugger had a hard-coded list of man page names. *Inevitably* this list was out of date, missing more recently added man pages as well as having the odd reference to a page now removed.

(I think it's fair to say of anything that if it's neither automatically generated nor tested for up-to-dateness, then it runs the risk of becoming stale, and for a large enough project and an old enough project, the probabilities aren't in favour of fresh.)

Seeking a proper solution to the problem (which doesn't create an ongoing requirement for human intervention), I had considered using some of the existing code in regen/ to keep the list up to date. But this feels ugly, so I held off. But recently I figured that one could completely eliminate the list by looking at run time in the perl library tree for the installed Pod files. If a pod file is found, it's a legitimate abbreviation. If the Pod file isn't found, then it's assumed not to be a legitimate abbreviation, and the second man command isn't attempted. If for whatever reason your distribution chose not to install the Pod files, all that "breaks" is that you have to type "doc perlfunc" each time. It seemed like the right trade off.

So I started work. Only to find that in the shipping the man command had been inadvertently broken by a recent refactoring, and no-one had spotted this. Obviously, I fixed this first, and wrote the simplest test I could to try to ensure that the doc command didn't break again. Try to run "man perlrules", and expect that to fail. And "it fails" isn't quite good enough, as the test needs to be sure that it gets the "not found" error message from running an external command, given that the regression it's trying to protect against is failure to run that. So I wrote a minimal test that causes the debugger to run `man perlrules` and check that it's man that is failing.

Fail it does, but not always with the same message. I'd assumed that the failure message was the same on all *nix systems. Tony Cook fixed the test to skip on all platforms except Linux. Whilst only testing on one platform isn't ideal, it's a common platform, so the tests are going to get run enough, and it's better than no tests, which seemed to be the only alternative. So Linux only should be safe, right?

If the question is in the title, you know that the answer is "no". In this one also need to override the locale, in case man has been localised to report translated error messages. Tony tested a fix on the failing platforms, localising $ENV{LANG} and $ENV{LC_MESSAGES} to "C", and pushed this to blead. Which worked on those platforms. But whack-a-mole wasn't quite over, because LC_ALL overrides LC_MESSAGES, as demonstrated by subsequent platform failures. So it took one more tweak to the test before we had something reliable enough to have no failures. Everything takes longer than you expect it to, because testing portably and robustly is hard, and you usually don't get it exactly right first or even second time.

The final purge this month was the code to support DG/UX. DG/UX was a Unix sold by Data General. The last release was in April 2001. It only runs on Data General's own hardware.

It should have been removed before v5.18.0 but it got missed, and then seemed an inappropriate thing to do during the code freeze. There is still a lot of code for platforms we don't have the infrastructure to support. In some cases we doubt that Perl even compiles natively. Specifically, if you're using Perl on any of z/OS, Windows CE, NeXT, AmigaOS, DJGPP, NetWare (natively), OS/2 or Plan 9, we'd really like help in getting Perl 5 buildable and regularly smoked. Otherwise it's not clear how long we can keep considering those platforms as viable.

A more detailed breakdown summarised from the weekly reports. In these:

16 hex digits refer to commits in
RT #... is a bug in
CPAN #... is a bug in

HoursActivity RCs
4.75Floating point stringification (RT #108378, RT #115800)
0.25HP-UX smoke
1.25PVHV (critbit)
0.50RC testing
0.50RT #113920
0.25RT #114878
0.25RT #116098
0.25RT #116407
1.00RT #116943
1.00RT #116989
3.00RT #117031
0.25RT #117209
0.25RT #117501
0.50RT #117793
1.50RT #117835
0.25RT #117885
1.00RT #117893
0.75RT #117903
0.50RT #117907
0.25RT #117917
0.75RT #117941
0.50RT #117967
0.25RT #117969
0.25RT #117977
13.25RT #118055
RT #118055 sparc32 use64bitint
0.25RT #118121
6.50RT #118139 (Storable)
3.00RT #118157
0.25RT #118187
1.00RT #118197
0.25RT #118225
0.25RT #5087
0.25RT #63402
0.75RT #78674 (and the hazards of PUTBACK)
0.25SelfLoader (CPAN #85572)
TAP::Parser (CPAN #84939)
TAP::Parser (Win32 failure)
2.75Unicode Names
8.25makedepend.SH, x2p/Makefile.SH
pp_pack.c (BYTEORDER)
pp_pack.c (hton* and htov*)
6.00process, scalability, mentoring
20.75reading/responding to list mail
1.00smoke-me branches
0.25used only once

139.25 hours total

* If you weren't aware of it, GE in Canada need PDP-11 programmers for the next 27 years. That's more job security than most other firms offer. But assembler is the skill they care about, not Perl:

About TPF

The Perl Foundation - supporting the Perl community since 2000. Find out more at

About this Entry

This page contains a single entry by Karen Pauley published on July 30, 2013 10:01 AM.

Hague Grant Application was the previous entry in this blog.

Improving Perl 5: Grant Report for Month 20 is the next entry in this blog.

Find recent content on the main index or look in the archives to find all content.


OpenID accepted here Learn more about OpenID
Powered by Movable Type 6.2.2