Nicholas Clark writes:
Possibly the most unexpected discovery of May was determining precisely why Merijn's HP-UX smoker wasn't able to build with certain configuration options. The output summary grid looked like this, which is most strange:
O = OK F = Failure(s), extended report at the bottom
X = Failure(s) under TEST but not under harness
? = still running or test results not (yet) available
Build failures during: - = unknown or N/A
c = Configure, m = make, M = make (after miniperl), t = make test-prep
v5.15.9-270-g5a0c7e9 Configuration (common) none
----------- ---------------------------------------------------------
O O O m - -
O O O O O O -Duse64bitall
O O O m - - -Duseithreads
O O O O O O -Duseithreads -Duse64bitall
| | | | | +- LC_ALL = univ.utf8 -DDEBUGGING
| | | | +--- PERLIO = perlio -DDEBUGGING
| | | +----- PERLIO = stdio -DDEBUGGING
| | +------- LC_ALL = univ.utf8
| +--------- PERLIO = perlio
+----------- PERLIO = stdio
As the key says, 'O' is OK. It's what we want. 'm' is very bad - it means that it couldn't even build miniperl, let alone build extensions or run any tests. But what is strange is that ./Configure ... will fail, but the same options plus -Duse64bitall will work just fine. And this is replicated with ithreads - default fails badly, but use 64 bit IVs and pointers and it works. Usually it's the other way round - the default configuration works, because it is "simplest", and attempting something more complex such as 64 bit support, ithreads, shared perl library, hits a problem.
As it turns out, what's key is that that ./Configure ... contains -DDEBUGGING. The -DDEBUGGING parameter to Configure causes it to add -DDEBUGGING to the C compiler flags, and to add -g to the optimiser settings (without removing anything else there). So on HP-UX, with HP's compiler that changes the optimiser setting from '+O2 +Onolimit' to '+O2 +Onolimit -g'. Which, it seems, the compiler doesn't accept for building 32 bit object code (the default) but does in 64 bit. Crazy thing.
Except, that, astoundingly, its not even that simple. The original error message was actually "Can't handle preprocessed file". Turns out that that detail is important. The build is using ccache to speed things up, so ccache is invoking the pre-processor only, not the main compiler, to create a hash key to look up in its cache of objects. However, on a cache miss, ccache doesn't run the pre-processor again - to save time by avoiding repeating work, it compiles the already pre-processed source. And that is key the distinction between invoking the pre-processor and then compiling, versus compiling without the pre-processor:
$ echo 'int i;' >bonkers.c
$ cc -c -g +O2 bonkers.c
$ cc -E -g +O2 bonkers.c >bonkers.i
$ cc -c -g +O2 bonkers.i
cc: error 1414: Can't handle preprocessed file "bonkers.i" if -g and -O specified.
$ cat bonkers.i
# 1 "bonkers.c"
int i;
$ cc -c -g +O2 +DD64 bonkers.c
$ cc -E -g +O2 +DD64 bonkers.c >bonkers.i
$ cc -c -g +O2 +DD64 bonkers.i
$ cat bonkers.i
# 1 "bonkers.c"
int i;
No, it's not just crazy compiler, its insane! It handles -g +O2 just fine normally, but for 32 bit mode it refuses to accept pre-processed input. Whereas for 64 bit mode it does.
If HP think that this isn't a bug, I'd love to know what their excuse is.
A close contender for "unexpected cause" came about as a result of James E Keenan, Brian Fraser and Darin McBride recent work going through RT looking for old stalled bugs related to old versions of Perl on obsolete versions operating systems, to see whether they are still reproducible on current versions. If the problem isn't reproducible, it's not always obvious whether the bug was actually fixed, or merely that the symptom was hidden. This matters if the symptom was revealing a buffer overflow or similar security issue, as we'd like to find these before the blackhats do. Hence I've been investigating some of these to try to get a better idea whether we're about to throw away our only easy clue about still present bug.
One of these was RT #6002, reported back in 2001 in the old system as ID 20010309.008. In this case, the problem was that glob of a long filename would fail with a SEGV. Current versions of perl on current AIX don't SEGV, but did we fix it, did IBM, or is it still lurking? In this case, it turned out that I could replicate the SEGV by building 5.6.0 on current AIX. At which point, I have a test case, so start up git bisect, and the answer
should pop out within an hour. Only it doesn't, because it turns out that git bisect gets stuck in a tarpit of "skip"s because some intermediate blead version doesn't build. So this means a digression into bisecting the cause of the build failure, and then patching Porting/bisect-runner.pl to be able to build the relevant intermediate blead versions, so that it can then find the true cause. This might seem like a lot of work that is used only once, but it tends not to be. It becomes progressively easier to bisect more and more problems without hitting any problems, and until you have it you don't realise how powerful a tool automated bisection is. It's a massive time saver.
But, as to the original bug and the cause of its demise. It turned out to be interesting. And completely not what I expected:
commit 61d42ce43847d6cea183d4f40e2921e53606f13f
Author: Jarkko Hietaniemi
Date: Wed Jun 13 02:23:16 2001 +0000
New AIX dynaloading code from Jens-Uwe Mager.
Does break binary compatibility.
p4raw-id: //depot/perl@10554
The SEGV (due to an illegal instruction) goes away once perl switched to using dlopen() for dynamic linking on AIX. So my hunch that this bug was worth digging into was right, but not for reason I'd guessed.
A couple of bugs this month spawned interesting subthreads and digressions. RT #108286 had one, relating to the observation that code written like this, with each in the condition of a while loop:
while ($var = each %hash) { ... }
while ($_ = each %hash) { ... }
actually has a defined check automatically added, eg
$ perl -MO=Deparse -e 'while ($_ = each %hash) { ... }'
while (defined($_ = each %hash)) {
die 'Unimplemented';
}
-e syntax OK
whereas code that omits the assignment does not have defined added:
$ perl -MO=Deparse -e 'while (each %hash) { ... }'
while (each %hash) {
die 'Unimplemented';
}
-e syntax OK
contrast with (say) readdir, where defined is added, and an assignment to $_:
$ perl -MO=Deparse -e 'while ($var = readdir D) { ... }'
while (defined($var = readdir D)) {
die 'Unimplemented';
}
-e syntax OK
$ perl -MO=Deparse -e 'while (readdir D) { ... }'
while (defined($_ = readdir D)) {
die 'Unimplemented';
}
-e syntax OK
Note, this is only for readdir in the condition of a while loop - it doesn't usually default to assigning to $_
So, is this intended, or is it a bug? And if it's a bug, should it be fixed.
Turns out that the answer is, well, involved.
The trail starts with a ruling from Larry back in 1998:
As usual, when there are long arguments, there are good arguments for both sides (mixed in with the chaff). In this case, let's make
while ($x = <whatever>)
equivalent to
while (defined($x = <whatever>))
(But nothing more complicated than an assignment should assume defined().)
http://www.xray.mpe.mpg.de/mailing-lists/perl5-porters/1998-04/msg00133.html
Nick Ing-Simmons asks for a clarification:
Thanks Larry - that is what the patch I posted does.
But it also does the same for C, C and C - i.e. the same cases that solicit the warning in 5.004 is extending the defined insertion to those cases desirable? (glob and readdir seem to make sense, I am less sure about each).
http://www.xray.mpe.mpg.de/mailing-lists/perl5-porters/1998-04/msg00182.html
(it's clarified in a later message that Nick I-S hadn't realised that each in scalar context returns the keys, so it's an analogous iterator which can't return undef for any entry)
In turn, the "RULING" dates back to a thread discussing/complaining about a warning added in added in 5.004
$ perl5.004 -cwe 'while ($a = <>) {}'
Value of <HANDLE> construct can be "0"; test with defined() at -e line 1.
-e syntax OK
The intent of the changes back then appears to be to retain the 5.003 and earlier behaviour on what gets assigned for each construction, but change the loop behaviour to terminate on undefined rather than simply falsehood for the common simple cases:
while (OP ...)
and
while ($var = OP ...)
And there I thought it made sense - fixed in 1998 for readline, glob and readdir, but introducing the inconsistency because each doesn't default to assigning to $_. Except, it turned out that there was a twist in the tail. It turns out that while (readdir D) {...} didn't use to implicitly assign to $_. Both the implicit assignment to $_ and defined test were added in 2009 by commit 114c60ecb1f7, without any fanfare, just like any other bugfix. And the world hasn't ended.
$ perl5.10.0 -MO=Deparse -e 'while (readdir D) {}'
while (readdir D) {
();
}
-e syntax OK
$ perl5.12 -MO=Deparse -e 'while (readdir D) {}'
while (defined($_ = readdir D)) {
();
}
-e syntax OK
Running a search of CPAN reveals that almost no code uses while (each %hash) [and why should it? The construction does a lot of work only to throw it away], and nothing should break if it's changed. Hence it makes sense to treat this as a bug, and fix it. Which has now happened, but I can't take credit for it - post 5.16.0, Father Chrysostomos has now fixed it in blead.
To conclude this story, the mail archives from 15 years ago are fascinating. Lots of messages. Lots of design discussions, not always helpful. And some of the same unanswered questions as today.
The digression relates from trying to replicate a previous old bug (ID 20010918.001, now #7698) I'd dug an old machine with FreeBSD 4.6 out from the cupboard under the stairs in the hope of reproducing the period problem with a period OS. Sadly I couldn't do that, but out of curiosity I tried to build blead on it. This is the same 16M machine whose swapping hell prompted my investigation of enc2xs the better part of a decade ago, resulting in various optimisations on its build time memory use, that in turn led to ways to roughly halve the side of the built shared objects, and a lot of the material then used in a tutorial I presented at YAPC::Europe and The German Perl Workshop, "When Perl is not quite fast enough". This machine has pedigree.
Once again, it descended into swap hell, this time on mktables. (And with swap on all 4 hard disks, it's very effective at letting you know that it's swapping.) Sadly after 10 hours, and seemingly nearly finished, it ran out of virtual memory. So I wondered if