June 2012 Archives

Nicholas Clark has requested an extension of $20,000 for his Improving Perl 5 grant. The grant, that is currently running, is on track to finish successfully at the end of June. This extension would allow Nicholas to devote another 400 hours to the project.

Details of the recent work completed can be found in the following blog posts:

March 2012
April 2012
May 2012

Before we make a decision on this extension we would like to have a period of community consultation that will last for seven days. Please leave feedback in the comments or if you prefer send email with your comments to karen at perlfoundation.org.

Nicholas Clark writes:

Possibly the most unexpected discovery of May was determining precisely why Merijn's HP-UX smoker wasn't able to build with certain configuration options. The output summary grid looked like this, which is most strange:

O = OK F = Failure(s), extended report at the bottom
X = Failure(s) under TEST but not under harness
? = still running or test results not (yet) available
Build failures during: - = unknown or N/A
c = Configure, m = make, M = make (after miniperl), t = make test-prep

v5.15.9-270-g5a0c7e9 Configuration (common) none
----------- ---------------------------------------------------------
O O O m - -
O O O O O O -Duse64bitall
O O O m - - -Duseithreads
O O O O O O -Duseithreads -Duse64bitall
| | | | | +- LC_ALL = univ.utf8 -DDEBUGGING
| | | | +--- PERLIO = perlio -DDEBUGGING
| | | +----- PERLIO = stdio -DDEBUGGING
| | +------- LC_ALL = univ.utf8
| +--------- PERLIO = perlio
+----------- PERLIO = stdio

As the key says, 'O' is OK. It's what we want. 'm' is very bad - it means that it couldn't even build miniperl, let alone build extensions or run any tests. But what is strange is that ./Configure ... will fail, but the same options plus -Duse64bitall will work just fine. And this is replicated with ithreads - default fails badly, but use 64 bit IVs and pointers and it works. Usually it's the other way round - the default configuration works, because it is "simplest", and attempting something more complex such as 64 bit support, ithreads, shared perl library, hits a problem.

As it turns out, what's key is that that ./Configure ... contains -DDEBUGGING. The -DDEBUGGING parameter to Configure causes it to add -DDEBUGGING to the C compiler flags, and to add -g to the optimiser settings (without removing anything else there). So on HP-UX, with HP's compiler that changes the optimiser setting from '+O2 +Onolimit' to '+O2 +Onolimit -g'. Which, it seems, the compiler doesn't accept for building 32 bit object code (the default) but does in 64 bit. Crazy thing.

Except, that, astoundingly, its not even that simple. The original error message was actually "Can't handle preprocessed file". Turns out that that detail is important. The build is using ccache to speed things up, so ccache is invoking the pre-processor only, not the main compiler, to create a hash key to look up in its cache of objects. However, on a cache miss, ccache doesn't run the pre-processor again - to save time by avoiding repeating work, it compiles the already pre-processed source. And that is key the distinction between invoking the pre-processor and then compiling, versus compiling without the pre-processor:

$ echo 'int i;' >bonkers.c
$ cc -c -g +O2 bonkers.c
$ cc -E -g +O2 bonkers.c >bonkers.i
$ cc -c -g +O2 bonkers.i
cc: error 1414: Can't handle preprocessed file "bonkers.i" if -g and -O specified.
$ cat bonkers.i
# 1 "bonkers.c"
int i;

$ cc -c -g +O2 +DD64 bonkers.c
$ cc -E -g +O2 +DD64 bonkers.c >bonkers.i
$ cc -c -g +O2 +DD64 bonkers.i
$ cat bonkers.i
# 1 "bonkers.c"
int i;

No, it's not just crazy compiler, its insane! It handles -g +O2 just fine normally, but for 32 bit mode it refuses to accept pre-processed input. Whereas for 64 bit mode it does.

If HP think that this isn't a bug, I'd love to know what their excuse is.

A close contender for "unexpected cause" came about as a result of James E Keenan, Brian Fraser and Darin McBride recent work going through RT looking for old stalled bugs related to old versions of Perl on obsolete versions operating systems, to see whether they are still reproducible on current versions. If the problem isn't reproducible, it's not always obvious whether the bug was actually fixed, or merely that the symptom was hidden. This matters if the symptom was revealing a buffer overflow or similar security issue, as we'd like to find these before the blackhats do. Hence I've been investigating some of these to try to get a better idea whether we're about to throw away our only easy clue about still present bug.

One of these was RT #6002, reported back in 2001 in the old system as ID 20010309.008. In this case, the problem was that glob of a long filename would fail with a SEGV. Current versions of perl on current AIX don't SEGV, but did we fix it, did IBM, or is it still lurking? In this case, it turned out that I could replicate the SEGV by building 5.6.0 on current AIX. At which point, I have a test case, so start up git bisect, and the answer
should pop out within an hour. Only it doesn't, because it turns out that git bisect gets stuck in a tarpit of "skip"s because some intermediate blead version doesn't build. So this means a digression into bisecting the cause of the build failure, and then patching Porting/bisect-runner.pl to be able to build the relevant intermediate blead versions, so that it can then find the true cause. This might seem like a lot of work that is used only once, but it tends not to be. It becomes progressively easier to bisect more and more problems without hitting any problems, and until you have it you don't realise how powerful a tool automated bisection is. It's a massive time saver.

But, as to the original bug and the cause of its demise. It turned out to be interesting. And completely not what I expected:

commit 61d42ce43847d6cea183d4f40e2921e53606f13f
Author: Jarkko Hietaniemi
Date: Wed Jun 13 02:23:16 2001 +0000

New AIX dynaloading code from Jens-Uwe Mager.
Does break binary compatibility.

p4raw-id: //depot/perl@10554

The SEGV (due to an illegal instruction) goes away once perl switched to using dlopen() for dynamic linking on AIX. So my hunch that this bug was worth digging into was right, but not for reason I'd guessed.

A couple of bugs this month spawned interesting subthreads and digressions. RT #108286 had one, relating to the observation that code written like this, with each in the condition of a while loop:

while ($var = each %hash) { ... }
while ($_ = each %hash) { ... }

actually has a defined check automatically added, eg

$ perl -MO=Deparse -e 'while ($_ = each %hash) { ... }'
while (defined($_ = each %hash)) {
  die 'Unimplemented';
}
-e syntax OK

whereas code that omits the assignment does not have defined added:

$ perl -MO=Deparse -e 'while (each %hash) { ... }'
while (each %hash) {
   die 'Unimplemented';
}
-e syntax OK

contrast with (say) readdir, where defined is added, and an assignment to $_:

$ perl -MO=Deparse -e 'while ($var = readdir D) { ... }'
while (defined($var = readdir D)) {
   die 'Unimplemented';
}
-e syntax OK
$ perl -MO=Deparse -e 'while (readdir D) { ... }'
while (defined($_ = readdir D)) {
   die 'Unimplemented';
}
-e syntax OK

Note, this is only for readdir in the condition of a while loop - it doesn't usually default to assigning to $_

So, is this intended, or is it a bug? And if it's a bug, should it be fixed.

Turns out that the answer is, well, involved.

The trail starts with a ruling from Larry back in 1998:

As usual, when there are long arguments, there are good arguments for both sides (mixed in with the chaff). In this case, let's make

while ($x = <whatever>)

equivalent to

while (defined($x = <whatever>))

(But nothing more complicated than an assignment should assume defined().)

http://www.xray.mpe.mpg.de/mailing-lists/perl5-porters/1998-04/msg00133.html

Nick Ing-Simmons asks for a clarification:

Thanks Larry - that is what the patch I posted does.

But it also does the same for C, C and C - i.e. the same cases that solicit the warning in 5.004 is extending the defined insertion to those cases desirable? (glob and readdir seem to make sense, I am less sure about each).

http://www.xray.mpe.mpg.de/mailing-lists/perl5-porters/1998-04/msg00182.html

(it's clarified in a later message that Nick I-S hadn't realised that each in scalar context returns the keys, so it's an analogous iterator which can't return undef for any entry)

In turn, the "RULING" dates back to a thread discussing/complaining about a warning added in added in 5.004

$ perl5.004 -cwe 'while ($a = <>) {}'
Value of <HANDLE> construct can be "0"; test with defined() at -e line 1.
-e syntax OK

The intent of the changes back then appears to be to retain the 5.003 and earlier behaviour on what gets assigned for each construction, but change the loop behaviour to terminate on undefined rather than simply falsehood for the common simple cases:

while (OP ...)

and

while ($var = OP ...)

And there I thought it made sense - fixed in 1998 for readline, glob and readdir, but introducing the inconsistency because each doesn't default to assigning to $_. Except, it turned out that there was a twist in the tail. It turns out that while (readdir D) {...} didn't use to implicitly assign to $_. Both the implicit assignment to $_ and defined test were added in 2009 by commit 114c60ecb1f7, without any fanfare, just like any other bugfix. And the world hasn't ended.

$ perl5.10.0 -MO=Deparse -e 'while (readdir D) {}'
while (readdir D) {
    ();
}
-e syntax OK
$ perl5.12 -MO=Deparse -e 'while (readdir D) {}'
while (defined($_ = readdir D)) {
    ();
}
-e syntax OK

Running a search of CPAN reveals that almost no code uses while (each %hash) [and why should it? The construction does a lot of work only to throw it away], and nothing should break if it's changed. Hence it makes sense to treat this as a bug, and fix it. Which has now happened, but I can't take credit for it - post 5.16.0, Father Chrysostomos has now fixed it in blead.

To conclude this story, the mail archives from 15 years ago are fascinating. Lots of messages. Lots of design discussions, not always helpful. And some of the same unanswered questions as today.

The digression relates from trying to replicate a previous old bug (ID 20010918.001, now #7698) I'd dug an old machine with FreeBSD 4.6 out from the cupboard under the stairs in the hope of reproducing the period problem with a period OS. Sadly I couldn't do that, but out of curiosity I tried to build blead on it. This is the same 16M machine whose swapping hell prompted my investigation of enc2xs the better part of a decade ago, resulting in various optimisations on its build time memory use, that in turn led to ways to roughly halve the side of the built shared objects, and a lot of the material then used in a tutorial I presented at YAPC::Europe and The German Perl Workshop, "When Perl is not quite fast enough". This machine has pedigree.

Once again, it descended into swap hell, this time on mktables. (And with swap on all 4 hard disks, it's very effective at letting you know that it's swapping.) Sadly after 10 hours, and seemingly nearly finished, it ran out of virtual memory. So I wondered if, like last time, I could get the memory usage down. After a couple of false starts I found a tweak to Perl_sv_grow that gave a 2.5% memory reduction on FreeBSD (but none on Linux), but that wasn't enough. However, the cleanly abstracted internal structure of mktables makes it easy to add code to count the memory usage of the various data structures it generate. One of its low-level types is "Range", which subdivides into "special" and "non-special". There are 368676 of the latter, and the name for each may be need to be normalised into a "standard form". The code was taking the approach of calculating the standard form at object creation time. With the current usage patterns of the code, this turns out to be less than awesome - the standard form is only requested for 22047 of them. By changing the code to calculate only when needed (and cache the result) I reduced RAM and CPU usage by about 10% on Linux, and 6% on FreeBSD. Whilst the latter is smaller, it was enough to get the build through mktables, and on to completion. The refactoring is now merged to blead, post 5.16.0. Hopefully everyone's build will be a little bit smaller and a little bit faster as a result.

To complete the story, I should note that make harness failed with about 100 tests still to run, snatching defeat from the jaws of victory. Turns out that that also chews a lot of memory to store test results. make test, however, did pass (except for one bug in t/op/sprintf.t, patch in RT @112820). Curiously gcc, even when optimising, isn't the biggest memory hog of the build. It's beaten by mktables, t/harness and a couple of the Unicode regression tests. But even then, our build is very frugal. It should complete just fine with 128M of VM on a 32 bit FreeBSD system, and I'd guess under 256M on Linux (different malloc, different trade offs). I think that this means that blead would probably build and test OK within the hardware of a typical smartphone (without swapping), if they actually had native toolchains. Which they don't. Shame :-(

Part of May was spent getting a VMS build environment set up on the HP Open Source cluster, and using it to test RC1 and then RC2 on VMS.

Long term I'd like to have access to a VMS environment, not to actually do any porting work to VMS, but to permit refactoring of the build system without breaking VMS. George Greer's smoker builds the various smoke-me branches on Win32, so that makes it easy to test changes that would affect the Win32 build system, but no such smoker exists for VMS. Hence historically I've managed to do this by sending patches to Craig Berry and asking him nicely if he'd test them on his system, but this is obviously a slow, inefficient process that consumes his limited time, preventing him using it to instead actually improve the VMS port.

As the opportunity to get access turned up just as 5.16.0 was nearing shipping, I decided to work on getting things set up "right now" to try to get (more) tests of the release candidates on VMS. We discovered various shortcomings in the instructions in README.vms, and as a side effect of debugging a failed build, a small optimisation to avoid needless work when building DynaLoader. So it's likely that my ignorance will continue to be a virtue by finding assumptions and pitfalls in the VMS process that the real experts don't even realise that they are avoiding subconsciously.

We had various scares just before 5.16.0 shipped relating to build or test issues on Ubuntu, specifically on x86_64. This shouldn't happen - x86_64 GNU/Linux is probably the most tested platform, and Ubuntu is a popular distribution, so it feels like there simply shouldn't be any more bugs lurking. However, it seems that they keep breeding.

In this case, it's yet another side effect of Ubuntu going multi-architecture, with the result that the various libraries perl needs to link against are now in system dependent locations, instead of /usr/lib. This isn't a problem (well, wasn't once we coded to cope with it) - we ask the system gcc where its libraries are coming from, and use that library path. The raw output from the command looks like this:

$ /usr/bin/gcc -print-search-dirs
install: /usr/lib/gcc/x86_64-linux-gnu/4.6/

programs: =/usr/lib/gcc/x86_64-linux-gnu/4.6/:/usr/lib/gcc/x86_64-linux-gnu/4.6/:/usr/lib/gcc/x86_64-linux-gnu/:/usr/lib/gcc/x86_64-linux-gnu/4.6/:/usr/lib/gcc/x86_64-linux-gnu/:/usr/lib/gcc/x86_64-linux-gnu/4.6/../../../../x86_64-linux-gnu/bin/x86_64-linux-gnu/4.6/:/usr/lib/gcc/x86_64-linux-gnu/4.6/../../../../x86_64-linux-gnu/bin/x86_64-linux-gnu/:/usr/lib/gcc/x86_64-linux-gnu/4.6/../../../../x86_64-linux-gnu/bin/

libraries: =/usr/lib/gcc/x86_64-linux-gnu/4.6/:/usr/lib/gcc/x86_64-linux-gnu/4.6/../../../../x86_64-linux-gnu/lib/x86_64-linux-gnu/4.6/:/usr/lib/gcc/x86_64-linux-gnu/4.6/../../../../x86_64-linux-gnu/lib/x86_64-linux-gnu/:/usr/lib/gcc/x86_64-linux-gnu/4.6/../../../../x86_64-linux-gnu/lib/../lib/:/usr/lib/gcc/x86_64-linux-gnu/4.6/../../../x86_64-linux-gnu/4.6/:/usr/lib/gcc/x86_64-linux-gnu/4.6/../../../x86_64-linux-gnu/:/usr/lib/gcc/x86_64-linux-gnu/4.6/../../../../lib/:/lib/x86_64-linux-gnu/4.6/:/lib/x86_64-linux-gnu/:/lib/../lib/:/usr/lib/x86_64-linux-gnu/4.6/:/usr/lib/x86_64-linux-gnu/:/usr/lib/../lib/:/usr/lib/gcc/x86_64-linux-gnu/4.6/../../../../x86_64-linux-gnu/lib/:/usr/lib/gcc/x86_64-linux-gnu/4.6/../../../:/lib/:/usr/lib/

So the hints file processes that, to get the search path. It runs this pipeline of commands:

$ /usr/bin/gcc print-search-dirs | grep libraries | cut -f2 -d= | tr ':' '\n' | grep -v 'gcc' | sed -e 's:/$::'
/lib/x86_64-linux-gnu/4.6
/lib/x86_64-linux-gnu
/lib/../lib
/usr/lib/x86_64-linux-gnu/4.6
/usr/lib/x86_64-linux-gnu
/usr/lib/../lib
/lib
/usr/lib
$

Except that all of a sudden, we started getting reports of build failures on Ubuntu. It turned out that no libraries were found, with the first problem being the lack of the standard maths library, hence miniperl wouldn't link. Why so? After a bit of digging, it turns out that the reason was that the system now had a gcc which localised its output, and the reporter was running under a German locale.

So, here's what the hints file sees under a German locale:

$ export LC_ALL=de_AT
$ /usr/bin/gcc print-search-dirs | grep libraries | cut -f2 -d= | tr ':' '\n' | grep -v 'gcc' | sed -e 's:/$::'
$

Oh dear, no libraries. Why so?

$ /usr/bin/gcc -print-search-dirs installiere: /usr/lib/gcc/x86_64-linux-gnu/4.6/
Programme: =/usr/lib/gcc/x86_64-linux-gnu/4.6/:/usr/lib/gcc/x86_64-linux-gnu/4.6/:/usr/lib/gcc/x86_64-linux-gnu/:/usr/lib/gcc/x86_64-linux-gnu/4.6/:/usr/lib/gcc/x86_64-linux-gnu/:/usr/lib/gcc/x86_64-linux-gnu/4.6/../../../../x86_64-linux-gnu/bin/x86_64-linux-gnu/4.6/:/usr/lib/gcc/x86_64-linux-gnu/4.6/../../../../x86_64-linux-gnu/bin/x86_64-linux-gnu/:/usr/lib/gcc/x86_64-linux-gnu/4.6/../../../../x86_64-linux-gnu/bin/
Bibliotheken: =/usr/lib/gcc/x86_64-linux-gnu/4.6/:/usr/lib/gcc/x86_64-linux-gnu/4.6/../../../../x86_64-linux-gnu/lib/x86_64-linux-gnu/4.6/:/usr/lib/gcc/x86_64-linux-gnu/4.6/../../../../x86_64-linux-gnu/lib/x86_64-linux-gnu/:/usr/lib/gcc/x86_64-linux-gnu/4.6/../../../../x86_64-linux-gnu/lib/../lib/:/usr/lib/gcc/x86_64-linux-gnu/4.6/../../../x86_64-linux-gnu/4.6/:/usr/lib/gcc/x86_64-linux-gnu/4.6/../../../x86_64-linux-gnu/:/usr/lib/gcc/x86_64-linux-gnu/4.6/../../../../lib/:/lib/x86_64-linux-gnu/4.6/:/lib/x86_64-linux-gnu/:/lib/../lib/:/usr/lib/x86_64-linux-gnu/4.6/:/usr/lib/x86_64-linux-gnu/:/usr/lib/../lib/:/usr/lib/gcc/x86_64-linux-gnu/4.6/../../../../x86_64-linux-gnu/lib/:/usr/lib/gcc/x86_64-linux-gnu/4.6/../../../:/lib/:/usr/lib/

Because in the full output, the string we were searching for, "libraries", isn't there. it's now translated to "Bibliotheken".

Great. Unfortunately, there isn't an alternative machine readable output format offered by gcc, so this single output format has to make do for humans and machines, which means that the thing that we're parsing changes.

This is painful, and often subtle pain because we don't get any indication of the problem at the place where it happens. In this case, a failure in the hints file doesn't become obvious until the end of the link in the build.

The solution is simple - force the locale to "C" when running gcc in a pipeline. But it's whack-a-mole fixing these. It would be nice if more tools made the distinction that git does between porcelain (for humans), and plumbing (for input to other programs).

The second Ubuntu failure report just before 5.16.0 was for t/op/filetest.t failing. It turned out that the test couldn't cope with a combination of circumstances - running the test as root, but the build tree not being owned by root, and the file permissions being such that other users couldn't read files in the test tree. This all being because testing that -w isn't true on a read only file goes wrong if you're root, so there's special-case code to detect if it's running as root, which temporarily switches to an arbitrary non-zero UID for that test. Unfortunately it also had a %Config::Config based skip within that section, and the read of obscure configuration information triggers a disk read from lib/, which fails if the build tree's permissions just happened to be restrictive. The problem had actually been around for quite a while, so Ricardo documented it as a known issue and shipped it unchanged.

So post 5.16.0, I went to fix t/op/filetest.t. And this turned into quite a yak shaving exercise, as layer upon layer of historical complexity was revealed. Originally, t/op/filetest.t was added to test that various file test operators worked as expected. (Commit 42e55ab11744b52a in Oct 1998.) It used the file t/TEST and the directory t/op for targets. To test that read-only files were detected correctly, it would chmod 0555 TEST to set it read only.

The test would fail if run as root, because root can write to anything. So logic was added to set the effective user ID to 1 by assigning to $> in an eval (unconditionally), and restoring $> afterwards. (Commit 846f25a3508eb6a4 in Nov 1988.) Curiously, the restoration was done after the test for C<-r op>, rather than before it.

Most strangely, a skip was then added for the C<-w op> test based on $Config{d_seteuid}. The test runs after $> has been restored, so should have nothing to do with setuid. It was added as part of the VMS-related changes of commit 3eeba6fb8b434fcb in May 1999. As d_seteuid is not defined in VMS, this makes the test skip on VMS.

Commit 15fe5983b126b2ad in July 1999 added a skip for the read-only file test if d_seteuid is undefined. Which is actually the only test where having a working seteuid() might matter (but only if running as root, so that $> can be used to drop root privileges).

Commit fd1e013efb606b51 in August 1999 moved the restoration of $> earlier, ahead of the test for C<-r op>, as that test could fail if run as root with the source tree unpacked with a restrictive umask. (Bug ID 19990727.039)

"Obviously no bugs" vs "no obvious bugs". Code that complex can hide anything. As it turned out, the code to check $Config{d_seteuid} was incomplete, as it should also have been checking for $Config{d_setreuid} and $Config{d_setresuid}, as $> can use any of these. So I refactored the test to stop trying to consult %Config::Config to see whether root assigning to $> is going to work - just try it in an eval, and skip if it didn't. Only restore $> if we know we changed it, and as we only change it from root, we already know which value to restore it to.

Much simpler, and avoids having to duplicate the entire logic of which probed Configure variables affect the operation of $>

Finally, I spotted that I could get rid of a skip by using the temporary file the test (now) creates rather than t/TEST for a couple of the tests. The skip is necessary when building "outside" the source tree using a symlink forest back to it (./Configure -Dmksymlinks), because in that case t/TEST is actually a symlink.

So now the test is clearer, simpler, less buggy, and skips less often.

A more detailed breakdown summarised from the weekly reports. In these:

16 hex digits refer to commits in http://perl5.git.perl.org/perl.git
RT #... is a bug in https://rt.perl.org/rt3/
CPAN #... is a bug in https://rt.cpan.org/Public/
BBC is "bleadperl breaks CPAN" - Andreas König's test reports for CPAN modules
ID YYYYMMDD.### is an bug number in the old bug system. The RT # is given
afterwards. You can look up the old IDs at https://rt.perl.org/perlbug/

HoursActivity
0.50?->
1.25AIX bisect
0.75AIX ccache
1.00HP-UX 32 bit -DDEBUGGING failure
1.50ID 20000509.001 (#3221)
0.25ID 20010218.002 (#5844)
1.50ID 20010305.011 (#5971)
1.25ID 20010309.008 (#6002)
0.25ID 20010903.004 (#7614)
0.50ID 20010918.001 (#7698)
0.50ID 20011126.145 (#7937)
0.50IO::Socket::IP
3.75RT #108286
0.25RT #112126
0.25RT #112732
0.50RT #112786
0.75RT #112792
0.75RT #112820
1.00RT #112866
0.50RT #112914
0.75RT #112946
0.50RT #17711
0.25RT #18049
0.25RT #29437
0.25RT #32331
0.25RT #47027
0.50RT #78224
0.50RT #94682
0.50Ubuntu link fail with non-English locales
4.25VMS setup/RC1
7.00VMS setup/RC2
1.00clarifying the build system
1.00installhtml
4.50mktables memory usage
1.25process, scalability, mentoring
46.25reading/responding to list mail
1.75smoke-me branches
0.25smoke-me/trim-superfluous-Makefile
5.25t/op/filetest.t
1.00t/porting/checkcase.t
0.25the todo list
1.00undefined behaviour from integer overflow

96.00 hours total

Dave Mitchell has requested an extension of $20,000 for his Fixing Perl5 Core Bugs grant. During this grant he has sent weekly reports to the p5p mailing list as well as providing monthly summary reports that have been published on this blog. He has also provided a short summary of the work he completed during his last extension:

My previous $20K extension has, over the last 9 months, been spent almost exclusively working on a single meta-bug:

#34161 METABUG - (?{...}) and (??{...}) regexp issues

This work is almost complete, and I expect to to merge the work (approx 120 commits) back into bleadperl shortly, (although there will be some minor fixes still to do after that). At this point I will be able to close around 25 outstanding tickets linked to the meta-ticket.

I anticipate using the next round of grant money to work on a larger number of smaller issues!

Before we make a decision on this extension we would like to have a period of community consultation. Please leave feedback in the comments or if you prefer send email with your comments to karen at perlfoundation.org.

Joel Berger wrote:

Well its that time again. Thankfully the news is getting better once more!

Much of my time which is earmarked for Perl went to preparing for my YAPC talks. I'm glad they went so well; thanks to all of you who attended. I did even get a question about Alien::Base during one of the Q&As, so I glad to know that people are interested.

The news this month hopefully is that Windows is passing tests! Or it should once some windows tests show up. It passes on the only windows dev box that I have access to.

Also I have begun work on the fixes for the Mac problem I described last month. I still would like to figure out some way to ensure that the linker flag -header-pad_max_install_names is passed rather than ensuring a really long build path is used. Perhaps using the Makefile ENV => Variables trick? Of course that assumes your project uses make and doesn't clobber variables but appends to them.

Future plan:

This Mac problem has stymied me for a while too long, and I think it might be time (now that windows passes) to move forward. Certainly if the Mac host has the library installed Alien::Base should be able to detect it, the problem is only on the installation side and its being looked at. I really need to start getting some real-world feedback. I have heard from several people that they have projects in mind; I think that assuming the windows build tests do pass, its time to move to alpha phase.

During alpha testing I will want people to start creating their own Alien::MyLibrary modules. I especially need to know where the configuration scheme is not general enough. I do also need to know what is confusing in the documentation/examples so I can clear those up too. Of course during alpha testing I am making no promises that the API wont change, so don't release your dependent Alien:: modules to CPAN just yet, but please do inform me if you make a new github project or branch using Alien::Base, I definitely want to follow those.

I know I have said it before but the official call to arms should come very soon. The released version will be 0.001. Still if you want to get started now, you have my blessing.

Finally I want to mention that my original grant proposal had mentioned being done by now. Obviously this isn't the case. However, seeing as I am not being paid by the month (or any other timeframe for that matter), I would hope that seeing that the project is still alive and well should be enough to keep my grant open.

Cheers!

Original article at Joel Berger [blogs.perl.org].

In accordance with the terms of my grant from TPF this is the monthly
report for my work on improving Devel::Cover covering May 2012.

This report is essentially the same as my report for the first week, but
that will not be the usual case. If you read that report you can safely
skip this one.

You may recall that recently I submitted a grant proposal to The Perl
Foundation to work on Devel::Cover. I'd like to thank everyone who
thought that this might be a good idea, and I'm pleased to say that the
proposal was accepted and I have started work. Modelled on the
successful grants of Dave and Nick, one of the conditions of the grant
is that I should produce weekly and monthly reports of my progress, and
these reports will be sent to perl-qa@perl.org. If you are not
interested in these reports, please set up a filter, or just ignore
them.

I suppose there are two deliverables this month - Devel::Cover 0.87 and
http://cpancover.com

My plan is to work primarily on existing, relatively simple bugs and
problems before getting down to the complicated bugs and the more
interesting work of adding functionality, and most of my work so far has
indeed fallen into that category. There have been a couple of
deviations from that - one planned and one not. Much of the rest of my
work was related to Moose and Mouse. I've had a report about Moo which
I need to look at, but I've heard nothing about Mo so I'm not sure if
there are any problems there.

Very shortly after I started work perl-5.16.0 was released. So the second bug
that I looked at was RT 75314 which concerned a regression on condition cover
in 5.16. Unfortunately this problem went to the core of one of the most
tricky parts of Devel::Cover and it took me quite some time to get to the
bottom of it. The good news though, is that without this grant I would never
have been able to solve this problem and so condition and branch coverage in
5.16 would have been broken.

The bulk of the fix is found in f26ba32 and it relates to an
optimisation from David Mitchell:
http://perl5.git.perl.org/perl.git/blobdiff/0e1b3a4b35c4f6798b244c5b82edcf759e9e6806..db4d68cf2dda3f17:/op.c

Nick's bisecting code in the perl core was very helpful in fingering
this commit but, really, I should have remembered it. When Dave
announced this optimisation, I suspected that it would affect
Devel::Cover, but I had not fully followed up on that. The problem for
Devel::Cover is that I was depending on that op ordering to properly
calculate condition coverage.

Devel::Cover has two ways of collecting condition coverage, depending on
whether or not the condition is short circuited. If it is short
circuited, then when we get to the condition all we have to do is look
at the top of the stack to see whether the LHS is true or false. If
not, we have to remember that we are interested in the value of the RHS
when it has been evaluated.

Devel::Cover does this by hijacking the op_next pointer to look at the
value of the RHS and collect its coverage before moving on to the
expected op. The optimisation meant that we never evaluated the RHS and
so Devel::Cover didn't know its value. So Devel::Cover fell back to its
default behaviour in such cases which is to assume that the RHS is
false, as it is in cases when we never return a value from the RHS.
This is somewhat simplified, but gives you the idea of what is going on.
(Incidently, this is also the primary reason that Devel::Cover doesn't
run under threads.)

The solution to this involves grovelling around the optree a little more
to find the conditional ops and, whilst it solves most of the problems,
there is still a little work to be done in this area. (Where "a little"
means that the problem is simple, but the solution may not be.)

There's also a question here of whether Devel::Cover should be doing
this sort of thing, or whether I should be adding some hooks into the
core, for example. Hopefully, I will be able to look into such matters
later.

The other main area I looked at was getting cpancover up and running
again. This is not completely altruistic, since I tend to think of CPAN
as my extended test suite. I think cpancover could be quite useful, and
the current version gets many hits.

To this end, I was able to procure a machine from the nice folk at
bigv.io and I bought the domain name cpancover.com, with the result that
http://cpancover.com now points to an updated list of coverage for some
CPAN modules, run against perl 5.16.0. If there are any modules you
would like to see added here, please let me know, or just make your
changes to utils/install_modules in https://github.com/pjcj/Devel--Cover
and send me a pull request.

So, the work I have completed in the time covered by this report is:

Closed RT tickets:

  • 75944 Warning about Moose clearers when processing Test::TempDir
  • 75314 conditional testing broken in blead?
  • 73452 Bleadperl v5.15.5-331-g4d8ac5c breaks PJCJ/Devel-Cover-0.79.tar.gz
  • 69892 Tests busted on perl 5.15
  • 57174 Devel::Cover 0.66 does not work with Moose+type constraints
  • 68389 Moose's make_immutable call causes Devel::Cover to emit many warnings
  • 77163 t/e2e/amoose_basic.t fails while installing Devel::Cover 0.86
  • 37350 Missing whatis entry of Devel::Cover::Tutorial manual
  • 71680 Devel::Cover spews error-messages when Moose is used
  • 72819 0% branch coverage for methods defined in non-reported files
  • 68353 perl 5.14.0 breaks D:C in t/e2e/ainc_sub.t
  • 63090 conditionals where one element is data from a Moose attrib are not evaluated
  • 63568 Devel::Cover can't handle Test::More's is()

Closed Github tickets:

  • 12 Devel::Cover 0.80 and up fails t/e2e/amoose_basic.t
  • 9 Remove use Data::Dumper from Devel::Cover

Merged pull requests:

  • 14 -launch option opens report in appropriate viewer

Fixed cpantesters reports:

And many more covering the same problems in various guises.

You can see the commits at https://github.com/pjcj/Devel--Cover/commits/master

Hours worked: 44:10

To give a few interested groups more time to prepare and submit their bids for this year, we're extending the deadline to July 1. You can see details on how to submit in previous posts. If you're at YAPC::NA right now and you're interested, you should attend Dan Wright's talk So, you want to run a Perl event? and attend the YAPC BOF.

For those of you who aren't able to attend YAPC::NA in person, this year's organizers set up live streams of the conference. You can use the conference schedule to find which talks you want to see and then stream the room you are interested in. The rooms are:

You'll need Microsoft Silverlight, Adobe Flash, or Apple Quicktime to be able to view these streams.

The streams will go live at 9am US Central time on June 13th.

Enjoy the conference!

Dave Mitchell writes:

As per my grant conditions, here is a report for the May period.

This month I continued to rework how the code blocks in /(?{code})/ are actually invoked, and the work is mostly finished. In particular,

  • /(?{die/next/last/caller})/ no longer SEGV;
  • recursive (?{}) calling (?{}) etc works;
  • paren captures work correctly in the presence of recursive and nested (?{}) and (??{});
  • I also found and fixed some buggy behaviour of $^N and $+ when backtracking is involved (but not related to (?{}));
  • and I generally cleaned up the paren capturing code while I was at it;
  • propagating 'use re eval' into the return from (??{});
  • saving paren positions when running (?{}) code.

Over the last month I have averaged 17 hours per week.

As of 2012/05/31: since the beginning of the grant:

116.6 weeks
1237.1 total hours
10.6 average hours per week

There are now 63 hours left on the grant.

Report for period 2012/05/01 to 2012/05/31 inclusive

Summary

Effort (HH::MM):

0:00 diagnosing bugs
78:26 fixing bugs
0:00 reviewing other people's bug fixes
0:00 reviewing ticket histories
0:00 review the ticket queue (triage)
-----
78:26 Total

Numbers of tickets closed:

0 tickets closed that have been worked on
0 tickets closed related to bugs that have been fixed
0 tickets closed that were reviewed but not worked on (triage)
-----
0 Total

Short Detail

78:26 [perl #34161] METABUG - (?{...}) and (??{...}) regexp issues

Moritz has completed his grant and provided the following closing report.

Moritz Lenz writes:

After working for more than a year on my grant on exceptions, I am now confident that I have done all that I have promised, and can now close the grant.

Deliverables

A short summary of what I did for each deliverable follows

D1: Specification

S32::Exception contains my work in this area. It provides information about the basic exception types, the backtrace printer and how they interact with the rest of Perl 6.

There are certainly still open design question in the general space of exceptions like, how do we indicate that an exception should or should not print its backtrace by default? There are ways to achieve this right now, but it's not as easy as it it should be for the end user. However those open questions are well outside the realm of this grant. I still plan to tackle them in due time.

Several approaches to localization and internationalization are now within reach and only wait for somebody to do it.

D2: Error catalog, tests

The error catalog is also in S32::Exception. It is not comprehensive (ie doesn't cover all possible errors that are thrown from current compilers), but the grant request only required an "initial" catalog. It is certainly enough to demonstrate the feasibility of the design, and to handle very many common cases.

Tests are in the roast repository. At the time of writing (2012-06-07) there are 424 tests, of which Rakudo passes nearly all (the few failures are due to known bugs not related to the exception subsystem). I added a utility test function throws_like to Test::Util which makes testing of typed exceptions very easy.

D3: Implementation, tests, documentation

Rakudo now throws only typed exceptions from its setting (with the exception of internal errors). Note that before my work started it only allowed strings as exceptions.

The tests mentioned above already cover several bug reports where people complained about wrong or less-than-awesome error messages. Since my main motivation was to make error testing more robust, I consider this a big success.

Documentation for compiler writers and test authors is available.

Other Exceptions Progress

I'd also like to mention that I did several things related to exceptions which were not covered by this grant:

  • greatly improved backtrace printer
  • Many exceptions from within the compilation process (such as parse errors, redeclarations etc.) are now typed.
  • I enabled typed exceptions thrown from C code, and as a proof of concept I ported all user-visible exceptions in perl6.ops to their intended types.
  • Exceptions from within the meta model can now be caught in the "actions" part of the compiler, augmented with line numbers and file name and re-thrown
  • The Rakudo developers usually only close bug reports when tests are available. I wrote many tests for specific error conditions in response to such bug reports and closed the tickets.

Acknowledgements

I'd like to thank Ian Hague and the Perl Foundation for funding this grant, Karen Pauley and Will Coleda for managing it, and all the people who helped me designing, programming and wording things, especially Jonathan Worthington.

References

I am pleased to announce that Jess Robinson's grant application, Improving Cross compilation of Perl 5, has been successful. This grant will be managed by Tom Hukins and Renee Bäcker.

I would like to thank everyone who provided feedback on this grant, both on our blog and during the consultation phase.

Nicholas Clark writes:

The largest part of of the month was spent on various topics related to cleaning up the build process. In particular, simplifying the top-level Makefile, by removing duplication and self-recursion.

On platforms that run Configure (so Linux, Unix, Cygwin and OS/2), the Makefile is extracted from a shell script Makefile.SH. Most of it is written out verbatim, but there is some shell logic to generate OS or configuration specific code.

In turn that Makefile is used by makedepend to generate a second Makefile containing dependency information (named makefile on most systems, and GNUMakefile on OS X - both names chosen so that the make utility picks the newer makefile in preference).

The makefile compiles all the object files and links them into miniperl, which is pretty much the real perl, only without any form of dynamic linking. Between 5.005 and 5.6.0 things got a bit more complex, because perl switched to using File::Glob to implement globbing. [File::Glob uses the BSD glob code, written by that well-known Perl contributor Guido van Rossum :-), safely quarantined in its own file to avoid licensing cross-contamination]. As File::Glob needs miniperl to build, and miniperl isn't built yet, to solve the bootstrapping problem op.c is conditionally compiled as opmini.o, and that instead calls the old "shell out to csh" globbing code. This need to link with opmini.o instead of op.o becomes relevant later.

miniperl bootstraps the rest of the build, and is used as much as possible to avoid writing any subsequent build system 3 (or more) times - shell, batch files and DCL scripts. In particular, miniperl builds DynaLoader and perlmain.c, and links both with all those already compiled object files to make the real perl.

There is a configuration option to build a shared perl library. In this case, the C code is compiled to be position independent, and linked into a shared libperl, with the perl binary just a small stub linked against it. This also becomes relevant later, as libperl.so (or .dylib etc) contains the code for op.o, but not opmini.o.

Currently in blead, the Makefile has 20 places where it calls $(MAKE) to run a different target in the top level directory. Of these, this one is the most troubling:

$(LDLIBPTH) $(RUN) ./miniperl$(HOST_EXE_EXT) -w -Ilib -MExporter -e '<?>' || $(MAKE) minitest

as it can cause a fork bomb with a parallel make if miniperl builds but fails to work. Having been bitten by that fork bomb one time too many, I decided to eliminate it. However, that above line is actually replicated in 4 different places, 3 OS specific and the generic fallback. Hence the first digression into yak-shaving - reduce that number.

So, we have AIX (and BeOS), NeXT, OS X and "general". 3 of those are very much alive, so we can't just kill the code...

Firstly, AIX.

On AIX, when building with a shared perl library, one needs the list symbols that library should export when building it. For 5.005 and earlier, that list was generated by a shell script, so no bootstrapping problem.

However, the export list had to be manually updated in the shell script, so with commit 549a6b102c2ac8c4 in Jul 1999, the Win32 solution was generalised to work on both AIX and Win32. This uses Perl script, makedef.pl to parse various files and generate the current export list automatically. This introduces a bootstrapping problem - miniperl is needed to generate libperl.a, but libperl.a is needed to generate miniperl. This commit solves the problem by introducing a new target, MINIPERL_NONSHR, which builds a special staticly-linked miniperl (named miniperl_nonshr) in order to run makedef.pl, which in turn permits libperl.a and the regular miniperl to be built, and the build to proceed.

All was well until commits 52bb0670c0d245e3 and 8d55947163edbb9f (Dec 1999) changed the default for CORE::glob() to use the XS module File::Glob, and linked miniperl against an opmini.o, built from op.c but with compiler flags to use the old glob-via-csh code. The change made for AIX was to build miniperl_nonshr with the bootstrapping glob code, but leave the build of miniperl unchanged. This broke the build on AIX - miniperl would build just fine, but would fail to build any XS extensions, as the ExtUtils::MakeMaker code requires working globbing.

The AIX build was fixed with commit 18c4b137c9980e71 (Feb 2000) by changing Makefile.SH so that AIX used the same rules to build miniperl as NeXT. The rules for NeXT generated miniperl from an explicit list of object files, instead of using libperl.a. The result of this change was that on AIX, miniperl was now identical to miniperl_nonshr. Both correctly use the csh globbing code, but now neither require the shared libperl.a to work.

This makes miniperl_nonshr redundant. So I eliminated it. This starts to converge the code for AIX with the other platforms.

The default case:

Curiously, as a side effect of commit 908fcb8bef8cbab8 (Dec 2006) which moved DynaLoader.o into libperl.so, the default platform build rules for miniperl were changed to use an explicit list of object files, instead of C<-lperl>, which had the side effect of building miniperl non-shared. Hence all platforms now have a non-shared miniperl when building perl with a shared perl library.

Next, OS X:

Commit cb3fc4263509f28c (May 2003) removed the use of -flat_namespace from the link flags, but added it specially in Makefile.SH for miniperl, so that symbols from opmini.o overrode those in libperl.dylib. However, a side effect of commit 908fcb8bef8cbab8 (Dec 2006) was to change the linker line to use explicit object files, meaning that op.o was no longer part of linking, meaning that the override is no longer needed. Hence darwin's link does not need special-casing.

Lastly, NeXT:

Whilst it's not clear whether anyone is still using NeXT, since the previous changes have refactored the other 3 cases to use an explicit list of object files, the only difference remaining between the makefile rule for NeXT and the rest is that next doesn't use $(CLDFLAGS) when linking miniperl, whereas all the other do. It's not clear if this difference is significant or accidental, but lacking any NeXT system to test on, it was simple enough to preserve the difference but use simpler code to implement it.

The result - all 4 cases are merged. However, I've not yet actually eliminated that particular troublesome C<|| $(MAKE) ...> rule, as to test it properly requires C to pass first time, without first building all the non-XS extensions - work I've made progress on, but not yet completed.

As AIX now uses regular miniperl to run makedef.pl, the AIX-specific section of the Makefile now starts to look much more similar to the OS/2 specific section that runs makedef.pl there. With some refactoring of makedef.pl to avoid needing a command-line parameter to specify the platform to build for, and passing in a -DPERL_DLL on AIX which does nothing, the two converge even further. At which point it was a small matter of using Makefile macros to encode the different dependencies and filenames used, at which point the two can share code, and Makefile.SH gets simpler.

In turn, I also eliminated a lot of redundancy in the various variant install targets (each of which had been implemented with a call to $(MAKE) in the same directory), and the pre-requisites for various debugging targets.

One of these needs to pass a flag to installperl to instruct it to run strip on the installed binaries. installman doesn't need to strip anything, so doesn't accept such a flag. As it's the only difference between the invocations of the two, I decided that the simplest option was actually to make installman accept a --strip flag and do nothing. This, as ever, isn't as simple as it seems. installperl's strip flag is currently -s, but installman uses Getopt::Long so accepts -s as an abbreviation for --silent Hence the best solution is to have both accept a long-option --strip. This means refactoring installperl to use Getopt::Long, which in turn is "fun" because it accepts both +v and -v as distinct options, which Getopt::Long can't support. Or, more pragmatically, can't directly support, in the general case. Fortunately installperl isn't the general case, as it takes no non-option arguments, which permits a reasonably simple solution.

With this, more duplication died. En route, I spotted and eliminated some dead code in installperl related to a 5.000->5.001 change - specifically where in @INC autosplit installs files. Small instances of this sort of cruft accumulate all over the source tree, but generally the code is never annotated sufficiently as to the purpose to make it obvious that it had a specific time-limited purpose. And of course, it's maybe only 1% of the slightly "look twice and wonder why" code that is actually redundant - most is subtly useful on some platform or configuration corner case that is hard to test, but likely still needed somewhere.

This work is in the branches smoke-me/Makefile-miniperl-unification, smoke-me/Makefile-norecurse, smoke-me/make_ext and smoke-me/perlmodlib

Makefile.SH is a lot better than it, although there's still some more to do that will make it simpler still. As well as fixing minitest, still to do are some further simplifications of how ./miniperl is invoked to run various utilities. Most of the Makefile command lines have -Ilib -Idist/Cwd -Idist/Cwd/lib, dating back to the time when Cwd was detangled from ext/ and lib/ into a single directory in dist/ with the same layout as the CPAN distribution. However, since then I refactored miniperl to use the sitecustomize code to set up the build @INC automatically, meaning that everything after that first -Ilib is now redundant. So that's still to clean.

As ever, the age and gnarliness of the codebase, combined with the complexities resulting from being able to build various different configurations on many different platforms means that often I spend a lot of time investigating things, but little code changed as a result.

The most obvious example of this was while investigating an unrelated problem, happening to use valgrind on OS X, and discovering that it was reporting an error during interpreter startup [specifically down from S_init_postdump_symbols()]. Strange and troubling - strange because I thought we'd nailed all of these, and troubling because interpreter start up is a fairly common code path. As in, 100% of programs traverse it. So it's important not mess up and possibly open the door for malice. Except that the more I dug the more this seemed to be S.E.P.* - either a bug in gcc or in valgrind. I'm aware of Klortho's advice on this matter:

#11907 Looking for a compiler bug is the strategy of LAST resort. LAST resort.

but it did really seem to be a bug somewhere in the toolchain - malloc allocating 142 bytes, and then memmove attempting to read 8 bytes from offset 136, 6 before the end of the 8*n+6 sized block. So I noted the symptoms as RT #112198 in case anyone else hit them and searched for them, and then rejected the ticket, thinking I was done.

Of course, I wasn't. Tony Cook recognised the symptoms as being the same as a Mozilla bug: https://bugzilla.mozilla.org/show_bug.cgi?id=710438#c3 which leads to a valgrind bug: https://bugs.kde.org/show_bug.cgi?id=285662 which had been marked as resolved in their svn trunk. I've built valgrind from svn and verified that their r12423 genuinely fixes it.

While on the subject of Advice from Klortho #11907, I investigated further the failure of HP-UX to build blead, failing on File::Glob with very similar symptoms to a mysterious Cygwin failure - lib/File/Glob.pm is reported as containing syntax errors, because a postfix when is not recognised as being enabled syntax - the use feature 'switch'; is being ignored.

However, in this case it turns out that the bug isn't the same as Cygwin (lack of a suitable cast in our code), but instead similar to an earlier bug on AIX. In that case, a compiler bug in handling the C99 bool type with paired ! operators caused the control V byte in the name $^V not to be considered a control character. In this case, a compiler bug with bool and the && operator** caused features never to be enabled. Again, worked round with a trivial change in a header file in how we express a construction. It's a bit frustrating, but it does seem that 12 years isn't really enough time for the compiler writers to get all the bugs out of these new fangled bool thingymabobs. Stay tuned next month for further fun with HP's compiler.

I investigated what gcc's -ftrapv flag reveals about the prevenance of signed integer overflow in the core C code. Unsigned integer overflow in C is well defined - it wraps. Signed integer overflow, however, is undefined behaviour (not just implementation defined or unspecified) so really isn't something we should be doing. Not having to care about undefined behaviour gives C compilers the loopholes they need to optimise conformant code. Whilst right now compilers usually end up exposing the behaviour of the underlying CPU (which is nearly always 2s complement these days) the core's code is making this assumption, but should not.

The -ftrapv flag for gcc changes all operations that might caused signed overlow to trap undefined behaviour and abort(). This, of course, is still a conformant compiler, because undefined behaviour is, well, undefined. (Be glad that it didn't format your hard disk. That's fair game too, although one hopes that the operating system stops that.)

It turns out that there's quite a lot of bad assumptions in the code about signed integer overflow. These seem to fall into three groups

0) The code used to implement the arithmetic operators under use integer;

1) The assumption that the signed bit pattern for the most negative signed value, IV_MIN, is identical to the bit pattern for the unsigned negation of that value. ie on a 32 bit system, the IV -2147483648 and the UV 2147483648 have the same bit representation

2) A lot of code in the regular expression engine uses the type I32

The regular expression engine code is complex. It only causes aborts under -ftrapv on a 32 bit system, probably due to type promotion on 64 bit systems for the problematic expressions, such that the values don't overflow. However, it's clear that in places the engine is relying on sentinel values such as -1, and it's not clear whether these are tightly coupled with pointer arithmetic, so a brute force approach of trying to "fix" the problem by converting I32 to SSize_t is likely to cause far more bugs than it fixes.

Sadly the use integer code sadly is doing exactly what it's documented to do - "native integer arithmetic (as provided by your C compiler) is used" And as your C compiler's integer arithmetic is undefined on signed overflow, so will your Perl code be. So, we have to accept this is going to raise the ire of -ftrapv. However, this causes problems when testing, as code to test it will abort, and it's not possible to make exceptions from -ftrapv on a function by function basis. So the best solution to this seems to be to break out the integer ops into their own file, and set the C compiler flags specially for that file (which we already have the infrastructure for).

The middle set of code turns out to be relatively easy to fix. In part it can be done by changing conversions between IV_MIN and its negation from obvious-but-overflowing terse expressions to slightly longer expressions that sidestep the overflow. Most of the rest can be fixed by bounds checking - negating values in range, and explicitly using IV_MIN when negating the value 1-(IV-MIN+1). One surprise was the code used to avoid infinite looping when the end point of a .. range was IV_MAX, which needs a special case, but the special case in turn was written assuming signed integer overflow. That code is fixed in smoke-me/iter-IV_MAX, the rest in smoke-me/ftrapv, and some related cleanup of pp_pow in smoke-me/pp_pow. All will be merged to blead after 5.16.0 is released.

To continue this, I investigated using CompCert to compile Perl. CompCert is a "verified compiler, a high-assurance compiler for almost all of the ISO C90 / ANSI C language, generating efficient code for the PowerPC, ARM and x86 processors." -- http://compcert.inria.fr/

However, its description of "almost all" is accurate - it doesn't include variable argument lists, which is kind of a show stopper for us. However, as an experiment I tried out running Configure with it to see what happened. Result! We generate an invalid config.h file. So that's now RT #112494, with a fix ready to go in post 5.16.0

I also spent some time investigating precisely how and when we lost support for using sfio (in place of stdio). It turns out that the build of miniperl is restored by 7 one-line fixes (now preserved for posterity in the branch nicholas/misc-tidyup as they're not relevant to getting 5.16.0 released). More usefully, it revealed a section of Storable.xs which is no longer needed, so that's 14 lines of code to kill. However, this doesn't mean that sfio is nearly working and about to be useful again. With this fixed, the build fails thanks to a strange issue which truncates some Makefiles' output by ExtUtils::MakeMaker at 8192 bytes. I dug further, because I wanted to be sure it wasn't the only known symptom of some mysterious core buffering bug. Based on some historical anomalous CPAN smoker results, I'm suspicious that we may well have one. It turns out, fortunately, that in this case it's not our bug. The cause was far more interesting - something I'd never even considered. In C, a pointer to the element immediately after an array is valid for pointer arithmetic. (But, obviously, not the value it points to. So don't dereference it.) The usual use of this is to store a buffer end pointer - increment the active pointer, and if it's equal to the buffer end, do something.

However, what's not obvious from that is that the memory address immediately after the array might also be the start of some other array that the program holds a pointer to. The bug was that sfio's "do something" (ie empty the write buffer) was deferred until the next call that wrote to the stream. However, ahead of that check was a check related to sfreserve(), which permits user code to write directly into the buffer. This is implemented by handing out a pointer to the internal "next write" pointer, and the user code signals that it is done by calling sfwrite() with this (internal) pointer. The special case was intended to express "is the passed in write-from pointer identical to the internal next write pointer?" However, that check breaks if it just happens that the internal buffer is full ("next write" points one beyond the buffer), and the user's buffer just happens to be in the very next byte of memory. Which turns out to be possible on FreeBSD, where malloc() is able to allocate contiguous blocks of memory. So, fortunately, this isn't our bug. I've reported it to Phong Vo, who confirms that it's a bug in sfio.

Even with this fixed locally, a perl built with sfio fails even more tests than one built without PerlIO enabled. Unless someone external with an ongoing interest in sfio works to fix these issues, the days of the current sfio code are numbered. Dead non-working code just gets in the way.

A more detailed breakdown summarised from the weekly reports. In these:

16 hex digits refer to commits in http://perl5.git.perl.org/perl.git
RT #... is a bug in https://rt.perl.org/rt3/
CPAN #... is a bug in https://rt.cpan.org/Public/
BBC is "bleadperl breaks CPAN" - Andreas König's test reports for CPAN modules
ID YYYYMMDD.### is an bug number in the old bug system. The RT # is given
afterwards.

HoursActivity
0.25'd' flags (on valid_utf8_to_uv{chr,uni})
0.75AIX bisect
5.25AIX make bug
AIX make bug (ccache)
AIX make bug (https://bugzilla.samba.org/show_bug.cgi?id=8906)
AIX make bug (xlc -DDEBUGGING segv)
0.50DL shared library (RT #40652)
6.50HP-UX build failure (cc bug)
0.25HP-UX cc bug
7.25ID 20000721.002 (#3561)
0.25ID 20010801.037 (#7425)
1.00ID 20020513.013 (#9319)
2.50LDLIBPTH
0.25MSDOS died with 5.000, DJGPP only appeared in 5.004
2.00Makefile improvements
Makefile improvements (VMS)
7.00Makefile-norecurse
11.00Makefile.SH miniperl rule unification
0.25Pod::PerlDoc
3.50RT #108286
0.50RT #112312
4.00RT #112350
1.25RT #112370
0.25RT #112404
0.25RT #112478
1.25RT #112494
2.00RT #112504
0.25RT #112536
2.25RT #24250
0.50RT #33159
0.25RT #36309
0.50RT #40652
1.50Solaris -xO3 failure
0.75Solaris bisect.
Solaris bisect. (-xO3)
0.50Testing on OpenBSD
12.00bisect.pl
bisect.pl (HP-UX and AIX)
bisect.pl (HP-UX build failure (cc bug))
bisect.pl (user fixups)
bisect.pl (valgrind)
8.50build process [autodoc, perlmodlib, $(Icwd)]
0.25cross compilation
3.50embed_lib.pl
5.50installman
6.50minitest cleaner
1.50pp_pow
0.50process, scalability, mentoring
24.50reading/responding to list mail
4.00review davem/re_eval
7.50sfio
2.50split %SIG panic
1.50the todo list
9.50undefined behaviour caught by gcc -ftrapv
3.75valgrind error on OS X (RT #112198)

156.00 hours total

* http://en.wikipedia.org/wiki/SEP_field ** to be more accurate, an expression involving a variable, ||, && and a function returning bool.

Dave Mitchell writes:

As per my grant conditions, here is a report for the April period.

Mostly spent the month improving how the code blocks in /(?{code})/ are actually invoked. Previously a proper entry wasn't made on the context stack, so things like /(?{die/next/last/caller/return})/ caused erratic behaviour, and even seg faults.

My new method makes use of the MULTICALL API rather than having its own hand-rolled implementation, so it benefits from code reuse and commonality of any future bug fixes.

This was harder to achieve than might be expected because the semantics of regex code-blocks insist that they do not introduce new scopes; so each time a block is called, any new entries on the save stack are accumulated rather than being freed. This clashes badly with the normal expectation of pushing a sub on the context stack (along with some SAVEs on the save stack), then popping everything on block exit.

Over the last month I have averaged 6 hours per week.

As of 2012/04/30: since the beginning of the grant:

112.1 weeks
1158.7 total hours
10.3 average hours per week

There are now 140 hours left on the grant.

Report for period 2012/04/01 to 2012/04/30 inclusive

Summary

Effort (HH::MM):

4:30 diagnosing bugs
19:15 fixing bugs
0:00 reviewing other people's bug fixes
0:00 reviewing ticket histories
0:00 review the ticket queue (triage)
-----
23:45 Total

Numbers of tickets closed:

1 tickets closed that have been worked on
0 tickets closed related to bugs that have been fixed
0 tickets closed that were reviewed but not worked on (triage)
-----
1 Total

Short Detail

14:35 [perl #34161] METABUG - (?{...}) and (??{...}) regexp issues
1:05 [perl #109718] fork.t fails on Win32 since v5.15.4-465-g676a678
4:40 [perl #112326] Bleadperl v5.15.8-77-g5bec93b breaks GFUJI/Data-Util-0.59.tar.gz
3:25 [perl #112444] Bleadperl v5.15.9-131-g2653c1e breaks ECARROLL/nextgen-0.06.tar.gz

About TPF

The Perl Foundation - supporting the Perl community since 2000. Find out more at www.perlfoundation.org.

Recent Comments

  • idn: Ah crap, that should have been Jozef not Alberto.. read more
  • idn: Hi Alberto, Do you plan to import the existing RPMs read more
  • idn: This is an excellent plan getty! I'd vote for a read more
  • Neil Bowers: I think this is well worth funding (and more so read more
  • autarch.urth.org: The Perl community needs more well thought out and documented read more
  • autarch.urth.org: I'm all for improving ACT, and if the current maintainer read more
  • Jeffrey Ryan Thalhammer: This definitely gets my support. I'm thinking of putting on read more
  • Paul Seamons: While I like reading reviews, I'm not sure I'd pay read more
  • Jeffrey Ryan Thalhammer: tempire: It is probably impossible to do full justice to read more
  • Jeffrey Ryan Thalhammer: I'd love to see this proposal funded. Neil always writes read more

About this Archive

This page is an archive of entries from June 2012 listed from newest to oldest.

May 2012 is the previous archive.

July 2012 is the next archive.

Find recent content on the main index or look in the archives to find all content.

OpenID accepted here Learn more about OpenID
Powered by Movable Type 4.38