September 2012 Archives

Shlomi Fish reported:

2012-Sep-08:

  • Added a test for the w [expr] command for setting a watch expression.

  • Added a test for the W [expr] command for removing a watch expression.

  • Added a test for the W * command for removing all watch expressions.

  • Added a test for the o command (with no arguments).

  • Added a test for the o anyoption? command.

  • Added a test for the o option=value command.

  • Added a test for the < and < ? commands.

  • Added a test for the < * command.

2012-Sep-09:

  • Added tests for >, > ? and > *.

2012-Sep-10:

  • Spent most of the time today handling breakage in ruby-related packages in Mageia Linux Cauldron, caused by the upgrade to ruby-1.9.x.

  • Added a test for > and < together.

2012-Sep-12:

  • Added a test for the { and { ? commands.

  • Added a test for the { * command.

  • Added a test for !.

  • Added a test for ! -num.

  • Refactored lib/perl5db.t by converting old test assertions to DebugWrap and higher-level generator functions.

2012-Sep-13:

  • Continued to refactor lib/perl5db.t by converting old test assertions to DebugWrap and higher-level generator functions.

  • Added a test for the source command.

  • Fix a bug where source could not be properly specified inside @typeahead (with a test):

    • Commit: 48b182ca706a9492537db741fb42f0bfce52df70

2012-Sep-15:

  • Added tests for H and H -num commands.

  • Added a test for the = command.

  • Add soem tests for the m command.

  • Added a test for the M command.

    • Commit: 9539c6dbccaf8c25dbf1742813399c52512763c2

2012-Sep-19:

  • Added a test for the dieLevel option.

2012-Sep-20:

  • Added a test for the warnLevel=1 option.

    • Commit: 375e0c01a0fc95a5451878cf3b7bddc3873a8fc2

2012-Sep-21:

  • Add some tests for o AutoTrace and the t command and fixed a bug that affected o AutoTrace.

    • Commit: 4600172d02b2e44d15abe0285ac70405b5fa5f73

Nicholas Clark writes:

A fair chunk of the month was taken up with investigating three related areas:

  • How code such as -e 'BEGIN {1}' is compiled, and the interaction between PL_main_cv, CvSTART, PL_main_root and PL_main_start
  • Code in op.c which calls CopLINE_set(PL_curcop, oldline) and warnings from multi-line constructions
  • Building with -DPERL_DEBUG_READONLY_OPS to force shared ops to be read-only, and determining the causes of code trying to write to read-only ops

Of these, I managed to finish the first two in August, but it took until the first week of September, so that won't feature until next month (or next week, if you read the weekly summaries on perl5-porters)

ithreads is implemented on the interpreter cloning infrastructure originally added to provide fork emulation on Win32. Part of the design for that is that under ithreads optrees are read only and shared between threads, to save the time and memory that would be needed to copy them. For building without ithreads, the old rules still hold - there is no restriction that OPs should be read only, and no restriction as to what they can point to. However, to implement the shared OPs for ithreads required locating all places where OPs have mutable fields or pointers to structures that are now per-ithread, and change the code so that when building under ithreads they move to unshared structures, or otherwise ensure that the OP stays read only once constructed. To my memory no bugs had cropped up post v5.6.0 relating to this, so it was assumed that all was fine.

In 2007 I decided to check this assumption by adding the ability to recompile perl with the OP memory allocations coming from mmap(), and using mprotect() to turn OPs read only once they had been built. I forget what even motivated me to do this, but the approach did find a couple more obscure cases where OPs were being modified at runtime, in violation of the ithreads rules.

Father Chrysostomos recently refactored OP allocation to always use a slab allocator, as part of fixing some very long standing bugs to do with OPs leaking if compilation fails within an eval, and did some further work on it. Because I was having "fun" trying to work out how Perl_newCONSTSUB(), PL_curcop and various other things were interacting in reporting warning filenames and line numbers, I decided to compile with -DPERL_DEBUG_READONLY_OPS to see if enabling that code would shed any light on the problem. As a matter of routine, I did this by doing a full build and test (less than 5 minutes in parallel on reasonable hardware), and noticed that nearly all of the tests passed in this configuration. So I set off identifying the cause of failures, to see if it was possible to get it to zero.

It turns out that it was (at least on the x86_64 Linux system I was testing on), as there were only two underlying causes of failures. Firstly pp_i_modulo contains runtime code to detect a bug in glibc 2.2.5's _moddi3, switching in a slower work around implementation if the C library is buggy. I think that the reasoning for doing this check at runtime, rather than compile time, is because one is (typically) linking against a shared library here, and so detecting the problem at build time is potentially useless - if the system is upgraded to the buggy version, your build time information that you were safe is now stale, and bugs appear. Meanwhile if you build when the installed version is buggy, but it's then upgraded to a fixed version, you don't get the benefit. So when built on platform that is "at risk", the code does a check on the first call to pp_i_modulo, and then picks the "right" implementation and rewrites the op to call that directly in future. "rewrite" - that's a SEGVing offence on a read-only page. So the simple solution was to disable all the runtime probing if PERL_DEBUG_READONLY_OPS is defined, effectively treating glibc like every other platform.

The only other write action on OPs was the debugger setting breakpoints. When the debugger is enabled, all NEXTSTATE ops are changed at compile time in DBSTATE ops, and if OPf_SPECIAL is set on a DBSTATE op then a callback is made into the debugger. Clearly setting or clearing OPf_SPECIAL on an OP at runtime is a write activity. Given that the debugger itself is aware of threads, and it is documented that setting a breakpoint applies to all threads, I decided that the right solution was to explicitly permit this OP writing, by tweaking the C code to set the OP read/write before altering the flag, and back to read only afterwards.

With these changes, building with -DPERL_DEBUG_READONLY_OPS (and -Dusethreads, obviously) passes all tests.

I managed to write tests for the various blocks of code in op.c which calls CopLINE_set(PL_curcop, oldline), related to generating better warnings from multi-line constructions. All that is, except this code in newCONSTSUB:

    if (IN_PERL_RUNTIME) {
        /* at runtime, it's not safe to manipulate PL_curcop: it may be
         * an op shared between threads. Use a non-shared COP for our
         * dirty work */
         SAVEVPTR(PL_curcop);
         SAVECOMPILEWARNINGS();
         PL_compiling.cop_warnings = DUP_WARNINGS(PL_curcop->cop_warnings);
         PL_curcop = &PL_compiling;
    }
    SAVECOPLINE(PL_curcop);
    CopLINE_set(PL_curcop, PL_parser ? PL_parser->copline : NOLINE);

This all feels hacky. Why does it need to be set...

So, I think that it contributes to the following bug. Sorry it's not clear, but note that some of the line numbers in the redefined warnings differ depending on whether it's in a BEGIN block:

$ ./perl -Ilib -we 'eval qq{ {\n\n\nDynaLoader::boot_DynaLoader("DynaLoader")}}; eval
 qq{{\n\n\nDynaLoader::boot_DynaLoader("DynaLoader")}}'
Subroutine DynaLoader::dl_load_file redefined at (eval 2) line 4.
Subroutine DynaLoader::dl_unload_file redefined at (eval 2) line 4.
Subroutine DynaLoader::dl_find_symbol redefined at (eval 2) line 4.
Subroutine DynaLoader::dl_undef_symbols redefined at (eval 2) line 4.
Subroutine DynaLoader::dl_install_xsub redefined at (eval 2) line 4.
Subroutine DynaLoader::dl_error redefined at (eval 2) line 4.
$ ./perl -Ilib -we 'eval qq{ {\n\n\nDynaLoader::boot_DynaLoader("DynaLoader")}}; eval 
qq{BEGIN {\n\n\nDynaLoader::boot_DynaLoader("DynaLoader")}}'
Subroutine DynaLoader::dl_load_file redefined at (eval 2) line 1.
Subroutine DynaLoader::dl_unload_file redefined at (eval 2) line 1.
Subroutine DynaLoader::dl_find_symbol redefined at (eval 2) line 1.
Subroutine DynaLoader::dl_undef_symbols redefined at (eval 2) line 1.
Subroutine DynaLoader::dl_install_xsub redefined at (eval 2) line 1.
Subroutine DynaLoader::dl_error redefined at (eval 2) line 1.

ie "line 4" vs "line 1" despite the fact that the only difference between the two overlong 1-liners is the six character string "BEGIN "

So, I'd like to take the above code out. If I remove it, the build fails:

GLOB_CSH is not a valid File::Glob macro at ../lib/File/Glob.pm line 66

The problem comes down to this bit of Perl_gv_fetchpvn_flags():

    if (!stash) {
    no_stash:
        if (len && isIDFIRST_lazy(name)) {
...
            if (global)
                stash = PL_defstash;
            else if (IN_PERL_COMPILETIME) {
                stash = PL_curstash;
...
            }
            else
                stash = CopSTASH(PL_curcop);

$expletive. The behaviour of the function Perl_gv_fetchpvn_flags() differs between "Compile Time" and "Run Time". That's really, um, less than awesome. (This also isn't the only place deep within a function unrelated to parsing or optree building that behaviour differs depending on whether IN_PERL_COMPILETIME is true of false. You can laugh, or you can cry, or maybe you should do both at the same time.)

Specifically, the way that newCONSTSUB_flags() actually controls how gv_fetchpvn_flags() gets a stash to default from is by

  • setting PL_curstash to the stash to use
  • assigning &PL_compiling to PL_curcop to make gv_fetchpvn_flags() notice this.

This is not sane.

I think that the right way to fix this is to have a way to pass the default stash into Perl_gv_fetchpvn_flags(). Or even split it into two - one half that locates the stash to use, and the other half that takes a stash, and does the initialisation. So that code can be changed to something like this:

@@ -1521,7 +1529,9 @@ Perl_gv_fetchpvn_flags(pTHX_ const char *nambeg, STRLEN fu
     if (!stash) {
     no_stash:
-       if (len && isIDFIRST_lazy(name)) {
+        if (def_stash) {
+            stash = def_stash;
+        } else if (len && isIDFIRST_lazy(name)) {
            bool global = FALSE;

switch (len) {

However, it's not at all clear to me how to cleanly get that def_stash into there. I sent a very ugly proof of concept code to perl5-porters, with which all tests pass and my convoluted example becomes consistent. But it's a total bodge, and it's not clear to me (or anyone who has looked at this previously) what the right way to proceed is. We know where we want to be, but "If I were you sir, I wouldn't start from here".

I also investigated "microperl". "microperl", like "miniperl", is somewhat a misnomer. It's not that much smaller:

-rwxr-xr-x 1 nick nick 1091074 Aug 16 21:55 microperl
-rwxr-xr-x 1 nick nick 1223695 Aug 16 15:15 miniperl
-rwxr-xr-x 1 nick nick 1332163 Aug 16 16:03 perl

So what are the differences?

perl is (hopefully obviously) the thing that you want to install. It's linked with the platform specific dynamic library loading code which implements DynaLoader, and hence enables perl to load compiled XS code at runtime. But that dynamic library loading code is written in XS, so needs a copy of perl to build it. But the build system can't assume that there's a copy of perl on the system to run this, so how does it bootstrap?

That's the job of miniperl. miniperl is a binary linked from (pretty much) all the same object files as go up to make perl, but not DynaLoader.o It's good enough to run xsubpp, the XS to C translator (and the rest of the build system), but none of the things it needs need perl to build them. (Because we ship the small number of generated files that need perl to be recreated, and now have a regression test to ensure that they're kept up to date). So "perl" is pretty much "miniperl" + DynaLoader.

So where does microperl come in? It's not specifically intended to be "tiny". My understanding is that microperl was intended as an experiment as to whether it's possible to build perl without needing to run some other tool first to configure it. If you could, you might be able to replace Configure with some sort of bootstrapping approach using a microperl to build the configuration for the real perl. That sounds useful. But work on it pretty much stopped over a decade ago.

Even the "no configuration" idea doesn't really work - you need at least one canned configuration for ILP32 systems, and one for LP64 systems. (And, possibly, a third for LLP64 systems, which may just be Win64)

Because microperl doesn't probe features, and builds off a canned config.sh, that config.sh has to assume that pretty much everything optional isn't present. Meaning that if I happen to take the microperl config and graft it into a regular build, add -DNO_MATHOMS to remove all the legacy support wrappers, and bodge a couple of things that I can't configure away (yet), I find that I can get regular perl pretty close to the size of microperl:

-rwxr-xr-x 1 nicholas p5p 1290000 Aug 17 15:17 microperl
-rwxr-xr-x 1 nicholas p5p 1293153 Aug 17 15:09 miniperl
-rwxr-xr-x 1 nicholas p5p 1387582 Aug 17 15:09 perl

microperl is not much different in size from miniperl.

(I don't know why perl is 94429 bytes bigger than miniperl, as DynaLoader.o is only 9600 bytes, and the other 3 object files that differ between them only are about 10K larger in total)

Which means that all the special-casing with -DPERL_MICRO and the various special config files and Makefile doesn't actually gain anything meaningful in size reduction.

As I also can't see anyone looking to replace Configure at this stage in Perl 5's lifecycle, it's not clear to me that there's any actual case for it.

Given that we've managed to break microperl in two stable releases in the past 3 years without anyone noticing until some time afterwards, and it costs us time and effort to maintain it, I'm proposing that we announce in v5.18.0 that we're planning to eliminate it, and if no-one gives a good use case as to why to keep it, we cull it before v5.20.0 ships.

And still on the subject of removing things that are no longer used, Ricardo announced in the v5.16.0 that we plan to clean up the core codebase by removing code for various platforms for which we have no programmers to support. (Most likely because the platforms we listed are long dead.) The list of suspected "special biologist word for stable" platforms is here:

https://metacpan.org/module/RJBS/perl-5.16.0/pod/perldelta.pod#Platforms-with-no-supporting-programmers::

and the plan is to remove one per development release (until the code freeze).

So prior to the v5.17.3 release I removed code relating to UTS. UTS was a mainframe version of System V created by Amdahl, subsequently sold to UTS Global. The port has not been touched since before 5.8.0, and UTS Global is now defunct.

 MANIFEST                      |    5 -
 Porting/perlhist_calculate.pl |    2 +-
 README.uts                    |  107 ----------------------
 ext/POSIX/hints/uts.pl        |    9 --
 handy.h                       |    2 +-
 hints/uts.sh                  |   32 -------
 perl.h                        |   23 +----
 plan9/mkfile                  |    2 +-
 pod/perl.pod                  |    1 -
 pod/perl58delta.pod           |    4 +-
 pod/perldelta.pod             |    8 +-
 util.c                        |    3 -
 uts/sprintf_wrap.c            |  196 -----------------------------------------
 uts/strtol_wrap.c             |  174 ------------------------------------
 win32/Makefile                |    5 +-
 win32/makefile.mk             |    5 +-
 x2p/a2p.h                     |    8 --
 17 files changed, 17 insertions(+), 569 deletions(-)

There. That was fun. A 0.25% reduction in the line count of the distribution. What's next?

Well, after Steve Hay shipped v5.17.3, I removed support for VM/ESA. VM/ESA was a mainframe OS. IBM ended service on it in June 2003. It was superseded by Z/VM.

Cross/Makefile-cross-SH       |   3 -
 MANIFEST                      |   6 -
 Makefile.SH                   |   3 -
 Porting/perlhist_calculate.pl |   2 +-
 README.bs2000                 |   2 +-
 README.os390                  |   2 +-
 README.vmesa                  | 140 ----------
 ext/DynaLoader/dl_vmesa.xs    | 196 --------------
 ext/Errno/Errno_pm.PL         |   5 +-
 hints/vmesa.sh                | 342 ------------------------
 lib/perl5db.pl                |   3 +-
 perl.c                        |   4 -
 perl.h                        |   7 +-
 plan9/mkfile                  |   2 +-
 pod/perl.pod                  |   1 -
 pod/perl58delta.pod           |   4 +-
 pod/perldelta.pod             |   9 +-
 pod/perlebcdic.pod            |   4 -
 pod/perlfunc.pod              |   2 +-
 pod/perlport.pod              |  31 +--
 pp_sys.c                      |  12 -
 t/io/pipe.t                   |  10 +-
 t/op/magic.t                  |   2 +-
 thread.h                      |   5 +-
 util.c                        |   6 +-
 vmesa/Makefile                |  15 --
 vmesa/vmesa.c                 | 592 ------------------------------------------
 vmesa/vmesaish.h              |  10 -
 win32/Makefile                |   4 +-
 win32/makefile.mk             |   4 +-
 x2p/a2p.h                     |   2 +-
 31 files changed, 39 insertions(+), 1391 deletions(-)

And that's a further 0.6% reduction. But note how it had its tentacles in many many places. Surprisingly many. All of which are slightly cleaner now.

"Every little helps", as a certain supermarket round here likes to put it.*

Or, "your platform is at risk if someone doesn't keep up the payments on it" to paraphrase the small print on mortgages. But, joking aside, there are only finite people working on the perl core, for a surprisingly small value of "finite". And as a matter of prioritising limited resources, it's a no-brainer to concentrate on the things that matter to the vast majority of people. Code that probably no longer works gets in the way of understanding the code that works, and so acts as a drag on everything else. So it needs to go, to help make perl more maintainable in the long term.

Of course, maintainable code that demonstrably does still work is most welcome (or welcome back) in the core distribution.

A more detailed breakdown summarised from the weekly reports. In these:

16 hex digits refer to commits in http://perl5.git.perl.org/perl.git
RT #... is a bug in https://rt.perl.org/rt3/
CPAN #... is a bug in https://rt.cpan.org/Public/
BBC is "bleadperl breaks CPAN" - Andreas König's test reports for CPAN modules
ID YYYYMMDD.### is an bug number in the old bug system. The RT # is given
afterwards. You can look up the old IDs at https://rt.perl.org/perlbug/

HoursActivity
4.00-e 'BEGIN {1}'
0.25COW
1.50CPAN #78624
0.25CPAN #78768
0.25Cwd.xs
2.25File::Find::find and chdir
0.75HP-UX make and symlink targets
0.25IO-Socket-IP
0.25PTR2NV
0.25RT #114102
1.75RT #114118
1.50RT #114174, RT #114176, RT #114194, RT #114296, RT #114532
0.25RT #114312
1.25RT #114356
0.25RT #114372
1.00RT #114410
RT #114424
0.75RT #114576
0.75RT #114602
0.50Remove support for UTS Global.
1.50Remove support for VM/ESA
0.50bisect.pl (target 'none')
5.50bootstrapping Will Braswell
0.25cross compiling
1.50dl_aix.xs
0.75ext/B/t/optree_misc.t
0.75hashes
0.25investigating security tickets
1.00jemalloc
3.00microperl
10.00newCONSTSUB
2.75optimising sub entry
0.25perlport
4.00process, scalability, mentoring
31.25reading/responding to list mail
6.25readonly ops
6.25smartmatch
smartmatch, junctions
0.25smoke-me branches
0.50smoke-me/dynaloader_silence_xs_warning
0.25smoke-me/require
0.50t/porting/filenames.t
17.75warnings from multi-line constructions

113.00 hours total

* And round quite a few places, as it's the world's third largest retailer.

Paul Johnson writes:

In accordance with the terms of my grant from TPF this is the monthly
report for my work on improving Devel::Cover covering August 2012.

This month I released Devel::Cover 0.93.

The bulk of this report is taken from my weekly reports, so if you have
read them there is little new here.

One of the nice things about having this grant and being able to spend
more time working on Devel::Cover than I would otherwise have been able
to is that I'm getting feedback from my reports and checkins, and that I
have more time to be able to follow up on it.

Over the years I've received a lot of mail regarding Devel::Cover and
I've not always been able to properly reply to it all. In some cases
I'm sure I've not replied at all. If anyone has sent me something that
needs a reply, please feel free to remind me either by mail or,
preferably, on github if appropriate.

So I've started looking back at some of my outstanding mail (starting
with those where I have been prompted). One of the first I looked at
was about a problem I have known about for many years.

When you want to know your coverage there are two main phases: the first
in which you exercise your code and the coverage data is collected, and
the second in which that data is displayed, hopefully in a format which
makes it easy to understand. In each phase you can limit the amount of
coverage information both by limiting the files for which you collect
data and by limiting the criteria.

In general it is best to collect as little data as possible in the first
phase. This will reduce the size of the coverage database but, more
importantly, it will make the running of the tests faster. But there
are times when, having collected the data, you only want to display a
subset of it. Thus you can also filter the data when you generate your
report.

When the report is generated, by default, a summary of the coverage is
printed. This summary has always been a summary of all the data in the
database, rather than a summary of the data in the report. I remember
looking into this many years ago and deciding that it was going to be
complicated to fix and this is probably why I've never seriously looked
at the problem up to now. But now I have, and the fix wasn't as
complicated as I had imagined that it would be. There's probably a
lesson of some sort there.

I also took a look at RT 49916 which relates to file and directory
permissions with respect to code that changes EUID. I think this is
fixed but I'm waiting to hear back before closing the ticket.

I had a look into a cpantesters report on bleadperl which pointed back
to perl #113684, which is a report I've written about before. Father
Chrysostomos has made some commits to address the problem and this has
lead to a minor change in B::Deparse which affects Devel::Cover. This
is somewhat dictated by the limitations of B::Deparse. The full details
are in the RT thread, but the long and short of it all is that these
changes are here to stay and so I will adjust Devel::Cover accordingly
as soon as 5.17.3 is released.

I've also been running cpancover jobs every so often. This takes quite
a time to finish but I like to think of CPAN and cpancover as my
extended test suite. When I see something that doesn't look right I can
then go in and investigate in a bit more detail. I started running
cpancover with perl 5.16.1-RC1 to see how it pans out. See the results at
http://cpancover.com

At the QA hackathon this year I put together a Vim plugin to display
coverage information within Vim. I really like this plugin because I
can see my coverage information right where I am editing my code, but I
noticed that this plugin wasn't honouring the coverage criteria it was
given, so I fixed that up too.

Finally, I started looking into RT 69240. Initially, this looked like
it should be a reasonably simple problem, but when I eventually
unearthed a CPAN module where I could reproduce the problem it slowly
became clear that I was on the trail of a longstanding problem that I
had known about for many years but had never been able to reliably
reproduce, and that I had never had a sufficient chunk of time to be
able to track down.

Now, thanks to this grant, I do. This is exactly the sort of problem I
had hoped to be able to solve with this grant. I have started off by
trying to catch up on the backlog of bugs that have been reported, but
then problems such as this are the priority before, with luck, moving on
to new features.

As is often the case, all the work was in finding the problem. Once
found, the solutions was all but trivial.

The problem was in the heart of probably the most complicated part of
Devel::Cover. That is the code which tries to manage the dynamic nature
of Perl.

Code coverage in a static language such as C is relatively
straightforward. Source code is compiled to object code and that object
code doesn't change. (Self-modifying code is an illusion. It doesn't
exist.) Everything you need to know about the code is known before a
line of it has been executed.

In Perl, and similar languages, the separation of compile time and run
time is not so clear cut. Code being executed can get perl to compile
new code into already existing modules. This poses challenges to a tool
such as Devel::Cover which tries to collect information about the code
being executed.

If you consider just subroutine coverage, in "normal", static code, I
can note the subroutines in a file and produce a mapping from the
subroutines to the position in the source code where they are defined,
producing an ordering on the subroutines. When a subroutine is executed
I can note that the nth subroutine in the file is covered.

In dynamic code new subroutines can spring into existence whilst the
code is being executed. This happens via a string eval in some guise.
When this happens I can tag the new sub onto the end of my list of subs,
and this works well.

But the problem becomes more difficult when in two different runs,
different subroutines are created. A naïve solution here can lead to
subroutine $n meaning different subroutines in different runs. So we
need to be clever and recognise when one of these new subroutines
matches an identical subroutine from a previous run, and when it has
never been seen before.

If all this functions correctly we should never get to a situation where
we have information that subroutine $n has been executed, but we only
know about n-1 subroutines. If that situation does occur, we get the
"ignoring extra subroutine" message and coverage will be lost.

The bug existed in the code which managed how these lists of criteria
were maintained between runs. There may still be bugs in this area, but
it was great to be able to knock this one over. The actual commit was
https://github.com/pjcj/Devel--Cover/commit/997426eecb16899d0be425853478e2b5bdf9a1ee
if you want to see the simple fix to the hard-to-find problem.

I should probably note that there are other solutions to this problem.
Early versions of Devel::Cover stored the location of each construct
together with the information about its coverage. This does work well
but is very expensive on storage requirements and the CPU required
manage this extra data. Coverage always has an overhead and the greater
the overhead the less people will be inclined to use it, so I try hard
to keep the overhead to a minimum.

Whilst tracking down and fixing this problem I also fixed more than ten
other bits and bobs that I noticed, as well as other peripheral matters.
The full details are in the commits. A couple of those bits and bobs
are probably quite important, actually.

Oh, And then I did fix up the remainder of the problem in the original
bug report.

I also got a message from Nuno Carvalho who is packaging Devel::Cover
for Debian. It seems that PodVersion isn't happy when the module
description is longer than one line. Since that's not a very good idea
anyway I fixed up the affected modules so with luck the Debian packaging
can go ahead.

Finally, I fixed up Devel::Cover to work around mod_perl2 setting $^X
to httpd. The mod_perl folk are also going to fix that, hopefully for
2.0.8.

As usual, I made other various fixes and updates.

The work I have completed in the time covered by this report is:

Closed RT tickets:

  • 68517 summary, report total from cover tool includes ignored files
  • 77818 tests fail due to spaces in @INC (Devel::Cover::Inc issue)

Fixed cpantesters reports:

You can see the commits at https://github.com/pjcj/Devel--Cover/commits/master

  • Hours worked: 42:55
  • Total hours works on grant: 131:45

Dave Mitchell writes:

As per my grant conditions, here is a report for the July/August period.

I spent a bit of time fixing a few issues causes by my rewriting of the /(?{})/ implementation, then started to look into the last unclosed ticket still attached to the re_eval meta-ticket. This concerns code within (?{}) that modifies the string being matched against, and generally causes assertion failures or coredumps:

my $text = "a"; $text =~ m/(.(?{ $text .= "x" }))*/;

While trying to understand what's going on, I ended up delving into the issue of how and when perl makes a copy of the string buffer in order to make $1, $& etc continue to show the right value even if the string is subsequently changed. It turns out that in some circumstances this can have a huge performance penalty. For example the following code takes several minutes to run, since it mallocs and copies a 1Mb buffer a million times:

$&;
$_ = 'x' x 1_000_000;
1 while /(.)/g;

If you remove the $&, it runs fast (<1s), but this is only because pp_match has a special hack added that says "even if the pattern contains captures, in the presence of /g don't bother copying the string buffer". So the following prints zzz rather than aaa. And if the string buffer gets realloced in the meantime, it could print out garbage:

$_ = 'aaa';
 /(\w+)/g;
$_ = 'zzz';
print "[$1]\n";

Attempts to fix this in the past have tried to implement some sort of Copy-On-Write behaviour, but have come up against the difficulty of making an SV always honour COW in all circumstances and/or not making the SV itself "unusual". Also, the regex engine API itself matches against a string buffer not an SV, so you aren't guaranteed to always have a valid SV to mess with.

My approach to this has been to only copy the substring of the string buffer needed to cover $1,$&, etc. The mechanism (PL_sawampersand) used to detect whether $`,$&,$' have been seen in code has been updated to log each of the three variables separately. The code then uses the index range of any captures, plus which of $`,$&,$' are present, plus the presence or not of /p, to decide what part of the string to copy. In the case of

bc. $_ = 'x' x 1_000_000;
1 while /(.)/g;

(with or without $&), the range is a single byte rather than a Mb that gets copies a million times, and now runs in subsecond time. This means that the hack can be removed, and printing $1 no longer risks a segfault.

It also means that having just $& in your source code may no longer necessarily be the huge performance hog it used to be, although having $` and $' too will drag things down to previous levels.

In summary:

$_ = 'x' x 1_000_000; 1 while /(.)/g;

before: fast and segfaulty
now: fast and non-segfaulty

$&;
$_ = 'x' x 1_000_000; 1 while /(.)/g;

before: slow and non-segfaulty
now: fast and non-segfaulty

This is all working and tested, but hasn't been pushed out for smoking/merging yet, since I haven't yet fixed the original bug yet, i.e. the

my $text = "a"; $text =~ m/(.(?{ $text .= "x" }))*/;

Over the last two months I have averaged 6 hours per week :-(.

As of 2012/08/31: since the beginning of the grant:

129.7 weeks
1353.2 total hours
10.4 average hours per week

There are 343 hours left on the grant.

Report for period 2012/07/01 to 2012/08/31 inclusive

Summary

Effort (HH::MM):

6:25 diagnosing bugs
46:45 fixing bugs
0:00 reviewing other people's bug fixes
0:00 reviewing ticket histories
0:00 review the ticket queue (triage)
-----
53:10 Total

Numbers of tickets closed:

3 tickets closed that have been worked on
0 tickets closed related to bugs that have been fixed
0 tickets closed that were reviewed but not worked on (triage)
-----
3 Total

Short Detail

45:00 [perl #3634] Capture corruption through self-modying regexp (?{...})
3:00 [perl #114242] TryCatch toke error with (??{$any}) $ws \\] )? @
1:00 [perl #114302] Bleadperl v5.17.0-408-g3c13cae breaks DGL/re-engine-RE2-0.10.tar.gz
2:10 [perl #114356] REGEXPs have massive reference counts
2:00 [perl #114378] cond_signal does not wake up a thread

Enrique Nell and Joaquin Ferrero reported:

Project status: https://docs.google.com/spreadsheet/ccc?key=0AkmrG_9Q4x15dC1MNWloU0lyUjhGa2NrdTVTOG5WZVE

CPAN distribution: http://search.cpan.org/~enell/POD2-ES-5.16.1.02/

Project host: https://github.com/zipf/perldoc-es

This month we updated POD2::ES from v5.16.0 to v5.16.1. It was a swift operation, since only one document changed (perlhist.pod). Translated files not included in the distribution were also updated to v5.16.1.

As mentioned in a previous report, v5.16.0 fixed the issues that prevented displaying correctly the extended characters in UTF-8 encoded files using perldoc in the console, so we have switched back to UTF-8. In order to do so, we configured our translation tool (OmegaT) to generate utf-8-encoded output, and modified the post-processing script to check if the =encoding utf8 command is present in the pod documents, and add it in case it is missing.

Code changes were implemented in ES.pm to fix some issues related to POD2::Base:

  • search_perlfunc_re(). Since perldoc does not decode the text string returned by this method, it couldn't filter the perlfunc.pod introduction. As a result, it couldn't find the documentation section requested by the user when using perldoc with the -f switch. To fix it, the offending characters (those with diacritic marks) were removed from the string returned by this method.

  • print_pod(). To align the actual behavior of POD2:Base with the functionality described in this module's documentation we had to cover the case where this method is called as a class method and the case where it is called as an object method.

  • print_pods(). As for print_pod(), we covered the two possible call types for this method (class method and object method). print_pods() is used only to call POD2::Base's print_pods() method.

New files added this month:

  • perlmod
  • perlmodinstall
  • perlhacktut
  • perlclib

Reported source pod bugs:

2012/07/25 : [rt.cpan.org #78577] [RT #114260] perlfaq2.pod internal link error
2012/08/14 : [perl #114486] perlvar.pod, line 1337, bad filehandle

Stats

The word count increased because we discovered 5 additional pod files that are generated automatically during setup (perlapi, perlintern, perlmodlib, perltoc, and perluniprops). We only added four of them, since perltoc.pod is generated automagically from the source pods that are being translated. Status of our v5.16 track (currently v5.16.1):

  • Total documents: 167 (100 in core docs)
  • Total words: 945,786 (495,813 in core docs)
  • % translated: 31.21% (46.46% of core docs)
  • % reviewed: 11.60%

Tools

We added more functionality to the post-processing script (see below), new utilities, and renamed some scripts for consistency.

New scripts

  • get_pods.pl
    Gets all the pod files from the Perl distribution and adds them to the OmegaT project source folder

  • test_pod2es_setup.pl
    Checks the POD2::ES setup

  • compare_pods.sh
    Shows source and target pods side-by-side to easily spot formatting differences

Changes in postprocess.pl

Since we changed the output encoding to UTF-8, now the script checks if the =encoding utf8 command (or an equivalent command for a different encoding) is present. It adds the command if it's not present, or updates it if the specified encoding is different from UTF-8.

We also added a switch to generate HTML diff files that show word-oriented differences.

These reports provide a clear view of the changes made by the reviewer, and can be useful to learn the style and terminology used in the project. Here is an example (not exactly the same view, since we had to translate the HTML to Google Docs format):

https://docs.google.com/document/d/1wIzsIk9PS1OPz7ixbdG8KhVLvnRS9fnn4K-mpLDIvi4/edit

Hopefully, this will help to improve global consistency.

On the other hand, these changes can be collected to generate a list of frequent errors/changes in order to do an automated first pass before delivering the files to the reviewers.

Other actions

  • Apertium Offline is now available in our server OmegaT setup. This provides an alternative machine translation engine (the other one is Google Translate) that can be used to get a first draft.

  • During this update we filed the issues found in the distribution files after the post-processing stage in a spreadsheet added to our project status document. We will use this as a checklist for subsequent updates. All these issues fall in two categories:

  • Double-spaces (e.g., after question mark, between full stop and opening parenthesis, etc.). In some cases the problem stems from a segmentation error; our customized segmentation settings cover most of the cases quite well, but not all of them. We should be able to fix most of these issues by adding a few regular expressions to postprocess.pl.

  • Broken links: Links with long names are split in two or three lines. This issue has to do with Pod::Tidy. We must check if there is any way to prevent it.

Future work

  • perlcheat didn't change in v5.16.1 (i.e., still contains the bug we reported), but we will include an amended ES version in an upcoming release, later this month.

  • We are working on a terminology extraction tool. It will be ready in the next few weeks. A new module will be added to CPAN.

  • Add to the tools section the code that generates our project status spreadsheet and a Readme containing tool usage guidelines.

  • Check how to get an ES perltoc.pod generated automatically.

Shlomi Fish reported:

2012-Sep-01:

2012-Sep-02:

2012-Sep-04:

  • Plans for today:

    • Write more tests for the L command.

      • Done.
    • Write more tests in general.

      • Wrote some for S. There are more commands in perldoc perldebug following it that already have test coverage.

2012-Sep-05:

2012-Sep-06:

  • Added a test for the a [line] command command and noticed a failure (which had been reported to perl5-porters).

    • Corrected the lib/perl5db.pl and test now passes.
  • Added a test for the A line command.

  • Added a test for the A * command.

  • Added two Bash functions to my _Theme perl/core to facilitate testing the entire perl core distribution, and running only lib/perl5db.t.

Jess Robinson writes:

(Report up to Aug 26th anyway, I'll add anything I do in the last few days of August to the next one)

After a bit of a slow start (and thus the reason to merge July/August reports), I've gotten started on the actual work, and managed to tick off one of my tasks.

I eliminated my earlier use of the tool "agcc" from the code base. This was an external (and with non-compatible licence) tool designed to make using the cross-compiled gcc similar to a normal gcc. In the end it turned out to be more of a hinderance. Getting rid of it and using Android's suggestion for "--sysroot" as a gcc option means upgrading the NDK to the latest one is now simple. The SDK is now not required at all.

The last 2 weeks (Weeks #6/#7) I was at the Perl Reunification Summit, and then YAPC::EU, which were inspirational. I got started on the second task I have identified. To enable me to merge (or better, re-port) my original changes to blead, I have started with a fresh copy of blead and will add a piece at a time.

Expanding Configure is my current target. It already has a small piece of code to support cross-compiling, using -Dusecrosscompile, however this mostly skips all the parts it can't do, like compiling and then running various pieces of code for testing sizes of integers and similar. In order to enable us to have as small a piece of "canned config.sh" as possible, I am adding support for these things.

Currently I have it using `nm` (or rather the copy in the cross-compiler bin-utils) to find the size of the variables declared in the 'try.c' binary. When cross-compiling, this now looks like:

  #include <stdio.h>
  #$i_stdlib I_STDLIB
  #ifdef I_STDLIB
  #include <stdlib.h>
  #endif

     int PERL_INT_SIZE;
     long PERL_LONG_SIZE;
     short PERL_SHORT_SIZE;

  int main()
  {
     exit(0);
  }
  EOCP

Running nm gets us:

  $ /usr/src/android/android-ndk-r8/toolchains/arm-linux-androideabi-4.4.3/prebuilt/linux-x86/bin/arm-linux-androideabi-nm -p -t d -S  UU/try
  00037740 d _DYNAMIC
  00037940 d _GLOBAL_OFFSET_TABLE_
  00033612 A __exidx_end
  00037996 A _bss_end__
  00037724 T __FINI_ARRAY__
  00037960 A __bss_start__
  00037968 00000004 B __dso_handle
  00033612 A __exidx_start
  00037984 00000004 B PERL_LONG_SIZE
  00037988 00000004 B PERL_INT_SIZE
           U __libc_init
  00037992 00000002 B PERL_SHORT_SIZE
  00037732 D __CTOR_LIST__
  00037996 A __bss_end__
  00033552 T _start
  00037716 T __INIT_ARRAY__
  00037960 A __bss_start
  00033600 00000012 T main
  00037996 A __end__
  00037708 D __PREINIT_ARRAY__
  00037960 A _edata
  00037996 A _end
           U exit
  00037960 D __data_start

And we extract the values for the PERL_* variables. The same process is repeated for long longs, long doubles, pointers etc. Assuming cross-compile environments all use gcc and contain the bin-utils, this ought to be fairly portable.

To ensure the correct 'nm' (and also later 'ar') is used, I've added support to Configure to allow the binaries to be named differently. The cross-compiler 'nm' for the NDK is, for example, 'arm-linux-androideabi-nm'. If "-Dusecrosscompile=arm-linux-androideabi-" is passed as a Configure argument, binaries will be discovered with that string as a prefix.

I've now started to add support for -Dsysroot. The name is taken from the gcc option mentioned earlier, and is similar to chroot, it specifies a logical root directory under which headers and libraries can be found. Various parts of the Configure script try to discover which libraries are available, and the location of the libc.so, using absolute paths. When cross-compiling, these are being found on the host system, instead of inside the toolchain for the cross-compiler. Adding sysroot support means the compiler's headers and libraries can now be located correctly. This is applicable to any system with its compiler located in its own chroot, or multiple toolchains / libraries installed.

[Hours] [Activity]

5.75     Admin - Tasks, Emails, Planning
6.50     Remove agcc from x-cross-compile branch
2.00     Update to latest Android NDK and verify working with x-cross-compile
10.50    Update Configure with better cross-compiling support
=====

24.75 hours total

Nicholas Clark writes:

More fun with compilers this month, with make utilities joining in. H. Merijn Brand and I identified this problem with HP's compiler a while back, but I'd not yet had time to fix it:

$ cc -DPERL_CORE -c  -Ae -D_HPUX_SOURCE -Wl,+vnocompatwarnings +DD64 -
DDEBUGGING -I/usr/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64   +O
2 +Onolimit -g pp_sys.c
cc: line 2976: panic 5172: Backend Assert ** Unimplemented CVT. (5172)
$ echo $?
1

Of course, HP's make just has to be helpfully "special" and despite noticing the failure exit code does not remove build products from the failed step:

$ make pp_sys.o
        `sh  cflags "optimize='+O2 +Onolimit -g'" pp_sys.o`  pp_sys.c
          CCCMD =  ccache cc -DPERL_CORE -c  -Ae -D_HPUX_SOURCE -Wl,+vnocompatwarnings 
+DD64 -DDEBUGGING -I/usr/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64   +O2 +Onolimit -g
cc: warning 404: Pre-processor not invoked, options ignored.
cc: line 2976: panic 5172: Backend Assert ** Unimplemented CVT. (5172)
*** Error exit code 1
Stop.
$ ls -l pp_sys.o
-rw-r--r--   1 nick       perl         53592 Jul 30 12:04 pp_sys.o
$ file pp_sys.o
pp_sys.o:       awk program text
$ less pp_sys.o
"pp_sys.o" may be a binary file.  See it anyway?

which of course means that if you re-run make it then assumes that pp_sys.o is up to date, carries on and then fails at the link. "Special", as I said.

The compiler is failing to deal with this (valid) macro after the refactorings of commits d2c4d2d1e22d3125 and 8d7906e182f93e18:

#define FT_RETURN_TRUE(X)                \
    RETURNX((void)(                       \
        PL_op->op_flags & OPf_REF          \
            ? (bool)XPUSHs(                 \
                PL_op->op_private & OPpFT_STACKING ? (SV *)cGVOP_gv : (X) \
              )                                                           \
            : (PL_op->op_private & OPpFT_STACKING || SETs(X))             \
    ))

part of a sequence where Father Chrysostomos considerably simplified the filetest ops so that they are consistent and clear in when they manipulate the perl stack. Fortunately his simplification also made it possible for me to see a way to refactor things a bit further to reduce the complexity of the code generally, and the macros in particular. The diff is slightly deceptive

$ git diff -R --stat 4c21785fe645f05e
 pp_sys.c | 80 ++++++++++++++++++++++++++++++++--------------------------------
 1 file changed, 40 insertions(+), 40 deletions(-)

because the forty lines of additions include 6 lines of documenting comments and 3 lines of whitespace.

So, on the subject of HP-UX make, what would one expect this Makefile to do?

$ cat Makefile
bar: foo
        cp $? $@
baz: foo
        ln -s $? $@

Here's AIX, not known for being the most flexible platform:

$ touch foo
$ make bar
        cp foo bar
$ make baz
        ln -s foo baz
$ make bar

and repeat:

Target "bar" is up to date.
$ make baz
Target "baz" is up to date.

Every other make I tried behaves the same. Except HP-UX:

$ touch foo
$ make bar
        cp foo bar
$ make baz
        ln -s foo baz
$ make bar
`bar' is up to date.
$ make baz
        ln -s foo baz
ln: baz exists
*** Error exit code 1

Stop.

Special. So the symlink is never up to date (and one has to work round this by always deleting it). I kiss you!

(And don't let's talk about HP-UX make's "support" for parallel makes.)

Better news - while working on the filetest operators, I noticed this piece of code just below them:

#if defined(atarist) /* this will work with atariST. Configure will
                        make guesses for other systems. */
# define FILE_base(f) ((f)->_base)
# define FILE_ptr(f) ((f)->_ptr)
# define FILE_cnt(f) ((f)->_cnt)
# define FILE_bufsiz(f) ((f)->_cnt + ((f)->_ptr - (f)->_base))
#endif

Atari STs - I remember them. 16 bit machines from 25 years ago. So does perl 5 really build on the Atari ST? Seems unlikely - so is that another platform we need to add to the list of "soon to be culled":

https://metacpan.org/module/perldelta#Platforms-with-no-supporting-programmers::

So with the help of git blame, I went digging...

http://perl5.git.perl.org/perl.git/blame/v5.17.2:/pp_sys.c#l3306

The code was refactored with "patch.1i for perl5.001", but has been in pp_sys.c since it first appeared with perl-5.000 alpha 2. So where as it in 4.036? doio.c:

http://perl5.git.perl.org/perl.git/blame/perl-4.0.36:doio.c#l1192

and it was added as part of patch #20, a rather big patch from June 1992. 1 line in that says:

Subject: added Atari ST portability

What was also added in that patch was an atarist/ directory, containing various files relevant to the port. We don't have an atarist/ directory now, so where did that go? Turns out it was removed on the release of perl-5.000. That rang a bell - there was also an msdos/ directory in perl 4 (added by perl 3 patch #16), but that was removed with perl-5.000.

So the atarist port is like the old perl 4 MSDOS port - the port specific files were removed with perl 5, but the tentacles left in the main parts of the code were not excised.

This I have now done.

And in the process it turned up a bunch of code for I286 support dating from perl 3 times, mostly specifically coping with 16 bit memory models, and using %ld instead of %d (because sizeof(int) is 2). So that's gone too now, and the world is a bit tidier.

This month I fixed some problems with the git bisect wrapper which were preventing me from bisecting to find the cause of other problems. :-/

The bisect wrapper is designed to sensibly default as much as possible, and its approach for defaulting the revision for the start of the bisect run is to try stable (.0) releases old to new until it finds one which can run the test code correctly. It had been using a hardcoded list, which still had 5.14.0 as the most recent stable release. It now uses `git tag -l` to get the list of stable releases. The default for the end of the bisect used to be 'blead'. Now if there is no 'blead' branch, bisect.pl now uses a suitable alternative - if HEAD is more recent than the last stable release, use HEAD, else use the last stable tag. Also, when it wanted to check out a known good recent version of a file (such as makedepend.SH) it would check out the revision from blead. It now uses the most recent tagged stable release for this.

We've had quite a bit of fun with Debian and Ubuntu's switch to a multiarch setup. This results in important libraries (such as libm.so) moving from the well known /usr/lib to an architecture specific directory. Without knowing where they are, perl won't build. As of 5.14.0 (and 5.12.4), the hints for Configure have been updated to get the correct library paths from gcc, and I thought that I'd correctly put the analogous changes into the bisect wrapper. However, it turned out that it was only correct on x86_64. On other Linux architectures it failed to pass the multiarch locations to Configure. That is now fixed.

I also added a --timeout feature to permit the bisect runner to time out (and kill) the user's test case if it takes longer than the specified time to run. With this I was able to bisect a problem that I'd noticed had appeared recently with debugging builds seeming to hang if PERL_DESTRUCT_LEVEL=2 is set in the environment (RT #114356)

I spent a while digging into the pre-history of the various scalar flags,trying to make sense of how we got to where we are, and why Chip's patch to magic flags makes sense. The full conclusions are here

http://www.xray.mpe.mpg.de/mailing-lists/perl5-porters/2012-07/msg00826.html

but the question comes down to an inconsistency - there are both "public" and "private" flag bits for integers (I), floating point values (N) and strings (P), but there is only one flag for references ®. This seems wrong - why is this?

It turns out that public and private flags were added by 5.000 alpha 4, as part of implementing magic on scalars. Prior to that version, tainting was implemented by building a separate taintperl binary. Magic enabled tainting to be implemented at runtime (with the -T command line option) in the same binary as the regular perl, without a significant speed hit. Magic also permitted the implementation of tie and untie. However at that time there was no SVf_ROK(), or SvROK. References could only be in SVs of type SVt_REF, and the code in sv_setsv() downgrades the destination SV to type to SVt_REF if needed. Note that one can't get a reference from the environment, so a reference can never be tainted.

Once the alpha went out into the wild, people discovered that this meant that also a reference could not be assigned to a tied variable, as noted in this thread from 1994:

https://groups.google.com/forum/?fromgroups#!msg/comp.lang.perl/TlLd6ttq4o4/-3YuF4n9UysJ

to which Larry replies "I'll fix it. Sounds like we'll want an alpha 5 pretty quick."

And so alpha 5 appeared, and changed SVt_REF to SVt_RV, added SVf_ROK, SvROK and and SvRV, thus (pretty much) promoting references to first class scalars with the same semantics as I, N and P.

Alpha 5 also contained a file internals,

http://perl5.git.perl.org/perl.git/blob/ed6116ce9b9d:/internals

which describes the public flags like this:

These tell whether an integer, double or string value is immediately available without further consideration. All tainting and magic (but not objecthood) works by turning off these bits and forcing a routine to be executed to discover the real value. The SvIV, SvNV and SvPV macros that fetch values are smart about all this, and should always be used if possible.

and the private flags:

These shadow the bits in sv_flags for tainted variables, indicated that there really is a valid value available, but you have to set the global tainted flag if you acces them.

which suggests that the lack of public and private flags for references was a mistake. The scheme was designed for tainting and tie, or designed for tainting and extended to tie, and references weren't quite first class then. References became first class one alpha too late, and that's why they never had the proper split public and private flags. Probably it wasn't noticed because references weren't tainted, and most early uses of references were effectively idempotent, with the result that as long as code was called, it didn't notice if it was called multiple times instead of once.

And yes, this does mean that every version from 5.000 to maint-5.16 has been subtly buggy.

This month finally something clicked and I finally understood the subtleties and assumptions of fold_constants(), op_linklist(), which it calls.

fold_constants() dates back to perl 5.000, but really it would have been better named "various mandatory and optional optree fixups that we need to do at this point", which is neither very terse, nor shorter than the 32 characters that ANSI guarantees as acceptable for a symbol. Since 5.000, various refactorings have moved the code unrelated to constant folding to other routines, leaving fold_constants() pretty much true to its name.

However, fold_constants() is not as general as its name might suggest. It is only able to analyse and fold a tree consisting of just a single op and entirely constant arguments. It can't fold anything more complex, such as this optree:

       +
     /  \
    1    *
        /  \
       2    3

This doesn't matter to perl's parser, as the optree is constructed from the bottom up, with fold_constants() is called immediately as each op is built, hence the above would be folded as 2 * 3 and then 1 + 6. But this does mean that fold_constants() isn't that useful as a general-purpose constant folding API.

As part of this enlightenment, I refactored op_linklist() to be slightly terser, improved the documentation of its wrapper macro LINKLIST, and removed needless duplicate calls from fold_constants() to LINKLIST. I've removed 9 gotos (all vestigial) from fold_constants(), and documented it. I also spotted a long-standing error in perlguts.pod, and fixed that too.

It looked fairly easy to write tests for the documented behaviour, and add it to the public API. It seemed pretty clear that it can return two types of OPs, so both would need testing:

    if (type == OP_RV2GV)
        newop = newGVOP(OP_GV, 0, MUTABLE_GV(sv));
    else
        newop = newSVOP(OP_CONST, OPpCONST_FOLDED<<8, MUTABLE_SV(sv));
    op_getmad(o,newop,'f');
    return newop;

It's clear that the OP_CONST is the common case, and it's obvious how to test it, but what about that newGVOP? I had no idea what called that, so took the brute force approach of replacing it with an abort(), and running a full build and test cycle. (Parallel build and tests mean this takes less than 5 minutes. It's often faster than any other approach if it's not immediately obvious how to reach some code.)

Nothing failed.

Interesting...

So what's going on here? type is the type of the original op was folded. So the newGVOP route can only be reached if fold_constants() completes for an op of type OP_RV2GV. But fold_constants() will never complete for an op of type OP_RV2GV, as it will return almost immediately:

    if (!(PL_opargs[type] & OA_FOLDCONST))
        return o;

as only ops with the OA_FOLDCONST bit set can be folded. That is set if the op is flagged as 'f' in regen/opcodes, and rv2gv doesn't have the flag. So, did it use to? It turns out that it never had it. The opcode data has moved around a bit in the history of perl, but even back in the earliest revision of perl 5 in git, alpha 2, rv2gv isn't flagged as 'f':

http://perl5.git.perl.org/perl.git/blame/perl-5a2:/opcode.pl#l176

So the code to return newGVOP, also added in alpha 2:

http://perl5.git.perl.org/perl.git/blame/perl-5a2:/op.c#l714

has always been dead code. So on the branch, it's gone.

Which only leaves the obviously testable code. "obvious" - always a danger sign.

Turns out that the first problem with testing the folding of constants is that if you try to build an optree ready to fold, the op constructor functions such as newBINOP() spot this and helpfully fold it for you, returning a single OP_CONST, instead of the tree you were hoping for. So you have to subvert their efficiency by lying to them - build OP_NULL instead of the op you really want to fold, then replace the op_type and op_ppaddr values after its returned.

So now you have your tree ready to fold, and you pass it to fold_constants(). At which point you hit the second problem - nothing happens. It turns out that when it executes the ops in order to get the result, the BINOP I was using (OP_MULTIPLY) panics because it doesn't have a target allocated. OP_NULL doesn't need a target, so newBINOP() doesn't create one needlessly. However, no error report escapes, because constant folding runs with all warnings and exceptions trapped, and if anything goes wrong, constant folding is abandoned and the original optree remains. So, also allocate a pad slot, and all is happy.

Except that writing more tests reveals that it's not. SEGVs, wrong numbers of tests run, and "interesting" things like that, which valgrind reveals is due to a read from freed memory in pp_iternext(), the implementation of the looping part of for(). The problem turns out to be allocating that pad slot, however it's done, newBINOP() or the XS test code. It's all because XS code doesn't have its own pad - so at the time of the C calls the current pad is that of the calling Perl subroutine. The running subroutine. The problems happen when the pad gets moved as a side effect of being extended to accommodate allocating another slot in it, because the runtime for for() has taken the address of a location within the pad, never expecting it to move. This is a totally reasonable assumption, because the pad moving at runtime simply doesn't happen within the perl interpreter itself - once a subroutine is compiled to ops, neither the optree nor the pad changes again. I don't know how much of the runtime code makes assumptions such as these, but it suggests that the level at which the optree construction functions act doesn't make a good API to call near directly from Perl space.

Rather less successful was an attempt to untangle a bit more of the build process. The distribution 'version' on CPAN ships XS code for dealing with version objects and parsing v-syntax essentially identical to the code in the perl core. However, because that code is needed early in the core bootstrap (specifically, for miniperl, so before the core is able to process XS code), the code for version objects etc has to be shipped as C code. The code in question is within util.c and universal.c, which obviously makes it a real pain for keeping it in sync with the CPAN version.

I had a possible insight - the version distribution also ships with a pure perl implementation, version::vpp - could we use that to bootstrap miniperl? Sadly the answer is no, because "pure perl" has its usual CPAN meaning - "no need to install XS code". version::vpp uses B to extract raw version object metadata from magic attached to the scalar. Even trying to bodge things by forcing that code to return "not found" doesn't work - the build process grinds to a halt with a failure whilst it tries to build lib/Config.pm - ie pretty much the first step after miniperl is ready to run. So, this isn't going to work out. However, it looks like it might be possible to disentangle things using a potentially simpler approach - refactor the version distribution's XS code to split the pertinent C out into separate files, and then #include those directly from universal.c and util.c. John Peacock hopes to be able to look into this further, but can't currently as his time is otherwise spoken for.

A more detailed breakdown summarised from the weekly reports. In these:

16 hex digits refer to commits in http://perl5.git.perl.org/perl.git
RT #... is a bug in https://rt.perl.org/rt3/
CPAN #... is a bug in https://rt.cpan.org/Public/
BBC is "bleadperl breaks CPAN" - Andreas König's test reports for CPAN modules
ID YYYYMMDD.### is an bug number in the old bug system. The RT # is given
afterwards. You can look up the old IDs at https://rt.perl.org/perlbug/

HoursActivity
1.50BBC (8be227ab5eaa23f2)
0.50Dumper.xs on non-gcc
3.25HP-UX compiler chokes
0.50HP-UX make and symlink targets
0.50ID 20001202.002 (RT #4821)
0.50Moose & MOP
0.50NetWare
0.25OS X non-UTF-8 filenames
5.75Old RT tickets
1.50PL_main_start/PL_main_root
3.25PV in %ENV
0.50RT #113786
2.00RT #113856
0.25RT #113930
0.50RT #113980
0.25RT #114022
0.50RT #114128
3.00RT #114142
2.00RT #114356
0.50RT #22375
0.25RT #24652
0.25RT #66092
3.00RT #77536
0.75array test failure on ARM with -Duse64bitint
2.50atarist
12.50bisect.pl
bisect.pl (Debian multiarch)
bisect.pl (for RT #114356)
3.00build process
0.25ck_select
2.25code review
1.25cross compilation
1.75decoupling version
8.00fold_constants
0.50given/when/smartmatch
0.25i5 query
1.00investigating security tickets
3.00linklist
7.50magicflags
0.25pp_require
2.25process, scalability, mentoring
36.00reading/responding to list mail
0.50scalarvoid
0.50smoke-me/magic_setenv
7.75smoke-me/require
0.75t/re/reg_posixcc.t

123.50 hours total

Shlomi Fish reported:

2012-Aug-28:

  • I received the E-mail that my grant for improving the Perl debugger was accepted in the afternoon today ( 2012-Aug-28 ). My grant manager will be Alan Haggai Alavi, whom I talked with and successfully collaborated with in the past.

  • I decided to start working.

  • I checked the pending tests on my shlomif-perl-d-add-tests-take-2-may-git-burn-in-hell (titled so because git has given me some trouble in the past incarnation of the branch), and after merging the changesets from bleadperl, made sure all tests pass (by re-running Configure, make and make test), and submitted a ticket with a patch with new tests:

    • https://rt.perl.org:443/rt3/Ticket/Display.html?id=114644

    • Notes:

      • To run the make in parallel do make -j4 (for 4 cores).
      • To run the tests in parallel do make -j12 test_harness TEST_JOBS=4" (the -j12 does not affect the tests, just makes sure the dependencies scanning goes quickly).
      • make test_harness TEST_FILES='../lib/perl5db.t' tests only the debugger.
  • Then I experimented with measuring coverage in the debugger. My first attempt was this script: https://gist.github.com/3583186

This caused some tests to fail. I decided not to investigate further for the time being (having spent a lot of time on it), and instead tried Devel::Cover.

  • After experimenting with several ways to enable Devel::Cover the most promising way appeared to be adding a use Devel::Cover; statement to the top of lib/perl5db.pl.

    • This caused some of the tests to segfault in lib/perl5db.t.

    • I decided to stop investigating for now.

    • TODO : Report a bug in Devel::Cover.

  • I talked with a friend on IM and he suggested completely replacing lib/perl5db.pl with either Devel::Ebug or Devel::Trepan. I noted Devel::Trepan was GPLv2+ + Artistic instead of GPLv1+ + Artistic like perl is, which may be problematic, and that its code exhibits some strange things such as prototypes for methods and multiple pragmas on the same lines, and that the author refused to apply a patch which remedied them (for his own reasons).

It's also not completely compatible with perl -d, which means a compatible interface should be written for it, and that adding tests against the current perl debugger with its current behaviour will help test that.

2012-Aug-30:

  • I added a test for the f debugger command, by copying, pasting and modifying a previous test (for the b [filename]:[line]). It runs nicely.

  • As I was testing the /pattern/ command I noticed it failed due to an exception. This turned out to have been caused by the fact that string eval inside the debugger did not handle lexical variables well. This was fixed along with a regression test and some future TODOs.

  • I added a test for the ?pattern? command (to search backward), and this time the test worked immediately, though I'm not sure why the bug is not exhibited now. Maybe due to the previous use vars (...).

  • I sent an up-to-date patch on the previous bug report.

2012-Aug-31:

  • I fixed the older patch by removing an extraneous $i++ (only $i was needed) that was reported in perl5-porters.

  • Reverted some of the changes that invovled converting C-style for(;;) loops into while/continue blocks.

  • Submitted a new patch to the rt.perl.org ticket.

2012-Sep-01:

About TPF

The Perl Foundation - supporting the Perl community since 2000. Find out more at www.perlfoundation.org.

Recent Comments

  • Nicholas Clark: Karl's choice of words are uncannily close to what I read more
  • Karl Williamson: I think Tony did an outstanding job, and I was read more
  • Ricardo Signes: Yes, please, with all possible speed. read more
  • Ron Savage: I'm with Craig. The depth of Tony's understanding is something read more
  • Craig Berry: Tony's patience and skill in executing the initial grant have read more
  • Ron Savage: Hi Tony Well done! That's a lot of valuable work read more
  • Karen Pauley: I don't think that the current Perl 5 Core Fund read more
  • diakopter: Karen, will there be a "Perl 6 core" fund? Or read more
  • Nicholas Clark: Dave's grant-funded work on the Perl core has been incredibly read more
  • Ricardo Signes: I am strongly in favor of this grant being granted! read more

About this Archive

This page is an archive of entries from September 2012 listed from newest to oldest.

August 2012 is the previous archive.

October 2012 is the next archive.

Find recent content on the main index or look in the archives to find all content.

OpenID accepted here Learn more about OpenID
Powered by Movable Type 4.38