Shlomi Fish
$1500.
Add tests to the built-in perl debugger and refactor it.
The default perl debugger ("perl -d") is useful and powerful. However, it suffers from lack of automated tests (which, as a result, caused some recent enhancements to it, to introduce regressions), and its internal code quality leaves a lot to be desired. This project aims to improve the debugger's test coverage, and afterwards to refactor it.
1. More automated tests for verifying the correctness of "perl -d".
2. More modular code with better extensibility.
The default perl debugger (normally invoked with "perl -d") is implemented
in lib/perl5db.pl
in the Perl 5 core and has a test suite in
lib/perl5db.t
. I have sent the perl5-porters mailing lists patches to
add tests to it, and fix bugs (some of which were introduced by recent
feature additions).
Currently, the test suite is heavily incomplete, which makes introducing regressions in the debugger fairly easy. Furthermore, its code suffers from many bad paradigms, and could use a lot of refactoring.
This project aims to first add more automated tests to the debugger (to make sure present and future modifications to the debugger do not cause regressions), and then to refactor, clean up, and modernise its code (to make future modifications easier).
1. Add automated tests for each of the debugger command-line commands listed in http://perldoc.perl.org/perldebug.html in order to reach a minimal level of test coverage.
2. Implement a method to find which lines in the debugger's code are not covered by tests. We may be able to use Devel::Cover for that, but it is possible a different strategy will be needed.
3. Add tests to cover most of the lines that are still not covered in the debugger.
4. Refactor the code.
I predict that the project will take between one and four months. I can begin working on it immediately.
The code will be made available (including during development) in a branch in my git clone of the Perl 5 core repository. During and afterwards, it will hopefully be merged into the Perl 5 core, after review by the perl5-porters.
I am an active user, developer, and advocate of Perl and other open-source technologies. I maintain many modules on CPAN (http://metacpan.org/author/SHLOMIF/, http://perlresume.org/SHLOMIF), and have contributed to other Perl projects. I am also proficient in C, C++, Assembly and other languages and have been actively involved in C and C++ projects.
I have successfully completed the Perl Foundation's "XML-RSS Cleanup" grant:
( short URL: http://is.gd/1nYZM )
I maintain an active homepage at http://www.shlomifish.org/ which contains more information about me.
I think his previous collaborations with p5p have been sufficiently difficult that I do not think he's the right person to do this, sadly.
Hi Leon,
I'm sorry that you feel that my previous collaborations with the Perl 5 Porters have been "sufficiently difficult". It would be helpful for me (and maybe for P5P as well) to understand how you feel that they are "difficult", and how I can try to improve.
I should note that it is my intention to carry on with adding tests to the debugger and refactoring it, even if I don't get the grant, just that I hope that getting this grant will motivate me.
I await for your response.
Regards, — Shlomi Fish.
Hi Shlomi -
Good luck with the project!
As a an early follower follower of the Devel::Trepan github project, you might already be implicitly aware of some ideas that might be of benefit in this project. In case not, I'd like to make this explicit here.
But first, let me say a word on appears to be the plan: writing tests and then performing a refactoring. On new programs, but I think older ones too, the two generally go hand in hand rather than one after another. Test early and often. Refactor early and often too.
Although it is no doubt true that more tests are needed for perl5db.pl it is also true that this almost 10K file really needs much more modularity and more so than many other programs for this one weird reason. Many other Perl 5 debugger efforts uses some part of this code. Devel::Trepan, for example, does. So does the DB API/module. And although I haven't looked closely, I thik ActiveState's Komodo does, and possibly Padre. Because perl5db.pl comes as one huge file, what almost invariably happens is the file is cut and pasted. Then perl5db.pl is improved such as in this project, and then those projects do not benefit. If there more the separation then less code would be cut and pasted and when something is improved in perl5db may be beneficial other projects across the board without having to change these, or update the cut and paste.
So with the above in mind, one of the first things I did in developing Devel::Trepan was to refactor the DB package a little. And over time more. Right now this modified DB is distributed with Devel::Trepan, but perhaps one that will split off.
Even to this day, changes to DB continues as I learn about better ways to do things.
I hope you will take some of that reorganization or the ideas behind them and incorporate that back into perl5db.
Because as I've said testing and refactoring go hand in hand, you will also find tests for this aspect of the refactoring too. Look for example for tests in Devel::Trepan that start t/10test-db and I'll probably soon create more as I been changing this recently.
The other broad area where think ideas if not code can be used this project is in the way the tests are written.
There is a spectrum in testing between black-box tests (or "integration tests") and glass-box (or "unit tests"). The unit tests allow one to customize testing based on knowledge of the implementation. And as such they can tailor the tests to boundary conditions and ways to ensure that all of the code is covered. Unit tests are generally faster too and pinpoint errors more precisely. At the other end of the spectrum are the "integration tests", end-to-end or black-box tests that say: I don't care how you implement it, I just want to see that from the end-user standpoint things work. So they tend to be slower because they have to pull in everything. And when tests fail, you generally have to wade through more levels of code in order to pin-point the problem.
In Devel::Trepan, the unit tests start t/10 while the integration tests start t/20. The naming of the tests I don't feel important as the overall concept of having a clear separation of the two. I wrote this according this convention because I prefer the faster and more pin-pointable unit tests to run first in case there is failure. In other projects in Ruby, for example, I separate these into two directories to make it easier to run tests in one or the other or both.
So again, it may be suggest be useful to follow some sort of broad sorting of the kinds of tests this way.
The other area in terms of testing that I think may be useful to follow is the way the integration tests are handled. There has been a bit of work to set this up to make it easy to write these integration tests. There is an example directory that contains sample Perl programs that one can run a debugger on. There is also a t/data directory which contains two kinds of files. One kind of file is just debugger commands like "step" or "break". Creating such a debugger command file is pretty easy. Go into the debugger think of something and type commands and then just remember what you type.
The other kind of file contains canonical output from the debugger after running the command. The canonicalization is needed in order to strip away file paths, or turn some numbers like hex memory locations or process id's into some fake canonic numbers. Creating this kind of file is also pretty simple. Just take output from that the debugger produces and save that to a file, possibly looking for such places where numbers or file paths need to be adjusted.
Given this setup, writing a integration test then becomes pretty simple. If you look at the t/20 tests you'll see that they are very short — one is about 25 lines. They run the debugger and compare output.
Even the names of the command and comparison are often implicitly specified based on the name of the integration test. See t/Helper.pm which contains most of the boilerplate code in order to make work.
Again, good luck!