2008Q2 Grant Proposal - Fixing Bugs in the Archive::Zip Perl Module

| 9 Comments
  • Name: Alan Haggai Alavi
  • Title: Fixing Bugs in the Archive::Zip Perl Module
  • Synopsis: Perl programs often need to manipulate .zip files. Archive::Zip (http://search.cpan.org/dist/Archive-Zip/) is a Perl module that allows a Perl program to manage Zip archive files without calling an external utility. The Archive::Zip module, however, has some bugs which prevent it from generating fully-portable .zip files, that are handled correctly by all .zip file readers and manipulators. The project's main aim is to address the outstanding bug reports (http://rt.cpan.org/Public/Dist/Display.html?Name=Archive-Zip), by using pyconstruct (http://pyconstruct.wikispaces.com/), which is a flexible framework for defining dissectors for binary formats in a declarative way.

Name:
Alan Haggai Alavi

Title:
Fixing Bugs in the Archive::Zip Perl Module

Synopsis:
Perl programs often need to manipulate .zip files. Archive::Zip (http://search.cpan.org/dist/Archive-Zip/) is a Perl module that allows a Perl program to manage Zip archive files without calling an external utility.

The Archive::Zip module, however, has some bugs which prevent it from generating fully-portable .zip files, that are handled correctly by all .zip file readers and manipulators. The project's main aim is to address the outstanding bug reports (http://rt.cpan.org/Public/Dist/Display.html?Name=Archive-Zip), by using
pyconstruct (http://pyconstruct.wikispaces.com/), which is a flexible framework for defining dissectors for binary formats in a declarative way.

Deliverables:
A version of Archive::Zip Perl module that will manipulate .zip files and a re-usable .zip file deconstructor.

Project Details:
Archive::Zip is a Perl module that allows a program to create, manipulate, read and write Zip archive files without calling an external utility. This programmatic approach is more robust than other approaches of calling external utilities, as their output is not easily parsable.

The aim of the project is to fix the existing Archive::Zip module. I will use the pyconstruct tool to compare the structure of .zip files produced by Archive::Zip with those produced by other utilities. pyconstruct will be used for the dissection. It is a generic analyser for file formats and protocols based on the concept of defining data structures in a declarative manner, rather than procedural code. Once I build the .zip file dissector, I will be able to compare Archive::Zip's outputs with those of other .zip archivers. Then I will be able to understand what Archive::Zip is doing wrong, and how to improve it.

The Archive::Zip Perl module will not depend on pyconstruct for its operation. pyconstruct will be used only for the comparison of binary .zip files.

Some of the currently outstanding bugs that will be fixed are:
* #22933: Properly extract symbolic links (http://rt.cpan.org/Public/Bug/Display.html?id=22933)
* #19502: Incomplete file (http://rt.cpan.org/Public/Bug/Display.html?id=19502)

Reproducing bugs is the major issue that can occur during the project. If I am not able to reproduce a bug, I will contact the bug reporter and get more details on how the bug had occurred.

In case of a setback, that is, inherent problem in Archive::Zip that cannot be solved, I will be able to deliver a .zip file parser in Perl. This parser will dump verbose information which will enable users to understand why certain .zip files are not fully-portable. This
knowledge will help in creating a new Zip archive module which solves the inherent problems of Archive::Zip Perl module.

Project Schedule:
The project will take 3 months. I can begin work immediately.

Bio:
I am doing my final year in Computer Science and Engineering at College of Engineering Chengannur, Kerala, India ( http://cec.ihrd.ac.in ). I am a member of the college website team. At college, I am the FLOSS Cell Chairman and have conducted classes and seminars on Free Software at my college as well as a nearby school. I am an active participant of technical events at college. I program in C/C++, PHP, Visual Basic and build web-sites for fun and profit. It has been a year and a half since I have fully converted to the GNU/Linux Operating System. At one period, I wrote some articles on GNU/Linux (http://slashmedia.wordpress.com). While having neglected doing that recently, I hope to revive that site soon. Currently I am maintaining a personal website at: http://drchost.com/~haggai/.

I have recently become interested in Perl. I have done a Webmin clone in Perl as part of college project. It helped me understand some basic Perl. I use the Vim editor for all programming.

Amount Requested:
$1,500.

Notes:
I have the approval of Archive::Zip Perl module maintainers: Adam Kennedy and Shlomi Fish.

9 Comments

(repeating my comment on the main section)

Although it is not a sexy area, Archive::Zip really needs the help that the proposal details.

I am the maintainer, but I am not in a position to contribute anything to the module other than to maintain the package itself and do some structural refactoring, when it comes to the actual zip specification I'm of little help.

I agree with Adam, not very sexy work, but extremely valuable to lots of people and projects (TAP::Harness::Archive being one of them).

The basic idea seems useful. But it would be good to have a better idea of the scope of the grant. At first glance, $1,500 for fixing two bugs seems off.

But maybe a lot more bugs will be fixed. Some clarification would be good.


There is almost certainly more than 2 bugs that need fixing. :)

The pyconstruct methodology alone should churn up a bunch more.

I for one would add #24036 "WinXP Explorer Exposes Problems" which basically means you can't open a Perl-generating zip file in the native Windows Explorer Zip "directory" thingy properly.

Although I am not really interested in knowing if the packages can be or not open with Windows Explorer Zip (it fails with lot of other zips from other sources), I think that to correct bugs on Archive::Zip is important.

But as Dave Rolsky says, I think $1500 is too much for fixing bugs on a module (although an important one).

On one hand I think that the general concept of giving TPF funds to people for fixing bugs in CPAN modules might be good if it helps channel corporate money to open source developers.

On the other hand will this encourage people to wait with their bugs till TPF finances the work?

I think the latter would be better avoided if the amount of money was lower to be only encouragement instead of a
full salary replacement.
$1500 for the two bugs seem to be way to high.

For that amount I would expect fixing more things either in Archive::Zip or in some other module.

I find it conspicuous that the proposal does not mention unit tests at all.

Chris Dolan: the plan is to naturally add regression tests for bugs, and other unit tests when possible. It was an ommission from the original post. We're not going to fix a bug without adding an automated test.

Note that they may not be "unit tests" but also possibly "system tests", "integration tests". I personally tend to write automated tests that test a particular bug as a system test rather than a unit test. That may be a bad habit, but that's what I did until now. Generally, I'm trying to make sure the test tests for meaningful rather than the particular behaviour of the unit.

@szabgab

With respect to encouraging people to leave bugs until they are funded, I note that the applicant has zero existing relationship with the Archive::Zip module, so it would not apply in this case anyways.

Now if _I_ or the original Archive::Zip author applied, that would be a different story.

Leave a comment

About this Entry

This page contains a single entry by Alberto Simões published on May 1, 2008 9:00 PM.

2008Q2 Grant Proposal - Module Installation Configuration Wizard was the previous entry in this blog.

2008Q2 Grant Proposal - CatalystX::Installer is the next entry in this blog.

Find recent content on the main index or look in the archives to find all content.