2008Q2 Grant Proposal - CPAN Stability Project 
        
        
      
      
         
          Thu, 01-May-2008 by 
          Alberto Simões
        
        
        
          
             edit post
          
        
        
        
      
      
* **Name:** Michael G Schwern
* **Project Title:** CPAN Stability Project
* **Synopsis:** The CPAN Stability Project is intended to improve the usability of CPAN over long term use by providing a way to choose between safe releases vs newest releases as well as to better guarantee that upgrading will not break anything. It draws heavily on Debian's release management and tiers (unstable, testing, stable).
**Name:**
Michael G Schwern
**Project Title:**
CPAN Stability Project
**Synopsis:**
The CPAN Stability Project is intended to improve the usability of CPAN over long term use by providing a way to choose between safe releases vs newest releases as well as to better guarantee that upgrading will not break anything. It draws heavily on Debian's release management and tiers (unstable, testing, stable).
This is a combination largely of known problems and previously proposed solutions into one overarching vision of stability with flexibility by adding...
* Multiple module indexes spanning from most stable to most recent.
* The ability to test a module's dependents before upgrading.
* Metadata about the types of changes in a release.
* Backwards compatibility information about a release.
**Benefits to the Perl Community**
The intent is to allow everyone to safely and automatically keep their CPAN modules up to date.
CPAN is Perl's greatest asset, but it is not used widly enough because there is only one way to keep up to date on CPAN and that is to install the latest version. This is a crap shoot, and for large dependency chains there's a very good chance something has broken today. The CPAN Dependency Checker illustrates the problem. http://cpandeps.cantrell.org.uk/
This will allow large-scale CPAN users to feel and be safe when using CPAN modules while still keeping up to date, while still allowing those who value rapid development to get the latest versions and servicing everyone in between.
The CPAN Stability Project eases the "dependency hell" problem of a broken dependency bringing an install or upgrade to a halt.
**Deliverables**
The project is split up into four major pieces, Testing Dependents, Indexing and Release Metadata. Please see the wiki page under Project Details for details of each component.
The deliverables have all been specified as use cases describing what new abilities will be available, rather than how it will be done. This is so the implementation can evolve without having to renegociate the grant, and so in the end the Perl Community gets their intended benefits.
_Testing Dependents_
Given a CPAN release, a user must be able to...
* Determine what distributions depend on this release.
* Of those, determine which are installed.
* Aquire those dependent releases with their tests.
* Run the dependent's tests against the new release
without installing the new release.
* Determine if the dependent's tests pass or fail.
* Configure a CPAN shell to perform this process automatically.
It is intended that these additions be available inside a CPAN shell, but initially they will be written independently to rapidly prototype the problem.
Though it will attempt to work without it, this phase will operate best with a local database of installed releases and one may be developed during the process.
This phase will likely depend on the Complete Index.
_Release Metadata_
The ability for release authors to document, via META.yml...
* The list of what has changed in this release, in a general sense.
** Documentation, Tests, Security, Features, etc...
* The intended highest stability tier of this release.
* The oldest release with which the release intends to maintain compatibilty.
The ability for the indexer to...
* Use the release's intended stability tier to determine how the release is indexed.
The ability for the user to...
* Determine if upgrading to the release will break compatbility with installed dependents.
** Have a CPAN shell make that determination automatically
* [BONUS] Declare to the CPAN shell what releases they want to remain compatible with.
It is intended that these additions be added to the META.yml specification, but initially they can be added as 3rd party defined keys using the provisions for this in the META.yml spec.
It is intended that Module::Build and ExtUtils::MakeMaker have the ability to emit these new keys. If not, Module::Build users have the ability to add their own META.yml keys and it is intended that MakeMaker gain this ability soon.
This phase is largely independent of the other phases.
_Complete Index_
The ability for an indexer to create and updated...
* A "Complete" index mapping a module and version to a release.
** Tie in the ability to get deleted releases from BackPAN mirrors.
** [BONUS] A remote service to query the Complete index
The ability for a user to...
* Install a particular release using the Complete index via a CPAN shell
_Tiered Indexing_
The ability to generate and keep up to date...
* Tiered release indexes
** Experimental
** Unstable
** Testing
** Stable
** Old Stable
* A compatible 02packages.details.txt index.
The ability for an indexer to...
* Place a release in its starting tier.
** Experimental starts in Experimental.
** Unstable starts in Unstable.
** Testing starts in Testing.
** Stable starts in Testing.
** Untagged X.Y is handled as Stable.
** Untagged X.Y_Z is handled as Unstable.
* Move a release to another tier.
* Progress releases from Testing to Stable after a waiting period.
* Progress Stable releases to Old Stable when a new Stable release appears.
* Remove old Old Stable releases when new Old Stable releases come in.
The ability for release authors to...
* Declare the intended stability tier for their release.
* Change the intended stability for a release.
** For example, change from Stable to Testing if a release fails tests.
The ability for a user to...
* Configure a CPAN shell to draw from a particular index.
* [BONUS] Have the CPAN shell try a more stable index if their intended fails.
It is intended that this all coordiante with PAUSE. Because of the complexity and age of the PAUSE code, it is intended that initial implementation will be with a new, prototype indexer. Should coordination with PAUSE not prove feasible, for whatever reason, or should it prove complex enough that it endangers holding up the project, the prototype indexer can be put into production independent of PAUSE. It would publish new indexes indepenently, but still remain in sync with PAUSE. Users and CPAN clients could use these independent indexes to talk with existing CPAN mirrors.
This phase can operate independently of the other phases. Without the Release Metadata all releases are simply considered Untagged.
**Project Details**
Details of the project can be found here. http://www.perlfoundation.org/perl5/index.cgi?cpan_stability_project
This project is a combination of several smaller, related grants. ambs advised that I should combine them. If the project is too large it can be split back into the smaller stages.
This project works best when coordinated with...
* CPAN.pm
* CPANPLUS
* PAUSE
* ExtUtils::MakeMaker
* Module::Build
* The META.yml spec
**Project Schedule**
I will be available to work on this project after June 20th.
Scheduling for this project is difficult as adapting PAUSE and the CPAN shells can throw the whole thing off. The independent prototypes can be scheduled with more certainty.
In the interest of getting this grant in before the 2008 Q2 deadline, I must leave off a schedule.
**Bio:**
Michael G Schwern will be doing the primary work. He has been programming Perl for over 12 years. He has written, maintained or been heavily involved in all parts of the CPAN module installation process. As maintainer of Test::More and ExtUtils::MakeMaker, at the very root of the CPAN dependency tree, he is acutely aware of the problems involved in keeping CPAN stable. He is currently working as an independent contractor. His work schedule for the summer is open in anticipation of being awarded this grant.
It is intended that the project coordinate with...
* Andreas König for CPAN shell suport and PAUSE and CPAN indexing changes.
* Ken Williams for approval of META.yml and Module::Build changes.
* Jos Boumans for CPANPLUS support.
* Adam Kennedy for advice on dependent testing.
* David Golden and rjbs for CPAN Testers advice.
**Amount Requested**
$9000
Is this amount is too large for this round of grants, the project can be split back into the following pieces...
Complete Index: $2000
Dependent Testing
* w/Complete Index: $4000
* Tiered Indexing: $4000
Release Metadata: $1000
Additionally, as this project would be of great interest to any Enterprise level CPAN users, it is possible that several companies could be found to match the TPF's funding.
      
      
      
      
Comments (2)
  
  
  
  
    I enthusiastically support this idea.  Even if the project fails, it will make an impact, I believe.
I worked as a packager and maintainer for the Fink project for about a year.  The major problem with tiers, as I experienced in Fink and as hes been popularized in Debian, is that everything ends up in "testing" or "unstable".  It's really hard to get the confidence to push something to "stable", especially if the packager only uses a subset of the functionality him/herself.
By pushing the "stable" decision back upstream *and* letting the author change that decision post-facto, the tiers will be much more meaningful and authoritative.
The proposal seems heavyweight for what PAUSE/CPAN have traditionally been about, but I think plan to simply let the version number drive the stable/unstable decision in the absence of an author tag will make this be opt-in and thus successful.
  
  
    
  
  
  
  
  
    looks like a very good idea.
one thing that i feel missing in our current CPAN design is common version number convention that should be followed by people for their package/distribution. 
for example, i have seen packages with first version as 1.00 or as 0.01 or as 1.0 or as 0.1. 
also, i have seen version increment something like,
1.20
1.2001
1.2002
1.21
1.22
so users might find difficulty if they are doing things apart from just installing from cpan (like creating pre-built packages from cpan for their use).
so, i guess it would be useful if CPAN sets common version number convention that people should follow for their package/distribution while releasing new version.