2008Q2 Grant Proposal - Revision Control for all of CPAN

Category: Grants

Comments (16)

I like it! Two thumbs up!

Some questions that might be important for this proposal though:

How are CPAN authors to gain access to their repository? How are they to setup fine-grained access control on their own (to enable close collaboration for instance)? Can they? While not important for setting up the repositories it seems important to hash these things out so that they can be easily used.

One thing that may be considered for a future proposal or as an extension to this one would be given the svn repositories existence could there be a more direct interface to PAUSE. (i.e. use the repo to publish directly to PAUSE somehow)

I've played around with this kind of thing before and realise it's not trivial. It does sound to me like this is two seperate projects: a subversion repository that people and play with and a subversion repository containing historical CPAN information.

Once this is done, it sounds like a high IO hosting problem. This sounds like a seperate part of the project. Have you considered asking the perl.org admins whether they could host it?

Please restrain from flamewars on this point, but the main perl repository is about to move to Git. Would it not make sense for this to use the same version control system?

Honestly I don't really care whether it's svn or git (although if it's svn it would be compatible with git and svk, but the reverse isn't tru). There are already a lot of existing tools to work with SVN so that's definitely a plus.

But regardless I think this would be a huge win. Having to setup a separate sourceforge or google code project for a little CPAN module seems like overkill to me. And more openness is always good for OS code.

I think this is a terrible idea.

It is an attempt to centralize and normalize the CPAN module process, which should specifically NOT normalized. Svn vs. Git is only one part of this aspect.

Perl needs to diversify and de-standardize, rather than funnel towards being a monoculture.

A lot of people won't use this, simply because they're already happy with their own version control setup (myself included). The end result will be unpredictable - some people in, others out. Those who choose to retain their own setup will be forced to repetitively explain to others that no, they don't use the "central" repository. How tedious.

The idea of free version control for CPAN authors, however, is a good one. A "PerlForge", with good integration with the other standard community tools, would be very helpful for those who don't know how to or can't set up their own repository. (Although people didn't seem to like the idea so much four years ago.) Then again, it would also be a lot of work spent on duplication of an existing, successful system, SourceForge.

Those who have their own setup (myself included) are unlikely to switch to using this. That raises the tedious prospect of having to repeatedly explain to people that no, your repository isn't the central one.

The idea of free version control hosting for CPAN authors (PerlForge?) is a nice one, but has had a lukewarm reception before because it only serves as a lot of work spent on duplicating SourceForge, a successful existing system. The saving benefit might be from integration with RT for CPAN.

Plus, yes, there is the what-version-control-system discussion. Pick svn and the git users will see it as a retrograde step. Pick git and the svn users may be unconvinced about having to put in the time to convert their existing repositories.

The whole thing strikes me as a boondoggle, to be honest.

This idea is flawed, because it attempts to mix release management with development management. Those are two very different things, if you study it in more detail.

CPAN is a distribution network, where complete sets of software are finding their way to end-users. It's about consistency, authority, preservation, maintainability, transport.

SVN is for development communication between authors, its about change, which will only confuse end-users. SVN does not have release management features.

Compare it to a book-shop: the reader of the book really doesn't want to be bothered by the fights between the author and his publisher about the book content (SVN/revisions), but wants to see the book on a bookshelf, have a nice print, be affordable, and with all pages in the right order (CPAN/releases)

'diff uploads/downloads' are useless (that's not the way Perl's installation tools work). 'file extracts' are already available via search.cpan.org/browse. Having old releases in SVN without annotation of the seperate changes is no added value.

IMHO, this project is not an improvement of CPAN. However, it can be useful to be able to start an SVN easily, to work on any perl module. Hm... we have that provided by sourceforge and google. (I agree there is a need for improvement, but you are thinking too small!)

As Andy says, diversity is good. I would like to continue to have people using their own repositories. I would prefer a proposal for a sourceforce/trac-like system for those who can't have their own, but not complicating the CPAN system.

I vote yes, my only point would be to think about changing the one repo per author to one repo per distribution. This could allow a distro to be taken over by another user with out having to copy and fork the history.

Also I see this as a great way to allow a user to track the current progress of any module as well as submit more current patches as needed.

[SVN vs GIT]
I think that eric is right for picking svn , it's generic enough that everything can build from it. From the stand point of this being CPAN you do not gain anything from this being GIT (or anything else). It seems that Eric's goal is to provide a central repo to consolidate BackPAN and provide tools for future development of any module. =IF= you want to pull a test branch then you could, you can also pull a local copy and play in what ever you want. SVN, for all of it's issues, is a flexible enough system to allow for just about anything.

I would argue that CPAN is already a centralized and normalized system. What Eric is proposing is keeping everything that we already have but also adding the ability to use SVN if you choose.

I agree that there might not be a mass adoption, though I do not see that as a hindrance for this project. I see this more as a method to move CPAN to a versioned fs that just becomes a more flexable system if we want to use it. Because all the tools are all at one location, you do not need to create a seperate account, set up a second enviroment, ect. Most importantly, though, is that everyone who has ever submitted anything to CPAN currently would have everything set up for them already. If you want to use it cool, it's there, if not, no worries.

I agree that it could be confusing to a user that starts to look behind the screen, but I do not think that is erics intent. What I am taking away from this is that the current 'trunk' of any dist would be the same as the current tar method. Then any previous 'tag' would become all the previous tars. So I dont see the UI to CPAN needing to change much if at all.

The idea of having all of Backpan, unpacked, imported into a revision system isn't a bad one, but it will perforce fail on many levels when it can't track things like file renaming or movement, the semantics of which are lost to the tarballs stored online.

The idea of offering a per-user repository seems like a nice enough thing to do at first, but it's already offered by plenty of other hosts, who offer options other than one VCS. They also allow per-project permissions, which would be required for collaboration. That would require adding more ACLs to the system to track "projects." Maybe those are distributions, but there are currently no user-dist permission mappings.

There is also the "puppies make bad presents" aspect to this grant request. It covers building something that then must be continually operated and allowed to grow as needed, without specifying any sure backing.

I don't see much benefit, but I see plenty of costs.

The great thing about this system is that it has different things to offer depending on what you want from it.

The unfortunate thing is that it has different things to offer depending on what you want from it. So, some comments say "we don't need this history", some say "we don't need this version control", etc. No single aspect requires complete buy-in from anyone for it to be useful in many ways to many people.

This is not, however, a trac/sourceforge/etc. Those things (and more) could be built on or linked to it, but they are all one layer up. This is only a versioned filesystem hosted on HTTP with some data arranged in a useful way.

The "I have my own repository" issues are addressed (though in not such great detail as earlier revisions.) Yes, even if your repository is git.

Andy: this does not impose anything on the "process". If anything, its potential as an aggregation technology will enable more discoverable decentralization. The issues of linking to external repositories played a bigger role in a previous (too expensive) proposal, so I have left them out in the hope of at least getting *something* started. In any case, I think you will still find it valuable even if the author-writable edge of the sword does nothing beyond providing a sort of registrar for external repositories.

The git vs svn question is a tough one. I have given it much thought. I decided that svn would be a better fit. Nothing in this work precludes creation of a parallel repository built on git. Further, the majority of this work involves unpacking the backpan -- which would also need to be done by anybody who wants to make it out of git. Yet, here I am proposing to get it to a point where it could easily be cloned directly into git.

Nit note: the "-CPAN-/$dist/" bit appears to be suffering from wiki markup (strikethrough) interpolation.


I am concerned about long-term maintenance. It seems to me that the work of doing the import once is likely to be less in the long-run than the work of maintaining the up and running repository, answering questions about how to use it, finding hosting, renegotiating hosting periodically, etc.

If someone can line up a promise to take on that work long-term, I'd vote for this. Otherwise I'm planning to vote against it.

I think having an optional central place to version control all the code on CPAN is a good idea.
I don't think there is a need for importing code from backpan though and I don't think there should be a mandated way on tracking releases. Version control and release management are different as Mark has already pointed out.

I have my own SVN repository and what I think is missing is an easy way to allow other CPAN authors to contribute to my projects. AdamK has solved that in his SVN repository, others are using Google Code or Sourceforge.

I think it would be great if I could use my PAUSEid to setup a repository on svn.perl.org and then to easily allow other CPAN authors to commit to my projects.


I really love this idea and agree with some of the former commentors that an optional svn repository would probably be optimal.

My biggest problem with other project sites like SF or Google Projects is that they are aimed at larger projects, while CPAN is very modularized. Google for example reserves (IIRC) 100MB svn space for you, and limits you to a handful of projects per account. But for most libraries, I don't need that much space, but a better way to manage a large set of distributions with very different amounts of changes over time.

I also think this is a great idea. If people are
afraid of centralization, just make it optional.
I think many people would be glad to have a standard
rcs for CPAN. Why should people be dependent on other
project hosting services if CPAN could offer it?

There are already more ways to build a module, and
also two ways to install (CPAN, CPANPLUS). While
TIMTOWTDI is good, can't there be a default way for
hosting your CPAN code - the recommended way for those
who would benefit? A standard way doesn't need to
forbid other ways.

CPAN is already a killer app. If people who want to
write a module and don't already know where to host it,
see, that CPAN offers it, it would be even better.

Git would be a much better target for this because;

a) there are already partial conversions underway

b) it's possible to easily import detailed history from other revision control systems to git; the same is simply not possible with svn

c) it is a more flexible way forward.

svn has seen its day, let's move forward please.

Sign in to add comment