Kjetil Kjernsmo will assist Tope with Perl and CPAN
- Amount Requested:
How much is your project worth?
The Semantic Web community has developed a vocabulary named VoID, which is a World Wide Web Consortium Interest Group Note, a de facto standard: http://www.w3.org/TR/void/. The goal of this project is to generate such descriptions, partly automatically, partly by hand-maintained descriptions, using RDF::Trine.
Benefits to the Perl Community
There is an active community around Semantic Web-technologies with Perl. This community believes that generating VoID descriptions is a very important undertaking, as it is an important part of the Linked Data technology stack and should be deployable in Linked Data services. A module such as this will be highly useful for the users of Perl in web development, and is likely to drive new users to Perl as it becomes a viable alternative amongst few for deploying a comprehensive Linked Data service. Despite its importance, the Perl+RDF community has not found the manpower to write this module. The community has therefore felt that the assistance from Tope Omitola is very welcome as he is a researcher in this field and has been involved, with other researchers from the University of Southampton, with the creation of similar modules for PHP. The module will be supported by the Perl+RDF community (see http://www.perlrdf.org/) and integrated with RDF::LinkedData.
Dataset publishers can use VoID descriptions for datasets' maintenance, administration, and hosting.
Clients can use VoID descriptions to discover, query, crawl, and index datasets, navigate them, get an idea of the type of data available, and optimize queries on them.
A module named
RDF::Generator::Void. The module has been started on Github https://github.com/tope/RDF-Generator-Void by the primary proposer and has received patches from two members of the Perl+RDF community. This module will be uploaded to CPAN.
With the rise in the usage and deployment of Semantic Web, especially of Linked Data, in organisations, industries, and governments, a good service is needed that can be used to construct, automatically, service level descriptions of these Linked Data modules. This service should also be useful for Linked Data developers and maintainers to help them add additional data of these services, manually.
This project aims to build a Perl module
RDF::Generator::Void that can be used to set up such services, and Perl+RDF community members are already committed to integrating this module with existing modules.
The module will generate the following automatically, of a dataset or a sparql endpoint using RDF::Trine:
The total number of triples contained in the dataset.
The total number of entities that are described in the dataset.
The total number of distinct classes in the dataset.
The total number of distinct properties in the dataset.
The total number of distinct subjects in the dataset.
The total number of distinct objects in the dataset.
- void:documents, void:subset, void:Linkset
express the set of foreign links in a dataset.
used for expressing certain technical features of a dataset, such as its supported RDF serialization formats.
- void:sparqlEndpoint void:dataDump void:exampleResource
Further descriptions of the dataset.
A full VoID description also contains other properties that in most cases must be hand-maintained, i.e. in a practical application, added from a file or through configuration. We will not enumerate this properties, but the module must be able to accept such statements.
Furthermore, the module must have methods to prompt a regeneration of the description, both a forced update and an update that will first check if the data has changed.
Learn sufficient Perl
Get an overview of the API provided by RDF::Trine
Get an idea of what can be done with Any::Moose
Write a set of test suites
Create a constructor that can take a RDF::Trine::Model as the basis of computing the description and a model to add the description to.
Add a method to use to add hand-maintained statements.
Add a method to return an RDF::Trine::Model with the description.
Create test data
Write more tests
Create the code to generate the description based on triple counts.
Create a method to unconditionally regenerate the description.
Create a method to regenerate the description only if the model's etag has changed.
Package the module for CPAN distribution
How long will the project take? When can you begin work?
It will take 2 months. Begin work middle of 16 May 2012
Work will happen between other projects, so active time spent is much shorter. Learning to use Perl and auxillary tools are expected to take two weeks. Then, constructor and initial methods is expected to take one week. Writing tests is then one week. Generation of the description is the main effort and can take up to two weeks. Wrapping up and review the code is finally expected to take two weeks before release.
The module will be released to CPAN, either by the proposer or by other members of the Perl+RDF community. The completeness will be judged by the ability of the module to fulfil the goals stated in the detailed project description and by passing a test suite developed.
Who are you? What makes you the best person to work on this project?
Tope Omitola: Research Fellow at the University of Southampton. Experienced Semantic Web / Linked Data developer, experienced in developing semantic web dicovery services. Amongst his research interests are provenance tracking of Linked Data, which involves an extension of VoID called voidp of which he is the primary author. Has previously been involved with writing a PHP module similar to the one proposed in this project. He is new to Perl.
Kjetil Kjernsmo: Ph.D. Research Fellow at the University of Oslo. Has 16 years of Perl experience and several modules on CPAN. Is an active member of the Perl+RDF community and organized the first International Semantic Web with Perl hackathon in London in March 2011. Active member of Dahut.pm and deputy board member of Oslo.pm. Will help Tope get up to speed with Perl and help with packaging and the test suite.