Some Python Utilities

One of the languages I use for programming in Python. As (I presume) everybody using a particular language, I have also piled up some Python modules that I use for various projects. These are listed here and you can download and use them if you want. None of those are production quality in my view, but they can be useful. A word of warning: my environment is Windows XP and I use Python from under cygwin. There might be dependencies that I do not know about (although I tried to develop the scripts in a portable manner). If you hit such a problem (or any other) I would welcome your feedback. Some of the modules are distributed as a gzipped tar or zip file, containing a more detailed documentation. Others are simply a single Python file.

Here are some of these modules:

PyXMLUtils.py (gzipped tar file, zip file)
PyXML is a very rich XML library implementing W3C's DOM (Level 2), including XPath processing, too. The ElementNode class is some sort of a wrapper around PyXML's implementation of a DOM Node, that makes it a little bit easier to program with. (In other languages one would call these macros.) Examples are: appendString (creating a Text node and append it as a child node), appendCDATA (similar), append an XML Fragment (parse an XML string, and add the result as a child element), copy a full subtree from another document into the node, etc. See separate documentation for further details.
RDFLib utilities, including a SPARQL API (gzipped tar file, zip file)
RDFLib is a Python library to manage RDF triplets. myTripleStore is a wrapper around RDFLib's TripleStore to make certain operations a bit easier. Just as PyXMLUtils.py above, it is mostly a set of "macros": getPredicateSubject (useful when one knows that there may be only one?), unfolding Collections, operator methods to concatenate and mutliply (ie, intersect) triple stores. It also contains the management of a Seq: RDFLib does not ensure that an iterator on the properties on a Seq (ie, _1, _2, _3, etc) are returned in sequential order, so I made a wrapper around that. The SPARQL part is an API level implementation of the SPARQL Query draft of W3C. See separate documentation for further details, with a separate description of the SPARQL facilities.

However: these method and routines are getting out of date. The reason is that the newer releases of RDFLib incorporates most of this code: thanks to Michel Pelletier and Daniel Kresch, this SPARQL API has been included in the latest releases of RDFLib (2.2.2 and higher). See my note on the transition. (However, the version 2.2.2 still contains some bugs of the transition; hopefully, all this will be settled soon!).

RDFS Closure (gzipped tar file, zip file)
This is also a utility added to RDFLib, but independent of the SPARQL implementation. The function in the module calculates the RDFS Closure graph, following the Semantics of RDFS. More exactly, the module implements the closure algorithm as described in "Completeness, decidability and complexity of entailment for RDF Schema and a semantic extension involving the OWL vocabulary", by Herman J. ter Horst, Journal of Web Semantics, (2005) 79-115. (Unfortunately, the paper is not on-line.) The nice aspect of ter Horst's algorithm is that it is guaranteed to be finite. However, the datatype inference rules are not implemented in this module, simply because RDFLib does not have the necessary features. There is nothing sophisticated in the module, which also means that it takes its time for larger graphs... See separate documentation for further details.
xmp.py
Extract Adobe's XMP metadata from jpeg or pdf files, and return it either as a string (in RDF/XML) format, or loaded into an RDFLib TripleStore. There is also a method that takes a URI instead of a file name, and it would then retrieve the XMP data over the network. Based on some tricky regular expressions provided by Sean B. Palmer.
coordDates.py
This sounds very specific, but comes up quite frequently when one generates graphics with Python: a coordinate interval might correspond, semantically, to an interval of time (usually starting at some past moment, "epoch") and one would like to make conversions from dates to coordinates on that interval. This is what this small module does.

Ivan Herman, $Date: 31-12-2005 - 16:47$ ivan@ivan-herman.net

Creative Commons License
Unless otherwise noted, these works are licensed under a Creative Commons License.