Next Page »

Extending the Python Wrapper for Calais to Support RDF

16 April 2009

calais_logo

We’ve been working for some time with python-calais, a Python wrapper for the Calais Semantic Text Annotation Service led by Jordan Dimov.  For Likematter, we extended its native support for Calais’ JSON response type to include support for Calais’ RDF output.  Several people expressed interest in using rdflib with Calais at last night’s Python meetup in Cambridge, MA - so I’ve packaged up our RDF extension as a starting point/example for folks looking to process Calais RDF in Python apps.

The reason to use RDF over JSON is that the RDF output contains a fuller representation of the analysis Calais produces.   As just one minor example, the full de-referenceable URLs for each entity type (e.g. Country, Person, etc.) aren’t present in the JSON.  The drawback to using RDF is that it’s much harder to interpret and translate into Python objects than JSON.   The code here includes the SPARQL to obtain only part of what Calais produces, entities and categories, but should serve as a starting point to working with Calais’ RDF.

Our RDF extension has been  added to the python-calais google-code repository.  The key SPARQL queries for categories and entities are below.


CATEGORY_QUERY = { ‘fields’ : ['docId', 'category', 'categoryName', 'score'],
‘SPARQL’ : “”"
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX cp: <http://s.opencalais.com/1/pred/>
SELECT ?docId ?category ?categoryName ?score
WHERE { ?doc cp:docId ?docId .
?doc cp:category ?category .
?doc cp:categoryName ?categoryName .
?doc cp:score ?score . }
“”"  }

ENTITY_QUERY = { ‘fields’ : ['entityId', 'name', 'type', 'relevance', 'resolves_to_uri', 'resolves_to_type', 'resolves_to_name', 'resolves_to_score'],
‘SPARQL’ : “”"
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX cp: <http://s.opencalais.com/1/pred/>
SELECT ?entity ?name ?type ?relevance ?res_uri ?res_type ?res_name ?res_score
WHERE {?entity cp:name ?name .
?entity rdf:type ?type .
?rel_uri cp:subject ?entity .
?rel_uri cp:relevance ?relevance .
OPTIONAL { ?res_uri cp:subject ?entity .
?res_uri rdf:type ?res_type .
?res_uri cp:name ?res_name . }
}
“”"  }


Next Page »