mulgara - semantic store

skip navigation

SHOW SITE NAV
fixed
fluid
straight

Content Handlers

Working in conjunction with resolvers are content handlers that perform the actual conversion of data in a file into RDF triples for the resolver to constrain. All content handlers use an implementation of the Statements interface to hold the triples extracted from the file and allow for navigation of the results.

While the ContentHandler interface is relatively simple, there are important decisions to be made before implementing it.

 

First, you need to determine the purpose of the content handler. The issue of protocols is dealt with by the Resolver classes, so you can create individual content handlers for each file type without having to worry about which protocol you are using to connect to it. In this case the content handler is for MP3 files.

 

Second, you need to decide how the triples are transferred to the Statements object. In the case of the MP3 content handler, it parses directly into an MP3 specific statements container. However, it is possible to parse the triples first and then feed them into the Statements object.

 

Finally, in addition to creating the statements, there is the issue of parsing the file. The MP3 content handler uses an ID3 tag parsing utility that is called from the Statements implementation. Whether this is done as part of the content handler or as a separate utility, and whether the statements are generated as a result of parsing or are fed the triples, is up to the implementer.

 
Configuration and Initialisation

Depending on the design choices made (see the Creating the Content Handler section) configuration might not be done in the ContentHandler class. If pre-configuration of the parser is required, then it can be done in the implementation, however, the class itself does not require any configuration.

 
Implementing the Interface

Once the usage and structure for the handler is set, the interface can be implemented. The MP3 content handler performs its parsing as part of the statements container so not much implementing is required. The implementation looks something like the following (extracted from MP3ContentHandler.java):

package org.mulgara.content.mp3;

// Java 2 standard packages
import java.io.InputStream;
import java.net.URI;
import java.util.Map;

// Java 2 enterprise packages
import javax.activation.MimeType;
import javax.activation.MimeTypeParseException;

// Third party packages
import org.apache.log4j.Logger; // Apache Log4J

// Local packages
import org.mulgara.content.Content;
import org.mulgara.content.ContentHandler;
import org.mulgara.content.ContentHandlerException;
import org.mulgara.resolver.spi.ResolverSession;
import org.mulgara.resolver.spi.Statements;
import org.mulgara.query.TuplesException;

public class MP3ContentHandler implements ContentHandler {

/** Logger. */
private static Logger logger =
Logger.getLogger(MP3ContentHandler.class.getName());

/**
* The MIME type of RDF/XML.
*/
private static final MimeType AUDIO_MPEG;

static {
try {
AUDIO_MPEG = new MimeType("audio", "mpeg");
}
catch (MimeTypeParseException e) {
throw new ExceptionInInitializerError(e);
}
}

/**
* Parses the ID3 tags of the MP3 file pointed to by the content object which
* are then converted to a statements object.
*
* @param content The actual content we are going to be parsing
* @param resolverSession The session in which this resolver is being used
*
* @return The parsed statements object
*
* @throws ContentHandlerException
*/
public Statements parse(Content content, ResolverSession resolverSession) throws
ContentHandlerException {

// Container for our statements
MP3Statements statements = null;

try {

// Attempt to create the MP3 statements
statements = new MP3Statements(content, resolverSession);
} catch (TuplesException tuplesException) {

throw new ContentHandlerException("Unable to create statements object from " +
"content object: " + content.getURI().toString(),
tuplesException);
}

return statements;
}

/**
* @return true if the file part of the URI has an
* .mp3 extension
*/
public boolean canParse(Content content)
{
MimeType contentType = content.getContentType();
if (contentType != null && AUDIO_MPEG.match(contentType)) {
return true;
}

if (content.getURI() == null)
{
return false;
}

// Obtain the path part of the URI
String path = content.getURI().getPath();
if (path == null) {
return false;
}
assert path != null;

// We recognize a fixed extension
return path.endsWith(".mp3");
}

}

An analysis of the class is as follows:

package org.mulgara.content.mp3;

// Java 2 standard packages
import java.io.InputStream;
import java.net.URI;
import java.util.Map;

// Java 2 enterprise packages
import javax.activation.MimeType;
import javax.activation.MimeTypeParseException;

// Third party packages
import org.apache.log4j.Logger; // Apache Log4J

// Local packages
import org.mulgara.content.Content;
import org.mulgara.content.ContentHandler;
import org.mulgara.content.ContentHandlerException;
import org.mulgara.resolver.spi.ResolverSession;
import org.mulgara.resolver.spi.Statements;
import org.mulgara.query.TuplesException;

There are no specific requirements for the packaging of the implementation but it is recommended that related classes be kept in the same package for easier implementing. For the interface, you also need to import:

In most cases the javax.activation.MimeType class also needs to be imported to do proper mime type handling. Any supporting classes for the implementation should also be imported.

public class MP3ContentHandler implements ContentHandler {

All content handlers must implement the ContentHandler interface unless they are extending an existing implementation, in which case the superclass should handle the implementation. Any extra interfaces or extensions are valid.

/**
* The MIME type of RDF/XML.
*/
private static final MimeType AUDIO_MPEG;

static {
try {
AUDIO_MPEG = new MimeType("audio", "mpeg");
}
catch (MimeTypeParseException e) {
throw new ExceptionInInitializerError(e);
}
}

Content handlers are written to handle specific content types and most often these have a mime type associated with them that can be used to determine if the handler is able to parse the content. Although not strictly necessary, it is preferable to set up the mime type using a static initialization block for the class, creating a variable that can be used during the canParse() method.

/**
* Parses the ID3 tags of the MP3 file pointed to by the content object which
* are then converted to a statements object.
*
* @param content The actual content we are going to be parsing
* @param resolverSession The session in which this resolver is being used
*
* @return The parsed statements object
*
* @throws ContentHandlerException
*/
public Statements parse(Content content, ResolverSession resolverSession) throws
ContentHandlerException {

// Container for our statements
MP3Statements statements = null;

try {

// Attempt to create the MP3 statements
statements = new MP3Statements(content, resolverSession);
} catch (TuplesException tuplesException) {

throw new ContentHandlerException("Unable to create statements object from " +
"content object: " + content.getURI().toString(),
tuplesException);
}

return statements;
}

The purpose of the parse(Content, ResolverSession) method is to convert the resource pointed to by the Content object into a series of triples inside a Statements object. This means there are two parts to consider, the parsing of the resource and the conversion of the results into statements. It is possible to perform both operations in the single method, but in the MP3 implementation, the content is parsed directly into the Statements object. The result is that you only need to create a MP3Statements object that handles the parsing and setting up of the statements. See the Creating the Statements section for more information.

/**
* @return true if the file part of the URI has an
* .mp3 extension
*/
public boolean canParse(Content content)
{
MimeType contentType = content.getContentType();
if (contentType != null && AUDIO_MPEG.match(contentType)) {
return true;
}

if (content.getURI() == null)
{
return false;
}

// Obtain the path part of the URI
String path = content.getURI().getPath();
if (path == null) {
return false;
}
assert path != null;

// We recognize a fixed extension
return path.endsWith(".mp3");
}

Before a Content object is sent to the handler, a search is done through the list of registered handlers to find out which to use to parse it into statements. For the resolver to know which content handlers support which content, it uses the canParse(Content) method. If the method returns true then the content is passed to the parse(Content, ResolverSession) method of that handler. One of the first checks that should be made is whether the content is of the correct mime type. Sometimes the mime type is unavailable (for example, for a file protocol resolver) so you should also check the URI and extensions. If the content is supported by the handler then this method should return true, otherwise false.

Valid XHTML 1.0 TransitionalValid CSS 3.0!