mulgara - semantic store

skip navigation

SHOW SITE NAV
fixed
fluid
straight

Mulgara Semantic Store

Data in electronic form is flourishing. It is, in fact, growing at a rate that makes it hard to manage. Organizations often have so much information in electronic form that it can be hard to find, access, share and reuse. Mulgara is an important part of a solution to this problem.

Metadata is information about data. For example, metadata for a word-processing document or an electronic mail message might include the author, the recipients, the subject, keywords, concepts addressed, people named, dates or places mentioned. Mulgara stores this metadata and creates relationships between it.

Mulgara implements many of the World Wide Web Consortium's Semantic Web concepts (http://www.w3.org/RDF/, http://www.w3c.org/2001/sw). Mulgara databases hold metadata in the form of short subject-predicate-object statements, much like the W3C's Resource Description Framework (RDF) standard. In fact, metadata may be imported into TKS in RDF form.

Using iTQLTM (Interactive Tucana Query LanguageTM) commands, you can query Mulgara databases and receive results that match the query. iTQL is similar to the Structured Query Language (SQL) used to query relational databases, with some significant differences due to the way data is stored in Mulgara. Like relational databases, Mulgara can be used as an underlying data repository for software applications.

 

Overview

The features and benefits of MulgaraTM are outlined in the following sections.

In This Section

General

Performance and Scalability

Reliability

Connectivity

Manageability

Cross OS/Platform Support

Scalability

 

General

 

Performance and Scalability

More information is available in the Scalability section below.

 

Reliability

 

Connectivity

 

Manageability

 

Cross OS/Platform Support

 

Scalability

The storage engine of MulgaraTM is a transactional triplestore known as the XA Triplestore. Much of the scalability of Mulgara is due to the following features of the XA Triplestore.

 

64-bit Data Structures

All relevant fields of in-memory and on-disk data structures are 64 bits wide, thus ensuring that MulgaraTM can store very large amounts of data up to the limits imposed by the host operating system.

 

Multiple Sessions with no Lock Contention

A single writing session in addition to multiple reading sessions can access the triplestore concurrently without the reading sessions being required to acquire a global lock while processing a query. This completely avoids the possibility of any lock contention. In general, each session executes in its own thread. The lack of lock contention means that the maximum number of active reading sessions is only limited by the concurrency of the host operating system and I/O subsystem.

When a session initiates a query, which may involve multiple requests to the triplestore, it first takes a snapshot of the entire database. This ensures that all requests to the triplestore during the processing of the query see the database in a consistent state.

The triplestore is designed such that obtaining a snapshot is a very quick operation and does not cause any I/O to be performed. It should take less than a millisecond on current hardware, regardless of the size of the database.

The session must hold a global lock only during this brief period while it obtains the snapshot. Once the snapshot is obtained, no further locking is required regardless of the number of triplestore operations that must be performed or the amount of time required to execute the query.

The existence of a snapshot does not by itself cause any additional storage to be consumed but it will cause any modifications to use copy-on-write semantics. The on-disk data structures of the triplestore are designed to minimize the amount of copying required to perform a modification thus improving performance while also maximizing the amount of storage shared between snapshots.

A snapshot is released once the query processing is complete. Any disk storage used by the snapshot and not shared with any other snapshot is immediately available for reuse. Releasing a snapshot is just as quick as obtaining a snapshot but the session does not even need to hold the global lock during this operation.

A separate global lock (the write lock) is used to ensure that there is only one writing session at any given time. The write lock is released after the writer either commits or rolls back the current transaction.

 

On-Line Backups

The XA Triplestore allows modifications and queries to proceed concurrently with a backup operation. The session performing the backup acquires a snapshot of the entire database as it would if it was performing a query.

 

Permanent Integrity

System crashes caused by power failures and some types of hardware fault will not cause data corruption.

The on-disk data structures of the triplestore are designed to be kept in a consistent state at all times while minimizing the overhead required to achieve this. Disk writes during a write transaction are unordered thus preserving good write performance. Write ordering is imposed only during a commit operation.

 

Use of Java NIO

The XA Triplestore uses the JavaTM NIO (new I/O) API which was introduced in Java 2 SDK Version 1.4. The NIO API provides access to advanced I/O facilities which were previously only available to native C programs. The use of NIO allows the XA Triplestore to provide transactions, permanent integrity and good performance while still remaining a pure Java implementation.

Some of the features of NIO that are used by the triplestore include:

Valid XHTML 1.0 TransitionalValid CSS 3.0!