select
Issues queries to Mulgara servers and displays the corresponding results. The command consists of a sequence of select
, from
, and where
clauses; and may optionally include order by
, offset
and limit
clauses. With all clauses present, the general syntax is as follows:
select columns from models where constraints order by variables limit count offset count;
The purpose of the select
command is to find values for some set of variables that satisfy the specified constraints. All variable names start with a dollar sign ($)
, for example $x
or $title
.
select Clause
Specifies the variables to solve for and their order in the result. For example:
select $title $author $date ...
Constant resource or literal values may be part of a select
clause. In these cases, dummy variable names ($k0
, $k1
, $k2
, … $kn
) are created for the constant values.
The following example returns three columns: $k0
, $x
and $k1
where the values of $k0
and $k1
for every solution are the literal value foo
and the resource value http://www.site.domain.net
respectively.
select 'foo' $x <http://www.site.domain.net> ...
from Clause
Specifies the model to query. For example:
... from <rmi://mysite.com/server1#model1> ...
Because models are sets of statements, it is logical to compose them using set operations. The from
clause permits set union using the or
operator and set intersection using the and
operator, with parentheses used to control association.
The following example queries only the statements appearing in all three models.
... from <rmi://mysite.com/server1#model1> and <rmi://mysite.com/server1#model2>
and <rmi://mysite.com/server1#model3> ...
where Clause
The where
clause is usually the largest and most detailed clause of the select
command. It specifies the constraints that must be satisfied by the variable values in each solution. A constraint is a sequence of subject, predicate and object that represents an RDF statement. Each of the three positions is either a constant value (a resource or a literal) or a variable. The Mulgara server finds values for any variables such that the resulting statement is present in the model that was specified in the preceding from
clause.
For example, the following where
clause constrains that in all solutions, the value of $title
is the title of the resource referred to in $document
.
... where $document <dc:title> $title ...
Constraints may be composed using and
and or
operations, with parentheses to control association.
The following example returns every document with a title, an author and a subject of either botany or zoology. Documents without a known title or author are not returned.
... where $document <dc:title> $title and $document <dc:creator> $author
and ($document <dc:subject> 'botany' or $document <dc:subject> 'zoology') ...
in Specifier
Constraints may optionally contain an in
specifier to resolve the constraint against the statements in a specified model, rather than the one specified in the from
clause.
The in
specifier may be used to specify views as well as models.
The following example constrains the result by titles in the #books
model rather than the one specified in the from
clause.
... where $document <dc:title> $title in <rmi://mysite.com/server1#books> ...
Assigning a Value to a Variable (mulgara:is)
Constraints may also assign a value to a variable rather then querying for it. The special property http://mulgara.org/mulgara#is
(or the aliased form <mulgara:is>
) can be used with a variable as its subject and a non-variable value as its object. This assigns the non-variable value to the variable.
The following example results in every document authored by Mendel about genetics, or by Mendeleev about chemistry.
... where $document <dc:title> $title and $document <dc:creator> $author
and $document <dc:subject> $subject
and (($author <mulgara:is> 'Mendel' and $subject <mulgara:is> 'genetics' )
or ($author <mulgara:is> 'Mendeleev' and $subject <mulgara:is> 'chemistry')) ...
Traversing a Graph (walk Function)
Traversing a graph allows a query to return values, based on a predicate, by following up or down a hierarchy of statements. In a schema language such as RDFS, these hierarchies are expressed as a sub-class or sub-property predicate. Traversing a graph is performed with the walk
function within a where
clause.
The syntax of the walk
function is either:
walk ($subject_variable <predicate_URI> <object_URI> and
$subject_variable <predicate_URI> $object_variable)
or
walk (<subject_URI> <predicate_URI> $object_variable and
$subject_variable <predicate_URI> $object_variable)
The walk
function must be bound to a select
clause using the same triple pattern that matches the second parameter. For example:
select $subject <predicate_URI> $object
...
where walk ($subject <predicate_URI> <object_URI> and
$subject <predicate_URI> $object);
An example of walk
is demonstrated using the following statements:
[ ( <kangroos>, <rdfs:subClassOf>, <marsupials> )
( <marsupials>, <rdfs:subClassOf>, <mammals> )
( <placental-mammals>, <rdfs:subClassOf>, <mammals> )
( <mammals>, <rdfs:subClassOf>, <vertebrates> ) ]
To query a set of statements in the hierarchy ending with <vertebrates>
as an object:
select $subject <rdfs:subClassOf> $object
...
where walk($subject <rdfs:subClassOf> <vertebrates>
and $subject <rdfs:subClassOf> $object);
Working from the bottom up, the system:
- Matches
( <mammals>, <rdfs:subClassOf>, <vertebrates> )
and then substitutes<mammals>
for<vertebrates>
in the constraints. - Attempts to match for the triples
( *, <rdfs:subClassOf>, <mammals> )
. - Then matches for
( <marsupials>, <rdfs:subClassOf>, <mammals> )
and( <placental-mammals>, <rdfs:subClassOf>, <mammals> )
. - Then matches for
( *, <rdfs:subClassOf>, <marsupials> )
and (*, <rdfs:subClassOf>, <placental-mammals> )
and so on.
The result of the query is:
[ ( <mammals>, <rdfs:subClassOf>, <vertebrates> )
( <placental-mammals>, <rdfs:subClassOf>, <mammals> )
( <marsupials>, <rdfs:subClassOf>, <mammals> )
( <kangaroos>, <rdfs:subClassOf>, <marsupials> ) ]
You can also traverse down the graph following the hierarchy. For example:
select $subject <rdfs:subClassOf> $object
...
where walk(<kangaroos> <rdfs:subClassOf> $object
and $subject <rdfs:subClassOf> $object);
This returns:
[ ( <kangaroos>, <rdfs:subClassOf>, <marsupials> )
( <marsupials>, <rdfs:subClassOf>, <mammals> )
( <mammals>, <rdfs:subClassOf>, <vertebrates> ) ]
Transitive Closure (trans Function)
Transitive closure provides the ability to express a function that generates new statements. Normally, transitive closure produces both existing and new statements. The trans
function in iTQLTM only produces new statements, statements that did not exist in the model before it was executed. The trans
function can be further constrained by limiting which statements are inferred and by giving it a starting or termination point.
The simplest form of the trans
function defines a predicate to operate on:
select $subject <rdfs:subClassOf> $object
...
where trans($subject <rdfs:subClassOf> $object);
This generates a new statement, $x <rdfs:subClassOf> $z
, when it finds two statements that match the pattern $x <rdfs:subClassOf> $y
and $y <rdfs:subClassOf> $z
.
For example, consider the following set of statements:
[ (<mammals>, <rdfs:subClassOf>, <vertebrates>)
(<eats-leaves>, <rdfs:subPropertyOf>, <herbivore>)
(<marsupials>, <rdfs:subClassOf>, <mammals>)
(<placental-mammals>, <rdfs:subClassOf>, <mammals>)
(<elephants>, <rdfs:subClassOf>, <placental-mammals>)
(<kangaroos>, <rdfs:subClassOf>, <marsupials>)
(<red-kangaroos>, <rdfs:subClassOf>, <kangaroos>) ]
As a tree it looks as follows:
Based on the constraint in the query, the first and third statements match the pattern (<marsupials>, <rdfs:subClassOf>, <mammals>) and (<mammals>, <rdfs:subClassOf>, <vertebrates>)
. Therefore, the function generates the statement (<marsupials>, <rdfs:subClassOf>, <vertebrates>)
.
Using the query across the existing set of statements produces the following new set of statements:
[ (<marsupials>, <rdfs:subClassOf>, <vertebrates>
)
(<kangaroos>, <rdfs:subClassOf>, <vertebrates
>)
(<red-kangaroos>, <rdfs:subClassOf>, <vertebrates
>)
(<placental-mammals>, <rdfs:subClassOf>, <vertebrates>
)
(<elephants>, <rdfs:subClassOf>, <vertebrates
>)
(<kangaroos>, <rdfs:subClassOf>, <mammals>)
(<red-kangaroos>, <rdfs:subClassOf>, <mammals>)
(<elephants>, <rdfs:subClassOf>, <mammals>)
(<red-kangaroos>, <rdfs:subClassOf>, <marsupials>) ]
To provide the results expected from a transitive closure function you would union together these newly generated statements with the original base set of statements, using the following query:
select $subject <rdfs:subClassOf> $object
...
where trans($subject <rdfs:subClassOf> $object)
or $subject <rdfs:subClassOf> $object;
You can further restrict the trans
function to a sub-set of statements, as shown in the following example.
select $subject <rdfs:subClassOf> $object
...
where trans($subject <rdfs:subClassOf> <mammals>
and $subject <rdfs:subClassOf> $object);
This produces new statements where the object in the inheritance tree begins with <mammals>
, eliminating the statements derived from <vertebrates>
. It produces the following:
[ (<kangaroos>, <rdfs:subClassOf>, <mammals>)
(<red-kangaroos>, <rdfs:subClassOf>, <mammals>)
(<elephants>, <rdfs:subClassOf>, <mammals>)
(<red-kangroos>, <rdfs:subClassOf>, <marsupials>) ]
To get the full transitive closure, the newly inferred statements are unioned with the base statements. To only return the sub-graph from <mammals>
we can add the results from a walk
function:
select $subject <rdfs:subClassOf> $object
...
where trans($xxx <rdfs:subClassOf> <mammals>
and $subject <rdfs:subClassOf> $object) or
walk($subject <rdfs:subClassOf> <mammals>
and $subject <rdfs:subClassOf> $object);
To generate the statements to <marsupials>
you can constrain the subject in the function instead:
select $subject <rdfs:subClassOf> $object
...
where trans(<marsupials> <rdfs:subClassOf> $object
and $subject <rdfs:subClassOf> $object) ;
This produces:
[ <marsupials>, <rdfs:subClassOf>, <vertebrates>) ]
The trans
function also allows you to limit what is inferred by dropping the second constraint within the trans
definition. For example, to infer only direct statements from <vertebrates>
:
select $subject <rdfs:subClassOf> <vertebrates
>
...
where trans($subject <rdfs:subClassOf> <vertebrates
>)
Which produces:
[ (<marsupials>, <rdfs:subClassOf>, <vertebrates
>)
(<kangaroos>, <rdfs:subClassOf>, <vertebrates
>)
(<red-kangaroos>, <rdfs:subClassOf>, <vertebrates
>)
(<placental-mammals>, <rdfs:subClassOf>, <vertebrates
>)
(<elephants>, <rdfs:subClassOf>, <vertebrates
>) ]
Likewise, you can also make the subject in the trans
constraint a constant and the object a variable.
Graph Difference (minus) Function
The minus
function allows you to find the statements which differ between two graphs.
For example, to find the statements which are different between the model <rmi://localhost/server1#input> and <rmi://localhost/server1#output>, we could issue a query like this:
select $subject $predicate $object
from <rmi://localhost/server1#output>
where
$subject $predicate $object in <rmi://localhost/server1#output>
minus
$subject $predicate $object in <rmi://localhost/server1#input> ;
The use of the in
specifier is not strictly necessary for the model named in from
clause, but it makes the query a bit more readable by providing symmetry.
To count the number of statements that differ between the two models, we could do something like
the following. Note variables do not carry over into a count
function's namespace, so the outer query could be anything.
select count (
select $subject $predicate $object
from <rmi://localhost/server1#output>
where
$subject $predicate $object in <rmi://localhost/server1#output>
minus
$subject $predicate $object in <rmi://localhost/server1#input>
)
from <rmi://localhost/server1#>
where
$s $p $o ;
exclude Function
The exclude
function allows you to select all statements which do not match a given constraint. Normal constraints match against the graph with the constraints you provide. Constraints enclosed in an exclude
function return the values in the graph that are not the constraints provided.
This function is almost never useful. Consider using the minus operator instead.
Using the following statements as an example for "finding all plants with leaves that are not green":
[ (<maple>, <leaves>, 'green')
(<redMaple>, <leaves>, 'red')
(<oak>, <leaves>, 'green')
(<cactus>, <prickles>, 'yellow') ]
With a small model it is possible to query for all plants that do not have leaves. The following query:
select $s
from ...
where exclude($s <leaves> $o) ;
Returns:
[ ( <cactus> ) ]
However, more statements in the data mean that the results will need to be constrained more carefully. Unfortunately this may cause the exclude
operator to miss some required statements. For instance, if the following statements were also included:
[ ( <maple> <rdf:type> <plant> )
( <redMaple> <rdf:type> <plant> )
( <oak> <rdf:type> <plant> )
( <cactus> <rdf:type> <plant> ) ]
The following query would appear appropriate:
select $s
from ...
where $s <rdf:type> <plant>
and exclude($s <leaves> $o);
However, this returns:
[( <maple> )
( <redMaple> )
( <oak> )
( <cactus> ) ]
To understand what this query is doing, examine each constraint individually:
$s <rdf:type> <plant>
returns all statements referring to a type of plant:$s $p
<maple> <plant>
<redMaple> <plant>
<oak> <plant>
<cactus> <plant>$s <urn:leaves> $o
(before theexclude
operator is applied) returns all statements which match the predicate<leaves>
:$s $o
<maple> 'green'
<redMaple> 'red'
<oak> 'green'exclude($s <urn:leaves> $o)
returns all statements which do not match the predicate<leaves>
:$s $o
<maple> <plant>
<redMaple> <plant>
<oak> <plant>
<cactus> <plant>
<cactus> 'yellow'
Combining the two results (using and
), leads to all the subjects satisfying both constraints. This is incorrect. The required solution can be found if the minus
operator is used instead:
select $s
from ...
where $s <rdf:type> <plant>
minus $s <leaves> $o ;
In this case the minus
operator will remove all statements matching the specified constraint, rather than joining to all statements which do not match the specified constraint (which is the operation of and exclude
).
order by Clause
Optionally sorts the results of a select
command according to the variables specified.
The following example sorts the results numerically by rating (assuming ratings can be parsed as numbers), then alphabetically by author for documents of equal rating.
... order by $rating $title;
The suffixes asc
and desc
may be used to override the default sort ordering for a variable.
The following example sorts the results such that low ratings display first, and then alphabetically by author for documents of equal rating.
... order by $rating asc $author;
limit Clause
Optionally limits the query result to a specified non-negative number of rows.
When using limit
, it is advisable to use the order by
clause to constrain the result rows into a unique order. Otherwise you do not know which rows are returned. For example, if you limit a result to 10 rows without specifying an order, you do not know which 10 rows are returned.
offset Clause
Optionally skips a non-negative number of rows at the beginning of a query result. The use of offset
usually accompanies a limit
clause, making it possible to page through results.
As with limit
, it is advisable to use the order by
clause to constrain the result rows into a unique order.
For examples and explanations of complete queries, see the Issuing iTQL Commands section.
subquery Function
Used to nest select
commands. Subqueries nest inside select
by binding variables in the subquery where
clause to the outer select
clause.
In the following example the value of $vcard
is bound to the inner query. The outer result set contains nested result sets for each subquery.
select $vcard $fn
subquery( select $title
from <rmi://mysite.com/server1#vcard>
where $vcard <http://www.w3.org/2001/vcard-rdf/3.0#TITLE> $title
order by $title )
from <rmi://mysite.com/server1#vcard>
where $vcard <http://www.w3.org/2001/vcard-rdf/3.0#FN> $fn
order by $fn;
The above example produces the following XML output from a SOAP call:
<?xml version="1.0"?>
<answer xmlns="http://mulgara.org/tql#">
<query>
<variables>
<vcard/>
<fn/>
<k0/>
</variables>
<solution>
<vcard resource="http://qqq.com/staff/superman"/>
<fn>Superman</fn>
<k0>
<variables>
<title/>
</variables>
</k0>
</solution>
<solution>
<vcard resource="http://qqq.com/staff/spiderman"/>
<fn>Peter Parker</fn>
<k0>
<variables>
<title/>
</variables>
<solution>
<title>Super Hero</title>
</solution>
<solution>
<title>PO2</title>
</solution>
</k0>
</solution>
<solution>
<vcard resource="http://qqq.com/staff/corky"/>
<fn>Corky Crystal</fn>
<k0>
<variables>
<title/>
</variables>
<solution>
<title>Computer Officer Class 3</title>
</solution>
</k0>
</solution>
</query>
</answer>
count Function
Similar to subquery
except that it only returns a dummy variable with value of the total row count of the inner query. See the select
clause section for a description of dummy variables. For example:
select $vcard $fn
count( select $title from <rmi://mysite.com/server1#vcard>
where $vcard <http://www.w3.org/2001/vcard-rdf/3.0#TITLE> $title )
from <rmi://mysite.com/server1#vcard>
where $vcard <http://www.w3.org/2001/vcard-rdf/3.0#FN> $fn ;
The above example returns the following:
vcard=http://qqq.com/staff/corky fn="Corky Crystal" k0="1"
vcard=http://qqq.com/staff/spiderman fn="Peter Parker" k0="2"
vcard=http://qqq.com/staff/superman fn="Superman" k0="0"
having Clause
The having
clause applies a constraint to a dummy variable that results from a subquery
in a select
clause of a query. These variables are of the form $k0
, $k1
, $k2
, … $kn
and only hold numerical values. See the select
clause section for a description of dummy variables.
There are four special predicates that can be used to perform arithmetic comparisons in the having
clause, as outlined in the following table.
Predicate |
Arithmetic Operation |
|
= |
|
> |
|
< |
|
? |
Expanding on the example shown in the subquery
and count
sections, the following query restricts the result to those with a count of 1:
select $vcard $fn
count( select $title from <rmi://mysite.com/server1#vcard>
where $vcard <http://www.w3.org/2001/vcard-rdf/3.0#TITLE> $title )
from <rmi://mysite.com/server1#vcard>
where $vcard <http://www.w3.org/2001/vcard-rdf/3.0#FN> $fn
having $k0 <http://mulgara.org/mulgara#occurs>
'1.0'^^<http://www.w3.org/2001/XMLSchema#double> ;
The above example returns:
vcard=http://qqq.com/staff/corky fn="Corky Crystal" k0="1"
Similarly, to restrict the result to those with a count greater than 0 would be:
select $vcard $fn
count( select $title from <rmi://mysite.com/server1#vcard>
where $vcard <http://www.w3.org/2001/vcard-rdf/3.0#TITLE> $title )
from <rmi://mysite.com/server1#vcard>
where $vcard <http://www.w3.org/2001/vcard-rdf/3.0#FN> $fn
having $k0 <http://mulgara.org/mulgara#occursMoreThan>
'0.0'^^<http://www.w3.org/2001/XMLSchema#double> ;
The above example returns:
vcard=http://qqq.com/staff/corky fn="Corky Crystal" k0="1"
vcard=http://qqq.com/staff/spiderman fn="Peter Parker" k0="2"
The form of the constraint for the having
clause must be:
$kx predicate value^^<http://www.w3.org/2001/XMLSchema#double>
Where predicate
is one of the predicates from the above table.
Note that compound constraints for the having
clause are not allowed. For example, the following query is not legal:
select $vcard $fn
count( select $title from <rmi://mysite.com/server1#vcard>
where $vcard <http://www.w3.org/2001/vcard-rdf/3.0#TITLE> $title )
from <rmi://mysite.com/server1#vcard>
where $vcard <http://www.w3.org/2001/vcard-rdf/3.0#FN> $fn
having $k0 <http://mulgara.org/mulgara#occurs>
'0.0'^^<http://www.w3.org/2001/XMLSchema#double>
or $k0 <http://mulgara.org/mulgara#occurs>
'2.0'^^<http://www.w3.org/2001/XMLSchema#double> ;