SPARQL can be used inline wherever SQL can be used. The only API functions that one needs to know are the ones for loading RDF data into the store. Dynamic SQL client applications can issue SPARQL queries against Virtuoso through the regular SQL client API, ODBC, JDBC or other, simply by prefixing the SPARQL query with the SPARQL keyword. Parameters work just as with dynamic SQL. Stored procedures can have SPARQL expressions inline and can declare cursors over SPARQL result sets.
Value conversions between SQL and SPARQL are most often automatic and invisible. In some cases one needs to be aware of the different SPARQL value representations (valmodes). SPARQL offers delcrations for determining if graphs to be returned are to be representaed as XML or Turtle text serialization or whether these will be hash tables of triples. See dict_new and related functions for a description of the hash table SQL data type. The use of doct's is convenient for further programmatic processing of graphs.
RDF-related procedures use Virtuoso/PL vectors and dictionaries to represent RDF triples and sets of triples.
'Valmode' means the 'format of values returned by an expression', i.e. 'short', 'long' or 'sql value'.
'Triple vector' is a vector (array) of S, P and O, where all values are in 'long' formats, i.e. IRI_ID's for IRI values, vector of 5 items if O is a string, SQL scalar value if O is neither string nor IRI.
'Dictionary of triples' or 'Hash table of triples' is an dictionary object made by the SQL function dict_new () whose keys are 'triple vectors' and values are not specified; this is a good storage format for an unordered set of distinct triples.
'Dictionary of blank node names' is a dictionary used for tricky processing of a number of TURTLE or RDF /XML descriptions of subgraphs that come from a common graph. Imagine a situation where different descriptions actually refer to the same blank nodes of the original graph and, moreover, the application that generates these descriptions always generates the same blank node id string for the same node. A reader of descriptions can correctly join described subgraphs into one big subgraph by filling in a dictionary that contains blank node id strings as keys and IRI_ID's assigned to that strings as dependant data. As soon as all readers of an application share the same dictionary of nodes created before, no blank node is created twice;
Virtuoso extends the SQL 92 syntax with SPARQL queries and subqueries. Instead of writing a SQL SELECT query or subquery, one can write the SPARQL keyword and a SPARQL query after the keyword.
SQL>sparql select distinct ?p where { graph ?g { ?s ?p ?o } }; p varchar ---------- http://example.org/ns#b http://example.org/ns#d http://xmlns.com/foaf/0.1/name http://xmlns.com/foaf/0.1/mbox ... SQL>select distinct subseq (p, strchr (p, '#')) as fragment from (sparql select distinct ?p where { graph ?g { ?s ?p ?o } } ) as all_predicates where p like '%#%'; fragment varchar ---------- #query #data #name #comment ...
It is possible to pass parameters to a SPARQL query via a Virtuoso-specific syntax extension. '??' or '$?' indicates a positional parameter similar to '?' in plain SQL. '??' can be used in graph patterns or anywhere in the place of a SPARQL variable. The value of a parameter should be passed in SQL form, i.e. this should be a number or a untyped string. An IRI ID can not be passed, but an absolute IRI can. Using this notation, any dynamic SQL client, whether ODBC, JDBC or other can execute parametrized SPARQL queries, binding parametres just as with dynamic SQL.
SQL> create function param_passing_demo (); { declare stat, msg varchar; declare mdata, rset any; exec ('sparql select ?s where { graph ?g { ?s ?? ?? }}', stat, msg, vector ( -- Vector of two parameters 'http://www.w3.org/2001/sw/DataAccess/tests/data/Sorting/sort-0#int1', 4 ), 10, -- Max no of rows mdata, -- Variable to get metadata rset ); -- Variable to get result-set return rset[0][0]; } SQL> select param_passing_demo (); callret VARCHAR _______________________________________________________________________________ http://www.w3.org/2001/sw/DataAccess/tests/data/Sorting/sort-0#four 1 Rows. -- 00000 msec.
An inline SPARQL query can refer to SQL variables that are in scope in the SQL query or stored procedure containing it. Virtuoso extends the SPARQL syntax with a special notation to this effect. A reference to SQL variable X can be written as '?:X' or '$:X'. A reference to column C of table or sub-select with alias T can be written as '?:T.C' or '$:T.C'. Both notations can be used in any place where a variable name is allowed, except 'AS' clause described below.
A column of a result set of a SPARQL SELECT can be used in SQL code inside a for statement just like any column from a SQL select.
SQL rules about double-quoted names are applicable to variables that are passed to a SPARQL query or selected from one. If a variable name contains unusual characters or should not be normalized according to SQL conventions then the name should use double quotes for escaping. E.g., the notation '?:"OrderLine"' will always refer to variable or column titled '"OrderLine"' whereas '?:OrderLine' can be converted to 'ORDERLINE' or 'orderline'.
It is safer to avoid using variable names that conflict with column names of RDF system tables, esp. 'G', 'P', 'S' and 'O'. These names are not reserved now but they may cause subtle bugs when the SPARQL subquery is compiled into SQL code that refers to table columns of same names. Some of these names may be rejected as syntax errors by future Virtuoso versions.
SQL> create procedure sql_vars_demo (); { #pragma prefix sort0: <http://www.w3.org/2001/sw/DataAccess/tests/data/Sorting/sort-0#> declare RES varchar; declare obj integer; result_names (RES); obj := 4; for (sparql select ?subj where { graph ?g { ?subj sort0:int1 ?:obj } } ) do result (RES); } SQL> sql_vars_demo (); RES VARCHAR _______________________________________________________________________________ http://www.w3.org/2001/sw/DataAccess/tests/data/Sorting/sort-0#four 1 Rows. -- 00000 msec.
The example also demonstrates the Virtuoso/PL pragma line for procedure-wide declarations of namespace prefixes. This makes the code more readable and eliminates duplicate declarations of namespace prefixes when the procedure contains many SPARQL fragments that refer to a common set of namespaces.
SPARQL ASK query can be used as an argument of the SQL EXISTS predicate.
create function sparql_ask_demo () returns varchar { if (exists (sparql ask where { graph ?g { ?s ?p 4}})) return 'YES'; else return 'NO'; } SQL> select sparql_ask_demo (); _______________________________________________________________________________ YES
The compilation of a SPARQL query may depend on environment that is usually provided by the SPARQL protocol, including name of default graph URI . Environment settings that come from protocol may override settings in the text of SPARQL query. To let an application configure the environment for a query, SPARQL syntax is extended with the 'define' clause:
define parameter-qname parameter-value
Supported parameters are 'output:valmode' and 'output:format'
'output:valmode' sets the SQL representation used for values in the result set. In most cases applications need SQL values to be returned by SPARQL. By default the query returns a result set of values in SQL format and behaves as a typical SQL select. To compose triple vectors in Virtuoso/PL code application may need data in 'long' format. If the query contains a
define output:valmode 'LONG'
clause then all returned values are in long format. E.g., the following query returns IRI_ID's instead of IRI strings.
SQL>sparql define output:valmode 'LONG' select distinct ?p where { graph ?g { ?s ?p ?o } }; p ---------- #i1000001 #i1000003 #i1000005 #i1000006 ...
'output:format' instruct SPARQL compiler that the result of the query should be serialized into an RDF document; that document will be returned as a single column of a single row result set. 'output:format' is especially useful if SPARQL CONSTRUCT or SPARQL DESCRIBE query is executed directly via ODBC or JDBC database connection and the client can not receive the resulting dictionary of triples (there's no way to transfer such an object via ODBC). Using this option, the client can receive the document that contains the whole result set of a SELECT or the dictionary of triples of a CONSTRUCT/DESCRIBE, and parse it locally.
Supported values for 'output:format' are 'RDF/XML' and 'TURTLE' (or 'TTL'). If both 'output:valmode' and 'output:format' are specified, 'output:format' has higher priority; an error if 'output:valmode' is set to a value other than 'LONG'.
When a SPARQL query is compiled, the compiler checks whether the result set is sent to the remote ODBC/JDBC client or used in some other way. The compiler will automatically define 'output:format' 'TURTLE' if compiling for execution by an SQL client.
The example below demonstrates how different values of 'output:format' affect the result of SPARQL SELECT. Note 10 rows and 4 columns in the first result, and single LONG VARCHAR in two others. Using the ISQL client, use 'set blobs on;' directive to fetch long texts without 'data truncated' warning.
SQL> sparql select * where {graph ?g { ?s ?p ?o }} limit 10; g s p o VARCHAR VARCHAR VARCHAR VARCHAR ______________________________________________________________________ http://local.virt/DAV/bound/manifest.rdf nodeID://1000000000 http://example.com/test#query http://local.virt/DAV/bound/bound1.rq . . . http://local.virt/DAV/examples/manifest.rdf nodeID://1000000019 http://example.com/test#query http://local.virt/DAV/examples/ex11.2.3.1_1.rq 10 Rows. -- 00000 msec. SQL> sparql define output:format "TTL" select * where {graph ?g { ?s ?p ?o }} limit 10; callret-0 LONG VARCHAR _______________________________________________________________________________ @prefix :rdf <http://www.w3.org/1999/02/22-rdf-syntax-ns#> . @prefix :rs <http://www.w3.org/2005/sparql-results#> . @prefix :xsd <http://www.w3.org/2001/XMLSchema#> . [ rdf:type rs:results ; rs:result [ rs:binding [ rs:name "g" ; rs:value <http://local.virt/DAV/bound/manifest.rdf> ] ; rs:binding [ rs:name "s" ; rs:value _:nodeID1000000000 ] ; rs:binding [ rs:name "p" ; rs:value <http://example.com/test#query> ] ; rs:binding [ rs:name "o" ; rs:value <http://local.virt/DAV/bound/bound1.rq> ] ; ] ; . . . rs:result [ rs:binding [ rs:name "g" ; rs:value <http://local.virt/DAV/examples/manifest.rdf> ] ; rs:binding [ rs:name "s" ; rs:value _:nodeID1000000019 ] ; rs:binding [ rs:name "p" ; rs:value <http://example.com/test#query> ] ; rs:binding [ rs:name "o" ; rs:value <http://local.virt/DAV/examples/ex11.2.3.1_1.rq> ] ; ] ; ] . 1 Rows. -- 00000 msec. SQL> sparql define output:format "RDF/XML" select * where {graph ?g { ?s ?p ?o }} limit 10; callret-0 LONG VARCHAR _______________________________________________________________________________ <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:rs="http://www.w3.org/2005/sparql-results#" xmlns:xsd="http://www.w3.org/2001/XMLSchema#" > <rs:results rdf:nodeID="rset"> <rs:result rdf:nodeID="sol206"> <rs:binding rdf:nodeID="sol206-0" rs:name="g"><rs:value rdf:resource="http://local.virt/DAV/bound/manifest.rdf"/></rs:binding> <rs:binding rdf:nodeID="sol206-1" rs:name="s"><rs:value rdf:nodeID="1000000000"/></rs:binding> <rs:binding rdf:nodeID="sol206-2" rs:name="p"><rs:value rdf:resource="http://example.com/test#query"/></rs:binding> <rs:binding rdf:nodeID="sol206-3" rs:name="o"><rs:value rdf:resource="http://local.virt/DAV/bound/bound1.rq"/></rs:binding> </rs:result> . . . <rs:result rdf:nodeID="sol5737"> <rs:binding rdf:nodeID="sol5737-0" rs:name="g"><rs:value rdf:resource="http://local.virt/DAV/examples/manifest.rdf"/></rs:binding> <rs:binding rdf:nodeID="sol5737-1" rs:name="s"><rs:value rdf:nodeID="1000000019"/></rs:binding> <rs:binding rdf:nodeID="sol5737-2" rs:name="p"><rs:value rdf:resource="http://example.com/test#query"/></rs:binding> <rs:binding rdf:nodeID="sol5737-3" rs:name="o"><rs:value rdf:resource="http://local.virt/DAV/examples/ex11.2.3.1_1.rq"/></rs:binding> </rs:result> </rs:results> </rdf:RDF> 1 Rows. -- 00000 msec.
SPARQL CONSTRUCT and SPARQL DESCRIBE results are serialized as one would expect:
SQL> sparql define output:format "TTL" construct { ?s ?p "004" } where {graph ?g { ?s ?p 4 }}; callret-0 LONG VARCHAR _______________________________________________________________________________ <http://www.w3.org/2001/sw/DataAccess/tests/data/Sorting/sort-0#four> <http://www.w3.org/2001/sw/DataAccess/tests/data/Sorting/sort-0#int1> "004" . _:b1000000913 <http://www.w3.org/2001/sw/DataAccess/tests/result-set#index> "004" . 1 Rows. -- 00000 msec. SQL> sparql define output:format "RDF/XML" construct { ?s ?p "004" } where {graph ?g { ?s ?p 4 }}; callret-0 LONG VARCHAR _______________________________________________________________________________ <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"> <rdf:Description about="http://www.w3.org/2001/sw/DataAccess/tests/data/Sorting/sort-0#four"><ns0pred:int1 xmlns:ns0pred="http://www.w3.org/2001/sw/DataAccess/tests/data/Sorting/sort-0#">004</ns0pred:int1></rdf:Description> <rdf:Description rdf:nodeID="b1000000913"><ns0pred:index xmlns:ns0pred="http://www.w3.org/2001/sw/DataAccess/tests/result-set#">004</ns0pred:index></rdf:Description> </rdf:RDF> 1 Rows. -- 00000 msec.
SPARQL ASK returns a non-empty result set if the match is found for graph pattern, empty result-set otherwise. If 'output:format' is specified then the query makes a 'boolean result' document instead:
SQL> sparql ask where {graph ?g { ?s ?p 4 }}; __ask_retval INTEGER _______________________________________________________________________________ 1 1 Rows. -- 00000 msec. SQL> sparql ask where {graph ?g { ?s ?p "no such" }}; __ask_retval INTEGER _______________________________________________________________________________ 0 Rows. -- 00000 msec. SQL> sparql define output:format "TTL" ask where {graph ?g { ?s ?p 4 }}; callret VARCHAR _______________________________________________________________________________ @prefix :rdf <http://www.w3.org/1999/02/22-rdf-syntax-ns#> . @prefix :rs <http://www.w3.org/2005/sparql-results#> . [ rdf:type rs:results ; rs:boolean TRUE ] 1 Rows. -- 00000 msec. SQL> sparql define output:format "RDF/XML" ask where {graph ?g { ?s ?p 4 }}; callret VARCHAR _______________________________________________________________________________ <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:rs="http://www.w3.org/2005/sparql-results#" xmlns:xsd="http://www.w3.org/2001/XMLSchema#" > <rs:results rdf:nodeID="rset"> <rs:boolean rdf:datatype="http://www.w3.org/2001/XMLSchema#boolean">1</rs:boolean></results></rdf:RDF> 1 Rows. -- 00000 msec.
-- This parses TTL (TURTLE or N3 resource and places its triples into DB.DBA.RDF_QUAD. create procedure DB.DBA.TTLP ( in strg varchar, -- text of the resource in base varchar, -- base IRI to resolve relative IRIs to absolute in graph varchar) -- target graph IRI, parsed triples will appear in that graph. -- This does not load TTL content, instead it returns a dict of triples in 'long valmode' create function DB.DBA.RDF_TTL2HASH ( in strg varchar, in base varchar, in graph varchar) returns any -- This parses TTL (TURTLE or N3 resource and places its triples into DB.DBA.RDF_QUAD. create procedure DB.DBA.RDF_EXP_LOAD_RDFXML ( in g any, -- target graph IRI, can be IRI string or integer or IRI ref num inout ent any, -- XML entity that contain RDF/XML document in process_as_large_xper integer, -- special mode to load large documents in many transactions of relatively small size in app_env any := null ) -- application environment for callbacks, unused in current version
-- Simple insertion of a quad where object is a node create procedure DB.DBA.RDF_QUAD_URI (in g_uri varchar, in s_uri varchar, in p_uri varchar, in o_uri varchar) -- Simple insertion of a quad where object is a literal value in 'SQL valmode' create procedure DB.DBA.RDF_QUAD_URI_L (in g_uri varchar, in s_uri varchar, in p_uri varchar, in o_lit any)
-- Serializes vector of triples into a session, in TURTLE syntax create procedure DB.DBA.RDF_TRIPLES_TO_TTL ( inout triples any, -- vector of triples in 'long valmode'. inout ses any) -- an output stream in server default encoding -- Serializes vector of triples into a session, in RDF/XML syntax. -- In current version, every triple is printed in separate rdf:Description tag, no nesting. create procedure DB.DBA.RDF_TRIPLES_TO_RDF_XML_TEXT ( inout triples any, -- vector of triples in 'long valmode'. in print_top_level integer, -- zero if only rdf:Description tags should be written, non-zero if the rdf:RDF top-lev el element should also be written inout ses any) -- an output stream in server default encoding
-- Local execution of SPARQL via SPARQL protocol, produces a result set of SQL values. create procedure DB.DBA.SPARQL_EVAL ( in query varchar, -- text of SPARQL query to execute in dflt_graph varchar, -- default graph IRI, if not NULL then this overrides what's specified in query in maxrows integer) -- limit on numbers of rows that should be returned. -- Similar to SPARQL_EVAL, but returns a vector of vectors of SQL values. create function DB.DBA.SPARQL_EVAL_TO_ARRAY ( in query varchar, -- text of SPARQL query to execute in dflt_graph varchar, -- text of SPARQL query to execute in maxrows integer) -- limit on numbers of rows that should be returned. returns any
-- Remote execution of SPARQL via SPARQL protocol, produces a result set of SQL values. create procedure DB.DBA.SPARQL_REXEC ( in service varchar, -- service URI to call via HTTP in query varchar, -- text of SPARQL query to execute in dflt_graph varchar, -- default graph IRI, if not NULL then this overrides what's specified in query in named_graphs any, -- vector of named graph IRIs, if not NULL then this overrides what's specified in query in req_hdr any, -- additional HTTP header lines that should be passed to the service; 'Host: ...' is most po pular. in maxrows integer, -- limit on numbers of rows that should be returned. in bnode_dict any) -- dictionary of bnode ID references. -- Similar to SPARQL_REXEC (), but returns a vector of vectors of SQL values. -- All arguments are the same. create function DB.DBA.SPARQL_REXEC_TO_ARRAY ( in service varchar, in query varchar, in dflt_graph varchar, in named_graphs any, in req_hdr any, in maxrows integer, in bnode_dict any) returns any -- Similar to SPARQL_REXEC (), but fills in output parameters with metadata (like exec metadata) and a vector of vector s of 'long valmode' values. -- First seven arguments are the same. create procedure DB.DBA.SPARQL_REXEC_WITH_META ( in service varchar, in query varchar, in dflt_graph varchar, in named_graphs any, in req_hdr any, in maxrows integer, in bnode_dict any, out metadata any, -- metadata like exec () returns. out resultset any) -- results as 'long valmode' value.
If the query is a CONSTRUCT or DESCRIBE then the result set consists of a single row and single column, the value inside is a dict of triples in 'long valmode'.
These functions emulate constructor functions from XQuery Core Function Library.
create function DB.DBA."http://www.w3.org/2001/XMLSchema#boolean" (in strg any) returns integer create function DB.DBA."http://www.w3.org/2001/XMLSchema#dateTime" (in strg any) returns datetime create function DB.DBA."http://www.w3.org/2001/XMLSchema#double" (in strg varchar) returns double precision create function DB.DBA."http://www.w3.org/2001/XMLSchema#float" (in strg varchar) returns float create function DB.DBA."http://www.w3.org/2001/XMLSchema#integer" (in strg varchar) returns integer
-- Returns 1 if string s matches pattern p, 0 otherwise create function DB.DBA.RDF_REGEX ( in s varchar, -- source string to check in p varchar, -- regular expression pattern string in coll varchar := null) -- unused for now (modes are not yet implemented) -- Returns 1 if language identifier r matches ling pattern t create function DB.DBA.RDF_LANGMATCHES ( in r varchar, -- language identifies (string or NULL) in t varchar) -- language pattern (exact name, first two letters or '*')
Sometimes the default graph IRI is not known when the SPARQL query is composed. It can be added at the very last moment by providing the IRI in 'define' clause as follows:
define input:default-graph-uri <http://example.com>
Such a definition overrides the default graph URI set in query by 'FROM ...' clause (if any).
When Virtuoso receives a SPARQL request via HTTP, the value of default graph set in protocol is sent back in the reply header as 'X-SPARQL-default-graph: ...' header line, for debugging purposes. This value has the highest possible priority and can not be redefined in the text of the query.
A SPARQL expression can contain calls of Virtuoso/PL functions and built-in SQL functions in both the WHERE clause and in result set. Two namespace prefixes, 'bif' and 'sql' are reserved for these purposes. When a function name starts with 'bif:' namespace prefix, the rest of name is treated as a name of SQL BIF (Built-In Function). When a function name starts with 'sql:' namespace prefix, the rest of name is treated as a name of Virtuoso/PL function owned by 'DBA' with database qualifier 'DB', e.g. 'sql:example(...)' is converted into 'DB.DBA."examples"(...)'.
In both cases, the function receives arguments in SQL format ('SQL valmode') and returns the result also in SQL format. The SPARQL compiler will automatically add code for format conversion into the resulting SQL code so SQL functions can be used even if "define output:valmode 'LONG'" forces the use of internal RDF representation in the result set.
Similarly, the following will add a named graph to the list of allowed named graphs.
define input:named-graph-uri <http://example.com>
The query may contain many 'define input:named-graph-uri ...' definitions.
If the query does not contain any declarations of default graph URI, the value of connection variable ':default_graph" is used. This value should be an absolute IRI string.
If the query does not contain any declarations of named graph URIs, the value of connection variable ':named_graphs" is used. The value of this variable should be a vector of absolute IRI strings.
Two functions allow the user to alter RDF storage by inserting or deleting all triples of a result graph of a query. Both functions receive an IRI of a graph that should be altered and a vector of triples that should be added or removed. The graph IRI can be either IRI ID or a string. The return values of functions are not defined and should not be used by applications.
create function DB.DBA.RDF_INSERT_TRIPLES (in graph_iri any, in triples any) create function DB.DBA.RDF_DELETE_TRIPLES (in graph_iri any, in triples any)
As soon as the Virtuoso SPARQL implementation is extended to provide fine grained access rights the , SPARQL syntax will be extended with 'INSERT INTO ... CONSTRUCT ...', 'INSERT INTO ... DESCRIBE ...', and 'DELETE FROM ... CONSTRUCT ...' statements.
Previous
Data Representation |
Chapter Contents |
Next
SPARQL Implementation Extent |