Closed
Description
Using a FROM statement to query a Dataset induces a number of problems. Consider the following example:
from rdflib import Dataset
from rdflib.plugins import sparql
sparql.SPARQL_LOAD_GRAPHS = False # Needed to ensure remote lookups aren't performed
sparql.SPARQL_DEFAULT_GRAPH_UNION = False # Needed otherwise queries don't work at all!
data = """\
@prefix ex: <http://example.com#> .
ex:Graph1 {
ex:Alice a ex:Person .
}
ex:Graph2 {
ex:Charlie a ex:Person .
}
"""
query = """
PREFIX ex: <http://example.com#>
SELECT ?person
FROM ex:Graph1
WHERE {
?person a ex:Person
}
"""
ds = Dataset()
ds.parse(data=data, format="trig")
for row in ds.query(query):
print(*row)
# Correctly outputs "http://example.com#Alice"
This snippet loads two named graphs, each with a single triple, then queries ex:Graph1
.
After this query, the graph now contains a duplicate of ex:Graph1
in the default graph:
for quad in ds.quads():
print([term.fragment for term in quad[0:3]], quad[3])
# Outputs:
# ['Charlie', 'type', 'Person'] http://example.com#Graph2
# ['Alice', 'type', 'Person'] http://example.com#Graph1
# ['Alice', 'type', 'Person'] urn:x-rdflib:default
Aside from inadvertently increasing the size of the Dataset, this also induces a bug when querying other graphs in the Dataset. For example, if we now query ex:Graph2
, we find that we get an erroneous result:
query = """
PREFIX ex: <http://example.com#>
SELECT ?person
FROM ex:Graph2
WHERE {
?person a ex:Person
}
"""
res = ds.query(query)
print("Second query results")
for row in res:
print(*row)
# Outputs
# http://example.com#Alice <---- SHOULDN'T BE HERE
# http://example.com#Charlie
I have no clue as to the cause of this behaviour, but clearly something is wrong with the handling of FROM statements. Furthermore, none of the queries show here work if sparql.SPARQL_DEFAULT_GRAPH_UNION = True
, which appears to be a separate but related problem.
Metadata
Metadata
Assignees
Labels
No labels