Skip to content

Define term "quad" #5

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
gkellogg opened this issue Jan 24, 2023 · 8 comments · Fixed by #12
Closed

Define term "quad" #5

gkellogg opened this issue Jan 24, 2023 · 8 comments · Fixed by #12

Comments

@gkellogg
Copy link
Member

The term quad is used in a number of specifications (e.g., n-quads, rdf-canon, sparql), but never really defined. Even though quads are not part of the abstract syntax, at least an informative definition within 4. RDF Datasets would be useful as a common reference point.

A quad is the combination of an RDF statement and optionally, a named graph.

See w3c/rdf-canon#62

@pchampin
Copy link
Contributor

If we go down that path, then we should also clarify the relationship between datasets and quads. Proposal (after the definition of dataset):

Alternatively, an RDF Dataset can be seen as a set of quads, where a quad is the combination of an RDF statement and optionally, a graph name that is either an IRI or a blank node.

(NB: I changed the end of the definition of quad, compared to @gkellogg's proposal above, to something more accurate, IMO).

@afs
Copy link
Contributor

afs commented Jan 27, 2023

All the content suggestions are sketches and can be word-smithed when we have the overall direction.
Long comment to put the full context in but the amount of change suggested isn't that great.


A "quad" in N-Quads is analogous to a triple. "RDF Statement" is at the level of abstract syntax.

https://w3c.github.io/rdf-concepts/spec/#dfn-statement
"This statement corresponding to an RDF triple is known as an RDF statement."

N-Quads uses the word statement (not RDF Statement) in the sense of one item in the grammar.

The NT and NQ text comes from the Turtle spec where "statement" is a declaration or a triple, the repeating unit in the syntax.

rdf-canon needs the n-quads concept of "quads".

The SPARQL definition of RDF Dataset and the more descriptive definition in RDF Concepts do not mention quads. Quads in SPARQL are part of the grammar, not the data model. The word occurs as "quad pattern" as patterns of (name, graph). When in text, they are styled to suggest a grammar rule.

I think the best way forward is to define quads in RDF Concept (in section 3.1, section 1.x) saying a quad is a triple with an optional graph name. "quad" is defined in RDF Concepts as:
"""
A quad is (optional_name, triple), alternatively (optional_name, subject, predicate, object)
A quad without a name is a triple.
"""
and this is repeated in N-Quads (+link to concepts) because it is important to N-Quads.

I see no harm in keeping the concepts section as "3.1 Triples" when quads is first presented in the section as (name, triple). It frames the terminology as as a extension of triple (loose sense). It keeps triple as the primary concept which is important because semantics/entailment is about triples. There is no "quoted quad" (:smile:).

I am not that comfortable with "name" on its own, maybe graph name is necessary here.

Some explanatory text that relates quads to RDF Datasets would be useful in RDF concepts. But the definition of RDF dataset remains as default graph + named graphs; anything else is a very big change.
"""
An RDF Dataset can be seen as a set of quads. Because a quad without a name is a triple, this is a set of triples and a set of quads with graph names.
The set of triples forms the default graph.
The set of quads with graph names forms the named graphs. All the quads with the same graph name (IRI or blank node) form a named graph.
"""

"form"/"comprise"/"make up"

@gkellogg
Copy link
Member Author

If we go down that path, then we should also clarify the relationship between datasets and quads. Proposal (after the definition of dataset):

Alternatively, an RDF Dataset can be seen as a set of quads, where a quad is the combination of an RDF statement and optionally, a graph name that is either an IRI or a blank node.

Defining quad in the dataset section makes sense.

A "quad" in N-Quads is analogous to a triple. "RDF Statement" is at the level of abstract syntax.

https://w3c.github.io/rdf-concepts/spec/#dfn-statement "This statement corresponding to an RDF triple is known as an RDF statement."

N-Quads uses the word statement (not RDF Statement) in the sense of one item in the grammar.

With the proposed update, N-Quads (and maybe TriG) could use the quad term, and not RDF Statement.

I think the best way forward is to define quads in RDF Concept (in section 3.1, section 1.x) saying a quad is a triple with an optional graph name. "quad" is defined in RDF Concepts as: """ A quad is (optional_name, triple), alternatively (optional_name, subject, predicate, object) A quad without a name is a triple. """ and this is repeated in N-Quads (+link to concepts) because it is important to N-Quads.

As 3.1 describes statements within graphs, as @pchampin suggests, section 4 might be more appropriate, and not mix the paradigm. It probably deserves its own sub-section, then. Adding a more formal definition along the lines of its being a 4-Tuple composed of subject, predicate, object, and optionally graph name avoids over-loading the "Tuple" too far, so that it's not a Tuple composed of an optional graph name and a statement.

I am not that comfortable with "name" on its own, maybe graph name is necessary here.

graph name is a well-defined term.

Some explanatory text that relates quads to RDF Datasets would be useful in RDF concepts. But the definition of RDF dataset remains as default graph + named graphs; anything else is a very big change. """ An RDF Dataset can be seen as a set of quads. Because a quad without a name is a triple, this is a set of triples and a set of quads with graph names. The set of triples forms the default graph. The set of quads with graph names forms the named graphs. All the quads with the same graph name (IRI or blank node) form a named graph. """

👍

@afs
Copy link
Contributor

afs commented Jan 28, 2023

Details - lets see if we can wrap this up for RCH.

The term quad is used in a number of specifications (e.g., n-quads, rdf-canon, sparql)

Coming back to the requirement:

rdf-canon needs a concept to refer to. rdf-canon could define itself but there is colloquial use of "quad" generally so it is useful to a have definition in RDF.

It's not supposed to be another top-level concept. It would be used sparingly such as a mention/link in N-Quads.

Where terminology is currently used:

  • N-Quads does not mention quads except as n-quads (the syntax).
  • TriG does not mention quads.
  • JSON-LD does not seem to mention quads.
  • SPARQL uses quad in a different sense ("quad pattern" as a grammar element)

In RDF Concepts: Section 1 is about (was Models-)Statements-Resources (it's where "asserting" is discussed).

"statement" is only in section 1 and section 1 is not normative.

"statement" is also separately used in several concrete syntaxes as a grammar unit.

If the definition of "quad" is in Concepts section 4, then a separate later section is better. It needs to be clear it is not an implied alternative definition of dataset because that has effects throughout (e.g. semantics for quads, quoted quads). "RDF Dataset" is also shared with SPARQL and changing SPARQL end-to-end for quads is more than an errata.

Using the defn of quad:

Parsing N-Quads produces triples.

Parsing TriG produces triples although the section is called "Triple Output". This ties back to TriG is extending Turtle : w3c/rdf-new#2.

Given rdf-canon, it is worth mentioning/referencing "quad" in N-Quads. (It's also odd N-Quads doesn't use "quad").
Explanatory text can go in N-Quads section 5.2.

"""
Parsing can also be described as outputting quads [link]. When a statement [link to
grammar] is output, it can be in the form of a quad,"(optional graph name, subject, predicate, object)"
considered as "(optional graph name, triple)"
"""

keeping "triple" somewhere here because the production of both languages is "triples in graphs".

We don't have to change TriG but we could add the same text around TriG section 5.3.2 but it does not fit in so nicely. Should it be a new 5.3.2.4?

@gkellogg
Copy link
Member Author

Arguably, those specs don't use the term "quads" because there is no such term. But, most parsing can be described as adding a triple to some particular graph (default or named). Never-the-less, the notion of "quad" is firmly embedded the the community consciousness (from my experience) it it's odd that it's never actually defined.

If the definition of "quad" is in Concepts section 4, then a separate later section is better. It needs to be clear it is not an implied alternative definition of dataset because that has effects throughout (e.g. semantics for quads, quoted quads). "RDF Dataset" is also shared with SPARQL and changing SPARQL end-to-end for quads is more than an errata.

I was thinking as in a new sub-section 4.3, and can probably be a non-normative section, and thus a non-normative term definition.

Given rdf-canon, it is worth mentioning/referencing "quad" in N-Quads. (It's also odd N-Quads doesn't use "quad").
Explanatory text can go in N-Quads section 5.2.

"""
Parsing can also be described as outputting quads [link]. When a statement [link to
grammar] is output, it can be in the form of a quad,"(optional graph name, subject, predicate, object)"
considered as "(optional graph name, triple)"
"""

Seems reasonable, in addition to some of the other changes needed for N-Quads. I agree that there's no particular reason to touch TriG.

@pchampin
Copy link
Contributor

  • I would strongly prefer keeping quads as a "secondary" concept , precisely to avoid the "why not quoted quads?" rathole (and they are clearly secondary, as RDF 1.1 have existed without them all this time)
  • My proposal was not to change the definition of dataset (default graph + 0..n named graphs), on the contrary. "Alternatively, Dataset can be seen" is supposed to come after the definition, and ideally in a non-normative paragraph (if it makes sense to define quads non-normatively)
  • I am not comfortable with saying that "quads with no graph name are triples", just like I am not comfortable with conflating a dataset having no named graph with its default graph...

@afs
Copy link
Contributor

afs commented Jan 31, 2023

@gkellogg @pchampin -- could you check that a non-normative definition is workable for rdf-canon?

@gkellogg
Copy link
Member Author

RDF Canon uses phrases such as

an RDF dataset D is represented as a set of quads of the form < s, p, o, g > where the graph component g is empty if and only if the triple < s, p, o > is in the default graph. This algorithm considers an RDF dataset to be a set of quads.

I think that if we have an informative statement in RDF Concepts along these lines, we should be fine. RDF Canon is using quad as a shorthand for saying "a triple, optionally within a named graph of an RDF Dataset".

RDF Canon could perhaps improve the phrasing by saying something like: "For the purpose of this specification, an RDF Dataset is considered to be a set of quads".

If RDF Concepts uses a nomenclature other than < s, p, o, g > to describe the logical tuple members of a quad, RDF Canon should synchronize with that, or simply defer the definition to RDF Concepts.

gkellogg added a commit that referenced this issue Feb 2, 2023
gkellogg added a commit that referenced this issue Feb 6, 2023
Informal definition of quad.

Fixes #5.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants