Lists, sets, and multisets #473

timothee-haudebourg · 2020-04-17T17:48:54Z

Hi it's me again,

I am confused about the use of the term "set" to describe @set objects. In the section Lists and Sets it is said that

A set represents an unordered set of values

Section 4.3.2 it is said that

While @list is used to describe ordered lists, the @set keyword is used to describe unordered sets.

So I thought to be correct assuming that I could optimize @set objects by removing duplicate values (using a set data structure internally), since it is described as an unordered set, a opposed to @list objects defined as an ordered list.

Now trying to pass the expansion tests, I cannot pass test 27 as it is indicated that "Duplicate values in @list and @set are not merged" which is not mentioned in the specification sections associated to @list and @set, and also means that a @set object is, in fact, not a set, but a multiset.

Note that I understand that @set is just syntactic sugar, associating multiple values to a subject property. Forcing an implementation to keep duplicate values in @set objects implies that duplicate RDF triples would have some kind of meaning, which they have not since a RDF graph is a set of triples, not a multiset.

Am I correct in my conclusion or is there something wrong with the expansion test 27? Or did I miss the explanation in the spec? Or is it just to avoid further complications (like, what should we do if there is a duplicate occurrence of the same node object, with the same id but not the same properties, etc.)

The text was updated successfully, but these errors were encountered:

gkellogg · 2020-04-17T18:07:15Z

The word set is, indeed, something of a misnomer. Really, the distinction is between order-preserving and non-order-preserving arrays (or multiset, as you note). A true set could not allow duplicates.

Although non-list arrays of JSON values do maintain their distinctness in the JSON-LD realm, they do not round-trip through RDF, and strictly speaking, this does not really conform to the RDF Graph model even in the pure JSON-LD realm. But, compatibility with JSON-LD 1.0 restricts our ability to do anything here, other than perhaps to add such a disclaimer.

I'd defer to a future version any change the semantics of @set (explicit or implied), but that's for the group to discuss.

Am I correct in my conclusion or is there something wrong with the expansion test 27? Or did I missed the explanation in the spec? Or is it just to avoid further complications (like, what should we do if there is a duplicate occurrence of the same node object, with the same id but not the same properties, etc.)

No, not given the current algorithms, test 27 ensures that duplicate values are not dropped, although it is reasonable to infer that perhaps they should. Otherwise, step 5.2.3 of the Expansion Algorithm would need to have something in here about looking for duplicates.

However, note that the Compaction Algorithm does look for duplicates, so re-compacting that value would likely remove them.

@dlongley thoughts?

azaroth42 · 2020-04-24T16:18:32Z

Add editorial note that expansion does not eliminate duplicates.
Future version might make it a real set without duplicates.

iherman · 2020-04-24T17:08:44Z

This issue was discussed in a meeting.

RESOLVED: Make editorial change to note that @set terms can have duplicate entries during expansion (only)

View the transcript

Gregg Kellogg: One issue (#480) is a bit different: @id: null is not a valid jsonld document, which has to do with the fact that we ignore keyword-like entries. A step in the algorithm has issues with this. So this would be more complicated.
Rob Sanderson: #473
Gregg Kellogg: One of the differences with set versus array is with duplicates. If you compact, there are no duplicates. In expansion, there is nothing that eliminates duplicates. This issue shows something that we may want to look at.
… In a future version, this may require an incompatible change.
… The editorial change is that expansion does not remove duplicates.
Rob Sanderson: We can do this before the document is frozen.
Proposed resolution: Make editorial change to note that @set terms can have duplicate entries during expansion (only) (Rob Sanderson)
Rob Sanderson: +1
Gregg Kellogg: +1
Ruben Taelman: +1
Benjamin Young: +1
Pierre-Antoine Champin: +1
Ivan Herman: +1
Resolution #1: Make editorial change to note that @set terms can have duplicate entries during expansion (only)
David I. Lehn: +1

…ansion. For #473. Other parts of #473 are deferred.

gkellogg · 2020-04-27T18:54:09Z

@timothee-haudebourg this is addressed in #481. Note that this does not resolve the issue, which will remain deferred.

…ansion. For #473. Other parts of #473 are deferred.

gkellogg added the wr:open label Apr 17, 2020

gkellogg added the spec:editorial label Apr 24, 2020

gkellogg added a commit that referenced this issue Apr 27, 2020

Add note that unordered arrays may contain duplicate values after exp…

674cd01

…ansion. For #473. Other parts of #473 are deferred.

gkellogg mentioned this issue Apr 27, 2020

Timothee updates #481

Merged

gkellogg added defer-future-version Defer this issue until a future version of JSON-LD wr:spec-updated-partial and removed wr:open labels Apr 27, 2020

gkellogg added a commit that referenced this issue Apr 29, 2020

Add note that unordered arrays may contain duplicate values after exp…

4159b40

…ansion. For #473. Other parts of #473 are deferred.

gkellogg added wr:spec-updated and removed wr:spec-updated-partial labels Apr 29, 2020

pchampin added this to JSON-LD Management May 15, 2024

pchampin moved this to Future Work in JSON-LD Management May 15, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Lists, sets, and multisets #473

Lists, sets, and multisets #473

timothee-haudebourg commented Apr 17, 2020 •

edited

Loading

gkellogg commented Apr 17, 2020

azaroth42 commented Apr 24, 2020

iherman commented Apr 24, 2020

gkellogg commented Apr 27, 2020

Lists, sets, and multisets #473

Lists, sets, and multisets #473

Comments

timothee-haudebourg commented Apr 17, 2020 • edited Loading

gkellogg commented Apr 17, 2020

azaroth42 commented Apr 24, 2020

iherman commented Apr 24, 2020

gkellogg commented Apr 27, 2020

timothee-haudebourg commented Apr 17, 2020 •

edited

Loading