Skip to content

Lists, sets, and multisets #473

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
timothee-haudebourg opened this issue Apr 17, 2020 · 4 comments
Open

Lists, sets, and multisets #473

timothee-haudebourg opened this issue Apr 17, 2020 · 4 comments
Labels
defer-future-version Defer this issue until a future version of JSON-LD spec:editorial wr:spec-updated

Comments

@timothee-haudebourg
Copy link

timothee-haudebourg commented Apr 17, 2020

Hi it's me again,

I am confused about the use of the term "set" to describe @set objects. In the section Lists and Sets it is said that

A set represents an unordered set of values

Section 4.3.2 it is said that

While @list is used to describe ordered lists, the @set keyword is used to describe unordered sets.

So I thought to be correct assuming that I could optimize @set objects by removing duplicate values (using a set data structure internally), since it is described as an unordered set, a opposed to @list objects defined as an ordered list.

Now trying to pass the expansion tests, I cannot pass test 27 as it is indicated that "Duplicate values in @list and @set are not merged" which is not mentioned in the specification sections associated to @list and @set, and also means that a @set object is, in fact, not a set, but a multiset.

Note that I understand that @set is just syntactic sugar, associating multiple values to a subject property. Forcing an implementation to keep duplicate values in @set objects implies that duplicate RDF triples would have some kind of meaning, which they have not since a RDF graph is a set of triples, not a multiset.

Am I correct in my conclusion or is there something wrong with the expansion test 27? Or did I miss the explanation in the spec? Or is it just to avoid further complications (like, what should we do if there is a duplicate occurrence of the same node object, with the same id but not the same properties, etc.)

@gkellogg
Copy link
Member

The word set is, indeed, something of a misnomer. Really, the distinction is between order-preserving and non-order-preserving arrays (or multiset, as you note). A true set could not allow duplicates.

Although non-list arrays of JSON values do maintain their distinctness in the JSON-LD realm, they do not round-trip through RDF, and strictly speaking, this does not really conform to the RDF Graph model even in the pure JSON-LD realm. But, compatibility with JSON-LD 1.0 restricts our ability to do anything here, other than perhaps to add such a disclaimer.

I'd defer to a future version any change the semantics of @set (explicit or implied), but that's for the group to discuss.

Am I correct in my conclusion or is there something wrong with the expansion test 27? Or did I missed the explanation in the spec? Or is it just to avoid further complications (like, what should we do if there is a duplicate occurrence of the same node object, with the same id but not the same properties, etc.)

No, not given the current algorithms, test 27 ensures that duplicate values are not dropped, although it is reasonable to infer that perhaps they should. Otherwise, step 5.2.3 of the Expansion Algorithm would need to have something in here about looking for duplicates.

However, note that the Compaction Algorithm does look for duplicates, so re-compacting that value would likely remove them.

@dlongley thoughts?

@azaroth42
Copy link
Contributor

Add editorial note that expansion does not eliminate duplicates.
Future version might make it a real set without duplicates.

@iherman
Copy link
Member

iherman commented Apr 24, 2020

This issue was discussed in a meeting.

  • RESOLVED: Make editorial change to note that @set terms can have duplicate entries during expansion (only)
View the transcript Gregg Kellogg: One issue (#480) is a bit different: @id: null is not a valid jsonld document, which has to do with the fact that we ignore keyword-like entries. A step in the algorithm has issues with this. So this would be more complicated.
Rob Sanderson: #473
Gregg Kellogg: One of the differences with set versus array is with duplicates. If you compact, there are no duplicates. In expansion, there is nothing that eliminates duplicates. This issue shows something that we may want to look at.
… In a future version, this may require an incompatible change.
… The editorial change is that expansion does not remove duplicates.
Rob Sanderson: We can do this before the document is frozen.
Proposed resolution: Make editorial change to note that @set terms can have duplicate entries during expansion (only) (Rob Sanderson)
Rob Sanderson: +1
Gregg Kellogg: +1
Ruben Taelman: +1
Benjamin Young: +1
Pierre-Antoine Champin: +1
Ivan Herman: +1
Resolution #1: Make editorial change to note that @set terms can have duplicate entries during expansion (only)
David I. Lehn: +1

gkellogg added a commit that referenced this issue Apr 27, 2020
@gkellogg
Copy link
Member

@timothee-haudebourg this is addressed in #481. Note that this does not resolve the issue, which will remain deferred.

@gkellogg gkellogg added defer-future-version Defer this issue until a future version of JSON-LD wr:spec-updated-partial and removed wr:open labels Apr 27, 2020
gkellogg added a commit that referenced this issue Apr 29, 2020
@pchampin pchampin moved this to Future Work in JSON-LD Management May 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
defer-future-version Defer this issue until a future version of JSON-LD spec:editorial wr:spec-updated
Projects
Status: Future Work
Development

No branches or pull requests

4 participants