Skip to content

Framing: expected behavior for @explicit/@embed/@default #641

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
pjohnston-wiley opened this issue Apr 20, 2018 · 4 comments
Closed

Framing: expected behavior for @explicit/@embed/@default #641

pjohnston-wiley opened this issue Apr 20, 2018 · 4 comments
Labels
defer Issue deferred to future Working Group framing

Comments

@pjohnston-wiley
Copy link
Contributor

pjohnston-wiley commented Apr 20, 2018

Apologies in advance if this turns out to be covered by the spec, but i was unable to determine it from the latest draft or the issues on github. In summary, I am trying to understand how the various switches in framing can be used at different levels to control output, in particular how @explicit and @embed are supposed to interact with each other. I appear to be running into some inconsistencies, and I would just like to understand if these are just bugs in the current implementation, or i am missing the intent. Thanks for your patience.

I have been mostly testing with pyld (i have large files to process, and i would be happy to have a go at fixing it there), though i can reproduce the behavior in both the dev and stable playgrounds. This example isolates the behavior.

In expanded form, it looks like this (minus the context):

{
  "@context": {...},
  "@graph": [
      {
          "label": "Some property.",
          "@id": "ex:knows",
          "@type": "ObjectProperty",
          "domain": "ex:Flinstone", 
          "range": "_:1"
      },
      {
          "label": "Rubble class.",
          "@id": "ex:Rubble",
	  "@type": "Class",
	  "termStatus": "stable"
      },
      {
          "label": "Flintstone class.",
          "@id": "ex:Flintstone",
	  "@type": "Class"
      },
      {
          "@id": "_:1",
          "@type": "Class",
          "unionOf": {
              "@list": [
                  {
                      "@id": "ex:Flintstone"
                  },
                  {
                      "@id": "ex:Rubble"
                  }
              ]
          }
      }			
  ]
}

By way of context, the exercise is rendering OWL ontologies in a framed form to be used to build documentation. The case to examine here is that the domain and range of a property can be either a single Class or a union of classes. The latter manifests as a blank node with a unionOf property with an ordered list of class ids. In English terms, here i am saying that Flintstones can know either other Flintstones, or Rubbles.

In order to deal with this scenario, my frame domain and range are identical, and look like this:

			"range": {
				"@default": "Thing",
				"@explicit": true,
				"@embed": "@always",
				"unionOf": {
					"@embed": "@never"
				}
			},

with the intent being that if the property (1) references just a single class, it will look like:

"range": "ex:Flintstone",

if it (2) references a union of classes it will look like:

"range": {"unionOf": ["ex:Flintstone", "ex:Rubble"]},

and (3) if the property is absent it will have a range of all possible classes:

"range": "Thing",

What i am seeing in the link above is that case (2) and (3) work fine, but case (1) fails and renders the default (3).

Is this expected behavior?

By way of comparison, if i set:

			"range": {
				"@default": "Thing",
				"@explicit": false,
				"@embed": "@always"
			},

but that gives me a lot of unwanted information (as in other properties on the class i had wanted to omit), and i then cannot control the output of the unionOf statement (see here):

      "domain": "ex:Flinstone",
      "range": {
        "@id": "_:b0",
        "@type": "Class",
        "unionOf": [
          {
            "@id": "ex:Flintstone",
            "@type": "Class",
            "label": "Flintstone class."
          },
          {
            "@id": "ex:Rubble",
            "@type": "Class",
            "termStatus": "stable",
            "label": "Rubble class."
          }
        ]
      },

There does appear to be a bug if the JSON-LD @type is @vocab and the property is absent, the defaulting mechanism renders incorrectly, compare ex:termStatus with ex:supersedes in the referenced example - it renders ex:termStatus rather than the alias termStatus. I suspect framing is compacting the output prior to defaulting.

Regards,
Patrick Johnston

@gkellogg
Copy link
Member

The reason this is not doing what you expect is that because both domain and range include unionOf, the value does not match unless it includes unionOf.

If you leave out unionOf, it will match, but you have lost the opportunity to specify @embed.

Although the playground does not presently allow you to specify options such as the require all flag, the API allows you to do so; setting requireAll to false should allow it to match. This requires that there be at least one thing to match the frame, so adding "@id": {} to the domain and range bits should allow anything to match.

I note that my implementation does this, but the playground generates different results. Using the following frame:

{
  "@context": {
    "@language": "en",
    "ex": "http://example.com/",
    "termStatus": {
      "@id": "ex:termStatus",
      "@type": "@vocab"
    },
    "unstable": "ex:unstable",
    "stable": "ex:stable",
    "termSupersedes": {
      "@id": "ex:termSupersedes",
      "@type": "@id"
    },
    "label": "rdfs:label",
    "owl": "http://www.w3.org/2002/07/owl#",
    "rdfs": "http://www.w3.org/2000/01/rdf-schema#",
    "skos": "http://www.w3.org/2004/02/skos/core#",
    "AnnotationProperty": "owl:AnnotationProperty",
    "Class": "owl:Class",
    "comment": "rdfs:comment",
    "DatatypeProperty": "owl:DatatypeProperty",
    "definition": {
      "@id": "skos:definition",
      "@language": "en"
    },
    "domain": {
      "@id": "rdfs:domain",
      "@type": "@id"
    },
    "ObjectProperty": "owl:ObjectProperty",
    "range": {
      "@id": "rdfs:range",
      "@type": "@id"
    },
    "ConceptScheme": "skos:ConceptScheme",
    "schemeOf": {
      "@container": "@set",
      "@reverse": "skos:inScheme"
    },
    "subClassOf": {
      "@id": "rdfs:subClassOf",
      "@type": "@id"
    },
    "subPropertyOf": {
      "@id": "rdfs:subPropertyOf",
      "@type": "@id"
    },
    "Thing": "owl:Thing",
    "Nothing": "owl:Nothing",
    "unionOf": {
      "@container": "@list",
      "@id": "owl:unionOf",
      "@type": "@id"
    }
  },
  "@graph": [
    {
      "@explicit": true,
      "@embed": "@last",
      "@type": [
        "Class",
        "ObjectProperty",
        "DatatypeProperty",
        "AnnotationProperty",
        "ConceptScheme"
      ],
      "description": {},
      "definition": {},
      "schemeOf": {
        "@explicit": true,
        "value": {},
        "definition": {}
      },
      "termStatus": {
        "@embed": "@never",
        "@default": "unstable"
      },
      "subClassOf": {
        "@default": "Thing",
        "@explicit": true,
        "@embed": "@always",
        "unionOf": {
          "@embed": "@never"
        }
      },
      "termSupersedes": {
        "@embed": "@never"
      },
      "domain": {
        "@id": {},
        "@default": "Thing",
        "@explicit": true,
        "@embed": "@always",
        "@requireAll": false,
        "unionOf": {
          "@embed": "@never"
        }
      },
      "range": {
        "@id": {},
        "@default": "Thing",
        "@explicit": true,
        "@embed": "@always",
        "@requireAll": false,
        "unionOf": {
          "@embed": "@never"
        }
      },
      "subPropertyOf": {
        "@embed": "@never"
      }
    }
  ]
}

I get the following output:

{
  "@context": {
    "@language": "en",
    "ex": "http://example.com/",
    "termStatus": {
      "@id": "ex:termStatus",
      "@type": "@vocab"
    },
    "unstable": "ex:unstable",
    "stable": "ex:stable",
    "termSupersedes": {
      "@id": "ex:termSupersedes",
      "@type": "@id"
    },
    "label": "rdfs:label",
    "owl": "http://www.w3.org/2002/07/owl#",
    "rdfs": "http://www.w3.org/2000/01/rdf-schema#",
    "skos": "http://www.w3.org/2004/02/skos/core#",
    "AnnotationProperty": "owl:AnnotationProperty",
    "Class": "owl:Class",
    "comment": "rdfs:comment",
    "DatatypeProperty": "owl:DatatypeProperty",
    "definition": {
      "@id": "skos:definition",
      "@language": "en"
    },
    "domain": {
      "@id": "rdfs:domain",
      "@type": "@id"
    },
    "ObjectProperty": "owl:ObjectProperty",
    "range": {
      "@id": "rdfs:range",
      "@type": "@id"
    },
    "ConceptScheme": "skos:ConceptScheme",
    "schemeOf": {
      "@container": "@set",
      "@reverse": "skos:inScheme"
    },
    "subClassOf": {
      "@id": "rdfs:subClassOf",
      "@type": "@id"
    },
    "subPropertyOf": {
      "@id": "rdfs:subPropertyOf",
      "@type": "@id"
    },
    "Thing": "owl:Thing",
    "Nothing": "owl:Nothing",
    "unionOf": {
      "@container": "@list",
      "@id": "owl:unionOf",
      "@type": "@id"
    }
  },
  "@graph": [
    {
      "@id": "_:b0",
      "@type": "Class",
      "ex:termStatus": "unstable",
      "rdfs:domain": "Thing",
      "rdfs:range": "Thing",
      "rdfs:subClassOf": "Thing",
      "skos:definition": null,
      "subPropertyOf": null,
      "termSupersedes": null
    },
    {
      "@id": "ex:Flintstone",
      "@type": "Class",
      "ex:termStatus": "unstable",
      "rdfs:domain": "Thing",
      "rdfs:range": "Thing",
      "rdfs:subClassOf": "Thing",
      "skos:definition": null,
      "subPropertyOf": null,
      "termSupersedes": null
    },
    {
      "@id": "ex:Rubble",
      "@type": "Class",
      "rdfs:domain": "Thing",
      "rdfs:range": "Thing",
      "rdfs:subClassOf": "Thing",
      "skos:definition": null,
      "subPropertyOf": null,
      "termStatus": "stable",
      "termSupersedes": null
    },
    {
      "@id": "ex:knows",
      "@type": "ObjectProperty",
      "domain": {
        "@id": "ex:Flinstone",
        "owl:unionOf": null
      },
      "ex:termStatus": "unstable",
      "range": {
        "@id": "_:b0",
        "@type": "Class",
        "unionOf": [
          "ex:Flintstone",
          "ex:Rubble"
        ]
      },
      "rdfs:subClassOf": "Thing",
      "skos:definition": null,
      "subPropertyOf": null,
      "termSupersedes": null
    }
  ]
}

Perhaps not ideal, but better.

Any thoughts on how to improve framing, or the documentation are useful.

cc/ @dlongley

@pjohnston-wiley
Copy link
Contributor Author

Thanks, that makes sense.

My mistake was assuming the playground would just use the defaults from the spec: "...if the value of the require all flag is false (the default)...". I checked pyld, and that also defaults to true, which explains why the results were consistent; perhaps this was the original implementation default? It would be good if the reference implementations and the documentation could converge here: i think the documentation has it correctly.

I am actually not quite sure i understand the benefit of having global options specified externally to the frame definition: it would be clearer if these were just set as processing instructions on the root frame, e.g.:

{
  "@context": ...,
  "@requireAll": false,
  ...
}

or if you don't want to pollute the basic JSON-LD vocabulary with stuff that will evolve over time:

{
  "@context": ...,
  "@options": {
     "requireAll": false
  },
  ...
}

I'd go even further and say that one should be able to specify requireAll on any node in the frame, much like the other framing instructions, though i guess the reason this was not done was to keep the frame spec simple? It just seems that if we are already recursing through the things like @embed, it would be much the same to do so for requireAll, especially given how crucial it is to getting my example working....

I will say that the current documentation on framing is a vast improvement over what was there around 1.0 (and i get that it was more the start of an idea back then), so thank you to the folks who have spent the time to elaborate on it. Back then, i couldn't make head or tail of it and ended up writing what was essentially my own framing algorithm. This is actually what i am now experimenting with to see whether it can be refactored to use framing. Where i have been struggling so far:

  1. How the switches interact with each other to achieve a desired result is not clear, in particular whether or not the switches can be applied to any node or just the root was not obvious. Some of this is just a learning experience, but a more comprehensive framing example on the playground than the library would perhaps help. It may be because the examples in the documentation all focus on pretty much the same use case (the contains predicate), which is good for continuity, but not so good for learning.
  2. Blank nodes are the bane of my existence, much as they are for anyone working with RDF. I have played with pruneBlankNodeIdentifiers, which isn't all that useful - since it focuses on single-use blank nodes. What i suspect a lot of folks will want is an option that eliminates the need for any blank node identifiers - in other words, selective embedding, or something like pruneBlankNodes. If i start with a compacted graph, the framing algorithm would look whether all references to a blank node have been embedded as a result of framing, and if so eliminate the blank node left on the root. As it is, i am finding i am having to write some post-processing to achieve this kind of result. I am happy to work on specifying this more precisely if this is something you guys would entertain.
  3. The approach to framing assumes a very homogeneous graph, where all peer subjects are alike. It would be nice if, rather than invoking my least favorite and far-too-ubiquitous error message "a JSON-LD frame must be a single object", you could specify an array of things to match, which also makes for a more powerful filtering mechanism. Using the ontology example:
{
  "graph": [
    {"@type": "owl:ObjectProperty", ...},
    {"@type": "owl:Class", ...}
  ]
}
  1. In the implementations, if a property specified by the frame doesn't exist on a subject, and no default is set, the output still populates the property (albeit without applying compaction - see my original comments) and sets it to null. I am not sure i understand why these are output at all - setting the property to null is now saying it is defined where the source said no such thing. Especially given how iffily null json properties get interpreted by object deserializers, i would tend towards saying nothing rather than something that could be misinterpreted. At the very least turning this off should be an option. The documentation mentions something called @null but then doesn't explain what it is - maybe that could be used to control when you want it output:
# Frame
{
  "@context": {... "knows": "ex:knows", "loves": "ex:loves"},
   "knows": {},
   "loves": { "@default": "@null"}
}

# Document
{
  "@type": "Flintstone"
}

# Current output (ignoring blank nodes):
{
  "@type": "Flintstone",
  "ex:knows": null,
  "ex:loves": null
}

# Ideally it would be:
{
  "@type": "Flintstone",
  "loves": null
}
  1. It would be really nice if there were some kind of negative frame, in other words exclude results that have a specific property and/or value. This would be much less necessary with (3) and (4) in place.

Typo in the docs

In the main documentation, just above the introduction, the set of documents link to JSON-LD Framing points to the JSON-LD API rather than framing.

A nod to RDFLib would be nice

I understand that the implementers started from first principles when building the internal graph representation for jsonld.js, but the python port is not directly compatible with rdflib, which limits its usefulness when exploring this alongside prior W3C semantic web standards. In my case, i am doing a lot of RDF-level manipulation prior to serializing, meaning that framing is only a part of the puzzle. It wasn't the end of the world, as a nice fellow called Simeon Warner was kind of enough to put together a basic mapping rdlib_pyld_compat, which i have been able to adapt for my purposes. However, it would be good to include this or something like it into pyld, and i would be happy to help if so. Even for the javascript version, being able to tap into something, say, tinkerpop-compatible, would be nice, though perhaps outside the scope of this group.

chz
p

@gkellogg
Copy link
Member

Thanks for the feedback, much of this will likely wait for the WG to start up for any substantiative work. I will get the reference updated, however.

Personally, I fully agree that pyld should more natively support rdflib, but that's really an issue to take up in https://github.com/digitalbazaar/pyld.

gkellogg added a commit that referenced this issue Apr 25, 2018
Fix biblio reference for JSON-LD11CG-FRAMING.

For #641.
@gkellogg gkellogg added defer Issue deferred to future Working Group and removed question labels Apr 25, 2018
@pjohnston-wiley
Copy link
Contributor Author

Understood. I may get to a pull request on pyld once i've had things working for a while. I have more feedback for framing specifically, but happy to hold off until the WG kicks off.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
defer Issue deferred to future Working Group framing
Projects
None yet
Development

No branches or pull requests

2 participants