Skip to content

Possible solution for non-JSON references support relying on schemaFormat #930

@derberg

Description

@derberg

This issue aims to kinda summarize idea mentioned dunring Adding support for non-JSON schemas (April 18th 2023) that could be a good solution for using JSON Reference for non-JSON structures like Protobuf or XSD.

JSON Reference spec

Current JSON Reference used in AsycAPI spec is JSON Reference v0.3.0.

In general, JSON Reference focuses on defining how a reference object should look like and requires only the following:

  • that it should have property $ref
  • that $ref is a string value of URI
  • that other properties in reference object must be ignored

Key words used to describe requirements, like SHALL or SHOULD follow https://datatracker.ietf.org/doc/html/rfc2119

Even thought the JSON Reference says in the intro

This specification defines a JSON [RFC4627] structure which allows a JSON value __to reference another value in a JSON document__

Later in spec in Resolution section it says in only SHOULD:

Resolution of a JSON Reference object SHOULD yield the referenced JSON value

Also, for fragments that start after #, JSON Pointer is not required but recommended:

If the representation of the referrant document is JSON, then the fragment identifier SHOULD be interpreted as a [JSON-Pointer].

Conclusion is

  • We can have a $ref pointing to .proto or .xsd for example
  • We can have a custom solution for fragments resolution in case of .proto for example, because JSON Pointer is just a recommended solution. So we can for example follow:
    • For .proto, when there are nested types that need to be referenced, we can follow standard Protobuf approach for pointing to nested types. So for $ref: "https://gist.githubusercontent.com/shankarshastri/c1b4d920188da78a0dbc9fc707e82996/raw/49e733499bfb302d9ecf320f2eca2f752f7e257b/LearnXInYMinProtocolBuffer.proto" if someone wants to reference FirstLevelNestedMessage type, fragment would look like, probably 😄, #NestedMessages.FirstLevelNestedMessage. So the final reference would be:
      {
          "$ref": "https://gist.githubusercontent.com/shankarshastri/c1b4d920188da78a0dbc9fc707e82996/raw/49e733499bfb302d9ecf320f2eca2f752f7e257b/LearnXInYMinProtocolBuffer.proto#NestedMessages.FirstLevelNestedMessage"
      }
    • For .xsd it is much easier, as there is already a XPath in place that allows to create a fragment pointing to a specific part of the XML/XSD. That requires further exploration and some input from experts but in theory, if we have https://www.w3.org/2001/XMLSchema schema, the xpath fragment to formChoice simple type would probably be xs:schema/xs:simpleType[@name='formChoice'] so the resulting reference is something like:
      {
          "$ref": "https://www.w3.org/2001/XMLSchema#xs:schema/xs:simpleType[@name='formChoice']"
      }

Now the question is how on a spec level specify that dereferencing for given $ref should not use JSON Pointer but XPath or some other solution

Solutions on a speck level

schemaFormat based

As discussed during Adding support for non-JSON schemas (April 18th 2023) meeting, we can say that dereferencing mechanism depends on the schemaFormat.

  • If schemaFormat specifies AsyncAPI Schema, JSON Schema, Avro or any other JSON structure, we follow JSON Reference + JSON Pointer in fragments
  • If schemaFormat specifies Protobuf, we follow JSON Reference + Protobuf Nested Types Reference in fragments
  • If schemaFormat specifies XSD, we follow JSON Reference + XPath in fragments

Alternative

Something I mentioned before somewhere in an issue or Slack, but can't find reference.

We can just follow example from the JSON reference resolver that we use, that explains how custom resolvers work -> https://apitools.dev/json-schema-ref-parser/docs/plugins/resolvers.html. So for example they had a use case that someone keeps schemas in a MongoDB and want to reference directly to the database. The solution is that $ref value starts with "mongodb://" because plugins can discover that like canRead: /^mongodb:/i.

so we can do:

  • XSD like "$ref": "xsd|https://www.w3.org/2001/XMLSchema#xs:schema/xs:simpleType[@name='formChoice']"
  • Proto like "$ref": "proto|https://gist.githubusercontent.com/shankarshastri/c1b4d920188da78a0dbc9fc707e82996/raw/49e733499bfb302d9ecf320f2eca2f752f7e257b/LearnXInYMinProtocolBuffer.proto#NestedMessages.FirstLevelNestedMessage"

Neverheless, sharing just for reference as when I thought about it, I did not take schemaFormat into consideration. So relying on schemaFormat is better imho.

Tooling

json-schema-ref-parser that we use is pretty flexible:

And writing them is not hard same using. So, again in theory, in current JS Parser we can do $RefParser.dereference(mySchema, { resolve: { proto: ourProtoResolver }}); whenever we encounter a proto reference. This will of course require some refactor in Parser as dereferencing happens once, globally for entire AsyncAPI document, and in new approach that should be done one by one, applying different resolver depending on schemaFormat


I think I covered entire discussion and summary. Lemme know if something is missing

pinging meeting participants: @GreenRover @jonaslagoni @fmvilas

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions