[uss_qualifier] Standardize resource file loading #860

BenjaminPelletier · 2022-10-25T16:23:55Z

This PR standardizes how data files are loaded in uss_qualifier. Previously, there were a number of different places that loaded YAML files or JSON files or both, sometimes supporting web links, sometimes not. This PR uses a single library (fileio.py) to load data files wherever practical within uss_qualifier and it supports:

JSON or YAML
Local files, web files, or package-based files
$ref and allOf similar to OpenAPI so that parts of other files can be referenced in the file of interest
All combinations of the above

This PR also makes all package paths relative to uss_qualifier -- previously, test suites were relative to uss_qualifier.suites, configurations relative to uss_qualifier.configurations, resources relative to uss_qualifier.resources, etc. This reduces the need for context to understand what configuration files, test suite definitions, etc are declaring, and makes reuse of standard loading procedures easier.

barroco

Please, see my comments before merging.

barroco · 2022-10-26T14:49:35Z

monitoring/uss_qualifier/fileio.py

+
+        if file_name is None:
+            raise NotImplementedError(
+                f"Cannot load find a suitable file to load for {data_file}"


Typo: Cannot --load--

Thanks, done.

barroco · 2022-10-26T15:28:08Z

monitoring/uss_qualifier/fileio.py

+
+
+def _drop_allof(content: dict) -> None:
+    if "allOf" in content:


Why do we need to drop allOf keys? Seems like the intention is to flatten it in case of multiple $ref children, however, it seems that the fileio library may be used for files which will include allOf keys with other objects.

My understanding of how OpenAPI references work is:

Say we have some files:

a.json

{"foo": 123}

b.json

{"bar": 456}

c.json

{"baz": 789, "$ref": "a.json"}

The $ref in OpenAPI "works by replacing itself and everything on its level with the definition it is pointing at"[1], but 1) that's actually a really bad idea in certain cases, so this implementation instead works by removing itself and overwriting all keys on its level with the keys in the definition it is pointing at. So, strict OpenAPI would probably render c.json as {"foo": 123} (assuming $ref was allowed at that location), but this more forgiving approach would render c.json as {"baz": 789, "foo": 123}. So, here $ref is allowing us to combine both local content and referenced content. But, what if we wanted to add the content of both a.json and b.json to c.json? Consider:

d1.json

{"baz": 789, "$ref": "a.json", "$ref": "b.json"}

It turns out RFC 8259 doesn't actually prohibit this, but ECMA-262 says the first-level parser will see {"baz": 789, "$ref": "b.json"} because last value wins there. So, this isn't a good way to use the content of both a.json and b.json in c.json. This is, in my understanding, the main purpose of allOf. With allOf, we can rewrite d1.json:

d2.json

{"baz": 789, "allOf": [{"$ref": "a.json"}, {"$ref": "b.json"}]}

When resolving d2.json, first we can evaluate the $refs:

d2.json, $refs resolved

{"baz": 789, "allOf": [{"foo": 123}, {"bar": 456}]}

The purpose of _drop_allof is to apply the multiple schemas in allOf to the parent schema (to "drop" them a level down to the parent). Dropping the two allOf schemas in the $ref-resolved d2.json above results in:

d2.json, allOfs dropped

{"baz": 789, "foo": 123, "bar": 456}

The reason allOf does this relates to the fact that OpenAPI is based on JSON Schema which is designed merely to validate incoming data, not to express a data format. This lineage is most apparent in the "description at the same level as $ref" issue in OpenAPI -- from JSON Schema's perspective, that description is irrelevant because incoming data content is validated against the criteria in the $ref content and a usage of a data type is equivalent to the data type itself (saying "this field is an Address" is the same thing as saying "this field is Address"). allOf means "the incoming data object must validate against all of the specified conditions", so JSON Schema is expecting something like "allOf": [{"$ref": "#/HasValidPhoneNumber"}, {"$ref": "#/HasValidAddress"}]. The contents of HasValidPhoneNumber are probably something like "the field phone_number is a string with between 7 and 12 characters", so we often name that set of validation criteria something like PhoneNumber rather than HasValidPhoneNumber. So, the expectation would actually be written more like "allOf": [{"$ref": "#/PhoneNumber"}, {"$ref": "#/Address"}]. With that in mind, here's an example OpenAPI spec:

openapi.json

{ "openapi": "3.1.0", "info": { "title": "Validator inheritance demo", "version": "1.0.0" }, "components": { "schemas": { "Address": { "type": "object", "required": ["address"], "properties": { "address": {"type": "string"} } }, "PhoneNumber": { "type": "object", "required": ["phone_number"], "properties": { "phone_number": {"type": "string"} } }, "CustomerByRef": { "allOf": [ {"$ref": "#/components/schemas/Address"}, {"$ref": "#/components/schemas/PhoneNumber"} ] }, "CustomerByValue": { "type": "object", "required": ["address", "phone_number"], "properties": { "address": {"type": "string"}, "phone_number": {"type": "string"} } } } }, "paths": { "/by_ref": { "get": { "responses": { "200": { "content": { "application/json": { "schema": {"$ref": "#/components/schemas/CustomerByRef"} } }, "description": "N/A" } } } }, "/by_value": { "get": { "responses": { "200": { "content": { "application/json": { "schema": {"$ref": "#/components/schemas/CustomerByValue"} } }, "description": "N/A" } } } } } }

In openapi.json, the return format for both paths is the same -- the schema components referenced by allOf get combined into a single schema that is the same as adding the content of each of the $refs. The implementation in this PR will not actually load OpenAPI specs correctly -- for instance, the required properties need to be the sum of both $ref'd schemas' required properties. But, it should produce the same spiritual result: allOf combines the content of its elements into its parent. It wouldn't make sense to correctly load OpenAPI in load_dict because load_dict is for loading concrete content whereas OpenAPI describes content formats.

All I really want here is the ability to inherit content from other files/locations. Since neither JSON nor YAML has a standard means to do this, I tried to use an approach as similar to something repo users were most likely to already be familiar with. So, this is a custom adaptation of the parts of the OpenAPI spec relating to content inheritance, but as applied to concrete data objects, so some concepts like allOf have to be adapted to fit the use case.

Thanks for the context. I was actually referring to the fact that all objects with allOf keys (even those without ref) will be adapted. Since the library is called fileio I would not expect as a developer of this method that the content of the file to be adapted. May I suggest to either rename the method with something like: load_normalized_dict/load_flatten_dict or, as you proposed, limit drops to allOf objects containing refs ?

Sounds good -- I've made changes to:

Only perform reference-import operations on "$ref": <str> entries (e.g., "$ref": 123 would be ignored)

Only do anything with allOf entries when the value is a list of objects, each with a properly-formed $ref in it

Rename load_dict to load_dict_with_references (indicating the special functionality above)

…files

Standardize resource file loading

05da0e2

BenjaminPelletier requested a review from barroco October 25, 2022 16:33

barroco approved these changes Oct 26, 2022

View reviewed changes

BenjaminPelletier added 6 commits October 26, 2022 08:40

Merge remote-tracking branch 'origin/master' into uss_qualifier/data-…

bdbd969

…files

Fix typo

4d72daa

Only replace strictly-structured $ref and allOf instances

e1e0e9d

Rename load_dict to be more specific

3a71aaa

Drop sole $ref requirement from allOf

66a2a96

make format

1d60d2c

BenjaminPelletier merged commit 21a225f into interuss:master Oct 26, 2022

BenjaminPelletier deleted the uss_qualifier/data-files branch October 26, 2022 19:53

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[uss_qualifier] Standardize resource file loading #860

[uss_qualifier] Standardize resource file loading #860

Uh oh!

BenjaminPelletier commented Oct 25, 2022

Uh oh!

barroco left a comment

Uh oh!

barroco Oct 26, 2022

Uh oh!

BenjaminPelletier Oct 26, 2022

Uh oh!

barroco Oct 26, 2022

Uh oh!

BenjaminPelletier Oct 26, 2022

Uh oh!

barroco Oct 26, 2022

Uh oh!

BenjaminPelletier Oct 26, 2022

Uh oh!

Uh oh!



		def _drop_allof(content: dict) -> None:
		if "allOf" in content:

[uss_qualifier] Standardize resource file loading #860

[uss_qualifier] Standardize resource file loading #860

Uh oh!

Conversation

BenjaminPelletier commented Oct 25, 2022

Uh oh!

barroco left a comment

Choose a reason for hiding this comment

Uh oh!

barroco Oct 26, 2022

Choose a reason for hiding this comment

Uh oh!

BenjaminPelletier Oct 26, 2022

Choose a reason for hiding this comment

Uh oh!

barroco Oct 26, 2022

Choose a reason for hiding this comment

Uh oh!

BenjaminPelletier Oct 26, 2022

Choose a reason for hiding this comment

Uh oh!

barroco Oct 26, 2022

Choose a reason for hiding this comment

Uh oh!

BenjaminPelletier Oct 26, 2022

Choose a reason for hiding this comment

Uh oh!

Uh oh!