-
Notifications
You must be signed in to change notification settings - Fork 89
[uss_qualifier] Standardize resource file loading #860
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[uss_qualifier] Standardize resource file loading #860
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please, see my comments before merging.
monitoring/uss_qualifier/fileio.py
Outdated
|
||
if file_name is None: | ||
raise NotImplementedError( | ||
f"Cannot load find a suitable file to load for {data_file}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typo: Cannot --load--
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, done.
monitoring/uss_qualifier/fileio.py
Outdated
|
||
|
||
def _drop_allof(content: dict) -> None: | ||
if "allOf" in content: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need to drop allOf
keys? Seems like the intention is to flatten it in case of multiple $ref children, however, it seems that the fileio library may be used for files which will include allOf
keys with other objects.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My understanding of how OpenAPI references work is:
Say we have some files:
a.json
{"foo": 123}
b.json
{"bar": 456}
c.json
{"baz": 789, "$ref": "a.json"}
The $ref
in OpenAPI "works by replacing itself and everything on its level with the definition it is pointing at"[1], but 1) that's actually a really bad idea in certain cases, so this implementation instead works by removing itself and overwriting all keys on its level with the keys in the definition it is pointing at. So, strict OpenAPI would probably render c.json
as {"foo": 123}
(assuming $ref
was allowed at that location), but this more forgiving approach would render c.json
as {"baz": 789, "foo": 123}
. So, here $ref
is allowing us to combine both local content and referenced content. But, what if we wanted to add the content of both a.json
and b.json
to c.json
? Consider:
d1.json
{"baz": 789, "$ref": "a.json", "$ref": "b.json"}
It turns out RFC 8259 doesn't actually prohibit this, but ECMA-262 says the first-level parser will see {"baz": 789, "$ref": "b.json"}
because last value wins there. So, this isn't a good way to use the content of both a.json
and b.json
in c.json
. This is, in my understanding, the main purpose of allOf
. With allOf
, we can rewrite d1.json
:
d2.json
{"baz": 789, "allOf": [{"$ref": "a.json"}, {"$ref": "b.json"}]}
When resolving d2.json
, first we can evaluate the $ref
s:
d2.json, $refs resolved
{"baz": 789, "allOf": [{"foo": 123}, {"bar": 456}]}
The purpose of _drop_allof
is to apply the multiple schemas in allOf
to the parent schema (to "drop" them a level down to the parent). Dropping the two allOf
schemas in the $ref
-resolved d2.json
above results in:
d2.json, allOfs dropped
{"baz": 789, "foo": 123, "bar": 456}
The reason allOf
does this relates to the fact that OpenAPI is based on JSON Schema which is designed merely to validate incoming data, not to express a data format. This lineage is most apparent in the "description
at the same level as $ref
" issue in OpenAPI -- from JSON Schema's perspective, that description
is irrelevant because incoming data content is validated against the criteria in the $ref
content and a usage of a data type is equivalent to the data type itself (saying "this field is an Address" is the same thing as saying "this field is Address"). allOf
means "the incoming data object must validate against all of the specified conditions", so JSON Schema is expecting something like "allOf": [{"$ref": "#/HasValidPhoneNumber"}, {"$ref": "#/HasValidAddress"}]
. The contents of HasValidPhoneNumber
are probably something like "the field phone_number
is a string with between 7 and 12 characters", so we often name that set of validation criteria something like PhoneNumber
rather than HasValidPhoneNumber
. So, the expectation would actually be written more like "allOf": [{"$ref": "#/PhoneNumber"}, {"$ref": "#/Address"}]
. With that in mind, here's an example OpenAPI spec:
openapi.json
{
"openapi": "3.1.0",
"info": {
"title": "Validator inheritance demo",
"version": "1.0.0"
},
"components": {
"schemas": {
"Address": {
"type": "object",
"required": ["address"],
"properties": {
"address": {"type": "string"}
}
},
"PhoneNumber": {
"type": "object",
"required": ["phone_number"],
"properties": {
"phone_number": {"type": "string"}
}
},
"CustomerByRef": {
"allOf": [
{"$ref": "#/components/schemas/Address"},
{"$ref": "#/components/schemas/PhoneNumber"}
]
},
"CustomerByValue": {
"type": "object",
"required": ["address", "phone_number"],
"properties": {
"address": {"type": "string"},
"phone_number": {"type": "string"}
}
}
}
},
"paths": {
"/by_ref": { "get": { "responses": { "200": {
"content": {
"application/json": {
"schema": {"$ref": "#/components/schemas/CustomerByRef"}
}
},
"description": "N/A"
} } } },
"/by_value": { "get": { "responses": { "200": {
"content": {
"application/json": {
"schema": {"$ref": "#/components/schemas/CustomerByValue"}
}
},
"description": "N/A"
} } } }
}
}
In openapi.json
, the return format for both paths is the same -- the schema components referenced by allOf
get combined into a single schema that is the same as adding the content of each of the $ref
s. The implementation in this PR will not actually load OpenAPI specs correctly -- for instance, the required
properties need to be the sum of both $ref
'd schemas' required
properties. But, it should produce the same spiritual result: allOf
combines the content of its elements into its parent. It wouldn't make sense to correctly load OpenAPI in load_dict
because load_dict
is for loading concrete content whereas OpenAPI describes content formats.
All I really want here is the ability to inherit content from other files/locations. Since neither JSON nor YAML has a standard means to do this, I tried to use an approach as similar to something repo users were most likely to already be familiar with. So, this is a custom adaptation of the parts of the OpenAPI spec relating to content inheritance, but as applied to concrete data objects, so some concepts like allOf
have to be adapted to fit the use case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the context. I was actually referring to the fact that all objects with allOf
keys (even those without ref) will be adapted. Since the library is called fileio I would not expect as a developer of this method that the content of the file to be adapted. May I suggest to either rename the method with something like: load_normalized_dict
/load_flatten_dict
or, as you proposed, limit drops to allOf
objects containing refs ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good -- I've made changes to:
- Only perform reference-import operations on
"$ref": <str>
entries (e.g.,"$ref": 123
would be ignored) - Only do anything with
allOf
entries when the value is a list of objects, each with a properly-formed$ref
in it - Rename
load_dict
toload_dict_with_references
(indicating the special functionality above)
This PR standardizes how data files are loaded in uss_qualifier. Previously, there were a number of different places that loaded YAML files or JSON files or both, sometimes supporting web links, sometimes not. This PR uses a single library (
fileio.py
) to load data files wherever practical within uss_qualifier and it supports:$ref
andallOf
similar to OpenAPI so that parts of other files can be referenced in the file of interestThis PR also makes all package paths relative to uss_qualifier -- previously, test suites were relative to uss_qualifier.suites, configurations relative to uss_qualifier.configurations, resources relative to uss_qualifier.resources, etc. This reduces the need for context to understand what configuration files, test suite definitions, etc are declaring, and makes reuse of standard loading procedures easier.