-
Notifications
You must be signed in to change notification settings - Fork 6
Json Schema Draft 7 Validation for Kson #122
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Comments from the source JSON-Schema-Test-Suite tests were being inserted into the wrong place in our generated tests (into the schema Json of the test, making invalid Json). Fix by rendering them over the assertion.
Given the KsonApi, the TODO about Json Schema support, and access to running our `SchemaSuiteTest`, Claude 4 was able to design and implement this reasonable-looking approach to adding Json Schema support to Kson. Claude initially set up an object model for various types of schema it wanted to implement. When that was in place, I repeatedly prompted it with the following until it had fixed all the tests: ``` Can you help me get some of these failing tests passing? `./gradlew jvmTest --tests "org.kson.parser.json.generated.SchemaSuiteTest"` ``` It did a very impressive job of inspecting the failures, categorizing them, picking a high-value pattern to address, modifying the code to address that, then reporting back "X tests were failing, now only Y tests are failing" (Note: I interrupted it at one point to note that we were deferring support for `$ref`, so any test related to refs should be left on the SchemaSuite exclude list) This solution is ultimately the wrong object model (validations should not be directly associated with types, for instance), and due to this, hid some gaps in spite of getting the tests passing. And there is a LOT of code duplication here. Regardless: impressive first pass, and we're doing to refactor from there to a robust solution in subsequent commits.
Refactor the errors produced by the new schema validation subsystem to integrate with our parser's MessageSink strategy, properly logging the errors found as full `LoggedMessages` with location data embedded
The initial agent-assisted implementation of our Json Schema validation code not only created a complicated monolithic approach to comparing `KsonValue` objects for equality, but it also duplicated that approach a number of times. Greatly improve things by refactoring into proper equals and hashcode methods on our `KsonValue` classes.
More work towards reshaping the agent-generated JsonSchema support into something we fully own, understand and endorse. This commit fixes a bunch more duplication and provides very good insight into not only how this implementation actually works, but also how JsonSchema itself works (and where this impl diverges from that) The new JsonSchemaTest class demonstrate some of the gaps found. This commit addresses the problem in `JsonSchemaNumberTest`, but leaves `JsonSchemaTestType` failing to be fixed more holistically with a new object model for the validation subsystem
Fully implement of our improved Json Schema support design (modulo a few todos around interface polish and error messaging). This commit is larger that I usually like, but it is focused and mostly represents a refactor, so I won't artificially break it up. We now organize around schema validations rather than schema types. The schema `type` checker becomes just another validation which runs up front, and if it passes, then all all other validations that apply to the current types will be applied. This mirrors the actual Json Schema design much better, solving all problems detected in the previous implementation, including all the code duplication. SchemaParser now tries to parse every supported schema element exactly once, creates the appropriate validation classes, and organizes the validations into a JsonSchema class that can perform the parsed validations. The clarity and simplicity this brings to the implementation is very validating for this approach, and we are now well positioned to maintain this going forward.
Progress on making refined, helpful, and localizable error message for Json Schema validation
Claude 4 helped replace the placeholder SCHEMA_VALIDATION_ERROR messages with more refined, helpful, and localizable messages. It got cranky at the end and started making some unrelated changes and deleting some other MessageType declarations that were being used elsewhere, but mostly it was super helpful and understood the design goals here. Prompt: %% Everywhere in the @Schema dir that uses SCHEMA_VALIDATION_ERROR is a place where we would like to make more refined, localizable. Note how MessageType is organized to be parametrized in such a way that when we localize these messages, it will be trivial, as the only dynamic part of the messages are params that don't need to be localized. Do you see what I mean? Would you be able to create new MessageTypes to replace the too-informal "SCHEMA_VALIDATION_ERROR.create() messages? Please have a look at how the other MessageTypes are implemented for inspiration on how to implement your new messages. %%
Rename schema `node` params to `ksonValue`: Schemas validate KsonValues, but some of our props/params were called `node`, which, sure, but `ksonValue` is much more descriptive here. Also add/improve some docs around our Json Schema implementation while I'm thinking about code clarity
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This pull request implements a (near) full Json Schema parser and validator into Kson. This only "(near) full" because:
We now support specifying a Json Schema to validate against as part of a Kson compile. Simple example:
Note that support for other Json Schema drafts is planned. We started with Draft 7 since there are many signals that it is the most widely used, and we are now well-positioned to expand support to other drafts. The fidelity of our implementation is confirmed against the excellent test suite provided by JSON-Schema-Test-Suite, which we integrated in #77
Note also that I've been using this particular project as something of a test for LLM coding agents—as a well-specified and well-scoped non-trivial project with an extensive test suite, it seemed ideal as a "real" project to test agentic coding tools. I was not yet able to coax any agentic coding systems into fully implementing this, but they (primary Claude 4) finally came close recently and were very helpful here. For more color on this story, the individual commits document some of the saga.