Skip to content

Surface the distinction between an explicit "null" value and an absent value #419

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
madsodgaard opened this issue Dec 3, 2023 · 37 comments
Labels
area/generator Affects: plugin, CLI, config file. area/runtime Affects: the runtime library. kind/enhancement Improvements to existing feature.

Comments

@madsodgaard
Copy link

madsodgaard commented Dec 3, 2023

When using a schema that has a field from a reference that is both non-required and "nullable" this is a common way of defining it in OpenAPI: 3.1.0

UpdateTaskOccurrenceCompletionResponse:
      properties:
        old_task:
          $ref: '#/components/schemas/TaskOccurenceDTO'
        next_task:
          oneOf:
            - $ref: '#/components/schemas/TaskOccurenceDTO'
            - type: 'null'
      type: object
      required:
        - old_task

but this currently generates the following API:

/// - Remark: Generated from `#/components/schemas/UpdateTaskOccurrenceCompletionResponse`.
        public struct UpdateTaskOccurrenceCompletionResponse: Codable, Hashable, Sendable {
            /// - Remark: Generated from `#/components/schemas/UpdateTaskOccurrenceCompletionResponse/old_task`.
            public var old_task: Components.Schemas.TaskOccurenceDTO
            /// - Remark: Generated from `#/components/schemas/UpdateTaskOccurrenceCompletionResponse/next_task`.
            @frozen public enum next_taskPayload: Codable, Hashable, Sendable {
                /// - Remark: Generated from `#/components/schemas/UpdateTaskOccurrenceCompletionResponse/next_task/case1`.
                case TaskOccurenceDTO(Components.Schemas.TaskOccurenceDTO)
                /// - Remark: Generated from `#/components/schemas/UpdateTaskOccurrenceCompletionResponse/next_task/case2`.
                case case2(OpenAPIRuntime.OpenAPIValueContainer)
            }
            /// - Remark: Generated from `#/components/schemas/UpdateTaskOccurrenceCompletionResponse/next_task`.
            public var next_task: Components.Schemas.UpdateTaskOccurrenceCompletionResponse.next_taskPayload

This is quite confusing syntax at the callsite, since we have this weird case2, instead of just having only an optional TaskOccurenceDTO. Is this expected behaviour or could we do something to improve the ergonomics before the 1.0.0 release? I'll happily work on a PR, if you could point me in the right direction to where in the codebase the relevant parts are!

@czechboy0
Copy link
Contributor

Hi @madsodgaard,

what's the meaning of marking the value as nullable beyond just defining as

old_task:
  $ref: '#/components/schemas/TaskOccurenceDTO'

and not putting it in the required list?

How would you want to see it represented in Swift, if not as a simple optional property of TaskOccurenceDTO?

@czechboy0 czechboy0 added the status/triage Collecting information required to triage the issue. label Dec 3, 2023
@madsodgaard
Copy link
Author

@czechboy0 One place where I use this pattern a lot is when designing PATCH endpoints that allows partial updates of models. If I have field name that can actually be NULL in the database, I would give that the type:

type:
  - string
  - "null"

since this is a partial update endpoint, I want to differentiate between the user providing a "null" value (thus updating the value of name to NULL) and not providing a value at all (not updating name). That's why I don't include it in the required. From what I understand required means that the property must be present, regardless of the value. As you said I could just not include it in the required and remove the null type. But then I would only be doing that because I know I am going to use this library to generate a client, and I'll get a nicer API.

The ideal situtation, in my mind, is that the library could somehow recognize this pattern and just provide an optional property of TaskOccurenceDTO, as it does currently when not specifying the property in the required list and with only one type. Hope that makes some sense:)

@czechboy0
Copy link
Contributor

Swift OpenAPI Generator delegates JSON parsing to automatic Codable implementations (wherever possible) and to Foundation's JSONDecoder, which doesn't make the distinction between a null value and an absent value, so unfortunately what you're trying to do won't work. Several pieces of the stack, including those not owned by this tool (such as Foundation itself), would have to change.

You'll need to make resetting values somehow an explicit part of the API, not relying on the distinction between null and an absent value. And for values you're not trying to overwrite, just define them as regular properties and omit them from the required array.

@madsodgaard
Copy link
Author

@czechboy0 Thanks! Of course the fact that Codable does not differentiate between absent values and null is quite challenge to this issue. But, this pattern is so common in APIs today, that perhaps we should find a way to support? By "this pattern", I am referring to PATCH endpoints for partial updates instead of PUT endpoints to replace the entire resource. I have used this property wrapper previously in server-side applications. Do you think such a type or something similar would have a place in the library? (or is it too niché?) It would be used in places where you have a non-required nullable field.

@propertyWrapper
struct PotentiallyAbsent<T: Codable>: Codable {
    var wrappedValue: T? {
        switch state {
        case .present(let value):
            return value
        default:
            return nil
        }
    }
    
    enum State {
        case absent
        case null
        case present(T)
    }
    
    var projectedValue: PotentiallyAbsent<T> {
        return self
    }
    
    var state: State
    
    init(_ state: State) {
        self.state = state
    }
    
    static var null: PotentiallyAbsent<T> {
        .init(.null)
    }
    
    static var absent: PotentiallyAbsent<T> {
        .init(.absent)
    }
    
    static func present(_ value: T) -> PotentiallyAbsent<T> {
        .init(.present(value))
    }
    
    init(from decoder: Decoder) throws {
        fatalError("Attempted to decode `PotentiallyAbsent` as a standalone value")
    }
    
    func encode(to encoder: Encoder) throws {
        fatalError("Attempted to encode `PotentiallyAbsent` as a standalone value")
    }
}

extension KeyedDecodingContainer {
    func decode<T>(_ type: PotentiallyAbsent<T>.Type, forKey key: K) throws -> PotentiallyAbsent<T> {
        do {
            if try self.decodeNil(forKey: key) {
                return .init(.null)
            } else {
                return .init(.present(try decode(T.self, forKey: key)))
            }
        } catch DecodingError.keyNotFound {
            return .init(.absent)
        }
    }
}

extension KeyedEncodingContainer {
    mutating func encode<T>(_ value: PotentiallyAbsent<T>, forKey key: KeyedEncodingContainer<K>.Key) throws {
        switch value.state {
        case .absent:
            break
        case .null:
            try encodeNil(forKey: key)
        case let .present(value):
            try encode(value, forKey: key)
        }
    }
}

@czechboy0
Copy link
Contributor

I don't think it's likely for us to try to solve this in Swift OpenAPI Generator. The fix should be in Codable and JSONDecoder. Once it's there, we can see what'd remain for Swift OpenAPI Generator to have to do to support this case.

But only doing it here would be a significant amount of work that introduces a lot of complexity.

One way this could be better represented that'd work today would be a PATCH call that takes two values, a "toSet" dictionary, which has keys matching the property to set, and the value to set. And if a value is nil for a property, the property isn't changed.

And the second value would be called "toReset", which would just be a list of property names whose values to reset to their default.

@czechboy0 czechboy0 changed the title Improve ergonomics of inline nullable refs Surface the distinction between an explicit "null" value and an absent value Dec 11, 2023
@czechboy0
Copy link
Contributor

Renamed the issue to reflect that this is a missing feature. See my comment above explaining that this goes deeper in the stack we're based on, so would likely need some improvements from JSONDecoder first, but keeping open to allow tracking progress.

@czechboy0 czechboy0 added area/generator Affects: plugin, CLI, config file. area/runtime Affects: the runtime library. kind/enhancement Improvements to existing feature. and removed status/triage Collecting information required to triage the issue. labels Dec 11, 2023
@czechboy0 czechboy0 added this to the Post-1.0 milestone Dec 11, 2023
@brandonbloom
Copy link

I ran into this (or a similar) issue. The encoded output of my server omits a field with a nil value, but some code consuming this service expects an explicit null value.

One idea is this property wrapper: https://stackoverflow.com/a/53142581/67280

@czechboy0
Copy link
Contributor

I think it'd be best to first help Foundation get the desired behavior, and then Swift OpenAPI Generator and all other projects relying on JSONDecoder get it for free.

https://github.com/apple/swift-foundation

@MilesV64
Copy link

MilesV64 commented Feb 6, 2025

I'm also wanting this behavior for supporting a PATCH endpoint using an open api spec. I want to use a spec like this:

fieldName:
  type: [integer, 'null']
anotherFieldName:
  oneOf:
    - $ref: '#/components/schemas/MyCustomSchema'
    - type: 'null'

I'm working around this by abandoning the generated types and using a custom encoding like so:

public struct NullableValue<T: Codable>: Codable {
    public let value: T?
}

public struct MyCustomStruct: Encodable {
    public let fieldOne: NullableValue<Int>
    public let fieldTwo: NullableValue<String>
    
    func encode(to encoder: Encoder) throws {
        var container = encoder.container(keyedBy: CodingKeys.self)

        if let fieldOne {
            try container.encode(fieldOne.value, forKey: .fieldOne)
        }

        if let fieldTwo {
            try container.encode(fieldTwo.value, forKey: .fieldTwo)
        }
    }
}

The JSON output when MyCustomStruct(fieldOne: NullableValue(value: nil)) is {"fieldOne": <null>}.

This gives the exact behavior of the key being present even if the value may be null, or nothing if the entire NullableValue is null. I think this generator could absolutely support this with no changes to Foundation/Codable. If the type is [integer, 'null'] it would generate a field: NullableValue<Int>? property on the struct, and the generated struct would have the above encode implementation for those fields.

@nikitaame
Copy link

This definitely feels like something that the library should be able to support without updates to Codable/Foundation. @czechboy0 what're your thoughts on @MilesV64 suggestion above?

@czechboy0
Copy link
Contributor

It's not something we're actively working on right now, but I agree that it's a gap that should be fixed. If someone wants to propose a prototype in a PR, and a lightweight proposal (we have a process for those) for how the generated code would change, we'd appreciate a contribution! 🙏

@nikitaame
Copy link

To the best of your knowledge, is there currently any escape hatch for the functionality of using with PATCH requests or is a fix required?

@czechboy0
Copy link
Contributor

I'm not aware of a way to make it work with how the code is generated today.

So if you have a PATCH endpoint, and you need to pass explicit nulls in the JSON, you might need to use filtering and avoid generating that operation, and implement calling it manually. I admit it's not great, not sure if there's a better workaround.

@czechboy0
Copy link
Contributor

So I've been prototyping some solutions here, and wanted to get your thoughts before I go much further.

TL;DR: Swift OpenAPI Generator today "flattens" double optionals on purpose, as that's what Swift users want in the vast majority of cases. Example:

MyContainer:
  type: object
  properties:
    label:
      type: string
      # nullable: true is the default
  required: [] # label is not a required property

If we translated the above into Swift, we'd have to generate a struct like:

struct MyContainer: Codable {
  var label: String??
}

It's because the two sources of optionality: 1) whether the value itself is optional and 2) whether the property is required to be present in the payload

Generally, even when working with JSONDecoder outside of OpenAPI, we don't make that distinction, thus the single optional in the more expected:

struct MyContainer: Codable {
  var label: String? // the distinction between the field "label" being omitted, or being present as `null` is not surfaced in Swift
}

So - that's how we ended up where we are, and why the default is to flatten double optionals.

Now - the PATCH use case is very valid, and our current behavior doesn't work for it at all. Note that even the default Codable behavior with JSONDecoder doesn't either - the compiler-synthesized init(from:) and encoder(to:) also don't give you that information.

However, when writing Codable structs by hand, you do have an escape hatch: implement these two methods manually, and manually call decodeNil / encodeNil.

To that end, I'd like us to also offer an opt-in way to handle PATCH requests.


Here's a very early idea, and I'm looking for feedback on both the way to opt-into this feature, and the generated code:

  • Q1: Should a whole struct be opted into this behavior, or should it be per-property?
  • Q2: Should it be specified in the OpenAPI doc, or be some config passed only to the generator?
  • Q3: What would be other ways to implement it in the generated code other than overriding the full two Codable methods?
  MyPatchObject:
    type: object
    x-swift-explicit-nullable-properties: true # <<< not the final name, first trying to be very explicit
    properties:
      alias:
        type: string

(Simplified generated Swift code)

public struct MyPatchObject: Codable, Hashable, Sendable {
    public var alias: Swift.String?? // <<< double optional on purpose - allows detecting and providing explicit nulls and telling them apart from the missing JSON field
    public init(alias: Swift.String?? = nil) {
        self.alias = alias
    }
    public enum CodingKeys: String, CodingKey {
        case alias
    }
    public init(from decoder: any Decoder) throws {
        let container = try decoder.container(keyedBy: CodingKeys.self)
        self.alias = try container.decodeExplicitNullable(
            Swift.String.self,
            forKey: .alias
        )
    }
    public func encode(to encoder: any Encoder) throws {
        var container = encoder.container(keyedBy: CodingKeys.self)
        try container.encodeExplicitNullable(
            self.alias,
            forKey: .alias
        )
    }
}

Thoughts? cc @jpsim @simonjbeaumont

@jpsim
Copy link

jpsim commented Mar 19, 2025

First off, thank you so much for digging in here and thinking about this issue. It's one of a very small number of blockers preventing us from fully adopting swift-openapi-generator for all our endpoints in our app!

Q1: Should a whole struct be opted into this behavior, or should it be per-property?
Q2: Should it be specified in the OpenAPI doc, or be some config passed only to the generator?

I haven't fully thought this through, but ideally neither and this can be a callsite decision, even supporting making this decision at runtime. See my proposal below for details:

Q3: What would be other ways to implement it in the generated code other than overriding the full two Codable methods?

What if codegened endpoints had an optional requiredFields parameter? Like this:

let response = try await client.my_patch_endpoint(
    body: .json(MyPatchObject(alias: nil)),
    requiredFields: [.alias] // as [Components.Schemas.MyPatchObject.CodingKeys]
).ok.body.json

Which would produce an HTTP body of {"alias": null} vs {} today.

@czechboy0
Copy link
Contributor

What would MyPatchObject look like in Swift? What'd be the type of the properties?

@jpsim
Copy link

jpsim commented Mar 19, 2025

What would MyPatchObject look like in Swift? What'd be the type of the properties?

I think this works with the types effectively unchanged from how they're generated today, so like this:

  MyPatchObject:
    type: object
    properties:
      alias:
        type: string
        nullable: true
struct MyPatchObject: Codable, Hashable, Sendable {
    var alias: Swift.String?
    init(alias: Swift.String? = nil) {
        self.alias = alias
    }
    enum CodingKeys: String, CodingKey {
        case alias
    }
}

@czechboy0
Copy link
Contributor

And the Codable implementation of MyPatchObject? How would we get the required keys in there?

@jpsim
Copy link

jpsim commented Mar 20, 2025

In my mind, it should be possible to leverage CodingUserInfoKey to set which keys are required when encoding the body, but maybe that's getting too complex , or maybe this can't cleanly be implemented at all.

I think the x-swift-explicit-nullable-properties approach you propose would work for us.

Q1: Should a whole struct be opted into this behavior, or should it be per-property?

Slight preference for per-struct, assuming this behavior only applies to its optional properties.

Q2: Should it be specified in the OpenAPI doc, or be some config passed only to the generator?

Would prefer the OpenAPI doc for self-documenting reasons.

@czechboy0
Copy link
Contributor

czechboy0 commented Mar 20, 2025

Thanks, @jpsim, this is helpful.

Putting this into the OpenAPI doc has the upside of being quite flexible, and supporting inline object schemas anywhere in the document. The downside would be that for clients who have the doc coming from a different team/company, they'd have to carry and reapply a patch indefinitely. That's easier nowadays with OpenAPI Overlays, and maybe if we go for this approach, we should provide a documented example. We also have #634 that could potentially allow adopters to provide their upstream OpenAPI doc and overlays as separate files, and the generator could stitch them together automatically.

Another alternative would be to specify this in the config file. We'd have to constrain this feature to named schemas only (i.e. #/components/schemas/...), allowing us to use similar syntax to the existing filtering feature - possibly looking like this in the openapi-generator-config.yaml:

# ...
explicitlyNullableSchemas:
  - MyPatchObject

cc more folks from this thread - I'd like to understand if there's a preference here: @madsodgaard @brandonbloom @MilesV64 @nikitaame @simonjbeaumont

@simonjbeaumont
Copy link
Collaborator

Thanks @czechboy0 for revisiting this and for the proposed solution.

  • Q1: Should a whole struct be opted into this behavior, or should it be per-property?

Similar to @jpsim: think my lean would be to per-struct on the basis it only applied to nullable properties.

  • Q2: Should it be specified in the OpenAPI doc, or be some config passed only to the generator?

This would be the first use of an extension but we've discussed using them before and maybe this is a good fit. But I appreciate the argument that folks don't want to patch documents. I don't have a strong preference. Is it worth considering both?

@czechboy0
Copy link
Contributor

Is it worth considering both?

Eventually, sure, but I'd prefer to start with just one and see if there still remains a need to cover the other.

After thinking about this more, I'm now leaning towards this being specified in openapi-generator-config.yaml, similar to how we handle filtering today.

That'd look something like:

generate:
  - types
  - client
namingStrategy: idiomatic
# ...
explicitlyNullableSchemas:
  - MyPatchObject

One important implication is that this will only work for top-level #/components/schemas. I think this is a fine point to start, and doesn't preclude us from also offering a vendor extension in the OpenAPI document later, which would extend support to inline/unnamed schemas as well.

Would this approach work for folks here? If so, I'll write up a full proposal.

@simonjbeaumont
Copy link
Collaborator

One important implication is that this will only work for top-level #/components/schemas. I think this is a fine point to start, and doesn't preclude us from also offering a vendor extension in the OpenAPI document later, which would extend support to inline/unnamed schemas as well.

Would this approach work for folks here? If so, I'll write up a full proposal.

If one approach would only work for a subset of schemas and the other for all schemas, wouldn't it be prudent to prefer the latter if there's a strong desire to only do one?

For users that don't own the document, allowing this only for #/components/schemas schemas locks them further out of the feature. With the extensions they have to patch the document, sure, but the patch will be tractable and additive. If the upstream OpenAPI document uses inline schemas for patch requests, they'll be forced to refactor it, which is a bigger ask.

WDYT?

@czechboy0
Copy link
Contributor

Right, the tradeoff here is basically whether we ask everyone who needs PATCH to update their OpenAPI doc, or only those who don't have the PATCH request payload in a named schema to update theirs.

@simonjbeaumont
Copy link
Collaborator

I understand, but the tradeoff isn't symmetric. Users consuming documents with inline schemas in PATCH requests will be forced to do a significant refactor of the document (cf. just add an annotation).

I'm not suggestion we only do the extension and not config—my original suggestion upthread was to consider both—just that if we're really gonna pick one, I would suggest picking the one that solves the problem for all cases.

I'm sympathetic to folks preferring the config file where that is an option to them: is it significantly more work to offer both?

@simonjbeaumont
Copy link
Collaborator

Or, alternatively, is it possible to identify an inline schema from the config file, by some JSON Path syntax? That way we could do only config but still support inline schema PATCH requests.

@czechboy0
Copy link
Contributor

Yeah maybe, here's the OpenAPIKit impl of it: https://github.com/mattpolzin/OpenAPIKit/blob/fd27b297993923c9a1725dc62553d06579c7a33e/Sources/OpenAPIKit30/JSONReference.swift#L193

But since this is growing in scope, I might not be able to get to this as quickly as I hoped, just a heads up. If anyone else wants to investigate this, go ahead.

@nikitaame
Copy link

Thanks for the continued discussion here!

@czechboy0 what was the reason for the shift in preference to do this in the config rather than the document considering the option to do OpenAPI Overlays?

Btw, here's the implementation of the extension in OpenAPITools.

@czechboy0
Copy link
Contributor

I think in the fullness of time we could support all of these, just that the quickest and easiest to implement is the config approach + named schemas only. That'll only require changes in a handful of methods in the codegen logic, and historically adopters often preferred a codegen option over having to edit their OpenAPI doc.

The OpenAPI doc + overlay, and inline schema support are both great enhancements, they just also add quite a lot of scope both in generator implementation and documentation enhancements to show it all working end-to-end.

@jpsim
Copy link

jpsim commented Mar 25, 2025

I would love if there was at least one way to achieve this and would hope that the approach can then be expanded over time to support the full extent of use cases. Right now there's no way to achieve this and no workaround has been found.

@simonjbeaumont
Copy link
Collaborator

One alternative we discussed a little, is to solve this in a way that uses additive API without the need to opt-in for specific operations or schemas, which would hopefully make it more widely useful, more discoverable, and use a unified implementation under the hood.

The idea would be to add a side-array (likely an OptionSet) to each generated type for schemas with at least one nullable property, with one option per nullable property, matching the generated name of the property. For example:

S:
  type: object
  required: [a, b]
  properties:
    a:
      type: string
      nullable: true
    b:
      type: string
      nullable: false
struct S: Codable, Hashable, Sendable {
    var a: Swift.String?
    var b: Swift.String

    var __swiftOpenAPISetProperties: [...]  // <-- new: populated during decoding

    init(
        a: Swift.String? = nil,
        b: Swift.String,
        explicitlySerializing: [...] = .empty  // <-- new: influences the encoding
    )
}

This would require us to stop relying on the synthesized Codable conformance or to provide a wrapped JSON encoder and decoder.

I'm curious about feedback here as it seems to be a more general solution to the problem of surfacing and influencing explicitly null properties.

@nikitaame
Copy link

nikitaame commented Mar 27, 2025

I like the flexibility and built in discoverability this solution provides!

We'd probably only want this functionality exposed to PATCH endpoints but otherwise looks good to me 🫡

@simonjbeaumont
Copy link
Collaborator

We'd probably only want this functionality exploded to PATCH endpoints but otherwise looks good to me 🫡

This would be on the schema, but I suppose we could only apply it to schemas that are used in at least one PATCH request. Are there any other scenarios where distinguishing explicitly- vs implicitly-nil is useful?

@czechboy0
Copy link
Contributor

Personally, I'd prefer for this to be opt-in more explicitly. Searching all operations and identifying which schemas end up being (transitively) used by PATCH operations could result in unexpected spooky action at a distance.

@nikitaame
Copy link

For simplicity, perhaps doesn't matter to search if the schemas is used in at least one PATCH endpoint and can just generally apply to all schemas that have at least one nullable field.

Not seeing any other scenarios where this distinction can be beneficial but also don't see anything spooky with applying to all schemas with a nullable field.

@czechboy0 thoughts if we apply as originally suggested? Still think it should be more explicit? If so, which of the more explicit solutions are you leaning towards?

@simonjbeaumont
Copy link
Collaborator

For simplicity, perhaps doesn't matter to search if the schemas is used in at least one PATCH endpoint and can just generally apply to all schemas that have at least one nullable field.

This was my original suggestion in #419 (comment). If we managed to achieve this with a custom Encoder/Decoder instead of having to provide a custom Codable conformance then it shouldn't drastically affect the generated code size either.

@nikitaame
Copy link

Yep, totally good with the original suggestion!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/generator Affects: plugin, CLI, config file. area/runtime Affects: the runtime library. kind/enhancement Improvements to existing feature.
Projects
None yet
Development

No branches or pull requests

7 participants