Skip to content

Sendable Config #189

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

piotrkowalczuk
Copy link

@piotrkowalczuk piotrkowalczuk commented Mar 7, 2025

This PR is a follow-up for #172 , it introduces a Sendable Config, built around BinaryDistinctString and BinaryDistinctCharacter.

Motivation

  • Ensure Config conforms to Sendable.
  • Provide an explicit API instead of relying on @dynamicMemberLookup.
  • Minimize changes to business logic unless strictly necessary.

TODO

  • Ensure all existing tests pass.
  • Implement JSON serialization for Config, particularly in chat template contexts.
  • Add explicit tests for UTF-16 handling and protocol conformance.
  • Review and address TODO comments in the code.
  • Remove any unused code.
  • Bring back dot notation while accessing Config dict values.

Open Questions

  • Should BinaryDistinctString and BinaryDistinctCharacter be adopted more widely across the package, replacing String and NSString where applicable?
    • In practice, String already contains all the necessary information for binary-distinct comparisons.
    • The key difference lies in how the == operator affects serialization, deserialization, and hashing for uniqueness.
  • Making tokenizers Sendable would require deep changes to the Tokenizers package. What other improvements should be incorporated next to the Experimental: tokenizers with and without templates #168?

Follow-ups

  • Improve the Hub package, particularly by making Downloader conform to Sendable so the entire target can be marked as Swift6 compatible.
  • Redesign the Tokenizers package.

@FL33TW00D
Copy link
Collaborator

This is looking good so far @piotrkowalczuk!

@@ -34,75 +34,6 @@ public extension Hub {
}
}

// MARK: - Configuration files with dynamic lookup

@dynamicMemberLookup
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You might be planning this already but @dynamicMemberLookup seems to be missing from your new config struct. Would be nice to keep some of the dot syntax for properties that were previously defined like vocab, also general syntactic sugar vs string based access pattern - in your original proposal it was included #172 (comment), has your thinking changed to where that is no longer the case?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can certainly revert the change. My initial experience with the codebase was confusing because it was difficult to quickly determine whether a property was statically defined or dynamically accessed. Using explicit subscript notation clarified this distinction and increased my confidence in relying on the type system for accuracy.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll let @FL33TW00D @pcuenca et al. weigh in, my main concern besides syntax is backwards compatibility if any public properties (that may be accessed in projects that import this one) become inaccessible from this change.

For example, this was possible via public methods before this PR:
config.tokenizerData.model.vocab
and if merged these errors will propagate and require updating
Value of type 'Config' has no member 'model'

Perhaps a path forward could be making those objects with defined properties serializable (and probably defining more common properties that were previously dynamic) and leaving the remaining dynamic properties to your subscript pattern, just one idea.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If there is no further input on this topic, I will revert to the original behavior.

@piotrkowalczuk piotrkowalczuk changed the title WIP: Sendable Config Sendable Config Mar 21, 2025
@piotrkowalczuk
Copy link
Author

@pcuenca @FL33TW00D @ZachNagengast you might want to have a look :)

@FL33TW00D
Copy link
Collaborator

@piotrkowalczuk Could you resolve the conflicts first? <3

ConfigTests, BinaryDistinctDictionary removed, Config JSON serialization/deserialization, Config compatible with jinja templating system
@dynamicMemberLookup brought back for backward compatibility, ConfigTests/ConfigEquatable, Condig.Data equality improved
@dynamicMemberLookup dot notation used in favour of the subscript
formatting
rebase

Test cleanup

Test fix
@piotrkowalczuk
Copy link
Author

@FL33TW00D done

@FL33TW00D
Copy link
Collaborator

Sorry @piotrkowalczuk could you format it with the following instructions: https://github.com/huggingface/swift-transformers/actions/runs/14204745034/attempts/2#summary-39804901716

Will make this process smoother shortly.

@piotrkowalczuk
Copy link
Author

piotrkowalczuk commented Apr 1, 2025

@FL33TW00D done.

Apparently Testing package is not available in CI. On it.

/Users/runner/work/swift-transformers/swift-transformers/Tests/HubTests/ConfigTests.swift:10:8: error: no such module 'Testing'
import Testing
       ^

@piotrkowalczuk
Copy link
Author

I changed the package manifest. Here is where my ability to debug the CI build ends. @FL33TW00D could you help me?

@FL33TW00D
Copy link
Collaborator

FL33TW00D commented Apr 1, 2025

@piotrkowalczuk so Testing is Swift 6 and above.
EDIT: we will need to make a choice on dropping support for Swift <6. Discussing with @pcuenca.

@ZachNagengast
Copy link
Collaborator

@piotrkowalczuk The CI runners are using this workflow on this image: https://github.com/actions/runner-images/blob/macos-14-arm64/20250324.1158/images/macos/macos-14-arm64-Readme.md which uses Xcode 15.4 by default. To get the tests you may need to make some edits such that it can build on that version of xcode. There are some apps that make it easier to switch between versions in your local development environment like https://github.com/XcodesOrg/xcodes or it can also be downloaded directly from the Apple's website.

@piotrkowalczuk
Copy link
Author

I experimented with several options locally using https://github.com/XcodesOrg/xcodes and found that all versions of the Testing package available within the Xcode toolchain support Swift 6.

I also attempted to explicitly import swift-testing:0.5.0, but it is relatively immature and lacks some features, resulting in tests that do not compile.

Available options:

  • Rewrite the tests to use XCTest. This is feasible, but due to the parametrized nature of the ConfigTests, it would require either custom glue code or a significant amount of redundancy.
  • Update the toolchain. Due to limited access to CI, I believe the most effective approach would be for a member of the maintainers team to handle this outside of this PR.

@FL33TW00D I'll wait for the verdict regarding the toolchain.

@FL33TW00D
Copy link
Collaborator

I experimented with several options locally using https://github.com/XcodesOrg/xcodes and found that all versions of the Testing package available within the Xcode toolchain support Swift 6.

I also attempted to explicitly import swift-testing:0.5.0, but it is relatively immature and lacks some features, resulting in tests that do not compile.

Available options:

  • Rewrite the tests to use XCTest. This is feasible, but due to the parametrized nature of the ConfigTests, it would require either custom glue code or a significant amount of redundancy.
  • Update the toolchain. Due to limited access to CI, I believe the most effective approach would be for a member of the maintainers team to handle this outside of this PR.

@FL33TW00D I'll wait for the verdict regarding the toolchain.

I'm really not keen on dropping support for Swift <6, as Swift 6 only came out 6 months ago it seems premature.
However, I do appreciate the difficulties of parameterised tests using XCTest.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants