Skip to content

Conversation

thejchap
Copy link
Contributor

@thejchap thejchap commented Aug 1, 2025

Summary

astral-sh/ty#111

adds support for @dataclass(kw_only=True) (https://docs.python.org/3/library/dataclasses.html)

Test Plan

Copy link
Contributor

github-actions bot commented Aug 1, 2025

Diagnostic diff on typing conformance tests

Changes were detected when running ty on typing conformance tests
--- old-output.txt	2025-08-14 01:15:44.974868205 +0000
+++ new-output.txt	2025-08-14 01:15:45.042868491 +0000
@@ -1,5 +1,5 @@
 WARN ty is pre-release software and not ready for production use. Expect to encounter bugs, missing features, and fatal errors.
-fatal[panic] Panicked at /home/runner/.cargo/git/checkouts/salsa-e6f3bb7c2a062968/918d35d/src/function/execute.rs:215:25 when checking `/home/runner/work/ruff/ruff/typing/conformance/tests/aliases_typealiastype.py`: `infer_definition_types(Id(48ad)): execute: too many cycle iterations`
+fatal[panic] Panicked at /home/runner/.cargo/git/checkouts/salsa-e6f3bb7c2a062968/918d35d/src/function/execute.rs:215:25 when checking `/home/runner/work/ruff/ruff/typing/conformance/tests/aliases_typealiastype.py`: `infer_definition_types(Id(124c6)): execute: too many cycle iterations`
 _directives_deprecated_library.py:15:31: error[invalid-return-type] Function always implicitly returns `None`, which is not assignable to return type `int`
 _directives_deprecated_library.py:30:26: error[invalid-return-type] Function always implicitly returns `None`, which is not assignable to return type `str`
 _directives_deprecated_library.py:36:41: error[invalid-return-type] Function always implicitly returns `None`, which is not assignable to return type `Self@__add__`
@@ -212,13 +212,8 @@
 dataclasses_frozen.py:16:1: error[invalid-assignment] Property `a` defined in `DC1` is read-only
 dataclasses_frozen.py:17:1: error[invalid-assignment] Property `b` defined in `DC1` is read-only
 dataclasses_kwonly.py:23:11: error[too-many-positional-arguments] Too many positional arguments: expected 1, got 2
-dataclasses_kwonly.py:32:1: error[missing-argument] No argument provided for required parameter `a`
-dataclasses_kwonly.py:32:5: error[invalid-argument-type] Argument is incorrect: Expected `int`, found `Literal["hi"]`
-dataclasses_kwonly.py:35:1: error[missing-argument] No argument provided for required parameter `a`
-dataclasses_kwonly.py:35:5: error[invalid-argument-type] Argument is incorrect: Expected `int`, found `Literal["hi"]`
-dataclasses_kwonly.py:35:11: error[parameter-already-assigned] Multiple values provided for parameter `b`
-dataclasses_kwonly.py:38:5: error[invalid-argument-type] Argument is incorrect: Expected `int`, found `Literal["hi"]`
-dataclasses_kwonly.py:38:11: error[invalid-argument-type] Argument is incorrect: Expected `str`, found `Literal[1]`
+dataclasses_kwonly.py:38:11: error[too-many-positional-arguments] Too many positional arguments: expected 1, got 2
+dataclasses_kwonly.py:53:11: error[too-many-positional-arguments] Too many positional arguments: expected 1, got 2
 dataclasses_kwonly.py:61:1: error[missing-argument] No argument provided for required parameter `c`
 dataclasses_kwonly.py:61:9: error[invalid-argument-type] Argument is incorrect: Expected `int`, found `float`
 dataclasses_kwonly.py:61:14: error[parameter-already-assigned] Multiple values provided for parameter `b`
@@ -264,9 +259,14 @@
 dataclasses_transform_converter.py:121:11: error[too-many-positional-arguments] Too many positional arguments to bound method `__init__`: expected 1, got 7
 dataclasses_transform_converter.py:130:31: error[invalid-argument-type] Argument to function `model_field` is incorrect: Expected `(Literal[1], /) -> Unknown`, found `def converter_simple(s: str) -> int`
 dataclasses_transform_field.py:49:43: error[invalid-return-type] Function always implicitly returns `None`, which is not assignable to return type `(...) -> @Todo`
+dataclasses_transform_field.py:75:16: error[too-many-positional-arguments] Too many positional arguments: expected 0, got 1
+dataclasses_transform_func.py:53:8: error[missing-argument] No arguments provided for required parameters `id`, `name`
+dataclasses_transform_func.py:53:18: error[too-many-positional-arguments] Too many positional arguments: expected 0, got 2
 dataclasses_transform_func.py:57:1: error[invalid-assignment] Object of type `Literal[3]` is not assignable to attribute `name` of type `str`
 dataclasses_transform_func.py:61:6: error[unsupported-operator] Operator `<` is not supported for types `Customer1` and `Customer1`
 dataclasses_transform_func.py:65:36: error[unknown-argument] Argument `salary` does not match any known parameter
+dataclasses_transform_func.py:71:8: error[missing-argument] No arguments provided for required parameters `id`, `name`
+dataclasses_transform_func.py:71:18: error[too-many-positional-arguments] Too many positional arguments: expected 0, got 2
 dataclasses_transform_func.py:77:36: error[invalid-return-type] Function always implicitly returns `None`, which is not assignable to return type `T@create_model_frozen`
 dataclasses_transform_func.py:97:1: error[invalid-assignment] Property `id` defined in `Customer3` is read-only
 dataclasses_transform_meta.py:60:36: error[unknown-argument] Argument `other_name` does not match any known parameter

Copy link
Contributor

github-actions bot commented Aug 1, 2025

mypy_primer results

No ecosystem changes detected ✅
No memory usage changes detected ✅

@thejchap thejchap marked this pull request as draft August 1, 2025 02:32
@thejchap thejchap force-pushed the thejchap/kw-only branch 2 times, most recently from 7dd8764 to 0418fe3 Compare August 1, 2025 02:38
@thejchap thejchap changed the title [ty] upport kw_only=True for dataclasses [ty] support kw_only=True for dataclasses Aug 1, 2025
@thejchap thejchap marked this pull request as ready for review August 1, 2025 02:40
@AlexWaygood AlexWaygood added the ty Multi-file analysis & type inference label Aug 1, 2025
sharkdp
sharkdp previously requested changes Aug 1, 2025
Copy link
Contributor

@sharkdp sharkdp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you very much!

If you look at the typing conformance PR comment here (source code: https://github.com/python/typing/tree/main/conformance), you'll notice that this adds false positives when a single field opts out of kw_only=True:

@dataclass(kw_only=True)
class DC3:
    a: str = field(kw_only=False)
    b: int = 0

DC3("hi")  # error on this branch

I think it shouldn't be too hard to add support for kw_only of single fields here as well. You can match on the kw_only parameter and add it as additional metadata to FieldInstance, similar to what we do for the init argument:

Some(KnownFunction::Field) => {
if let [default, default_factory, init, ..] = overload.parameter_types()
{
let default_ty = match (default, default_factory) {
(Some(default_ty), _) => *default_ty,
(_, Some(default_factory_ty)) => default_factory_ty
.try_call(db, &CallArguments::none())
.map_or(Type::unknown(), |binding| binding.return_type(db)),
_ => Type::unknown(),
};
let init = init
.map(|init| !init.bool(db).is_always_false())
.unwrap_or(true);
// `typeshed` pretends that `dataclasses.field()` returns the type of the
// default value directly. At runtime, however, this function returns an
// instance of `dataclasses.Field`. We also model it this way and return
// a known-instance type with information about the field. The drawback
// of this approach is that we need to pretend that instances of `Field`
// are assignable to `T` if the default type of the field is assignable
// to `T`. Otherwise, we would error on `name: str = field(default="")`.
overload.set_return_type(Type::KnownInstance(
KnownInstanceType::Field(FieldInstance::new(
db, default_ty, init,
)),
));
}
}
. Let me know if you need help with that or would rather not like to work on that.

@thejchap thejchap force-pushed the thejchap/kw-only branch 2 times, most recently from 23148ae to 557670d Compare August 2, 2025 00:56
@thejchap thejchap marked this pull request as draft August 2, 2025 01:50
@thejchap thejchap force-pushed the thejchap/kw-only branch 2 times, most recently from be2cfb5 to c01e70f Compare August 2, 2025 02:46
@thejchap
Copy link
Contributor Author

thejchap commented Aug 2, 2025

implemented kw_only in field as well - need to take a look at the rest of the conformance diffs, will do that this week

@thejchap thejchap force-pushed the thejchap/kw-only branch 3 times, most recently from 5271700 to 6aafc5d Compare August 6, 2025 12:13
id: int
name: str

# TODO: Should not emit errors
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i believe this should cover the dataclasses_transform_func false positive

@thejchap thejchap marked this pull request as ready for review August 6, 2025 12:19
@thejchap thejchap requested a review from sharkdp August 6, 2025 12:19
@thejchap thejchap changed the title [ty] support kw_only=True for dataclasses [ty] support kw_only=True for dataclass() and field() Aug 6, 2025
@thejchap thejchap changed the title [ty] support kw_only=True for dataclass() and field() [ty] support kw_only=True for dataclass() and field() Aug 6, 2025
@thejchap
Copy link
Contributor Author

thejchap commented Aug 8, 2025

@sharkdp I think this is ready for another look

@AlexWaygood
Copy link
Member

David's on vacation until the end of next week, but we'll try to have someone else on the team take a look soon!

@thejchap thejchap force-pushed the thejchap/kw-only branch 2 times, most recently from 3b7b558 to 2a031d7 Compare August 9, 2025 17:13
Comment on lines 1957 to 1960
let mut parameter = if kw_only_field_seen
|| name == "__replace__"
|| kw_only.unwrap_or(has_dataclass_param(DataclassParams::KW_ONLY))
{
Copy link
Contributor

@abhijeetbodas2001 abhijeetbodas2001 Aug 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could not find it specified in the spec, so I tested that unioning all kw_only's is the right way here. It seems that kw_only param of field has a lower precedence than dataclasses.KW_ONLY, for example:

[nav] In [8]: @dataclass
         ...: class C:
         ...:     _: KW_ONLY
         ...:     x: field(kw_only=False)

[nav] In [9]: C(x=3)
Out[9]: C(x=3)

[ins] In [10]: C(1)
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In[10], line 1
----> 1 C(1)

TypeError: C.__init__() takes 1 positional argument but 2 were given

I think it would be nice to have an mdtest simulating this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@abhijeetbodas2001 thanks! fixed the precedence and added a test here

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, no, sorry, I didn't mean to say that the earlier code was incorrect. I meant to say that it looks like it was correct (based on the code snippet I pasted above -- x can only be a kw argument), and that we should add a test to make sure it stays correct.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@abhijeetbodas2001 the below snippet passes mypy and works at runtime which makes me think the new precedence is correct (unless i'm missing something) - can you maybe clarify whats making you think the previous implementation was correct?

from dataclasses import field, dataclass, KW_ONLY
from typing import reveal_type

@dataclass
class C:
    _: KW_ONLY
    x: int = field(kw_only=False)

C(x=1)
C(1)
reveal_type(C.__init__)
../test.py:11: note: Revealed type is "def (self: test.C, x: builtins.int)"
Success: no issues found in 1 source file

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yes, the new code is indeed correct. I was under the impression that the earlier one was correct based on the code snippet I pasted above. But I now realise that that snippet is wrong (I've added field as an annotation, while it should be an assignment RHS).

But thanks a lot for fixing this and adding the test! And sorry for the confusion.

@thejchap
Copy link
Contributor Author

Diagnostic diff on typing conformance tests

Changes were detected when running ty on typing conformance tests

notes on the conformance diffs here: https://diffswarm.dev/d-01k2gknwyq82f6x17zqf3apjxc

Copy link
Contributor

@carljm carljm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great, thank you!

Comment on lines +927 to +929
if let [default, default_factory, init, .., kw_only] =
overload.parameter_types()
Copy link
Contributor

@carljm carljm Aug 13, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not awesome how we're hardcoding a correspondence with the ordering of keyword-only parameters in typeshed here, rather than actually matching by parameter name. But that's orthogonal to this PR.

But this makes it even less robust. Python 3.14 added a new doc parameter which will break this as it's currently written, since it means kw_only is not necessarily last.

We can't just list all the parameters from the front, because then this if let will fail to match at all on older Python versions that don't have the kw_only parameter.

I think we should modify this code to do something a bit more intelligent here. Ideally that more intelligent thing would be to actually match by parameter name. That should be very doable, but it is a bit more involved; it will require adding new API to Binding, I think. So perhaps it would make more sense as a separate PR.

So for now perhaps we just add a TODO comment here:

Suggested change
if let [default, default_factory, init, .., kw_only] =
overload.parameter_types()
// TODO this will break on Python 3.14 -- we should match by parameter name instead
if let [default, default_factory, init, .., kw_only] =
overload.parameter_types()

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO added

yeah - i remember thinking this felt a little brittle while i was working on it. if you want to expand a bit on the vision for the Binding API or write up an issue or something i'd be interested in taking a look!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Basically a Binding includes a vector of types for each parameter (matching up with parameters in source order), and it includes a Signature which tells us all about those parameters (including their names), but it doesn't currently have any way to say "give me the type for the parameter with this name." Unless I'm missing something (which I might be), there's no particular difficulty here, it just requires a bit of wiring things together.

@thejchap
Copy link
Contributor Author

@carljm i think ive addressed everything - thanks!

@carljm carljm dismissed sharkdp’s stale review August 14, 2025 15:01

Requested changes have been made

@carljm carljm merged commit dc2e8ab into astral-sh:main Aug 14, 2025
38 checks passed
dcreager added a commit that referenced this pull request Aug 14, 2025
* main:
  [ty] Add diagnostics for invalid `await` expressions (#19711)
  [ty] Synthesize read-only properties for all declared members on `NamedTuple` classes (#19899)
  [ty] Remove use of `ClassBase::try_from_type` from `super()` machinery (#19902)
  [ty] Speedup project file discovery  (#19913)
  [`pyflakes`] Add secondary annotation showing previous definition (`F811`) (#19900)
  Bump 0.12.9 (#19917)
  [ty] support `kw_only=True` for `dataclass()` and `field()` (#19677)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ty Multi-file analysis & type inference
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants