-
-
Notifications
You must be signed in to change notification settings - Fork 2.9k
Better alternative for algebraic data types #2464
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
This is an interesting idea. Here is some early bikeshedding:
class Shape(TaggedUnion):
class Circle:
radius: float
class Rectangle:
width: float
height: float EDIT: Fixed typo pointed by @kirbyfan64 |
But isn't On topic: +10000 to this. ADTs are awesome. :D |
Ah, sorry, that is just a typo, it is not needed. |
Hmm.. it would be less clear that the subclasses are named tuples (we want the default
This would require using qualified names such as |
This is true. On the other hand the definition is a bit more compact, and it is immediately visible later in code that
|
I would have liked the enum-like syntax (I have suggested it some time ago) but it is not exactly a tagged union, since Here's a wild idea: with NamedUnion as Shape:
class Circle(Shape):
radius : float
class Rectangle(Shape):
width: float
height: float This way we have a bounded scope for the definition of subelements of Shape, and we can also use its name in the subclasses. |
Please excuse the noob question, but how would this be different than the classes extending the |
The point is the declaration "A |
Just another idea for the possible syntax. If we allow only named tuples a s members, then one can write: class Shape(NamedUnion):
Circle: {radius: int}
Rectangle: {width: int, height: int} At runtime, this will be transformed so that |
@elazarg I really like your syntax proposition. It can also be used to mimic Haskell's type constraints:
|
In my sum type library I went with syntax like this: class Shape(NamedUnion):
def Circle(radius: float): ...
def Rectangle(width: float, height: float): ... It's similar to a Haskell GADT declaration in syntax and semantics¹ – there's only one class/type, ¹ as close as I could get in Python anyway 😄 (I hope this doesn't sound too much like a plug, just wanted to contribute to the discussion!) |
Since the canonical answer to the "where is the switch/case statement in python?" seems to be "use a dict", I think Sadly mypy does not allow for creating |
This syntax actually already kind of works today. The following type checks and runs: from abc import ABC
from typing import Union, NamedTuple
class Shape(ABC):
class Circle(NamedTuple):
radius: float
class Rectangle(NamedTuple):
width: float
height: float
ShapeType = Union[Shape.Circle, Shape.Rectangle]
S: ShapeType = Shape.Circle(2.0)
S = Shape.Rectangle(2.0, 1.4)
print(S.width)
def print_width(shape: ShapeType) -> None:
if isinstance(shape, Shape.Rectangle):
print(shape.width)
print_width(S) The only ugly thing is that you have to define the |
@thomkeh Yes, the point of special support would be to minimize verbosity, maximize clarity and simplicity. Here you have to juggle with both a base type(useless in itself, except for namespacing the subclasses) and an additional type alias. Personally, I've found myself naturally wanting and trying to emulate this nested class syntax. I think it just makes sense visually. |
It's good to know that sum types can be partly emulated today. But one of the key features of algebraic datatypes is that they can be recursive (e.g. |
Adding support for recursive types is on the roadmap for mypy. |
@dckc Using lazy annotations, I don't see any syntactic issue. from __future__ import annotations
import typing
class List(TaggedUnion):
class Cons:
head: typing.Any
tail: List
class Empty: pass One issue is the incompatibility between |
Will this issue be resolved using https://www.python.org/dev/peps/pep-0622/#sealed-classes-as-algebraic-data-types ? |
Another question (maybe not directly related to mypy...) : what would be recommended to serialize in JSON, since there is no "tag"? Adding the class name? But the class name may be an implementation detail that may change. |
@dfroger PEP 622 was withdrawn; the final version of pattern matching didn't have sealed. However, it's a natural future extension. This isn't the right place to discuss JSON serialization. |
Using the new PEP 695 syntax and pattern matching you can express algebraic data types in a very nice and concise way. For product types you can use dataclasses. For sum types, here's the way I propose. Either from dataclasses import dataclass
type Either[L, R] = Left[L] | Right[R]
@dataclass(frozen=True)
class Left[T]:
value: T
@dataclass(frozen=True)
class Right[T]:
value: T
def handle(either: Either[int, str]) -> Either[int, str]:
match either:
case Left(number):
return Left(number + 1)
case Right(string):
return Right(string.lower())
# Optionally, you can rename Either, Left and Right
# to better document your code
type Result[T, E] = Either[T, E]
Ok = Left
Error = Right
def str_to_int(string: str) -> Result[int, Exception]:
try:
return Ok(int(string))
except Exception as error:
return Error(error) List from dataclasses import dataclass
type List[T] = Empty | Constructor[T]
@dataclass(frozen=True)
class Empty:
pass
@dataclass(frozen=True)
class Constructor[T]:
head: T
tail: List[T]
def append[T](list: List[T], item: T) -> List[T]:
match list:
case Empty():
return Constructor(item, Empty())
case Constructor(head, tail):
return Constructor(head, append(tail, item)) Bool from dataclasses import dataclass
type Bool = Yes | No
@dataclass(frozen=True)
class Yes:
pass
@dataclass(frozen=True)
class No:
pass
def both(first: Bool, second: Bool) -> Bool:
match first, second:
case Yes(), Yes():
return Yes()
case _:
return No() Personally, I think it's not worth it to overcomplicate things by adding a new special syntax or extending the standard library, as it's already fine with the current version of Python. There's also a Scott encoding way of representing algebraic data types. Option from collections.abc import Callable
from typing import Protocol
class Option[T](Protocol):
def __call__[R](self, case_nothing: R, case_some: Callable[[T], R], /) -> R:
...
def nothing[T, R](case_nothing: R, case_some: Callable[[T], R]) -> R:
return case_nothing
def some[T](value: T) -> Option[T]:
def option[R](case_nothing: R, case_some: Callable[[T], R]) -> R:
return case_some(value)
return option
def default_to_0(number: Option[int]) -> int:
return number(0, identity)
def identity[T](value: T) -> T:
return value Number from __future__ import annotations
from collections.abc import Callable
from typing import Protocol
class Number(Protocol):
def __call__[R](self, case_zero: R, case_successor: Callable[[Number], R], /) -> R:
...
def zero[R](case_zero: R, case_successor: Callable[[Number], R]) -> R:
return case_zero
def successor(number: Number) -> Number:
def new_number[R](case_zero: R, case_successor: Callable[[Number], R]) -> R:
return case_successor(number)
return new_number
def add(augend: Number, addend: Number) -> Number:
return augend(addend, lambda augend_predessor: successor(add(augend_predessor, addend))) |
Many languages support algebraic data types, including Haskell, OCaml, Scala, Swift and Rust. Many programmers like them and they have some benefits over Python subclassing, which I won't discuss in detail here.
Currently we can kind of fake algebraic data types by using union types and named tuples:
There are a few problems with this currently:
For (1), mypy could detect at least some undefined variables and recognize (some) if statements used for "pattern matching". For example, assume that we added a new shape,
Triangle
. Now mypy should complain aboutarea()
, as it fails to handle triangles (if we'd have used multiple returns, mypy could already have caught the error):For (2), (3) and (4), we could have some new syntax:
NamedTupleUnion
would have a few special features. All subclasses are named tuples and must be defined in the same module as theNamedTupleUnion
type (i.e. it can't be extended outside the current module). This way we can check whether all item types are handled usingisinstance
checks. Finally, only one level of inheritance would be supported, for simplicity -- though I'm not sure if this restriction is necessary.[This is just a random idea I wanted to write down and it's not urgent in any way.]
The text was updated successfully, but these errors were encountered: