-
Notifications
You must be signed in to change notification settings - Fork 213
Enums are too heavy for large-scale use, but are the only feature affording easy completion. #50
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Would these approaches allow to add static methods like List<T> values = [foo, bar, baz];
List<T> get evenValues => values.where((e) => e.value % 2 == 0).toList(); to be used like
or be used with extension methods to make such additional methods available? Would it not be better to make the more powerful and flexible old-style enums
more convenient to write and use instead of adding different limited ways to define enums? |
Since enums are all sealed classes, couldn't we continue to pretend they are as they are now, but just compile them down to more-or-less-just integers? That is to say, why are enums necessarily heavy, as currently specified? |
In large part because the following program must print "true" and "false", which implies that in general enums must support dynamic dispatch (which de facto means they pretty must be represented as heap allocated values with vtables). enum Color { red, green, blue }
class NotAColor {
String index = "hello";
}
void foo(dynamic a) {
print(a.index is String);
}
void main() {
foo(new NotAColor());
foo(Color.red);
} |
Can we use the same strategy as we use to make this work?: class NotAnInt {
String sign = "hello";
}
void foo(dynamic a) {
print(a.sign is String);
}
void main() {
foo(new NotAnInt());
foo(2);
} |
@Hixie On the VM ints have two representations, one is a boxed value (an allocated object), the other uses a bit in the pointer representation to distinguish it from a real pointer (Smi). There could be other values shoe-horned into the pointer bits, it will add complexity everywhere, just like the current Smi representation makes When compiling to JavaScript, there is no kind of value we can play this kind of trick with. If we can come up with a scheme that works well when compiling to JavaScript, that would be preferable to one that where JavaScript is significantly worse. |
It is important to note that the 'Enums' from protobufs are not the same type as declared by the |
I don't believe that's true.
The completion issue could be fixed. We could easily add all static fields (or some subset of static fields) to the list of suggestions, but what we'd need to validate is that doing so actually improved the UX. |
FWIW, I think SMI is one of Dart's vestigial features that should be removed. To be fair, it's a very clever trick, and it made a lot of sense back when Dart was a dynamic programming language, where it actually had a huge performance benefit. In the statically typed AOT world I think all it does is make performance worse and hinders new language features. The language should also allow defining custom unboxed types of known size with the same level of efficiency, including enums, grapheme clusters, As a litmus test Dart should be powerful enough that you could implement your own |
@yjbanov Dart can already do many of the optimizations you mention. We can unbox integers in many places, but without whole-program analysis and optimization, we can't do so on API boundaries. And then there are dynamic invocations. We can probably do something clever, but at the cost of making dynamic invocations even more expensive - they will basically have to match the function signature to the argument list signature at run-time to see which arguments can/must be passed unboxed, and then do boxing/unboxing as necessary. |
Smi is not a feature. It's an optimization of boxed integers. If there would be no need to box integers - there would be no need for smi's. As @lrhn points out boxing originates from Dart's dynamic features - if you want to eliminate boxing altogether you need to eliminate dynamism. |
Performance is a feature :)
The goal is not to eliminate dynamism. The goal is to have predictable performance in code that requires it, which is usually very small portion of the code, and therefore the cost of having to deal with some "advanced" language features (if we have them) is relatively low, and at the same time it's where performance really really matters. |
Even if we eliminated That in turn means a callsite that's passing an integer needs to use a calling convention that can handle that being stored as an Object parameter. That's also a source of trouble for us, isn't it? We could potentially do what Java and C# do and move numbers out of the object hierarchy. That has its pros and cons, but might be a better trade-off than Dart's current approach which is to force all users to pay the cost of flexibility all the time, everywhere. I'd be really interested to see some data on how often real code actually uses the fact that numbers are a subtype of |
Let's take this example: class A {
foo(int a);
}
class B extends A {
@override
foo(Object a) {
print(a);
}
}
main() {
var b = B();
b.foo(5);
b.foo('hello');
} Can we compile it to the following? class B extends A {
foo(int a) {
foo$Object(box<a>);
}
foo$Object(Object a) {
print(a);
}
}
main() {
var b = B();
b.foo(5);
b.foo$Object('hello');
}
This would be the cleanest approach IMHO. An alternative to boxing is to encode
+💯 |
Maybe. If you have a method like If you have whole-program analyses, you might be able to cut down on some of those cases. Doing auto-boxing is what Java did. It means that generics only work for objects, so a Both Java and C# are just-in-time compiled, so you only pay for what you use. As for using tuples, I don't see how |
A couple of options:
I cannot rely on whole-program analysis when optimizing performance-sensitive parts of my code. Whole program analysis optimizations are useful as a last-minute vacuum sealer. I need language primitives that are predictable. For example, I should be able to make performance assumptions that cannot be invalidated by an unrelated part of the code in a large application. Unfortunately, by definition, using non-local information for optimizations is precisely what whole program analysis does.
I'm totally fine paying up to maybe 4x code duplication to gain performance of the 5% of code where performance matters.
My gut feeling is that it won't impact JS much, but it will likely not have any benefits either. To get benefits from unboxing in JS we should start thinking about WebAssembly (high time we did tbh).
This is not observable by the developer, so there's no difference. I was only suggesting an alternative in-memory representation. Instead of boxing the value, you can use fat pointers, like Go does.
The size overhead is only proportional to the amount of numerics that are boxed. My guess is that most of them do not need to be boxed. I think Golang is a good source of stats for stuff like that. |
#158 has received a large number of comments on a rather diverse set of topics. One of the threads was about C# enumerations that allow us to use a number as a bit vector, with a certain amount of static protection. Here is a comment that I added to the discussion in #158, which turns out to fit better here: @JohnGalt1717, I understand that the Dart enum types will not do exactly the same thing unless they are radically redesigned, and, as @munificent mentioned here, those C# enumerations lack a number of capabilities and guarantees that we do have (and presumably won't give up) with Dart enum types. However, Dart is very likely to be extended with a mechanism like views, and they are directly aimed at supporting low-level operations on a highly performant representation, protected by a specific static type. So, setting out from the C# examples you mentioned, here's how we could provide support for it using views: view Bitset on int {
bool operator <=(X x) => this & x == this;
bool operator >=(X x) => this & x == x;
bool operator <(X) => this != x && this <= x;
bool operator >(X x) => this != x && this >= x;
}
view Languages extends BitSet on int {
static const Languages
CSharp = 0x0001,
VBNET = 0x0002,
VB6 = 0x0004,
Cpp = 0x0008,
FortranNET = 0x0010,
JSharp = 0x0020,
MSIL = 0x0080
All = CSharp | VBNET | VB6 | Cpp | FortranNET | Jsharp | MSIL,
VBOnly = VBNET | VB6,
NonVB = CSharp | Cpp | FortranNET | Jsharp | MSIL;
Languages operator |(Languages other) => this | other;
Languages operator &(Languages other) => this & other;
}
view Days extends BitSet on int {
static const Days
Monday = 0x0001,
Tuesday = 0x0002,
Wednesday = 0x0004,
Thursday = 0x0008,
Friday = 0x0010,
Saturday = 0x0020,
Sunday = 0x0040,
Weekend = Saturday | Sunday;
Days operator |(Days other) => this | other;
Days operator &(Days other) => this & other;
}
void main() {
// Usage of `Languages` as a bit set.
Languages lang = Languages.CSharp | Languages.MSIL;
print(lang <= Languages.NonVB); // Subset relation: 'true'.
print(Languages.FortranNET <= lang); // Membership: 'false'.
print(lang == lang); // Equality: 'true'.
// Different bit sets are statically separate.
Days days = Days.Weekend | Days.Wednesday;
lang = days; // Compile-time error.
} With views, different bit sets can be declared (like This kind of bit sets do not restrict the values (we can introduce the value I think this illustrates that the whole discussion about this kind of feature in Dart may be important in its own right, but it might not be relevant to discussions about |
@eernstg That's great, however the root issue with enums as they stand, right now for every Dart developer I've talked to, is that you can't serialize them properly without massive amounts of boilerplate and hackery which makes Dart incompatible with most APIs that other languages generate. You're literally adding an entirely new class of language idioms for something that can be solved by making enum be implicitly a bit shifted num which would, by definition of num, still be inherited from object but would allow all of the bitwise operators, and allow value assignment exactly like C# with no ceremony, and wouldn't break anything, because you can easily check at compile time that assignments are done enum to enum or explicitly with a cast, exactly like C# does, which doesn't break the paradigm and still allows full switch validation as well simply because the switch is on the enum type. (which is far as I can tell is all that enums in dart do, there's literally nothing else to them). And if the user wanted to break out of the box they can explicitly cast the enum to an int and then write the switch based on that and add whatever other arbitrary values they abused enums with. Or you could set an analyzer option that would prevent arbitrary assignment even with cast and voila, no abuse allowed. You could even make that the default. But of course then you'd break easy deserialization of ints in json/protobuf to enums without either disabling it explicitly or some specific function on the enum that did it manually. The former would probably be preferable and allow ignore commands in the code on the file level as an example. And since Dart's primary (virtually only) job in life at this point is to be the programming language for Flutter, this is of monumental concern because all you ever do is interop with other languages (and even if this wasn't the case, you can see the mess this creates even with Google's own GRPC even if you're interoping with a Dart server). If Dart gets value assignments to enums, then, since I don't use enums in any way that is performant or has a critical path, all of this is immaterial to me on a day to day basis. But the point of this topic is well taken because indeed, the problem is that Dart enums are objects. There's absolutely no downside to making enums inherit from num (or int, or whatever else, since C# allows you to specify any numeric type) which itself inherits from object. No code would break as a result of doing so. Nor would any code break if you then use ordinal position in the enum to assign an int if not specified at compile time to every value of the enum like C# does, and then allow users to assign (with = !) the values explicitly if they so choose like C# does and then enable implicitly the bitwise operators (<<,>>,<,>,=,&,|,&=,|= etc). Nor would it introduce a new set of bugs or possible issues, because the compiler by default would throw on any out of bounds assignment that wasn't explicitly cast. (presumably) which would then enable switches to still be exhaustive. You could further have a separate code path that allows other types to be defined for enums if you so desired by simply allowing the user to assign non-numerics to the enum, and compile/analyser time validating that all types assigned are the same (or not if you want to really shoot yourself in the head). And this could even be a different implementation of the same methodology. Standard num based enums can be hard coded and work as they do right now, but with all of the above with:
This would automatically and implicitly be like the original suggestion of "on int". And if you want to put other values in:
and omitting the type declaration but specifying something other than a num, results in the generic version being used by the compiler/analyzer and it automatically picks the best, most specific type based on the values provided (which Dart already does in lots of cases with numeric values choosing int instead of doubles etc.) and if they are mixed types that don't share a type other than object (i.e. strings and ints) then it can just choose dynamic or object as the type. And this implementation then doesn't have the bitwise operations available which causes a compile time failure if you try and use them, AND uses a set or map in the background. This would also enable deserialization from the int (or other representation) as simply someVariable = map["field"] as SomeEnum, and the other way would just be map["field"] = someVariable (which would automatically assign the root type to the map on serialization because that could be gotten at runtime when doing the serialization) If you skipped the as SomeEnum in the above, then it would throw because the implicit cast wouldn't be valid, even map["field"] wasn't dynamic and was a variable of type int because of no implicit casts being able to be set in your analyzer configuration. Unless I just don't see it, your views implementation doesn't fix the massive amount of boilerplate that is required in this case with enums as they stand and in the view suggestion. (i.e. I have to have a switch to map them still from the int to the actual value, and another switch to map to an int to serialize which is anoying and error prone. This might not be the case if operator overriding allowed you to override with types that weren't the type on = and handle an int comparison and have it work as explicit assignment in both directions) As for the request for Java/Kotlin style functionality, I'd suggest again, the C# methodology which is the Enumeration class from which you can inherit, which takes a generic of the enum, which you can define inline and allows you to do whatever you want with methods, constructors, etc. etc. etc. while maintaining all of the functionality of enums. It literally gives you everything that Java/Kotlin does with their implementation while still giving you all of the superior C# properties of enums, with no downside and you don't have to add yet another paradigm to the language. You've incrementally improved Dart's enums, provided all of the functionality that everyone is asking for, and made them more performant by having them, by default, always be operations on a bitmask instead of objects being allocated on the stack. I.e.:
Where TEnumType is restricted to the new Enum generic type restriction. (TEnumType extends Enum) And if you really wanted to be fancy, you could allow the enumeration to be defined inline:
(and you could easily use your view implementation to do the same) Which would of course make the enum not publicly referenceable outside of this class, which would give you exactly the same functionality as Java in one tight little package, while still enabling highly performant operations even within the class on the enum from which the class is based. (and I'm not hung up on the fancy syntax above, you could just have it as a final on the class that has to be passed in or defined or whatever else you want, doesn't really matter) I believe the above addresses every single concern, and since every single example given in the previous comment about functionality in dart that C# doesn't provide is actually a feature in C# (i.e. not allowing overriding and making a more specific definition with an enum because C# enums don't inherit (they do, but the compiler refuses it for really good reasons) from object, are addressed in C# with the new keyword to replace the inherited implementation if you want, and in Dart, you can easily allow this more specific implementation because enum would inherit from int/num/whatever numeric type you want to set it to, and those themselves inherit from object and you can just make the compiler allow it unlike C# that prevents you from doing this because of an entire class of bugs you'll create doing so. If there is other things that Dart enums can do that C# enums can't do, I'd love to see them. I've used C# enums for 2 decades and never once run into any obstacles with them that it wasn't better that I did and approached it differently, especially once they implemented Enumeration but I endlessly am running into issues with Dart enums, all for the exhaustive case ability, for which the C# compiler now warns by default, and can be set to an error if you want, doing exactly what Dart does thus making C# handle every case of Dart enums AND every case of Java/Kotlin enums. The only thing I can think of is multiple assignments to the same value (which is insane and a Rust "feature" because of no null instead of nullable types IMHO) but you could easily do this with the above using Unions, sets, maps or anything else as the root type of the enum and you could even validate those values in the map that was created based on the "on ..." type in the definition if you wanted. |
There are like a hundred things to comment on, but it gets overwhelming, so let me just respond to a couple of things:
I assume you're claiming that there is no downside to do so in Dart, such that the claim is relevant here. The Dart notion of a cast ( In C# we can cast out of an enum type and into another one, and this may involve significant changes to the target of the cast (even when it is implicit): using System;
enum A { a1, a2, a3 }
enum B { b1, b2 }
public class Program
{
public static void Main()
{
object o = A.a2;
B b = (B)o;
Console.WriteLine(b.ToString()); // 'b2'.
}
} IIUC, the inline integer representation of the enum In other words, the language makes no attempt to ensure that a variable of type Perhaps you will say "so don't do that", but the situation may not be so simple: You could use software written and maintained by others, and it just takes one line of code in a million line program to introduce the reinterpretation. In Dart, an enum value has a type which is maintained robustly: enum A { a1, a2, a3 }
enum B { b1, b2 }
void main() {
Object o = A.a2;
var b = o as B; // Throws.
print(b.toString()); // Not reached.
} If we were to change Dart enums such that they would inherit from a numeric type (any of them) and use the same representation, then we could not maintain the type information at run time. In other words, Dart enums would then be just as leaky as C# enums. I'm not saying that the "enums are just bits" approach of C# is wrong, I'm just saying that it is completely different from the approach in Dart where every object maintains its own integrity by having a specific type. So it doesn't make sense if you claim that "Dart enums could just do the same thing as C#", because that would be massively breaking. And I'm also not at all convinced that the Dart approach is wrong. It just has different trade-offs, and different trade-offs correspond to different software designs. So of course you will be unhappy if you insist on writing C# enums in Dart using Dart enums. That's probably true for any language mechanism from different languages, for instance, you shouldn't try to write Haskell functions in C.
That is a choice, and I'm probably not the only person who thinks that the discussion stops if you just burst ahead and insist that it is a problem. Different trade-offs in programming languages give rise to different software designs, and it makes a lot of sense that you would use enums in C# in a very different way than you would use enums in Dart. But if the C# mechanism is useful then it is certainly also useful to investigate how a similar mechanism could be expressed in Dart. But it won't be called "enums"! I happen to believe that views would be a good starting point, because they are specifically targeted at building a static harness around a low-level representation, and they allow for all operations to be resolved statically (so they can be inlined, etc). Coincidentally, views are also leaky in the sense that they allow the underlying representation to be accessed (say, a I'll stop now, because it gets overwhelming to respond to every detail. I won't promise to respond to long discussion threads. But, @JohnGalt1717, please keep in mind that if you can't push the world into a shape that fits your head, you may need to reshape your head a little bit such that it fits the world. |
The Dart IDE integration allows completing
EnumClass.name
where something of typeEnumClass
is required. However, Dart enums are objects, and a large system using complex protobufs might need to allocate ~10000 such enum objects (this is actually happening). That is a serious memory and start-up time impact on the application.The currently available alternative is to just use constant integers, like:
However, that pattern is less readable, less writable and doesn't complete well in the editor. You have to write
EnumClass
before it lets you complete to.foo
.We could introduce a special kind of type-aliased enum, say:
That is equivalent to the above class, and lets
EnumClass
be used as an alias forint
.It also lets the code completion recognize it as an enum.
(If we want to really treat it as a closed enum, we could allow assignment from
EnumClass
toint
, but not the other direction. If we just want to treat it as anint
alias and allowfoo + baz
to have typeEnumClass
, then that won't work.)We could generalize that and introduce type aliases for any type, and allow static declarations on
those types:
and improve code completion to include all static members returning something of the same type as the containing type. So, if the context type is
EnumClass
, it would propose completing toEnumClass.foo
,EnumClass.bar
andEnumClass.baz
, just as for an enum, and we would also complete static factory methods, not just constructors.Or maybe we can use static extension types (#42) to get the type alias, and maybe even cast functions to/from
int
onEnumClass
. If we plan to get static extension types,, we should make sure that whetever we do here will work with that syntax too.The text was updated successfully, but these errors were encountered: