Description
Motivation
Its fairly convoluted to add an non-allocating struct enumerator to a class; and yield iterators which have a simpler syntax are allocating and also don't work as a class-level struct enumerable.
Related: System.Linq
is a wonderful feature, however it also allocates for all the IEnumerables; so it would be desirable to find a solution that supports a non-allocating struct-based Linq
or Value Linq
Background
Given a common or garden List class
public partial class List<T>
{
private T[] _items;
private int _size;
private int _version;
}
Adding an indexer is fairly straight forward using the this[]
property, and it would be good to have a similar ease of use for an enumerator; that is also non-allocating for yield enumerators.
Inspired by @davkean's twitter conversion on the verboseness of enumerators, @jaredpar's Rethinking IEnumerable and Immutable's IStrongEnumerable
As well as @nguerrera call to action that 140 chars was too small to convey a design
Proposal
Contract
public interface IValueEnumerable<T, TEnumerator>
where TEnumerator : struct, IValueEnumerator<T>
{
TEnumerator GetValueEnumerator();
}
public interface IValueEnumerator<T> : IDisposable
{
T TryGetNext(out bool success);
}
Contextual keyword
New method_modifier to specify the method is a ValueEnumerable which is genericly typed enumerable<>
method_modifier
: 'new'
| 'public'
| 'protected'
| 'internal'
| 'private'
| 'static'
| 'virtual'
| 'sealed'
| 'override'
| 'abstract'
| 'extern'
| 'async'
| method_modifier_unsafe
+ | method_modifier_enumerable
;
+method_modifier_enumerable
+ : 'enumerable'<' type_argument'>'
+ ;
Used with a member_name it is a struct-based iterator and works with yield
Used without a member_name it is a struct-based class-level enumerable and works with yield
method_header
: attributes? method_modifier* 'partial'? return_type member_name type_parameter_list?
'(' formal_parameter_list? ')' type_parameter_constraints_clause*
+ | class_enumerable_header
;
+class_enumerable_header
+ : attributes? method_modifier* 'partial'? return_type type_parameter_list?
+ '()' type_parameter_constraints_clause*
+ | class_enumerable_pass_through_header
+ ;
+class_enumerable_pass_through_header
+ : attributes? method_modifier* type_parameter_list?
+ '()' type_parameter_constraints_clause*
+ ;
Usage
Class-Level Iterator
public enumerable<T> StructTypeName()
Convention: public enumerable<T> ValueEnumerable()
partial class List<T>
{
// Developer types for class-level enumerator
public enumerable<T> ValueEnumerable() // typeof(ValueEnumerable)
{
int version = _version;
int index = 0;
while ((uint)index < (uint)_size)
{
if (version == _version) throw new InvalidOperationException("List changed");
index++;
yield return _items[index - 1];
}
}
// auto implements: ValueEnumerable.Enumerator GetValueEnumerator()
}
Change to iterator Stack sample to utilize new fast enumerator
partial class Stack<T>
{
// Current: Developer types for interface class-level enumerator
public IEnumerator<T> GetEnumerator() {
for (int i = count - 1; i >= 0; --i) yield return items[i];
}
// New: Developer types for struct class-level enumerator
public enumerable<T> ValueEnumerable() { // typeof(ValueEnumerable)
for (int i = count - 1; i >= 0; --i) yield return items[i];
}
// auto implements: ValueEnumerable.Enumerator GetValueEnumerator()
}
The type name of the Enumerator is part of the method_header declaration so user is at liberty to define it as
public enumerable<T> SomeBadName()
// typeof(SomeBadName) && typeof(SomeBadName.Enumerator)
Method Iterator
public enumerable<T> StructTypeName MethodName()
Convention: public enumerable<T> MethodNameEnumerable MethodName()
Used as return type from a method it is a method iterator and works with yield
// Developer types for method iterator
public enumerable<T> SkipEnumerable Skip(int toSkip) // typeof(SkipEnumerable)
{
int version = _version;
int index = toSkip;
while ((uint)index < (uint)_size)
{
if (version == _version) throw new InvalidOperationException("List changed");
index++;
yield return _items[index - 1];
}
}
The type name of the Enumerator is part of the method_header declaration so user is at liberty to define it as
public enumerable<T> OtherBadName Skip(int toSkip)
// typeof(OtherBadName) && typeof(OtherBadName.Enumerator)
Pass-through Iterator
They should also be able to be pass-through chained without generating another enumerable type. The class enumerable will have to re-specify enumerable<>
to identify the method; a method to method will not:
// Class-level to method iterator, specified return type
public enumerable<T> SkipEnumerable.Enumerator()
=> Skip(toSkip: 0).GetValueEnumerator(); // typeof(SkipEnumerable.Enumerator)
As the class-level enumerable; needs to refers to the Enumerator subtype of the Enumerable a simpler inferred type pass-through syntax will be allowed:
// Class-level to method iterator, inferred return type
public enumerable<T>()
=> Skip(toSkip: 0).GetValueEnumerator(); // typeof(SkipEnumerable.Enumerator)
Method Iterator pass-through is a simple method mapping
// method to method iterator, return type is always specified
public SkipEnumerable SkipTen() => Skip(toSkip: 10); // typeof(SkipEnumerable)
Code-generation
Class interfaces
Using the class level enumerator will automatically implement the IValueEnumerable
interface on the class as well as IEnumerable<T>
if not already implemented
public partial class List<T>
// Compiler added from named TEnumerator for class
: IValueEnumerable<T, List<T>.ValueEnumerable.Enumerator>
// Added by complier if not already implmented on class
, IEnumerable<T>
{
Class-level enumerator
Example code for the above class enumerator that the compiler could generate
// Compiler emits
public ValueEnumerable.Enumerator GetValueEnumerator()
=> new ValueEnumerable.Enumerator(this);
// Nested static class to maintain naming consistency with method iterators
public static class ValueEnumerable
{
// Value Enumerator
public struct Enumerator : IValueEnumerator<T>
{
private int __state;
private List<T> __thisCapture;
private int version;
private int index;
internal Enumerator(List<T> thisCapture)
{
__state = 0;
__thisCapture = thisCapture;
version = thisCapture._version;
index = 0;
}
public T TryGetNext(out bool success)
{
switch (_state)
{
case 0:
if ((uint)index < (uint)__thisCapture._size)
{
if (version == __thisCapture._version) throw new InvalidOperationException("List changed");
index++;
success = true;
return __thisCapture._items[index - 1];
}
_state = 1;
goto case 1;
case 1:
default:
success = false;
return default(T);
}
}
public void Dispose() {}
public static implicit operator EnumeratorAdapter<Enumerator>(Enumerator enumerator)
{
return new EnumeratorAdapter<Enumerator>(enumerator);
}
}
}
Class-level enumerator Compatibility/Interop
If IEnumerator<T>
was not previously defined on the class so the compiler added it; it would also generate an adapter. Example code for the generated code:
// If GetEnumerator not defined Compiler also generates
public EnumeratorAdapter<ValueEnumerable.Enumerator> GetEnumerator() => GetValueEnumerator();
IEnumerator<T> IEnumerable<T>.GetEnumerator() => GetEnumerator();
IEnumerator IEnumerable.GetEnumerator() => GetEnumerator();
GetEnumerator/IEnumerable interop
With a common EnumeratorAdapter
shared for all IValueEnumerator
s
public struct EnumeratorAdapter<TValueEnumerator> : IEnumerator<T>
where TValueEnumerator : IValueEnumerator<T>
{
private TValueEnumerator _enumerator;
private T _current;
internal Enumerator(TValueEnumerator enumerator)
{
_enumerator = enumerator;
_current = default(T);
}
public T Current => _current;
public bool MoveNext()
{
_current = _enumerator.TryGetNext(ref _state, out bool success);
return success;
}
public void Dispose() => _enumerator.Dispose();
object IEnumerator.Current => Current;
void IEnumerator.Reset() => throw new NotSupportedException();
}
Method iterator
// Developer types for method iterator
public enumerable<T> SkipEnumerable Skip(int toSkip) // typeof(SkipEnumerable)
{
// ...
}
// Compiler emits
public SkipEnumerable Skip(int toSkip) => new SkipEnumerable(this, toSkip);
public struct SkipEnumerable : IValueEnumerable<T, SkipEnumerable.Enumerator>
{
private List<T> _thisCapture;
private int _toSkip;
public SkipEnumerable(List<T> thisCapture, int toSkip)
{
_thisCapture = thisCapture;
_toSkip = toSkip;
}
public Enumerator GetValueEnumerator()
{
return new Enumerator(_thisCapture, _toSkip);
}
// Value Enumerator
public struct Enumerator : IValueEnumerator<T>
{
private int __state;
private List<T> __thisCapture;
private int version;
private int index;
internal Enumerator(List<T> thisCapture, int toSkip)
{
__state = 0;
__thisCapture = thisCapture;
version = thisCapture._version;
index = toSkip;
}
public T TryGetNext(out bool success)
{
switch (__state)
{
case 0:
if ((uint)index < (uint)__thisCapture._size)
{
if (version == __thisCapture._version) throw new InvalidOperationException("List changed");
index++;
success = true;
return __thisCapture._items[index - 1];
}
__state = 1;
goto case 1;
case 1:
default:
success = false;
return default(T);
}
}
public void Dispose() { }
public static implicit operator EnumeratorAdapter<Enumerator>(Enumerator enumerator)
{
return new EnumeratorAdapter<Enumerator>(enumerator);
}
}
}
Pass-through enumerator
The should return the called type
// Class-level to method iterator, explicit return type
public enumerable<T> SkipEnumerable.Enumerator()
=> Skip(toSkip: 0).GetValueEnumerator(); // typeof(SkipEnumerable.Enumerator)
// Class-level to method iterator, inferred return type
public enumerable<T>()
=> Skip(toSkip: 0).GetValueEnumerator(); // typeof(SkipEnumerable.Enumerator)
// Compiler emits for both
public SkipEnumerable.Enumerator GetValueEnumerator()
=> Skip(toSkip).GetValueEnumerator();
Already defined error if both type-inferred and type-specified class-level iterator
// method to method iterator, return type is always specified
public SkipEnumerable SkipTen() => Skip(toSkip: 10);
// Compiler behaves as now
Consuming the Enumerable
foreach
will bind to GetValueEnumerator
in preference to GetEnumerator
if available and usage as now:
// Developer types
foreach (T item in list) {
// Loop body
}
Code-generation
// Compiler emits
var enumerator = list.GetValueEnumerator();
try
{
T item = enumerator.TryGetNext(out bool success);
while (success)
{
// Loop body
item = enumerator.TryGetNext(out success);
}
}
finally
{
enumerator.Dispose();
}
Questions
Type inference
Linq style extensions (also see #974 (comment))
public static partial class Enumerable
{
public static int Count<TSource, TEnumerable, TEnumerator>(this TEnumerable source)
where TEnumerable : struct, IValueEnumerable<TSource, TEnumerator>
where TEnumerator : struct, IValueEnumerator<TSource>
{
int count = 0;
foreach (var _ in source)
{
checked
{
count++;
}
}
return count;
}
}
When used
var list = new List<int>();
list.Count();
Will error with
The type arguments for method "Enumerable.Count<TSource, TEnumerable, TEnumerator>(TEnumerable)" cannot be inferred from usage. Try specifying the type arguments directly.
However specify the types degrades the user experience significantly
Value Linq
Overload preference, struct generic to be preferred over interface extensions e.g.
Non-boxing
public static int Count<TSource, TEnumerable, TEnumerator>(this TEnumerable source)
where TEnumerable : struct, IValueEnumerable<TSource, TEnumerator>
where TEnumerator : struct, IValueEnumerator<TSource>
{
Preferred over cast to interface (that will eventually box)
public static int Count<TSource>(this IEnumerable<TSource> source)
{
Return vs out
Return T
or return bool
?
strangest part is the signature of TryGetNext:
T TryGetNext(out bool b)
It's demonstrably faster than
bool TryGetNext(out T elem)
e.g.
public interface IValueEnumerator<T> : IDisposable
{
T TryGetNext(out bool success);
}
vs
public interface IValueEnumerator<T> : IDisposable
{
bool TryGetNext(out T value);
}
Covariance
I said fast right 😉 Though open question...
/cc @JonHanna for thoughts on Value Linq use