Skip to content

ValueEnumerables (fast to code and run) #974

Closed
@benaadams

Description

@benaadams

Motivation

Its fairly convoluted to add an non-allocating struct enumerator to a class; and yield iterators which have a simpler syntax are allocating and also don't work as a class-level struct enumerable.

Related: System.Linq is a wonderful feature, however it also allocates for all the IEnumerables; so it would be desirable to find a solution that supports a non-allocating struct-based Linq or Value Linq

Background

Given a common or garden List class

public partial class List<T>
{
    private T[] _items;
    private int _size;
    private int _version;
}

Adding an indexer is fairly straight forward using the this[] property, and it would be good to have a similar ease of use for an enumerator; that is also non-allocating for yield enumerators.

Inspired by @davkean's twitter conversion on the verboseness of enumerators, @jaredpar's Rethinking IEnumerable and Immutable's IStrongEnumerable

As well as @nguerrera call to action that 140 chars was too small to convey a design

Proposal

Contract

public interface IValueEnumerable<T, TEnumerator>
    where TEnumerator : struct, IValueEnumerator<T>
{
    TEnumerator GetValueEnumerator();
}

public interface IValueEnumerator<T> : IDisposable
{
    T TryGetNext(out bool success);
}

Contextual keyword

New method_modifier to specify the method is a ValueEnumerable which is genericly typed enumerable<>

method_modifier
    : 'new'
    | 'public'
    | 'protected'
    | 'internal'
    | 'private'
    | 'static'
    | 'virtual'
    | 'sealed'
    | 'override'
    | 'abstract'
    | 'extern'
    | 'async'
    | method_modifier_unsafe
+   | method_modifier_enumerable
    ;

+method_modifier_enumerable
+   : 'enumerable'<' type_argument'>'
+   ;

Used with a member_name it is a struct-based iterator and works with yield

Used without a member_name it is a struct-based class-level enumerable and works with yield

method_header
    : attributes? method_modifier* 'partial'? return_type member_name type_parameter_list?
      '(' formal_parameter_list? ')' type_parameter_constraints_clause*
+   | class_enumerable_header
    ;

+class_enumerable_header
+   : attributes? method_modifier* 'partial'? return_type type_parameter_list?
+      '()' type_parameter_constraints_clause*
+   | class_enumerable_pass_through_header
+   ;

+class_enumerable_pass_through_header
+   : attributes? method_modifier* type_parameter_list?
+      '()' type_parameter_constraints_clause*
+   ;

Usage

Class-Level Iterator

public enumerable<T> StructTypeName()

Convention: public enumerable<T> ValueEnumerable()

partial class List<T>
{
    // Developer types for class-level enumerator
    public enumerable<T> ValueEnumerable() // typeof(ValueEnumerable)
    {
        int version = _version;
        int index = 0;

        while ((uint)index < (uint)_size)
        {
            if (version == _version) throw new InvalidOperationException("List changed");

            index++;
            yield return _items[index - 1];
        }
    }

    // auto implements: ValueEnumerable.Enumerator GetValueEnumerator()
}

Change to iterator Stack sample to utilize new fast enumerator

partial class Stack<T>
{
    // Current: Developer types for interface class-level enumerator
    public IEnumerator<T> GetEnumerator() {
        for (int i = count - 1; i >= 0; --i) yield return items[i];
    }

    // New: Developer types for struct class-level enumerator
    public enumerable<T> ValueEnumerable() { // typeof(ValueEnumerable)
        for (int i = count - 1; i >= 0; --i) yield return items[i];
    }

    // auto implements: ValueEnumerable.Enumerator GetValueEnumerator()
}

The type name of the Enumerator is part of the method_header declaration so user is at liberty to define it as

public enumerable<T> SomeBadName()
    //  typeof(SomeBadName) && typeof(SomeBadName.Enumerator) 

Method Iterator

public enumerable<T> StructTypeName MethodName()

Convention: public enumerable<T> MethodNameEnumerable MethodName()

Used as return type from a method it is a method iterator and works with yield

// Developer types for method iterator
public enumerable<T> SkipEnumerable Skip(int toSkip) // typeof(SkipEnumerable)
{
    int version = _version;
    int index = toSkip;

    while ((uint)index < (uint)_size)
    {
        if (version == _version) throw new InvalidOperationException("List changed");

        index++;
        yield return _items[index - 1];
    }
}

The type name of the Enumerator is part of the method_header declaration so user is at liberty to define it as

public enumerable<T> OtherBadName Skip(int toSkip)
    // typeof(OtherBadName) && typeof(OtherBadName.Enumerator) 

Pass-through Iterator

They should also be able to be pass-through chained without generating another enumerable type. The class enumerable will have to re-specify enumerable<> to identify the method; a method to method will not:

// Class-level to method iterator, specified return type
public enumerable<T> SkipEnumerable.Enumerator()
       => Skip(toSkip: 0).GetValueEnumerator(); // typeof(SkipEnumerable.Enumerator)

As the class-level enumerable; needs to refers to the Enumerator subtype of the Enumerable a simpler inferred type pass-through syntax will be allowed:

// Class-level to method iterator, inferred return type
public enumerable<T>() 
       => Skip(toSkip: 0).GetValueEnumerator(); // typeof(SkipEnumerable.Enumerator)

Method Iterator pass-through is a simple method mapping

// method to method iterator, return type is always specified
public SkipEnumerable SkipTen() => Skip(toSkip: 10); // typeof(SkipEnumerable)

Code-generation

Class interfaces

Using the class level enumerator will automatically implement the IValueEnumerable interface on the class as well as IEnumerable<T> if not already implemented

public partial class List<T>
    // Compiler added from named TEnumerator for class
    : IValueEnumerable<T, List<T>.ValueEnumerable.Enumerator>
    // Added by complier if not already implmented on class
    , IEnumerable<T>
{

Class-level enumerator

Example code for the above class enumerator that the compiler could generate

// Compiler emits
public ValueEnumerable.Enumerator GetValueEnumerator()
       => new ValueEnumerable.Enumerator(this);

// Nested static class to maintain naming consistency with method iterators
public static class ValueEnumerable
{
    // Value Enumerator 
    public struct Enumerator : IValueEnumerator<T>
    {
        private int __state;
        private List<T> __thisCapture;

        private int version;
        private int index;

        internal Enumerator(List<T> thisCapture)
        {
            __state = 0;
            __thisCapture = thisCapture;
            version = thisCapture._version;
            index = 0;
        }

        public T TryGetNext(out bool success)
        {
            switch (_state)
            {
                case 0:
                    if ((uint)index < (uint)__thisCapture._size)
                    {
                        if (version == __thisCapture._version) throw new InvalidOperationException("List changed");

                        index++;
                        success = true;
                        return __thisCapture._items[index - 1];
                    }
                    _state = 1;
                    goto case 1;
                case 1:
                default:
                    success = false;
                    return default(T);
            }

        }
        
        public void Dispose() {}
        
        public static implicit operator EnumeratorAdapter<Enumerator>(Enumerator enumerator)
        {
            return new EnumeratorAdapter<Enumerator>(enumerator);
        }
    }
}

Class-level enumerator Compatibility/Interop

If IEnumerator<T> was not previously defined on the class so the compiler added it; it would also generate an adapter. Example code for the generated code:

// If GetEnumerator not defined Compiler also generates
public EnumeratorAdapter<ValueEnumerable.Enumerator> GetEnumerator() => GetValueEnumerator();
IEnumerator<T> IEnumerable<T>.GetEnumerator() => GetEnumerator();
IEnumerator IEnumerable.GetEnumerator() => GetEnumerator();

GetEnumerator/IEnumerable interop

With a common EnumeratorAdapter shared for all IValueEnumerators

public struct EnumeratorAdapter<TValueEnumerator> : IEnumerator<T>
        where TValueEnumerator : IValueEnumerator<T>
{
    private TValueEnumerator _enumerator;
    private T _current;

    internal Enumerator(TValueEnumerator enumerator)
    {
        _enumerator = enumerator;
        _current = default(T);
    }

    public T Current => _current;

    public bool MoveNext()
    {
        _current = _enumerator.TryGetNext(ref _state, out bool success);
        return success;
    }

    public void Dispose() => _enumerator.Dispose();
    object IEnumerator.Current => Current;
    void IEnumerator.Reset() => throw new NotSupportedException();
}

Method iterator

// Developer types for method iterator
public enumerable<T> SkipEnumerable Skip(int toSkip) // typeof(SkipEnumerable)
{
    // ...
}

// Compiler emits
public SkipEnumerable Skip(int toSkip) => new SkipEnumerable(this, toSkip);

public struct SkipEnumerable : IValueEnumerable<T, SkipEnumerable.Enumerator>
{
    private List<T> _thisCapture;
    private int _toSkip;

    public SkipEnumerable(List<T> thisCapture, int toSkip)
    {
        _thisCapture = thisCapture;
        _toSkip = toSkip;
    }

    public Enumerator GetValueEnumerator()
    {
        return new Enumerator(_thisCapture, _toSkip);
    }

    // Value Enumerator 
    public struct Enumerator : IValueEnumerator<T>
    {
        private int __state;
        private List<T> __thisCapture;

        private int version;
        private int index;

        internal Enumerator(List<T> thisCapture, int toSkip)
        {
            __state = 0;
            __thisCapture = thisCapture;
            version = thisCapture._version;
            index = toSkip;
        }

        public T TryGetNext(out bool success)
        {
            switch (__state)
            {
                case 0:
                    if ((uint)index < (uint)__thisCapture._size)
                    {
                        if (version == __thisCapture._version) throw new InvalidOperationException("List changed");

                        index++;
                        success = true;
                        return __thisCapture._items[index - 1];
                    }
                    __state = 1;
                    goto case 1;
                case 1:
                default:
                    success = false;
                    return default(T);
            }

        }

        public void Dispose() { }

        public static implicit operator EnumeratorAdapter<Enumerator>(Enumerator enumerator)
        {
            return new EnumeratorAdapter<Enumerator>(enumerator);
        }
    }
}

Pass-through enumerator

The should return the called type

// Class-level to method iterator, explicit return type
public enumerable<T> SkipEnumerable.Enumerator()
       => Skip(toSkip: 0).GetValueEnumerator(); // typeof(SkipEnumerable.Enumerator)

// Class-level to method iterator, inferred return type
public enumerable<T>() 
       => Skip(toSkip: 0).GetValueEnumerator(); // typeof(SkipEnumerable.Enumerator)

// Compiler emits for both
public SkipEnumerable.Enumerator GetValueEnumerator()
       => Skip(toSkip).GetValueEnumerator();

Already defined error if both type-inferred and type-specified class-level iterator

// method to method iterator, return type is always specified
public SkipEnumerable SkipTen() => Skip(toSkip: 10);

// Compiler behaves as now

Consuming the Enumerable

foreach will bind to GetValueEnumerator in preference to GetEnumerator if available and usage as now:

// Developer types 
foreach (T item in list) {
    // Loop body
}

Code-generation

// Compiler emits 
var enumerator = list.GetValueEnumerator();
try
{
    T item = enumerator.TryGetNext(out bool success);
    while (success)
    {
        // Loop body

        item = enumerator.TryGetNext(out success);
    }
}
finally
{
    enumerator.Dispose();
}

Questions

Type inference

Linq style extensions (also see #974 (comment))

public static partial class Enumerable
{
    public static int Count<TSource, TEnumerable, TEnumerator>(this TEnumerable source)
        where TEnumerable : struct, IValueEnumerable<TSource, TEnumerator>
        where TEnumerator : struct, IValueEnumerator<TSource>
    {
        int count = 0;
        foreach (var _ in source)
        {
            checked
            {
                count++;
            }
        }
        return count;
    }
}

When used

var list = new List<int>();
list.Count();

Will error with

The type arguments for method "Enumerable.Count<TSource, TEnumerable, TEnumerator>(TEnumerable)" cannot be inferred from usage. Try specifying the type arguments directly.
However specify the types degrades the user experience significantly

Value Linq

Overload preference, struct generic to be preferred over interface extensions e.g.

Non-boxing

public static int Count<TSource, TEnumerable, TEnumerator>(this TEnumerable source)
    where TEnumerable : struct, IValueEnumerable<TSource, TEnumerator>
    where TEnumerator : struct, IValueEnumerator<TSource>
{

Preferred over cast to interface (that will eventually box)

public static int Count<TSource>(this IEnumerable<TSource> source)
{

Return vs out

Return T or return bool?

@jaredpar mentions on twitter

strangest part is the signature of TryGetNext:
T TryGetNext(out bool b)
It's demonstrably faster than
bool TryGetNext(out T elem)

e.g.

public interface IValueEnumerator<T> : IDisposable
{
    T TryGetNext(out bool success);
}

vs

public interface IValueEnumerator<T> : IDisposable
{
    bool TryGetNext(out T value);
}

Covariance

I said fast right 😉 Though open question...

/cc @JonHanna for thoughts on Value Linq use

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions