Skip to content

Using for-loops with the ".." operator produces suboptimal code #9548

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
teo-tsirpanis opened this issue Jun 23, 2020 · 7 comments · Fixed by #16650
Closed

Using for-loops with the ".." operator produces suboptimal code #9548

teo-tsirpanis opened this issue Jun 23, 2020 · 7 comments · Fixed by #16650
Labels
Area-Compiler-Optimization The F# optimizer, release code gen etc. Feature Request
Milestone

Comments

@teo-tsirpanis
Copy link
Contributor

teo-tsirpanis commented Jun 23, 2020

Consider the following snippet:

open System

let f() =
  for x in 0 .. 15 do
    Console.WriteLine x

It produces a nice, C#esque for-loop. Now consider this one:

open System

let g() =
  for x in 0 .. 2 .. 15 do
    Console.WriteLine x

It creates a sequence and then enumerates it. And it also has a consecutive pair of ldnull; pop; twice, as well as a needless unit local.

Producing a for (int i = 0; i < 15; i += 2) in the second case would have been much more efficient. We could even generalize it to arbitrary types with an addition operator, not just ints. Furthermore, in the first example, replacing the loop with for x in 0.0 .. 15.0 creates a sequence too.

In addition, using like for x in [0 .. 15] does correctly produce a list, but the compiler should warn to better remove the brackets when using a ranged list literal in a loop.

@abelbraaksma
Copy link
Contributor

abelbraaksma commented Jun 24, 2020

for x in 0 .. 15 do

It creates a sequence. It's twice as fast if you use the for.. to syntax.

I believe there's a feature underway that allows for such optimizations, though, which would blur the performance differences more.

Ultimately, both your examples should compile into the IL code for a while loop, like your c# suggestion actually is.

@teo-tsirpanis
Copy link
Contributor Author

It creates a sequence. It's twice as fast if you use the for.. to syntax.

No, it doesn't. The compiler generates identical code.

image

This happens only on integers, and not when the double .. .. operator is used.

@abelbraaksma
Copy link
Contributor

abelbraaksma commented Jun 24, 2020

That's interesting. For sure, only a few days ago, I noticed the opposite, and I was considering filing a bug for it. I'll try to find that back, as that may suggest that it depends on the situation.

EDIT @teo-tsirpanis: the only thing I could find back was proof of the inverse, notes in my code saying that for x in y where y is an array, results in optimal code, that is, indexed on the array as opposed to using the ArrayEnumerator. Checking IL and cpu disassembly further backs the claim that there's no difference between for x=0 to 100 and for x in 0 .. 100; the difference is only conceptually.

Of course, differences do exist if the argument after in is indeed a sequence. In that case, even if the sequence were created from an array with Seq.ofArray (as opposed to a cast), the code will be sub-optimal.

Bottom line: as written you are absolutely right with your observation :)

@abelbraaksma
Copy link
Contributor

abelbraaksma commented Jun 24, 2020

Hmm, it can be subtle. Consider this:

> let f y x = for _=0 to y do for i in x do String.length i |> ignore;;
val f : y:int -> x:seq<string> -> unit
> let g y (x: array<_>) = for _=0 to y do for i in x do String.length i |> ignore;;
val g : y:int -> x:string array -> unit

Now, when you call f with an array, the optimization will not kick in, and the enumerator is called. In this case leading to about 4x perf loss. If you call g (which, of course, cannot be called with any sequence), the difference becomes clear.

If, instead, you were to do a type-test before entering the loop and use two paths for sequence and arrays, you get the original perf of g back again.

There's something to be said to optimize this in the compiler too (though not directly your original request, sorry for digressing).

> let values = [|"a";"a";"a";"a";"a";"a";"a";"a";"a"|];;
> f 10_000_000 values;;
Real: 00:00:00.846, CPU: 00:00:00.842, GC gen0: 51, gen1: 1, gen2: 0
> g 10_000_000 values;;
Real: 00:00:00.113, CPU: 00:00:00.124, GC gen0: 0, gen1: 0, gen2: 0

@cartermp
Copy link
Contributor

Note that the optimization does not happen for other numeric types: https://sharplab.io/#v2:DYLgZgzgPmD2BOACAHoglgO0QBgDIDp8BGbPRAE1kQAd5MAXMLAIgAsBTYYWZoA=

Though it is admittedly not as straightforward, since you can't index into an array unless it's an int.

@teo-tsirpanis
Copy link
Contributor Author

If, instead, you were to do a type-test before entering the loop and use two paths for sequence and arrays, you get the original perf of g back again.

These two paths would require the entire loop body code to be duplicated. I'm not sure whether it's worth it. After all, the developer is fully aware of the performance implications the for in loop entails. The optimization that occurs when iterating a known IEnumerable subtype (it also works on lists as well) is good but should need no generalization.

Something better we could do is to perform type tests on Seq.iter but we have to consider the cost of a type test (or many type tests; if the seq to be itered is a real seq, it will first be tested against whether it is an array, a list, a Set and then it will be itered), and the fact that there are so many implementations of IEnumerable and we cannot specially treat them all.

you can't index into an array unless it's an int

@cartermp your snippet should be optimized into for x = 0L to 100L but we can't even do a for to loop with types other than integers.

@abelbraaksma
Copy link
Contributor

abelbraaksma commented Jun 24, 2020

and the fact that there are so many implementations of IEnumerable and we cannot specially treat them all.

I'd only be interested in the following cases, which I think make sense:

  • list and array
  • when the compiler can detect it statically, or the type test does not degrade performance

People should not be punished when using an array in a function that accepts a sequence (when we can help it).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area-Compiler-Optimization The F# optimizer, release code gen etc. Feature Request
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.

5 participants