Skip to content

Severe performance regression #125

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
zenazn opened this issue Jun 27, 2016 · 11 comments
Closed

Severe performance regression #125

zenazn opened this issue Jun 27, 2016 · 11 comments

Comments

@zenazn
Copy link
Contributor

zenazn commented Jun 27, 2016

In the process of upgrading an application from 873d14f to 57fcf4a, we noticed that our application performed SFTP uploads roughly 5x slower than before (download speed, oddly enough, was greatly improved!)

Doing a full bisection against production is impractical for us, but I believe I've found a microbenchmark in this package that is indicative of this performance regression: BenchmarkWrite1k. Prior to 2960014, I got the following results on my laptop (Early 2013 15" MBP; Go 1.6.2):

$ go test -run NONE -bench BenchmarkWrite1k -integration -sftp /usr/libexec/sftp-server -count 10
PASS
BenchmarkWrite1k-8         5     257684948 ns/op      40.69 MB/s
BenchmarkWrite1k-8         5     258874739 ns/op      40.51 MB/s
BenchmarkWrite1k-8         5     257781516 ns/op      40.68 MB/s
BenchmarkWrite1k-8         5     258480088 ns/op      40.57 MB/s
BenchmarkWrite1k-8         5     254071386 ns/op      41.27 MB/s
BenchmarkWrite1k-8         5     258690412 ns/op      40.53 MB/s
BenchmarkWrite1k-8         5     298626463 ns/op      35.11 MB/s
BenchmarkWrite1k-8         5     258688959 ns/op      40.53 MB/s
BenchmarkWrite1k-8         5     255352022 ns/op      41.06 MB/s
BenchmarkWrite1k-8         5     262082224 ns/op      40.01 MB/s
ok      github.com/pkg/sftp 23.966s

After 2960014, I get:

$ go test -run NONE -bench BenchmarkWrite1k -integration -sftp /usr/libexec/sftp-server -count 10
PASS
BenchmarkWrite1k-8         1    2363737801 ns/op       4.44 MB/s
BenchmarkWrite1k-8         1    2401106692 ns/op       4.37 MB/s
BenchmarkWrite1k-8         1    2373019226 ns/op       4.42 MB/s
BenchmarkWrite1k-8         1    2316647765 ns/op       4.53 MB/s
BenchmarkWrite1k-8         1    1319017375 ns/op       7.95 MB/s
BenchmarkWrite1k-8         1    2357954728 ns/op       4.45 MB/s
BenchmarkWrite1k-8         1    2379846631 ns/op       4.41 MB/s
BenchmarkWrite1k-8         1    1364000038 ns/op       7.69 MB/s
BenchmarkWrite1k-8         1    1186599504 ns/op       8.84 MB/s
BenchmarkWrite1k-8         1    1259262555 ns/op       8.33 MB/s
ok      github.com/pkg/sftp 19.487s

Note that in each case I've cherry-picked 587af18cfc86e31167f25d5e12b9a3e588b8775b to get the package to compile on Darwin.

I've tried all my usual debugging and profiling tricks, and it's not obviously any of the usual suspects. Any tips about where to look next? I'm also happy to provide more information about our production application, or to test a small number of patches against production (although the microbenchmark seems pretty damning on its own)

@zenazn
Copy link
Contributor Author

zenazn commented Jun 28, 2016

I ended up writing an "end-to-end" microbenchmark in the context of our application logic, and bisected our performance degradations to a complex interaction between our application abstractions, bufio.Writer, and (*sftp.File).ReadFrom (introduced in de236e8; not the commit I falsely accused above; although the performance regression for small writes from that commit still stands):

We've written an abstraction on top of package sftp to wrap all Opened and Created files with bufio Readers and Writers, respectively. This is because our application requires the use of many small reads and writes, which we need to batch together for performance.

We use io.Copy to shuffle bytes to or from an io.Pipe that bridges the sftp code and this small-reads-and-writes application logic. This has the effect of running bufio.Writer's ReadFrom method (or vice versa), called with our io.PipeReader as an argument. Unfortunately for us, the bufio.Writer has yet to buffer any bytes, and as an optimization, if the object bufio.Writer is wrapping itself supports a io.ReaderFrom, the empty bufio will dispatch the call to the wrapped object instead (https://github.com/golang/go/blob/b75b0630fe6d8f66f937f78f15f540b5b6dab24f/src/bufio/bufio.go#L679-L683).

The net result of this chain of optimizations ("optimizations") is that (*sftp.File).ReadFrom gets called with the unbuffered, small-writes stream, defeating the purpose of our use of bufio and causing (at least under our network conditions) the 5x slowdown we were seeing in production.

I'm really not sure where to file this bug (the obvious contenders being this package and bufio), so I'll leave this bug report open in case you folks have opinions on how to triage it. Also, the original microbenchmark performance regression from my original issue still stands, and while I'm not sure how (if at all) it affects real production workloads, it might still be worth a look.

And while hopefully it goes without saying, thanks for maintaining this package and being my rubber duck! :)

@davecheney
Copy link
Member

Thanks for digging into this, please keep us informed.

On Tue, Jun 28, 2016 at 3:43 PM, Carl Jackson [email protected]
wrote:

I ended up writing an "end-to-end" microbenchmark in the context of our
application logic, and bisected our performance degradations to a complex
interaction between our application abstractions, bufio.Writer, and
(*sftp.File).ReadFrom (introduced in de236e8
de236e8;
not the commit I falsely accused above; although the performance regression
for small writes from that commit still stands):

We've written an abstraction on top of package sftp to wrap all Opened
and Created files with bufio Readers and Writers, respectively. This is
because our application requires the use of many small reads and writes,
which we need to batch together for performance.

We use io.Copy to shuffle bytes to or from an io.Pipe that bridges the
sftp code and this small-reads-and-writes application logic. This has the
effect of running bufio.Writer's ReadFrom method (or vice versa), called
with our io.PipeReader as an argument. Unfortunately for us, the
bufio.Writer has yet to buffer any bytes, and as an optimization, if the
object bufio.Writer is wrapping itself supports a io.ReaderFrom, the
empty bufio will dispatch the call to the wrapped object instead (
https://github.com/golang/go/blob/b75b0630fe6d8f66f937f78f15f540b5b6dab24f/src/bufio/bufio.go#L679-L683
).

The net result of this chain of optimizations ("optimizations") is that
(*sftp.File).ReadFrom gets called with the unbuffered, small-writes
stream, defeating the purpose of our use of bufio and causing (at least
under our network conditions) the 5x slowdown we were seeing in production.

I'm really not sure where to file this bug (the obvious contenders being
this package and bufio), so I'll leave this bug report open in case you
folks have opinions on how to triage it. Also, the original microbenchmark
performance regression from my original issue still stands, and while I'm
not sure how (if at all) it affects real production workloads, it might
still be worth a look.

And while hopefully it goes without saying, thanks for maintaining this
package and being my rubber duck! :)


You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
#125 (comment), or mute
the thread
https://github.com/notifications/unsubscribe/AAAcA0X9rY0nGq5DMkrGqZdYwnxq7NSKks5qQLSbgaJpZM4I_mbm
.

@eikenb
Copy link
Member

eikenb commented Jan 7, 2017

Do you have anything more to add? If you have a work around I'd be interested in seeing it. Maybe you could submit a pull request with it?

Thanks.

@unclejack
Copy link
Contributor

This performance issue is still an issue. I haven't looked into this much so far.

@eikenb
Copy link
Member

eikenb commented Feb 28, 2017

Based on my reading of your explanations it seems like you expect bufio to work like a framebuffer. Where it stores up data in the buffer and passes it along only when the buffer is full. The go developers seem to have gone more with the more traditional idea of a data buffer that fills up only if there is nothing to read from it. That is a stream that can buffer to deal with the reader/writer being out of sync. The sftp.File objects mimics this behavior.

So I guess I'm not sure if this is a bug with either this package or bufio.

@eikenb
Copy link
Member

eikenb commented Dec 26, 2017

@unclejack Reviewing the issues again and I can see an argument for this being a bug in bufio. You use it explicitly when you want to buffer data but when the underlying writer supports ReadFrom it skips buffering, which was the whole point of using it.

I think I might file a ticket it to see what the go-devs think.

@eikenb
Copy link
Member

eikenb commented Dec 27, 2017

@unclejack Have you tried...

var f *sftp.File = ... // your writer

type writerOnly struct { io.Writer }
bw := bufio.NewWriter(writerOnly{f}) // no ReadFrom()

Thanks @calmh for affirming this as a good workaround.

@eikenb
Copy link
Member

eikenb commented Jan 12, 2018

IMO this is a bug in either the specification of ReadFrom(), that you shouldn't implement it on slow writers, or a bug in the implementation of ReadFrom() skipping buffering.

Either way the fix for the moment seems to be to document it and the way to deal with it. I'm going to write an example for ReadFrom that demonstrates the workaround.

@eikenb eikenb closed this as completed in 72ec6e8 Jan 12, 2018
@puellanivis
Copy link
Collaborator

Small aside here, my understanding has always been that ReadFrom and WriteTo should only be provided when they can be written to be “the most efficient way of doing this operation”.

Basically, because the operations are defined as “write the whole reader from/to” the operation is able to make much more intimate assumptions using knowledge of the internal data type. Taking a look at the bytes.Buffer.ReadFrom for example… wow. That’s some super “clever” code right there…

@eikenb
Copy link
Member

eikenb commented Jan 14, 2018

I'm curious how you came to this understanding? Reading the documentation it describes it as a different semantic for writing, a convenience function for when you have an io.Reader. There is nothing that implies it is for optimizing IO. It only seems that way if/when you read the ReadFrom implementations in the standard library.

@puellanivis
Copy link
Collaborator

Indeed, my understand has been based only upon the implementations and the “niche” use. One could always use io.Copy to do a naive copying, even if ReadFrom doesn’t exist, so… why should such a function exist?

Because in that particular situation one can potentially do a better job than a naive io.Copy (and thus why io.Copy defers to the function if it exists.)

So, it’s kind of an inference about purpose based on its existence, and thus it should have a rational purpose, but since io.Copy exists, “mere convenience” doesn’t seem like a rational one.

But yes, the documentation itself is unclear about this purpose… so, this understanding certainly is an exogenesis.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants