-
Notifications
You must be signed in to change notification settings - Fork 18k
runtime: fatal error: sweep increased allocation count (With go get) #19029
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
@aclements, I forget whether this was ever debugged. There was also #16778. |
@bradfitz, we're seeing the same issue (not with Stacktrace:
|
Ping @aclements, @rsc. |
@cynecx, what's the probability of a given 'go get -u -v github.com/nsf/gocode' crashing? I just ran 100 in a row without a problem, using Go 1.7.5 on Linux 3.13.0. @brauner, are you also using Go 1.7.5? Do you have a command that reproduces the problem reliably? And, unrelated but for my curiosity, why are you using GOARCH=386 instead of amd64? |
@rsc, it seems our test for Go 1.7.* ran Go 1.7. We are currently setting up a testrun with Go 1.7.5. @aclements, we've set up a testrun with |
@rsc, one more detail, the error seems independent of the arch we test on. It happens with equal probability on |
@brauner, another question: is there any unsafe or cgo in your application? For example, creating a pointer into the Go heap that points at an unallocated object can result in exactly this failure. |
@aclements, yes we're using
|
@aclements, the above paste is with |
@rsc, it does not seem to be reproducible with Go 1.7.5. |
@brauner, those traces are great, I see what's going on and I've reproduced at master. Checking to see what earlier versions the problem exists in. But you were seeing other failures (without checkmark) in Go 1.7. Are they gone in Go 1.7.5? Are they gone in Go 1.8rc3? |
@brauner, thanks, so just to confirm, you originally posted a failure beginning:
In a followup then you said that was from Go 1.7, and in your more recent followup you said that with Go 1.7.5 you are unable to reproduce that specific failure anymore. Can you confirm I've got that right? Thanks. |
Or maybe I just replied to the wrong issue... |
@rsc, so far we only managed to reproduce it with current Go master (2017-02-14). All other Go versions build fine. |
@brauner, but you had the original report about "sweep increased allocation count". Was that Go 1.7? |
@rsc, no. The original report was Go master (2017-02-14). |
@cynecx, are you able to reproduce the issue you saw? If so, how often? Can you run with |
Sorry about the delay.
|
Thanks. @aclements, that block is a Package struct from cmd/go, and the field at offset 1024 with the unmarked object is 'buildID string', That's only updated in one place in the code:
which compiles to:
This looks right to me, provided runtime.writeBarrier and runtime.writerbarrierptr are correct. There is another write for binary-only packages but there aren't any binary-only packages involved here and p.BinaryOnly (the bool at offset 168) is false. |
I've got this too. Very repeatable. Let me know if there's anything I can do to help.
and with
my go env:
|
@skdp Seriously, I have no idea who you are. What's your bitbucket id, so I can add you to a private repo. I think it's better if don't use this issue thread for having private conversations. |
@greyltc, nice traces, thanks. The next step is to figure out what the memory is. The big memory block being dumped starts with five []byte all taken from the same underlying array. That seems distinctive but I haven't been able to find a struct in the rclone sources or any libraries it uses that starts with five []byte. |
That seems likely for the database/sql trybot failure. The other failures were reported on Go 1.7.5 (#19029 (comment)) and Go 1.8rc3 (#19029 (comment)), so it's probably a different failure. |
@greyltc, which exact rclone commit did you get that failure at? Also, it would be handy if you posted a few more failures (at the same commit or a different commit; just let us know). |
@aclements, this was with the 1.35 release, rclone/rclone@5b8b379 here are five more traces: https://gist.github.com/greyltc/5a75b4bdcf8538c54e53472c857b17a9 |
@greyltc, can you can gather 5 more traces with GODEBUG=gccheckmark=1 ? |
@randall77, any ETA on fixing #19078? It would be nice to eliminate that from future debugging considerations. |
@rsc, here are those five additional traces, this time with https://gist.github.com/greyltc/8cc55ce7c1d9fa00b065f0bfbbf77b29 |
Thanks. The unscanned pointers are all over the place, not just in the same data structure at the same position each time. That's useful data, although still not conclusive. |
@rsc: nothing ready yet. I may have a fix by this weekend. |
For completeness, I analyzed @greyltc's five new traces: fail_10.txt: Clearly still |
@aclements, fail_9 looks like an array of (slice+interface) pairs, perhaps length 3, or perhaps 4 with a final zeroed pair. |
Would it be helpful for me to gather any more failure logs? |
I've been getting this crash with upgrading to 1.7.4 (won't be able to use 1.7.5) with cgo and leveldb possibly playing into it. I'm not sure what actions trigger, but its happening routinely in our testing environment. |
@greyltc, unfortunately more logs probably wouldn't help at this point. The logs you sent don't fit into any of the patterns we'd expect for this failure, so we're a bit stumped. Do you (or @ncw) know if other rclone users are having similar problems? I'm wondering if it may be an environment issue in your case. @adamflott, I assume your code isn't public? Can you try reproducing it a few times with |
@aclements I haven't had any other reports of that bug and I haven't seen it myself in testing :-( |
@ncw try these steps to repeat it:
|
@aclements code is not public. With those env vars I get an instant crash on startup, although not always. However I've been running a build with 1.6.3 and I have not seen a crash over a few days.
|
@ncw, @adamflott, same question for both of you: are you still able to reproduce this on Go 1.8 or current master (soon to be 1.9)? I'm particularly wondering if commit c1730ae (released in Go 1.8) fixed this. |
@aclements I never managed to reproduce this and I haven't had any other reports of it. |
Rereading this issue it seems to combine several different problems, some clearly fixed, some possibly fixed, none clearly still relevant. It's gotten hard to pull out the specific information for a specific problem. Therefore, I'm going to close this issue as at least partially fixed. If you can still produce a problem when using Go1.9rc1 or later, please open a new issue that references this one and provides the details for your problem. Thanks. |
It's looks like a bug #21717 |
"Sweep increased allocation count" means "something went wrong somewhere in garbage collector or maybe the runtime or maybe unsafe code", so the similar panic is not a good indicator that it's the same bug. :) |
What version of Go are you using (
go version
)?go version go1.7.5 linux/amd64
What operating system and processor architecture are you using (
go env
)?What did you do?
What did you expect to see?
Anything except runtime errors.
What did you see instead?
I am currently using a customized (ck/muqss+bfq) kernel but I don't think this is the issue because it is also crashing with a freshly compiled vanilla linux kernel (v4.9.9).
The text was updated successfully, but these errors were encountered: