Skip to content

"fatal error: casgstatus: bad incoming values" in runtime package #49513

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
skyfireitdiy opened this issue Nov 11, 2021 · 5 comments
Closed

"fatal error: casgstatus: bad incoming values" in runtime package #49513

skyfireitdiy opened this issue Nov 11, 2021 · 5 comments
Labels
FrozenDueToAge WaitingForInfo Issue is not actionable because of missing required information, which needs to be provided.

Comments

@skyfireitdiy
Copy link

What version of Go are you using (go version)?

$ go version
go version go1.14.4 linux/amd64

Does this issue reproduce with the latest release?

I don't know, This problem has only appeared twice in our program this year.

What operating system and processor architecture are you using (go env)?

go env Output
$ go env
GO111MODULE=""
GOARCH="amd64"
GOBIN="/home/wangmaobin/go/bin"
GOCACHE="/home/wangmaobin/.cache/go-build"
GOENV="/home/wangmaobin/.config/go/env"
GOEXE=""
GOFLAGS=""
GOHOSTARCH="amd64"
GOHOSTOS="linux"
GOINSECURE=""
GONOPROXY=""
GONOSUMDB=""
GOOS="linux"
GOPATH="/home/wangmaobin/go"
GOPRIVATE=""
GOPROXY="https://goproxy.cn,direct"
GOROOT="/home/wangmaobin/tools/go"
GOSUMDB="sum.golang.org"
GOTMPDIR=""
GOTOOLDIR="/home/wangmaobin/tools/go/pkg/tool/linux_amd64"
GCCGO="gccgo"
AR="ar"
CC="gcc"
CXX="g++"
CGO_ENABLED="1"
GOMOD=""
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -m64 -pthread -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build367083870=/tmp/go-build -gno-record-gcc-switches"

What did you do?

Nothing

What did you expect to see?

Stable operation of software

What did you see instead?

There was a fatal error: casgstatus: bad incoming values.

runtime: casgstatus: oldval=0x1 newval=0x2
fatal error: casgstatus: bad incoming values

runtime stack:
runtime.throw(0x107a53a, 0x1f)
        /home/itran_ci_master/workspace/compile_code/05/workspace/rum/go_src/base/go/src/runtime/panic.go:1116 +0x72
runtime.casgstatus.func1()
        /home/itran_ci_master/workspace/compile_code/05/workspace/rum/go_src/base/go/src/runtime/proc.go:776 +0xa7
runtime.casgstatus(0xc000302a80, 0x200000001)
        /home/itran_ci_master/workspace/compile_code/05/workspace/rum/go_src/base/go/src/runtime/proc.go:774 +0x4a
runtime.execute(0xc000302a80, 0xc000302a00)
        /home/itran_ci_master/workspace/compile_code/05/workspace/rum/go_src/base/go/src/runtime/proc.go:2053 +0x7d
runtime.schedule()
        /home/itran_ci_master/workspace/compile_code/05/workspace/rum/go_src/base/go/src/runtime/proc.go:2567 +0x1a5
runtime.park_m(0xc0003c6300)
        /home/itran_ci_master/workspace/compile_code/05/workspace/rum/go_src/base/go/src/runtime/proc.go:2696 +0x78
runtime.mcall(0x7f4dd49efd90)
        /home/itran_ci_master/workspace/compile_code/05/workspace/rum/go_src/base/go/src/runtime/asm_amd64.s:318 +0x5b

I reviewed the code and, based on the printed oldVal and newval, guessed that it should not be possible to enter this branch.
I wonder if there is any effective debugging for this kind of problem

@ianlancetaylor
Copy link
Member

Note that Go 1.14 is no longer supported.

I agree that this output is simply impossible. And nobody else is reporting this. That suggests a hardware problem, such as bad memory. Has this happened on more than one machine?

@ianlancetaylor ianlancetaylor added the WaitingForInfo Issue is not actionable because of missing required information, which needs to be provided. label Nov 11, 2021
@skyfireitdiy
Copy link
Author

I don't think it's a hardware problem, the same problem appears on two different machines and different applications.
I reviewed the go source code:

func casgstatus(gp *g, oldval, newval uint32) {
	if (oldval&_Gscan != 0) || (newval&_Gscan != 0) || oldval == newval {
		systemstack(func() {
			print("runtime: casgstatus: oldval=", hex(oldval), " newval=", hex(newval), "\n")
			throw("casgstatus: bad incoming values")
		})
	}
       // ...
}

Here are local variables, which should not be caused by data competition.

I have just consulted other colleagues and they have encountered similar problems.

If we upgrade the tool chain, it will have too much impact and we need to reassess the risk. So I would like to come to github to ask if there are any relevant debug means.

@bcmills
Copy link
Contributor

bcmills commented Nov 11, 2021

This issue occurred intermittently on the linux-arm-scaleway builders for a little while, and on the plan9-arm builders before that (but appears to have been fixed on plan9-arm, or at least no longer has this failure mode).

If I understand correctly, those builders lacked VDSO support (#33574). If this problem still reproduces on a supported Go version, that might be a direction to investigate.

greplogs --dashboard -md -l -e 'casgstatus: bad incoming values'

2021-04-01T15:50:43-45ca9ef/linux-arm-scaleway
2021-04-01T01:26:29-5f646f0/linux-arm-scaleway
2021-04-01T00:51:26-ec721d9/linux-arm-scaleway
2021-04-01T00:51:24-1f29e69/linux-arm-scaleway
2021-04-01T00:51:23-3304b22/linux-arm-scaleway
2021-03-31T20:21:57-5d6581d/linux-arm-scaleway
2020-12-15T16:30:24-5046cb8/plan9-arm
2020-11-19T19:30:38-e73697b/plan9-arm
2020-05-05T18:32:35-8627b4c/plan9-arm
2020-05-05T18:05:10-a8e83d5/plan9-arm
2020-05-05T15:41:37-b4ecafc/plan9-arm
2020-05-05T05:13:26-9b18968/plan9-arm
2020-05-05T00:36:44-c9d5f60/plan9-arm
2020-05-04T17:40:00-a1ffbe9/plan9-arm

@bcmills bcmills added WaitingForInfo Issue is not actionable because of missing required information, which needs to be provided. and removed WaitingForInfo Issue is not actionable because of missing required information, which needs to be provided. labels Nov 11, 2021
@ianlancetaylor
Copy link
Member

Actually, none of those builders are showing the same problem. They happen to have the same string, but for entirely different reasons.

Which is good, because the problem simply can't happen.

I'm sorry, I don't know how to debug this. My only suggestion would be to look closely at the generated code to see how this could happen, since as we both agree the Go code can't produce this result. Is it possible that you have a corrupt compiler that is somehow generating bad code for this case?

@gopherbot
Copy link
Contributor

Timed out in state WaitingForInfo. Closing.

(I am just a bot, though. Please speak up if this is a mistake or you have the requested information.)

@golang golang locked and limited conversation to collaborators Dec 11, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
FrozenDueToAge WaitingForInfo Issue is not actionable because of missing required information, which needs to be provided.
Projects
None yet
Development

No branches or pull requests

4 participants