Skip to content

net/http/pprof: coroutines + pprof makes the program panic #69998

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
GiedriusS opened this issue Oct 23, 2024 · 9 comments
Closed

net/http/pprof: coroutines + pprof makes the program panic #69998

GiedriusS opened this issue Oct 23, 2024 · 9 comments
Assignees
Labels
compiler/runtime Issues related to the Go compiler and/or runtime. Critical A critical problem that affects the availability or correctness of production systems built using Go NeedsFix The path to resolution is known, but the work has not been done.
Milestone

Comments

@GiedriusS
Copy link

GiedriusS commented Oct 23, 2024

Go version

go version devel go1.24-d0631b90a3 Wed Oct 23 04:48:55 2024 +0000 linux/amd64

Output of go env in your module/workspace:

GO111MODULE=''
GOARCH='amd64'
GOAUTH='netrc'
GOBIN=''
GOCACHE='/home/giedriusstatkevicius/.cache/go-build'
GOENV='/home/giedriusstatkevicius/.config/go/env'
GOEXE=''
GOEXPERIMENT=''
GOFLAGS=''
GOHOSTARCH='amd64'
GOHOSTOS='linux'
GOINSECURE=''
GOMODCACHE='/home/giedriusstatkevicius/go/pkg/mod'
GONOPROXY=''
GONOSUMDB=''
GOOS='linux'
GOPATH='/home/giedriusstatkevicius/go'
GOPRIVATE=''
GOPROXY='https://proxy.golang.org,direct'
GOROOT='/home/giedriusstatkevicius/dev/go'
GOSUMDB='sum.golang.org'
GOTMPDIR=''
GOTOOLCHAIN='auto'
GOTOOLDIR='/home/giedriusstatkevicius/dev/go/pkg/tool/linux_amd64'
GOVCS=''
GOVERSION='devel go1.24-d0631b90a3 Wed Oct 23 04:48:55 2024 +0000'
GODEBUG=''
GOTELEMETRY='local'
GOTELEMETRYDIR='/home/giedriusstatkevicius/.config/go/telemetry'
GCCGO='gccgo'
GOAMD64='v1'
AR='ar'
CC='gcc'
CXX='g++'
CGO_ENABLED='1'
GOMOD='/home/giedriusstatkevicius/dev/coroutine-panic-pprof/go.mod'
GOWORK=''
CGO_CFLAGS='-O2 -g'
CGO_CPPFLAGS=''
CGO_CXXFLAGS='-O2 -g'
CGO_FFLAGS='-O2 -g'
CGO_LDFLAGS='-O2 -g'
PKG_CONFIG='pkg-config'
GOGCCFLAGS='-fPIC -m64 -pthread -Wl,--no-gc-sections -fmessage-length=0 -ffile-prefix-map=/tmp/go-build3486958581=/tmp/go-build -gno-record-gcc-switches'

What did you do?

Run this program:

package main

import (
	"iter"
	"net/http"
	_ "net/http/pprof"
	"time"
)

func infiniteLoopCoroutine() {
	for {
		p, stop := iter.Pull2[int, int](
			func(yield func(int, int) bool) {
				for i := 0; i < 10000; i++ {
					if !yield(i, i) {
						return
					}
				}
			},
		)

		defer stop()
		for {
			_, _, ok := p()
			if !ok {
				break
			}
		}
	}

}

func main() {
	go func() {
		http.ListenAndServe("localhost:8080", nil)
	}()

	for i := 0; i < 1; i++ {
		go infiniteLoopCoroutine()
	}

	go func() {
		for {
			resp, err := http.Get("http://localhost:8080/debug/pprof/goroutine?debug=1")
			if err != nil {
				continue
			}
			resp.Body.Close()
		}
	}()

	time.Sleep(8888 * time.Second)

}

What did you see happen?

I either see:

SIGSEGV: segmentation violation
PC=0x4602d1 m=0 sigcode=1 addr=0x18

goroutine 0 gp=0x9fad00 m=0 mp=0x9fbb20 [idle]:
runtime.(*unwinder).resolveInternal(0x7ffc64224f00, 0x1?, 0xab?)
	/home/giedriusstatkevicius/dev/go/src/runtime/traceback.go:378 +0x1d1 fp=0x7ffc64224de0 sp=0x7ffc64224d78 pc=0x4602d1
runtime.(*unwinder).initAt(0x7ffc64224f00, 0x40deb0?, 0xc000380800?, 0x80?, 0xc0000c4380, 0x2)
	/home/giedriusstatkevicius/dev/go/src/runtime/traceback.go:224 +0x1db fp=0x7ffc64224ed0 sp=0x7ffc64224de0 pc=0x45ff7b
runtime.saveg(0x7ffc64224fc8?, 0x440725?, 0x0?, 0xc0001fe240, {0xc000380800?, 0x40000000066d6b6?, 0xc0000c5340?})
	/home/giedriusstatkevicius/dev/go/src/runtime/mprof.go:1686 +0xd2 fp=0x7ffc64224f90 sp=0x7ffc64224ed0 pc=0x433c72
runtime.doRecordGoroutineProfile.func1()
	/home/giedriusstatkevicius/dev/go/src/runtime/mprof.go:1570 +0x4c fp=0x7ffc64224fd8 sp=0x7ffc64224f90 pc=0x433b8c
runtime.systemstack(0x47de5f)
	/home/giedriusstatkevicius/dev/go/src/runtime/asm_amd64.s:514 +0x4a fp=0x7ffc64224fe8 sp=0x7ffc64224fd8 pc=0x4796ea

Or:

doRecordGoroutineProfile gp1=8
fatal error: cannot read stack of running goroutine

goroutine 50 gp=0xc0002381c0 m=7 mp=0xc0000a2808 [running]:
runtime.throw({0x76ce27?, 0x40f17e?})
	/home/giedriusstatkevicius/dev/go/src/runtime/panic.go:1074 +0x48 fp=0xc0000f5478 sp=0xc0000f5448 pc=0x4746e8
runtime.doRecordGoroutineProfile(0xc0000c41c0?, {0xc0002b4000?, 0x7fb966e20aa8?, 0x7fb9ae3383e8?})
	/home/giedriusstatkevicius/dev/go/src/runtime/mprof.go:1550 +0x14c fp=0xc0000f54d0 sp=0xc0000f5478 pc=0x433b0c
runtime.tryRecordGoroutineProfile(0xc0000c4380, {0xc0002b4000, 0x80, 0x80}, 0x791eb8)
	/home/giedriusstatkevicius/dev/go/src/runtime/mprof.go:1533 +0xc5 fp=0xc0000f5508 sp=0xc0000f54d0 pc=0x433905
runtime.goroutineProfileWithLabelsConcurrent.func2(0xd10121c0806?)
	/home/giedriusstatkevicius/dev/go/src/runtime/mprof.go:1448 +0x26 fp=0xc0000f5540 sp=0xc0000f5508 pc=0x433766
runtime.forEachGRace(0xc0000f55d8)
	/home/giedriusstatkevicius/dev/go/src/runtime/proc.go:720 +0x49 fp=0xc0000f5570 sp=0xc0000f5540 pc=0x43eb69
runtime.goroutineProfileWithLabelsConcurrent({0xc0002b0000, 0x13, 0x13}, {0xc0002b2000, 0x13, 0x13})
	/home/giedriusstatkevicius/dev/go/src/runtime/mprof.go:1447 +0x40b fp=0xc0000f5658 sp=0xc0000f5570 pc=0x4335eb
runtime.goroutineProfileWithLabels({0xc0002b0000?, 0xc0000f56c8?, 0x4765e9?}, {0xc0002b2000?, 0x6ee2e0?, 0xc0000f5601?})
	/home/giedriusstatkevicius/dev/go/src/runtime/mprof.go:1323 +0x31 fp=0xc0000f5698 sp=0xc0000f5658 pc=0x433191
runtime.pprof_goroutineProfileWithLabels({0xc0002b0000?, 0x75fbde?, 0xc?}, {0xc0002b2000?, 0x6658f0?, 0xc000286030?})
	/home/giedriusstatkevicius/dev/go/src/runtime/mprof.go:1314 +0x1d fp=0xc0000f56d8 sp=0xc0000f5698 pc=0x47371d
runtime/pprof.writeRuntimeProfile({0x7ed740, 0xc00029e000}, 0x1, {0x75e54a, 0x9}, 0x7927d0)
	/home/giedriusstatkevicius/dev/go/src/runtime/pprof/pprof.go:793 +0xb1 fp=0xc0000f5740 sp=0xc0000f56d8 pc=0x6b0f51
runtime/pprof.writeGoroutine({0x7ed740?, 0xc00029e000?}, 0x30?)
	/home/giedriusstatkevicius/dev/go/src/runtime/pprof/pprof.go:752 +0x45 fp=0xc0000f5780 sp=0xc0000f5740 pc=0x6b0d85
runtime/pprof.(*Profile).WriteTo(0x9ee6d0?, {0x7ed740?, 0xc00029e000?}, 0xc?)
	/home/giedriusstatkevicius/dev/go/src/runtime/pprof/pprof.go:374 +0x14b fp=0xc0000f5880 sp=0xc0000f5780 pc=0x6adbab
net/http/pprof.handler.ServeHTTP({0xc000282011, 0x9}, {0x7efcf8, 0xc00029e000}, 0xc00024c000)
	/home/giedriusstatkevicius/dev/go/src/net/http/pprof/pprof.go:272 +0x52f fp=0xc0000f5938 sp=0xc0000f5880 pc=0x6c7b6f
net/http/pprof.Index({0x7efcf8, 0xc00029e000}, 0xc00024c000?)
	/home/giedriusstatkevicius/dev/go/src/net/http/pprof/pprof.go:388 +0xde fp=0xc0000f5af0 sp=0xc0000f5938 pc=0x6c861e
net/http.HandlerFunc.ServeHTTP(0x9fab20?, {0x7efcf8?, 0xc00029e000?}, 0x66d6b6?)
	/home/giedriusstatkevicius/dev/go/src/net/http/server.go:2220 +0x29 fp=0xc0000f5b18 sp=0xc0000f5af0 pc=0x674109
net/http.(*ServeMux).ServeHTTP(0x46ef59?, {0x7efcf8, 0xc00029e000}, 0xc00024c000)
	/home/giedriusstatkevicius/dev/go/src/net/http/server.go:2746 +0x1c4 fp=0xc0000f5b68 sp=0xc0000f5b18 pc=0x675f84
net/http.serverHandler.ServeHTTP({0xc0002022d0?}, {0x7efcf8?, 0xc00029e000?}, 0x6?)
	/home/giedriusstatkevicius/dev/go/src/net/http/server.go:3214 +0x8e fp=0xc0000f5b98 sp=0xc0000f5b68 pc=0x690ace
net/http.(*conn).serve(0xc00023c000, {0x7f0278, 0xc0002021e0})
	/home/giedriusstatkevicius/dev/go/src/net/http/server.go:2092 +0x5a5 fp=0xc0000f5fb8 sp=0xc0000f5b98 pc=0x672d45
net/http.(*Server).Serve.gowrap3()
	/home/giedriusstatkevicius/dev/go/src/net/http/server.go:3364 +0x28 fp=0xc0000f5fe0 sp=0xc0000f5fb8 pc=0x677628
runtime.goexit({})
	/home/giedriusstatkevicius/dev/go/src/runtime/asm_amd64.s:1700 +0x1 fp=0xc0000f5fe8 sp=0xc0000f5fe0 pc=0x47b6a1
created by net/http.(*Server).Serve in goroutine 7
	/home/giedriusstatkevicius/dev/go/src/net/http/server.go:3364 +0x485

What did you expect to see?

No panic.

Removing the coroutines immediately fixes the issue.

@gopherbot gopherbot added the compiler/runtime Issues related to the Go compiler and/or runtime. label Oct 23, 2024
@mknyszek mknyszek added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Oct 23, 2024
@mknyszek mknyszek added this to the Go1.24 milestone Oct 23, 2024
@mknyszek
Copy link
Contributor

This seems like a real bug. iter.Pull does funky things with the goroutine state machine and I could see this interacting poorly with goroutine profiles. I'll take a look.

@mknyszek mknyszek self-assigned this Oct 23, 2024
@mknyszek mknyszek added the Critical A critical problem that affects the availability or correctness of production systems built using Go label Oct 23, 2024
@mknyszek
Copy link
Contributor

Ah, yikes. The goroutine profile expects every time a goroutine is about to start running that it checks to see if it needs to be profiled. We do not do this on the coroutine resume path. It's an easy fix, I'll send a patch.

@dmitshur dmitshur added NeedsFix The path to resolution is known, but the work has not been done. and removed NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. labels Oct 23, 2024
@dmitshur dmitshur moved this from Todo to In Progress in Go Compiler / Runtime Oct 23, 2024
@mknyszek
Copy link
Contributor

@gopherbot Please open a backport issue for Go 1.23.

This causes crashes with no workaround when trying to take a goroutine profile in a program that uses iter.Pull. This problem only applies to that release because it pertains to the new iter package.

@gopherbot
Copy link
Contributor

Backport issue(s) opened: #70001 (for 1.23).

Remember to create the cherry-pick CL(s) as soon as the patch is submitted to master, according to https://go.dev/wiki/MinorReleases.

@gopherbot
Copy link
Contributor

Change https://go.dev/cl/622016 mentions this issue: runtime: uphold goroutine profile invariants in coroswitch

@mknyszek
Copy link
Contributor

Oh no. This is actually quite a bit more annoying because the coroswitch context isn't really safe for all the goroutine profile machinery. Either we have to make it safe, or we spin until somebody else takes the profile for us (very unfortunate). I'll try and see if I can do the former.

@mknyszek
Copy link
Contributor

Just kidding, the problem was actually relative minor. The fix seems OK now.

@github-project-automation github-project-automation bot moved this from In Progress to Done in Go Compiler / Runtime Oct 24, 2024
@gopherbot
Copy link
Contributor

Change https://go.dev/cl/622375 mentions this issue: [release-branch.go1.23] runtime: uphold goroutine profile invariants in coroswitch

gopherbot pushed a commit that referenced this issue Oct 30, 2024
…in coroswitch

Goroutine profiles require checking in with the profiler before any
goroutine starts running. coroswitch is a place where a goroutine may
start running, but where we do not check in with the profiler, which
leads to crashes. Fix this by checking in with the profiler the same way
execute does.

For #69998.
Fixes #70001.

Change-Id: Idef6dd31b70a73dd1c967b56c307c7a46a26ba73
Reviewed-on: https://go-review.googlesource.com/c/go/+/622016
Reviewed-by: David Chase <[email protected]>
LUCI-TryBot-Result: Go LUCI <[email protected]>
(cherry picked from commit 2a98a18)
Reviewed-on: https://go-review.googlesource.com/c/go/+/622375
Reviewed-by: Michael Pratt <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compiler/runtime Issues related to the Go compiler and/or runtime. Critical A critical problem that affects the availability or correctness of production systems built using Go NeedsFix The path to resolution is known, but the work has not been done.
Projects
Development

No branches or pull requests

5 participants