Skip to content

jsonpb: race marshaling logic #838

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
srcLurker opened this issue Apr 18, 2019 · 15 comments · Fixed by #913
Closed

jsonpb: race marshaling logic #838

srcLurker opened this issue Apr 18, 2019 · 15 comments · Fixed by #913
Labels

Comments

@srcLurker
Copy link

Our team is using the grpc library which in turn is using this library. Occasionally we are seeing:

fatal error: concurrent map read and map write

with a stack trace (included the stack trace below). Curious if others have seen something like this.

The version we are using is from Feb 5th (c823c79ea1570fb5ff454033735a8e68575d1d0f). Looking over the commit list, there doesn't seem to be any checkin since then that may be related, but we could have missed something.

-- charles

fatal error: concurrent map read and map write

goroutine 38 [running]:
runtime.throw(0xa0d911, 0x21)
    /usr/local/go/src/runtime/panic.go:616 +0x81 fp=0xc4215eb2a0 sp=0xc4215eb280 pc=0x42d261
runtime.mapaccess2(0x92c580, 0xc420160900, 0xc4309404a0, 0xc4309404a0, 0x10)
    /usr/local/go/src/runtime/hashmap.go:409 +0x225 fp=0xc4215eb2e8 sp=0xc4215eb2a0 pc=0x40b055
reflect.mapaccess(0x92c580, 0xc420160900, 0xc4309404a0, 0xc420160900)
    /usr/local/go/src/runtime/hashmap.go:1195 +0x3f fp=0xc4215eb320 sp=0xc4215eb2e8 pc=0x40d30f
reflect.Value.MapIndex(0x92c580, 0xc429dbc1f0, 0x195, 0x8f9820, 0xc4309404a0, 0x98, 0x8f9a60, 0xc42988a070, 0x8a)
    /usr/local/go/src/reflect/value.go:1081 +0x126 fp=0xc4215eb3a8 sp=0xc4215eb320 pc=0x4aedb6
github.com/golang/protobuf/proto.makeMapMarshaler.func2(0xc429dbc1f0, 0x2, 0x0)
    /tmpfs/tmp/root/src/github.com/golang/protobuf/proto/table_marshal.go:2321 +0x22a fp=0xc4215eb488 sp=0xc4215eb3a8 pc=0x6c02ca
github.com/golang/protobuf/proto.(*marshalInfo).size(0xc420020d20, 0xc429dbc100, 0xc420095380)
    /tmpfs/tmp/root/src/github.com/golang/protobuf/proto/table_marshal.go:183 +0xaa fp=0xc4215eb4f0 sp=0xc4215eb488 pc=0x697f0a
github.com/golang/protobuf/proto.makeMessageMarshaler.func1(0xc420318018, 0x1, 0xc4215eb558)
    /tmpfs/tmp/root/src/github.com/golang/protobuf/proto/table_marshal.go:2223 +0x44 fp=0xc4215eb518 sp=0xc4215eb4f0 pc=0x6bf7c4
github.com/golang/protobuf/proto.makeOneOfMarshaler.func1(0xc4253225b0, 0x0, 0xe)
    /tmpfs/tmp/root/src/github.com/golang/protobuf/proto/table_marshal.go:2373 +0x143 fp=0xc4215eb578 sp=0xc4215eb518 pc=0x6c0d63
github.com/golang/protobuf/proto.(*marshalInfo).size(0xc420020850, 0xc4253225a0, 0xc4202cf070)
    /tmpfs/tmp/root/src/github.com/golang/protobuf/proto/table_marshal.go:183 +0xaa fp=0xc4215eb5e0 sp=0xc4215eb578 pc=0x697f0a
github.com/golang/protobuf/proto.makeMessageSliceMarshaler.func1(0xc420098258, 0x1, 0x2)
    /tmpfs/tmp/root/src/github.com/golang/protobuf/proto/table_marshal.go:2248 +0x74 fp=0xc4215eb630 sp=0xc4215eb5e0 pc=0x6bfaf4
github.com/golang/protobuf/proto.(*marshalInfo).size(0xc420020540, 0xc420098230, 0xc424a82180)
    /tmpfs/tmp/root/src/github.com/golang/protobuf/proto/table_marshal.go:183 +0xaa fp=0xc4215eb698 sp=0xc4215eb630 pc=0x697f0a
github.com/golang/protobuf/proto.makeMessageMarshaler.func1(0xc420318028, 0x1, 0x0)
    /tmpfs/tmp/root/src/github.com/golang/protobuf/proto/table_marshal.go:2223 +0x44 fp=0xc4215eb6c0 sp=0xc4215eb698 pc=0x6bf7c4
github.com/golang/protobuf/proto.(*marshalInfo).size(0xc4200203f0, 0xc420318028, 0xc424a82180)
    /tmpfs/tmp/root/src/github.com/golang/protobuf/proto/table_marshal.go:183 +0xaa fp=0xc4215eb728 sp=0xc4215eb6c0 pc=0x697f0a
github.com/golang/protobuf/proto.(*InternalMessageInfo).Size(0xc424a82180, 0xa6dca0, 0xc420318028, 0x0)
    /tmpfs/tmp/root/src/github.com/golang/protobuf/proto/table_marshal.go:125 +0x65 fp=0xc4215eb758 sp=0xc4215eb728 pc=0x697b15
github.com/golang/protobuf/proto.(*Buffer).Marshal(0xc4253227b8, 0xa6dca0, 0xc420318028, 0xc42988a050, 0x0)
    /tmpfs/tmp/root/src/github.com/golang/protobuf/proto/table_marshal.go:2757 +0x314 fp=0xc4215eb828 sp=0xc4215eb758 pc=0x6aaf14
google.golang.org/grpc/encoding/proto.marshal(0x962fa0, 0xc420318028, 0xc4253227b0, 0x0, 0x0, 0x959400, 0xc4215eb910, 0x7a595e)
    /tmpfs/tmp/root/src/google.golang.org/grpc/encoding/proto/proto.go:59 +0xda fp=0xc4215eb878 sp=0xc4215eb828 pc=0x71bfaa
google.golang.org/grpc/encoding/proto.codec.Marshal(0x962fa0, 0xc420318028, 0x18, 0x8effc0, 0x79ff0f, 0xc429f9c1dc, 0x8)
    /tmpfs/tmp/root/src/google.golang.org/grpc/encoding/proto/proto.go:74 +0xaf fp=0xc4215eb8f8 sp=0xc4215eb878 pc=0x71c0ff
google.golang.org/grpc/encoding/proto.(*codec).Marshal(0x14c1350, 0x962fa0, 0xc420318028, 0x7f6e1d340458, 0x0, 0xc4206881a8, 0xc4215eb988, 0xc4215ebb40)
    <autogenerated>:1 +0x48 fp=0xc4215eb940 sp=0xc4215eb8f8 pc=0x71c968
google.golang.org/grpc.encode(0x7f6e1d2570e0, 0x14c1350, 0x962fa0, 0xc420318028, 0x0, 0x0, 0xc420689980, 0x5, 0x0)
    /tmpfs/tmp/root/src/google.golang.org/grpc/rpc_util.go:543 +0x61 fp=0xc4215eb9c8 sp=0xc4215eb940 pc=0x79a4a1
google.golang.org/grpc.(*clientStream).SendMsg(0xc429f9c120, 0x962fa0, 0xc420318028, 0x0, 0x0)
    /tmpfs/tmp/root/src/google.golang.org/grpc/stream.go:671 +0x135 fp=0xc4215ebb50 sp=0xc4215eb9c8 pc=0x7a04d5
google.golang.org/grpc.invoke(0xa6e8e0, 0xc424872000, 0xa171ea, 0x31, 0x962fa0, 0xc420318028, 0x951c20, 0x14c1350, 0xc4272ce300, 0xc430940020, ...)
    /tmpfs/tmp/root/src/google.golang.org/grpc/call.go:70 +0xe7 fp=0xc4215ebbd8 sp=0xc4215ebb50 pc=0x78eb57
google.golang.org/grpc.(*ClientConn).Invoke(0xc4272ce300, 0xa6e8e0, 0xc424872000, 0xa171ea, 0x31, 0x962fa0, 0xc420318028, 0x951c20, 0x14c1350, 0xc430940020, ...)
    /tmpfs/tmp/root/src/google.golang.org/grpc/call.go:37 +0x1b3 fp=0xc4215ebc90 sp=0xc4215ebbd8 pc=0x78e7d3
google.golang.org/grpc.Invoke(0xa6e8e0, 0xc424872000, 0xa171ea, 0x31, 0x962fa0, 0xc420318028, 0x951c20, 0x14c1350, 0xc4272ce300, 0xc430940020, ...)
    /tmpfs/tmp/root/src/google.golang.org/grpc/call.go:60 +0xc1 fp=0xc4215ebd10 sp=0xc4215ebc90 pc=0x78ea41
... plus code that is calling the grpc method ...

@neild
Copy link
Contributor

neild commented Apr 18, 2019

I can almost guarantee that the problem is that you're concurrently modifying the request message.

@dsnet dsnet changed the title Race condition in proto/table_marshal.go? proto: race condition in proto/table_marshal.go Apr 20, 2019
@timestee
Copy link

timestee commented May 14, 2019

Similar problem

type Session struct {
}

func (session *Session) Send(msg interface{}) error {
	...
	session.sendChan <- msg
	...
}

func (session *Session) sendLoop() {
	for {
		select {
		case msg, ok := <-session.sendChan:
			// call proto.Marshal(msg) then send the raw data
		case <-session.closeChan:
			return
		}
	}
}
... 
go session.sendLoop()
... 
// ProtoMessageDemo is a proto message.
type ProtoMessageDemo struct {
	Data map[int32]int32
}
pmd := &ProtoMessageDemo{}
session.Send(pmd)

But ProtoMessageDemo changed by user's main goroutine after Session.Send called. Then crash.
Should keep the message marshal in the user's main goroutine too.

Hope this helps you. @neild was right. You must concurrently modify the request message.

@dsnet
Copy link
Member

dsnet commented May 14, 2019

@timestee, I'm not sure how that code snippet helps us see an issue as a protobuf message (either as a proto.Message or a generated message type) is not referenced.

@timestee
Copy link

timestee commented May 14, 2019

@dsnet in my code snippet, ProtoMessageDemo is a proto.Message.

@dsnet
Copy link
Member

dsnet commented May 14, 2019

Sorry, I missed that. Your code in-and-of-itself looks fine. What makes you think there is a race?

@timestee
Copy link

timestee commented May 14, 2019

session.Send(pmd) will push the message 'pmd' to Session's send chan, the message will be marshaled(the Data field will be access here, just read) in the Session's send goroutine, but it would be modified in user's logic goroutine. One read, the other write. @dsnet Are you with me now?

@puellanivis
Copy link
Collaborator

The race condition @timestee presents is an expected and appropriately-caught race condition in the user’s code. Passing pointers (and pointer-semantics values) over a channel is not sharing by communicating, because both receiver and sender end up with a pointer to the same data.

Modifying the same data at the same time as you are Marshaling is appropriately a simultaneous read/write violation.

@timestee
Copy link

timestee commented May 14, 2019

@puellanivis yes, i feel @srcLurker might have the same issue. Concurrently modifying the request message somewhere.

@dsnet
Copy link
Member

dsnet commented Jun 14, 2019

Closing as obsolete. This seems to be a race in user code. Can re-open if that is not the case.

@dsnet dsnet closed this as completed Jun 14, 2019
@riannucci
Copy link

I believe this is actually a real issue. Here's a repro: https://gist.github.com/riannucci/8021fcddc6ba055c2f124bb92ef51b20

Important bits:

  1. The message must have a subfield (simple fields don't trigger it)
  2. It seems to be a race (at least in my case) between jsonpb and the normal marshaler

@dsnet
Copy link
Member

dsnet commented Jul 26, 2019

@riannucci, thank you for your report. I am able to reproduce. However, it is a race that is unlikely to the be the same as originally reported. Probably would have been better to be a newly filed issue. Comments to closed issues have a high probability of being forgotten.

@dsnet dsnet reopened this Jul 26, 2019
@dsnet dsnet changed the title proto: race condition in proto/table_marshal.go jsonpb: race marshaling logic Jul 26, 2019
@riannucci
Copy link

Ah, sorry, I was just guessing that it would have the same underlying race. Should have filed a new bug and mentioned this one instead. Thanks for taking a look :)

dsnet added a commit that referenced this issue Jul 27, 2019
The reflect.Value.Interface method shallow copies the underlying value,
which may copy mutexes and atomically-accessed fields.
Some usages of the Interface method is only to check if the interface value
implements an interface. In which case the shallow copy was unnecessary.
Change those usages to use the reflect.Value.Implements method instead.

Fixes #838
dsnet added a commit that referenced this issue Jul 27, 2019
The reflect.Value.Interface method shallow copies the underlying value,
which may copy mutexes and atomically-accessed fields.
Some usages of the Interface method is only to check if the interface value
implements an interface. In which case the shallow copy was unnecessary.
Change those usages to use the reflect.Value.Implements method instead.

Fixes #838
@riannucci
Copy link

(Thanks for the fast fix; I've verified that this fixes the issue in the original codebase where we detected this)

@dsnet dsnet closed this as completed in #913 Aug 5, 2019
dsnet added a commit that referenced this issue Aug 5, 2019
The reflect.Value.Interface method shallow copies the underlying value,
which may copy mutexes and atomically-accessed fields.
Some usages of the Interface method is only to check if the interface value
implements an interface. In which case the shallow copy was unnecessary.
Change those usages to use the reflect.Value.Implements method instead.

Fixes #838
@awinlei
Copy link

awinlei commented Apr 28, 2020

I also have this bug in table_marshal.go

m := ptr.asPointerTo(t).Elem() // the map n := 0 for _, k := range m.MapKeys() { ki := k.Interface() vi := m.MapIndex(k).Interface() kaddr := toAddrPointer(&ki, false, false) // pointer to key vaddr := toAddrPointer(&vi, valIsPtr, false) // pointer to value siz := keySizer(kaddr, 1) + valSizer(vaddr, 1) // tag of key = 1 (size=1), tag of val = 2 (size=1) n += siz + SizeVarint(uint64(siz)) + tagsize } return n

the log:

fatal error: concurrent map iteration and map write

goroutine 43 [running]:
runtime.throw(0x186c838, 0x26)
/usr/local/go/src/runtime/panic.go:1112 +0x72 fp=0xc00035b838 sp=0xc00035b808 pc=0x435d72
runtime.mapiternext(0xc006d303c0)
/usr/local/go/src/runtime/map.go:853 +0x552 fp=0xc00035b8b8 sp=0xc00035b838 pc=0x411052
reflect.mapiternext(0xc006d303c0)
/usr/local/go/src/runtime/map.go:1337 +0x2b fp=0xc00035b8d0 sp=0xc00035b8b8 pc=0x411fcb
reflect.Value.MapKeys(0x14ef500, 0xc005ccef88, 0x195, 0x14ef500, 0xc005ccef88, 0x195)
/usr/local/go/src/reflect/value.go:1204 +0x10a fp=0xc00035b960 sp=0xc00035b8d0 pc=0x497efa
github.com/golang/protobuf/proto.makeMapMarshaler.func2(0xc005ccef88, 0x1, 0x2)
/data/workspace/publish/server/code/go/src/github.com/golang/protobuf/proto/table_marshal.go:2319 +0xce fp=0xc00035ba40 sp=0xc00035b960 pc=0x556e1e
github.com/golang/protobuf/proto.(*marshalInfo).size(0xc00617ebd0, 0xc005ccef80, 0x2788040)
/data/workspace/publish/server/code/go/src/github.com/golang/protobuf/proto/table_marshal.go:183 +0x9d fp=0xc00035baa8 sp=0xc00035ba40 pc=0x52e97d
github.com/golang/protobuf/proto.(*InternalMessageInfo).Size(0x2788040, 0x1b43800, 0xc005ccef80, 0x7f186e7ab0a0)
/data/workspace/publish/server/code/go/src/github.com/golang/protobuf/proto/table_marshal.go:125 +0x66 fp=0xc00035bad8 sp=0xc00035baa8 pc=0x52e536
proto/out/cl.(*L2C_TanlentAddInfo).XXX_Size(0xc005ccef80, 0x1b43800)
/data/workspace/publish/server/code/go/src/proto/out/cl/msg_life_supplies.pb.go:473 +0x43 fp=0xc00035bb08 sp=0xc00035bad8 pc=0xb71a43
github.com/golang/protobuf/proto.(*Buffer).Marshal(0xc00035bc08, 0x1b43800, 0xc005ccef80, 0x0, 0xc00035bc30)
/data/workspace/publish/server/code/go/src/github.com/golang/protobuf/proto/table_marshal.go:2740 +0x85 fp=0xc00035bbe0 sp=0xc00035bb08 pc=0x5417d5

@neild
Copy link
Contributor

neild commented Apr 28, 2020

@awinlei This is almost certainly a race condition in your code--you are probably modifying the contents of a message at the same time as marshaling it.

In the event that this is not the case, the stack trace indicates that you're using an older version of the proto package. The marshal implementation has been rewritten since that version. The first step would be to try upgrading to a newer version of the package.

@golang golang locked as resolved and limited conversation to collaborators Jun 25, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants