-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Add Dynamic Config for safe keepalive rollout #7809
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
dnr
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we really need this? Can't we roll out with the static config changes?
On the server side, the dynamic config isn't dynamic at all, it requires a process restart. On the client side, it would only apply to new connections.
common/resource/fx.go
Outdated
| "go.temporal.io/server/common/searchattribute" | ||
| "go.temporal.io/server/common/telemetry" | ||
| "go.temporal.io/server/common/testing/testhooks" | ||
| "go.temporal.io/server/service/history/configs" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is in common, shared by all services. It shouldn't have a dependency on history service.
common/dynamicconfig/constants.go
Outdated
| EnableInterNodeServerKeepAlive = NewGlobalBoolSetting( | ||
| "system.enableInterNodeServerKeepAlive", | ||
| false, | ||
| `enableInterNodeServerKeepAlive is the config to enable keep alive for inter-node connections on server side.`, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| EnableInterNodeServerKeepAlive = NewGlobalBoolSetting( | |
| "system.enableInterNodeServerKeepAlive", | |
| false, | |
| `enableInterNodeServerKeepAlive is the config to enable keep alive for inter-node connections on server side.`, | |
| EnableInternodeServerKeepalive = NewGlobalBoolSetting( | |
| "system.enableInternodeServerKeepalive", | |
| false, | |
| `enableInternodeServerKeepalive is the config to enable keep alive for inter-node connections on server side.`, |
common/dynamicconfig/constants.go
Outdated
| EnableInterNodeClientKeepAlive = NewGlobalBoolSetting( | ||
| "system.enableInterNodeClientKeepAlive", | ||
| false, | ||
| `enableInterNodeClientKeepAlive is the config to enable keep alive for inter-node connections on client side.`, | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| EnableInterNodeClientKeepAlive = NewGlobalBoolSetting( | |
| "system.enableInterNodeClientKeepAlive", | |
| false, | |
| `enableInterNodeClientKeepAlive is the config to enable keep alive for inter-node connections on client side.`, | |
| ) | |
| EnableInternodeClientKeepalive = NewGlobalBoolSetting( | |
| "system.enableInternodeClientKeepalive", | |
| false, | |
| `enableInternodeClientKeepalive is the config to enable keep alive for inter-node connections on client side.`, | |
| ) |
common/resource/fx.go
Outdated
| resolver *membership.GRPCResolver, | ||
| tracingStatsHandler telemetry.ClientStatsHandler, | ||
| monitor membership.Monitor, | ||
| dynamicConfig *configs.Config, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why do we need history service configs here? this rpc factory is for all services
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed.
common/rpc/rpc.go
Outdated
| if !d.dynamicConfig.EnableInterNodeClientKeepAlive() { | ||
| // default keepalive settings for clients | ||
| return grpc.WithKeepaliveParams(keepalive.ClientParameters{ | ||
| Time: time.Duration(math.MaxInt64), | ||
| Timeout: 20 * time.Second, | ||
| PermitWithoutStream: false, | ||
| }) | ||
| } | ||
| serviceConfig := d.config.Services[string(serviceName)] | ||
| return grpc.WithKeepaliveParams(serviceConfig.RPC.ClientConnectionConfig.GetKeepAliveClientParameters()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe write this reversed, like:
| if !d.dynamicConfig.EnableInterNodeClientKeepAlive() { | |
| // default keepalive settings for clients | |
| return grpc.WithKeepaliveParams(keepalive.ClientParameters{ | |
| Time: time.Duration(math.MaxInt64), | |
| Timeout: 20 * time.Second, | |
| PermitWithoutStream: false, | |
| }) | |
| } | |
| serviceConfig := d.config.Services[string(serviceName)] | |
| return grpc.WithKeepaliveParams(serviceConfig.RPC.ClientConnectionConfig.GetKeepAliveClientParameters()) | |
| // default keepalive settings for clients | |
| params := keepalive.ClientParameters{ | |
| Time: time.Duration(math.MaxInt64), | |
| Timeout: 20 * time.Second, | |
| PermitWithoutStream: false, | |
| } | |
| if d.dynamicConfig.EnableInterNodeClientKeepAlive() { | |
| serviceConfig := d.config.Services[string(serviceName)] | |
| params = serviceConfig.RPC.ClientConnectionConfig.GetKeepAliveClientParameters() | |
| } | |
| return grpc.WithKeepaliveParams(params) |
common/resource/fx.go
Outdated
| ) | ||
| } | ||
|
|
||
| func ConfigProvider( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this is here (and it shouldn't be), it needs a new name, "ConfigProvider" for something that provides history configs is too generic outside of the context of the history service.
common/resource/fx.go
Outdated
| ) | ||
| factory.EnableInternodeServerKeepalive = enableServerKeepalive | ||
| factory.EnableInternodeClientKeepalive = enableClientKeepalive | ||
| logger.Info(fmt.Sprintf("RPC factory created. enableServerKeepalive: %v, enableClientKeepalive: %v", enableServerKeepalive, enableClientKeepalive)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we really need this log line? I'd rather drop it unless you think it is very important for debugging (and then maybe debug level?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This line is temporary and I will remove it when fully rolled out. I want to make sure the dc change is taking effect when rolling out. I think debug level make sense.
| factory.EnableInternodeServerKeepalive = enableServerKeepalive | ||
| factory.EnableInternodeClientKeepalive = enableClientKeepalive |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm guessing you set these externally instead of passing them to NewFactory to avoid having to change all the tests? that's kind of messy but it's okay since it's just temporary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, and when removing, we don't have to deal with signature change again. I will add todo comments for that as well.
What changed?
Add Dynamic Config for safe keepalive rollout
Why?
We need ability to switch on/off the keepalive setting for keepalive config
How did you test it?
Potential risks
No risk is expected.