-
Notifications
You must be signed in to change notification settings - Fork 292
[v0.14.x] Allow dnsmasq to be backed by a local copy of CoreDNS #1891
[v0.14.x] Allow dnsmasq to be backed by a local copy of CoreDNS #1891
Conversation
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
1adcbea
to
ad76f04
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Small merge conflict needs fixing.
builtin/files/cluster.yaml.tmpl
Outdated
@@ -1405,6 +1410,13 @@ kubeProxy: | |||
# It is enabled by default. | |||
#cloudFormationStreaming: true | |||
|
|||
<<<<<<< Updated upstream |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we fix this merge conflict?
application: coredns | ||
data: | ||
Corefile: | | ||
cluster.local:9254 {{ .PodCIDR }}:9254 {{ .ServiceCIDR }}:9254 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice.
labels: | ||
application: coredns | ||
data: | ||
Corefile: | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Re-reviewing this Corefile against Zalandos, I see that they actually have their custom (additional) DNS configuration above the cluster.local...
block. I wonder if that performs better?
ad76f04
to
1ea50d8
Compare
2406955
to
a9e301c
Compare
This commit introduces a new configuration option for cluster.yaml, `kubeDns.coreDNSLocal`. If this option and `kubeDns.nodeLocalResolver` are set to true, the dnsmasq-node DaemonSet will be configured to use a local copy of CoreDNS for its resolution while setting the global CoreDNS service as a fallback. This is handy in situations where the number of DNS requests within a cluster grows large and causes resolution issues as dnsmasq reaches out to the global CoreDNS service. See [0] for an investigation into this situation which was instrumental in understanding issues we were facing. Many thanks to dominicgunn for providing the manifests which I codified into this commit. 0: https://github.com/zalando-incubator/kubernetes-on-aws/blob/dev/docs/postmortems/jan-2019-dns-outage.md
a9e301c
to
7145636
Compare
With this commit, the user can override the following dnsmasq defaults: * --cache-size (50000) * --dns-forward-max (500) * --neg-ttl (60)
7145636
to
e4a412b
Compare
The dnsmasq-node's coredns-local's container limits can now be configured (or removed) by specifying the following: ``` dnsmasq: coreDNSLocal: limits: cpu: 100m memory: "" # disabled ```
@@ -1058,6 +1058,9 @@ write_files: | |||
"${mfdir}/kube-dns-de.yaml" | |||
{{- end }} | |||
{{ if .KubeDns.NodeLocalResolver -}} | |||
{{ if .KubeDns.dnsmasq.CoreDNSLocal.Enabled -}} | |||
deploy "${mfdir}/dnsmasq-node-coredns-local.yaml" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know that kube-aws
doesn't do a fantastic job of clean up, but seems like this one would be easy. Could we throw in a
{{- else }}
remove "${mfdir}/canal.yaml"
{{- end }}}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, remove "${mfdir}/dnsmasq-node-coredns-local.yaml"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 Will do.
pkg/api/types.go
Outdated
@@ -209,6 +209,23 @@ type IPVSMode struct { | |||
MinSyncPeriod string `yaml:"minSyncPeriod"` | |||
} | |||
|
|||
type CoreDNSLocalLimits struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we use ComputeResources
instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep -- great idea!
coredns-local's resources can now be configured using: ``` kubeDns: dnsmasq: coreDNSLocal: resources: requests: cpu: "200m" memory: "1000Mi" limits: cpu: "400m" memory: "2000Mi" ```
/lgtm |
This commit allows the user to specify that dnsmasq should be backed by a pod-local copy of CoreDNS rather than relying on the global CoreDNS service. If enabled, the dnsmasq-node DaemonSet will be configured to use a local copy of CoreDNS for its resolution while setting the global CoreDNS service as a fallback. This is handy in situations where the number of DNS requests within a cluster grows large and causes resolution issues as dnsmasq reaches out to the global CoreDNS service.
Additionally, several values passed to dnsmasq are now configurable, including its
--cache-size
and--dns-forward-max
.See this postmortem for an investigation into this situation which was instrumental in understanding issues we were facing. Many thanks to @dominicgunn for providing the manifests which I codified into this commit.
These features can be enabled and tuned by setting the following values within cluster.yaml: