Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

nixos/kubernetes: KubeDNS -> CoreDNS #49516

Merged
merged 2 commits into from Nov 6, 2018
Merged

Conversation

johanot
Copy link
Contributor

@johanot johanot commented Oct 31, 2018

Motivation for this change

In accordance with Kubernetes defaults as per >v1.11, I've updated the Kubernetes DNS addon to provide CoreDNS instead of Kube-DNS.

See: https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG-1.11.md#sig-cluster-lifecycle

The kubernetes specs for the CoreDNS components are heavily based on this template:
https://github.com/coredns/deployment/blob/master/kubernetes/coredns.yaml.sed

I've made an effort of preserving the current container-ports in use, for backward compatibility:
DNS: 10053
Health: 10054 (not exposed)
Metrics: 10055

Also in this PR is a release note section which should explain some gotchas and details about the DNS-addon migration.

Things done

CoreDNS has been deployed as presented in the PR to our staging cluster for 3 days now and so far testing looks good. We've had no issues.

  • Tested using sandboxing (nix.useSandbox on NixOS, or option sandbox in nix.conf on non-NixOS)
  • Built on platform(s)
    • NixOS
    • macOS
    • other Linux distributions
  • Tested via one or more NixOS test(s) if existing and applicable for the change (look inside nixos/tests)
  • Tested compilation of all pkgs that depend on this change using nix-shell -p nox --run "nox-review wip"
  • Tested execution of all binary files (usually in ./result/bin/)
  • Determined the impact on package closure size (by running nix path-info -S before and after)
  • Fits CONTRIBUTING.md.

@johanot
Copy link
Contributor Author

johanot commented Oct 31, 2018

@GrahamcOfBorg test kubernetes.rbac.singlenode kubernetes.dns.singlenode kubernetes.rbac.multinode kubernetes.dns.multinode

@GrahamcOfBorg
Copy link

No attempt on aarch64-linux (full log)

The following builds were skipped because they don't evaluate on aarch64-linux: tests.kubernetes.rbac.singlenode, tests.kubernetes.dns.singlenode, tests.kubernetes.rbac.multinode, tests.kubernetes.dns.multinode

Partial log (click to expand)


Cannot nix-instantiate `tests.kubernetes.dns.singlenode' because:
error: attribute 'singlenode' in selection path 'tests.kubernetes.dns.singlenode' not found

Cannot nix-instantiate `tests.kubernetes.rbac.multinode' because:
error: attribute 'multinode' in selection path 'tests.kubernetes.rbac.multinode' not found

Cannot nix-instantiate `tests.kubernetes.dns.multinode' because:
error: attribute 'multinode' in selection path 'tests.kubernetes.dns.multinode' not found

@GrahamcOfBorg
Copy link

Unexpected error: unexpected build failure on x86_64-linux (full log)

Attempted: tests.kubernetes.rbac.singlenode, tests.kubernetes.dns.singlenode, tests.kubernetes.rbac.multinode, tests.kubernetes.dns.multinode

Partial log (click to expand)

machine1# [  260.497244] etcd[799]: sync duration of 5.138984144s, expected less than 1s
machine2# [  262.982659] kube-proxy[566]: E1031 13:10:13.223672     566 reflector.go:134] k8s.io/client-go/informers/factory.go:131: Failed to list *v1.Service: Get https://api.my.zyx/api/v1/services?limit=500&resourceVersion=0: dial tcp 192.168.1.1:443: connect: connection refused
machine1# [  263.817675] kube-scheduler[1001]: E1031 13:10:16.469783    1001 reflector.go:134] k8s.io/client-go/informers/factory.go:131: Failed to list *v1.StatefulSet: Get https://api.my.zyx/apis/apps/v1/statefulsets?limit=500&resourceVersion=0: dial tcp 192.168.1.1:443: i/o timeout
machine1# [  260.651282] etcd[827]: read-only range request "key:\"/registry/controllers\" range_end:\"/registry/controllert\" count_only:true " took too long (260.04143ms) to execute
machine1# [  265.552769] etcd[827]: sync duration of 4.747217313s, expected less than 1s
machine1# [  265.559326] kube-scheduler[1039]: E1031 13:10:18.589977    1039 reflector.go:134] k8s.io/client-go/informers/factory.go:131: Failed to list *v1.ReplicationController: Get https://api.my.zyx/api/v1/replicationcontrollers?limit=500&resourceVersion=0: dial tcp 192.168.1.1:443: i/o timeout
machine2# [  262.327630] kube-proxy[590]: E1031 13:10:17.945579     590 reflector.go:134] k8s.io/client-go/informers/factory.go:131: Failed to list *v1.Service: Get https://api.my.zyx/api/v1/services?limit=500&resourceVersion=0: dial tcp 192.168.1.1:443: connect: connection refused
machine2# [  262.356909] kube-proxy[590]: E1031 13:10:17.953584     590 reflector.go:134] k8s.io/client-go/informers/factory.go:131: Failed to list *v1.Endpoints: Get https://api.my.zyx/api/v1/endpoints?limit=500&resourceVersion=0: dial tcp 192.168.1.1:443: connect: connection refused
machine1# [  265.617370] kube-controller-manager[1042]: E1031 13:10:20.846643    1042 leaderelection.go:252] error retrieving resource lock kube-system/kube-controller-manager: Get https://api.my.zyx/api/v1/namespaces/kube-system/endpoints/kube-controller-manager?timeout=10s: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
machine2# [  262.475683] kube-proxy[590]: E1031 13:10:19.849634     590 reflector.go:134] k8s.io/client-go/informers/factory.go:131: Failed to list *v1.Endpoints: Get https://api.my.zyx/api/v1/endpoints?limit=500&resourceVersion=0: dial tcp 192.168.1.1:443: connect: connection refused

@johanot
Copy link
Contributor Author

johanot commented Oct 31, 2018

@GrahamcOfBorg test kubernetes.dns.singlenode kubernetes.dns.multinode

@GrahamcOfBorg
Copy link

No attempt on aarch64-linux (full log)

The following builds were skipped because they don't evaluate on aarch64-linux: tests.kubernetes.dns.singlenode, tests.kubernetes.dns.multinode

Partial log (click to expand)

Cannot nix-instantiate `tests.kubernetes.dns.singlenode' because:
error: attribute 'singlenode' in selection path 'tests.kubernetes.dns.singlenode' not found

Cannot nix-instantiate `tests.kubernetes.dns.multinode' because:
error: attribute 'multinode' in selection path 'tests.kubernetes.dns.multinode' not found

@GrahamcOfBorg
Copy link

Success on x86_64-linux (full log)

Attempted: tests.kubernetes.dns.singlenode, tests.kubernetes.dns.multinode

Partial log (click to expand)

machine1: running command: sync
machine1: exit status 0
test script finished in 344.75s
cleaning up
killing machine2 (pid 597)
killing machine1 (pid 609)
vde_switch: EOF on stdin, cleaning up and exiting
vde_switch: Could not remove ctl dir '/build/vde1.ctl': Directory not empty
/nix/store/xfznlhapy0j5r5kz38ppfv9jlzl1qs77-vm-test-run-kubernetes-dns-singlenode
/nix/store/0a2sriz43gyhql40178ask8l5lzpjhrw-vm-test-run-kubernetes-dns-multinode

@johanot
Copy link
Contributor Author

johanot commented Oct 31, 2018

Hmm.. Tests continue to be somewhat unreliable on ofborg, but they run clean on my machine.
And perhaps there's a higher possibility of success if I run less test cases on ofborg at a time.

@Mic92
Copy link
Member

Mic92 commented Nov 1, 2018

mhm. If they don't run well on ofborg there is a good chance they also break horrible in hydra that is even more contended at times. Any chance to increase some timeouts or so? This is not meant as blocker of this pull request, but would make our ci more reliable in future.

@srhb
Copy link
Contributor

srhb commented Nov 1, 2018

Regarding the tests: As far as I know, they are stable on Hydra (modulo the VM connect timeouts that we've been experiencing regularly across the board that are not kubernetes-related)

The ofborg failures are caused by the much lower timeout, I believe.

@srhb
Copy link
Contributor

srhb commented Nov 2, 2018

@johanot If you know of any kubernetes-on-NixOS users, can you ping them for review over the weekend? I'm happy to merge this on monday, but more feedback is nice.

@Mic92
Copy link
Member

Mic92 commented Nov 2, 2018

I would say @offlinehacker or @cstrahan?

Copy link
Contributor

@steveej steveej left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I noted a few things related to the replicas setting. Other than that I trust the tests.

};
replicas = mkOption {
description = "Number of DNS pod replicas to deploy in the cluster.";
default = 2;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens on a one-node cluster?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Upstream doesn't touch this: https://github.com/kubernetes/kubernetes/blob/v1.12.2/cluster/addons/dns/coredns/coredns.yaml.sed#L85-L88 instead it states it's handled at runtime by horizontal auto-scaling.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@steveej A one-node cluster will have 2 replicas deployed (by default), but both on the same node. I understand that scaling can be handled by horizontal auto-scalers, but 1) we don't deploy the horizontal auto-scaler with this module (as of now), and 2) I prefer multiple replicas not for scaling reasons, but for redundancy reasons. In our production cluster, we cannot tolerate DNS-service downtime due to the failure of a single container (or node). I can accept setting replicas to 1 by default, but in that case we really should set maxUnavailable: 0 to ensure that at least 1 DNS pod is always up during rolling-update of the DNS-deployment.

But that would still not ensure redundancy in case of node-failure. In our production cluster we have 12 nodes now, and we want to be able to guarantee zero downtime of the DNS-service, even in the case we have multiple nodes failing. We therefore currently set replicas = 4.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for explaining. My remarks have been resolved, not to say invalidated ;-)

cache 30
loop
reload
loadbalance
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe make this depend on replicas > 1?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You mean loadbalance ? Afaik that's just round-robin DNS? https://github.com/coredns/coredns/tree/master/plugin/loadbalance

.. which.. Is a bit unnecessary in Kubernetes perhaps? Since L4-loadbalancing is implemented through services and L7 can be enabled through ingresses. But.. I guess it can't hurt to enable DNS "loadbalancing", even though I can't think of which queries would return multiple A-records in our cluster. :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't know what exactly that option was saying ;-) After learning what exactly it does I see it has no relation to load balancing across the replicas of the CoreDNS instances.

@Mic92
Copy link
Member

Mic92 commented Nov 5, 2018

@GrahamcOfBorg test kubernetes

@Mic92
Copy link
Member

Mic92 commented Nov 5, 2018

OK, this was unnecessary. All changes have been already tested.

@srhb srhb merged commit 81de3e3 into NixOS:master Nov 6, 2018
@GrahamcOfBorg
Copy link

Success on aarch64-linux

Attempted: tests.kubernetes

No partial log is available.

@GrahamcOfBorg
Copy link

Success on x86_64-linux

Attempted: tests.kubernetes

No partial log is available.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants