kubernetes: 1.7.9 -> 1.9.1 #33954

kuznero · 2018-01-16T20:56:06Z

Motivation for this change

Upgrade kubernetes to latest v1.9.1 (as well as kuvecfg to v0.6.0, as well as kubernetes-dashboard to v1.8.2). Related to #30639.

Things done

Tested using sandboxing (nix.useSandbox on NixOS, or option build-use-sandbox in nix.conf on non-NixOS)
Built on platform(s)
- NixOS
- macOS
- other Linux distributions
Tested via one or more NixOS test(s) if existing and applicable for the change (look inside nixos/tests)
Tested compilation of all pkgs that depend on this change using nix-shell -p nox --run "nox-review wip"
Tested execution of all binary files (usually in ./result/bin/)
Fits CONTRIBUTING.md.

srhb · 2018-01-16T20:57:18Z

Nice!
Sorry I missed you in #nixos!

You can run the tests via:

nix-build nixos/tests/kubernetes -A rbac
nix-build nixos/tests/kubernetes -A dns

kuznero · 2018-01-16T20:58:08Z

@srhb Excellent! Thanks, will try that!

kuznero · 2018-01-16T22:15:32Z

Ran rbac test for a long time with the following failure in the end:

machine1# [ 1082.420971] kube-apiserver[1013]: I0116 21:27:26.547088    1013 pathrecorder.go:247] kube-aggregator: "/api/v1/resourcequotas" satisfied by prefix /api/
machine1# [ 1082.422157] kube-apiserver[1013]: I0116 21:27:26.547129    1013 handler.go:150] kube-apiserver: GET "/api/v1/resourcequotas" satisfied by gorestful with webserv1
machine1# [ 1082.423348] kube-apiserver[1013]: I0116 21:27:26.548427    1013 get.go:238] Starting watch for /api/v1/resourcequotas, rv=1 labels= fields= timeout=8m15s
machine1# [ 1082.424599] kube-controller-manager[1014]: I0116 21:27:26.548675    1014 round_trippers.go:436] GET https://api.my.zyx/api/v1/resourcequotas?resourceVersion=1&ts
error: action timed out after -1 seconds at /nix/store/lp80aincldbqcdfj2bxshw4ls314lymm-nixos-test-driver/lib/perl5/site_perl/Machine.pm line 227, <__ANONIO__> line 901.
action timed out after -1 seconds at /nix/store/lp80aincldbqcdfj2bxshw4ls314lymm-nixos-test-driver/lib/perl5/site_perl/Machine.pm line 227, <__ANONIO__> line 901.
cleaning up
killing machine1 (pid 593)
vde_switch: EOF on stdin, cleaning up and exiting
vde_switch: Could not remove ctl dir '/tmp/nix-build-vm-test-run-kubernetes-rbac-singlenode.drv-0/vde1.ctl': Directory not empty
builder for ‘/nix/store/brq94jki7y0hnx88yimsyl6ycb4dpv6z-vm-test-run-kubernetes-rbac-singlenode.drv’ failed with exit code 255
error: build of ‘/nix/store/brq94jki7y0hnx88yimsyl6ycb4dpv6z-vm-test-run-kubernetes-rbac-singlenode.drv’ failed

And also dns took quite some time with the following failure in the end:

machine1# [ 1084.051487] kube-proxy[1048]: I0116 21:49:04.148737    1048 iptables.go:321] running iptables-save [-t filter]
machine1# [ 1084.056529] kube-proxy[1048]: I0116 21:49:04.172086    1048 iptables.go:321] running iptables-save [-t nat]
machine1# [ 1084.069982] kube-proxy[1048]: I0116 21:49:04.185701    1048 proxier.go:1664] Restoring iptables rules: *filter
machine1# [ 1084.071917] kube-proxy[1048]: :KUBE-SERVICES - [0:0]
machine1# [ 1084.073313] kube-proxy[1048]: :KUBE-FORWARD - [0:0]
machine1# [ 1084.074508] kube-proxy[1048]: -A KUBE-SERVICES -m comment --comment "kube-system/kube-dns:dns-tcp has no endpoints" -m tcp -p tcp -d 10.0.0.254/32 --dport 53 -jT
machine1# [ 1084.077681] kube-proxy[1048]: -A KUBE-SERVICES -m comment --comment "kube-system/kube-dns:dns has no endpoints" -m udp -p udp -d 10.0.0.254/32 --dport 53 -j REJT
error: action timed out after -1 seconds at /nix/store/lp80aincldbqcdfj2bxshw4ls314lymm-nixos-test-driver/lib/perl5/site_perl/Machine.pm line 227, <__ANONIO__> line 901.
action timed out after -1 seconds at /nix/store/lp80aincldbqcdfj2bxshw4ls314lymm-nixos-test-driver/lib/perl5/site_perl/Machine.pm line 227, <__ANONIO__> line 901.
cleaning up
killing machine1 (pid 593)
vde_switch: EOF on stdin, cleaning up and exiting
vde_switch: Could not remove ctl dir '/tmp/nix-build-vm-test-run-kubernetes-dns-singlenode.drv-0/vde1.ctl': Directory not empty
builder for ‘/nix/store/9sai7v94s418hnljn68a5ll3r4bnnqdf-vm-test-run-kubernetes-dns-singlenode.drv’ failed with exit code 255
error: build of ‘/nix/store/9sai7v94s418hnljn68a5ll3r4bnnqdf-vm-test-run-kubernetes-dns-singlenode.drv’ failed

Will be running same tests on master now to compare results.

P.S. I assume that test passes when exit code is 0.

kuznero · 2018-01-16T22:32:44Z

rbac test on master gives no error - here is the end of the stdout:

machine1# [  666.179324] kubelet[1455]: I0116 22:26:31.283127    1455 generic.go:182] GenericPLEG: Relisting
machine1# [  666.184917] dockerd[901]: time="2018-01-16T22:26:31.319811155Z" level=warning msg="unknown container" container=907a9d0b7cc03c69f3840d141b72b2928757edb55b9631a6y
machine1# [  666.191117] kubelet[1455]: I0116 22:26:31.323034    1455 server.go:794] GET /cri/exec/shK4aP2A: (270.364529ms) hijacked [[kubectl/v1.7.9+7f63532e4ff4f (linux/am]
machine1: exit status 1
collecting coverage data
machine1: running command: test -e /sys/kernel/debug/gcov
machine1# [  666.193438] kube-apiserver[996]: E0116 22:26:31.324203     996 proxy.go:199] Error proxying data from client to backend: write tcp 192.168.1.1:56668->192.168.1.e
machine1: exit status 1
syncing
machine1: running command: sync
machine1# [  666.198385] kube-apiserver[996]: I0116 22:26:31.324481     996 wrap.go:42] POST /api/v1/namespaces/default/pods/kubectl/exec?command=kubectl&command=delete&comm]
machine1: exit status 0
test script finished in 667.18s
cleaning up
killing machine1 (pid 593)
vde_switch: EOF on stdin, cleaning up and exiting
vde_switch: Could not remove ctl dir '/tmp/nix-build-vm-test-run-kubernetes-rbac-singlenode.drv-0/vde1.ctl': Directory not empty
/nix/store/pk67nb0fdqm027nf0dimraisb1vdmixn-vm-test-run-kubernetes-rbac-multinode
/nix/store/khz2sj56jcyq1aj50rwq5lapg02x4g7l-vm-test-run-kubernetes-rbac-singlenode

dns on master also gives no error - here is the end of stdout:

machine1# [  109.500882] kubelet[1476]: I0116 22:33:32.636713    1476 server.go:794] GET /cri/exec/Jf46yACz: (1.342959231s) hijacked [[kubectl/v1.7.9+7f63532e4ff4f (linux/am]
machine1# [  109.586333] kube-apiserver[1029]: I0116 22:33:32.722066    1029 wrap.go:42] POST /api/v1/namespaces/default/pods/probe/exec?command=%2Fbin%2Fhost&command=redis.]
machine1# [  109.626825] kube-apiserver[1029]: I0116 22:33:32.762798    1029 wrap.go:42] GET /api/v1/namespaces/kube-system/endpoints/kube-controller-manager: (174.38194ms) ]
machine1: exit status 0
collecting coverage data
machine1: running command: test -e /sys/kernel/debug/gcov
machine1: exit status 1
syncing
machine1: running command: sync
machine1# [  109.668227] kube-controller-manager[1034]: I0116 22:33:32.804238    1034 round_trippers.go:405] GET https://api.my.zyx/api/v1/namespaces/kube-system/endpoints/ks
machine1# [  109.678463] kubelet[1476]: I0116 22:33:32.814565    1476 config.go:101] Looking for [api file], have seen map[file:{} api:{}]
machine1# [  109.680205] kubelet[1476]: I0116 22:33:32.816315    1476 kubelet.go:1959] SyncLoop (housekeeping)
machine1# [  109.717406] dhcpcd[869]: vethb65ee3fc: no IPv6 Routers available
machine1# [  109.743463] kube-apiserver[1029]: I0116 22:33:32.876495    1029 handler.go:160] kube-aggregator: PUT "/api/v1/namespaces/kube-system/endpoints/kube-controller-ml
machine1# [  109.745526] kube-apiserver[1029]: I0116 22:33:32.876524    1029 pathrecorder.go:247] kube-aggregator: "/api/v1/namespaces/kube-system/endpoints/kube-controller-/
machine1# [  109.747288] kube-apiserver[1029]: I0116 22:33:32.876542    1029 handler.go:150] kube-apiserver: PUT "/api/v1/namespaces/kube-system/endpoints/kube-controller-ma1
machine1: exit status 0
test script finished in 110.59s
cleaning up
killing machine1 (pid 593)
vde_switch: EOF on stdin, cleaning up and exiting
vde_switch: Could not remove ctl dir '/tmp/nix-build-vm-test-run-kubernetes-dns-singlenode.drv-0/vde1.ctl': Directory not empty
/nix/store/a6lw5fgdpicmf0nd9ja6m25cazr9h7g4-vm-test-run-kubernetes-dns-multinode
/nix/store/19ws0gsprrjrd3jxrvw1wb86156h0idx-vm-test-run-kubernetes-dns-singlenode

One observation - with kubernetes 1.9.1 tests run much much longer. Perhaps some behavior changed in the new kubernetes version? Hope it is not necessarily wrong, but in this case tests have to be changed. Will try to figure out how exactly those tests work.

kuznero · 2018-01-16T22:44:57Z

On the other hand how to test that kubernetes modules work properly after this upgrade?

offlinehacker · 2018-01-17T00:40:05Z

I can test this on one of our kubernetes clusters which uses nixos in next week and check if everything works as it should. We want to update all clusters to kubernetes 1.9 anyway, so this is on agenda.

kuznero · 2018-01-17T06:36:53Z

@offlinehacker thanks

NeQuissimus · 2018-01-17T13:49:21Z

@GrahamcOfBorg test kubernetes.rbac

GrahamcOfBorg

Failure for system: x86_64-linux

error: while evaluating ‘hydraJob’ at /var/lib/gc-of-borg/.nix-test-rs/repo/38dca4e3aa6bca43ea96d2fcc04e8229/builder/grahamc-zoidberg/lib/customisation.nix:167:14, called from /var/lib/gc-of-borg/.nix-test-rs/repo/38dca4e3aa6bca43ea96d2fcc04e8229/builder/grahamc-zoidberg/nixos/release.nix:286:22:
while evaluating the attribute ‘name’ at /var/lib/gc-of-borg/.nix-test-rs/repo/38dca4e3aa6bca43ea96d2fcc04e8229/builder/grahamc-zoidberg/lib/customisation.nix:172:24:
attribute ‘name’ missing, at /var/lib/gc-of-borg/.nix-test-rs/repo/38dca4e3aa6bca43ea96d2fcc04e8229/builder/grahamc-zoidberg/lib/customisation.nix:172:10

kuznero · 2018-01-17T21:13:15Z

Is there any way to reproduce this on my box somehow?

NeQuissimus · 2018-01-17T21:16:39Z

@grahamc ? Maybe I kicked it off wrong?

srhb · 2018-01-17T21:23:51Z

No, evaluating it via release.nix is also weird here. Building the test directly seems the only way, but ofborg doesn't do that, obviously.

grahamc · 2018-01-17T22:19:07Z

Why is it weird to evaluate it through release.nix?

jirkadanek · 2018-01-18T06:14:56Z

I cherry-picked the two commits from this PR on top of current master (98b35db Wed Jan 17 eclipse-plugins-ansi-econsole: init at 1.3.5) and ran nix-build nixos/tests/kubernetes -A rbac -I .. I got an error, both times I ran this:

machine1# [ 1178.071316] kubelet[1742]: I0117 22:56:48.946459    1742 config.go:99] Looking for [api file], have seen map[file:{}]
machine2# [ 1178.304523] kubelet[1114]: I0117 22:56:49.146572    1114 generic.go:183] GenericPLEG: Relisting
machine2# [ 1178.308840] kubelet[1114]: I0117 22:56:49.153175    1114 config.go:99] Looking for [api], have seen map[]
machine1# [ 1178.320576] kubelet[1704]: I0117 22:56:49.087860    1704 config.go:99] Looking for [api file], have seen map[file:{}]
machine1# [ 1178.116734] kube-apiserver[1079]: I0117 22:56:48.994321    1079 handler.go:160] kube-aggregator: GET "/api/v1/namespaces/kube-system/endpoints/kube-controller-manager" satisfied by nonGoRestful
machine1# [ 1178.121814] kube-apiserver[1079]: I0117 22:56:48.994365    1079 pathrecorder.go:247] kube-aggregator: "/api/v1/namespaces/kube-system/endpoints/kube-controller-manager" satisfied by prefix /api/
machine1# [ 1178.126070] kube-apiserver[1079]: I0117 22:56:48.994396    1079 handler.go:150] kube-apiserver: GET "/api/v1/namespaces/kube-system/endpoints/kube-controller-manager" satisfied by gorestful with webservice /api/v1
machine1# [ 1178.128363] kube-apiserver[1079]: I0117 22:56:48.997994    1079 wrap.go:42] GET /api/v1/namespaces/kube-system/endpoints/kube-controller-manager: (3.928991ms) 200 [[kube-controller-manager/v1.9.1 (linux/amd64) kubernetes/3a1c944/leader-election] 192.168.1.1]
machine1# [ 1178.131345] kube-apiserver[1079]: I0117 22:56:48.999263    1079 handler.go:160] kube-aggregator: PUT "/api/v1/namespaces/kube-system/endpoints/kube-controller-manager" satisfied by nonGoRestful
machine1# [ 1178.133724] kube-apiserver[1079]: I0117 22:56:48.999288    1079 pathrecorder.go:247] kube-aggregator: "/api/v1/namespaces/kube-system/endpoints/kube-controller-manager" satisfied by prefix /api/
machine1# [ 1178.136221] kube-apiserver[1079]: I0117 22:56:48.999317    1079 handler.go:150] kube-apiserver: PUT "/api/v1/namespaces/kube-system/endpoints/kube-controller-manager" satisfied by gorestful with webservice /api/v1
machine2# [ 1178.339793] kube-proxy[528]: I0117 22:56:49.184135     528 config.go:141] Calling handler.OnEndpointsUpdate
machine1# [ 1178.145787] kube-controller-manager[1965]: I0117 22:56:48.998251    1965 round_trippers.go:436] GET https://api.my.zyx/api/v1/namespaces/kube-system/endpoints/kube-controller-manager 200 OK in 4 milliseconds
machine1# [ 1178.148623] kube-controller-manager[1965]: I0117 22:56:49.018441    1965 graph_builder.go:601] GraphBuilder process object: v1/Endpoints, namespace kube-system, name kube-controller-manager, uid 14480d14-fbd7-11e7-b236-525400123456, event type update
machine1# [ 1178.151654] kube-controller-manager[1965]: I0117 22:56:49.021523    1965 round_trippers.go:436] PUT https://api.my.zyx/api/v1/namespaces/kube-system/endpoints/kube-controller-manager 200 OK in 22 milliseconds
machine1# [ 1178.153594] kube-controller-manager[1965]: I0117 22:56:49.021822    1965 leaderelection.go:199] successfully renewed lease kube-system/kube-controller-manager
machine1# [ 1178.155378] kubelet[1742]: I0117 22:56:49.019767    1742 generic.go:183] GenericPLEG: Relisting
machine1# [ 1178.156744] kube-proxy[1086]: I0117 22:56:49.019117    1086 config.go:141] Calling handler.OnEndpointsUpdate
machine1# [ 1178.157994] kube-apiserver[1079]: I0117 22:56:49.021116    1079 wrap.go:42] PUT /api/v1/namespaces/kube-system/endpoints/kube-controller-manager: (21.953628ms) 200 [[kube-controller-manager/v1.9.1 (linux/amd64) kubernetes/3a1c944/leader-election] 192.168.1.]
machine1# [ 1178.167633] kubelet[1742]: I0117 22:56:49.046376    1742 config.go:99] Looking for [api file], have seen map[file:{}]
machine2# [ 1178.408940] kubelet[1114]: I0117 22:56:49.253301    1114 config.go:99] Looking for [api], have seen map[]
machine1# [ 1178.420664] kubelet[1704]: I0117 22:56:49.187859    1704 config.go:99] Looking for [api file], have seen map[file:{}]
error: action timed out after -1 seconds at /nix/store/22rg6kabcnqjadb6ajfdxqk8sq415dq3-nixos-test-driver/lib/perl5/site_perl/Machine.pm line 227, <__ANONIO__> line 901.
action timed out after -1 seconds at /nix/store/22rg6kabcnqjadb6ajfdxqk8sq415dq3-nixos-test-driver/lib/perl5/site_perl/Machine.pm line 227, <__ANONIO__> line 901.
cleaning up
killing machine1 (pid 27482)
vde_switch: EOF on stdin, cleaning up and exiting
vde_switch: Could not remove ctl dir '/tmp/nix-build-vm-test-run-kubernetes-rbac-singlenode.drv-0/vde1.ctl': Directory not empty
builder for ‘/nix/store/5r32c0qs8v4bqy3hy6ym3n6p1nc6kgzd-vm-test-run-kubernetes-rbac-singlenode.drv’ failed with exit code 255
error: build of ‘/nix/store/5r32c0qs8v4bqy3hy6ym3n6p1nc6kgzd-vm-test-run-kubernetes-rbac-singlenode.drv’ failed
nix-build nixos/tests/kubernetes -A rbac -I .  5.45s user 3.49s system 0% cpu 19:45.08 total

jirkadanek · 2018-01-18T07:35:15Z

same error with nix-build nixos/tests/kubernetes -A dns -I .

machine1# [ 1177.817411] kubelet[1734]: I0118 06:35:14.599854    1734 round_trippers.go:436] GET https://api.my.zyx/api/v1/pods?fieldSelector=spec.nodeName%3Dmachine1.my.zyx&limit=500&resourceVersion=0 403 Forbidden in 5 milliseconds
machine1# [ 1177.820796] kubelet[1734]: E0118 06:35:14.615350    1734 reflector.go:205] k8s.io/kubernetes/pkg/kubelet/config/apiserver.go:47: Failed to list *v1.Pod: pods is forbidden: User "apiserver-client-kubelet" cannot list pods at the cluster scope
machine1# [ 1178.098719] kubelet[1698]: I0118 06:35:14.940601    1698 config.go:99] Looking for [api file], have seen map[file:{}]
machine1# [ 1177.865278] kubelet[1734]: I0118 06:35:14.659768    1734 config.go:99] Looking for [api file], have seen map[file:{}]
machine2# [ 1178.098806] kubelet[1112]: I0118 06:35:14.960770    1112 config.go:99] Looking for [api], have seen map[]
machine1# [ 1178.198759] kubelet[1698]: I0118 06:35:15.040712    1698 config.go:99] Looking for [api file], have seen map[file:{}]
machine1# [ 1177.953699] kube-apiserver[1082]: I0118 06:35:14.748223    1082 handler.go:160] kube-aggregator: GET "/api/v1/namespaces/kube-system/endpoints/kube-controller-manager" satisfied by nonGoRestful
machine1# [ 1177.955672] kube-apiserver[1082]: I0118 06:35:14.750233    1082 pathrecorder.go:247] kube-aggregator: "/api/v1/namespaces/kube-system/endpoints/kube-controller-manager" satisfied by prefix /api/
machine1# [ 1177.958596] kube-apiserver[1082]: I0118 06:35:14.751720    1082 handler.go:150] kube-apiserver: GET "/api/v1/namespaces/kube-system/endpoints/kube-controller-manager" satisfied by gorestful with webservice /api/v1
machine1# [ 1177.966206] kube-apiserver[1082]: I0118 06:35:14.755374    1082 wrap.go:42] GET /api/v1/namespaces/kube-system/endpoints/kube-controller-manager: (7.446756ms) 200 [[kube-controller-manager/v1.9.1 (linux/amd64) kubernetes/3a1c944/leader-election] 192.168.1.1]
machine2# [ 1178.177076] kube-proxy[531]: I0118 06:35:15.039014     531 config.go:141] Calling handler.OnEndpointsUpdate
machine1# [ 1177.969819] kube-apiserver[1082]: I0118 06:35:14.756447    1082 handler.go:160] kube-aggregator: PUT "/api/v1/namespaces/kube-system/endpoints/kube-controller-manager" satisfied by nonGoRestful
machine1# [ 1178.235028] kubelet[1698]: I0118 06:35:15.076974    1698 config.go:99] Looking for [api file], have seen map[file:{}]
machine1# [ 1177.972678] kube-apiserver[1082]: I0118 06:35:14.756477    1082 pathrecorder.go:247] kube-aggregator: "/api/v1/namespaces/kube-system/endpoints/kube-controller-manager" satisfied by prefix /api/
machine1# [ 1178.236714] kubelet[1698]: I0118 06:35:15.078676    1698 kubelet.go:1921] SyncLoop (housekeeping, skipped): sources aren't ready yet.
machine1# [ 1177.975052] kube-apiserver[1082]: I0118 06:35:14.756503    1082 handler.go:150] kube-apiserver: PUT "/api/v1/namespaces/kube-system/endpoints/kube-controller-manager" satisfied by gorestful with webservice /api/v1
machine1# [ 1177.977558] kube-apiserver[1082]: I0118 06:35:14.759349    1082 wrap.go:42] PUT /api/v1/namespaces/kube-system/endpoints/kube-controller-manager: (3.069384ms) 200 [[kube-controller-manager/v1.9.1 (linux/amd64) kubernetes/3a1c944/leader-election] 192.168.1.1]
machine1# [ 1177.980150] kube-proxy[1088]: I0118 06:35:14.760057    1088 config.go:141] Calling handler.OnEndpointsUpdate
machine1# [ 1177.981560] kubelet[1734]: I0118 06:35:14.759724    1734 config.go:99] Looking for [api file], have seen map[file:{}]
machine1# [ 1177.983307] kube-controller-manager[2064]: I0118 06:35:14.755601    2064 round_trippers.go:436] GET https://api.my.zyx/api/v1/namespaces/kube-system/endpoints/kube-controller-manager 200 OK in 7 milliseconds
machine1# [ 1177.985632] kube-controller-manager[2064]: I0118 06:35:14.758955    2064 graph_builder.go:601] GraphBuilder process object: v1/Endpoints, namespace kube-system, name kube-controller-manager, uid 21f12473-fc17-11e7-a2e9-525400123456, event type update
machine1# [ 1177.988147] kube-controller-manager[2064]: I0118 06:35:14.759510    2064 round_trippers.go:436] PUT https://api.my.zyx/api/v1/namespaces/kube-system/endpoints/kube-controller-manager 200 OK in 3 milliseconds
machine2# [ 1178.198684] kubelet[1112]: I0118 06:35:15.060700    1112 config.go:99] Looking for [api], have seen map[]
machine1# [ 1177.990277] kube-controller-manager[2064]: I0118 06:35:14.763013    2064 leaderelection.go:199] successfully renewed lease kube-system/kube-controller-manager
machine1# [ 1178.298694] kubelet[1698]: I0118 06:35:15.140626    1698 config.go:99] Looking for [api file], have seen map[file:{}]
machine1# [ 1178.061839] kubelet[1734]: I0118 06:35:14.856299    1734 config.go:99] Looking for [api file], have seen map[file:{}]
machine1# [ 1178.063498] kubelet[1734]: I0118 06:35:14.858002    1734 kubelet.go:1921] SyncLoop (housekeeping, skipped): sources aren't ready yet.
machine1# [ 1178.065488] kubelet[1734]: I0118 06:35:14.859721    1734 config.go:99] Looking for [api file], have seen map[file:{}]
error: action timed out after -1 seconds at /nix/store/22rg6kabcnqjadb6ajfdxqk8sq415dq3-nixos-test-driver/lib/perl5/site_perl/Machine.pm line 227, <__ANONIO__> line 901.
action timed out after -1 seconds at /nix/store/22rg6kabcnqjadb6ajfdxqk8sq415dq3-nixos-test-driver/lib/perl5/site_perl/Machine.pm line 227, <__ANONIO__> line 901.
cleaning up
killing machine1 (pid 20663)
vde_switch: EOF on stdin, cleaning up and exiting
vde_switch: Could not remove ctl dir '/tmp/nix-build-vm-test-run-kubernetes-dns-singlenode.drv-0/vde1.ctl': Directory not empty
machine2# [ 1178.298946] kubelet[1112]: I0118 06:35:15.160802    1112 config.go:99] Looking for [api], have seen map[]
builder for ‘/nix/store/9d46cpdm16df4rrqwljx6b1xazzghg7j-vm-test-run-kubernetes-dns-singlenode.drv’ failed with exit code 255
error: build of ‘/nix/store/9d46cpdm16df4rrqwljx6b1xazzghg7j-vm-test-run-kubernetes-dns-singlenode.drv’ failed
nix-build nixos/tests/kubernetes -A dns -I .  5.56s user 3.38s system 0% cpu 19:57.85 total

the return code was 100 in both cases.

jirkadanek · 2018-01-18T07:36:12Z

$ nix-info -m

system: "x86_64-linux"
host os: Linux 4.9.76, NixOS, 18.03pre125130.3a763b91963 (Impala)
multi-user?: yes
sandbox: no
version: nix-env (Nix) 1.11.16
channels(root): "nixos-18.03pre125130.3a763b91963"
channels(jdanek): ""
nixpkgs: /nix/var/nix/profiles/per-user/root/channels/nixos/nixpkgs

srhb · 2018-01-18T16:04:34Z

@grahamc I haven't looked into it, but the tests are sufficiently different from the others in release.nix that the naive approach doesn't work:

❯ nix-build nixos/release.nix -A tests.kubernetes.rbac --arg supportedSystems '[ "x86_64-linux" ]'
error: attribute 'meta' missing, at /home/sarah/src/nixpkgs/lib/customisation.nix:172:10

Hydra knows how to do it, clearly, but I'm not sure whether ofborg does the exact same thing.

kuznero · 2018-01-22T13:17:26Z

@offlinehacker, did you have a chance to see if it works for you in the field?

jirkadanek · 2018-01-24T00:41:46Z

@kuznero I've been able to set this up as a single node cluster on my laptop and run it without problems since Monday. Compared to the previous version, I only had to add services.kubele.extraOpts = "--fail-swap-on=false";, because 1.9 now by default refuses to start on systems with swap.

jirkadanek · 2018-01-24T14:51:38Z

little update. I actually hit an issue with this just now, the issue matches exactly this one, kubernetes/kubernetes#32796. Which is weird, as this was supposed to be resolved few releases before 1.9.1...

kuznero · 2018-01-24T15:40:59Z

@jdanekrh thanks for the update.

srhb · 2018-02-02T14:21:55Z

@jdanekrh I believe that something has changed relating to bootstrapping of the kubelets' authorization.

Adding the following line to the start of every test script works:

      $machine1->waitUntilSucceeds("kubectl create clusterrolebinding kubelet-node-binding --clusterrole=system:node --user=apiserver-client-kubelet");

I'm wondering whether this is just due to the CN we're using, or if we need to do something else to bootstrap the clients.

srhb · 2018-02-04T20:29:47Z

How about something like this?

The issue with the current tests is that there is no longer a default ClusterRoleBinding that confers registration access for kubelets with users in the system:nodes group. To alleviate this, I've enabled the Node authorizer, which requires the username (CN) be system:node:<nodename>. I've also thrown in the NodeRestriction admission controller, which should furthermore limit the kubelets' access drastically.

We could just add them both to the tests, but I think they're sane defaults and match up well with what the k8s community is doing.

nixos/k8s: Enable Node authorizer and NodeRestriction by default

voobscout · 2018-02-15T21:15:34Z

Will this ever get merged?

NeQuissimus · 2018-02-16T01:23:50Z

I think this is good to go?! @srhb ? @kuznero ?
The tests pass for me anyways...

kuznero · 2018-02-16T13:53:07Z

Should be ok

Baughn · 2018-02-26T21:03:23Z

This won't be particularly reliable.

Kubernetes 9.x works with Docker 17.03.x. We're currently shipping 17.12.x, and Docker has API changes in minor version bumps, so all sorts of flakiness may ensue. It really should be set to force Docker 17.03.

(Or better, 1.12. To quote from the documentation: On each of your machines, install Docker. Version v1.12 is recommended, but v1.11, v1.13 and 17.03 are known to work as well. Versions 17.06+ might work, but have not yet been tested and verified by the Kubernetes node team.)

srhb · 2018-02-27T08:10:09Z

@Baughn It appears that snippet is from kubeadm, not from Kubernetes itself. I haven't found anything in the actual Kubernetes docs that is worded as strongly; the release notes simply call those "verified" versions, and 17.12 appears to work just fine. We may still want to do something to signal this to the user, and at least provide (one of) the verified version(s) as an option, but forcing this seems a bit strong.

srhb · 2018-02-27T08:26:03Z

@Baughn This issue is relevant, too: kubernetes/kubernetes#53221

Until k8s switches to matching some/more Docker API versions, the problem remains that we can either choose an EOL Docker or a K8s-unvalidated version. Fun!

drdaeman · 2018-03-02T21:44:20Z

I've tried this today, cherry-picking those on release-17.09. Unfortunately, this has a problem: if CA/key/cert files are not specified explicitly, they'll be generated under /var/run/kubernetes, so any restart breaks the cluster.

Thanks to @srhb for help on #nixos, I had it solved. I've scratched a few notes here: https://gist.github.com/drdaeman/fee048df456ced9f604fb554b78f549f (a sample config and a script to generate dirty certs that would work for a totally insecure local-dev single-node "cluster").

Unfortunately, I'm really brain-dead after the struggle with K8s, so can't write a proper issue. And my weekend's going to be very hasty, so I'm not sure I'll have time for this in next few days. But I thought I'd at least leave this comment here, in case someone else would have similar problem.

kubernetes: 1.7.9 -> 1.9.1

bd0d934

GrahamcOfBorg added 10.rebuild-darwin: 1-10 10.rebuild-linux: 1-10 labels Jan 16, 2018

kuznero mentioned this pull request Jan 16, 2018

kubectl: init at v1.8.1 #30450

Closed

8 tasks

kubernetes-dashboard (module): 1.6.3 -> 1.8.2

f63604a

GrahamcOfBorg reviewed Jan 17, 2018

View reviewed changes

nixos/k8s: Enable Node authorizer and NodeRestriction by default

bf58890

Merge pull request #1 from srhb/k8s-nodeauth

f44a81e

nixos/k8s: Enable Node authorizer and NodeRestriction by default

GrahamcOfBorg added 6.topic: nixos 8.has: module (update) labels Feb 5, 2018

nicknovitski mentioned this pull request Feb 11, 2018

init: kubectl at 1.7.9 (alias for kubernetes with only kubectl component) #34126

Merged

6 tasks

NeQuissimus merged commit 8755902 into NixOS:master Feb 16, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

kubernetes: 1.7.9 -> 1.9.1 #33954

kubernetes: 1.7.9 -> 1.9.1 #33954

kuznero commented Jan 16, 2018 •

edited

Loading

srhb commented Jan 16, 2018

kuznero commented Jan 16, 2018

kuznero commented Jan 16, 2018 •

edited

Loading

kuznero commented Jan 16, 2018 •

edited

Loading

kuznero commented Jan 16, 2018

offlinehacker commented Jan 17, 2018 •

edited

Loading

kuznero commented Jan 17, 2018

NeQuissimus commented Jan 17, 2018

GrahamcOfBorg left a comment

kuznero commented Jan 17, 2018

NeQuissimus commented Jan 17, 2018

srhb commented Jan 17, 2018

grahamc commented Jan 17, 2018

jirkadanek commented Jan 18, 2018

jirkadanek commented Jan 18, 2018

jirkadanek commented Jan 18, 2018

srhb commented Jan 18, 2018

kuznero commented Jan 22, 2018

jirkadanek commented Jan 24, 2018

jirkadanek commented Jan 24, 2018

kuznero commented Jan 24, 2018

srhb commented Feb 2, 2018 •

edited

Loading

srhb commented Feb 4, 2018 •

edited

Loading

voobscout commented Feb 15, 2018

NeQuissimus commented Feb 16, 2018

kuznero commented Feb 16, 2018

Baughn commented Feb 26, 2018 •

edited

Loading

srhb commented Feb 27, 2018

srhb commented Feb 27, 2018

drdaeman commented Mar 2, 2018

kubernetes: 1.7.9 -> 1.9.1 #33954

kubernetes: 1.7.9 -> 1.9.1 #33954

Conversation

kuznero commented Jan 16, 2018 • edited Loading

Motivation for this change

Things done

srhb commented Jan 16, 2018

kuznero commented Jan 16, 2018

kuznero commented Jan 16, 2018 • edited Loading

kuznero commented Jan 16, 2018 • edited Loading

kuznero commented Jan 16, 2018

offlinehacker commented Jan 17, 2018 • edited Loading

kuznero commented Jan 17, 2018

NeQuissimus commented Jan 17, 2018

GrahamcOfBorg left a comment

Choose a reason for hiding this comment

kuznero commented Jan 17, 2018

NeQuissimus commented Jan 17, 2018

srhb commented Jan 17, 2018

grahamc commented Jan 17, 2018

jirkadanek commented Jan 18, 2018

jirkadanek commented Jan 18, 2018

jirkadanek commented Jan 18, 2018

srhb commented Jan 18, 2018

kuznero commented Jan 22, 2018

jirkadanek commented Jan 24, 2018

jirkadanek commented Jan 24, 2018

kuznero commented Jan 24, 2018

srhb commented Feb 2, 2018 • edited Loading

srhb commented Feb 4, 2018 • edited Loading

voobscout commented Feb 15, 2018

NeQuissimus commented Feb 16, 2018

kuznero commented Feb 16, 2018

Baughn commented Feb 26, 2018 • edited Loading

srhb commented Feb 27, 2018

srhb commented Feb 27, 2018

drdaeman commented Mar 2, 2018

kuznero commented Jan 16, 2018 •

edited

Loading

kuznero commented Jan 16, 2018 •

edited

Loading

kuznero commented Jan 16, 2018 •

edited

Loading

offlinehacker commented Jan 17, 2018 •

edited

Loading

srhb commented Feb 2, 2018 •

edited

Loading

srhb commented Feb 4, 2018 •

edited

Loading

Baughn commented Feb 26, 2018 •

edited

Loading