This morning, shortly after starting work I was sent a message about a problem with an application being deployed into our prod environment.
Get "https://10.96.0.1:443/api/v1?timeout=32s": x509: certificate has expired or is not yet valid: current time 2022-10-26T08:42:46Z is after 2022-10-24T13:25:26Z
Now, I’m fairly new to kubernetes but this seems like it should be a simple fix, I just need to replace the expired certificate with the new one and restart the service, right? But then, it’s never really as simple as that, is it? Well, it actually is almost as simple as that. I did make a wee mistake that I’ll get to later, lets get into checking that the cluster certificates actually have expired. First open a connection to the server(s) that your master nodes are running on, and run this command:
[root@RYD1KMASTERPRD02 ~]# kubeadm certs check-expiration
[check-expiration] Reading configuration from the cluster...
[check-expiration] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -o yaml'
CERTIFICATE EXPIRES RESIDUAL TIME CERTIFICATE AUTHORITY EXTERNALLY MANAGED
admin.conf Oct 24, 2022 13:26 UTC <invalid> no
apiserver Oct 24, 2022 13:26 UTC <invalid> ca no
apiserver-etcd-client Oct 24, 2022 13:26 UTC <invalid> etcd-ca no
apiserver-kubelet-client Oct 24, 2022 13:26 UTC <invalid> ca no
controller-manager.conf Oct 24, 2022 13:26 UTC <invalid> no
etcd-healthcheck-client Oct 24, 2022 13:26 UTC <invalid> etcd-ca no
etcd-peer Oct 24, 2022 13:26 UTC <invalid> etcd-ca no
etcd-server Oct 24, 2022 13:26 UTC <invalid> etcd-ca no
front-proxy-client Oct 24, 2022 13:26 UTC <invalid> front-proxy-ca no
scheduler.conf Oct 24, 2022 13:26 UTC <invalid> no
CERTIFICATE AUTHORITY EXPIRES RESIDUAL TIME EXTERNALLY MANAGED
ca Oct 22, 2031 13:25 UTC 8y no
etcd-ca Oct 22, 2031 13:25 UTC 8y no
front-proxy-ca Oct 22, 2031 13:25 UTC 8y no
(FYI, I’m writing this on the afternoon of the 26th October) So as we can see, these certificates have expired and all need to be replaced. This can be done in a single command.
kubeadm certs renew all
This will renew all of the above certificates for you on the node you run it from. You’ll need to update your admin kube config to refer to the new certificates, you can copy them from /etc/kubernetes/admin.conf
You now also need to copy these certificates to all other master nodes. This was the mistake I made, I had assumed that kubeadm would magically replicate the certificates and I wouldn’t have to, and to my surprise, I was now intermittently getting the expired certificate error. It was only afterwards I realised that the other master nodes also needed the certificates replacing on, and yes, it seems obvious to me in hindsight, but I’ve learned something new so it’s a success, in my book.
Having copied the certificates from the first master to the others, you’ll notice that none of your kube services are using them yet. they have to be restarted before they can pick them up. The following commands will do that for you.
kubectl -n kube-system delete pod -l 'component=kube-apiserver'
kubectl -n kube-system delete pod -l 'component=kube-controller-manager'
kubectl -n kube-system delete pod -l 'component=kube-scheduler'
kubectl -n kube-system delete pod -l 'component=etcd'
Once you’ve done that, your cluster should be back to normal, operating as expected.
I hope that this has been helpful. If you have read it and have any questions/queries/comments, please feel free to send me an email to the address in the footer of this page. (Adding a comments section is on my to-do list)
