Problem:
The version of Gravity shipped with all versions of Anaconda Enterprise 5, prior to 5.4.0, contains a bug that prevents it from automatically renewing the certificates for the cluster components - etcd, Kubernetes, etc generated during the initial installation.
The RPC certificates are stored as a gravity package in the cluster package service. They will be generated during the installation prior to installing the controller Daemon set, gravity-site
which reads it upon start. These certificates will typically expire one year after the AE5 cluster's initial installation and would not be rotated automatically.
You could find the certificates from the following path:
/var/lib/gravity/secrets
You could check the expiration using the following command:
openssl x509 -noout -text -in etcd.cert | grep -I GMT
Symptom:
You may see the following error while adding a node to the cluster:
“[ERROR]: bad username or password”
Solution:
Please follow the below steps sequentially on each node, one node at a time to renew the certificates manually.
Master node
1. Download the gravity binary of version 5.0.0-alpha.11:
curl https://get.gravitational.io/telekube/bin/5.0.0-alpha.11/linux/x86_64/gravity -o gravity
The new gravity binary should not replace the existing binary in /usr/bin
. You can put the new one in /tmp/gravity
.
2. Backup /var/lib/gravity/secrets
.
3. Run the following command using the downloaded gravity to renew certs:
/tmp/gravity system rotate-certs <cluster-name>
It will backup current secrets into /var/lib/gravity/secrets-XXX
and generate new secrets replacing the old ones in /var/lib/gravity
.
You could find the cluster name from the following command:
gravity status
4. Restart the planet using the following command and wait for it to come up before moving on to the next node:
sudo systemctl list-units | grep planet
sudo systemctl restart gravity__gravitational.io__planet-dbtools__<id>.service
Worker nodes
Regular nodes do not have CA so it needs to be obtained first. Run the following command on a master node to export CA to a file:
gravity system export-ca <cluster-name> /tmp/ca.tar
Then distribute this file to regular nodes, e.g. to /tmp/ca.tar
. After that the process is similar to the master node:
1. Download the gravity binary of version 5.0.0-alpha.11:
curl https://get.gravitational.io/telekube/bin/5.0.0-alpha.11/linux/x86_64/gravity -o gravity
The new gravity binary should not replace the existing binary in /usr/bin
. You can put the new one in /tmp/gravity
.
2. Backup /var/lib/gravity/secrets
.
3. Run the following command using the downloaded gravity to renew certs, and explicitly specifying the path to the exported CA:
/tmp/gravity system rotate-certs <cluster-name> --ca-path=/tmp/ca.tar
4. Restart the planet and wait for it to come up before moving on to the next node:
sudo systemctl list-units | grep planet
sudo systemctl restart gravity__gravitational.io__planet-dbtools__<id>.service
After certificates have been rotated, remove the exported CA file from everywhere.