Kubernetes root CA certificate update¶
Storyboard: https://storyboard.openstack.org/#!/story/2008675
This feature introduces CLI/REST APIs and execution orchestration for updating Kubernetes root CA certficate and certificates issued by the root CA in a rolling fashion so that the impact on the system is minimized.
This is the updated version of the approved spec security-2008675-kubernetes-rootca-update.rst. This version reflects the adjustments from implementation.
Problem description¶
In a deployed Kubernetes cluster, the root CA certficate signs all the other serving and client certificates used by various components for various purposes. This root CA certificate may need to be updated for security or administrative reasons while the cluster is still running.
An update mechanism is needed to update the root CA certificate and all the certificates signed by the root CA certificate in a rolling fashion (ie., minimal impact on the applications and services running in the cluster).
Currently Kubernetes doesn’t provide such a mechanism out of the box. A manual update procedure [1] is possible but it’s lengthy and error-prone. This feature is to introduce a set of CLI/REST APIs and execution orchestration to simplify the procedure.
Use Cases¶
The cluster’s root CA certificate approaches its expiry date, the cloud admin need to update the root CA certicate in order for the cluster to function continously.
The cloud admin decides to update the root CA certificate with a new one for security concern.
Proposed change¶
Enhance sysinv to support root CA certificate rolling update¶
A rolling update procedure roughly based on [1] has been investigated. The procedure consists of three phases. The first phase is to update kubernetes components and pods to trust the new root CA certficate along with the old one (trust both). The second phase is to update kubernetes components’ server and client certificates with new ones signed by the new root CA certificate. The third phase is to remove the old root CA certficate from components’ and pods’ trusted CA bundle so that only the new root CA certificate is trusted.
We will wrap up this update procedure by sysinv CLI commands and supporting APIs. VIM and DC orchestration of the procedure will be in the future. This is being done to hide the complexities of the underlying procedure, add in semantic checks and overall provides a simpler, less error-prone procedure, which will be analogous to the approach taken for other complex multi-host procedures such as kubernetes upgrade, patching and system upgrades.
The overall feature will have multiple layers. sysinv REST APIs and CLI is the first layer providing the fundamental implementation of the certificate update. VIM orchestration is the second layer for executing the update across all hosts in a cluster, by utilizing support from sysinv. DC Orchestration is the third layer for executing VIM orchestration across all subclouds of a DC system.
There will also be a 4th layer in the future where cert-manager will manage the kubernetes Root CA certificate and key. cert-mon will monitor the certificate and raise alarm when it needs to be updated so that user can schedule the orchestration of the update during a maintenance window.
The initial version of the spec will cover only the first layer, the sysinv support for root CA certifcate update. Changes include adding new system CLI commands and sysinv REST APIs to the existing framework, adding logic to sysinv conductor to generate required puppet hieradata, and adding new puppet runtime manifests to be applied by sysinv agent to make the actual certificate update on hosts.
Sysinv operations for root CA certificate update¶
A new set of sysinv CLI commands will be introduced to simplify the update procedure. It will be a procedure similar to software upgrade, with a start, execute and complete cycle. User can retry a step if it fails. There will also be support for “abort”, where user can choose to exit an on-going update. But the user is supposed to restart the update procedure with either uploading or re-generating a root CA certficate and run the update to full complete. This also provides a mechanism to restore the original CA certificate if user chooses to upload the original CA certificate.
The following is a summary of the CLI commands and the steps to perform kubernetes root CA certificate update.
1. system kube-rootca-update-start¶
Pre-check to validate the update, initialize the procedure and mark update progress as update-started.
2. system kube-rootca-certificate-generate¶
Generates a new kubernetes root CA certificate
Change progress state to update-new-rootca-cert-generated
2. system kube-rootca-certificate-upload¶
User can choose to use this command to upload a new kubernetes root CA certificate and private key from a file instead of generating one
Change progress state to update-new-rootca-cert-uploaded
3. system kube-rootca-host-update <hostname> –phase=trust-both-cas¶
Update apiserver’s trusted CAs to include the new CA cert
Update scheduler’s trusted CAs to include the new CA cert
Update controller-manager’s trusted CAs to include the new CA cert
Update kubelet’s trusted CAs to include the new CA cert
Update admin.conf’s trusted CAs to include the new CA cert
Change progress state to updated-host-trust-both-cas on success
Change progress state to updating-host-trust-both-cas-failed on failure
4. system kube-rootca-pods-update –phase=trust-both-cas¶
Annotate Daemonsets and Deployments to trigger pod replacement in a safer rolling fashion, to ensure pods to pick up the new root CA cert as its trusted CA along with the old root CA certificate
Change progess state to updated-pods-trust-both-cas on success
Change progess state to updating-pods-trust-both-cas-failed on failure
5. system kube-rootca-host-update <hostname> –phase=update-certs¶
Update admin.conf’s client cert/key data with new ones signed by the new root CA
Update apiserver’s server and client certs/keys with new ones signed by the new root CA
Update scheduler’s client cert/key with new one signed by the new root CA
Update controller-manager’s client cert/key with new one signed by the new root CA
Update kubelet’s client cert/key with new one signed by the new root CA
Change progress state to updated-host-update-certs on success
Chante progress state to updating-host-update-certs-failed on failure
6. system kube-rootca-host-update <hostname> –phase=trust-new-ca¶
Update admin.conf’s trusted CAs to remove the old root CA
Update apiserver’s trusted CAs to remove the old root CA
Update controller-manager’s trusted CAs to remove the old root CA
Update scheduler’s trusted CAs to remove the old root CA
Update kubelet’s trusted CAs to remove the old root CA
Change progress state to updated-host-trust-new-ca on success
Change progress state to updating-host-trust-new-ca-failed on failure
7. system kube-rootca-pods-update –phase=trust-new-ca¶
Annotate Daemonsets and Deployments to trigger pod replacement in a safer rolling fashion, to remove the old root CA from pods trusted CA list
Change progress state to updated-pods-trust-new-ca on success
Change progress state to updating-pods-trust-new-ca-failed on failure
8. system kube-rootca-host-update complete¶
Post-check to verify the update
Change the progress state to update-complete
9. system kube-rootca-host-update-list¶
Run this command anytime to show the update status of all hosts in the cluster
10. system kube-rootca-update-show¶
Run this command anytime to show the overall update status
11. system kube-rootca-update-abort¶
Run this command to abort the update at any step
VIM Orchestration Operations¶
Refer to future spec
DC Orchestration Operations¶
Refer to future spec
cert-mon monitoring and alarm raising¶
Refer to future spec
Fault Handling¶
After the update start, user can re-try the step that fails. At any step before update-complete, user can choose to reload or regenerate a new root CA certificate and start the update procedure again. This provides a mechanism to recover from a step that fails multiple times, as well as a mechanism to restore the original root CA certficate.
CLI Clients¶
We will extend the existing system clients to add the new commands.
Web GUI¶
If we want to allow the update to be handled entirely through the GUI we’d need to add support in the GUI for all the operations from sysinv.
This will not be implemented in the initial release.
Alternatives¶
kubernetes v1.18.1 has support to renew certificates via “kubeadm alpha certs renew” command [2]. Certificates can be renewed by kubeadm include admin.conf, apiserver, apiserver-kubelete-client, controller-manager.conf, scheduler.conf. It doesn’t support renewal of the root CA certificate and kubelet client certificates.
We could update /etc/kubernetes/pki/ca.crt and /etc/kubernetes/pki/ca.key with a new root CA cert and use kubeadm to update the certificates supported, but this procedure won’t be a rolling update and will cause service outage. Still we have to handle kubelet client certificates as they are not managed by kubeadm.
Notably, this alternative procedure would be a lengthy manual error-prone procedure.
Data model impact¶
In order to track the progress of the update, the following tables in sysinv database are required.
kube_rootca_update
created/update/delete_at: as per other tables
id: as per other tables
uuid: as per other tables
from_rootca_cert: character (255), the id of the old root CA cert
to_rootca_cert: character (255), the id of the new root CA cert
state: character (255), the state of the update
kube_rootca_host_update
created/update/delete_at: as per other tables
id: as per other tables
uuid: as per other tables
target_rootca_cert: character (255), the id of the new root CA cert
effective_rootca_cert: character (255), the id of the current root CA cert
state: character (255), the state of the update
host_id: foreign key (i_host.id)
REST API impact¶
New sysinv REST APIs will be added to implement the certificate update logic on top of the existing sysinv API framework. The actual certificate update in the API implementation will be by sysinv-agent applying runtime puppet manifests on each host.
The following is the list of REST resources and APIs to be added:
The new resource /kube_rootca_update is added¶
URLS:
/v1/kube_rootca_update
Request Methods:
POST /v1/kube_rootca_update
Creates (starts) a new root CA cert update
Response body example:
{"uuid": "47dff2b6-17ba-45a2-b3d3-8b2a85a5dba9", "to_rootca_cert": null, "created_at": "2021-08-25T14:57:13.006034+00:00", "from_rootca_cert": "d70efa2daaee06f8-91764", "updated_at": null, "state": "update-started", "id": 1}
GET /v1/kube_rootca_update
Return the current root CA update
Response body example:
{"uuid": "47dff2b6-17ba-45a2-b3d3-8b2a85a5dba9", "to_rootca_cert": null, "created_at": "2021-08-25T14:57:13.006034+00:00", "from_rootca_cert": "d70efa2daaee06f8-91764", "updated_at": null, "state": "update-started", "id": 1}
PATCH /v1/kube_rootca_update
Modifies the current rootca_update. Used to update the state of the update (e.g. to update_complete, or update_aborted).
Request body example:
[{"path": "/state", "value": "update-completed", "op": "replace"}] [{"path": "/state", "value": "update-aborted", "op": "replace"}]
Response body example:
{"uuid": "fb882423-ea26-42bf-b645-fd9de4248fd4", "to_rootca_cert": "d70efa2daaee06f8-176046114160516196064588947858918572907", "created_at": "2021-08-24T13:40:13.318822+00:00", "from_rootca_cert": "d70efa2daaee06f8-199590289735612744821302170157251522966", "updated_at": "2021-08-24T13:52:21.547899+00:00", "state": "update-completed", "id": 20} {"uuid": "7d07e384-f06d-4213-8e61-5e300aeb9d1c", "to_rootca_cert": null, "created_at": "2021-08-24T13:38:55.376395+00:00", "from_rootca_cert": "d70efa2daaee06f8-199590289735612744821302170157251522966", "updated_at": "2021-08-24T13:39:47.108582+00:00", "state": "update-aborted", "id": 19}
The new resource /kube_rootca_update/upload_cert is added¶
URLS:
/v1/kube_rootca_update/upload_cert
Request Methods:
POST /v1/kube_rootca_update/upload_cert
Upload a root CA cert and key from a file
Request body example: (The contents of the body is from a file containing both private key and certificate):
{"-----BEGIN PRIVATE KEY----- ...... -----END PRIVATE KEY----- ...... -----BEGIN CERTIFICATE----- ...... -----END CERTIFICATE-----}
Return body example:
{"success": "8503e172a63b23e6-12808492498813125379", "error": ""}
The new resource /v1/kube_rootca_update/generate_cert is added¶
URLS:
/v1/kube_rootca_update/generate_cert
Request Methods:
POST /v1/kube_rootca_update/generate_cert
Tell sysinv to generate a new root CA cert and key pair
Request body example:
{"expiry_date": "2022-08-25", "subject": "C=CA O=Company CN=kubernetes"}
Return body example:
{"success": "a8942428863f292b-253592702972967198587817983178843995169", "error": ""}
The existing resource /ihosts is modified to add new actions¶
URLS:
/v1/ihosts/<hostid>
Request Methods:
POST /v1/ihosts/<hostid>/kube_update_ca
Update root CA cert on the specified host
Request body example:
{"phase", "trust-both-cas"}
Response body example:
{"target_rootca_cert": "8503e172a63b23e6-12808492498813125379", "created_at": "2021-08-25T17:13:22.571151+00:00", "hostname": "controller-1", "updated_at": "2021-08-25T17:58:59.809264+00:00", "state": "updating-host-trust-both-cas", "personality": "controller", "id": 8, "effective_rootca_cert": "d70efa2daaee06f8-91764", "uuid": "a597c090-731f-48f8-9f3f-344997c41317"}
The new resource /kube_rootca_update/hosts is added¶
URLs:
/v1/kube_rootca_update/hosts
Request Methods:
GET /v1/kube_rootca_update/hosts
Returns the update details of all hosts
Response body example:
{ "kube_host_updates": [ {"target_rootca_cert": null, "created_at": "2021-08-25T17:13:22.558411+00:00", "hostname": "controller-0", "updated_at": null, "state": null, "personality": "controller", "id": 7, "effective_rootca_cert": "d70efa2daaee06f8-91764", "uuid": "7d7d05dd-900f-4004-951d-d92536faac8e" }, {"target_rootca_cert": "8503e172a63b23e6-12808492498813125379", "created_at": "2021-08-25T17:13:22.571151+00:00", "hostname": "controller-1", "updated_at": "2021-08-25T17:59:16.097029+00:00", "state": "updated-host-trust-both-cas", "personality": "controller", "id": 8, "effective_rootca_cert": "d70efa2daaee06f8-91764", "uuid": "a597c090-731f-48f8-9f3f-344997c41317" }, {"target_rootca_cert": null, "created_at": "2021-08-25T17:13:22.584500+00:00", "hostname": "worker-0", "updated_at": null, "state": null, "personality": "worker", "id": 9, "effective_rootca_cert": "d70efa2daaee06f8-91764", "uuid": "a4ca4eed-9b2f-4b4c-8ee7-45bbc573a55f" } ] }
The new resource /kube_rootca_update/pods is added¶
URLs:
/v1/kube_rootca_update/pods
Request Methods:
POST /v1/kube_rootca_update/pods
Update root CA cert for pods
Request body example:
{"phase", "trust-both-cas"}
Response body example:
{"uuid": "6cf4157b-75ff-4e86-bc96-8b08e4c9836d", "to_rootca_cert": "8503e172a63b23e6-12808492498813125379", "created_at": "2021-08-25T17:13:22.535798+00:00", "from_rootca_cert": "d70efa2daaee06f8-91764", "updated_at": "2021-08-25T18:37:02.574836+00:00", "state": "updating-pods-trust-both-cas", "id": 3}
Security impact¶
The new sysinv APIs are to be added within the existing framework, there is no changes to the existing security model.
The feature is providing a mechanism to update kubernetes certificates. Frequent or routine certificate update will enhance cluster security.
Other end user impact¶
End users will typically perform kubernetes root CA certificate update using the sysinv (i.e. system) CLI. The new CLI commands are shown in the Proposed change section above.
Performance Impact¶
When a root CA certificate update is in progress, kubernetes components (apiserver, scheduler, controller-manager, kubelet) and application pods will be restarted. Since the update is a rolling update, system will be functioning as usual but there will be small performance impact during the update. The user should update the host sequentially so the impact can be minimized.
Other deployer impact¶
Deployers will now be able to update the root CA certificate on a running system in a rolling fashion.
Developer impact¶
Developers working on the StarlingX components that manage container applications may need to be aware that certain operations should be prevented when a root CA update is in progress, since these components will be restarted during the update.
Developers working on application pods may also need to be aware that certain operations should be prevented when a root CA update is in progress as pods will be restarted during the update.
Generally speaking, there shouldn’t be any deployment or development activities on the system when a update is in progress. A maintenance window is a good time to do the update.
Upgrade impact¶
The newly added root CA update tables in sysinv database need to be created during upgrade from a release without this feature to a release with this feature. The tables will have initial empty default values.
Implementation¶
Assignee(s)¶
Primary assignee:
Andy Ning (andy.wrs)
Other contributors:
Soubihe, Joao Paulo (jsoubihe)
Repos Impacted¶
Impacted repo from this spec:
config
stx-puppet
fault
Work Items¶
Sysinv¶
New DB tables and APIs to access them
kube-rootca-update-start CLI/API
basic infrastructure
semantic and system health checks for update start
raise alarm to prevent upgrade, patching, etc.
kube-rootca-certificate-upload CLI/API
basic infrastructure
semantic checks
root CA issuer creation in cert-manager
calculate the ID of the new root certificate
kube-rootca-certificate-generate CLI/API
basic infrastructure
root CA certficate and issuer creation in cert-manager
calculate the ID of the new root certificate
kube-rootca-host-update <hostname> –phase=trust-both-cas CLI/API
basic infrastructure
semantic checks
conductor RPC/implementation (generate hieradata, call agent to apply puppet manifests, handle apply result, update host state etc…)
agent RPC/implementation (apply puppet manifest, report back config status, etc…)
kube-rootca-pods-update –phase=trust-both-cas CLI/API
basic infrastructure
semantic checks
conductor implementation (generate hieradata, trigger puppet manifests apply, handle apply result, update progress state etc…)
kube-rootca-host-update <hostname> –phase=update-certs CLI/API
basic infrastructure
semantic checks
conductor RPC/implementation (generate certificates and hieradata, call agent to apply puppet manifests, handle apply result, update host state etc…)
agent RPC/implementation (apply puppet manifest, report back config status, etc…)
kube-rootca-host-update <hostname> –phase=trust-new-ca CLI/API
basic infrastructure
semantic checks
conductor RPC/implementation (generate hieradata, call agent to apply puppet manifests, handle apply result, update host state etc…)
agent RPC/implementation (apply puppet manifest, report back config status, etc…)
kube-rootca-pods-update –phase=trust-new-ca CLI/API
basic infrastructure
semantic checks
conductor implementation (generate hieradata, trigger puppet manifests apply, handle apply result, update progress state etc…)
kube-rootca-update-complete CLI/API
basic infrastructure
semantic checks
clear the update in progress alarm
system health checks for update complete
kube-rootca-update-show CLI/API
basic infrastructure
conductor database query
kube-rootca-host-update-list CLI/API
basic infrastructure
conductor database query
kube-rootca-update-abort CLI/API
basic infrastructure
semantic checks
system health checks for update abort
clear ‘kube root CA update in progress’ alarm
raise ‘kube root CA update aborted’ alarm
Puppet¶
runtime manifest for host update trust-both-cas phase
runtime manifest for host update update-certs phase
runtime manifest for host update trust-new-ca phase
runtime manifest for pods update trust-both-cas phase
runtime manifest for pods update trust-new-ca phase
System Upgrade¶
Upgrade script to create the new tables in sysinv database when upgrading from a release without this feature. The tables will have default empty values.
Dependencies¶
None
Testing¶
The feature must be tested in the following StarlingX configurations:
AIO-SX
AIO-DX
Standard with at least one kubernetes worker node
The test can be performed on hardware or virtual environments.
Documentation Impact¶
New end user documentation will be required to describe how kubernetes root CA certificate update should be done. The config API reference will also need updates.
References¶
History¶
Release Name |
Description |
---|---|
stx-6.0 |
Introduced |