Kubernetes Upgrades

Storyboard: https://storyboard.openstack.org/#!/story/2006781

This story will provide a mechanism to upgrade the kubernetes components on a running StarlingX system. This is required to allow bug fixes to be delivered for kubernetes components and to allow upgrades between kubernetes minor versions.

Problem description

The kubernetes components used by StarlingX are delivered in two ways:

  • RPMs (e.g. kubeadm, kubelet)

  • docker images (e.g. kube-apiserver, kube-controller-manager)

StarlingX must provide a mechanism to allow bug fixes to be delivered for both these components. In addition, the kubernetes version skew policy 1 mandates that kubernetes minor release cannot be skipped. Since these minor releases occur approximately every three months and the StarlingX release cycle is approximately six months long, StarlingX must provide a mechanism to support kubernetes minor version upgrades on a running system.

Kubernetes versions are in the format major.minor.patch (e.g. 1.16.2). Bug fixes are delivered with a new patch version and releases are delivered with a new minor version. In either case, since the StarlingX kubernetes components are managed using kubeadm, a specific upgrade procedure must be followed 2. The kubeadm tool provides commands that help perform these upgrades, but the upgrade also involves other actions such as installing new RPMs, pulling images, restarting processes, etc… These steps must all be performed in a specific sequence.

In order to provide a robust and simple kubernetes upgrade experience for users of StarlingX, the entire process must be automated as much as possible and controls must be in place to ensure the steps are followed in the right order.

Use Cases

  • End User wants to upgrade to a new patch version of kubernetes on a running StarlingX system with minimal impact to running applications.

  • End User wants to upgrade to a new minor version of kubernetes on a running StarlingX system with minimal impact to running applications.

  • End User wants to downgrade to a previous patch version of kubernetes on a running system, because they experienced an issue with the new patch version. Note: downgrading between minor versions is not supported.

Proposed change

StarlingX will only support specific kubernetes versions and upgrades paths. For each supported kubernetes version we will track (via metadata in the sysinv component):

  • version (e.g. v1.16.1)

  • upgrade_from (e.g. v1.15.3)

    Specifies which kubernetes versions can upgrade to this version.

  • downgrade_to (e.g. none)

    Specifies which kubernetes versions this version can downgrade to.

  • applied_patches (e.g. PATCH.10, KUBE_PATCH.1)

    These patches must be applied before the upgrade starts.

  • available_patches (e.g. KUBE_PATCH.2):

    These patches must be available (but not applied) before the upgrade starts.

The existing patching mechanism will be used to deliver metadata and software to enable upgrades for specific kubernetes versions. The patches for a new kubernetes version will be structure as follows (using an upgrade from v1.16.4 to v1.17.1 as an example):

  • PATCH.X

    • software patches for sysinv component

    • contains metadata for new kubernetes version - for example:

      • version=v1.17.1

      • upgrade_from=v1.16.4

      • applied_patches=PATCH.Y

      • available_patches=PATCH.Z

  • PATCH.Y

    • patches kubeadm RPM to version 1.17.1

    • patch metadata:

      • requires: PATCH.X

      • pre-apply: running kube-apiservers are >= 1.16.4

      • pre-remove: running kube-apiservers are <= 1.16.4

  • PATCH.Z

    • patches kubelet and kubectl RPMs to version 1.17.1

    • patch metadata:

      • requires: PATCH.Y

      • pre-apply: running kube-apiservers are >= 1.17.1

      • pre-remove: running kube-apiservers are <= 1.17.1

The following is a summary of the steps the user will take when performing a kubernetes upgrade (using an upgrade from v1.16.4 to v1.17.1 as an example). For each step, a summary of the actions the system will perform is given.

  1. Upload/apply/install metadata patch

    This is PATCH.X in the example above. The existing “sw-patch” CLIs will be used.

  2. List available kubernetes versions

    # system kube-version-list
    +---------+--------+-----------+
    | Version | Target | State     |
    +---------+--------+-----------+
    | v1.16.4 | True   | active    |
    | v1.17.1 | False  | available |
    +---------+--------+-----------+
    

    This list comes from metadata in the sysinv component (updated by PATCH.X). The fields are:

    • Target: denotes version currently selected for installation

    • States:

      • active: version is running everywhere

      • partial: version is running somewhere

      • available: version that can be upgraded to

      The state must be calculated at runtime by querying the kubernetes component versions running on each node.

  3. Upload/apply/install kubeadm patch

    This is PATCH.Y in the example above. The existing “sw-patch” CLIs will be used. The patch pre-apply script verifies that all kube-apiservers versions are >= 1.16.4.

  4. Upload (but don’t apply) kubelet/kubectl patch

    This is PATCH.Z in the example above. The existing “sw-patch” CLIs will be used.

  5. Start kubernetes upgrade

    # system kube-upgrade-start v1.17.1
    +-------------------+-------------------+
    | Property          | Value             |
    +-------------------+-------------------+
    | from_version      | v1.16.4           |
    | to_version        | v1.17.1           |
    | state             | upgrade-started   |
    +-------------------+-------------------+
    

    This will do semantic checks for applied/available patches, upgrade path, application support for the new kubernetes version, etc…

    The states will include:

    • upgrade-started: semantic checks passed, upgrade started

    • upgrading-first-master: first master node control plane upgrade in progress

    • upgraded-first-master: first master node control plane upgrade complete

    • upgrading-networking: networking plugin upgrade in progress

    • upgraded-networking: networking plugin upgrade complete

    • upgrading-second-master: second master node control plane upgrade in progress

    • upgraded-second-master: second master node control plane upgrade complete

    • upgrading-kubelets: kubelet upgrades in progress

    • upgrade-complete: all nodes upgraded

    • upgrade-failed: upgrade has failed

  6. Show kubernetes upgrade status for hosts

    # system kube-host-upgrade-list
    +----+--------------+-------------+----------------+-----------------------+-----------------+-------------------+
    | id | hostname     | personality | target_version | control_plane_version | kubelet_version | status            |
    +----+--------------+-------------+----------------+-----------------------+-----------------+-------------------+
    | 1  | controller-0 | controller  | v1.16.4        | v1.16.4               | v1.16.4         |                   |
    | 2  | controller-1 | controller  | v1.16.4        | v1.16.4               | v1.16.4         |                   |
    | 3  | compute-0    | worker      | v1.16.4        | N/A                   | v1.16.4         |                   |
    | 4  | compute-1    | worker      | v1.16.4        | N/A                   | v1.16.4         |                   |
    +----+--------------+-------------+----------------+-----------------------+-----------------+-------------------+
    
    • The control_plane_version must be calculated at runtime by examining the version of the image associated with each control plane pod on each controller.

    • The kubelet_version must be calculated at runtime based on the kubelet version running on each node (i.e. kubectl get nodes).

    • The status field will indicate the current action in progress (e.g. upgrading-control-plane)

    • The data will be retrieved by the sysinv-api, using the kubernetes REST API.

  7. Upgrade control plane on first controller

    # system kube-host-upgrade control-plane controller-0
    +-----------------------+-------------------+
    | Property              | Value             |
    +-----------------------+-------------------+
    | target_version        | v1.16.1           |
    | control_plane_version | v1.15.3           |
    | kubelet_version       | v1.15.3           |
    | status                |                   |
    +-----------------------+-------------------+
    
    • Either controller can be upgraded first

    • The control plane upgrade involves the following steps:

      • upgrade kubernetes control plane components (must be done locally):

        • docker login <repo>

        • kubeadm config images pull –kubernetes-version <version> –image-repository=<repo>

        • docker logout <repo>

        • kubeadm upgrade apply <version>

      • update affinity for coredns pod (done through kubernetes API)

    • The local upgrade actions will be done by applying a runtime puppet manifest on the host.

  8. Show kubernetes upgrade status

    # system kube-upgrade-show
    +-------------------+-------------------------+
    | Property          | Value                   |
    +-------------------+-------------------------+
    | from_version      | v1.16.4                 |
    | to_version        | v1.17.1                 |
    | state             | upgraded-first-master   |
    +-------------------+-------------------------+
    
  9. Upgrade networking

    # system kube-upgrade-networking
    +-------------------+----------------------+
    | Property          | Value                |
    +-------------------+----------------------+
    | from_version      | v1.16.4              |
    | to_version        | v1.17.1              |
    | state             | upgrading-networking |
    +-------------------+----------------------+
    
    • The networking upgrade involves the following steps:

      • upgrade calico/multus/sriov if necessary (done through Ansible and kubernetes API)

    • In the future, StarlingX may support different types of networking (e.g. tungsten). The upgrade networking step would perform the steps required for whatever networking was installed.

  10. Upgrade control plane on second controller

    # system kube-host-upgrade control-plane controller-1
    +-----------------------+-------------------+
    | Property              | Value             |
    +-----------------------+-------------------+
    | target_version        | v1.17.1           |
    | control_plane_version | v1.16.4           |
    | kubelet_version       | v1.16.4           |
    | status                |                   |
    +-----------------------+-------------------+
    
    • The control plane upgrade involves the following steps:

      • upgrade kubernetes control plane components (must be done locally):

        • docker login <repo>

        • kubeadm config images pull –kubernetes-version <version> –image-repository=<repo>

        • docker logout <repo>

        • kubeadm upgrade node

  11. Show kubernetes upgrade status

    # system kube-upgrade-show
    +-------------------+--------------------------+
    | Property          | Value                    |
    +-------------------+--------------------------+
    | from_version      | v1.16.4                  |
    | to_version        | v1.17.1                  |
    | state             | upgraded-second-master   |
    +-------------------+--------------------------+
    
  12. Show kubernetes upgrade status for hosts

    # system kube-host-upgrade-list
    +----+--------------+-------------+----------------+-----------------------+-----------------+-------------------+
    | id | hostname     | personality | target_version | control_plane_version | kubelet_version | status            |
    +----+--------------+-------------+----------------+-----------------------+-----------------+-------------------+
    | 1  | controller-0 | controller  | v1.17.1        | v1.16.4               | v1.16.4         |                   |
    | 2  | controller-1 | controller  | v1.17.1        | v1.16.4               | v1.16.4         |                   |
    | 3  | compute-0    | worker      | v1.16.4        | N/A                   | v1.16.4         |                   |
    | 4  | compute-1    | worker      | v1.16.4        | N/A                   | v1.16.4         |                   |
    +----+--------------+-------------+----------------+-----------------------+-----------------+-------------------+
    
  13. Apply/install kubelet/kubectl patch

    This is PATCH.Z in the example above. The existing “sw-patch” CLIs will be used. This will place the v1.17.1 kubelet binary on each host, but will not restart kubelet.

  14. Upgrade kubelet on each controller

    The first controller will first be locked using the existing “system host-lock” CLI (either controller can be done first). This results in services being migrated off the host and applies the NoExecute taint, which will evict any pods that can be evicted.

    The kubelet is then upgraded::

    # system kube-host-upgrade kubelet controller-<n>
    +-----------------------+-------------------+
    | Property              | Value             |
    +-----------------------+-------------------+
    | target_version        | v1.17.1           |
    | control_plane_version | v1.17.1           |
    | kubelet_version       | v1.16.4           |
    | status                |                   |
    +-----------------------+-------------------+
    
    • The kubelet upgrade involves the following steps:

      • restart kubelet (must be done locally)

    The controller is then unlocked using the existing “system host-unlock” CLI. The kubelet on the second controller is then upgraded in the same way.

  15. Show kubernetes upgrade status

    # system kube-upgrade-show
    +-------------------+--------------------------+
    | Property          | Value                    |
    +-------------------+--------------------------+
    | from_version      | v1.16.4                  |
    | to_version        | v1.17.1                  |
    | state             | upgrading-kubelets       |
    +-------------------+--------------------------+
    
  16. Show kubernetes upgrade status for hosts

    # system kube-host-upgrade-list
    +----+--------------+-------------+----------------+-----------------------+-----------------+-------------------+
    | id | hostname     | personality | target_version | control_plane_version | kubelet_version | status            |
    +----+--------------+-------------+----------------+-----------------------+-----------------+-------------------+
    | 1  | controller-0 | controller  | v1.17.1        | v1.17.1               | v1.17.1         |                   |
    | 2  | controller-1 | controller  | v1.17.1        | v1.17.1               | v1.17.1         |                   |
    | 3  | compute-0    | worker      | v1.16.4        | N/A                   | v1.16.4         |                   |
    | 4  | compute-1    | worker      | v1.16.4        | N/A                   | v1.16.4         |                   |
    +----+--------------+-------------+----------------+-----------------------+-----------------+-------------------+
    
  17. Upgrade kubelet on all worker hosts

    Each worker host will first be locked using the existing “system host-lock” CLI (worker hosts can be done in any order). This results in services being migrated off the host and applies the NoExecute taint, which will evict any pods that can be evicted.

    The kubelet is then upgraded::

    # system kube-host-upgrade kubelet worker-<n>
    +-----------------------+-------------------+
    | Property              | Value             |
    +-----------------------+-------------------+
    | target_version        | v1.17.1           |
    | control_plane_version | v1.17.1           |
    | kubelet_version       | v1.16.4           |
    | status                |                   |
    +-----------------------+-------------------+
    
    • The kubelet upgrade involves the following steps (must be done locally):

      • download new pause image if the version has changed

      • kubeadm upgrade node

      • restart kubelet

    The worker is then unlocked using the existing “system host-unlock” CLI. Multiple worker hosts can be upgraded at the same time, as long as there is sufficient capacity remaining on other worker hosts.

  18. Show kubernetes upgrade status

    # system kube-upgrade-show
    +-------------------+--------------------------+
    | Property          | Value                    |
    +-------------------+--------------------------+
    | from_version      | v1.16.4                  |
    | to_version        | v1.17.1                  |
    | state             | upgrade-complete         |
    +-------------------+--------------------------+
    

Failure Handling

  • When a failure happens and cannot be resolved without manual intervention, the upgrade state will be set to upgrade-failed. The “kubeadm upgrade” commands will fall back to the previous configuration if (for example) image downloads fail.

  • To recover, the user will resolve the issue that caused the failure and then re-attempt the upgrade (this will require a “system kube-upgrade-resume” command). Based on the kubernetes versions running on each host, the system will reset the upgrade state to the right point and the upgrade will resume.

Health Checks

  • In order to ensure the health and stability of the system we likely will do health checks both before allowing a kubernetes upgrade to start and then as each upgrade CLI is run.

  • The health checks will include:

    • basic system health (i.e. system health-query)

    • new kubernetes specific checks - for example:

      • verify that all kubernetes control plane pods are running

      • verify that all kubernetes applications are fully applied

Interactions with container applications

  • Before starting an upgrade, we also need to check that all installed applications are compatible with the new kubernetes version. Ideally this checking should be done by invoking a plugin provided by each application.

  • When a kubernetes upgrade is in progress, we will prevent container application operations (e.g. system application-apply/remove/update). This will be done by introducing semantic checks in these APIs.

  • When a kubernetes upgrade is in progress, we will prevent helm-override operations (e.g. system helm-override-update/delete). These operations can trigger the applications to be re-applied, which we wouldn’t want to do during a kubernetes upgrade. This will be done by introducing semantic checks in these APIs.

Alternatives

Given that StarlingX is using kubeadm to install and manage kubernetes, this tool is the only reasonable choice for upgrading kubernetes.

The alternative to the approach described above would be to have the user do the kubernetes upgrades by running the docker and kubeadm commands directly. This approach would be very complex and error prone and would not be acceptable to users of StarlingX.

Data model impact

The following new tables in the sysinv DB will be required:

  • kube_host_upgrade:

    • created/updated/deleted_at: as per other tables

    • id: as per other tables

    • uuid: as per other tables

    • forhostid: foreign key (i_host.id)

    • target_version: character (255)

    • status: character (255)

  • kube_upgrade:

    • created/updated/deleted_at: as per other tables

    • id: as per other tables

    • uuid: as per other tables

    • from_version: character (255)

    • to_version: character (255)

    • state: character (255)

REST API impact

This impacts the sysinv REST API:

  • The new resource /kube_versions is added.

    • URLS:

      • /v1/kube_versions

    • Request Methods:

      • GET /v1/kube_versions

        • Returns all kube_versions known to the system

        • Response body example:

          {"kube_versions": [{"state": "active",
                              "version": "v1.16.4",
                              "target": true}]}
          
      • GET /v1/kube_versions/{version}

        • Returns details of specified kube_version

        • Response body example:

          {"target": true,
           "upgrade_from": ["v1.16.4"],
           "downgrade_to": [],
           "applied_patches": ["PATCH.Y"],
           "state": "active",
           "version": "v1.17.1",
           "available_patches": ["PATCH.Z"]}
          
  • The new resource /kube_upgrade is added.

    • URLS:

      • /v1/kube_upgrade

    • Request Methods:

      • POST /v1/kube_upgrade

        • Creates (starts) a new kube_upgrade

        • Request body example:

          {"to_version": "v1.17.1"}
          
        • Response body example:

          {"from_version": "v1.16.4",
           "to_version": "v1.17.1",
           "state": "upgrade-started",
           "uuid": "223ba65e-45d1-4383-baa7-f03bb4c46773",
           "created_at": "2019-10-25T12:04:10.372399+00:00",
           "updated_at": "2019-10-25T12:04:10.372399+00:00"}
          
      • GET /v1/kube_upgrade

        • Returns the current kube_upgrade

        • Response body example:

          {"from_version": "v1.16.4",
           "to_version": "v1.17.1",
           "state": "upgrade-started",
           "uuid": "223ba65e-45d1-4383-baa7-f03bb4c46773",
           "created_at": "2019-10-25T12:04:10.372399+00:00",
           "updated_at": "2019-10-25T14:45:43.252964+00:00"}
          
      • PATCH /v1/kube_upgrade

        • Modifies the current kube_upgrade. Used to update the state of the upgrade (e.g. to upgrade networking).

        • Request body example:

          {"state": "upgrading-networking"}
          
        • Response body example:

          {"from_version": "v1.16.4",
           "to_version": "v1.17.1",
           "state": "upgrade-started",
           "uuid": "223ba65e-45d1-4383-baa7-f03bb4c46773",
           "created_at": "2019-10-25T12:04:10.372399+00:00",
           "updated_at": "2019-10-25T14:45:43.252964+00:00"}
          
      • DELETE /v1/kube_upgrade

        • Deletes the current kube_upgrade (after it is completed)

  • The existing resource /ihosts is modified to add new actions.

    • URLS:

      • /v1/ihosts/<hostid>

    • Request Methods:

      • POST /v1/ihosts/<hostid>/kube_upgrade_control_plane

        • Upgrades the control plane on the specified host

        • Response body example:

          {"id": "4",
           "hostname": "controller-1",
           "personality": "controller",
           "target_version": "v1.17.1",
           "control_plane_version": "v1.16.4.",
           "kubelet_version": "v1.16.4",
           "status": ""}
          
      • POST /v1/ihosts/<hostid>/kube_upgrade_kubelet

        • Upgrades the kubelet on the specified host

        • Response body example:

          {"id": "4",
           "hostname": "controller-1",
           "personality": "controller",
           "target_version": "v1.17.1",
           "control_plane_version": "v1.17.1.",
           "kubelet_version": "v1.16.4",
           "status": ""}
          

Security impact

This story is providing a mechanism to upgrade kubernetes from one version to another. It does not introduce any additional security impacts above what is already there regarding the initial deployment of kubernetes.

Other end user impact

End users will typically perform kubernetes upgrades using the sysinv (i.e. system) CLI. The new CLI commands are shown in the Proposed change section above.

Performance Impact

When a kubernetes upgrade is in progress, each host must be taken out of service in order to upgrade the kubelet. This is necessary because running containers can be adversely impacted by the restart of the kubelet. The user must ensure that there is enough capacity in the system to handle the removal from service of one (or more) hosts as the kubelet on each host is upgraded.

Other deployer impact

Deployers will now be able to upgrade kubernetes on a running system.

Developer impact

Developers working on the StarlingX components that manage container applications may need to be aware that certain operations should be prevented when a kubernetes upgrade is in progress. This is discussed in the Proposed change section above.

Upgrade impact

Kubernetes upgrades are independent from the upgrade of the StarlingX platform. However, when StarlingX platform upgrades are supported, checks must be put in place to ensure that the kubernetes version is not allowed to change due to a platform upgrade. In effect, the system must be upgraded to the same version of kubernetes as is packaged in the new platform release, to ensure this is the case. This will be enforced through semantic checking in the platform upgrade APIs.

Implementation

Assignee(s)

Primary assignee:

  • Bart Wensley (bartwensley)

Other contributors:

  • Al Bailey (albailey)

  • Don Penney (dpenney)

  • Kevin Smith (kevin.smith.wrs)

Repos Impacted

  • config

  • integ

  • update

Work Items

Sysinv:

  • Define new metadata for kubernetes versions

  • DB API for new tables

  • kube-version-list/show CLI/API

    • basic infrastructure

    • calculate state for each known version

  • kube-upgrade-start/show CLI/API

    • basic infrastructure

    • semantic checks for upgrade start

      • applied/available patches

      • installed applications support new kubernetes version

      • tiller/armada images support new kubernetes version

  • kube-host-upgrade-list/show CLI/API

    • basic infrastructure

    • calculate versions for each host

  • kube-host-upgrade control-plane CLI/API

    • basic infrastructure

    • semantic checks

    • conductor RPC/implementation (trigger puppet manifest apply, wait for completion, update coredns affinity, etc…)

  • kube-upgrade-networking CLI/API

    • basic infrastructure

    • semantic checks

    • conductor RPC/implementation (trigger playbook apply, wait for completion, etc…)

  • kube-host-upgrade kubelet CLI/API

    • basic infrastructure

    • semantic checks

    • conductor RPC/implementation (trigger puppet manifest apply, wait for completion, etc…)

  • kube-upgrade-resume CLI/API

    • basic infrastructure

    • semantic checks

    • conductor RPC/implementation (determine what state the upgrade should be in, etc…)

  • New KubeOperator functions, including:

    • retrieve versions of each control plane component

    • retrieve versions of each kubelet

    • utility to roll up versions into overall kubernetes version

    • update affinity (for coredns pod)

  • Kubernetes specific health checks

    • add to existing health-query CLI

    • verify all control plane pods are running/healthy

    • verify that all applications are fully applied

    • figure out what else we should check

  • Add semantic checks to existing APIs

    • application-apply/remove/etc… - prevent when kubernetes upgrade in progress

    • helm-override-update/etc… - prevent when kubernetes upgrade in progress

Ansible:

  • enhance upgrade networking playbook to support applying different manifests based on what kubernetes version is running

Puppet:

  • kubernetes runtime manifest for control plane upgrade

  • kubernetes runtime manifest for kubelet upgrade

Patching:

  • Pre-apply/remove scripts to check running kubernetes version

Dependencies

None

Testing

Kubernetes upgrades must be tested in the following StarlingX configurations:

  • AIO-SX

  • AIO-DX

  • Standard with controller storage

  • Standard with dedicated storage

  • Distributed cloud

The testing can be performed on hardware or virtual environments.

Documentation Impact

New user end user documentation will be required to describe how kubernetes upgrades should be done. The config API reference will also need updates.

History

Revisions

Release Name

Description

stx-4.0

Introduced