Install Kubernetes Platform on All-in-one Duplex

Overview

The All-in-one Duplex (AIO-DX) deployment option provides a pair of high availability (HA) servers with each server providing all three cloud functions (controller, worker, and storage).

An AIO-DX configuration provides the following benefits:

  • Only a small amount of cloud processing and storage power is required

  • Application consolidation using multiple containers or virtual machines on a single pair of physical servers

  • High availability (HA) services run on the controller function across two physical servers in either active/active or active/standby mode

  • A storage back end solution using a two-node CEPH deployment across two servers

  • Containers or virtual machines scheduled on both worker functions

  • Protection against overall server hardware fault, where

    • All controller HA services go active on the remaining healthy server

    • All containers and/or virtual machines are recovered on the remaining healthy server

Note

If you are behind a corporate firewall or proxy, you need to set proxy settings. Refer to Docker Proxy Configuration for details.

All-in-one Duplex deployment configuration

Figure 1: All-in-one Duplex deployment configuration

Note

By default, StarlingX uses IPv4. To use StarlingX with IPv6:

  • The entire infrastructure and cluster configuration must be IPv6, with the exception of the PXE boot network.

  • Not all external servers are reachable via IPv6 addresses (for example Docker registries). Depending on your infrastructure, it may be necessary to deploy a NAT64/DNS64 gateway to translate the IPv4 addresses to IPv6.

For information on getting started quicky using an automated virtual installation, see Automated Virtual Installation.

Minimum hardware requirements

This section describes the hardware requirements and server preparation for a StarlingX r9.0 bare metal Duplex deployment configuration.

The recommended minimum hardware requirements for bare metal servers for various host types are:

Minimum Requirements

All-in-one Controller Node

Worker Node

Number of servers

2

2-99

Minimum processor class

  • Dual-CPU Intel® Xeon® E5 26xx family (SandyBridge) 8 cores/socket

or

  • Single-CPU Intel® Xeon® D-15xx family, 8 cores (low-power/low-cost option)

Note

  • Platform: 1 physical core with HT enabled or 2 physical cores with HT disabled, (by default, configurable)

    The use of single physical core for platform function is only suitable for Intel® 4th Generation Xeon® Scalable Processors or above and should not be configured for previous Intel® Xeon® CPU families. For All-In-One systems with older generation processors, two physical cores (or more) must be configured.

  • Application: Remaining cores

(Same as controller node)

Minimum memory

64 GB

  • Platform:

    • Socket 0: 10GB (by default, configurable)

    • Socket 1: 1GB (by default, configurable)

  • Application:

    • Socket 0: Remaining memory

    • Socket 1: Remaining memory

32 GB

Primary disk

500 GB SSD or NVMe (see NVME Configuration)

120 GB (Minimum 10k RPM)

Additional disks

  • 1 or more 500 GB (min. 10K RPM) for Ceph OSD

  • Recommended, but not required: 1 or more SSDs or NVMe drives for Ceph journals (min. 1024 MiB per OSD journal)

  • Recommended, but not required: 1 or more 500G HDs (min. 10K RPM), SSDs or NVMe drives for Container ephemeral disk storage.

  • For StarlingX OpenStack, we recommend 1 or more 500 GB (min. 10K RPM) for VM local ephemeral storage

For StarlingX OpenStack, we recommend 1 or more 500 GB (min. 10K RPM) for VM local ephemeral storage

Minimum network ports

  • MGMT: 1x1GE ( Recommended: MGMT 2x10GE LAG ).

  • OAM: 1x1GE (Recommended: OAM: 2x1GE LAG)

  • Data: 1 or more x 10GE (Recommended: Data: 2x10GE LAG)

  • Mgmt/Cluster: 1x10GE (Recommended: MGMT 2x10GE LAG)

  • Data: 1 or more x 10GE (Recommended: Data 2x10GE LAG)

USB

1 (Only required if used for initial installation of controller-0).

Board Management

1 BMC

Power profile

C-states (up to C6) may be configured for some use cases where application workloads can tolerate additional scheduling/timer latency.

Note

C-states may be enabled depending on application requirements.

Boot order

HD, PXE, USB

HD, PXE

BIOS mode

BIOS or UEFI

Note

UEFI Secure Boot and UEFI PXE boot over IPv6 are not supported. On systems with an IPv6 management network, you can use a separate IPv4 network for PXE boot. For more information, see PXE Boot Controller-0.

BIOS or UEFI

Bios settings

  • Hyper-Threading technology enabled

  • Virtualization technology enabled

  • VT for directed I/O enabled

  • CPU power and performance policy set to performance

  • CPU C state control disabled

  • Plug & play BMC detection disabled

Note

The system will not override the recommended BIOS settings.

(Same as controller node)

Installation Prerequisites

Several pre-requisites must be completed prior to starting the StarlingX installation.

Before attempting to install StarlingX, ensure that you have the following:

  • The StarlingX host installer ISO image file.

  • The update-iso.sh script.

  • Optionally, if required, update the ISO image to modify installation boot parameters, automatically select boot menu options and/or add a kickstart file to automatically perform configurations such as configuring the initial IP Interface for bootstrapping.

    Use the update-iso.sh script from a StarlingX mirror. The script syntax and options are:

    update-iso.sh --initial-password <password> -i <input bootimage.iso> -o <output bootimage.iso>
        [ -a <ks-addon.cfg> ] [ -p param=value ]
        [ -d <default menu option> ] [ -t <menu timeout> ]
        -i <file>: Specify input ISO file
        -o <file>: Specify output ISO file
        -a <file>: Specify ks-addon.cfg file
        --initial-password <password>: Specify the initial login password for sysadmin user
        -p <p=v>:  Specify boot parameter
    
        Example:
            -p instdev=/dev/disk/by-path/pci-0000:00:0d.0-ata-1.0
    
        -d <default menu option>:
            Specify default boot menu option:
            0 - Standard Controller, Serial Console
            1 - Standard Controller, Graphical Console
            2 - AIO, Serial Console
            3 - AIO, Graphical Console
            4 - AIO Low-latency, Serial Console
            5 - AIO Low-latency, Graphical Console
            NULL - Clear default selection
        -t <menu timeout>:
            Specify boot menu timeout, in seconds
    

    The following example ks-addon.cfg file, used with the -a option, sets up an initial IP interface at boot time by defining a VLAN on an Ethernet interface with with static assigned VLAN addresses:

    #### start ks-addon.cfg
    RAW_DEV=enp24s0f0
    OAM_VLAN=103
    MGMT_VLAN=163
    
    cat << EOF > ${IMAGE_ROOTFS}/etc/network/interfaces.d/auto
    auto ${RAW_DEV} lo vlan${OAM_VLAN} vlan${MGMT_VLAN}
    
    EOF
    
    cat << EOF > ${IMAGE_ROOTFS}/etc/network/interfaces.d/ifcfg-${RAW_DEV}
    iface ${RAW_DEV} inet manual
    mtu 9000
    post-up echo 0 > /proc/sys/net/ipv6/conf/${RAW_DEV}/autoconf;\
    echo 0 > /proc/sys/net/ipv6/conf/${RAW_DEV}/accept_ra;\
    echo 0 > /proc/sys/net/ipv6/conf/${RAW_DEV}/accept_redirects
    EOF
    
    cat << EOF > ${IMAGE_ROOTFS}/etc/network/interfaces.d/ifcfg-vlan${OAM_VLAN}
    iface vlan${OAM_VLAN} inet6 static
    vlan-raw-device ${RAW_DEV}
    address <__address__>
    netmask 64
    gateway <__address__>
    mtu 1500
    post-up /usr/sbin/ip link set dev vlan${OAM_VLAN} mtu 1500;\
    echo 0 > /proc/sys/net/ipv6/conf/vlan${OAM_VLAN}/autoconf;\
    echo 0 > /proc/sys/net/ipv6/conf/vlan${OAM_VLAN}/accept_ra;\
    echo 0 > /proc/sys/net/ipv6/conf/vlan${OAM_VLAN}/accept_redirects
    pre-up /sbin/modprobe -q 8021q
    EOF
    
    cat << EOF > ${IMAGE_ROOTFS}/etc/network/interfaces.d/ifcfg-vlan${MGMT_VLAN}
    iface vlan${MGMT_VLAN} inet6 static
    vlan-raw-device ${RAW_DEV}
    address <__address__>
    netmask 64
    mtu 1500
    post-up /usr/local/bin/tc_setup.sh vlan${MGMT_VLAN} mgmt 10000 > /dev/null;\
    /usr/sbin/ip link set dev vlan${MGMT_VLAN} mtu 1500;\
    echo 0 > /proc/sys/net/ipv6/conf/vlan${MGMT_VLAN}/autoconf;\
    echo 0 > /proc/sys/net/ipv6/conf/vlan${MGMT_VLAN}/accept_ra;\
    echo 0 > /proc/sys/net/ipv6/conf/vlan${MGMT_VLAN}/accept_redirects
    pre-up /sbin/modprobe -q 8021q
    EOF
    
    #### end ks-addon.cfg
    

    After updating the ISO image, create a bootable USB with the ISO or put the ISO on a PXEBOOT server. See the next bullet for details.

  • A mechanism for boot installation of the StarlingX host installer ISO downloaded from a StarlingX mirror. This can be either:

    • a bootable USB drive with the StarlingX host installer ISO.

      Refer to Create Bootable USB for instructions on how to create a bootable USB with the StarlingX ISO on your system.

    • the ISO image on a PXE boot server on the same network as the server that will be used as the initial controller-0. See Appendix PXE Boot Controller-0 for details.

  • For all controller or AIO controller servers, OAM Network connectivity to:

    • the BMC ports of all nodes

    • An external DNS Server. This is required for accessing StarlingX Docker Registry as discussed below.

    • A Docker Registry(s) containing the Docker images for the StarlingX load accessible via the OAM Network.

    You can use one of the following options:

    • The public open source registries (i.e. docker.io, k8s.gcr.io, ghcr.io, gcr.io, quay.io). This is the default option.

    • A private Docker Registry populated with the docker images from the public open source registries.

  • A record of the IP addresses allocated for the public interfaces for your deployment (that is IP addresses for the OAM Network and SR-IOV Data Networks).

Prepare Servers for Installation

Preparing servers is the first step of the StarlingX installation procedure.

Prior to starting the StarlingX installation, ensure that the bare metal servers are in the following state:

  • Physically installed.

  • Cabled for power.

  • Cabled for networking.

    Far-end switch ports should be properly configured to realize the networking shown in the diagram earlier in this topic.

  • All disks are wiped.

    This ensures that servers will boot from either the network or USB storage, if present.

    Note

    The disks and disk partitions need to be wiped before the install. Installing a Debian ISO may fail with a message that the system is in emergency mode if the disks and disk partitions are not completely wiped before the install, especially if the server was previously running a CentOS ISO.

  • BIOS configured with Intel Virtualization (VTD, VTX)

    • Disabled for controller-only servers and storage servers.

    • Enabled for controller+worker (All-in-one) servers and worker servers.

  • The servers are powered off.

Install Software on Controller-0

Note

The disks and disk partitions need to be wiped before the install. Installing a Debian ISO may fail with a message that the system is in emergency mode if the disks and disk partitions are not completely wiped before the install, especially if the server was previously running a CentOS ISO.

  1. Insert the bootable USB into a bootable USB port on the host you are configuring as controller-0.

    Note

    Refer to Create Bootable USB for instructions on how to create a bootable USB with the StarlingX ISO.

    Note

    Alternatively one can PXEBOOT controller-0. See PXE Boot Controller-0 for details on how to setup a PXEBOOT Server and pxe-boot the StarlingX load on controller-0

  2. Power on the host.

  3. Attach to a console, ensure the host boots from the USB, and wait for the StarlingX Installer Menus.

  4. Wait for the Install menus, and when prompted, make the following menu selections in the installer:

    Note

    If you configured the default menu options into the ISO with the update-iso.sh script (using the -d option) in Installation Prerequisites, then the Install menu will not appear.

    1. Select the appropriate deployment option for your scenario.

      For All-in-one deployments, choose one of the All-in-One Configurations, either standard kernel or real-time/low-latency kernel.

      Standard Controller Configuration

      For a standard configuration with controller or dedicated storage.

      All-in-one Controller Configuration

      For an AIO Simplex or Duplex configuration.

      All-in-one Controller Configuration (Low Latency)

      For an AIO Simplex or Duplex configuration with Low Latency Kernel.

    2. Choose Graphical Console or Serial Console depending on your terminal access to the console port.

    Wait for non-interactive install of software to complete and server to reboot. This can take 5-10 minutes, depending on the performance of the server.

    Wait for the non-interactive software installation to complete and for the server to reboot. This can take 5-10 minutes, depending on the performance of the server.

    Warning

    When using the low latency kernel, you must use the serial console instead of the graphics console, as it causes RT performance issues.

Bootstrap system on controller-0

  1. Login using the username / password of “sysadmin” / “sysadmin”. When logging in for the first time, you will be forced to change the password.

    Login: sysadmin
    Password:
    Changing password for sysadmin.
    (current) UNIX Password: sysadmin
    New Password:
    (repeat) New Password:
    
  2. Verify and/or configure IP connectivity.

    External connectivity is required to run the Ansible bootstrap playbook. The StarlingX boot image will DHCP out all interfaces so the server may have obtained an IP address and have external IP connectivity if a DHCP server is present in your environment. Verify this using the ip addr and ping 8.8.8.8 command.

    Otherwise, manually configure an IP address and default IP route. Use the PORT, IP-ADDRESS/SUBNET-LENGTH and GATEWAY-IP-ADDRESS applicable to your deployment environment.

    sudo ip address add <IP-ADDRESS>/<SUBNET-LENGTH> dev <PORT>
    sudo ip link set up dev <PORT>
    sudo ip route add default via <GATEWAY-IP-ADDRESS> dev <PORT>
    ping 8.8.8
    
  3. Specify user configuration overrides for the Ansible bootstrap playbook.

    Ansible is used to bootstrap StarlingX on controller-0. Key files for Ansible configuration are:

    /etc/ansible/hosts

    The default Ansible inventory file. Contains a single host: localhost.

    /usr/share/ansible/stx-ansible/playbooks/bootstrap.yml

    The Ansible bootstrap playbook.

    /usr/share/ansible/stx-ansible/playbooks/host_vars/bootstrap/default.yml

    The default configuration values for the bootstrap playbook.

    sysadmin home directory ($HOME)

    The default location where Ansible looks for and imports user configuration override files for hosts. For example: $HOME/<hostname>.yml.

    Important

    Some Ansible bootstrap parameters cannot be changed or are very difficult to change after installation is complete.

    Review the set of install-time-only parameters before installation and confirm that your values for these parameters are correct for the desired installation.

    Refer to Ansible install-time-only parameters for details.

    Specify the user configuration override file for the Ansible bootstrap playbook using one of the following methods:

    Note

    This Ansible Overrides file for the Bootstrap Playbook ($HOME/localhost.yml) contains security sensitive information, use the ansible-vault create $HOME/localhost.yml command to create it. You will be prompted for a password to protect/encrypt the file. Use the ansible-vault edit $HOME/localhost.yml command if the file needs to be edited after it is created.

    1. Use a copy of the default.yml file listed above to provide your overrides.

      The default.yml file lists all available parameters for bootstrap configuration with a brief description for each parameter in the file comments.

      To use this method, run the ansible-vault create $HOME/localhost.yml command and copy the contents of the default.yml file into the ansible-vault editor, and edit the configurable values as required.

    2. Create a minimal user configuration override file.

      To use this method, create your override file with the ansible-vault create $HOME/localhost.yml command and provide the minimum required parameters for the deployment configuration as shown in the example below. Use the OAM IP SUBNET and IP ADDRESSing applicable to your deployment environment.

      Note

      During system bootstrap, the platform does not support the use of quotation characters and $ in the keystone user password.

      cd ~
      
      cat <<EOF > localhost.yml
      
      system_mode: duplex
      
      dns_servers:
        - 8.8.8.8
        - 8.8.4.4
      
      external_oam_subnet: <OAM-IP-SUBNET>/<OAM-IP-SUBNET-LENGTH>
      external_oam_gateway_address: <OAM-GATEWAY-IP-ADDRESS>
      external_oam_floating_address: <OAM-FLOATING-IP-ADDRESS>
      external_oam_node_0_address: <OAM-CONTROLLER-0-IP-ADDRESS>
      external_oam_node_1_address: <OAM-CONTROLLER-1-IP-ADDRESS>
      
      admin_username: admin
      admin_password: <admin-password>
      ansible_become_pass: <sysadmin-password>
      
      # OPTIONALLY provide a ROOT CA certificate and key for k8s root ca,
      # if not specified, one will be auto-generated,
      # see ‘Kubernetes Root CA Certificate’ in Security Guide for details.
      k8s_root_ca_cert: < your_root_ca_cert.pem >
      k8s_root_ca_key: < your_root_ca_key.pem >
      apiserver_cert_sans:
        - < your_hostname_for_oam_floating.your_domain >
      
      EOF
      

      In either of the above options, the bootstrap playbook’s default values will pull all container images required for the StarlingX Platform from Docker hub.

      If you have setup a private Docker registry to use for bootstrapping then you will need to add the following lines in $HOME/localhost.yml:

      docker_registries:
        quay.io:
           url: myprivateregistry.abc.com:9001/quay.io
        docker.elastic.co:
           url: myprivateregistry.abc.com:9001/docker.elastic.co
        gcr.io:
           url: myprivateregistry.abc.com:9001/gcr.io
        ghcr.io:
           url: myprivateregistry.abc.com:9001/ghcr.io
        k8s.gcr.io:
           url: myprivateregistry.abc.com:9001/k8s.gcr.io
        docker.io:
           url: myprivateregistry.abc.com:9001/docker.io
        registry.k8s.io:
           url: myprivateregistry.abc.com:9001/registry.k8s.io
        icr.io:
           url: myprivateregistry.abc.com:9001/icr.io
        defaults:
           type: docker
           username: <your_myprivateregistry.abc.com_username>
           password: <your_myprivateregistry.abc.com_password>
      
      # Add the CA Certificate that signed myprivateregistry.abc.com’s
      # certificate as a Trusted CA
      ssl_ca_cert: /home/sysadmin/myprivateregistry.abc.com-ca-cert.pem
      

      See Use a Private Docker Registry for more information.

      If a firewall is blocking access to Docker hub or your private registry from your StarlingX deployment, you will need to add the following lines in $HOME/localhost.yml (see Docker Proxy Configuration for more details about Docker proxy settings):

      # Add these lines to configure Docker to use a proxy server
      docker_http_proxy: http://my.proxy.com:1080
      docker_https_proxy: https://my.proxy.com:1443
      docker_no_proxy:
         - 1.2.3.4
      

      Configure system_local_ca_cert, system_local_ca_key and system_root_ca_cert to setup a local intermediate CA (signed by an external Root CA) for managing / signing all of the StarlingX Certificates. See Platform Issuer (system-local-ca) for more details.

      Refer to Ansible Bootstrap Configurations for information on additional Ansible bootstrap configurations for advanced Ansible bootstrap scenarios.

  4. Run the Ansible bootstrap playbook:

    Note

    Before running the Ansible bootstrap playbook, it is important that you ensure that controller-0 server time is synchronized correctly. Run the following command:

    # check the current server time
    $ date
    
    # if the current server time is not correct, update the NTP Servers configuration.
    For more information, see :ref:`Configure NTP Servers <configuring-ntp-servers-and-services-using-the-cli>`.
    
    ansible-playbook --ask-vault-pass /usr/share/ansible/stx-ansible/playbooks/bootstrap.yml
    

    Wait for Ansible bootstrap playbook to complete. This can take 5-10 minutes, depending on the performance of the host machine.

Configure controller-0

  1. Acquire admin credentials:

    source /etc/platform/openrc
    
  2. Configure the OAM interface of controller-0 and specify the attached network as “oam”.

    The following example configures the OAM interface on a physical untagged ethernet port, use OAM port name that is applicable to your deployment environment, for example eth0:

    ~(keystone_admin)$ OAM_IF=<OAM-PORT>
    ~(keystone_admin)$ system host-if-modify controller-0 $OAM_IF -c platform
    ~(keystone_admin)$ system interface-network-assign controller-0 $OAM_IF oam
    

    To configure a VLAN or aggregated ethernet interface, see Node Interfaces.

  3. Configure the MGMT interface of controller-0 and specify the attached networks of both “mgmt” and “cluster-host”.

    The following example configures the MGMT interface on a physical untagged ethernet port. Use the MGMT port name that is applicable to your deployment environment, for example eth1:

    ~(keystone_admin)$ MGMT_IF=<MGMT-PORT>
    ~(keystone_admin)$ system host-if-modify controller-0 lo -c none
    ~(keystone_admin)$ IFNET_UUIDS=$(system interface-network-list controller-0 | awk '{if ($6=="lo") print $4;}')
    ~(keystone_admin)$ for UUID in $IFNET_UUIDS; do \
        system interface-network-remove ${UUID} \
    done
    ~(keystone_admin)$ system host-if-modify controller-0 $MGMT_IF -c platform
    ~(keystone_admin)$ system interface-network-assign controller-0 $MGMT_IF mgmt
    ~(keystone_admin)$ system interface-network-assign controller-0 $MGMT_IF cluster-host
    

    To configure a vlan or aggregated ethernet interface, see Node Interfaces.

  4. Configure NTP servers for network time synchronization:

    ~(keystone_admin)$ system ntp-modify ntpservers=0.pool.ntp.org,1.pool.ntp.org
    

    To configure PTP instead of NTP, see PTP Server Configuration.

OpenStack-specific host configuration

Important

These steps are required only if the StarlingX OpenStack application (stx-openstack) will be installed.

  1. For OpenStack only: Assign OpenStack host labels to controller-0 in support of installing the stx-openstack manifest and helm-charts later.

    system host-label-assign controller-0 openstack-control-plane=enabled
    system host-label-assign controller-0 openstack-compute-node=enabled
    system host-label-assign controller-0 openvswitch=enabled

    Note

    If you have a NIC that supports SR-IOV, then you can enable it by using the following:

    system host-label-assign controller-0 sriov=enabled
    
  2. For OpenStack only: Due to the additional OpenStack services running on the AIO controller platform cores, additional platform cores may be required.

    A minimum of 4 platform cores are required, 6 platform cores are recommended.

    Increase the number of platform cores with the following commands. This example assigns 6 cores on processor/numa-node 0

    on controller-0 to platform.

    ~(keystone_admin)$ system host-cpu-modify -f platform -p0 6 controller-0
    
  3. Due to the additional OpenStack services’ containers running on the controller host, the size of the Docker filesystem needs to be increased from the default size of 30G to 60G.

    # check existing size of docker fs
    system host-fs-list controller-0
    # check available space (Avail Size (GiB)) in cgts-vg LVG where docker fs is located
    system host-lvg-list controller-0
    # if existing docker fs size + cgts-vg available space is less than
    # 80G, you will need to add a new disk to cgts-vg.
    
       # Get device path of BOOT DISK
       system host-show controller-0 | fgrep rootfs
    
       # Get UUID of ROOT DISK by listing disks
       system host-disk-list controller-0
    
       # Add new disk to 'cgts-vg' local volume group
       system host-pv-add controller-0 cgts-vg <DISK_UUID>
       sleep 10    # wait for disk to be added
    
       # Confirm the available space and increased number of physical
       # volumes added to the cgts-vg colume group
       system host-lvg-list controller-0
    
    # Increase docker filesystem to 60G
    system host-fs-modify controller-0 docker=60
    
  4. For OpenStack only: Configure the system setting for the vSwitch.

    StarlingX has OVS (kernel-based) vSwitch configured as default, which:

    • runs in a container; defined within the helm charts of stx-openstack manifest.

    • shares the core(s) assigned to the platform.

    If you require better performance, OVS-DPDK (OVS with the Data Plane Development Kit, which is supported only on bare metal hardware) should be used:

    • Runs directly on the host (it is not containerized). Requires that at least 1 core be assigned/dedicated to the vSwitch function.

    To deploy the default containerized OVS:

    ~(keystone_admin)$ system modify --vswitch_type none
    

    This does not run any vSwitch directly on the host, instead, it uses the containerized OVS defined in the helm charts of stx-openstack manifest.

    To deploy OVS-DPDK, run the following command:

    ~(keystone_admin)$ system modify --vswitch_type ovs-dpdk

    Default recommendation for an AIO-controller is to use a single core for OVS-DPDK vSwitch.

    # assign 1 core on processor/numa-node 0 on controller-0 to vswitch
    ~(keystone_admin)$ system host-cpu-modify -f vswitch -p0 1 controller-0
    

    Once vswitch_type is set to OVS-DPDK, any subsequent nodes created will default to automatically assigning 1 vSwitch core for AIO controllers and 2 vSwitch cores (both on numa-node 0; physical NICs are typically on first numa-node) for compute-labeled worker nodes.

    When using OVS-DPDK, configure 1G of huge pages for vSwitch memory on each NUMA node on the host. It is recommended to configure 1x 1G huge page (-1G 1) for vSwitch memory on each NUMA node on the host.

    However, due to a limitation with Kubernetes, only a single huge page size is supported on any one host. If your application VMs require 2M huge pages, then configure 500x 2M huge pages (-2M 500) for vSwitch memory on each NUMA node on the host.

    # Assign 1x 1G huge page on processor/numa-node 0 on controller-0 to vswitch
    ~(keystone_admin)$ system host-memory-modify -f vswitch -1G 1 controller-0 0
    
    # Assign 1x 1G huge page on processor/numa-node 1 on controller-0 to vswitch
    ~(keystone_admin)$ system host-memory-modify -f vswitch -1G 1 controller-0 1
    

    Important

    VMs created in an OVS-DPDK environment must be configured to use huge pages to enable networking and must use a flavor with property: hw:mem_page_size=large

    Configure the huge pages for VMs in an OVS-DPDK environment on this host, the following commands are an example that assumes that 1G huge page size is being used on this host:

    # assign 1x 1G huge page on processor/numa-node 0 on controller-0 to applications
    ~(keystone_admin)$ system host-memory-modify -f application -1G 10 controller-0 0
    
    # assign 1x 1G huge page on processor/numa-node 1 on controller-0 to applications
    ~(keystone_admin)$ system host-memory-modify -f application -1G 10 controller-0 1
    

    Note

    After controller-0 is unlocked, changing vswitch_type requires locking and unlocking controller-0 to apply the change.

  5. For OpenStack only: Add an instances filesystem OR set up a disk based nova-local volume group, which is needed for stx-openstack nova ephemeral disks.

    Note

    Both cannot exist at the same time.

    Add an ‘instances’ filesystem

    ~(keystone_admin)$ export NODE=controller-0
    
    # Create ‘instances’ filesystem
    ~(keystone_admin)$ system host-fs-add ${NODE} instances=<size>
    

    Or add a ‘nova-local’ volume group:

    ~(keystone_admin)$ export NODE=controller-0
    
    # Create ‘nova-local’ local volume group
    ~(keystone_admin)$ system host-lvg-add ${NODE} nova-local
    
    # Get UUID of an unused DISK to to be added to the ‘nova-local’ volume
    # group. CEPH OSD Disks can NOT be used
    # List host’s disks and take note of UUID of disk to be used
    ~(keystone_admin)$ system host-disk-list ${NODE}
    
    # Add the unused disk to the ‘nova-local’ volume group
    ~(keystone_admin)$ system host-pv-add ${NODE} nova-local <DISK_UUID>
    
  6. For OpenStack only: Configure data interfaces for controller-0. Data class interfaces are vswitch interfaces used by vswitch to provide VM virtio vNIC connectivity to OpenStack Neutron Tenant Networks on the underlying assigned Data Network.

    Important

    A compute-labeled All-in-one controller host MUST have at least one Data class interface.

    • Configure the data interfaces for controller-0.

      ~(keystone_admin)$  NODE=controller-0
      
      # List inventoried host’s ports and identify ports to be used as ‘data’ interfaces,
      # based on displayed linux port name, pci address and device type.
      ~(keystone_admin)$ system host-port-list ${NODE}
      
      # List host’s auto-configured ‘ethernet’ interfaces,
      # find the interfaces corresponding to the ports identified in previous step, and
      # take note of their UUID
      ~(keystone_admin)$ system host-if-list -a ${NODE}
      
      # Modify configuration for these interfaces
      # Configuring them as ‘data’ class interfaces, MTU of 1500 and named data#
      ~(keystone_admin)$ system host-if-modify -m 1500 -n data0 -c data ${NODE} <data0-if-uuid>
      ~(keystone_admin)$ system host-if-modify -m 1500 -n data1 -c data ${NODE} <data1-if-uuid>
      
      # Create Data Networks that vswitch 'data' interfaces will be connected to
      ~(keystone_admin)$ DATANET0='datanet0'
      ~(keystone_admin)$ DATANET1='datanet1'
      
      # Assign Data Networks to Data Interfaces
      ~(keystone_admin)$ system interface-datanetwork-assign ${NODE} <data0-if-uuid> ${DATANET0}
      ~(keystone_admin)$ system interface-datanetwork-assign ${NODE} <data1-if-uuid> ${DATANET1}
      

Optionally Configure PCI-SRIOV Interfaces

  1. Optionally, configure PCI-SR-IOV interfaces for controller-0.

    This step is optional for Kubernetes. Do this step if using SR-IOV network attachments in hosted application containers.

    This step is optional for OpenStack. Do this step if using SR-IOV vNICs in hosted application VMs. Note that PCI-SR-IOV interfaces can have the same Data Networks assigned to them as vswitch data interfaces.

    • Configure the pci-sriov interfaces for controller-0.

      ~(keystone_admin)$ export NODE=controller-0
      
      # List inventoried host’s ports and identify ports to be used as ‘pci-sriov’ interfaces,
      # based on displayed linux port name, pci address and device type.
      ~(keystone_admin)$ system host-port-list ${NODE}
      
      # List host’s auto-configured ‘ethernet’ interfaces,
      # find the interfaces corresponding to the ports identified in previous step, and
      # take note of their UUID
      ~(keystone_admin)$ system host-if-list -a ${NODE}
      
      # Modify configuration for these interfaces
      # Configuring them as ‘pci-sriov’ class interfaces, MTU of 1500 and named sriov#
      ~(keystone_admin)$ system host-if-modify -m 1500 -n sriov0 -c pci-sriov ${NODE} <sriov0-if-uuid> -N <num_vfs>
      ~(keystone_admin)$ system host-if-modify -m 1500 -n sriov1 -c pci-sriov ${NODE} <sriov1-if-uuid> -N <num_vfs>
      
      # If not already created, create Data Networks that the 'pci-sriov'
      # interfaces will be connected to
      ~(keystone_admin)$ DATANET0='datanet0'
      ~(keystone_admin)$ DATANET1='datanet1'
      ~(keystone_admin)$ system datanetwork-add ${DATANET0} vlan
      ~(keystone_admin)$ system datanetwork-add ${DATANET1} vlan
      
      # Assign Data Networks to PCI-SRIOV Interfaces
      ~(keystone_admin)$ system interface-datanetwork-assign ${NODE} <sriov0-if-uuid> ${DATANET0}
      ~(keystone_admin)$ system interface-datanetwork-assign ${NODE} <sriov1-if-uuid> ${DATANET1}
      
    • For Kubernetes Only: To enable using SR-IOV network attachments for the above interfaces in Kubernetes hosted application containers:

      • Configure the Kubernetes SR-IOV device plugin.

      ~(keystone_admin)$ system host-label-assign controller-0 sriovdp=enabled
      
      • If you are planning on running DPDK in Kubernetes hosted application containers on this host, configure the number of 1G Huge pages required on both NUMA nodes.

        # assign 10x 1G huge page on processor/numa-node 0 on controller-0 to applications
        ~(keystone_admin)$ system host-memory-modify -f application controller-0 0 -1G 10
        
        # assign 10x 1G huge page on processor/numa-node 1 on controller-0 to applications
        ~(keystone_admin)$ system host-memory-modify -f application controller-0 1 -1G 10
        

Optional - Initialize a Ceph Persistent Storage Backend

A persistent storage backend is required if your application requires PVCs.

Important

The StarlingX OpenStack application requires PVCs.

Note

Each deployment model enforces a different structure for the Rook Ceph cluster and its integration with the platform.

There are two options for persistent storage backend: the host-based Ceph solution and the Rook container-based Ceph solution.

Note

Host-based Ceph will be deprecated and removed in an upcoming release. Adoption of Rook-Ceph is recommended for new deployments.

For host-based Ceph:

  1. Initialize with add ceph backend:

    ~(keystone_admin)$ system storage-backend-add ceph --confirmed
    
  2. Add an OSD on controller-0 for host-based Ceph:

    # List host’s disks and identify disks you want to use for CEPH OSDs, taking note of their UUID
    # By default, /dev/sda is being used as system disk and can not be used for OSD.
    ~(keystone_admin)$ system host-disk-list controller-0
    
    # Add disk as an OSD storage
    ~(keystone_admin)$ system host-stor-add controller-0 osd <disk-uuid>
    
    # List OSD storage devices
    ~(keystone_admin)$ system host-stor-list controller-0
    

For Rook-Ceph:

  1. Add Storage-Backend with Deployment Model.

    ~(keystone_admin)$ system storage-backend-add ceph-rook --deployment controller
    ~(keystone_admin)$ system storage-backend-list
    +--------------------------------------+-----------------+-----------+----------------------+----------+------------------+------------------------------------------------------+
    | uuid                                 | name            | backend   | state                | task     | services         | capabilities                                         |
    +--------------------------------------+-----------------+-----------+----------------------+----------+------------------+------------------------------------------------------+
    | 45e3fedf-c386-4b8b-8405-882038dd7d13 | ceph-rook-store | ceph-rook | configuring-with-app | uploaded | block,filesystem | deployment_model: controller replication: 2          |
    |                                      |                 |           |                      |          |                  | min_replication: 1                                   |
    +--------------------------------------+-----------------+-----------+----------------------+----------+------------------+------------------------------------------------------+
    
  2. Set up a contorllerfs ceph-float filesystem.

    ~(keystone_admin)$ system controllerfs-add ceph-float=20
    
  3. Set up a host-fs ceph filesystem on controller-0.

    ~(keystone_admin)$ system host-fs-add controller-0 ceph=20
    

Unlock controller-0

Unlock controller-0 to bring it into service:

~(keystone_admin)$ system host-unlock controller-0

Controller-0 will reboot in order to apply configuration changes and come into service. This can take 5-10 minutes, depending on the performance of the host machine.

For Rook-Ceph:

  1. List all the disks.

    $ system host-disk-list controller-0
    +--------------------------------------+-------------+------------+-------------+----------+---------------+--------------+---------------------+--------------------------------------------+
    | uuid                                 | device_node | device_num | device_type | size_gib | available_gib | rpm          | serial_id           | device_path                                |
    +--------------------------------------+-------------+------------+-------------+----------+---------------+--------------+---------------------+--------------------------------------------+
    | 17408af3-e211-4e2b-8cf1-d2b6687476d5 | /dev/sda    | 2048       | HDD         | 292.968  | 0.0           | Undetermined | VBba52ec56-f68a9f2d | /dev/disk/by-path/pci-0000:00:0d.0-ata-1.0 |
    | cee99187-dac4-4a7b-8e58-f2d5bd48dcaf | /dev/sdb    | 2064       | HDD         | 9.765    | 0.0           | Undetermined | VBf96fa322-597194da | /dev/disk/by-path/pci-0000:00:0d.0-ata-2.0 |
    | 0c6435af-805a-4a62-ad8e-403bf916f5cf | /dev/sdc    | 2080       | HDD         | 9.765    | 9.761         | Undetermined | VBeefed5ad-b4815f0d | /dev/disk/by-path/pci-0000:00:0d.0-ata-3.0 |
    +--------------------------------------+-------------+------------+-------------+----------+---------------+--------------+---------------------+--------------------------------------------+
    
  2. Choose empty disks and provide hostname and uuid to finish OSD configuration:

    $ system host-stor-add controller-0 osd cee99187-dac4-4a7b-8e58-f2d5bd48dcaf
    
  3. Wait for OSDs pod to be ready.

    $ kubectl get pods -n rook-ceph
    NAME                                                     READY   STATUS      RESTARTS      AGE
    ceph-mgr-provision-78xjk                                0/1     Completed   0          4m31s
    csi-cephfsplugin-572jc                                  2/2     Running     0          5m32s
    csi-cephfsplugin-provisioner-5467c6c4f-t8x8d            5/5     Running     0          5m28s
    csi-rbdplugin-2npb6                                     2/2     Running     0          5m32s
    csi-rbdplugin-provisioner-fd84899c-k8wcw                5/5     Running     0          5m32s
    rook-ceph-crashcollector-controller-0-589f5f774-d8sjz   1/1     Running     0          3m24s
    rook-ceph-exporter-controller-0-5fd477bb8-c7nxh         1/1     Running     0          3m21s
    rook-ceph-mds-kube-cephfs-a-cc647757-6p9j5              2/2     Running     0          3m25s
    rook-ceph-mds-kube-cephfs-b-5b5845ff59-xprbb            2/2     Running     0          3m19s
    rook-ceph-mgr-a-746fc4dd54-t8bcw                        2/2     Running     0          4m40s
    rook-ceph-mon-a-b6c95db97-f5fqq                         2/2     Running     0          4m56s
    rook-ceph-operator-69b5674578-27bn4                     1/1     Running     0          6m26s
    rook-ceph-osd-0-7f5cd957b8-ppb99                        2/2     Running     0          3m52s
    rook-ceph-osd-prepare-controller-0-vzq2d                0/1     Completed   0          4m18s
    rook-ceph-provision-zcs89                               0/1     Completed   0          101s
    rook-ceph-tools-7dc9678ccb-v2gps                        1/1     Running     0          6m2s
    stx-ceph-manager-664f8585d8-wzr4v                       1/1     Running     0          4m31s
    
  4. Check ceph cluster health.

    $ ceph -s
        cluster:
            id:     75c8f017-e7b8-4120-a9c1-06f38e1d1aa3
            health: HEALTH_OK
    
        services:
            mon: 1 daemons, quorum a (age 32m)
            mgr: a(active, since 30m)
            mds: 1/1 daemons up, 1 hot standby
            osd: 1 osds: 1 up (since 30m), 1 in (since 31m)
    
        data:
            volumes: 1/1 healthy
            pools:   4 pools, 113 pgs
            objects: 22 objects, 595 KiB
            usage:   27 MiB used, 9.7 GiB / 9.8 GiB avail
            pgs:     113 active+clean
    
        io:
            client:   852 B/s rd, 1 op/s rd, 0 op/s wr
    
  • For OpenStack Only Due to the additional OpenStack services’ containers running on the controller host, the size of the Docker filesystem needs to be increased from the default size of 30G to 60G.

    # check existing size of docker fs
    ~(keystone_admin)$ system host-fs-list controller-0
    # check available space (Avail Size (GiB)) in cgts-vg LVG where docker fs is located
    ~(keystone_admin)$ system host-lvg-list controller-0
    # if existing docker fs size + cgts-vg available space is less than
    # 80G, you will need to add a new disk to cgts-vg.
    
       # Get device path of BOOT DISK
       ~(keystone_admin)$ system host-show controller-0 | fgrep rootfs
    
       # Get UUID of ROOT DISK by listing disks
       ~(keystone_admin)$ system host-disk-list controller-0
    
       # Add new disk to 'cgts-vg' local volume group
       ~(keystone_admin)$ system host-pv-add controller-0 cgts-vg <DISK_UUID>
       ~(keystone_admin)$ sleep 10    # wait for disk to be added
    
       # Confirm the available space and increased number of physical
       # volumes added to the cgts-vg colume group
       ~(keystone_admin)$ system host-lvg-list controller-0
    
       # Increase docker filesystem to 60G
       ~(keystone_admin)$ system host-fs-modify controller-0 docker=60
    

Install software on controller-1 node

  1. Power on the controller-1 server.

    Power on the controller-1 server and force it to network boot with the appropriate BIOS boot options for your particular server.

    As controller-1 boots, a message appears on its console instructing you to configure the personality of the node.

  2. On the console of controller-0, list hosts to see newly discovered controller-1 host (hostname=None):

    ~(keystone_admin)$ system host-list
    +----+--------------+-------------+----------------+-------------+--------------+
    | id | hostname     | personality | administrative | operational | availability |
    +----+--------------+-------------+----------------+-------------+--------------+
    | 1  | controller-0 | controller  | unlocked       | enabled     | available    |
    | 2  | None         | None        | locked         | disabled    | offline      |
    +----+--------------+-------------+----------------+-------------+--------------+
    
  3. Using the host id, set the personality of this host to ‘controller’:

    ~(keystone_admin)$ system host-update 2 personality=controller
    
  4. Wait for the software installation on controller-1 to complete, for controller-1 to reboot, and for controller-1 to show as locked/disabled/online in ‘system host-list’.

    This can take 5-10 minutes, depending on the performance of the host machine.

    ~(keystone_admin)$ system host-list
    +----+--------------+-------------+----------------+-------------+--------------+
    | id | hostname     | personality | administrative | operational | availability |
    +----+--------------+-------------+----------------+-------------+--------------+
    | 1  | controller-0 | controller  | unlocked       | enabled     | available    |
    | 2  | controller-1 | controller  | locked         | disabled    | online       |
    +----+--------------+-------------+----------------+-------------+--------------+
    

Configure controller-1

  1. Configure the OAM interface of controller-1 and specify the attached network of “oam”.

    The following example configures the OAM interface on a physical untagged ethernet port, use the OAM port name that is applicable to your deployment environment, for example eth0:

    ~(keystone_admin)$ OAM_IF=<OAM-PORT>
    ~(keystone_admin)$ system host-if-modify controller-1 $OAM_IF -c platform
    ~(keystone_admin)$ system interface-network-assign controller-1 $OAM_IF oam
    

    To configure a VLAN or aggregated ethernet interface, see Node Interfaces.

  2. The MGMT interface is partially set up by the network install procedure; configuring the port used for network install as the MGMT port and specifying the attached network of “mgmt”.

    Complete the MGMT interface configuration of controller-1 by specifying the attached network of “cluster-host”.

    ~(keystone_admin)$ system interface-network-assign controller-1 mgmt0 cluster-host
    

OpenStack-specific host configuration

Important

These steps are required only if the StarlingX OpenStack application (stx-openstack) will be installed.

  1. For OpenStack only: Assign OpenStack host labels to controller-1 in support of installing the stx-openstack manifest and helm-charts later.

    ~(keystone_admin)$ system host-label-assign controller-1 openstack-control-plane=enabled
    ~(keystone_admin)$ system host-label-assign controller-1 openstack-compute-node=enabled
    ~(keystone_admin)$ system host-label-assign controller-1 openvswitch=enabled

    Note

    If you have a NIC that supports SR-IOV, then you can enable it by using the following:

    ~(keystone_admin)$ system host-label-assign controller-0 sriov=enabled
    
  2. For OpenStack only: Due to the additional openstack services running on the AIO controller platform cores, additional cores may be required.

    Increase the number of platform cores with the following commands:

    # assign 6 cores on processor/numa-node 0 on controller-1 to platform
    ~(keystone_admin)$ system host-cpu-modify -f platform -p0 6 controller-1
    
  3. Due to the additional openstack services’ containers running on the controller host, the size of the docker filesystem needs to be increased from the default size of 30G to 60G.

    # check existing size of docker fs
    ~(keystone_admin)$ system host-fs-list controller-1
    # check available space (Avail Size (GiB)) in cgts-vg LVG where docker fs is located
    ~(keystone_admin)$ system host-lvg-list controller-1
    # if existing docker fs size + cgts-vg available space is less than
    # 80G, you will need to add a new disk to cgts-vg.
    
       # Get device path of BOOT DISK
       ~(keystone_admin)$ system host-show controller-1 | fgrep rootfs
    
       # Get UUID of ROOT DISK by listing disks
       ~(keystone_admin)$ system host-disk-list controller-1
    
       # Add new disk to 'cgts-vg' local volume group
       ~(keystone_admin)$ system host-pv-add controller-1 cgts-vg <DISK_UUID>
       ~(keystone_admin)$ sleep 10    # wait for disk to be added
    
       # Confirm the available space and increased number of physical
       # volumes added to the cgts-vg colume group
       ~(keystone_admin)$ system host-lvg-list controller-1
    
       # Increase docker filesystem to 60G
       ~(keystone_admin)$ system host-fs-modify controller-1 docker=60
    
  4. For OpenStack only: Configure the host settings for the vSwitch.

    If using OVS-DPDK vswitch, run the following commands: Default recommendation for an AIO-controller is to use a single core for OVS-DPDK vSwitch. This should have been automatically configured, if not run the following command.

    # assign 1 core on processor/numa-node 0 on controller-1 to vswitch
    ~(keystone_admin)$ system host-cpu-modify -f vswitch -p0 1 controller-1
    

    When using OVS-DPDK, configure 1G of huge pages for vSwitch memory on each NUMA node on the host. It is recommended to configure 1x 1G huge page (-1G 1) for vSwitch memory on each NUMA node on the host.

    However, due to a limitation with Kubernetes, only a single huge page size is supported on any one host. If your application VMs require 2M huge pages, then configure 500x 2M huge pages (-2M 500) for vSwitch memory on each NUMA node on the host.

    # assign 1x 1G huge page on processor/numa-node 0 on controller-1 to vswitch
    ~(keystone_admin)$ system host-memory-modify -f vswitch -1G 1 controller-1 0
    
    # Assign 1x 1G huge page on processor/numa-node 1 on controller-0 to vswitch
    ~(keystone_admin)$ system host-memory-modify -f vswitch -1G 1 controller-1 1
    

    Important

    VMs created in an OVS-DPDK environment must be configured to use huge pages to enable networking and must use a flavor with property: hw:mem_page_size=large.

    Configure the huge pages for VMs in an OVS-DPDK environment on this host, assuming 1G huge page size is being used on this host, with the following commands:

    # assign 10x 1G huge page on processor/numa-node 0 on controller-1 to applications
    ~(keystone_admin)$ system host-memory-modify -f application -1G 10 controller-1 0
    
    # assign 10x 1G huge page on processor/numa-node 1 on controller-1 to applications
    ~(keystone_admin)$ system host-memory-modify -f application -1G 10 controller-1 1
    
  5. For OpenStack only: Add an instances filesystem OR Set up a disk based nova-local volume group, which is needed for stx-openstack nova ephemeral disks.

    Note

    Both cannot exist at the same time.

    • Add an ‘instances’ filesystem:

    ~(keystone_admin)$ export NODE=controller-1
    
    # Create ‘instances’ filesystem
    ~(keystone_admin)$ system host-fs-add ${NODE} instances=<size>
    

    Or

    • Add a ‘nova-local’ volume group:

    ~(keystone_admin)$ export NODE=controller-1
    
    # Create ‘nova-local’ local volume group
    ~(keystone_admin)$ system host-lvg-add ${NODE} nova-local
    
    # Get UUID of an unused DISK to to be added to the ‘nova-local’ volume
    # group. CEPH OSD Disks can NOT be used
    # List host’s disks and take note of UUID of disk to be used
    ~(keystone_admin)$ system host-disk-list ${NODE}
    
    # Add the unused disk to the ‘nova-local’ volume group
    ~(keystone_admin)$ system host-pv-add ${NODE} nova-local <DISK_UUID>
    
  6. For OpenStack only: Configure data interfaces for controller-1. Data class interfaces are vswitch interfaces used by vswitch to provide VM virtio vNIC connectivity to OpenStack Neutron Tenant Networks on the underlying assigned Data Network.

    Important

    A compute-labeled All-in-one controller host MUST have at least one Data class interface.

    • Configure the data interfaces for controller-1.

      export NODE=controller-1
      
      # List inventoried host's ports and identify ports to be used as 'data' interfaces,
      # based on displayed linux port name, pci address and device type.
      system host-port-list ${NODE}
      
      # List host’s auto-configured ‘ethernet’ interfaces,
      # find the interfaces corresponding to the ports identified in previous step, and
      # take note of their UUID
      system host-if-list -a ${NODE}
      
      # Modify configuration for these interfaces
      # Configuring them as 'data' class interfaces, MTU of 1500 and named data#
      ~(keystone_admin)$ system host-if-modify -m 1500 -n data0 -c data ${NODE} <data0-if-uuid>
      ~(keystone_admin)$ system host-if-modify -m 1500 -n data1 -c data ${NODE} <data1-if-uuid>
      
      # Create Data Networks that vswitch 'data' interfaces will be connected to
      ~(keystone_admin)$ DATANET0='datanet0'
      ~(keystone_admin)$ DATANET1='datanet1'
      
      # Assign Data Networks to Data Interfaces
      ~(keystone_admin)$ system interface-datanetwork-assign ${NODE} <data0-if-uuid> ${DATANET0}
      ~(keystone_admin)$ system interface-datanetwork-assign ${NODE} <data1-if-uuid> ${DATANET1}
      

Optionally Configure PCI-SRIOV Interfaces

  1. Optionally, configure PCI-SR-IOV interfaces for controller-1.

    This step is optional for Kubernetes. Do this step if using SR-IOV network attachments in hosted application containers.

    This step is optional for OpenStack. Do this step if using SR-IOV vNICs in hosted application VMs. Note that PCI-SR-IOV interfaces can have the same Data Networks assigned to them as vswitch data interfaces.

    • Configure the PCI-SR-IOV interfaces for controller-1.

      ~(keystone_admin)$ export NODE=controller-1
      
      # List inventoried host’s ports and identify ports to be used as ‘pci-sriov’ interfaces,
      # based on displayed linux port name, pci address and device type.
      ~(keystone_admin)$ system host-port-list ${NODE}
      
      # List host’s auto-configured 'ethernet' interfaces,
      # find the interfaces corresponding to the ports identified in previous step, and
      # take note of their UUID
      ~(keystone_admin)$ system host-if-list -a ${NODE}
      
      # Modify configuration for these interfaces
      # Configuring them as 'pci-sriov' class interfaces, MTU of 1500 and named sriov#
      ~(keystone_admin)$ system host-if-modify -m 1500 -n sriov0 -c pci-sriov ${NODE} <sriov0-if-uuid> -N <num_vfs>
      ~(keystone_admin)$ system host-if-modify -m 1500 -n sriov1 -c pci-sriov ${NODE} <sriov1-if-uuid> -N <num_vfs>
      
      # If not already created, create Data Networks that the 'pci-sriov' interfaces
      # will be connected to
      ~(keystone_admin)$ DATANET0='datanet0'
      ~(keystone_admin)$ DATANET1='datanet1'
      
      # Assign Data Networks to PCI-SRIOV Interfaces
      ~(keystone_admin)$ system interface-datanetwork-assign ${NODE} <sriov0-if-uuid> ${DATANET0}
      ~(keystone_admin)$ system interface-datanetwork-assign ${NODE} <sriov1-if-uuid> ${DATANET1}
      
    • For Kubernetes only: To enable using SR-IOV network attachments for the above interfaces in Kubernetes hosted application containers:

      • Configure the Kubernetes SR-IOV device plugin.

        ~(keystone_admin)$ system host-label-assign controller-1 sriovdp=enabled
        
      • If planning on running DPDK in Kubernetes hosted application containers on this host, configure the number of 1G Huge pages required on both NUMA nodes.

        # assign 10x 1G huge page on processor/numa-node 0 on controller-1 to applications
        ~(keystone_admin)$ system host-memory-modify -f application controller-1 0 -1G 10
        
        # assign 10x 1G huge page on processor/numa-node 1 on controller-1 to applications
        ~(keystone_admin)$ system host-memory-modify -f application controller-1 1 -1G 10
        

If configuring a Ceph-based Persistent Storage Backend, configure host-specific details

For host-based Ceph:

  1. Add an OSD on controller-1 for host-based Ceph:

    # List host’s disks and identify disks you want to use for CEPH OSDs, taking note of their UUID
    # By default, /dev/sda is being used as system disk and can not be used for OSD.
    ~(keystone_admin)$ system host-disk-list controller-1
    
    # Add disk as an OSD storage
    ~(keystone_admin)$ system host-stor-add controller-1 osd <disk-uuid>
    
    # List OSD storage devices
    ~(keystone_admin)$ system host-stor-list controller-1
    

For Rook-Ceph:

  1. Set up a host-fs ceph filesystem on controller-1.

    ~(keystone_admin)$ system host-fs-add controller-1 ceph=20
    

Unlock controller-1

Unlock controller-1 in order to bring it into service:

system host-unlock controller-1

Controller-1 will reboot in order to apply configuration changes and come into service. This can take 5-10 minutes, depending on the performance of the host machine.

If configuring Rook Ceph Storage Backend, configure the environment

  1. Check if the rook-ceph app is uploaded.

    ~(keystone_admin)$ source /etc/platform/openrc
    ~(keystone_admin)$ system application-list
    +--------------------------+-----------+-------------------------------------------+------------------+----------+-----------+
    | application              | version   | manifest name                             | manifest file    | status   | progress  |
    +--------------------------+-----------+-------------------------------------------+------------------+----------+-----------+
    | cert-manager             | 24.09-76  | cert-manager-fluxcd-manifests             | fluxcd-manifests | applied  | completed |
    | dell-storage             | 24.09-25  | dell-storage-fluxcd-manifests             | fluxcd-manifests | uploaded | completed |
    | deployment-manager       | 24.09-13  | deployment-manager-fluxcd-manifests       | fluxcd-manifests | applied  | completed |
    | nginx-ingress-controller | 24.09-57  | nginx-ingress-controller-fluxcd-manifests | fluxcd-manifests | applied  | completed |
    | oidc-auth-apps           | 24.09-53  | oidc-auth-apps-fluxcd-manifests           | fluxcd-manifests | uploaded | completed |
    | platform-integ-apps      | 24.09-138 | platform-integ-apps-fluxcd-manifests      | fluxcd-manifests | uploaded | completed |
    | rook-ceph                | 24.09-12  | rook-ceph-fluxcd-manifests                | fluxcd-manifests | uploaded | completed |
    +--------------------------+-----------+-------------------------------------------+------------------+----------+-----------+
    
  2. List all the disks.

    ~(keystone_admin)$ system host-disk-list controller-0
    +--------------------------------------+-------------+------------+-------------+----------+---------------+--------------+---------------------+--------------------------------------------+
    | uuid                                 | device_node | device_num | device_type | size_gib | available_gib | rpm          | serial_id           | device_path                                |
    +--------------------------------------+-------------+------------+-------------+----------+---------------+--------------+---------------------+--------------------------------------------+
    | 7ce699f0-12dd-4416-ae43-00d3877450f7 | /dev/sda    | 2048       | HDD         | 292.968  | 0.0           | Undetermined | VB0e18230e-6a8780e1 | /dev/disk/by-path/pci-0000:00:0d.0-ata-1.0 |
    | bfb83b6f-61e2-4f9f-a87d-ecae938b7e78 | /dev/sdb    | 2064       | HDD         | 9.765    | 9.765         | Undetermined | VB144f1510-14f089fd | /dev/disk/by-path/pci-0000:00:0d.0-ata-2.0 |
    | 937cfabc-8447-4dbd-8ca3-062a46953023 | /dev/sdc    | 2080       | HDD         | 9.765    | 9.761         | Undetermined | VB95057d1c-4ee605c2 | /dev/disk/by-path/pci-0000:00:0d.0-ata-3.0 |
    +--------------------------------------+-------------+------------+-------------+----------+---------------+--------------+---------------------+--------------------------------------------+
    
    (keystone_admin)]$ system host-disk-list controller-1
    +--------------------------------------+-------------+------------+-------------+----------+---------------+--------------+---------------------+--------------------------------------------+
    | uuid                                 | device_node | device_num | device_type | size_gib | available_gib | rpm          | serial_id           | device_path                                |
    +--------------------------------------+-------------+------------+-------------+----------+---------------+--------------+---------------------+--------------------------------------------+
    | 52c8e1b5-0551-4748-a7a0-27b9c028cf9d | /dev/sda    | 2048       | HDD         | 292.968  | 0.0           | Undetermined | VB9b565509-a2edaa2e | /dev/disk/by-path/pci-0000:00:0d.0-ata-1.0 |
    | 93020ce0-249e-4db3-b8c3-6c7e8f32713b | /dev/sdb    | 2064       | HDD         | 9.765    | 9.765         | Undetermined | VBa08ccbda-90190faa | /dev/disk/by-path/pci-0000:00:0d.0-ata-2.0 |
    | dc0ec403-67f8-40bf-ada0-6fcae3ed76da | /dev/sdc    | 2080       | HDD         | 9.765    | 9.761         | Undetermined | VB16244caf-ab36d36c | /dev/disk/by-path/pci-0000:00:0d.0-ata-3.0 |
    +--------------------------------------+-------------+------------+-------------+----------+---------------+--------------+---------------------+--------------------------------------------+
    
  3. Choose empty disks and provide hostname and uuid to finish OSD configuration:

    ~(keystone_admin)$ system host-stor-add controller-0 osd bfb83b6f-61e2-4f9f-a87d-ecae938b7e78
    ~(keystone_admin)$ system host-stor-add controller-1 osd 93020ce0-249e-4db3-b8c3-6c7e8f32713b
    
  4. Wait for OSDs pod to be ready.

    $ kubectl get pods -n rook-ceph
    NAME                                                     READY   STATUS      RESTARTS      AGE
    ceph-mgr-provision-w55rh                                 0/1     Completed   0             10m
    csi-cephfsplugin-8j7xz                                   2/2     Running     1 (11m ago)   12m
    csi-cephfsplugin-lmmg2                                   2/2     Running     0             12m
    csi-cephfsplugin-provisioner-5467c6c4f-mktqg             5/5     Running     0             12m
    csi-rbdplugin-8m8kd                                      2/2     Running     1 (11m ago)   12m
    csi-rbdplugin-provisioner-fd84899c-kpv4q                 5/5     Running     0             12m
    csi-rbdplugin-z92sk                                      2/2     Running     0             12m
    mon-float-post-install-sw8qb                             0/1     Completed   0             6m5s
    mon-float-pre-install-nfj5b                              0/1     Completed   0             6m40s
    rook-ceph-crashcollector-controller-0-589f5f774-sp6zf    1/1     Running     0             7m49s
    rook-ceph-crashcollector-controller-1-68d66b9bff-zwgp9   1/1     Running     0             7m36s
    rook-ceph-exporter-controller-0-5fd477bb8-jgsdk          1/1     Running     0             7m44s
    rook-ceph-exporter-controller-1-6f5d8695b9-ndksh         1/1     Running     0             7m32s
    rook-ceph-mds-kube-cephfs-a-5f584f4bc-tbk8q              2/2     Running     0             7m49s
    rook-ceph-mgr-a-6845774cb5-lgjjd                         3/3     Running     0             9m1s
    rook-ceph-mgr-b-7fccfdf64d-4pcmc                         3/3     Running     0             9m1s
    rook-ceph-mon-a-69fd4895c7-2lfz4                         2/2     Running     0             11m
    rook-ceph-mon-b-7fd8cbb997-f84ng                         2/2     Running     0             11m
    rook-ceph-mon-float-85c4cbb7f9-k7xwj                     2/2     Running     0             6m27s
    rook-ceph-operator-69b5674578-z456r                      1/1     Running     0             13m
    rook-ceph-osd-0-5f59b5bb7b-mkwrg                         2/2     Running     0             8m17s
    rook-ceph-osd-prepare-controller-0-rhjgx                 0/1     Completed   0             8m38s
    rook-ceph-provision-5glpc                                0/1     Completed   0             6m17s
    rook-ceph-tools-7dc9678ccb-nmwwc                         1/1     Running     0             12m
    stx-ceph-manager-664f8585d8-5lt8c                        1/1     Running     0             10m
    
  5. Check ceph cluster health.

    $ ceph -s
        cluster:
            id:     c18dfe3a-9b72-46e4-bb6e-6984f131598f
            health: HEALTH_OK
    
        services:
            mon: 2 daemons, quorum a,b (age 9m)
            mgr: a(active, since 6m), standbys: b
            mds: 1/1 daemons up, 1 hot standby
            osd: 2 osds: 2 up (since 7m), 2 in (since 7m)
    
        data:
            volumes: 1/1 healthy
            pools:   4 pools, 113 pgs
            objects: 25 objects, 594 KiB
            usage:   72 MiB used, 19 GiB / 20 GiB avail
            pgs:     113 active+clean
    
        io:
            client:   1.2 KiB/s rd, 2 op/s rd, 0 op/s wr
    

Note

Controller-0 and controller-1 use IP multicast messaging for synchronization. If loss of synchronization occurs a few minutes after controller-1 becomes available, ensure that the switches and other devices on the management and infrastructure networks are configured with appropriate settings.

In particular, if IGMP snooping is enabled on ToR switches, then a device acting as an IGMP querier is required on the network (on the same VLAN) to prevent nodes from being dropped from the multicast group. The IGMP querier periodically sends IGMP queries to all nodes on the network, and each node sends an IGMP join or report in response. Without an IGMP querier, the nodes do not periodically send IGMP join messages after the initial join sent when the link first goes up, and they are eventually dropped from the multicast group.

Optionally Extend Capacity with Worker Nodes

This section describes the steps to extend capacity with worker nodes on a StarlingX All-in-one Duplex deployment configuration.

Install software on worker nodes

  1. Power on the worker node servers.

    Power on the worker node servers and force them to network boot with the appropriate BIOS boot options for your particular server.

  2. As the worker nodes boot, a message appears on their console instructing you to configure the personality of the node.

  3. On the console of controller-0, list hosts to see newly discovered worker node hosts (hostname=None):

    ~(keystone_admin)$ system host-list
    +----+--------------+-------------+----------------+-------------+--------------+
    | id | hostname     | personality | administrative | operational | availability |
    +----+--------------+-------------+----------------+-------------+--------------+
    | 1  | controller-0 | controller  | unlocked       | enabled     | available    |
    | 2  | controller-1 | controller  | unlocked       | enabled     | available    |
    | 3  | None         | None        | locked         | disabled    | offline      |
    | 4  | None         | None        | locked         | disabled    | offline      |
    +----+--------------+-------------+----------------+-------------+--------------+
    
  4. Using the host id, set the personality of this host to ‘worker’:

    ~(keystone_admin)$ system host-update 3 personality=worker hostname=worker-0
    ~(keystone_admin)$ system host-update 4 personality=worker hostname=worker-1
    

    This initiates the install of software on worker nodes. This can take 5-10 minutes, depending on the performance of the host machine.

    Note

    A node with Edgeworker personality is also available. See Deploy Edgeworker Nodes for details.

  5. Wait for the install of software on the worker nodes to complete, for the worker nodes to reboot, and for both to show as locked/disabled/online in ‘system host-list’.

    ~(keystone_admin)$ system host-list
    +----+--------------+-------------+----------------+-------------+--------------+
    | id | hostname     | personality | administrative | operational | availability |
    +----+--------------+-------------+----------------+-------------+--------------+
    | 1  | controller-0 | controller  | unlocked       | enabled     | available    |
    | 2  | controller-1 | controller  | unlocked       | enabled     | available    |
    | 3  | worker-0     | worker      | locked         | disabled    | online       |
    | 4  | worker-1     | worker      | locked         | disabled    | online       |
    +----+--------------+-------------+----------------+-------------+--------------+
    

Configure worker nodes

  1. The MGMT interfaces are partially set up by the network install procedure; configuring the port used for network install as the MGMT port and specifying the attached network of “mgmt”.

    Complete the MGMT interface configuration of the worker nodes by specifying the attached network of “cluster-host”.

    for NODE in worker-0 worker-1; do
       system interface-network-assign $NODE mgmt0 cluster-host
    done
    

OpenStack-specific host configuration

Important

These steps are required only if the StarlingX OpenStack application (|prefix|-openstack) will be installed.

  1. For OpenStack only: Assign OpenStack host labels to the worker nodes in support of installing the stx-openstack manifest and helm-charts later.

    for NODE in worker-0 worker-1; do
       system host-label-assign $NODE  openstack-compute-node=enabled
       kubectl taint nodes $NODE openstack-compute-node:NoSchedule
       system host-label-assign $NODE  openvswitch=enabled
       system host-label-assign $NODE  sriov=enabled
    done
  2. For OpenStack only: Configure the host settings for the vSwitch.

    If using OVS-DPDK vswitch, run the following commands: Default recommendation for worker node is to use two cores on numa-node 0 for OVS-DPDK vSwitch; physical NICs are typically on first numa-node. This should have been automatically configured, if not run the following command.

    for NODE in worker-0 worker-1; do
       # assign 2 cores on processor/numa-node 0 on worker-node to vswitch
       ~(keystone_admin)$ system host-cpu-modify -f vswitch -p0 2 $NODE
    done
    

    When using OVS-DPDK, configure 1G of huge pages for vSwitch memory on each NUMA node on the host. It is recommended to configure 1x 1G huge page (-1G 1) for vSwitch memory on each NUMA node on the host.

    However, due to a limitation with Kubernetes, only a single huge page size is supported on any one host. If your application VMs require 2M huge pages, then configure 500x 2M huge pages (-2M 500) for vSwitch memory on each NUMA node on the host.

    for NODE in worker-0 worker-1; do
      # assign 1x 1G huge page on processor/numa-node 0 on worker-node to vswitch
      ~(keystone_admin)$ system host-memory-modify -f vswitch -1G 1 $NODE 0
      # assign 1x 1G huge page on processor/numa-node 0 on worker-node to vswitch
      ~(keystone_admin)$ system host-memory-modify -f vswitch -1G 1 $NODE 1
    done
    

    Important

    VMs created in an OVS-DPDK environment must be configured to use huge pages to enable networking and must use a flavor with property: hw:mem_page_size=large.

    Configure the huge pages for VMs in an OVS-DPDK environment on this host, assuming 1G huge page size is being used on this host, with the following commands:

    for NODE in worker-0 worker-1; do
      # assign 10x 1G huge page on processor/numa-node 0 on worker-node to applications
      ~(keystone_admin)$ system host-memory-modify -f application -1G 10 $NODE 0
      # assign 10x 1G huge page on processor/numa-node 1 on worker-node to applications
      ~(keystone_admin)$ system host-memory-modify -f application -1G 10 $NODE 1
    done
    
  3. For OpenStack only: Setup disk partition for nova-local volume group, needed for stx-openstack nova ephemeral disks.

    for NODE in worker-0 worker-1; do
       ~(keystone_admin)$ system host-lvg-add ${NODE} nova-local
    
       # Get UUID of DISK to create PARTITION to be added to ‘nova-local’ local volume group
       # CEPH OSD Disks can NOT be used
       # For best performance, do NOT use system/root disk, use a separate physical disk.
    
       # List host’s disks and take note of UUID of disk to be used
       ~(keystone_admin)$ system host-disk-list ${NODE}
       # ( if using ROOT DISK, select disk with device_path of
       #   'system host-show ${NODE} | fgrep rootfs'   )
    
       # Create new PARTITION on selected disk, and take note of new partition’s ‘uuid’ in response
       # The size of the PARTITION needs to be large enough to hold the aggregate size of
       # all nova ephemeral disks of all VMs that you want to be able to host on this host,
       # but is limited by the size and space available on the physical disk you chose above.
       # The following example uses a small PARTITION size such that you can fit it on the
       # root disk, if that is what you chose above.
       # Additional PARTITION(s) from additional disks can be added later if required.
       PARTITION_SIZE=30
    
       ~(keystone_admin)$ system host-disk-partition-add -t lvm_phys_vol ${NODE} <disk-uuid> ${PARTITION_SIZE}
    
       # Add new partition to ‘nova-local’ local volume group
       ~(keystone_admin)$ system host-pv-add ${NODE} nova-local <NEW_PARTITION_UUID>
       sleep 2
    done
    
  4. For OpenStack only: Configure data interfaces for worker nodes. Data class interfaces are vswitch interfaces used by vswitch to provide VM virtio vNIC connectivity to OpenStack Neutron Tenant Networks on the underlying assigned Data Network.

    Important

    A compute-labeled worker host MUST have at least one Data class interface.

    • Configure the data interfaces for worker nodes.

      # Execute the following lines with
      ~(keystone_admin)$ export NODE=worker-0
      
      # and then repeat with
      ~(keystone_admin)$ export NODE=worker-1
      
        # List inventoried host’s ports and identify ports to be used as `data` interfaces,
        # based on displayed linux port name, pci address and device type.
        ~(keystone_admin)$ system host-port-list ${NODE}
      
        # List host’s auto-configured ‘ethernet’ interfaces,
        # find the interfaces corresponding to the ports identified in previous step, and
        # take note of their UUID
        ~(keystone_admin)$ system host-if-list -a ${NODE}
      
        # Modify configuration for these interfaces
        # Configuring them as ‘data’ class interfaces, MTU of 1500 and named data#
        ~(keystone_admin)$ system host-if-modify -m 1500 -n data0 -c data ${NODE} <data0-if-uuid>
        ~(keystone_admin)$ system host-if-modify -m 1500 -n data1 -c data ${NODE} <data1-if-uuid>
      
        # Create Data Networks that vswitch 'data' interfaces will be connected to
        ~(keystone_admin)$ DATANET0='datanet0'
        ~(keystone_admin)$ DATANET1='datanet1'
        ~(keystone_admin)$ system datanetwork-add ${DATANET0} vlan
        ~(keystone_admin)$ system datanetwork-add ${DATANET1} vlan
      
        # Assign Data Networks to Data Interfaces
        ~(keystone_admin)$ system interface-datanetwork-assign ${NODE} <data0-if-uuid> ${DATANET0}
        ~(keystone_admin)$ system interface-datanetwork-assign ${NODE} <data1-if-uuid> ${DATANET1}
      

Optionally Configure PCI-SRIOV Interfaces

  1. Optionally, configure pci-sriov interfaces for worker nodes.

    This step is optional for Kubernetes. Do this step if using SR-IOV network attachments in hosted application containers.

    This step is optional for OpenStack. Do this step if using SR-IOV vNICs in hosted application VMs. Note that pci-sriov interfaces can have the same Data Networks assigned to them as vswitch data interfaces.

    • Configure the pci-sriov interfaces for worker nodes.

      # Execute the following lines with
      ~(keystone_admin)$ export NODE=worker-0
      # and then repeat with
      ~(keystone_admin)$ export NODE=worker-1
      
        # List inventoried host’s ports and identify ports to be used as ‘pci-sriov’ interfaces,
        # based on displayed linux port name, pci address and device type.
        ~(keystone_admin)$ system host-port-list ${NODE}
      
        # List host’s auto-configured ‘ethernet’ interfaces,
        # find the interfaces corresponding to the ports identified in previous step, and
        # take note of their UUID
        ~(keystone_admin)$ system host-if-list -a ${NODE}
      
        # Modify configuration for these interfaces
        # Configuring them as ‘pci-sriov’ class interfaces, MTU of 1500 and named sriov#
        ~(keystone_admin)$ system host-if-modify -m 1500 -n sriov0 -c pci-sriov ${NODE} <sriov0-if-uuid> -N <num_vfs>
        ~(keystone_admin)$ system host-if-modify -m 1500 -n sriov1 -c pci-sriov ${NODE} <sriov1-if-uuid> -N <num_vfs>
      
        # If not already created, create Data Networks that the 'pci-sriov'
        # interfaces will be connected to
        ~(keystone_admin)$ DATANET0='datanet0'
        ~(keystone_admin)$ DATANET1='datanet1'
        ~(keystone_admin)$ system datanetwork-add ${DATANET0} vlan
        ~(keystone_admin)$ system datanetwork-add ${DATANET1} vlan
      
        # Assign Data Networks to PCI-SRIOV Interfaces
        ~(keystone_admin)$ system interface-datanetwork-assign ${NODE} <sriov0-if-uuid> ${DATANET0}
        ~(keystone_admin)$ system interface-datanetwork-assign ${NODE} <sriov1-if-uuid> ${DATANET1}
      
    • For Kubernetes only To enable using SR-IOV network attachments for the above interfaces in Kubernetes hosted application containers:

      • Configure the Kubernetes SR-IOV device plugin.

        for NODE in worker-0 worker-1; do
           system host-label-assign $NODE sriovdp=enabled
        done
        
      • If planning on running DPDK in Kubernetes hosted application containers on this host, configure the number of 1G Huge pages required on both NUMA nodes.

        for NODE in worker-0 worker-1; do
           # assign 10x 1G huge page on processor/numa-node 0 on worker-node to applications
           ~(keystone_admin)$ system host-memory-modify -f application $NODE 0 -1G 10
        
           # assign 10x 1G huge page on processor/numa-node 1 on worker-node to applications
           ~(keystone_admin)$ system host-memory-modify -f application $NODE 1 -1G 10
        done
        

Unlock worker nodes

Unlock worker nodes in order to bring them into service:

for NODE in worker-0 worker-1; do
   system host-unlock $NODE
done

The worker nodes will reboot to apply configuration changes and come into service. This can take 5-10 minutes, depending on the performance of the host machine.

Complete system configuration by reviewing procedures in: