Bare metal Standard with Controller Storage R2.0

Description

The Standard with Controller Storage deployment option provides two high availability (HA) controller nodes and a pool of up to 10 compute nodes.

A Standard with Controller Storage configuration provides the following benefits:

  • A pool of up to 10 compute nodes

  • High availability (HA) services run across the controller nodes in either active/active or active/standby mode

  • A storage back end solution using a two-node CEPH deployment across two controller servers

  • Protection against overall controller and compute node failure, where

    • On overall controller node failure, all controller HA services go active on the remaining healthy controller node

    • On overall compute node failure, virtual machines and containers are recovered on the remaining healthy compute nodes

Standard with Controller Storage deployment configuration

Figure 1: Standard with Controller Storage deployment configuration

Note

By default, StarlingX uses IPv4. To use StarlingX with IPv6:

  • The entire infrastructure and cluster configuration must be IPv6, with the exception of the PXE boot network.

  • Not all external servers are reachable via IPv6 addresses (e.g. Docker registries). Depending on your infrastructure, it may be necessary to deploy a NAT64/DNS64 gateway to translate the IPv4 addresses to IPv6.

Hardware requirements

The recommended minimum requirements for the bare metal servers for the various host types are:

Minimum Requirement

Controller Node

Compute Node

Number of Servers

2

2-10

Minimum Processor Class

  • Dual-CPU Intel® Xeon® E5 26xx family (SandyBridge) 8 cores/socket

Minimum Memory

64 GB

32 GB

Primary Disk

500 GB SDD or NVMe

120 GB (Minimum 10k RPM)

Additional Disks

  • 1 or more 500 GB (min. 10K RPM) for Ceph OSD

  • Recommended, but not required: 1 or more SSDs or NVMe drives for Ceph journals (min. 1024 MiB per OSD journal)

  • For OpenStack, recommend 1 or more 500 GB (min. 10K RPM) for VM local ephemeral storage

Minimum Network Ports

  • Mgmt/Cluster: 1x10GE

  • OAM: 1x1GE

  • Mgmt/Cluster: 1x10GE

  • Data: 1 or more x 10GE

BIOS Settings

  • Hyper-Threading technology enabled

  • Virtualization technology enabled

  • VT for directed I/O enabled

  • CPU power and performance policy set to performance

  • CPU C state control disabled

  • Plug & play BMC detection disabled

Prepare Servers

Prior to starting the StarlingX installation, the bare metal servers must be in the following condition:

  • Physically installed

  • Cabled for power

  • Cabled for networking

    • Far-end switch ports should be properly configured to realize the networking shown in Figure 1.

  • All disks wiped

    • Ensures that servers will boot from either the network or USB storage (if present)

  • Powered off

StarlingX Kubernetes

Installing StarlingX Kubernetes

Create a bootable USB with the StarlingX ISO

Create a bootable USB with the StarlingX ISO.

Refer to Create Bootable USB for instructions on how to create a bootable USB on your system.

Install software on controller-0

  1. Insert the bootable USB into a bootable USB port on the host you are configuring as controller-0.

  2. Power on the host.

  3. Attach to a console, ensure the host boots from the USB, and wait for the StarlingX Installer Menus.

  4. Make the following menu selections in the installer:

    1. First menu: Select ‘Standard Controller Configuration’

    2. Second menu: Select ‘Graphical Console’ or ‘Textual Console’ depending on your terminal access to the console port

    3. Third menu: Select ‘Standard Security Profile’

  5. Wait for non-interactive install of software to complete and server to reboot. This can take 5-10 minutes, depending on the performance of the server.

Bootstrap system on controller-0

  1. Login using the username / password of “sysadmin” / “sysadmin”. When logging in for the first time, you will be forced to change the password.

    Login: sysadmin
    Password:
    Changing password for sysadmin.
    (current) UNIX Password: sysadmin
    New Password:
    (repeat) New Password:
    
  2. External connectivity is required to run the Ansible bootstrap playbook. The StarlingX boot image will DHCP out all interfaces so the server may have obtained an IP address and have external IP connectivity if a DHCP server is present in your environment. Verify this using the ip add and ping 8.8.8.8 commands.

    Otherwise, manually configure an IP address and default IP route. Use the PORT, IP-ADDRESS/SUBNET-LENGTH and GATEWAY-IP-ADDRESS applicable to your deployment environment.

    sudo ip address add <IP-ADDRESS>/<SUBNET-LENGTH> dev <PORT>
    sudo ip link set up dev <PORT>
    sudo ip route add default via <GATEWAY-IP-ADDRESS> dev <PORT>
    ping 8.8.8.8
    
  3. Specify user configuration overrides for the Ansible bootstrap playbook.

    Ansible is used to bootstrap StarlingX on controller-0:

    • The default Ansible inventory file, /etc/ansible/hosts, contains a single host: localhost.

    • The Ansible bootstrap playbook is at: /usr/share/ansible/stx-ansible/playbooks/bootstrap/bootstrap.yml

    • The default configuration values for the bootstrap playbook are in: /usr/share/ansible/stx-ansible/playbooks/bootstrap/host_vars/default.yml

    • By default Ansible looks for and imports user configuration override files for hosts in the sysadmin home directory ($HOME), for example: OME/<hostname>.yml

    Specify the user configuration override file for the ansible bootstrap playbook, by either:

    • Copying the default.yml file listed above to $HOME/localhost.yml and editing the configurable values as desired, based on the commented instructions in the file.

    or

    • Creating the minimal user configuration override file as shown in the example below, using the OAM IP SUBNET and IP ADDRESSing applicable to your deployment environment:

      cd ~
      cat <<EOF > localhost.yml
      system_mode: standard
      
      dns_servers:
        - 8.8.8.8
        - 8.8.4.4
      
      external_oam_subnet: <OAM-IP-SUBNET>/<OAM-IP-SUBNET-LENGTH>
      external_oam_gateway_address: <OAM-GATEWAY-IP-ADDRESS>
      external_oam_floating_address: <OAM-FLOATING-IP-ADDRESS>
      external_oam_node_0_address: <OAM-CONTROLLER-0-IP-ADDRESS>
      external_oam_node_1_address: <OAM-CONTROLLER-1-IP-ADDRESS>
      
      admin_username: admin
      admin_password: <sysadmin-password>
      ansible_become_pass: <sysadmin-password>
      EOF
      

    If you are using IPv6, provide IPv6 configuration overrides. Note that all addressing, except pxeboot_subnet, should be updated to IPv6 addressing. Example IPv6 override values are shown below:

    dns_servers:
    ‐ 2001:4860:4860::8888
    ‐ 2001:4860:4860::8844
    pxeboot_subnet: 169.254.202.0/24
    management_subnet: 2001:db8:2::/64
    cluster_host_subnet: 2001:db8:3::/64
    cluster_pod_subnet: 2001:db8:4::/64
    cluster_service_subnet: 2001:db8:4::/112
    external_oam_subnet: 2001:db8:1::/64
    external_oam_gateway_address: 2001:db8::1
    external_oam_floating_address: 2001:db8::2
    external_oam_node_0_address: 2001:db8::3
    external_oam_node_1_address: 2001:db8::4
    management_multicast_subnet: ff08::1:1:0/124
    

    Note that the external_oam_node_0_address, and external_oam_node_1_address parameters are not required for the AIO‐SX installation.

  4. Run the Ansible bootstrap playbook:

    ansible-playbook /usr/share/ansible/stx-ansible/playbooks/bootstrap/bootstrap.yml
    

    Wait for Ansible bootstrap playbook to complete. This can take 5-10 minutes, depending on the performance of the host machine.

Configure controller-0

  1. Acquire admin credentials:

    source /etc/platform/openrc
    
  2. Configure the OAM and MGMT interfaces of controller-0 and specify the attached networks. Use the OAM and MGMT port names, e.g. eth0, applicable to your deployment environment.

    OAM_IF=<OAM-PORT>
    MGMT_IF=<MGMT-PORT>
    system host-if-modify controller-0 lo -c none
    IFNET_UUIDS=$(system interface-network-list controller-0 | awk '{if ($6=="lo") print $4;}')
    for UUID in $IFNET_UUIDS; do
        system interface-network-remove ${UUID}
    done
    system host-if-modify controller-0 $OAM_IF -c platform
    system interface-network-assign controller-0 $OAM_IF oam
    system host-if-modify controller-0 $MGMT_IF -c platform
    system interface-network-assign controller-0 $MGMT_IF mgmt
    system interface-network-assign controller-0 $MGMT_IF cluster-host
    
  3. Configure NTP Servers for network time synchronization:

    system ntp-modify ntpservers=0.pool.ntp.org,1.pool.ntp.org
    
OpenStack-specific host configuration

Warning

The following configuration is required only if the StarlingX OpenStack application (stx-openstack) will be installed.

  1. For OpenStack only: Assign OpenStack host labels to controller-0 in support of installing the stx-openstack manifest and helm-charts later.

    system host-label-assign controller-0 openstack-control-plane=enabled
    
  2. For OpenStack only: Configure the system setting for the vSwitch.

    StarlingX has OVS (kernel-based) vSwitch configured as default:

    • Runs in a container; defined within the helm charts of stx-openstack manifest.

    • Shares the core(s) assigned to the platform.

    If you require better performance, OVS-DPDK should be used:

    • Runs directly on the host (it is NOT containerized).

    • Requires that at least 1 core be assigned/dedicated to the vSwitch function.

    To deploy the default containerized OVS:

    system modify --vswitch_type none
    

    Do not run any vSwitch directly on the host, instead, use the containerized OVS defined in the helm charts of stx-openstack manifest.

    To deploy OVS-DPDK (OVS with the Data Plane Development Kit, which is supported only on bare metal hardware), run the following command:

    system modify --vswitch_type ovs-dpdk
        system host-cpu-modify -f vswitch -p0 1 controller-0
    

    Once vswitch_type is set to OVS-DPDK, any subsequent nodes created will default to automatically assigning 1 vSwitch core for AIO controllers and 2 vSwitch cores for computes.

    When using OVS-DPDK, Virtual Machines must be configured to use a flavor with property: hw:mem_page_size=large.

    Note

    After controller-0 is unlocked, changing vswitch_type requires locking and unlocking all computes (and/or AIO controllers) to apply the change.

Unlock controller-0

Unlock controller-0 in order to bring it into service:

system host-unlock controller-0

Controller-0 will reboot in order to apply configuration changes and come into service. This can take 5-10 minutes, depending on the performance of the host machine.

Install software on controller-1 and compute nodes

  1. Power on the controller-1 server and force it to network boot with the appropriate BIOS boot options for your particular server.

  2. As controller-1 boots, a message appears on its console instructing you to configure the personality of the node.

  3. On the console of controller-0, list hosts to see newly discovered controller-1 host, that is, host with hostname of None:

    system host-list
        +----+--------------+-------------+----------------+-------------+--------------+
        | id | hostname     | personality | administrative | operational | availability |
        +----+--------------+-------------+----------------+-------------+--------------+
        | 1  | controller-0 | controller  | unlocked       | enabled     | available    |
        | 2  | None         | None        | locked         | disabled    | offline      |
        +----+--------------+-------------+----------------+-------------+--------------+
    
  4. Using the host id, set the personality of this host to ‘controller’:

    system host-update 2 personality=controller
    

    This initiates the install of software on controller-1. This can take 5-10 minutes, depending on the performance of the host machine.

  5. While waiting, repeat the same procedure for compute-0 server and compute-1 server, except for setting the personality to ‘worker’ and assigning a unique hostname, as shown below:

    system host-update 3 personality=worker hostname=compute-0
    system host-update 4 personality=worker hostname=compute-1
    
  6. Wait for the software installation on controller-1, compute-0, and compute-1 to complete, for all servers to reboot, and for to all show as locked/disabled/online in ‘system host-list’.

    system host-list
    
    +----+--------------+-------------+----------------+-------------+--------------+
    | id | hostname     | personality | administrative | operational | availability |
    +----+--------------+-------------+----------------+-------------+--------------+
    | 1  | controller-0 | controller  | unlocked       | enabled     | available    |
    | 2  | controller-1 | controller  | locked         | disabled    | online       |
    | 3  | compute-0    | compute     | locked         | disabled    | online       |
    | 4  | compute-1    | compute     | locked         | disabled    | online       |
    +----+--------------+-------------+----------------+-------------+--------------+
    

Configure controller-1

Configure the OAM and MGMT interfaces of controller-0 and specify the attached networks. Use the OAM and MGMT port names, e.g. eth0, applicable to your deployment environment.

(Note that the MGMT interface is partially set up automatically by the network install procedure.)

OAM_IF=<OAM-PORT>
MGMT_IF=<MGMT-PORT>
system host-if-modify controller-1 $OAM_IF -c platform
system interface-network-assign controller-1 $OAM_IF oam
system interface-network-assign controller-1 $MGMT_IF cluster-host
OpenStack-specific host configuration

Warning

The following configuration is required only if the StarlingX OpenStack application (stx-openstack) will be installed.

For OpenStack only: Assign OpenStack host labels to controller-1 in support of installing the stx-openstack manifest and helm-charts later.

system host-label-assign controller-1 openstack-control-plane=enabled

Unlock controller-1

Unlock controller-1 in order to bring it into service:

system host-unlock controller-1

Controller-1 will reboot in order to apply configuration changes and come into service. This can take 5-10 minutes, depending on the performance of the host machine.

Configure compute nodes

  1. Add the third Ceph monitor to compute-0:

    (The first two Ceph monitors are automatically assigned to controller-0 and controller-1.)

    system ceph-mon-add compute-0
    
  2. Wait for the compute node monitor to complete configuration:

    system ceph-mon-list
    +--------------------------------------+-------+--------------+------------+------+
    | uuid                                 | ceph_ | hostname     | state      | task |
    |                                      | mon_g |              |            |      |
    |                                      | ib    |              |            |      |
    +--------------------------------------+-------+--------------+------------+------+
    | 64176b6c-e284-4485-bb2a-115dee215279 | 20    | controller-1 | configured | None |
    | a9ca151b-7f2c-4551-8167-035d49e2df8c | 20    | controller-0 | configured | None |
    | f76bc385-190c-4d9a-aa0f-107346a9907b | 20    | compute-0    | configured | None |
    +--------------------------------------+-------+--------------+------------+------+
    
  3. Assign the cluster-host network to the MGMT interface for the compute nodes:

    (Note that the MGMT interfaces are partially set up automatically by the network install procedure.)

    for COMPUTE in compute-0 compute-1; do
       system interface-network-assign $COMPUTE mgmt0 cluster-host
    done
    
  4. Configure data interfaces for compute nodes. Use the DATA port names, e.g. eth0, applicable to your deployment environment.

    Note

    This step is required for OpenStack and optional for Kubernetes. For example, do this step if using SRIOV network attachments in application containers.

    For Kubernetes SRIOV network attachments:

    • Configure SRIOV device plugin:

      for COMPUTE in compute-0 compute-1; do
         system host-label-assign ${COMPUTE} sriovdp=enabled
      done
      
    • If planning on running DPDK in containers on this host, configure the number of 1G Huge pages required on both NUMA nodes:

      for COMPUTE in compute-0 compute-1; do
         system host-memory-modify ${COMPUTE} 0 -1G 100
         system host-memory-modify ${COMPUTE} 1 -1G 100
      done
      

    For both Kubernetes and OpenStack:

    DATA0IF=<DATA-0-PORT>
        DATA1IF=<DATA-1-PORT>
        PHYSNET0='physnet0'
        PHYSNET1='physnet1'
        SPL=/tmp/tmp-system-port-list
        SPIL=/tmp/tmp-system-host-if-list
    
        # configure the datanetworks in sysinv, prior to referencing it
        # in the ``system host-if-modify`` command'.
        system datanetwork-add ${PHYSNET0} vlan
        system datanetwork-add ${PHYSNET1} vlan
    
        for COMPUTE in compute-0 compute-1; do
          echo "Configuring interface for: $COMPUTE"
          set -ex
          system host-port-list ${COMPUTE} --nowrap > ${SPL}
          system host-if-list -a ${COMPUTE} --nowrap > ${SPIL}
          DATA0PCIADDR=$(cat $SPL | grep $DATA0IF |awk '{print $8}')
          DATA1PCIADDR=$(cat $SPL | grep $DATA1IF |awk '{print $8}')
          DATA0PORTUUID=$(cat $SPL | grep ${DATA0PCIADDR} | awk '{print $2}')
          DATA1PORTUUID=$(cat $SPL | grep ${DATA1PCIADDR} | awk '{print $2}')
          DATA0PORTNAME=$(cat $SPL | grep ${DATA0PCIADDR} | awk '{print $4}')
          DATA1PORTNAME=$(cat $SPL | grep ${DATA1PCIADDR} | awk '{print $4}')
          DATA0IFUUID=$(cat $SPIL | awk -v DATA0PORTNAME=$DATA0PORTNAME '($12 ~ DATA0PORTNAME) {print $2}')
          DATA1IFUUID=$(cat $SPIL | awk -v DATA1PORTNAME=$DATA1PORTNAME '($12 ~ DATA1PORTNAME) {print $2}')
          system host-if-modify -m 1500 -n data0 -c data ${COMPUTE} ${DATA0IFUUID}
          system host-if-modify -m 1500 -n data1 -c data ${COMPUTE} ${DATA1IFUUID}
          system interface-datanetwork-assign ${COMPUTE} ${DATA0IFUUID} ${PHYSNET0}
          system interface-datanetwork-assign ${COMPUTE} ${DATA1IFUUID} ${PHYSNET1}
          set +ex
        done
    
OpenStack-specific host configuration

Warning

The following configuration is required only if the StarlingX OpenStack application (stx-openstack) will be installed.

  1. For OpenStack only: Assign OpenStack host labels to the compute nodes in support of installing the stx-openstack manifest and helm-charts later.

    for NODE in compute-0 compute-1; do
      system host-label-assign $NODE  openstack-compute-node=enabled
      system host-label-assign $NODE  openvswitch=enabled
      system host-label-assign $NODE  sriov=enabled
    done
    
  2. For OpenStack only: Set up disk partition for nova-local volume group, which is needed for stx-openstack nova ephemeral disks.

    for COMPUTE in compute-0 compute-1; do
      echo "Configuring Nova local for: $COMPUTE"
      ROOT_DISK=$(system host-show ${COMPUTE} | grep rootfs | awk '{print $4}')
      ROOT_DISK_UUID=$(system host-disk-list ${COMPUTE} --nowrap | grep ${ROOT_DISK} | awk '{print $2}')
      PARTITION_SIZE=10
      NOVA_PARTITION=$(system host-disk-partition-add -t lvm_phys_vol ${COMPUTE} ${ROOT_DISK_UUID} ${PARTITION_SIZE})
      NOVA_PARTITION_UUID=$(echo ${NOVA_PARTITION} | grep -ow "| uuid | [a-z0-9\-]* |" | awk '{print $4}')
      system host-lvg-add ${COMPUTE} nova-local
      system host-pv-add ${COMPUTE} nova-local ${NOVA_PARTITION_UUID}
    done
    
    for COMPUTE in compute-0 compute-1; do
      echo ">>> Wait for partition $NOVA_PARTITION_UUID to be ready."
      while true; do system host-disk-partition-list $COMPUTE --nowrap | grep $NOVA_PARTITION_UUID | grep Ready; if [ $? -eq 0 ]; then break; fi; sleep 1; done
    done
    

Unlock compute nodes

Unlock compute nodes in order to bring them into service:

for COMPUTE in compute-0 compute-1; do
   system host-unlock $COMPUTE
done

The compute nodes will reboot in order to apply configuration changes and come into service. This can take 5-10 minutes, depending on the performance of the host machine.

Add Ceph OSDs to controllers

  1. Add OSDs to controller-0:

    HOST=controller-0
    DISKS=$(system host-disk-list ${HOST})
    TIERS=$(system storage-tier-list ceph_cluster)
    OSDs="/dev/sdb"
    for OSD in $OSDs; do
       system host-stor-add ${HOST} $(echo "$DISKS" | grep "$OSD" | awk '{print $2}') --tier-uuid $(echo "$TIERS" | grep storage | awk '{print $2}')
       while true; do system host-stor-list ${HOST} | grep ${OSD} | grep configuring; if [ $? -ne 0 ]; then break; fi; sleep 1; done
    done
    
    system host-stor-list $HOST
    
  2. Add OSDs to controller-1:

    HOST=controller-1
    DISKS=$(system host-disk-list ${HOST})
    TIERS=$(system storage-tier-list ceph_cluster)
    OSDs="/dev/sdb"
    for OSD in $OSDs; do
        system host-stor-add ${HOST} $(echo "$DISKS" | grep "$OSD" | awk '{print $2}') --tier-uuid $(echo "$TIERS" | grep storage | awk '{print $2}')
        while true; do system host-stor-list ${HOST} | grep ${OSD} | grep configuring; if [ $? -ne 0 ]; then break; fi; sleep 1; done
    done
    
    system host-stor-list $HOST
    

Your Kubernetes cluster is up and running.

Access StarlingX Kubernetes

Use local/remote CLIs, GUIs, and/or REST APIs to access and manage StarlingX Kubernetes and hosted containerized applications. Refer to details on accessing the StarlingX Kubernetes cluster in the Access StarlingX Kubernetes guide.

StarlingX OpenStack

Install StarlingX OpenStack

Other than the OpenStack-specific configurations required in the underlying StarlingX/Kubernetes infrastructure (described in the installation steps for the StarlingX Kubernetes platform above), the installation of containerized OpenStack for StarlingX is independent of deployment configuration. Refer to the Install OpenStack guide for installation instructions.

Access StarlingX OpenStack

Use local/remote CLIs, GUIs and/or REST APIs to access and manage StarlingX OpenStack and hosted virtualized applications. Refer to details on accessing StarlingX OpenStack in the Access StarlingX OpenStack guide.

Uninstall StarlingX OpenStack

Refer to the Uninstall OpenStack guide for instructions on how to uninstall and delete the OpenStack application.