Rook Migration Methods

Overview

StarlingX offers multiple migration options for moving from Bare Metal Ceph to Rook Ceph. Every migration method introduces an application outage, as the process involves removing Bare Metal Ceph and deploying Rook Ceph, which cannot be done without a service interruption.

Note

Before the migration starts, PVCs must not be attached to any running pods and users must scale down their applications and wait until the migration is complete before scaling them back up. For other prerequisites, see Rook Migration Prerequisites.

AIO-SX, AIO-DX, AIO-DX with Worker nodes, and Standard Systems with Controller Storage support the following migration methods.

Note

Standard Systems with Dedicated Storage Nodes support only Cluster Redeploy because the storage resides on dedicated nodes. This method wipes all storage devices and rebuilds the Ceph environment using Rook Ceph. No user data is preserved.

In-Service Migration

Preserves user data while converting OSDs from Filestore to Bluestore and transitioning storage to Rook Ceph with controlled downtime. For AIO-SX systems:

  • Requires at least two OSDs to proceed with migration.

  • When operating with replica 1, AIO-SX systems must have sufficient free space to take an OSD out of service (mark it out/down).

  • For AIO-DX and Standard Systems:

    Since these configurations always operate with at least replica 2, an entire host can be wiped and migrated in a single step.

Procedure

  1. Migrate all disks from Filestore to Bluestore while the system is still running on Bare Metal Ceph.

  2. Remove Bare Metal Ceph and back up the cluster configuration data.

  3. Deploy a Rook Ceph Pacific (v16) cluster and rebuild the keyrings and mon database (monstore) using the old Bare Metal Ceph data.

  4. Rebuild the filesystem in the Pacific-based Rook Ceph cluster.

  5. Remove the Pacific Rook Ceph deployment while retaining the cluster configuration.

  6. Install the Rook Ceph application running Reef (v18) and rebuild the keyrings and mon database (monstore) using the Pacific data.

  7. Rebuild the filesystem in the Reef-based Rook Ceph cluster.

  8. Recreate the persistent volumes.

Export/Import Migration

Preserves user data by exporting all Ceph data, redeploying Rook Ceph, then restoring the data. This method provides a clean storage redeployment while still maintaining data continuity.

Procedure

  1. Export the RBD and CephFS data while running on Bare Metal Ceph, compressing the data as part of the process.

  2. Remove Bare Metal Ceph, save the configuration details used by the deployment (monitors, OSD layout, and so on).

  3. Install the Rook Ceph application running Reef (version 18), using the same configuration parameters previously used by Bare Metal Ceph.

  4. Recreate the persistent volumes.

  5. Import the compressed RBD and CephFS backups in the toolbox pod, decompressing the data during the import process.

Cluster Redeploy Migration

Performs a fast and clean transition to Rook Ceph without preserving user data. All storage devices are wiped, and the user must redeploy applications that interact with PVs/PVCs.

Procedure

  1. Remove the existing Bare Metal Ceph environment. Since this migration does not preserve data, no configuration backup is necessary.

  2. Deploy the Rook Ceph application running Reef (v18).

  3. Recreate all PVC that will be used by applications.

  4. For Standard Systems with Dedicated Storage Nodes, after the migration, existing storage nodes are reinstalled as worker nodes. These new workers are configured with resource reservations that allocate as much capacity as possible to the platform.

    Details about the resource reservation of these new workers:

    • Processor 0: - Reserve two-thirds of the platform memory - Reserve all cores except one for the platform

    • Processor 1 (if present): - Reserve 1 GB of platform memory - Reserve 1 core for the platform

Advantages and Disadvantages of Migration Methods

Migration Method

Advantages

Disadvantages

In-Service Migration

  • Keeps user data

  • Uses Ceph’s very fast internal recovery mechanism to move data efficiently, especially with replica size ≥ 2

  • AIO-SX requires at least 2 OSDs for migration.

  • Migration duration can be long because multiple steps require the cluster to settle; this is especially slow on replica size 1 (AIO-SX).

  • Requires multiple Rook Ceph versions, since migration cannot go directly from Bare Metal Ceph (Nautilus) to Rook Ceph Reef; a temporary Rook Ceph Pacific deployment is needed.

Export/Import

  • Upgrades directly to the latest Rook Ceph version

  • Suitable for smaller clusters

  • Preserves user data through backup

  • Compression and decompression times can significantly affect migration speed, depending on cluster data size.

  • Requires additional free space on the active controller’s root disk to store the backup. The required space depends on the total data in the cluster; if compression cannot be achieved, the root disk must have enough free space to accommodate entire cluster data.

  • No parallelism in Ceph data export/import, making the method slower on large clusters.

Cluster Redeploy

  • Uses the latest Rook Ceph version directly

  • Fastest migration option

  • User data is discarded.

  • User must redeploy all workloads.