Performance Configurations on Rook Ceph

When using Rook Ceph it is important to consider resource allocation and configuration adjustments to ensure optimal performance. Rook introduces additional management overhead compared to a traditional bare-metal Ceph setup and needs more infrastructure resources.

Consequently, increasing the number of platform cores will improve I/O performance for OSD, monitor and MDS pods.

Increasing the number of OSDs in the cluster can also improve performance, reducing the load on individual disks and enhancing throughput.

When we talk about memory, it’s important to emphasize that Ceph’s default for the OSD is 4GB, and we do not recommend decreasing it below 4GB. However, the system could work with only 2GB.

Another factor to consider is the size of the data blocks. Reading and writing small block files can degrade Ceph’s performance, especially during high-frequency operations.

Pod resource limit tuning

To check the current values for OSDs memory limits:

$ helm get values -n rook-ceph rook-ceph-cluster -o yaml | grep ' osd:' -A2

If you want to adjust memory settings in an effort to improve OSD read/write performance, you can allocate more memory to OSDs by running the following command:

$ cat << EOF >> limit_override.yml
cephClusterSpec:
  resources:
     osd:
       limits:
         memory: <value>
EOF

Make sure to provide parameter the with the correct unit, e.g.: 4Gi.

Then reapply the override:

~(keystone_admin)$ system helm-override-update rook-ceph rook-ceph-cluster rook-ceph --values limit_override.yml

Note

The settings applied using helm-override-update remain active until the Rook-Ceph application is deleted. If the application is deleted and reinstalled, these settings will need to be reapplied.

Finally, apply the Rook-Ceph application:

~(keystone_admin)$ system application-apply rook-ceph

Bluestore tunable parameters

The osd_memory_cache_min and osd_memory_target parameters impact memory management in OSDs. Increasing them improves performance by optimizing memory usage and reducing latencies for read/write operations. However, higher values consume more resources, which can affect overall platform resources utilization. For performance similar to a Ceph bare metal environment, a significant increase in these parameters is required.

To check the current values for these parameters, use:

$ helm get values -n rook-ceph rook-ceph-cluster -o yaml | sed -n '/^configOverride:/,/^[[:alnum:]_-]*:/{/^[[:alnum:]_-]*:/!p}'

To modify these parameters first create a override with the updated values:

$ cat << EOF >> tunable_override.yml
configOverride: |
  [global]
  osd_pool_default_size = 1
  osd_pool_default_min_size = 1
  auth_cluster_required = cephx
  auth_service_required = cephx
  auth_client_required = cephx

  [osd]
  osd_mkfs_type = xfs
  osd_mkfs_options_xfs = "-f"
  osd_mount_options_xfs = "rw,noatime,inode64,logbufs=8,logbsize=256k"
  osd_memory_target = <value>
  osd_memory_cache_min = <value>

  [mon]
  mon_warn_on_legacy_crush_tunables = false
  mon_pg_warn_max_per_osd = 2048
  mon_pg_warn_max_object_skew = 0
  mon_clock_drift_allowed = .1
  mon_warn_on_pool_no_redundancy = false
EOF

Make sure to provide the osd_memory_target and osd_memory_cache_min with the correct unit, e.g.: 4Gi.

The default value for osd_memory_cache_min is 4Gi. The default value for osd_memory_target is 128Mi.

Then run helm-override-update:

~(keystone_admin)$ system helm-override-update rook-ceph rook-ceph-cluster rook-ceph --values tunable_override.yml

Note

The settings applied using helm-override-update remain active until the Rook-Ceph application is deleted. If the application is deleted and reinstalled, these settings will need to be reapplied.

Then reapply the Rook-Ceph application:

~(keystone_admin)$ system application-apply rook-ceph

To change the configuration of an already running OSD without restarting it, the following Ceph config commands must be executed:

$ ceph config set osd.<id> osd_memory_target <value>
$ ceph config set osd.<id> osd_memory_cache_min <value>

Note

Changes made with ceph config set commands will persist for the life of the Ceph cluster. However, if the Ceph cluster is removed (e.g., deleted and recreated), these changes will be lost and will need to be reapplied once the cluster is redeployed.