StarlingX: Performance measurement framework

Storyboard: https://storyboard.openstack.org/#!/story/2006406

The measurement of performance metrics in edge cloud systems is a key factor in quality assurance. Having a common understanding and a practical framework to run the performance tests are important and necessary for users and developers in the community. This specification will describe the scope of the performance framework, use cases and how it can scale by new tests developed by the community or imported from existing test performance frameworks.

Problem description

The deployment of StarlingX could happen in a wide variability of hardware and network environments, these differences could result in a difference in performance metrics. As of today, the community does not have a consolidated performance framework to measure the performance metrics with specific and well-defined test cases.

Use Cases

Developers who made some fairly significant changes in codebase want to ensure no performance regression brought by the changes. Developers who did some performance improvements need to measure the performance result with a consolidating framework valid across the community. Some of the community members want to measure out the performance items on StarlingX and promote StarlingX advantages in terms of performance which are relevant to an edge solution. Potential users want to evaluate StarlingX as one of their Edge solution candidates, so offering such a performance framework will be a positive factor of contributing to the evaluation process.

Proposed change

The proposed change is to create a set of scripts and documentation under the starlingx testing repository (https://opendev.org/starlingx/test) that anyone can use to measure the metric they need under their hardware and software environment.

Alternatives

The community can use existing performance test frameworks, Some of those are:

  • Rally:

OpenStack project, dedicated to the performance analysis and benchmarking system of individual OpenStack components

  • Kubernetes perf-tests:

An open source project dedicated to Kubernetes-related performance test tools

  • Yardstick by OPNFV:

The Yardstick concept decomposes typical virtual network function work-load performance metrics into several characteristics/performance vectors, which each of them can be represented by distinct test-cases.

The disadvantage of all these frameworks is that they coexist in separate projects. Users need to go and find the test cases anywhere on the internet, is not a centralized and scalable solution. The StarlingX Performance measurement framework proposed will allow the importing of existing test cases to be re-used in a centralized system easy to use for the community.

Data model impact

None

REST API impact

None

Security impact

None

Other end-user impact

End users will interact with this framework by command line on the controller system by the performance-test-runner.py script, which is the entry point to select and configure the test case. An example of to launch a test case might be:

::

./performance-test-runner.py –test failed_vm_detection –compute=0

This will generate results of multiple runs in CSV format easy to post-process.

Performance Impact

Running this performance framework or test cases, there shouldn’t be a performance impact or obvious overhead on the system.

Other deployer impact

None, there will be no impact on how to deploy and configure StarlingX

Developer impact

There will be a good impact on the developers. Since they will be able to have a consolidated and well aligned framework to measure the performance impact of their code/configuration changes.

Developers can’t improve something if they don’t know it needs improvement. That’s where this performance testing framework comes in. It gives the developers tools to detect how fast, stable, and scalable their StarlingX, is so they can decide if it needs to improve or not.

Upgrade impact

None

Implementation

The test framework will have the next components on the implementation

  • Common scripts:

This directory will contain scripts to capture key timestamps or collect other key data along with some use cases so that developers can measure out the time (latency) or other performance data (such as throughput, system CPU utilization and so on). Any common script that any other test case could use should be in this directory.

Despite the existence of a directory for common scripts, each one of the test cases is responsible for its own metrics. One example is the time measurement of test cases that run in more than one node. For these kinds of test cases will be necessary to use synchronization protocols like the Network Time Protocol (NTP). The Network Time Protocol (NTP) is a networking protocol for clock synchronization between computer systems. NTP is intended to synchronize all participating computers to within a few milliseconds of Coordinated Universal Time (UTC).In some other cases, it might not be necessary to have a protocol for clock synchronization.

Most of the details for wether to re-use a common script form the common directory or handle the measurement by the test itself will be left to the test implementation

  • Performance-test-runner.py:

The main command-line interface to execute the test cases on the system under test

  • Test cases directory:

This directory can contain wrapper scripts to existing upstream performance test cases as well as new test cases specific for Starlting X.

A view of the directory layout will look like:

::

performance/ ├── common │ ├── latency_metrics.py │ ├── network_generator.py │ ├── ntp_metrics.py │ └── statistics.py └── tests_cases │ ├── failed_compute_detection.py │ ├── failed_control_detection.py │ ├── failed_network_detection.py │ ├── failed_vm_detection.py │ ├── neutron_test_case_1.py │ ├── opnfv_test_case_1.py │ ├── opnfv_test_case_2.py │ ├── rally_test_case_1.py │ ├── rally_test_case_2.py │ └── performance-test-runner.py

The goal is that anyone on the StarlingX community can either define performance tests cases or re use existing on other projects and create scripts to measure in a scalable and repeatable framework.

In the future, it might be possible to evaluate how to connect some basic performance tests cases to zuul to detect regressions on commits to be merged. This will require to set up a robust infrastructure to run the sanity performance test cases for each commit to be merged.

Assignee(s)

Wang Hai Tao Elio Martinez Juan Carlos Alonzo Victor Rodriguez

Repos Impacted

https://opendev.org/starlingx/test

Work Items

  • Develop automated scripts for core test cases, such as:
    • Detection of failed VM

    • Detection failed compute node

    • Controller/Node fail detect/recovery

    • Detection of network link fail

    • DPDK Live Migrate Latency

    • Avg/Max host/guest Latency (Cyclictest)

    • Swact Time

  • Develop performance-test-runner.py that call each test case script

  • Integrate performance-test-runner.py with pytest framework

  • Implement call clone of opnfv test cases

  • Implement execution of opnfv test case

  • Implement monitoring pipeline to automatically detect changes in upstream

  • Automate weekly mail with list of upstream performance test cases that change

  • Implement automatic test cases documentation based on pydoc

  • Document framework on StarlingX wiki

  • Send regular status to ML and demos on community call for feedback

Dependencies

None

Testing

None

Documentation Impact

The performance test cases will be documented with pydoc (https://docs.python.org/3.0/library/pydoc.html). The pydoc module automatically generates documentation from Python modules. The documentation can be presented as pages of text on the console, served to a Web browser, or saved to HTML files.

The goal is to generate automatically the documentation for the end-user testing guide. This methodology will catch when the community adds or modify a new performance test case. At the same time if a standard config option changes or is deprecated the test framework documentation will be updated to reflect the change.

References

History

Revisions

Release Name

Description

Stein

Introduced