100 Series Alarm Messages¶

The system inventory and maintenance service reports system changes with different degrees of severity. Use the reported alarms to monitor the overall health of the system.

Alarm messages are numerically coded by the type of alarm.

For more information, see Fault Management Overview.

In the alarm description tables, the severity of the alarms is represented by one or more letters, as follows:

C: Critical

Indicates that a platform service affecting condition has occurred and immediate corrective action is required. (A mandatory platform service has become totally out of service and its capability must be restored.)
M: Major

Indicates that a platform service affecting condition has developed and urgent corrective action is required. (A mandatory platform service has developed a severe degradation and its full capability must be restored.)
- or -
An optional platform service has become totally out of service and its capability should be restored.
m: Minor

Indicates that a platform non-service affecting fault condition has developed and corrective action should be taken in order to prevent a more serious fault. (The fault condition is not currently impacting / degrading the capability of the platform service.)
W: Warning

Indicates the detection of a potential or impending service affecting fault. Action should be taken to further diagnose and correct the problem in order to prevent it from becoming a more serious service affecting fault

A slash-separated list of letters is used when the alarm can be triggered with one of several severity levels.

An asterisk (*) indicates the management-affecting severity, if any. A management-affecting alarm is one that cannot be ignored at the indicated severity level or higher by using relaxed alarm rules during an orchestrated patch or upgrade operation.

Note

Degrade Affecting Severity: Critical indicates a node will be degraded if the alarm reaches a Critical level.

Alarm ID: 100.101	Platform CPU threshold exceeded; threshold x%, actual y%. CRITICAL @ 95% MAJOR @ 90%
Entity Instance	host=<hostname>
Degrade Affecting Severity:	Critical
Severity:	C/M*
Proposed Repair Action	Monitor and if condition persists, contact next level of support.

Alarm ID: 100.103	Memory threshold exceeded; threshold x%, actual y% . CRITICAL @ 90% MAJOR @ 80%
Entity Instance	host=<hostname>
Degrade Affecting Severity:	Critical
Severity:	C/M
Proposed Repair Action	Monitor and if condition persists, contact next level of support; may require additional memory on Host.

Alarm ID: 100.104	host=<hostname>.filesystem=<mount-dir> File System threshold exceeded; threshold x%, actual y%. CRITICAL @ 90% MAJOR @ 80% OR host=<hostname>.volumegroup=<volumegroup-name> Monitor and if condition persists, consider adding additional physical volumes to the volume group.
Entity Instance	host=<hostname>.filesystem=<mount-dir> OR host=<hostname>.volumegroup=<volumegroup-name>
Degrade Affecting Severity:	Critical
Severity:	C*/M
Proposed Repair Action	Reduce usage or resize filesystem.

Alarm ID: 100.106	‘OAM’ Port failed.
Entity Instance	host=<hostname>.port=<port-name>
Degrade Affecting Severity:	Major
Severity:	M*
Proposed Repair Action	Check cabling and far-end port configuration and status on adjacent equipment.

Alarm ID: 100.107	‘OAM’ Interface degraded. or ‘OAM’ Interface failed.
Entity Instance	host=<hostname>.interface=<if-name>
Degrade Affecting Severity:	Major
Severity:	C or M*
Proposed Repair Action	Check cabling and far-end port configuration and status on adjacent equipment.

Alarm ID: 100.108	‘MGMT’ Port failed.
Entity Instance	host=<hostname>.port=<port-name>
Degrade Affecting Severity:	Major
Severity:	M*
Proposed Repair Action	Check cabling and far-end port configuration and status on adjacent equipment.

Alarm ID: 100.109	‘MGMT’ Interface degraded. or ‘MGMT’ Interface failed.
Entity Instance	host=<hostname>.interface=<if-name>
Degrade Affecting Severity:	Major
Severity:	C or M*
Proposed Repair Action	Check cabling and far-end port configuration and status on adjacent equipment.

Alarm ID: 100.110	‘CLUSTER-HOST’ Port failed.
Entity Instance	host=<hostname>.port=<port-name>
Degrade Affecting Severity:	Major
Severity:	C or M*
Proposed Repair Action	Check cabling and far-end port configuration and status on adjacent equipment.

Alarm ID: 100.111	‘CLUSTER-HOST’ Interface degraded. OR ‘CLUSTER-HOST’ Interface failed.
Entity Instance	host=<hostname>.interface=<if-name>
Degrade Affecting Severity:	Major
Severity:	C or M*
Proposed Repair Action	Check cabling and far-end port configuration and status on adjacent equipment.

Alarm ID: 100.114	NTP configuration does not contain any valid or reachable NTP servers. The alarm is raised regardless of NTP enabled/disabled status. NTP address <IP address> is not a valid or a reachable NTP server. Connectivity to external PTP Clock Synchronization is lost.
Entity Instance	host=<hostname>.ntp host=<hostname>.ntp=<IP address>
Degrade Affecting Severity:	None
Severity:	M or m
Proposed Repair Action	Monitor and if condition persists, contact next level of support.

Alarm ID: 100.118	Controller cannot establish connection with remote logging server.
Entity Instance	host=<hostname>
Degrade Affecting Severity:	None
Severity:	m
Proposed Repair Action	Ensure Remote Log Server IP is reachable from Controller through OAM interface; otherwise contact next level of support.

Alarm ID: 100.119	Major: PTP configuration or out-of-tolerance time-stamping conditions. Minor: PTP out-of-tolerance time-stamping condition.
Entity Instance	host=<hostname>.ptp OR host=<hostname>.ptp=no-lock OR host=<hostname>.ptp=<interface>.unsupported=hardware-timestamping OR host=<hostname>.ptp=<interface>.unsupported=software-timestamping OR host=<hostname>.ptp=<interface>.unsupported=legacy-timestamping OR host=<hostname>.ptp=out-of-tolerance
Degrade Affecting Severity:	None
Severity:	M or m
Proposed Repair Action	Monitor and, if condition persists, contact next level of support.

100 Series Alarm Messages

100 Series Alarm Messages¶

StarlingX R6.0