100 Series Alarm Messages¶
Alarm Severities
One or more of the following severity levels is associated with each alarm.
Critical
Indicates that a platform service affecting condition has occurred and immediate corrective action is required. (A mandatory platform service has become totally out of service and its capability must be restored.)
Major
Indicates that a platform service affecting condition has developed and urgent corrective action is required. (A mandatory platform service has developed a severe degradation and its full capability must be restored.)
- or -
An optional platform service has become totally out of service and its capability should be restored.
Minor
Indicates that a platform non-service affecting fault condition has developed and corrective action should be taken in order to prevent a more serious fault. (The fault condition is not currently impacting / degrading the capability of the platform service.)
Warning
Indicates the detection of a potential or impending service affecting fault. Action should be taken to further diagnose and correct the problem in order to prevent it from becoming a more serious service affecting fault.
Alarm ID: 100.101 |
Platform CPU threshold exceeded; threshold x%, actual y% . CRITICAL @ 95% MAJOR @ 90% |
Entity Instance |
host=<hostname> |
Degrade Affecting Severity: |
critical |
Severity: |
[‘critical’, ‘major’] |
Proposed Repair Action |
Monitor and if condition persists, contact next level of support. |
Management Affecting Severity |
major |
Alarm ID: 100.103 |
Memory threshold exceeded; threshold x%, actual y% . CRITICAL @ 90% MAJOR @ 80% |
Entity Instance |
host=<hostname> OR host=<hostname>.memory=total OR host=<hostname>.memory=platform OR host=<hostname>.numa=node<number> |
Degrade Affecting Severity: |
critical |
Severity: |
[‘critical’, ‘major’] |
Proposed Repair Action |
Monitor and if condition persists, contact next level of support; may require additional memory on Host. |
Management Affecting Severity |
none |
Alarm ID: 100.104 |
host=<hostname>.filesystem=<mount-dir> File System threshold exceeded; threshold x%, actual y% . CRITICAL @ 90% MAJOR @ 80% OR host=<hostname>.volumegroup=<volumegroup-name> Monitor and if condition persists, consider adding additional physical volumes to the volume group. |
Entity Instance |
host=<hostname>.filesystem=<mount-dir> OR host=<hostname>.volumegroup=<volumegroup-name> |
Degrade Affecting Severity: |
critical |
Severity: |
[‘critical’, ‘major’] |
Proposed Repair Action |
Reduce usage or resize filesystem. |
Management Affecting Severity |
critical |
Alarm ID: 100.106 |
‘OAM’ Port failed. |
Entity Instance |
host=<hostname>.port=<port-name> |
Degrade Affecting Severity: |
major |
Severity: |
major |
Proposed Repair Action |
Check cabling and far-end port configuration and status on adjacent equipment. |
Management Affecting Severity |
warning |
Alarm ID: 100.107 |
‘OAM’ Interface degraded. OR ‘OAM’ Interface failed. |
Entity Instance |
host=<hostname>.interface=<if-name> |
Degrade Affecting Severity: |
major |
Severity: |
[‘critical’, ‘major’] |
Proposed Repair Action |
Check cabling and far-end port configuration and status on adjacent equipment. |
Management Affecting Severity |
warning |
Alarm ID: 100.108 |
‘MGMT’ Port failed. |
Entity Instance |
host=<hostname>.port=<port-name> |
Degrade Affecting Severity: |
major |
Severity: |
major |
Proposed Repair Action |
Check cabling and far-end port configuration and status on adjacent equipment. |
Management Affecting Severity |
warning |
Alarm ID: 100.109 |
‘MGMT’ Interface degraded. OR ‘MGMT’ Interface failed. |
Entity Instance |
host=<hostname>.interface=<if-name> |
Degrade Affecting Severity: |
major |
Severity: |
[‘critical’, ‘major’] |
Proposed Repair Action |
Check cabling and far-end port configuration and status on adjacent equipment. |
Management Affecting Severity |
warning |
Alarm ID: 100.110 |
‘CLUSTER-HOST’ Port failed. |
Entity Instance |
host=<hostname>.port=<port-name> |
Degrade Affecting Severity: |
major |
Severity: |
major |
Proposed Repair Action |
Check cabling and far-end port configuration and status on adjacent equipment. |
Management Affecting Severity |
warning |
Alarm ID: 100.111 |
‘CLUSTER-HOST’ Interface degraded. OR ‘CLUSTER-HOST’ Interface failed. |
Entity Instance |
host=<hostname>.interface=<if-name> |
Degrade Affecting Severity: |
major |
Severity: |
[‘critical’, ‘major’] |
Proposed Repair Action |
Check cabling and far-end port configuration and status on adjacent equipment. |
Management Affecting Severity |
warning |
Alarm ID: 100.114 |
NTP configuration does not contain any valid or reachable NTP servers. NTP address <IP address> is not a valid or a reachable NTP server. |
Entity Instance |
host=<hostname>.ntp host=<hostname>.ntp=<IP address> |
Degrade Affecting Severity: |
none |
Severity: |
[‘major’, ‘minor’] |
Proposed Repair Action |
Monitor and if condition persists, contact next level of support. |
Management Affecting Severity |
none |
Alarm ID: 100.118 |
Controller cannot establish connection with remote logging server. |
Entity Instance |
host=<hostname> |
Degrade Affecting Severity: |
none |
Severity: |
minor |
Proposed Repair Action |
Ensure Remote Log Server IP is reachable from Controller through OAM interface; otherwise contact next level of support. |
Management Affecting Severity |
none |
Alarm ID: 100.119 |
<hostname> does not support the provisioned PTP mode OR <hostname> PTP clocking is out-of-tolerance OR <hostname> is not locked to remote PTP Primary source OR <hostname> GNSS signal loss state:<state> OR <hostname> 1PPS signal loss state:<state> |
Entity Instance |
host=<hostname>.ptp OR host=<hostname>.ptp=no-lock OR host=<hostname>.ptp=<interface>.unsupported=hardware-timestamping OR host=<hostname>.ptp=<interface>.unsupported=software-timestamping OR host=<hostname>.ptp=<interface>.unsupported=legacy-timestamping OR host=<hostname>.ptp=out-of-tolerance OR host=<hostname>.instance=<instance>.ptp=out-of-tolerance OR host=<hostname>.interface=<interface>.ptp=signal-loss |
Degrade Affecting Severity: |
none |
Severity: |
[‘major’, ‘minor’] |
Proposed Repair Action |
Monitor and if condition persists, contact next level of support. |
Management Affecting Severity |
none |
Alarm ID: 100.120 |
Controllers running mismatched kernels. |
Entity Instance |
host=<hostname>.kernel=<kernel> |
Degrade Affecting Severity: |
none |
Severity: |
minor |
Proposed Repair Action |
Modify controllers using ‘system host-kernel-modify’ so that both are running the desired ‘standard’ or ‘lowlatency’ kernel. |
Management Affecting Severity |
none |
Alarm ID: 100.121 |
Host not running the provisioned kernel. |
Entity Instance |
host=<hostname>.kernel=<kernel> |
Degrade Affecting Severity: |
none |
Severity: |
major |
Proposed Repair Action |
Retry ‘system host-kernel-modify’ and if condition persists, contact next level of support. |
Management Affecting Severity |
major |
Alarm ID: 100.150 |
service open file descriptor has reached its limit service open file descriptor is approaching to its limit |
Entity Instance |
host=<hostname>.resource_type=file-descriptor.service_name=<service-name> |
Degrade Affecting Severity: |
critical |
Severity: |
[‘critical’, ‘major’] |
Proposed Repair Action |
swact to the other controller if it is available |
Management Affecting Severity |
critical |