100 Series Alarm Messages

Alarm Severities

One or more of the following severity levels is associated with each alarm.

Critical

Indicates that a platform service affecting condition has occurred and immediate corrective action is required. (A mandatory platform service has become totally out of service and its capability must be restored.)

Major

Indicates that a platform service affecting condition has developed and urgent corrective action is required. (A mandatory platform service has developed a severe degradation and its full capability must be restored.)

- or -

An optional platform service has become totally out of service and its capability should be restored.

Minor

Indicates that a platform non-service affecting fault condition has developed and corrective action should be taken in order to prevent a more serious fault. (The fault condition is not currently impacting / degrading the capability of the platform service.)

Warning

Indicates the detection of a potential or impending service affecting fault. Action should be taken to further diagnose and correct the problem in order to prevent it from becoming a more serious service affecting fault.


Alarm ID: 100.101

Platform CPU threshold exceeded; threshold x%, actual y% .

CRITICAL @ 95%

MAJOR @ 90%

Entity Instance

host=<hostname>

Degrade Affecting Severity:

critical

Severity:

[‘critical’, ‘major’]

Proposed Repair Action

Monitor and if condition persists, contact next level of support.

Management Affecting Severity

major


Alarm ID: 100.103

Memory threshold exceeded; threshold x%, actual y% .

CRITICAL @ 90%

MAJOR @ 80%

Entity Instance

host=<hostname>

OR

host=<hostname>.memory=total

OR

host=<hostname>.memory=platform

OR

host=<hostname>.numa=node<number>

Degrade Affecting Severity:

critical

Severity:

[‘critical’, ‘major’]

Proposed Repair Action

Monitor and if condition persists, contact next level of support; may require additional memory on Host.

Management Affecting Severity

none


Alarm ID: 100.104

host=<hostname>.filesystem=<mount-dir>

File System threshold exceeded; threshold x%, actual y% .

CRITICAL @ 90%

MAJOR @ 80%

OR

host=<hostname>.volumegroup=<volumegroup-name>

Monitor and if condition persists, consider adding additional physical volumes to the volume group.

Entity Instance

host=<hostname>.filesystem=<mount-dir>

OR

host=<hostname>.volumegroup=<volumegroup-name>

Degrade Affecting Severity:

critical

Severity:

[‘critical’, ‘major’]

Proposed Repair Action

Reduce usage or resize filesystem.

Management Affecting Severity

critical


Alarm ID: 100.106

‘OAM’ Port failed.

Entity Instance

host=<hostname>.port=<port-name>

Degrade Affecting Severity:

major

Severity:

major

Proposed Repair Action

Check cabling and far-end port configuration and status on adjacent equipment.

Management Affecting Severity

warning


Alarm ID: 100.107

‘OAM’ Interface degraded.

OR

‘OAM’ Interface failed.

Entity Instance

host=<hostname>.interface=<if-name>

Degrade Affecting Severity:

major

Severity:

[‘critical’, ‘major’]

Proposed Repair Action

Check cabling and far-end port configuration and status on adjacent equipment.

Management Affecting Severity

warning


Alarm ID: 100.108

‘MGMT’ Port failed.

Entity Instance

host=<hostname>.port=<port-name>

Degrade Affecting Severity:

major

Severity:

major

Proposed Repair Action

Check cabling and far-end port configuration and status on adjacent equipment.

Management Affecting Severity

warning


Alarm ID: 100.109

‘MGMT’ Interface degraded.

OR

‘MGMT’ Interface failed.

Entity Instance

host=<hostname>.interface=<if-name>

Degrade Affecting Severity:

major

Severity:

[‘critical’, ‘major’]

Proposed Repair Action

Check cabling and far-end port configuration and status on adjacent equipment.

Management Affecting Severity

warning


Alarm ID: 100.110

‘CLUSTER-HOST’ Port failed.

Entity Instance

host=<hostname>.port=<port-name>

Degrade Affecting Severity:

major

Severity:

major

Proposed Repair Action

Check cabling and far-end port configuration and status on adjacent equipment.

Management Affecting Severity

warning


Alarm ID: 100.111

‘CLUSTER-HOST’ Interface degraded.

OR

‘CLUSTER-HOST’ Interface failed.

Entity Instance

host=<hostname>.interface=<if-name>

Degrade Affecting Severity:

major

Severity:

[‘critical’, ‘major’]

Proposed Repair Action

Check cabling and far-end port configuration and status on adjacent equipment.

Management Affecting Severity

warning


Alarm ID: 100.114

NTP configuration does not contain any valid or reachable NTP servers.

NTP address <IP address> is not a valid or a reachable NTP server.

Entity Instance

host=<hostname>.ntp

host=<hostname>.ntp=<IP address>

Degrade Affecting Severity:

none

Severity:

[‘major’, ‘minor’]

Proposed Repair Action

Monitor and if condition persists, contact next level of support.

Management Affecting Severity

none


Alarm ID: 100.118

Controller cannot establish connection with remote logging server.

Entity Instance

host=<hostname>

Degrade Affecting Severity:

none

Severity:

minor

Proposed Repair Action

Ensure Remote Log Server IP is reachable from Controller through OAM interface; otherwise contact next level of support.

Management Affecting Severity

none


Alarm ID: 100.119

<hostname> does not support the provisioned PTP mode

OR

<hostname> PTP clocking is out-of-tolerance

OR

<hostname> is not locked to remote PTP Primary source

OR

<hostname> GNSS signal loss state:<state>

OR

<hostname> 1PPS signal loss state:<state>

Entity Instance

host=<hostname>.ptp

OR

host=<hostname>.ptp=no-lock

OR

host=<hostname>.ptp=<interface>.unsupported=hardware-timestamping

OR

host=<hostname>.ptp=<interface>.unsupported=software-timestamping

OR

host=<hostname>.ptp=<interface>.unsupported=legacy-timestamping

OR

host=<hostname>.ptp=out-of-tolerance

OR

host=<hostname>.instance=<instance>.ptp=out-of-tolerance

OR

host=<hostname>.interface=<interface>.ptp=signal-loss

Degrade Affecting Severity:

none

Severity:

[‘major’, ‘minor’]

Proposed Repair Action

Monitor and if condition persists, contact next level of support.

Management Affecting Severity

none


Alarm ID: 100.150

service open file descriptor has reached its limit

service open file descriptor is approaching to its limit

Entity Instance

host=<hostname>.resource_type=file-descriptor.service_name=<service-name>

Degrade Affecting Severity:

critical

Severity:

[‘critical’, ‘major’]

Proposed Repair Action

swact to the other controller if it is available

Management Affecting Severity

critical