400 Series Alarm MessagesΒΆ
The system inventory and maintenance service reports system changes with different degrees of severity. Use the reported alarms to monitor the overall health of the system.
Alarm messages are numerically coded by the type of alarm.
For more information, see Fault Management Overview.
In the alarm description tables, the severity of the alarms is represented by one or more letters, as follows:
C: Critical
Indicates that a platform service affecting condition has occurred and immediate corrective action is required. (A mandatory platform service has become totally out of service and its capability must be restored.)
M: Major
Indicates that a platform service affecting condition has developed and urgent corrective action is required. (A mandatory platform service has developed a severe degradation and its full capability must be restored.)
or -
An optional platform service has become totally out of service and its capability should be restored.
m: Minor
Indicates that a platform non-service affecting fault condition has developed and corrective action should be taken in order to prevent a more serious fault. (The fault condition is not currently impacting / degrading the capability of the platform service.)
W: Warning
Indicates the detection of a potential or impending service affecting fault. Action should be taken to further diagnose and correct the problem in order to prevent it from becoming a more serious service affecting fault
A slash-separated list of letters is used when the alarm can be triggered with one of several severity levels.
An asterisk (*) indicates the management-affecting severity, if any. A management-affecting alarm is one that cannot be ignored at the indicated severity level or higher by using relaxed alarm rules during an orchestrated patch or upgrade operation.
Note
Degrade Affecting Severity: Critical indicates a node will be degraded if the alarm reaches a Critical level.
Alarm ID: 400.001 |
Service group failure; <list of affected services>. or Service group degraded; <list of affected services>. or Service group warning; <list of affected services>. |
Entity Instance: |
service_domain=<domain_name>.service_group=<group_name>.host=<hostname> |
Degrade Affecting Severity: |
None |
Severity: |
C/M/m* |
Proposed Repair Action |
Contact next level of support. |
Alarm ID: 400.002 |
Service group loss of redundancy; expected <num> standby member<s> but only <num> standby member<s> available. or Service group loss of redundancy; expected <num> active member<s> but no active members available. or Service group loss of redundancy; expected <num> active member<s> but only <num> active member<s> available. |
Entity Instance: |
service_domain=<domain_name>.service_group=<group_name> |
Degrade Affecting Severity: |
None |
Severity: |
M* |
Proposed Repair Action |
Bring a controller node back in to service, otherwise contact next level of support. |
Alarm ID: 400.003 |
License key is not installed; a valid license key is required for operation. or License key has expired or is invalid; a valid license key is required for operation. or Evaluation license key will expire on <date>; there are <num_days> days remaining in this evaluation. or Evaluation license key will expire on <date>; there is only 1 day remaining in this evaluation. |
Entity Instance: |
host=<hostname> |
Degrade Affecting Severity: |
None |
Severity: |
C* |
Proposed Repair Action |
Contact next level of support to obtain a new license key. |
Alarm ID: 400.005 |
Communication failure detected with peer over port <linux-ifname>. or Communication failure detected with peer over port <linux-ifname> within the last 30 seconds. |
Entity Instance: |
host=<hostname>.network=<mgmt | oam | cluster-host> |
Degrade Affecting Severity: |
None |
Severity: |
M* |
Proposed Repair Action |
Check cabling and far-end port configuration and status on adjacent equipment. |