Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Feature/Functionality  

Gap Identified 

Feedback/Comments 

Alarm Lifecycle 

Our alert states are more complex. We have at least 5 states of which 4 are represented by the GUI. For example, by means of flashing or different fonts. 

 

Multiple stages of Ack 

We have first- and second-line support and their acknowledgements are represented in the GUI. 

 

Correlation 

The initial info for a particular alarm will change over time (during the Alarm Lifecycle), or it may be removed quickly.  For example, multiple alarms that are immediately reported could be “squashed” into a single alarm after a few seconds. 

 

Coalescing 

Multiple instances of an identical alarm need to be “merged” in the gui.  But with an indication that this has happened, and how often. 

 

Backend health status UI

The gui must contain a panel, or some other indication, of the results of various real-time health checks on backend systems? .

 

Priority 

Severity and priority are different things. We have a priority numbering and it is utilised by 1st and 2nd line support. 

 

History + Search 

We need to keep all alarms that have ever happened, and their internal components (for example, individual BGP peerings or link states), to provide reporting for other services on availability and utilisation of service.

 

Alarm logical post-processing rules 

The OC can specify logical rules and change the characteristics of alarms.  For example, if an alarm description contains particular keywords, or is related to a particular location, then the gui severity can be automatically changed, comments added, or perhaps hidden.   It should be possible to apply this logic as new alarms enter some particular state, or to apply the logic to the existing database of old alarms.

 

Filtering 

Complex filtering. Filter groups AND / NOT 

 

Alarm internal details 

Our OC requires that the internal components of an alarm are easily-browsable from the gui.  For example: if an optical cut causes multiple IP circuit and BGP peering failures, OC must be able to “open” these details (e.g. hostnames, ports, event start/end times, multiple flaps of the same, etc.) by clicking on the alarm in the top-level alarm row. 

 

...