Alarm management in Industries


Figure 1. Buzzing Alarm image


Alarms, priorities and management


Brief start about Alarm management


One of the daily tasks of operators is to intervene and fix issues notified by alarms. Yet some may be more urgent than others. Thus, operators need to manage priority. It becomes overwhelming when an operator receives more than 100 alarms per day to manage. Stress leads to poor decision-making thus dangerous results for the factory. We can reduce the risk with the help of correct visual displays and notifications.


Rollback in History


Figure 2. Texaco Milford Haven refinery on fire


According to Etymonline, the concept of alarms (French for “a l’arme” which means “spring to arms”) is very old and originates from the military concept where a guard warns his fellows in case of an attack. We can think also about boilling water tracked by a whistle.


According to Great Britain. Health and Safety Executive, in The Explosion and Fires at the Texaco Refinery, Milford Haven, 24 July 1994 [1] on 1994, the 24th of July, a fire was declared in the Texaco Milford Haven refinery. This accident was the most expensive incident in the UK from 1974-1996. Despite the use of alarms, several exterior factors caused a fire in the plant. In the last 11 minutes before the explosion, the two operators in the control room had to recognize, acknowledge, and act on 275 alarms (1 alarm every 2-3 seconds). They abandoned the use of the DCS (Distributed Control Systems) as being more of a nuisance than a help (each alarm was noticed by a visual and sound signal).


This event led to the increase of the gas price in the entire country, the HSE (Health Safety Executive) investigated the incident. The investigation and further work sponsored by the HSE led to the writing of the EEMUA 191 guideline.

This event highlighted the need for alarm management, operator interface design, and operator training.

So in this article, we will discuss the meaning and management of alarms. We will also see how some visual display tips and comments. Lastly we will take a look at on side effects of poorly defined alarms directly observable thanks to visual displays.


Notions

First, let’s expose the notion we will need to understand the different specification of alarms.


Alarm : audible and/or visible means of indicating to the operator an equipment malfunction, process deviation, or abnormal condition requiring a timely response.

Fleeting Alarm : A fleeting alarm is a type of nuisance alarms that appears and disappears frequently. Operators can't act upon these.

Chattering Alarm : A chattering alarm transitions frequently between ON and OFF state in a short period. These alarms cause a lot of noise in the system and are the ones that distract the most operators

Frequent Alarms : An alarm that occurs quite often. Consequent frequent alarms in your system means either there is something wrong with the facility or the alarm settings you have set. Eliminating the top 10 of these alarms can help reduce the alarm load significantly. Hence we need to review the most frequent alarms on a daily, weekly or monthly basis.

Standing Alarms : Alarms that remain in active alarm state for long durations even after being acknowledged. Standing alarms are results of faulty alarm limits and dysfunctional equipments.


Alarm Management


According to Donald in https://www.isa.org/intech-home, [2] Alarm management is the taming of the alarm system.

In the past each alarm in the control room was connected by a wire to his specific sensor, so a lot of thought was put in the development of the control room, because an alarm meant several hundred meters of wire. Nowadays with the digitalization of the control room, one console has access to every alarm in the plant. This can be very convenient, because it can help the operator in his work, but this also means that the cost of creating one alarm has tremendously decreased. This results in almost every time the creation of way to much alarm. Lot of the time the definition of an Alarm is not respected, and events are labelled as alarm. This can create confusion and stress for the operator.

As we can see throughout our history, order comes after chaos. Just like the TCP/IP unified network communication, a standart called ISA SP18.2 also emerged to define what an alarm really is opposed to events.

The standart gave birth to the notion of alarm philosophy.

According to EEMUA in Alarm systems : a guide to design, management and procurement. (2016) [3] The alarm philosophy would help determine if an event is labelled as an Alarm or if it stays labelled as an event. It will ensure that each created alarm has a purpose and a meaning, resulting in an efficient alarm system.

The alarm management is composed by several steps, the creation of the alarm philosophy is the most critical. This step will condition the rest of the development of the alarm management.


Figure 3. Alarm Management life cycle


As determined above, according to EEMUA 191 in Alarm systems : a guide to design, management and procurement. [4] developing the alarm management within the plant, can be very benefic, not only to prevent risk, but to insure Personnel and Environmental Safety, Equipment Integrity, Product Quality Control. It can lead to the reduction of the production costs and generate profit.

We have to keep in mind that the alarm management is not a one time job, it is a continuous improvement, as shown in the figure 3.

A quick example might be to think on the variation of operators available in your factory: you should adapt to the increase and decrease of their numbers. The goal is to take into account those changing factors to insure the best performances of the alarm management.

Now that we have determined that to insure cost savings and security among the plant, we need the alarm management, we will determine what can be used to enforce the alarm management.


We will see in the next section, a powerful way to enforce the alarm management : alarm visualization.


Alarm Visualization


Fist of all what is the visualization in a general manner ? Visualization is any technique for creating images, diagrams, or animations to communicate a message.


The goal of the alarm visualization is to help the operator/manager to target abnormal conditions by monitoring Key Performance Indicators ( KPI ).


Before talking about how to visualize the alarms, we have to understand what to visualize. The ISA SP18.2 standard recommends KPI’s to prevent operators being overloaded with alarms. Let’s also keep in mind that those recommendations should be set regarding our own plant specification. When it comes to visualizing the performances of the plant, the ISA SP18.2, can be very specific regarding the performance of the plant. We can use a specific diagram displaying the KPI in a certain manner that gives access quickly to the performance of the plant.


Figure 4. Performance chart diagram.


The figure 4 shows this performance chart, it gives rapid access to the state of the plant, it’s an excellent preview of the system before digging further into the problems. It will help to prioritize the tasks among the plant. If the performances of the plant is good enough, there is no need to rush to try to understand where the problems are, the alarm management philosophy life cycle can follow the planned schedule. But the alarm management is a continuous improvement, so you will have to regularly monitor the performances.


Alarms are ordered by priority. The priority of an alarm gives access to two key information: The first one is managing concurrent alarm events in order to choose the correct one to handle first. The second information is related to the severity of the alarm. For instance, when a critical alarm occurs, the operator might call the supervisor to take emergency measures, thus escalating the issue to the next level of support.

The ISA SP18.2 standard, is very specific regarding the distribution of alarm priority. It is quite simple to understand why, if too many alarms are labelled for example as Critical alarm, the sense of Critical is lost. The purpose of labelling an alarm as Critical is to make sure that when the alarm comes up the response actions to this specific alarm are fast. But if every alarm are Critical, you won’t be able to differentiate the Critical one’s to the non-Critical.

In order to manage performances, we will have to visualize the priority distribution among the alarms.

Figure 5. Pie chart representing severity distribution of alarms.


To visualize the priority distribution (or any distribution), the pie chart tends to be the best way to represent data as we can easily distinguish proportions as long as they are not too similar. The ISA SP18.2 standard says that the proportion of Low priority must be at maximum 80%. The example in figure 5 helps us visualize it in a glance without digging into reports or abstract data contained in tables.


Figure 6. Stack area chart representing alarm occurences of different types on a given period.


Another very powerful representation of the alarm distribution, is the Stacked Area. This distribution is really powerful to associate a known state of the process of a certain date, with the number of alarms and the priorities of those alarms. The figure 6 is a good example of stacked area distribution, we can see by date the number of occurrences in total, with the proportion of each priority. For instance, we had an important increase of severe alarms in 10/28/2021.


Overwhelmed operators


The main purpose of an alarm is to inform the operator of a malfunction. We can understand that too many alarms, will compromise the purpose of an alarm. Alarm overloading might occur for several reasons. One of them can be the result of incorrect trigger values assignment.

This will create alarm messages for the operator, whereas the process is running in normal conditions. Identifying and targeting those tags can be a key factor in reducing the stress of the operator.


Figure 6. Top 10 Occurrences by Tag


If we try to reduce the number of alarms during normal conditions, we need to identify the top occurring alarms. This information will help us to choose the alarm to work on as a priority to fix it. In figure 6, we can see the most occurred tags classified by tag names. Eliminating the top 2 occurring alarms, will be significantly decrease the number of alarms the operator needs to handle.


Another key point to manage with close attention, is the duration of each tag, the duration of an alarm, is the elapsed time between the tag changes status from ALM (alarm) to RTN (return to normal). Managing duration, will help you to reduce the number of alarm as well an operator can handle. We can notice that if a tag remains active for a very long period of time, that can mean that the alarm is active, but the plant is working under normal conditions, so the alarm does not respect its definition, and therefore the alarm is not relevant and it’s causing nuisance.


Figure 7. Top 20 Duration by Tag


Figure 7 is a good example illustrating that tag number 35TI24305C might be annoying our operator in a point that he even might have completely hid it from his monitor.

Another way to target problematic Tags, is using a TreeMap. The TreeMap will plot the distribution of Tag on different level. It can be really powerful if you have multiple Plant Areas or even plants. In fact, it will help you rapidly target the most problematics Plant Areas and Tags. You will be able to take quick actions and quick decisions before diving further in the research.


Figure 8. TreeMap distribution of the occurrences of Tag


It is easy to say that in Figure 8, the Plant Area ‘’TANKAGE’’ is the most problematic Plant Area, and that the Tag ‘’35TI24305C’’, is the most frequent Tag, so the most problematic Tag.


Side Effects


Now that we have discussed what to visualize, and what those visualizations can help us target, we will discuss the side effects of over visualizing. As having to much alarm, trying to visualize everything can be meaningless and overwhelmed the operator.

The purpose of the visualization is to help the operator finding key information to gain time to take actions. Providing the operator with all the KPIs, all the representations can loose him. And thus the alarm visualization will be no help.


So it is important to know , as Stephan Few said in Information Dashboard design [5] that dashboards are “visual display[s] of the most important information needed to achieve one or more objectives; consolidated and arranged on a single screen so the information can be monitored at a glance.”


Conclusion


To conclude, the alarm visualization can be really powerful, it helps to increase the level of safety for the operators and the equipment, it will help to prevent for abnormal conditions and so making cost savings.

The alarm management and the alarm visualization, cannot be as efficient if one is implemented and not the other one, the will work as a pair and will help you to unleash the performances of your plant.


References


[1] Executive, G.B.H. and S. (1997). The Explosion and Fires at the Texaco Refinery, Milford Haven, 24 July 1994: A Report of the Investigation by the Health and Safety Executive Into the Explosion and Fires on the Pembroke Cracking Company Plant at the Texaco Refinery, Milford Haven on 24 July 1994.


[2] How to achieve an effective and efficient alarm management program (https://www.isa.org/intech-home/2020/march-april/features/alarm-management-questions-that-everyone-asks) By Donald G. Dunn and Nicholas P. Sands, PE, CAP



[3] Engineering Equipment And Materials Users' Association (2013). Alarm systems : a guide to design, management and procurement. London: The Engineering Equipment And Materials Users’ Association.


[4] Few, S. (2006). Information dashboard design : the effective visual communication of data. Sebastopol, Ca: O’reilly.


[5 ] Alarm systems : a guide to design, management and procurement. (2016). London: EEMUA.