[prev in list] [next in list] [prev in thread] [next in thread] 

List:       nagios-users
Subject:    [Nagios-users] Escalation notification count and state change
From:       Neil Ramsay <neil.ramsay () market-source ! com>
Date:       2009-10-27 1:20:11
Message-ID: ecf3ab400910261820s43aec64er50fb935cbf83e156 () mail ! gmail ! com
[Download RAW message or body]

[Attachment #2 (multipart/alternative)]


Hi,

I've searched and found similar posts but unfortunately no replies to this
type of problem. I'd expect this to be a common problem but maybe I've
misread the documentation.

On Nagios 3.2.0 we have service notifications set to go out for Warning and
for Critical states to an email address 24x7.

In addition during 'after hours', the on-call engineer receives SMS alerts
for all Critical notifications and the backup engineer should receive
escalations after the 4th Critical notification. However, last night the
backup engineer received an SMS on the second Critical notification.


define serviceescalation{
hostgroup_name switches,primary_nodes,secondary_nodes
service_description *
first_notification 4
last_notification 0
notification_interval 5
contact_groups  PrimaryAH,SecondaryAH
escalation_period afterhours
escalation_options u,c
}

# Primary After-hours contacts
define contactgroup{
        contactgroup_name       PrimaryAH
        alias                   Primary After-Hours contact
        members                 supportEmail,Engineer1
}

# Secondary After-hours contacts
define contactgroup{
        contactgroup_name       SecondaryAH
        alias                   Secondary After-Hours contact
        members                 supportEmail,Engineer2
}





It appears that escalation procedure ignores the actual sate when counting
notifications. So if the 1st notification is critical and the 4th is
critical but 2-3 are Warnings the 4th notification is escalated as it is
critical. Eventhough only 1 critical notification was sent and the other 2
were warnings. I was hoping that on the 4th critical notification Nagios
escalates.

See event log below:

Service Notification [10-26-2009 23:48:32] SERVICE NOTIFICATION:
Engineer2;winapps1;NT memory usage;CRITICAL;notify-service-by-sms;Mem: 984
MB (96%) / 1023 MB (3%) Paged Mem: 1189 MB (48%) / 2469 MB (51%)

Service Notification[10-26-2009 23:48:32] SERVICE NOTIFICATION:
Engineer1;winapps1;NT memory usage;CRITICAL;notify-service-by-sms;Mem: 984
MB (96%) / 1023 MB (3%) Paged Mem: 1189 MB (48%) / 2469 MB (51%



Service Notification[10-26-2009 23:38:32] SERVICE NOTIFICATION:
supportEmail;winapps1;NT memory usage;WARNING;notify-service-by-email;Mem:
950 MB (92%) / 1023 MB (7%) Paged Mem: 1219 MB (49%) / 2469 MB (50%)


Service Notification[10-26-2009 23:28:32] SERVICE NOTIFICATION:
supportEmail;winapps1;NT memory usage;WARNING;notify-service-by-email;Mem:
881 MB (86%) / 1023 MB (13%) Paged Mem: 1192 MB (48%) / 2469 MB (51%)


Service Notification[10-26-2009 23:18:32] SERVICE NOTIFICATION:
supportEmail;winapps1;NT memory usage;CRITICAL;notify-service-by-email;Mem:
1006 MB (98%) / 1023 MB (1%) Paged Mem: 1152 MB (46%) / 2469 MB (53%)

Service Notification[10-26-2009 23:18:32] SERVICE NOTIFICATION:
Engineer1;winapps1;NT memory usage;CRITICAL;notify-service-by-sms;Mem: 1006
MB (98%) / 1023 MB (1%) Paged Mem: 1152 MB (46%) / 2469 MB (53%)


Is there a way to avoid this behaviour?

Thanks

Neil

[Attachment #5 (text/html)]

Hi,<br><div class="gmail_quote"><br>I&#39;ve searched and found similar posts but \
unfortunately no replies to this type of problem. I&#39;d expect this to be a common \
problem but maybe I&#39;ve misread the documentation.<br> <br>On Nagios 3.2.0 we have \
service notifications set to go out for Warning and for Critical states to an email \
address 24x7.<br> <br>In addition during &#39;after hours&#39;, the on-call engineer \
receives SMS alerts for all Critical notifications and the backup engineer should \
receive escalations after the 4th Critical notification. However, last night the \
backup engineer received an SMS on the second Critical notification.<br>

<br><br>define serviceescalation{<br>hostgroup_name \
switches,primary_nodes,secondary_nodes<br>service_description *<br>first_notification \
4<br>last_notification 0<br>notification_interval 5<br>contact_groups  \
PrimaryAH,SecondaryAH<br>

escalation_period afterhours<br>escalation_options u,c<br>}<br><br># Primary \
After-hours contacts<br>define contactgroup{<br>        contactgroup_name       \
PrimaryAH<br>        alias                   Primary After-Hours contact<br>

        members                 supportEmail,Engineer1<br>}<br><br># Secondary \
After-hours contacts<br>define contactgroup{<br>        contactgroup_name       \
SecondaryAH<br>        alias                   Secondary After-Hours contact<br>

        members                 supportEmail,Engineer2<br>}<br><br><br><br><br><br>It \
appears that escalation procedure ignores the actual sate when counting \
notifications. So if the 1st notification is critical and the 4th is critical but 2-3 \
are Warnings the 4th notification is escalated as it is critical. Eventhough only 1 \
critical notification was sent and the other 2 were warnings. I was hoping that on \
the 4th critical notification Nagios escalates.<br>

<br>See event log below:<br><br>Service Notification [10-26-2009 23:48:32] SERVICE \
NOTIFICATION: Engineer2;winapps1;NT memory usage;CRITICAL;notify-service-by-sms;Mem: \
984 MB (96%) / 1023 MB (3%) Paged Mem: 1189 MB (48%) / 2469 MB (51%)<br>

<br>Service Notification[10-26-2009 23:48:32] SERVICE NOTIFICATION: \
Engineer1;winapps1;NT memory usage;CRITICAL;notify-service-by-sms;Mem: 984 MB (96%) / \
1023 MB (3%) Paged Mem: 1189 MB (48%) / 2469 MB (51%<br><br><br><br>

Service Notification[10-26-2009 23:38:32] SERVICE NOTIFICATION: \
supportEmail;winapps1;NT memory usage;WARNING;notify-service-by-email;Mem: 950 MB \
(92%) / 1023 MB (7%) Paged Mem: 1219 MB (49%) / 2469 MB (50%)<br><br><br>Service \
Notification[10-26-2009 23:28:32] SERVICE NOTIFICATION: supportEmail;winapps1;NT \
memory usage;WARNING;notify-service-by-email;Mem: 881 MB (86%) / 1023 MB (13%) Paged \
Mem: 1192 MB (48%) / 2469 MB (51%)<br>

<br><br>Service Notification[10-26-2009 23:18:32] SERVICE NOTIFICATION: \
supportEmail;winapps1;NT memory usage;CRITICAL;notify-service-by-email;Mem: 1006 MB \
(98%) / 1023 MB (1%) Paged Mem: 1152 MB (46%) / 2469 MB (53%)<br>

<br>Service Notification[10-26-2009 23:18:32] SERVICE NOTIFICATION: \
Engineer1;winapps1;NT memory usage;CRITICAL;notify-service-by-sms;Mem: 1006 MB (98%) \
/ 1023 MB (1%) Paged Mem: 1152 MB (46%) / 2469 MB (53%)<br><br><br> Is there a way to \
avoid this behaviour?<br>

</div><br>Thanks<br><br>Neil<br>



------------------------------------------------------------------------------
Come build with us! The BlackBerry(R) Developer Conference in SF, CA
is the only developer event you need to attend this year. Jumpstart your
developing skills, take BlackBerry mobile applications to market and stay 
ahead of the curve. Join us from November 9 - 12, 2009. Register now!
http://p.sf.net/sfu/devconference

_______________________________________________
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null

[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic