[prev in list] [next in list] [prev in thread] [next in thread] 

List:       nagios-devel
Subject:    Re: [Nagios-devel] [PATCH] - 3.0.3: only send out a service
From:       Max <perldork () webwizarddesign ! com>
Date:       2009-07-23 18:09:54
Message-ID: f7aebadd0907231109u783e53edibf89ef8f8a41d045 () mail ! gmail ! com
[Download RAW message or body]

Did an internal code review for this patch and the history checking
logic is overly naive and broken :p.  The patch as it stands *does*
work for the case where

* Escalation is in scope
* Previous service problem state *is* listed in the escalation
* Service recovers

If, to use Thomas's example again, the service specifies c,r as
notification states and the escalation specifies c,r as escalation
options and the service goes

critical -> warning -> recovery

then a notification will not be sent out.

My apologies for sending this patch out without doing this level of
testing first ..

The patch works for us as we use escalations as our primary form of
notification as we make extensive use of the

service -> hostgroup -> host

mappings and we have multiple users on separate projects at our
organization using Nagios, so we had to have a way to let each group
have separate notification rules without forcing each group to make
copies of a service or define their own version using inheritence.

To do this, we have shared services notify on every state, we put the
string 'do_nothing' as the notification command,  and we then patched
nagios to recognize the string 'do_nothing' as a null notification
command, so it doesn't actually shell out or call the perl
interpreter, just returns from the system command routine .. which
then lets us use escalations as primary notification mechanisms,
enabling each project user to define whatever notification policy they
want without having to touch any service definitions for shared
services.

So in our case notification_number == state_history_index because
every state change for a service triggers a notification counter
increment .. which means my narrow-minded code works for us just fine
.. but for anyone not using our unique setup it will break.

Long explanation, but just letting everyone know why my pretty dumb
code works for us and I didn't catch this from doing black box testing
before releasing it.

Will change the code internally, test at our organization using
various service notification state combinations (not just our unique
setup) and do another code review before resending the patch ...
should have a corrected version within a week ro so.

Sorry again for the code noise and long winded explanations.

- Max

------------------------------------------------------------------------------
_______________________________________________
Nagios-devel mailing list
Nagios-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-devel
[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic