您在這裡

Event monitoring

15 一月, 2016 - 09:27
Available under Creative Commons-ShareAlike 4.0 International License. Download for free at http://cnx.org/contents/f6522dce-7e2b-47ac-8c82-8e2b72973784@7.2

Nagios 1 has won lots of awards. We use it to monitor events from two locations.

  • Our DMZ 2 where it looks at all of our components every 90 seconds and critically has thresholds set for Green, Amber and Red. While most components in our large system are duplicated to provide resilience, it's absolutely vital to know when one of your resilient components has failed in order to prevent a systems failure.
  • The public Internet. From this location, we can look at the service(s) from the perspective of the end user.

Nagios is used to provide event monitoring. Implementing such a tool is not to be undertaken lightly. Getting the sensitivity correct so as not to cry wolf, and embedding the culture such that when an alert is sent out, the operational staff respond rapidly is, in my opinion, more difficult than installing the system in the first place.