OpenNMS can generate events when it detects outages and their resolutions, as well as availability reports on the entire network or specific subgroups of services.
In addition to service polling, OpenNMS can collect SNMP data from network devices running SNMP agents. It stores the data using RRDTool or JRobin and can display it as reports in the web-based user interface (webUI). There are configurable thresholds (such as disk space and CPU utilization) to generate events when the thresholds are met.
One important aspect of data collection on the scale of an enterprise is the need to automate as much of it as possible. It is very difficult to configure data collection on 20,000 devices manually. OpenNMS has the concept of a "system," defined by a particular System Object ID (systemOID), which matches devices with the data to collect from them. Thus, when OpenNMS discovers a Cisco router or a Windows server, it automatically begins data collection without operator intervention.
Currently, OpenNMS can collect more than 200,000 data points from 22,000 devices once every 5 minutes--a rate of approximately 2.4 million data points per hour. This limit is due to the speed at which it can write the data to disk; the collection itself takes under 2 minutes.
The last main functional area is event management and notifications. OpenNMS generates events corresponding to outage detection and exceeded thresholds, et cetera. In addition, it can receive and display external events such as SNMP traps. There are also numerous other ways of getting events into OpenNMS. An included perl script, send-event.pl, gives even novices an easy way to start using OpenNMS as their main event manager.
One user had a special emergency email address configured via procmail to accommodate the contents of any email message sent to that address, turn it into an OpenNMS event, and send it via send-event.pl to the application, which generated a notification.
Notifications trigger when OpenNMS detects specific events. They cause the sending of a notification action such as an email, page, or SMS. You can use anything that can run from the command line to send an OpenNMS notification. Notifications walk a "path" that will escalate the issue until someone acknowledges it. Thus the first step can be to send an email, and if no one acknowledges it within 5 minutes, the second step is to send a page. If no one acknowledges that page within, say, 10 minutes, the third step might be to page a manager, and so on. Notifications can also be auto-acknowledged.
For example, if a web server experiences an outage and OpenNMS detects a resolution, it will auto-acknowledge all notifications based on that event. In addition, it is possible to set an initial delay to suppress notifications for the first few minutes of an outage, in order to give the network a chance to correct itself. For those who carry a pager 24/7, this can mean the difference between a bad and good night's sleep.
OpenNMS currently supports most Linux distributions, Solaris, and Mac OS X. It does have several dependencies, so the best thing to do is to read the detailed installation guide. OpenNMS is mostly Java, requiring a 1.4 SDK.
Because Java 1.4 does not have an ICMP API, OpenNMS uses JNI to access a small portion of code written in C. Java 1.5 addresses this issue. Moving OpenNMS to 100 percent Java is on the road map.