SCOM, SNMP and TRAPS or The Good, the Bad and the Ugly : Part 3

If you have followed along this far, and have not ended up with a white jacket with really long sleeves then this next bit should be no problem.

The Problem :
Although it’s nice to be able to poll a device on a regular schedule and log this for the shiny graphs and alert on it once you have all the monitors working, what we really want is a real-time alarm when something goes wrong. When your UPS goes into bypass or your HVAC fails you may not want to wait for the next polling interval.

The Solution :
Strangely this is the whole reason I started these three articles, I found SNMP traps to be inconsistent and extremely frustrating. I almost gave up and got another product more suited to handling SNMP monitoring. Of course as with most things once you get it all sorted out it’s actually quite simple.

Step 1 : SNMP Services
On your Root Management Server, Management Servers and or gateways you need to have the SNMP Services installed.
Windows 2003 – Add Remove Programs, windows Components, Management and Monitoring Tools, Simple Network Management Protocol.
Windows 2008 – in the Server Manager, Features, SNMP Services
You should have an SNMP Service and a SNMP Trap Service, make sure both are set to automatic start and are started.

Step 2 : Configure your device
On your device, appliance, server etc you will need to go in and setup SNMP and Traps. First you will need to set a community name, remember this for later, and remember this is not really a useful security measure. With luck you can configure a community name, set your device to read only (personally I don’t trust making changes via SNMP) and configure a location for traps. Here is the first trick, at a minimum you need to direct the traps at the management server that your device is going to be managed by, I configured my devices to send to my RMS and both of my management servers. You can send traps to your RMS all day long and not get an alert if your device is discovered and managed by another management server.

Step 3 : Discover your device
In the SCOM Management console, Administration tab, you can right click on any entry and pick Discovery Wizard right at the top. Click Network Devices, next, enter the IP address or address range, entry the community string you configured earlier on the device, Pick the SNMP version (if you are not sure try V1) and pick a management server. This must be the server the traps are being sent to. I send the traps to all my management servers so that just in case I need to rediscover the device on another server I don’t have to go reconfiguring the device. SCOM will not send you duplicate alarms if it received a trap on multiple management servers.
Once teh discovery is complete you shodul be abel to select the check boxes of the devices you want to manage and finish.

Step 4 : Create a Monitor.
If you are following from Part 1 and 2 we will be creating this monitor in the management pack where we have set the discover for the device type that we are monitoring. If you are adventurous and don’t expect it to get too complex you can target this monitor at snmp network device, this creating a bulk trap monitor for every device. This may make it harder to filter what you want in the future but it’s up to you.
On the authoring tab, under management Pack Objects, Right Click on Monitors and select unit monitor. We are now looking for SNMP, Trap based Detection, Simple Trap Detection, Event Monitor – Single Event and Single Event. Place this in your management pack where your discovery is.
Give your monitor a Name, a Target and a parent monitor Next. Typically you would use the discovery community string, and for this example we are going to check the box “All Traps” at the bottom. Here is the next tricky bit. The expressions, this is an example for setting a critical state for any trap and requiring a manual reset.

First Expression
parameter Name : /DataItem/SnmpVarBinds/SnmpVarBind[1]/Value
Operator : Matches wildcard
Value : *

Second Expression
parameter Name : /DataItem/SnmpVarBinds/SnmpVarBind[1]/Value
Operator : Does not match wildcard
Value : *

The first express should fire on any trap, the second should never fire, So the First Event Raised is the critical state and the second is the healthy state.

Of course you seen to configure subscriptions etc if you want email alerts but this should change the state of the device and require that you go into the device and do a reset health to get things to go back to green.

If you get fancier with the expressions comment on this post so everyone can see, I have not had to yet so I can’t say.

Edit: Nov 24, 2009

Well that didn’t last long, I had to get more specific with my trap alerting so I figured I would update the people who are following this (all 2 of them)

It’s simple enough when creating your monitor on the “FristSnmptrapProvider” screen don’t check all traps, instead put the OID you are looking for under Object Identifier. The best way for me to find the OID is wire shark. In wire shark set a filter like “snmp.data==4” or “snmp.data==4 and ip.addr == xxx.xxx.xxx.xxx” now you should be seeing just traps from the device in question. 

trao

Here you can see the OID of the TRAP  1.3.6.1.4.2254.2.4.20 and the specific Trap of 3

This gives us a complete OID of  1.3.6.1.4.2254.2.4.20.0.3 This is what you need to put under Object Identifier  for the FirstSnmpTrapProvider, the first expression remains the same as above, but in this case I will start doing automatic recoveries or for the SecondSnmpTrapProvider you need the recovery trap, in this case 1.3.6.1.4.2254.2.4.20.0.3 the second expression changes now to match the first expression “Matches Wildcard *” These examples are for utility failure and recovery of a GE UPS SNMP card using the deltav4 MIB.

Enjoy

Part 1

Part 2

Part 3

39 thoughts on “SCOM, SNMP and TRAPS or The Good, the Bad and the Ugly : Part 3

  1. Pingback: SCOM, SNMP and TRAPS or The Good, the Bad and the Ugly : Part 1 | BlackOps

  2. Scott Garrett

    I ran into a problem where the initial specification changed and I started needing to respond to specific traps in different ways and ignore others. So my plan to have one SNMP trap rule for all my devices didn’t work anymore. I disabled the monitor at the SNMP Network Device level and started creating multiple trap monitors for each individual device type.

  3. Pingback: SCOM, SNMP and TRAPS or The Good, the Bad and the Ugly : Part 2 | BlackOps

  4. RF

    Just wanted to say thanks! Of all the websites I have looked at so far your “teach by example” method was by far the best. We’ve only successfully done Part 1 so far and I read through part 2 and 3 and it all at least makes sense to me. now The Wireshark tip was especially insightful! We will be attempting part 2 and 3 this week. Good stuff!

  5. Scott Garrett

    I am glad this is helping, good luck. If you run into trouble or anything isn’t clear let me know and I will try to help out.

  6. Ben Hoffman

    I am an intern and the company i am interning at wants to use scom to monitor snmp devices. I have setup your example but it doesnt seem to receive any traps i change values so they are in the alert stage but noting happens.

    Here is a iReasoning MIB value:
    (Temperature of data center)
    Name/OID: Value Type
    1.3.6.1.4.1.16174.1.1.3.3.1.4.0; 71 Integer

    So i would create a Trap detection monitor, i would go through the steps and get to the First Trap Provider.

    (warning)
    object identifier would be: 1.3.6.1.4.1.16174.1.1.3.3.1.4.0
    parameter would be : /DataItem/SnmpVarBinds/SnmpVarBind[1]/Value
    Operator would be: Greater then
    Value: 75

    (healthy)
    object identifier would be: 1.3.6.1.4.1.16174.1.1.3.3.1.4.0
    parameter would be : /DataItem/SnmpVarBinds/SnmpVarBind[1]/Value
    Operator would be: Less than or equal
    Value: 75

    That should work right? And to test i would just put the values in to the danger zone.

    But for some reason all it ever says is healthly, even if i put the values to where i know the servers are hotter than . So what am i missing?

    Thanks so much!

    Ben

  7. Ben Hoffman

    Never mind everything is working now. I had the wrong server monitoring them.

  8. Ben Hoffman

    Well actually it still isn’t this stuff is hard. The probe based stuff is working but the trap based stuff isn’t. Right now we want to setup a printer so i tried setting your collection of all traps example and we pulled a toner cartridge and it didn’t generate a trap. So I am stuck trying to get that to work.

    If you have any ideas let me know.

    Thanks!

    Ben

  9. Scott Garrett

    I am going to continue this in direct email and will post the resolution here when we figure it out.

  10. Scott Garrett

    We have a solution, I am adding a section on multiple detections to the body of the Blog.

  11. Tolga

    I followed your tutorial but stuck at one point: how do you display the monitors in Diagram View? I have created the monitors but I do not see them as healthy/warning/critical under the Diagram View.

  12. Scott Garrett

    I believe the step you are missing is in part one of this series. http://www.blackops.ca/cms/blog/?p=22
    Search for “one thing I believe it lacks is a view in the console”
    There is a small bit of XML you need to add to your new management pack, let me know if I missed what you are looking for.

  13. jman

    you can also create a rule targeted to SNMP device and have it capture any trap. Then you can see the OIDs you want. Might be a bit easier than Wireshark for some.

  14. Scott Garrett

    If I understand your comment I believe this is what I was doing in Part 3 Step 4 before the Nov 24th edit. My personal experience is that many SNMP devices are way too chatty when it comes to traps so I had to get more specific. But if you can get away with it it’s way easier.

  15. Abdul Karim

    Good Stuff ! I am just confused between SNMP Probe Based Detection and SNMP trap Based Detection …

  16. Scott Garrett

    Glad to hear you found it useful
    SNMP Traps are defined as Asynchronous notification from agent to manager. This means that as events happen on the device alerts are sent to your management agent.
    On the other side with Probe Based detection your management agent connects to the device and gets the value at that time
    If you consider a UPS with trap based detection if the UPS goes to battery for 10 seconds you will get an alert, with probe based you will only get an alert if the probe is sent at that time. If you are only polling every 5 minutes your chances are not very good.
    At the same time probe based is really the only way to go if you are trying to gather historical data to plot on a chart, but probe based does load your management station as well.
    They both have their place in your solution.
    If you have anything more specific you are looking for let me know.

  17. Jerry

    Can someone clarify – can a device’s SNMP trap destination be set to a SCOM Gateway so the alert ends up in the Management Group the other side of the firewall?

  18. Scott Garrett

    Yes you can direct SNMP traps to a SCOM gateway. This is in fact the preferred design. There is one step you need to watch for.
    It’s step 4 in http://www.blackops.ca/cms/blog/?p=22
    The bottom line is that the snmp device must be discovered and have the gateway that you are sending traps to as its management server.
    You can’t send traps to wherever and have SCOM respond. It will receive traps and ignore them unless the device is discovered on the SCOM server that received the traps. And of course the device can only be discovered on a single SCOM server.

    Let me know if this answered your question and if you need any assistance.

  19. Joep

    Is it sufficient to have SNMP only installed on the gatewayserver that receives the traps? Or does the RMS also need SNMP?

  20. Scott Garrett

    You should only need SNMP installed on end points that appliances are set to send traps to. SCOM will user its own internal communication once the trap is received.

  21. Jipin

    I am working in a telecome company and i am having a problem in broadcasting alerts send traps from Operation Manager 2007 R2 to an external system so that it can trap the alerts sent from my system. How can i do that? It will be a great help from your side.

  22. Scott Garrett

    If I understand correctly you want to send SNMP traps from your SCOM 2007 R2 system to an external SNMP receiver.

    SCOM being what it is largely based on receiving information and alerting on it.
    I am not aware of any way to send an SNMP trap from SCOM without using an external script.

    Have a look at the following 2 links. Basically you need a script, or command line program that will generate and send the TRAP you want. Then you attach that to either a new notification recipient or as a task to be launched when an alert is triggered.

    http://blogcastrepository.com/blogs/francoisd/archive/2009/01/27/scom-2007-how-to-generate-snmp-notifications.aspx
    http://poshcode.org/1413

    If I misunderstood and you are having trouble receiving SNMP traps into SCOM 2007 please correct me and I will assist further.

  23. rk3000

    I am SNMP polling my clustered systems via SCOM. I have discovered them each with their specific IP address.
    Now I also need to send traps. However I cannot influence the behaviour that the systems will send out the traps using their shared virtual cluster address. Of course, the SCOM cannot assign these traps to any of the systems.

    Does anyone have an idea for a workaround in SCOM?

  24. Scott Garrett

    Can I ask the details of your cluster and what application is running and sending the traps?
    It can be a problem to separate the host vs the clustered application when dealing via SNMP.

  25. rk3000

    Thanks for the quick reply.

    We are talking about Check Point ClusterXL firewalls. They use NetSNMP.

  26. Scott Garrett

    Watch for an email from blackops.ca
    I can post a comment here once we find the best option.

  27. Demetrius

    Scott,

    Have you had any success or tried this in SCOM 2012? The basic principals are the same in 2012 and the addition of SNMPv3. However I have setup the SNMP services on the MS servers and te network devices that I need to monitor are directed at the MS servers.

    Question on this – do the MS servers SNMP services need to be configured for the community string you are trying to monitor.

    Example:
    Said Networking device is pointed to your MS servers
    you setup the discovery with the IP’s of those devices
    the SNMP service on the MS servers do those need to have the community string setup there too, not just in the discovery?

    Any help is appreciated. Great article.

    Demetrius

  28. Scott Garrett

    I have a client reviewing the System Center 2012 suite, but am not currently working with it.
    I understand that the native SNMP support is much better but the concept should be similar.

    The community string does not need to be on the SNMP service on the SCOM probe server itself.
    (you could be running discoveries for multiple communities from the same server.
    When you say the network devices are directed at the SCOM server, do you mean for TRAPS or as a management host? (or both)

    Let me know if I can assist more
    Scott

  29. Demetrius

    It would be both so the device (I am trying to Monitor) is setup to direct SNMP traps to the Management servers. I just configured it on the service in case I happened to be missing something.

    I know in 2012 the network MP’s themselves have several OID they have set to discover on said network devices. I am concerned the main reason I am not getting anything is the network device itself is not listed in the included Network MP (I have confirmed all ports are open/services installed/pings)  given this, I am thinking I will end up having to go the old route of creating my own discovery for this device. The main things I am unfamiliar with though, would be, which OID I would use in order to discover it appropriately, since there are a myriad of OID’s to choose from but most of those are from after the device is discovered.

  30. Demetrius

    I did see that but unfortunately the device is a production device and I cannot not perform the MIB walk on the device (it shouldn’t impact production but… its production – you know how that goes 🙂 ). I have the actual MIB reference guide from the company but just cannot locate the appropriate OID to go with the name of the device. Do you know if they typically name it a set variable that I can search for? Thanks for all your help, much appreciated.

Leave a Reply

Your email address will not be published. Required fields are marked *