SCOM, SNMP and TRAPS or The Good, the Bad and the Ugly : Part 2

If you have made your way through Part 1 then you have written your management pack complete with your own custom discovery and imported it into SCOM.  Once you have ensured that it is discovering only the devices you wish to manage in this pack it is time to begin writing the monitors and rules that will apply to the detected devices.   As was mentioned in Part 1a program such as MIB Browser can be very handy in assisting with sorting through all of the OID’s and the healthy values which correspond with each individual OID. 

Creating an SNMP Get Monitor in SCOM 2007 R2

I find the easiest way to create a new monitor or rule is to start with the System Center Operations Manager Console. I will admit it’s not the best and does not give you many of the options you probably want but I find it’s the easiest way to get the XML started, and then edit it to get exactly what we want after the fact.

We will start in the management console, Authoring tab, expand Management Pack Objects and right click on Monitors, Select Create a Monitor \ Unit Monitor.

From within SCOM click on the Authoring tab and then right click on Monitors which is listed beneath Management Pack Objects.  At this point we would like to choose Create a Monitor – Unit Monitor, once this has been picked you will see the following screen:

 First we will create a simple expression Get Monitor and later we will deal with TRAPS, so we pick SNMP – Probe Based Detection – Simple Event Detection – Event Monitor……

Be sure to create this in the management pack you created for the discovery of the object.

Now we have to name our monitor, Select a target (You are looking for the device type you defined in part one and you may have to click the “View all targets” radio button for it to appear) and add a parent monitor (this defines where in the health view tree your new monitor will appear)

Personally I always use the discovery community string but you could use something custom if you want. The frequency is how often you want the monitor to poll the device and the object identifier or OID. This is the bit this will be used in the SNMP get call I find it works most reliably if you don’t have a leading period.

We need to create an expression what causes an alarm. I will keep the expressions simple so you can get a feel for one that works. Click the +Insert at the top and you are presented with 3 fields.  The first field that appears parameter name is the magic field.

/DataItem/SnmpVarBinds/SnmpVarBind[1]/Value

This is the value you are going to compare it is based on the First SnmpProbe from the step before. I have read that if you have more than one SnmpProbe that the number in this case [1] is in reverse order so [1] is at the bottom and [2] would be just above it in the list. Personally I have only one OID providers right now so I don’t know. Let me know if you figure it out for sure.  The operator gives you a drop down of choices. I will get into it more below but thing about this one carefully. If you can use a simple equals or does not equal you can make things much easier. Think of it like this if a UPS battery charge of anything less than 100% is bad then use an expression like “/DataItem/SnmpVarBinds/SnmpVarBind[1]/Value – does not equal – 100” instead of  “/DataItem/SnmpVarBinds/SnmpVarBind[1]/Value – less than – 100” it will just save you a bunch of extra steps even if it is not quite as flexible.

 

Second SnmpProbe lets you pick an OID just like for the first SnmpProbe personally all the monitor I have so far use the same OID as in the first provider as I am watching for a single value to be either good or bad. The second expression is exactly the same as the first. If you want a monitor that will not recover (you have to manually reset the health state  I use something like “/DataItem/SnmpVarBinds/SnmpVarBind[1]/Value – does not match wildcard – *” since any GET will have some result it will never not match and this will never recover.

Configure health lets you decide how the device health will change when the monitor gets tripped. I use second event raised as healthy and first event raised as warning or critical depending on whats going on.

The last option is if you want to create an alert or not, up to you.

Not so simple expressions

So lets say you don’t want a simple equals or does not equal kind of expression. It’s there in the drop down so whats the big deal you say? Well the SCOM Console make what I consider a bad assumption when creating rules and monitors. All the datatypes are strings. so although “100” does not equal “10” produces a true result “100” is greater than “10” when the values are strings has no meaning. Fixing this is actually not so hard and you have 2 choices. If the next bit is clear to use go for manual xml editing, if that makes you nervous then hold on for the second option.

Option 1 : Advanced

Export your MP to XML and open it in your favorite xml editor.

Way at the bottom you will find an ElementID linked to the text label you assigned to the monitor. Use this ElementID to find your monitor or rule and alter as follows.  I have highlighted the 4 places you must change “String” to “Integer”. Save the file and re-import it into SCOM and your monitor should be working.

<UnitMonitor ID=”UIGeneratedMonitor38d2a38d163b4c1f971885f7ea686f16″ Accessibility=”Public” Enabled=”true” Target=”GEUPS.Single.Phase.Management.Pack.SNMPDevice” ParentMonitorID=”Health!System.Health.AvailabilityState” Remotable=”true” Priority=”Normal” TypeID=”Snmp!System.SnmpProbe.2SingleEvent2StateMonitorType” ConfirmDelivery=”false”>
        <Category>Custom</Category>
        <AlertSettings AlertMessage=”UIGeneratedMonitor38d2a38d163b4c1f971885f7ea686f16_AlertMessageResourceID”>
          <AlertOnState>Error</AlertOnState>
          <AutoResolve>true</AutoResolve>
          <AlertPriority>Normal</AlertPriority>
          <AlertSeverity>Error</AlertSeverity>
          <AlertParameters>
            <AlertParameter1>$Data/Context/SnmpVarBinds/SnmpVarBind[1]/Value$</AlertParameter1>
          </AlertParameters>
        </AlertSettings>
        <OperationalStates>
          <OperationalState ID=”UIGeneratedOpStateId8a4572649aec48b58c336b84182c464b” MonitorTypeStateID=”SecondEventRaised” HealthState=”Success” />
          <OperationalState ID=”UIGeneratedOpStateId2bcf307948d14f4a892a99088603714a” MonitorTypeStateID=”FirstEventRaised” HealthState=”Error” />
        </OperationalStates>
        <Configuration>
          <FirstInterval>60</FirstInterval>
          <FirstIsWriteAction>false</FirstIsWriteAction>
          <FirstIP>$Target/Property[Type=”NetLib!Microsoft.SystemCenter.NetworkDevice”]/IPAddress$</FirstIP>
          <FirstCommunityString>$Target/Property[Type=”NetLib!Microsoft.SystemCenter.NetworkDevice”]/CommunityString$</FirstCommunityString>
          <FirstVersion>$Target/Property[Type=”NetLib!Microsoft.SystemCenter.NetworkDevice”]/Version$</FirstVersion>
          <FirstSnmpVarBinds>
            <SnmpVarBind>
              <OID>.1.3.6.1.2.1.33.1.2.4.0</OID>
              <Syntax>0</Syntax>
              <Value VariantType=”8″ />
            </SnmpVarBind>
          </FirstSnmpVarBinds>
          <FirstExpression>
            <SimpleExpression>
              <ValueExpression>
                <XPathQuery Type=”Integer“>/DataItem/SnmpVarBinds/SnmpVarBind[1]/Value</XPathQuery>
              </ValueExpression>
              <Operator>Less</Operator>
              <ValueExpression>
                <Value Type=”Integer“>96</Value>
              </ValueExpression>
            </SimpleExpression>
          </FirstExpression>
          <SecondInterval>60</SecondInterval>
          <SecondIsWriteAction>false</SecondIsWriteAction>
          <SecondIP>$Target/Property[Type=”NetLib!Microsoft.SystemCenter.NetworkDevice”]/IPAddress$</SecondIP>
          <SecondCommunityString>$Target/Property[Type=”NetLib!Microsoft.SystemCenter.NetworkDevice”]/CommunityString$</SecondCommunityString>
          <SecondVersion>$Target/Property[Type=”NetLib!Microsoft.SystemCenter.NetworkDevice”]/Version$</SecondVersion>
          <SecondSnmpVarBinds>
            <SnmpVarBind>
              <OID>.1.3.6.1.2.1.33.1.2.4.0</OID>
              <Syntax>0</Syntax>
              <Value VariantType=”8″ />
            </SnmpVarBind>
          </SecondSnmpVarBinds>
          <SecondExpression>
            <SimpleExpression>
              <ValueExpression>
                <XPathQuery Type=”Integer“>/DataItem/SnmpVarBinds/SnmpVarBind[1]/Value</XPathQuery>
              </ValueExpression>
              <Operator>GreaterEqual</Operator>
              <ValueExpression>
                <Value Type=”Integer“>96</Value>
              </ValueExpression>
            </SimpleExpression>
          </SecondExpression>
        </Configuration>
      </UnitMonitor>

Option 2 : Easy

Export your management pack to XML then Using System Center Operations Manager 2007 R2 Authoring Console open it.  You may be asked for dependencies that are usually found in “C:\Program Files\System Center Operations Manager 2007” but you can easily enough find *.mp

Once you find your monitor or rule of choice right click, properties. Configuration Tab.  Under each >>>XPathQuery and >>>Value you will see >>>@Type you need to change 4 Types to Integer.  You can see examples of the last two changed in the image below.

Then once you are finished just save the management pack and re-import it into SCOM and it should work.

 

Note: I am writing this in a somewhat sleep deprived state. I have not talked about rules at all but they are simpler than monitors so I hope it’s clear where the magic is. I will also thank David Allen for some blog posts that helped be although I can’t find them right now.  If things here are not clear or more detail is needed please comment or contact me and I will see what I can do.  

Part 1

Part 2

Part 3

SCOM 2007 R2 Automatic Alert Closing or Death by Brackets

Issue:

You have informational alerts, or any other alerts in the SCOM console that you want to have, but not stack up forever.

Solution:

Power shell, again I am far from a power shell expert, in fact this might be the first script I have created that is more than just calling an existing command-let.

for those of you who don’t care, here is a line that will resolve informational alerts more than 12 hours old. (run it from Operations Manager Shell typically  C:\WINDOWS\system32\windowspowershell\v1.0\powershell.exe -PSConsoleFile Microsoft.EnterpriseManagement.OperationsManager.ClientShell.Console.psc1 -NoExit .\Microsoft.EnterpriseManagement.OperationsManager.ClientShell.Startup.ps1)

get-alert -criteria “Severity = ‘0’ AND ResolutionState = ‘0’ AND LastModified <= ‘$((((Get-Date).ToUniversalTime())).addhours(-12))'”|resolve-alert| out-null

For those who want to know how it works , or for me once I forgot

 Get-Alert is a SCOM command-let (try get-help get-alert) for all the details.

-criteria allows us to filter based on whatever we want.

Severity = ‘0’  This is a zero just in case you are wondering and

Severity 0 = informational

Severity 1 = Warning

Severity 2 = Error

ResolutionState = ‘0’ again a zero and means

ResolutionState = ‘0’ is New

ResolutionState = ‘255’ Closed

Anything in the middle would be things you configured as custom resolution states


and now  the one that took all the effort

LastModified is when the alert was last modified, well duh you say and I agree but now for the hard part. This is logged in UTC so it won’t match with what you see in the console so we need to feed it a UTC time 12 hours in the past and for that we need more brackets that I ever figured.

$(Get-Date) processes the get-date command-let and passes a date that looks like this “Sunday, February 07, 2010 1:30:00 PM”

$((Get-Date).ToUniversalTime()) Takes the date from above and converts it to UTC based on your time zone offset, resulting in “Sunday, February 07, 2010 9:30:00 PM”

$((((Get-Date).ToUniversalTime())).addhours(-12)) takes the date from above and subtracts 12 hours giving “Sunday, February 07, 2010 9:30:00 AM”

By the magic of power shell this is changed into something more like ‘2/7/2010 9:30: AM’ and for that magic I am eternally grateful as I always hated date format issues in scripting (yea powershell)

Now power shell has gathered a lost of all the alerts we want to clear and we simply pipe that to resolve-alert

and pipe the output from they whole line to out-null so we don’t get any output.


Scheduling the Task

I never imagined that it would take more time and lines of code to schedule this script than it did to create.

Normally you could just run

The script as saved on the local drive as ClearInfo.ps1

$RMSFQDN=”FQDN of your RMS
Add-PSSnapin Microsoft.EnterpriseManagement.OperationsManager.Client
New-PSDrive -Name: Monitoring -PSProvider OperationsManagerMonitoring -Root: \
cd monitoring:\
New-ManagementGroupConnection $RMSFQDN
cd $RMSFQDN
$pf = (gc Env:\ProgramFiles)
cd “$pf\System Center Operations Manager 2007”
.\Microsoft.EnterpriseManagement.OperationsManager.ClientShell.Functions.ps1;
Start-OperationsManagerClientShell -ManagementServerName: $RMSFQDN -PersistConnection: $true -Interactive: $true;
get-alert -criteria “Severity = ‘0’ AND ResolutionState = ‘0’ AND LastModified >= ‘$((((Get-Date).ToUniversalTime())).addhours(-12))'”|resolve-alert| out-null

Then simply call something like C:\WINDOWS\system32\WINDOW~2\v1.0\powershell.exe C:\scripts\ClearInfo.ps1 from task manager.

You may want to have a look at http://technet.microsoft.com/en-us/library/ee176949.aspx

SCOM 2007 R2 Console Command Line

Microsoft.MOM.UI.Console.Exe

Typically installed in C:\Program Files\System Center Operations Manager 2007

Microsoft.MOM.UI.Console.Exe /?

Command Line Syntax:

Microsoft.MOM.UI.Console.Exe {/Option[:Value]}

Option Description
/? Shows this help window
/ClearCache Clear the UI cache (this is the one that made me look for this)
/Server:<ServerName> Connect to the specified server
/ViewName:<ViewName> Display a view
/TaskName:<TaskName> Run a task
/TaskTarget:<ObjectId> Use in conjunction with /Task
/ManagementPack:<MpName> Use in conjunction with /TaskName and ViewName options

You can see it says you can find a complete list int he help but I have not found anything yet. Anyone else?

FYI – Don’t use /Clearcache when connected via RDP unless you know there are no others using the console at the time.

Also remember if you are trying to use viewname etc you should be using the internal name not the display name, something like this :

Microsoft.Mom.UI.Console.exe /viewname:System.Views.AlertView

SCOM 2007 Limited Access – The Shiny Red Button.

Issue:

So you want to have some people have access to see details in the SCOM console but you don’t trust them after having a conversation that goes something like this:

Now, listen. I’ve got a JOB for you. See this button?  DON’T TOUCH IT!

So… what’ll happen?

That’s just IT! You don’t KNOW! Maayyyybeeee something bad?… Mayyyybeeee something good! I guess we’ll never know! ‘Cause you’re not going to touch it! You won’t TOUCH it, will you?

Solution:

Actually this is quite simple and very effective.

Administration Tab: User Roles

SCOM 2007 R2 comes with built in roles, you may have seen them Administration, Security, User Roles. Whats that you say? You can’t use these because the users you want to grant access should only get specific server like SQL only and there give access to everything? Well not to worry I’ll get to that in a minute.

Default Roles

Profile typeProfile descriptionRole scope

Administrator

Has full privileges to Operations Manager; no scoping of the Administrator profile is supported.

Full access to all Operations Manager data, services, administrative, and authoring tools.

Advanced Operator

Has limited change access to Operations Manager configuration; ability to create overrides to rules; monitors for targets or groups of targets within the configured scope. Advanced Operator also inherits Operator privileges.

Can be scoped against any groups, views, and tasks currently present and those imported in the future.

Author

Has ability to create, edit, and delete tasks, rules, monitors, and views within configured scope. Author also inherits Advanced Operator privileges.

Can be scoped against any target, groups, views, and tasks currently present and those imported in the future. The Author role is unique in that this is the only profile type that can be scoped against the targets.

Operator

Has ability to edit or delete alerts, run tasks, and access views according to configured scope. Operator also inherits Read-Only Operator privileges.

Can be scoped against any groups, views, and tasks currently present and those imported in the future.

Read-Only Operator

Has ability to view alerts and access views according to configured scope.

Can be scoped against any groups and views currently present and those imported in the future.

Report Operator

Has ability to view reports according to configured scope.

Globally scoped.

Report Security Administrator

Enables integration of SQL Reporting Services security with Operations Manager roles.

No scope.

Pick a type that has the level of access you are looking for and then right click on Users Roles and create a new role.

General Properties : Here you can give your role a name, description and add members to it. Personally I suggest adding AD groups and not individual users but hey, it’s your environment so your call.

Group Scope : This is the half the magic but I talked about earlier. Here you define what groups of objects you want the user to be able to affect.

Tasks: You can approve all or only specific tasks you want this rile to be able to run.

Views:  The second half of the magic. Here you can pick specific branches of your monitoring tree and that’s all this rile will be able to see.

Now your console may look something like this for a UPS operator…

 

All right now, wasn’t that fun? Let’s try something else…..


Useful Operations Manager 2007 SQL queries

Kevin Holman’s OpsMgr Blog is great, and provided me with many sql scripts from here

Of course sometimes I can’t find the right script when I want it and I will likely over time add things from other sources so I am duplicating some of his work here, and I will add to is as I personally use the scripts.

List of all management packs and version number

SELECT MPName, MPFriendlyName, MPVersion, MPIsSealed FROM ManagementPack WITH(NOLOCK) ORDER BY MPName 

Find Management Pack name from Workflow name (Change rulename)

SELECT MPName FROM ManagementPack WITH(NOLOCK) where Managementpackid = (SELECT ManagementPackID FROM Rules WHERE Rulename =’MomUIGeneratedRuleb0bac41041dc420abc0926dcdd7a8c23‘)

 Find what alerts are repeating most often

SELECT TOP 20 SUM(RepeatCount+1) AS RepeatCount, AlertStringName, AlertStringDescription, MonitoringRuleId, Name
FROM Alertview WITH (NOLOCK)
WHERE Timeraised is not NULL
GROUP BY AlertStringName, AlertStringDescription, MonitoringRuleId, Name
ORDER BY RepeatCount DESC

Find a rule friendly name from the GUID provided on an alert.  (change out your own rule guid)

SELECT LTValue FROM Rules
INNER JOIN LocalizedText LT ON LT.ElementName = rules.Rulename
WHERE rules.Ruleid=(SELECT ruleid FROM Rules WHERE Rulename =’MomUIGeneratedRuleb0bac41041dc420abc0926dcdd7a8c23‘)

Count the number of  discovered objects by type (from Pavlick.Net)

SELECT mt.ManagedTypeID, mt.TypeName, COUNT(*) AS NumEntitiesByType
FROM BaseManagedEntity bme WITH(NOLOCK)
LEFT JOIN ManagedType mt WITH(NOLOCK) ON mt.ManagedTypeID = bme.BaseManagedTypeID
WHERE bme.IsDeleted = 0
GROUP BY mt.ManagedTypeID, mt.TypeName
ORDER BY COUNT(*) DESC

SCOM 2007 R2 – Logical Disk Free Space or Redundancy In Action (and not the good kind)

The Problem:

We want to alert on specific drives that tend to fill up quickly when something goes wrong at different levels than the defaults.


The Process:

Sounds simple enough right? An override and it’s time for Coffee.  Well being the careful person I am I decided perhaps a test is in order.  So I create a folder on the drive and put a 10 MB file in it called 1.log plus the following batch file

:10
set Now=%time:~0,2%%time:~3,2%%time:~6,2%
copy 1.log %now%.log
PING 1.1.1.1 -n 1 -w 2000 >NUL
goto 10

Simply this script copies 1.log to a new file name that has the hour, minute and second as the file name. Then it tries to ping something for 2000 ms (2 Seconds) and does it again.

Don’t wander off while this is running or use it for evil please 😉

So I let this run till my drive is about 55% full, or was that 45% empty? But I am bothered by the complete absence of any kind of alert or alarm or even status change of the server in question .

If you look up the definition of redundancy here you will find 2 things of note, The good kind of redundancy (6. Electronics Duplication or repetition of elements in electronic equipment to provide alternative functional channels in case of failure.) and the kind we are going to deal with here (2. Something redundant or excessive; a superfluity.)

I didn’t notice the first time but there is a paragraph on the properties of the Logical Disk Free Space Monitor, and although I am glad it wasn’t harder to find I am bothered by it’s content.

Configuration

The Logical Disk Free Space monitoring routine is a high configurable solution that enables Operators to set varying threshold values for system and non-system logical disk volumes. In addition separate threshold values can be set for Warning and Error states.

Since logical disk volumes may vary in size from a few gigabytes to many terabytes or more the Logical Disk Free Space monitoring routine requires that an Operator indicate both the Megabyte and Percentage based threshold values that must be passed before the Warning and Error thresholds reached. This means that in order for the threshold to be reached both the Megabyte and Percentage based threshold values for the System or Non-System Drive must be breached.

So lets say like me you have several different drives of varying sizes that you want alerts on and the defaults from the table below just don’t do it for you. Like me you probably figured you could just set the Non-System Drive Error Percent Threshold and be done with it. Then like me you find that you get no alarm because although you are below the Non-System Drive Error Percent Threshold you are still over the Non-System Drive Error Mbytes Threshold that defaults to 1GB. Sadly now your option is to check the full size on each drive you are monitoring, do the math and figure out how many MB is X% of your drive and enter that value in Non-System Drive Error Mbytes Threshold in addition to the % you already set. Then an interval later you will get an alert something like this…

 


System Drive Free Space Thresholds (Defaults)

Parameter

Default Value

System Drive Error Mbytes Threshold

100

System Drive Error Percent Threshold

5

System Drive Warning Mbytes Threshold

200

System Drive Warning Percent Threshold

10

Non-System Drive Free Space Thresholds (Defaults)

Parameter

Default Value

Non-System Drive Error Mbytes Threshold

1000

Non-System Drive Error Percent Threshold

5

Non-System Drive Warning Mbytes Threshold

2000

Non-System Drive Warning Percent Threshold

10


The Solution :

Set overrides for both Mbytes and Percent thresholds as they both have to be breached to throw an alarm.

If you hate math perhaps you could just set the MB alarm to some unreasonably large value so that it is always breached, thus making the % monitor the only one that changes.

Update – Nov 30, 2009

Billy made some comments that started me thinking about a larger solution, and I fear it’s all in to overrides.

First create a series of groups that match your needs, like Alarm System at 100MB, Alarm non-System at 1GB, Alarm System at 100GB, Alarm System at 5%, Alarm non-System at 15%, Alarm System at 50% really whatever makes you happy. Isn’t that what we all really want? Then create a series of overrides based on the groups.  Something like for the override targeted at “Alarm System at 100MB” set the system MB to 100MB and set the system % to .01%, when creating a percentage based override the work it the other way setting the % to what you want the the MB to 1,000,000,000,000,000 or something similar. Then as you figure each new machine you just decide how you want it to work for that machine and add it to the static groups you defined earlier.  Someone please correct me if I am wrong but you may want to decide if % or MB is more important and set the enforced check box on that override just in case you ever assigned a machine to both groups. I figure this will help SCOM determine what override should apply, but I have not tested that and could be wrong there.


 

Hey Microsoft :

Is it not the point of a percent based alarm that you don’t need to go to every dive of a different size and figure it out for your self?  For me I would expect that a person could say send me an alarm whenever a drive is 50% full, but also at the same time may want to know when some very old small drives have less than 10GB free even if this does not constitute 50% of the drive.  I simply can’t wrap my head around the concept that because “logical disk volumes may vary in size from a few gigabytes to many terabytes or more” would cause any situation where you would want to set 2 different thresholds that both have to be triggered to cause an alarm. Does the alarm in your house only go off if a burglar had both your front and back doors open at the same time?

Last modified time: 19/11/2009 3:13:35 PM Alert description: The disk J: on computer X is running out of disk space. The values that exceeded the threshold are 52% free space and 36452 free Mbytes.

SCOM, SNMP and TRAPS or The Good, the Bad and the Ugly : Part 3

If you have followed along this far, and have not ended up with a white jacket with really long sleeves then this next bit should be no problem.

The Problem :
Although it’s nice to be able to poll a device on a regular schedule and log this for the shiny graphs and alert on it once you have all the monitors working, what we really want is a real-time alarm when something goes wrong. When your UPS goes into bypass or your HVAC fails you may not want to wait for the next polling interval.

The Solution :
Strangely this is the whole reason I started these three articles, I found SNMP traps to be inconsistent and extremely frustrating. I almost gave up and got another product more suited to handling SNMP monitoring. Of course as with most things once you get it all sorted out it’s actually quite simple.

Step 1 : SNMP Services
On your Root Management Server, Management Servers and or gateways you need to have the SNMP Services installed.
Windows 2003 – Add Remove Programs, windows Components, Management and Monitoring Tools, Simple Network Management Protocol.
Windows 2008 – in the Server Manager, Features, SNMP Services
You should have an SNMP Service and a SNMP Trap Service, make sure both are set to automatic start and are started.

Step 2 : Configure your device
On your device, appliance, server etc you will need to go in and setup SNMP and Traps. First you will need to set a community name, remember this for later, and remember this is not really a useful security measure. With luck you can configure a community name, set your device to read only (personally I don’t trust making changes via SNMP) and configure a location for traps. Here is the first trick, at a minimum you need to direct the traps at the management server that your device is going to be managed by, I configured my devices to send to my RMS and both of my management servers. You can send traps to your RMS all day long and not get an alert if your device is discovered and managed by another management server.

Step 3 : Discover your device
In the SCOM Management console, Administration tab, you can right click on any entry and pick Discovery Wizard right at the top. Click Network Devices, next, enter the IP address or address range, entry the community string you configured earlier on the device, Pick the SNMP version (if you are not sure try V1) and pick a management server. This must be the server the traps are being sent to. I send the traps to all my management servers so that just in case I need to rediscover the device on another server I don’t have to go reconfiguring the device. SCOM will not send you duplicate alarms if it received a trap on multiple management servers.
Once teh discovery is complete you shodul be abel to select the check boxes of the devices you want to manage and finish.

Step 4 : Create a Monitor.
If you are following from Part 1 and 2 we will be creating this monitor in the management pack where we have set the discover for the device type that we are monitoring. If you are adventurous and don’t expect it to get too complex you can target this monitor at snmp network device, this creating a bulk trap monitor for every device. This may make it harder to filter what you want in the future but it’s up to you.
On the authoring tab, under management Pack Objects, Right Click on Monitors and select unit monitor. We are now looking for SNMP, Trap based Detection, Simple Trap Detection, Event Monitor – Single Event and Single Event. Place this in your management pack where your discovery is.
Give your monitor a Name, a Target and a parent monitor Next. Typically you would use the discovery community string, and for this example we are going to check the box “All Traps” at the bottom. Here is the next tricky bit. The expressions, this is an example for setting a critical state for any trap and requiring a manual reset.

First Expression
parameter Name : /DataItem/SnmpVarBinds/SnmpVarBind[1]/Value
Operator : Matches wildcard
Value : *

Second Expression
parameter Name : /DataItem/SnmpVarBinds/SnmpVarBind[1]/Value
Operator : Does not match wildcard
Value : *

The first express should fire on any trap, the second should never fire, So the First Event Raised is the critical state and the second is the healthy state.

Of course you seen to configure subscriptions etc if you want email alerts but this should change the state of the device and require that you go into the device and do a reset health to get things to go back to green.

If you get fancier with the expressions comment on this post so everyone can see, I have not had to yet so I can’t say.

Edit: Nov 24, 2009

Well that didn’t last long, I had to get more specific with my trap alerting so I figured I would update the people who are following this (all 2 of them)

It’s simple enough when creating your monitor on the “FristSnmptrapProvider” screen don’t check all traps, instead put the OID you are looking for under Object Identifier. The best way for me to find the OID is wire shark. In wire shark set a filter like “snmp.data==4” or “snmp.data==4 and ip.addr == xxx.xxx.xxx.xxx” now you should be seeing just traps from the device in question. 

trao

Here you can see the OID of the TRAP  1.3.6.1.4.2254.2.4.20 and the specific Trap of 3

This gives us a complete OID of  1.3.6.1.4.2254.2.4.20.0.3 This is what you need to put under Object Identifier  for the FirstSnmpTrapProvider, the first expression remains the same as above, but in this case I will start doing automatic recoveries or for the SecondSnmpTrapProvider you need the recovery trap, in this case 1.3.6.1.4.2254.2.4.20.0.3 the second expression changes now to match the first expression “Matches Wildcard *” These examples are for utility failure and recovery of a GE UPS SNMP card using the deltav4 MIB.

Enjoy

Part 1

Part 2

Part 3

Using SCOM to moniotor name resolution on a Kiosk or Schrodinger’s DNS

The Problem:
I have a number of Kiosks that are going to be used by VIP’s and I have been asked to make sure that these kiosks are working properly and to proactively respond to failures. Our monitoring tool is SCOM 2007 R2. In this entry I will cover my attempt to monitor DNS name resolution.

The Process:
First I found that there is an existing monitor that does basically exactly what I want. Of course I know this becasue it didn’t work properly at first and thanks to Kevin Holman’s OpsMgr Blog I even got an answer. Now all I need to do is leverage this existing type and life will be grand. Little did I know I was about to enter the quantum world of management pack XML.

Using an excellent little powershell script from Boris Yanushpolsky I can open sealed MP’s and have a look. So I crack open Microsoft.Windows.DNSServer.2003 and Microsoft.Windows.DNSServer.Library and quickly find that the monitor type I am looking for is Microsoft.Windows.DNSServer.Library.NSLookupAvailability

I create my own empty management pack, and add the following reference

<Reference Alias=”DNS”>
        <ID>Microsoft.Windows.DNSServer.Library</ID>
        <Version>6.0.6480.0</Version>
        <PublicKeyToken>31bf3856ad364e35</PublicKeyToken>
      </Reference>

 

Then I export my new management pack and open it in the authoring console and basically copy everything from the monitor that I want to my new monitor.  Since I want this monitor to only specific machines I have created a group with dynamic members to target this monitor at. 

Some notes on Targeting:
You can’t target monitors, rules or tasks at dynamic groups.  If you want all the gory details then thanks to Jakub and http://www.scom2k7.com/scom-2007-targeting/ but the bottom line is you have to pick an existing class that will be available everywhere. Best practice suggests that you pick the closest existing class that you can, don’t just pick windows computer for everything. The secret is to create whatever it is you are doing as disabled and then use an override to enable it based on the dynamic group you created.

 So now that I have my targeting issues worked out I find that this monitor is not becoming active, initially I thought this was because of my targeting but now I needed some help.  a call to Microsoft support teaches me a couple of things I will pass on here.

Within the DNS library we find the section <UnitMonitorType ID=”Microsoft.Windows.DNSServer.Library.NSLookupAvailability”

within this unit monitor type we find <ProbeAction ID=”Probe” TypeID=”Microsoft.Windows.DNSServer.Library.Probe.NSLookupTest.PropertyBag”> this links to another section <ProbeActionModuleType ID=”Microsoft.Windows.DNSServer.Library.Probe.NSLookupTest.PropertyBag” and within this section we finally have <ScriptName>NslookupAllTests.js</ScriptName>

Now that we know the actual script the will really do the work on the host is called NslookupAllTests.js then we can search for it in the library and tada there it is.  Script: NslookupAllTests.js of this is about 2 lines below the propertybag but I wanted to show the progression in case someone else is tyring to figure out something similar.

Now I am no javascript expert by any means but the following few lines are pretty clear to me

//Check if DNS service is running.  Abort script with a warning if it’s not.
if (!DNSServiceRunning())

so despite the fact that for this specific test the DNS server service does not need to be running or even installed there is a single script running for all the dns testing and it will not run unless the DNS service is running.

now I guess I could comment out these three lines reseal the management pack and cross my fingers but the risk to my existing monitors and dealing with the next MP upgrade is more than I want to deal with right now.

The Solution: NOT
Well unfortunately my development skills are haunting me today.
The final solution would be to create a new library based on Microsoft.Windows.DNSServer.Library, providing new monitor types complete with overrides and alter the underlying DNS NslookupAllTests.js script to do what I need. Unfortunately java script is not a language I have taken the time to get familiar with and beyond that the required XML to get the library and monitor working is just taking too long.

Plan B….. I will have to get back to everybody on that

The Solution : Finally

OK, so it was not 4000 lines it took only 953.

With some help from PSS and a lot of theft from the DNS library I have managed to create a management pack to monitor DNS resolution. The monitor is called DNS Resolution Check and falls under windows computer. It is disabled by default and will require that you either configure and enable it or simply create overrides. I also added an override for server so you could direct different clients at different DNS servers. The Server config \ override will take a series of IP addresses comma separated.

Download DNS.Check.mp

Hey Microsoft:

SCOM is an interesting tool, but the authoring console is sadly lacking. Something as simple as a name resolution test should be simple to create. I should not need 4000 lines of XML and JS to handle something this simple.

We should be able to use dynamic groups as Watcher Nodes for web and port monitors.

SCOM 2007 R2 Gateway now with SSL

Need to extend your SCOM environment to another domain, into a DMZ or anywhere else where the typical protocols are not available? Then you need a gateway. Actually this process is reasonably simple but you will need an SSL certificate.

Step 1 :  Certificates

 Don’t have an Certificate Authority? Well then you have 2 options, you could build one, but unless you have more needs than this I would suggest you just go and buy a cheap one.  Use the fully qualified DNS name of the server for the subject and the friendly name. Other than that not much to worry about. If you do have a CA then you may want to publish a new template, despite many other blogs talking about exactly what you need I have found that this very simple template works for me.

G1

Either way all you need to do is install the certificate into the machine store personal folder. If  you are using your own CA you will also need to install the root certificate and any intermediate certificates as trusted root certification authorities or intermediate certification authorities on both this new gateway and on your root management server.

Now you are ready to start the gateway install so you need some files from your download or CD. May as well copy the “gateway” and “support tools” directories from the root of your install media to the new gateway server. Even with all that added security isn’t it nice to be able to copy file via RDP now?

Step 2: Installing the Gateway

Within the gateway folder you copied from the install media you will find directories for i386 and AMD64 go into the right folder based on if your OS is X32 or X64. Run MomGateway.msi (Funny how MOM keeps hanging on)

 g2

Fairly quickly you are provided with the following dialog, and if you are like me you are thinking hmmmmm what exactly was my management group name? If you remember your exact management group name then bully for you the rest of us need to go to back to our root management server and enter the registry. HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Microsoft Operations Manager\3.0\Server Management Groups\  under this is another key that is your management group name. Be careful not to change anything but I like to hit F2 and copy this key name and past it into the installer.  Then enter the name of your root management server and if you are truly offended by TCP port 5723 then here is your big change to change it.  When prompted for a Gateway Action Account, well Local System works well for me but it’s your environment you can use whatever you want.  Me I will stick to the principal of least access and go with Local System.

Step 3: Linking the Certificate

Once the install is complete you need to link the certificate to the service and Herein lines perhaps the only real gotcha, in the support tools folder that you copied over earlier you will find folders for i386 and amd64  and ia64, make sure to run the one that goes with your OS. Remember that the ia64 refers to itanium that nobody uses and be aware that the x32 version will run on x64 and appears to work. The registry keys will look right and all the troubleshooting in the world won’t tell you why your shiny new gateway doesn’t work. So skip all that grief and get this right the first time.  There are lots of instructions showing you the full command line and detailing how to get the cert name etc etc but I am lazy with the command line so just run MOMCertImport.exe by double clicking on it. TADA you are prompted to pick the certificate you want to use and all you need to do is click OK, just like a nice windows program.  If you run into problems or just want to make sure this worked have a look in HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Microsoft Operations Manager\3.0\Machine Settings the ChannelCertificateSerialNumber should contain the cert details.

Step 4: Approving the server on your RMS

This step is run from the command line on your Room Management Server. I know we are windows people we don’t DO command line….. well as our HR department would say “SUCK IT UP”.  I think we all need to get used to the idea that now there is Core Server and powershell we are all going to get much more familiar with doing everything from the command line. Thats enough resistance to change for one day so on to the command line.  From your RMS you need to run the following, have a look on your install media again in the “Support Tools” folder again being careful to get the correct version for your OS find Microsoft.EnterpriseManagement.GatewayApprovalTool.exe, and copy it to “C:\Program Files\System Center Operations Manager 2007” or wherever you installed SCOM in the first place.  Now open a command line in that location.

As as aside here is a little trick I like, save the following to a reg file and import it. This will give you a new right click context menu on folders that will allow you to quickly open a command prompt right to a folder.

Windows Registry Editor Version 5.00

[HKEY_CLASSES_ROOT\Folder\shell\Open_Command_Prompt]
@=”Open Command Prompt”

[HKEY_CLASSES_ROOT\Folder\shell\Open_Command_Prompt\command]
@=”cmd.exe /k \”cd %1\””

Now back to the command line

 See that’s not so bad, of course if you don’t have name resolution between your gateway and your RMS then you will likely wan to put in toms hosts entries in c:\WINDOWS\system32\drivers\etc\hosts.microsoft

 

=YOUR-GATEWAY-FQDN-HERE /action=creategatewayname=YOUR-RMS-FQDN_HERE /managementservername /exe.gatewayapprovaltool.enterprisemanagement.

 

Also if your gateway is not a member of a domain it’s likely that it’s FQDN of your gateway is just the servername and that’s just fine. Just make sure that if you ping from each side you get name resolution. Of course your security guys probably block ICMP but as Arthur Schopenhauer said “Life without pain has no meaning.” and as technical people this just adds meaning to our lives.

Now that you have created the new gateway you get to do what you like best about SCOM as a reward, take a short vacation. Did you enjoy it? Now back to work (BlackAdder)

Several minutes later you will see your new server appear in your SCOM console, on the Administration tab, under Management Servers. For a while it will appear as “Not Monitored” for a while and then move into a healthy state if everything is OK. If not probably time to crack out wireshark and look at the traffic and have a look at the event log to see if you can see any errors on either side.

 Troubleshooting: Getting an error in the logs?

The certificate specified in the registry at HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Microsoft Operations Manager\3.0\Machine Settings cannot be used for authentication. The error is The credentials supplied to the package were not recognized (0x8009030D).

The fix:

1)    Once you have a certificate imported into the certificate store:

i)     From the MMC “Certificates (Local Computer)” snap-in, locate the certificate (server FQDN name) in the Personal-Certificates folder, right click it, and select “Export”.

ii)    When prompted to “Export Private Key”, select “Yes, export the private key”.

iii)   Under “Export File Format”, select “Personal Information Exchange – PKCS #12 (.pfx)” and sub-option “Export all extended properties”.

iv)   Take note of the location and file name of the pfx file you saved and password if you entered one.

2)    From an Administrator mode command prompt, enter:

MOMCertimport.exe filename.pfx

– enter password if you entered one for the pfx file

            (where filename.pfx is the exported certificate from above)

 

Getting something like “The OpsMgr Connector connected to sceserver, but the connection was closed immediately after authentication occured” for me I just needed to wait a little longer then look for the gateway server listed in the SCOM Console, Administration, Device Management, Pending Management. I approved it there and a few minutes later everything was working.

SCOM, SNMP and TRAPS or The Good, the Bad and the Ugly : Part 1

Recently I was encouraged to find a way to monitor many of our appliances with our shiny new deployment of  Microsoft System Center Operations Manager 2007 R2. Up until this point we had not used SCOM for SNMP monitoring little did I know the adventure I was about to embark on.

Requirements:

Log (SCOM Rule) various values from different SNMP appliances

Alert (SCOM Monitor) on various values from different SNMP appliances

Basic availability checks for all appliances

Receive and alert on SNMP TRAPS from all appliances

 

The Process : (no point in trying to skip to the solution)

The first thing I found was a lot of people in various blogs with a warning “Abandon hope all ye who enter here”. Usually I would save my ranting comments to the end but perhaps it’s best to give you a quick glimpse now. SCOM 2007 R2 is not exactly the pinacle of SNMP monitoring, but I will get into that more later 😉

In part 1 we will find a way to discover and differentiate between the different kinds of SNMP appliances so that we can target various monitors and rules at specific devices. No point is weighing down SCOM with monitoring all sorts of thinsg that don’t exist on a given device right?

Step 1: Install required tools

Here I have to start with a major and well deserved tip of the hat to 2 individuals without whom my adventures into SCOM and SNMP would have been long and likely fatal… or very short and almost painless (Hmmmm) either way, Thanks to Raphael Burry and his SNMP Discovery Provider for OpsMgr 2007 and Scott Vintinner with his Example SNMP Management Pack for SCOM 2007. Without these 2 pieces I would likely have left SCOM as an SNMP monitoring too altogether.

First download the sealed managed pack from Raphael rename it to a zip, break out the management pack and install it into you SCOM installation. This gives us the extensions to start writing out own custom management packs to start discovering different types of SNMP devices.

Second do your self a favor and download and install the iReasoning MIB browser and WireShark (formerly knows as Ethereal for those that missed the name change a while back) these tools will be invaluable shortly, trust me.

Step 2:  Figure out how to identify devices

First we need to figure out what makes this device different from any other device in our network, this is where the iReasoning MIB Browser comes in handy. After you have configured your device with an SNMP community name and allowed traffic from the machine you are running the MIB Browser from it’s time to pick an OID. Start the MIB Browser. You can load manufacturer MIB’s if you want but we don’t need them here. Enter the IP address of the device, clear anything that is in the IOD field and use the operations dropdown to pick walk. Then click GO!

Edit : Daniel Morrison makes a good comment – you may also need to hit “Advanced” and enter the SNMP community you configured on teh device for the walk operation to work. The default value is Public.  Just in caseyou missed it, you can see it below just between Address and OID. 

MibBrowser

Now you can go down the list of OID’s that you see and find something specific that will be unique to this type of device. For this example we are looking at an SG series GE UPS.  In this case I am going to go with 1.3.6.1.2.1.1.2.0 that provides an answer of  1.3.6.1.4.1.818.1.100.1.1 with these noted it’s time to start editing our management pack.

Step 3: Your first custom SNMP discovering management pack!

Here again you will want a couple of tools to make life much easier.  First XML Notepad 2007 a Microsoft tool for editing XML, handy for when the next tool does not expose what you want. Second SCOM Authoring Console 2007 R2, from the AuthoringConsole directory in the root of your download or SCOM 2007 R2 CD. This is probably on the website somewhere but what isn’t anymore?

Now you are ready to download Scott Vintinner’s Example Management pack, (EDIT: or the updated version here) this is not installed just used as a framework to build our own custom pack for the device in question.  Make a copy and open it with notepad.  The first thing to notice is right near the top, <ID>RBH.Ecosaire.AC.Management.Pack</ID> we need to change this to match whatever we are doing like <ID>GEUPS.Example.Management.Pack</ID> so I do a find and replace on the whole XML from “RBH.Ecosaire.AC.” to “GEUPS.Example.” then I save and close the file. At this point you have to rename the file  to match the ID in this case GEUPS.Example.Management.Pack.xml

Now for the discovery bit. Open the XML with your preferred editor and look for <Discoveries> this section is all we plan to edit at this stage.  Within <Discoveries> you will find

          <SnmpVarBinds>
<SnmpVarBind>
<OID>1.3.6.1.2.1.1.2.0</OID>
<Syntax>0</Syntax>
<Value VariantType=”8″ />
</SnmpVarBind>
</SnmpVarBinds>

The key to this section is the OID this is what is queried to determine if this device is a GE UPS as defined by our management pack. A few lines further down you have another important section, we will start with a simple expression first.

<Expression>
<SimpleExpression>
<ValueExpression>
<XPathQuery>/DataItem/SnmpVarBinds/SnmpVarBind[OID=’1.3.6.1.2.1.1.2.0′][1]/Value</XPathQuery>
</ValueExpression>
<Operator>Equal</Operator>
<ValueExpression>
<Value Type=”String”>1.3.6.1.4.1.8072.3.2.10</Value>
</ValueExpression>
</SimpleExpression>
</Expression>

A few important things to notice in this section. First on the line with XPathQuery you see an OID, if you changed the OID in the <SnmpVarBind> section above you need to change it here as well as this is the pointer to the variable that was read and it needs to match for the compare or who knows what will happen. I of course know one thing that will happen and that is that you will never discover anything. So I guess with further thought I do know what will happen.  The second thing to note is <Operator>Equal</Operator> so this is a simple X = Y expression, we will look at another option in a second. Third <Value Type=”String”>1.3.6.1.4.1.8072.3.2.10</Value> this is telling us that the data type of this variable is a string, so you can’t try and match based on > or anything like that, and the value we are looking for is 1.3.6.1.4.1.8072.3.2.10, of course this is the example value and not the value we determined above so I will replace it with 1.3.6.1.4.1.818.1.100.1.1 and save the XML.      Optionally You can look for <Interval>3600</Interval> this is the number of seconds between discoveries. This can also be altered once the MP has been imported but unless you like to wait (in which case you will get along great with SCOM) you may want to reduce this now for testing. Try not to forget to change it back later 😉

Now if a simple expression is not good enough you may need a regular expression here is an example for another device I recently used. To be honest I am still looking for a good source of documentation on all the options for e regular expression if anyone knows a good one. (EDIT – Thanks Steve for pointing out the document Regular expression support in SCOM 2007.docx from the OpsManJam website. )

Here is the sample

<Expression>
<RegExExpression>
<ValueExpression>
<XPathQuery>/DataItem/SnmpVarBinds/SnmpVarBind[OID=’1.3.6.1.2.1.1.1.0′][1]/Value</XPathQuery>
</ValueExpression>
<Operator>MatchesRegularExpression</Operator>
<Pattern>^.*SensorHawk .*$</Pattern>
</RegExExpression>
</Expression>

With the above section of XML if the txt SensorHawk appears in the result then it’s considered a match.

Edit: Ben needed a discovery that would detect 2 different kinds of devices. Together we found the following appears to work best.

<Expression>
<Or>

  <Expression>
<RegExExpression>
<ValueExpression>
<XPathQuery>/DataItem/SnmpVarBinds/SnmpVarBind[OID=’1.3.6.1.2.1.1.1.0′][1]/Value</XPathQuery>
</ValueExpression>
<Operator>MatchesRegularExpression</Operator>

              <Pattern>^.*RICOH .*$</Pattern>
</RegExExpression>
</Expression>
<Expression>
<RegExExpression>
<ValueExpression>
<XPathQuery>/DataItem/SnmpVarBinds/SnmpVarBind[OID=’1.3.6.1.2.1.1.1.0′][1]/Value</XPathQuery>
</ValueExpression>
<Operator>MatchesRegularExpression</Operator>

              <Pattern>^.*Canon .*$</Pattern>
</RegExExpression>
</Expression>
</Or>
</Expression>

As awesome as Scott Vintinner’s example one thing I believe it lacks is a view in the console so you can see whats going on.  We need to add just a bit of XML to the existing pack, to make this easy I have added it to base pack using the RBH.Ecosaire.AC naming so you can just edit it all at once if you prefer.  Download updated management pack example Here we have added a section just after the end of  </Monitoring>

<Presentation>
<Views>
<View ID=”RBH.Ecosaire.AC.Management.Pack.AlertView” Accessibility=”Internal” Enabled=”true” Target=”RBH.Ecosaire.AC.Management.Pack.SNMPDevice” TypeID=”SC!Microsoft.SystemCenter.AlertViewType” Visible=”true”>
<Category>Custom</Category>
<Criteria />
</View>
<View ID=”RBH.Ecosaire.AC.Management.Pack.EventView” Accessibility=”Internal” Enabled=”true” Target=”RBH.Ecosaire.AC.Management.Pack.SNMPDevice” TypeID=”SC!Microsoft.SystemCenter.EventViewType” Visible=”true”>
<Category>Custom</Category>
<Criteria />
</View>
<View ID=”RBH.Ecosaire.AC.Management.Pack.PerformanceView” Accessibility=”Internal” Enabled=”true” Target=”RBH.Ecosaire.AC.Management.Pack.SNMPDevice” TypeID=”SC!Microsoft.SystemCenter.PerformanceViewType” Visible=”true”>
<Category>Custom</Category>
<Criteria />
</View>
<View ID=”RBH.Ecosaire.AC.Management.Pack.StateView” Accessibility=”Internal” Enabled=”true” Target=”RBH.Ecosaire.AC.Management.Pack.SNMPDevice” TypeID=”SC!Microsoft.SystemCenter.StateViewType” Visible=”true”>
<Category>Custom</Category>
<Criteria />
</View>
</Views>
<Folders>
<Folder ID=”RBH.Ecosaire.AC.Management.Pack.ViewFolder” Accessibility=”Internal” ParentFolder=”NetLib!Microsoft.SystemCenter.NetworkDevice.AllDevices.ViewFolder.Root” />
</Folders>
<FolderItems>
<FolderItem ElementID=”RBH.Ecosaire.AC.Management.Pack.AlertView” Folder=”RBH.Ecosaire.AC.Management.Pack.ViewFolder” />
<FolderItem ElementID=”RBH.Ecosaire.AC.Management.Pack.EventView” Folder=”RBH.Ecosaire.AC.Management.Pack.ViewFolder” />
<FolderItem ElementID=”RBH.Ecosaire.AC.Management.Pack.PerformanceView” Folder=”RBH.Ecosaire.AC.Management.Pack.ViewFolder” />
<FolderItem ElementID=”RBH.Ecosaire.AC.Management.Pack.StateView” Folder=”RBH.Ecosaire.AC.Management.Pack.ViewFolder” />
</FolderItems>
</Presentation>

And a few display strings that will form our text labels for the above entries, these go in the <LanguagePacks> section just before </DisplayStrings>

        <DisplayString ElementID=”RBH.Ecosaire.AC.Management.Pack.AlertView”>
<Name>Alerts</Name>
</DisplayString>
<DisplayString ElementID=”RBH.Ecosaire.AC.Management.Pack.EventView”>
<Name>Events</Name>
</DisplayString>
<DisplayString ElementID=”RBH.Ecosaire.AC.Management.Pack.PerformanceView”>
<Name>Performance View</Name>
</DisplayString>
<DisplayString ElementID=”RBH.Ecosaire.AC.Management.Pack.StateView”>
<Name>State View</Name>
</DisplayString>
<DisplayString ElementID=”RBH.Ecosaire.AC.Management.Pack.ViewFolder”>
<Name>Ecosaire AC</Name>
</DisplayString>

Now all you have to do is install your new MP and you should see in the SCOM monitoring console

DiscoveryView

With luck now you can see your base device management pack and may have even have enough good karma to see some devices.  Of course if all you have done is followed my instructions you still won’t have anything because there is one last step.

Step 4: You need to discover the device in SNMP. This is done via the operations console, in teh Administration tab.  Right click on “Device Management” and run the “Discovery Wizard”

D1

Simple enough, select network devices and select next.

d2

Here enter the IP or range of IP’s that your devices user, make sure to enter the community name you configured on the device, and drop the SNMP version down to 1, unless of course you know your device is V2. Select the management server that you want to handle traps, monitors and rules and click discover.

If all goes well a couple of minutes later you will get a screen showing the devices that have been doscovered, check the box(es) of the ones you want to be managed ckick finish and you are done. Then wait a while and they should start showing up in the management console.  If things didn’t go well there is likely a problem with either the community name or the SNMP configuration on the device allowing your root management server to contact the device using SNMP. Best to configure the device to send SNMP traps to all you management servers and allow SNMP read-only from all of your management servers.

Here are the completed XML files for a few of the MP’s I created if they are of use to you. Download and rename to .XML

GEUPS.GreaterThen.Five.Management.Pack

GEUPS.Single.Phase.Management.Pack

RBH.Ecosaire.AC.Management.Pack

Stay tuned for Part 2 where we will look into createing rules and monitors for the discovered devices.

Part 1

Part 2

Part 3