I have come across with a request from multiple customers, from my colleagues and in multiple forums where if any agent stop collecting data, how we will be able to get that information?

I have developed a management pack for this request where I have used Kevin’s fragments.

How this Management pack works?

Well, in this management pack, I have used PowerShell probe action module to execute the PowerShell Script which will be running against the SCOM OperationsManager database server, and it will collect the required information. If we have got any data in the propertybag, this PowerShell script will log an error event 7890 on the Operations Manager event logs and it will have all the server details for which we have an issue with data collections.

This Monitor will run every 12 hrs., I have given the option to overrides the frequency however I would recommend to run it only twice a day.

You can download the management pack from the below link.

Management-Packs/Custom.Perf.DataCollection.Verify.Monitor.xml at master · souravmahato7/Management-Packs (github.com)

Please find the details about what you have to change to make this Management pack to work.

  1. Firstly, you have to change all the references in the MP according to the version of dependent MP in your environment.
  2. Secondly, you need make changes in the XML in the Script body for the DatabaseName and Instance Name.
  3. You need to enable this Monitor for one of the SCOM Management server. By default, this Monitor is disabled.
  4. Once MP is imported and enable it for one of the Management server (Custom.PerfDataCollection.Verify.Monitor, is the monitor that you need to enable through Authoring Pane), you could see the alert descriptions as following.
    Custom PerfDataCollection Verify Monitor: detected a bad condition
    
    Please run the below SQL query against the OperationsManager database to get the server details. Please also check event ID 7890 for more details.
    
    select ME.Path As 'Name',
    CAST(Max(TimeSampled) As nvarchar(50)) As 'LastSample',
    CASE
    -- The number 4 is the hour passed from last sample collection we wait before marking the server as BAD or OGOOD.
    -- BAD should be for problematic servers
    -- GOOD should be for working servers
    WHEN Isnull(MAX(TimeSampled),'01-01-80') < DateAdd(hh,-4,getutcdate()) Then 'BAD'
    Else 'GOOD'
    END as 'Status'
    from dbo.ManagedEntityGenericView ME
    inner join dbo.ManagedTypeView MT on ME.MonitoringClassId=MT.Id
    inner join dbo.PerformanceCounterView C on ME.Id = C.ManagedEntityId
    left join dbo.PerformanceDataAllView P on C.PerformanceSourceInternalId=P.PerformanceSourceInternalId
    where MT.Name like '%Linux%'
    OR MT.Name like '%UNIX%'
    and ME.IsDeleted=0
    group by ME.Path
    order by Status
  5. You will get the servers details in the Operations manager log of the Management server under the event ID 7890 like below.
Log Name: Operations Manager
Source: Health Service Script
Date: 7/23/2021 11:40:02 PM
Event ID: 7890
Task Category: None
Level: Error
Keywords: Classic
User: N/A
Computer: Server.domain.com
Description:
Custom.PerfDataMonitor.ps1 : 
Performance data is not collected for DC.domain.COM DC.domain.COM;DC

 

 

Leave a Reply

Your email address will not be published. Required fields are marked *