Friday, July 27, 2012

SCOM Subscriptions automatically disabled repeatedly

An issue was flagged to my side that certain IT teams are not getting the alerts that they have been subscribed to.

Upon logging onto the SCOM Console it has been found that these notification subscriptions were getting disabled every 30 minutes. The weird thing was that not all subscriptions were being disabled and the same subscriptions were the same subscriptions every time. I tried re-enabling them and had the same result, the subscriptions kept being disabled. After some digging through the operations manager logs I found this warning:


Log Name: Operations Manager
Source: Health Service Modules
Date: 7/27/2012 5:53:22 PM
Event ID: 11452
Task Category: None
Level: Warning
Keywords: Classic
User: N/A

Computer: RMSserver

Description:
Validate alert subscription data source module encountered an alert subscription data source with configuration that has gone out of scope. Disabling the alert subscription data source module.

Alert subscription name: Subscription45c18cec_e95d_4af6_877e_072844d147d0

One or more workflows were affected by this.
Workflow name: Microsoft.SystemCenter.ValidateAlertSubscription
Instance name: RMSServer
Instance ID: {AF86A1AC-F1F5-9BF7-1E89-F60F73982EB6}
Management group: ManagementGRP



The problem turned out to be that someone in the team has just recently cleaned up the SCOM Admins user group and one of the users removed from the group had created this subscriptions. By putting the user back in the SCOM Admins group and re-enabling the subscriptions the problem was solved but we really didn’t want this user (Who has left the company) in the SCOM Admins group.

What is the root cause of this? When a subscription is created the user who created the subscriptions SID is associated with that subscription. There is a workflow that checks every half hour for SIDs no longer valid. They could be invalid because their accounts access that had been removed, or possibly because the account has been disabled or deleted.

The Solution

To fix the issue permanently, the management pack “Microsoft.SystemCenter.Notifications.Internal” is exported in xml format.
This management pack is unsealed and contains all subscriptions.
Inside the management pack I searched for one of the subscriptions that were being disabled and one that was wasn’t. I then replaced the SID of the subscription that is disabled with the SID of the subscription which is enabled.
After replacing the SIDs I re-imported the management pack and re-enabled all subscriptions and the problem was solved for good.
Here is an example of one of the SIDs I had to replace.

<ExpirationStartTime>12/01/2010 10:00:21</ExpirationStartTime>
<IdleMinutes>5</IdleMinutes>
<PollingIntervalMinutes>1</PollingIntervalMinutes>
<UserSid>S-1-5-21-1202660629-706699826-839522115-63827</UserSid>
<LanguageCode>ENE</LanguageCode>
<ExcludeNonNullConnectorIds>false</ExcludeNonNullConnectorIds>
<RuleId>$MPElementlt;/RuleId>

Monday, July 16, 2012

SCCM Package Stuck at "Install Pending" State Persistently


One of my SCCM Primary Site servers encountered some issues over the past week and at this time a package was being copied to all Primary Site Servers including the one having issues.

After the issue on the Primary Site Server is resolved, it has been found that after repeated attempts to remove and re-copy the package will end up having a similiar frustrating outcome of the package showing up as "Install Pending"

The method which was used to resolve this requires modification to the SCCM SQL Database tables directly (Attempt it at your own risk)

1) Remove the assigned DP from the Package and allow some time for the changes to take effect. Only proceed to step 2 once you have verified that the package is not at "Install Pending" state

2) Logon to the SQL Database for SCCM on Both top-tier and parent primary site server.
Run the SQL Query below against the PkgStatus table in the SCCM Database

Delete FROM PkgStatus WHERE ID='<Package ID>' AND SiteCode = '<Site Code>'

3) Give it some time before adding the DP to the same package

The procedure is applicable for all DPs inclusive of BDPs

Windows Server 2008 stops responding and hangs at the "Applying User Settings" stage of the logon process

An issue was flagged to me last week that a HyperV Guest running on Windows 2008 Sp2 is starting up extremely slowly (Applying Computer settings, Applying Security Policies etc) and it can take up to hours for the Server to reach the Logon Screen.

Even though I could logon to the server , it has been found that multiple services inclusive of the below are not started. Weird!!

Print Spooler
Terminal Services
Server service
Remote Registry
Windows Management Instrumentation (WMI)
Distributed Transaction Coordinator
Any services that are related to applications

After several rounds of troubleshooting which includes

- Booting to Safe mode (Booting to safe mode flies)
- Re-installing the HyperV integration Disk
- Tweaking Physical NIC settings

I finally came across a Microsoft Article (http://support.microsoft.com/kb/2004121) that more or less describes what I am facing.

This issue occurs because of a deadlock in the Service Control Manager database.

The Service Control Manager tries to start the HTTP.sys service and then puts a lock in place in the Service Control Manager database. Then, HTTP.sys makes a call that requires Cryptographic Services during startup. Then, a request is sent to start Cryptographic Services. However, a lock is already in place in the Service Control Manager database. Therefore, a deadlock occurs.

To verify that this is true, run "sc querylock" from command prompt.
The output below will indicate that the Service Control manager (SCM) databse is locked

QueryServiceLockstatus - Success
IsLocked : True
LockOwner : .\NT Service Control Manager
LockDuration : 1090 (seconds since acquired)


To Resolve the issue

You can modify the behavior of HTTP.SYS to depend on another service being started first. To do this, perform the following steps:

1) Open Registry Editor
2) Navigate to HKLM\SYSTEM\CurrentControlSet\Services\HTTP and create the following Multi-string value:DependOnService
3) Double click the new DependOnService entry 
4)Type CRYPTSVC in the Value Data field and click OK.
5) Reboot the server

Thursday, July 12, 2012

RDP clients and ICA clients cannot connect to a Windows Server 2003-based terminal server after hotfix 938759 is applied to the server

Encounter an issue of users being unable to logon to some of our Citrix Servers running on Windows 2003 R2 Sp2 after deployment of the security update (KB2653956, http://support.microsoft.com/kb/2653956)


It seems to be known issue that only affects Windows 2003 server but Not Windows 20008 Servers. The issue can be corrected by applying the hotfix listed in the KB below.
http://support.microsoft.com/kb/958476


Interesting that this is not corrected by Microsoft in the fix to prevent the issue from affecting the Windows 2003 Server and instead provided a hotfix to correct those that maybe affected
:(

Tuesday, July 10, 2012

Removing Delete Computers from SCOM View


There will be times when the SCOM agent has been a decommissioned server but after some time, this object is still displayed on the SCOM Computers view.

If removal is required, the SQL statement below will enable you to do so

UPDATE [OperationsManager].[dbo].[BaseManagedEntity] SET [IsDeleted] = 1   WHERE [DisplayName] LIKE 'servername'