Tag Archives: IIS

Intermittently Slow IIS web site

TL;DR:

An issue in the Windows Management Instrumentation (WMI) performance counter collection process caused periodic system-wide performance degradation.


This issue became visible when our infrastructure monitoring software invoked specific WMI queries.


We disabled the specific WMI query set which was causing the performance issues, and the problem went away.

A few days ago one of our clients began reporting performance issues on one of their web sites. This site is an IIS web application responsible for rendering visualizations of very large data sets (hundreds of gigabytes). As such, the application pool consumes a corresponding amount of RAM (which is physically available on the server).

Normally, these sites (I manage a few hundred instances) are fast, with most queries returning in under 300ms; however, this one instance proved difficult. To make matters worse, the performance issues were intermittent: most of the time, the site was blazing fast, but sometimes the site would hang for minutes.

Given a few hours of observation, one of my team members noticed a correlation between the performance issues of the site and a seemingly unrelated process on the host: WmiPrvSe.exe

I began digging in, and was able to corroborate this correlation by looking at the process’s CPU usage over time (using ELK / MetricBeat to watch windows processes). Sure enough, there’s a direct correlation between WmiPrvSe.exe using ~3-4% CPU, and IIS logs indicating a timeTaken of greater than 90 seconds. This correlation also established an interval between instances of the issue: 20 minutes.

I fired up Sysinternals’ ProcMon.exe to get a better handle on what exactly WmiPrvSe.exe was doing during these so-called “spikes”. I observed an obscene count of Registry queries to things looking like Performance counters (RegQueryValue, RegCloseKey, RegEnumKey, RegOpenKey). Note that there are multiple instances of WmiPrvSe.exe running on the sytsem, but only one instance was “misbehaving:” the one running as NT AUTHORITY\SYTSEM (which also happens to have the lowest PID). The instances running as NT AUTHORITY\NETWORK SERVICE and as NT AUTHORITY\LOCAL SERVICE did not seem to be misbehaving.

Almost all of the registry keys in question contained the string Performance or PERFLIB; many (but not all) queries were against keys within HKLM\System\CurrentControlSet\Services.

I know that I have the Elastic Co.’s “Beats” agents installed on this host; could Metricbeat, or one of my other monitoring tools be the culprit? So, I tried disabling all of the “beats” agents (filebeat, metricbeat, winlogbeat, etc), but was still seeing these intermittent spikes in WmiPrvSe.exe CPU usage correlating with slow page loads from IIS.

Stumped, I searched for how to capture WMI application logs, and found this article: https://docs.microsoft.com/en-us/windows/desktop/wmisdk/tracing-wmi-activity.

I ran the suggested command (Wevtutil.exe sl Microsoft-Windows-WMI-Activity/Trace /e:true) and fired up Event Veiwer (as admin) to the above path. Bingo.

Log hits inMicrosoft-Windows-WMI-Activity/Trace included mostly checks against the networking devices:select __RELPATH, Name, BytesReceivedPersec, BytesSentPersec, BytesTotalPersec from Win32_PerfRawData_Tcpip_NetworkInterface

These WMI queries were executed by the ClientProcessId owned by nscp.exe.

I perused the source code for NSCP a bit, and discovered that the Network queries for NSCP are executed through WMI(
https://github.com/mickem/nscp/blob/master/modules/CheckSystem/check_network.cpp#L105 ); while the standard performance counter queries were executed through PDH (https://github.com/mickem/nscp/blob/master/modules/CheckSystem/pdh_thread.cpp#L132):

Something else I noticed was that the Microsoft-Windows-WMI-Activity/Operational log contained events directly corresponding to the issue at hand: WMIProv provider started with result code 0x0. HostProcess = wmiprvse.exe; ProcessID = 3296; ProviderPath = %systemroot%\system32\wbem\wmiprov.dll

Some more creative google searches yielded me an interesting issue in a GitHub repo for a different project: CPU collector blocks every ~17 minutes on call to wmi.Query #89 .

Sounds about right.

Skimming through the issue I see this; which sets off the “ah-ha” moment:

Perfmon uses the PDH library, not WMI. I did not test with Perfmon, but PDH is not affected.


leoluk commented on Feb 16, 2018 (https://github.com/martinlindhe/wmi_exporter/issues/89#issuecomment-366195581)

Now knowing that only NSCP’s check_network uses WMI, I found the documentation to disable the network routine in nscp’s CheckSystem module: https://docs.nsclient.org/reference/windows/CheckSystem/#disable-automatic-checks

I added the bits to my nsclient.ini config to disable automatic network checks, restarted NSCP, and confirmed the performance issue was gone:

[/settings/system/windows]
# Disable automatic checks
disable=network

I’ve opened an issue on NSCP’s GitHub page for this problem: https://github.com/mickem/nscp/issues/619

Related Reading:

Forgot SharePoint Farm User Account Password

I had recently patched an inherited SharePoint 2010 Farm up to the December 2014 CU.  I’m currently prepping to migrate the farm to SharePoint 2013, but I needed to get it patched in the interim.

I successfully applied SP2, and the the December 2014 CU (14.0.7140.5000 – Much thanks to Todd Klindt’s SharePoint Admin Blog for the easy build number lookup), and all seemed well.

That is, until I had to change an extranet user’s email address.  These users don’t have mail accounts in our Exchange environment, but we do populate the mail attribute in AD with their corporate email address.  I made the change to the attribute in AD, and attempted to run the User Profile Synchronization (Central Administration | Manage Service Applications | User Profile Service Application | Start Profile Synchronization).

This action failed because the User Profile Synchronization Service was not running on the server! (Central Administration | Manage Services on Server).

I attempted to start the service but was prompted for the DOMAIN\SPFarm account! I searched all archives and documents, but found no reference to this password!  UH OH!!!!!

I finally found this post: http://joelblogs.co.uk/2012/09/22/recovering-passwords-for-sharepoint-2010-farm-web-application-and-service-application-accounts/

I had full administrative access to the server on which Central Administration was installed, so all I had to do was run a “one liner” in PowerShell.  Could it really be that easy?!

Here’s how easy it is:

&$env:windir\\system32\\inetsrv\\appcmd.exe list apppool 
 "SharePoint Central Administration v4" /text:ProcessModel.Password

I ran the command in my dev environment first (we always test foreign code outside of production, right?), and got this!

No Way.  That’s my Farm account password….in PLAIN TEXT! WHOA SCARY!

So, If you ever find yourself forgetting any of your IIS Application Pool Account Passwords, you now have the tool to recover it!

HOO-RAH!