So you built a web application, tested it and deployed it. People are using it and everything is fine, but then one day your site suddenly becomes totally unavailable. You look at the web server and the w3
wp.
exe process is using up 100% of the
cpu time. Well, here's a way to figure out what page is causing the jam without having to pay for any fancy diagnostic tools.
Log ParserFirstly get and install the latest version of Log Parser from Microsoft. Currently that would be version 2.2.
http://www.microsoft.com/downloads/details.aspx?FamilyID=890cd06b-abf8-4c25-91b2-f8d975cf8c07&
displaylang=en
Finding Out Which Site is the ProblemAre you running multiple sites and aren't sure which site is causing the problem? Well if you aren't already try running each site on a
separate application pool. This is also a hint at another way to prevent these kind of performance issues which is managing a web garden in
IIS. Hopefully I'll cover that another time. This will mean there is a w3
wp process for each site. When one of these processes hits 100% you can use process explorer (http://technet.microsoft.com/en-us/sysinternals/bb896653.aspx) to find out which pool is causing the issue. The name of the pool will be at the end of the target field when you click on the properties of the task though I'm sure there's a better way of finding that out.
Configuring IIS LoggingTo use log parser effectively you need to ensure that you are logging the right things. Go into
IIS manager, open the website tab, click on the logging properties button. Now click on the Extended Properties tab and check Time Taken. This is going to allow you to find pages that take a long time to execute in your logs.
WaitNow you need to wait for it to happen again. Why not have a cup of tea or dig a hole to nowhere? One thing you could do to speed up this
occurring again is to use a site crawler to hammer your application. There are plenty of free trials for these kinds of apps floating about the
Internet, though it does assume your problem page is causing it's havoc in something executed during page load and not after user interaction.
Using Log ParserHuzzah, it's at 100% again! Put down that spade and get back to your server. Open up a command prompt in the directory you installed
logparser, copy your most recent
logfile for the offending site into the
logparser directory and paste this into the command window:
logparser "SELECT TOP 10 cs-
uri-stem AS
Url, MIN(time-taken) as [Min], AVG(time-taken) AS [Avg], max(time-taken) AS [Max], count(time-taken) AS Hits FROM ex*.log WHERE time-taken < class="blsp-spelling-error" id="SPELLING_ERROR_17">Url ORDER BY [Avg]
DESC" -
rtp:-1
This will give you the top 50 longest taking pages on your site. Next comes a bit of a tedious task. One or more of those pages is likely to be the page causing your problem. The rest are going to be pages that were held up because of that page that was causing the issue, or for some other erroneous reason like the user is trying to load a page with a dying
Internet connection their end. In any case copy the results out of the command prompt and into note pad. Now copy each
url, starting from the top, into your browser, maybe fiddle a bit with the
ui, until you hit the one that maxes out the CPU of your server. This bit would probably be best done on a test server if you have one. When I last had this issue I didn't have to go any further than the second
url to find my problem (though I did check them all for prosperity). It'll likely be something simple and stupid that you didn't expect to happen. Mine was a malformed
url that was being generated on the site somewhere that
omitted a certain parameter I expected to always be there.
Now, back to that hole to nowhere....