Troubleshooting IIS Process Crash

Recently I was asked to help on troubleshooting occasional crashes of the IIS w3wp.exe process on a production site. Other than having the same windows application events generated at the time of the crash, the crashes do not have any obvious pattern and do not appear to be related to the traffic level. After performing a memory dump analysis using windbg, it turned out the crash was caused by a decryption function that was decrypting an empty string.

Problem

The following is the code that caused the exception.

Public Shared Function strDecrypt(ByVal encryptedString As String) As String
    encryptedString = encryptedString.Replace(" ", "+")
    Dim pstrDecrKey As String = "renamedFakeKey"
    Dim byKey As Byte() = {}
    Dim IV As Byte() = {18, 52, 86, 120, 144, 171, 205, 239}
    Dim inputByteArray As Byte() = New Byte() {encryptedString.Length}

    byKey = System.Text.Encoding.UTF8.GetBytes(pstrDecrKey.Substring(0, 8))
    Dim des As DESCryptoServiceProvider = New DESCryptoServiceProvider()
    inputByteArray = Convert.FromBase64String(encryptedString)
    Dim ms As MemoryStream = New MemoryStream()
    Dim cs As CryptoStream = New CryptoStream(ms, des.CreateDecryptor(byKey, IV), CryptoStreamMode.Write)
    cs.Write(inputByteArray, 0, inputByteArray.Length)
    cs.FlushFinalBlock()
    Return System.Text.Encoding.UTF8.GetString(ms.ToArray())
End Function

Solution

The fix is simply to check the input for empty string and return immediately

If string.IsNullOrEmpty(encryptedString) Then
    Return string.Empty
End If

Additional Information

This problem appears to have existed for a while as it was reported here back in 2009 and mentioned here in 2011.

Note that this problem only happens on 64 bit OS but not 32 bit OS. If you upgraded a web server OS from a old windows 32 bit OS to a newer windows 64 bit OS, and the site suddenly experienced random crashes, check if you have similar code and verify if you are having the same problem.

The fix is trivial once the cause is known, but finding out the root cause was a bit more involved. I shared below the process I went through to troubleshoot this.

At the time of the crash, three windows event log entries are generated.

Application Error Event 1023

Log Name:      Application
Source:        .NET Runtime
Date:          2/24/2016 12:53:02 PM
Event ID:      1023
Task Category: None
Level:         Error
Keywords:      Classic
User:          N/A
Computer:      xxxxxx
Description:
.NET Runtime version 2.0.50727.5485 - Fatal Execution Engine Error (000007FEF099600A) (80131506)
Event Xml:
<Event xmlns="http://schemas.microsoft.com/win/2004/08/events/event">
  <System>
    <Provider Name=".NET Runtime" />
    <EventID Qualifiers="0">1023</EventID>
    <Level>2</Level>
    <Task>0</Task>
    <Keywords>0x80000000000000</Keywords>
    <TimeCreated SystemTime="2016-02-24T19:53:02.000000000Z" />
    <EventRecordID>137453</EventRecordID>
    <Channel>Application</Channel>
    <Computer>xxxxxx</Computer>
    <Security />
  </System>
  <EventData>
    <Data>.NET Runtime version 2.0.50727.5485 - Fatal Execution Engine Error (000007FEF099600A) (80131506)</Data>
  </EventData>
</Event>

Application Event 1001

Log Name:      Application
Source:        Windows Error Reporting
Date:          2/24/2016 12:53:02 PM
Event ID:      1001
Task Category: None
Level:         Information
Keywords:      Classic
User:          N/A
Computer:      xxxxxx
Description:
Fault bucket , type 0
Event Name: APPCRASH
Response: Not available
Cab Id: 0

Problem signature:
P1: w3wp.exe
P2: 7.5.7601.17514
P3: 4ce7afa2
P4: mscorwks.dll
P5: 2.0.50727.5485
P6: 53a11d6c
P7: c0000005
P8: 00000000006b9089
P9: 
P10: 

Attached files:

These files may be available here:
C:\ProgramData\Microsoft\Windows\WER\ReportQueue\AppCrash_w3wp.exe_b75d5a4315af67c6bf1c7ab752f52b4cfb718_611caf08

Analysis symbol: 
Rechecking for solution: 0
Report Id: 3626349d-db30-11e5-b7d1-00155d023016
Report Status: 4
Event Xml:
<Event xmlns="http://schemas.microsoft.com/win/2004/08/events/event">
  <System>
    <Provider Name="Windows Error Reporting" />
    <EventID Qualifiers="0">1001</EventID>
    <Level>4</Level>
    <Task>0</Task>
    <Keywords>0x80000000000000</Keywords>
    <TimeCreated SystemTime="2016-02-24T19:53:02.000000000Z" />
    <EventRecordID>137455</EventRecordID>
    <Channel>Application</Channel>
    <Computer>xxxxxx</Computer>
    <Security />
  </System>
  <EventData>
    <Data>
    </Data>
    <Data>0</Data>
    <Data>APPCRASH</Data>
    <Data>Not available</Data>
    <Data>0</Data>
    <Data>w3wp.exe</Data>
    <Data>7.5.7601.17514</Data>
    <Data>4ce7afa2</Data>
    <Data>mscorwks.dll</Data>
    <Data>2.0.50727.5485</Data>
    <Data>53a11d6c</Data>
    <Data>c0000005</Data>
    <Data>00000000006b9089</Data>
    <Data>
    </Data>
    <Data>
    </Data>
    <Data>
    </Data>
    <Data>C:\ProgramData\Microsoft\Windows\WER\ReportQueue\AppCrash_w3wp.exe_b75d5a4315af67c6bf1c7ab752f52b4cfb718_611caf08</Data>
    <Data>
    </Data>
    <Data>0</Data>
    <Data>3626349d-db30-11e5-b7d1-00155d023016</Data>
    <Data>4</Data>
  </EventData>
</Event>

Application Error Event 1000

Log Name:      Application
Source:        Application Error
Date:          2/24/2016 12:53:02 PM
Event ID:      1000
Task Category: (100)
Level:         Error
Keywords:      Classic
User:          N/A
Computer:      xxxxxx
Description:
Faulting application name: w3wp.exe, version: 7.5.7601.17514, time stamp: 0x4ce7afa2
Faulting module name: mscorwks.dll, version: 2.0.50727.5485, time stamp: 0x53a11d6c
Exception code: 0xc0000005
Fault offset: 0x00000000006b9089
Faulting process id: 0x%9
Faulting application start time: 0x%10
Faulting application path: %11
Faulting module path: %12
Report Id: %13
Event Xml:
<Event xmlns="http://schemas.microsoft.com/win/2004/08/events/event">
  <System>
    <Provider Name="Application Error" />
    <EventID Qualifiers="0">1000</EventID>
    <Level>2</Level>
    <Task>100</Task>
    <Keywords>0x80000000000000</Keywords>
    <TimeCreated SystemTime="2016-02-24T19:53:02.000000000Z" />
    <EventRecordID>137454</EventRecordID>
    <Channel>Application</Channel>
    <Computer>xxxxxx</Computer>
    <Security />
  </System>
  <EventData>
    <Data>w3wp.exe</Data>
    <Data>7.5.7601.17514</Data>
    <Data>4ce7afa2</Data>
    <Data>mscorwks.dll</Data>
    <Data>2.0.50727.5485</Data>
    <Data>53a11d6c</Data>
    <Data>c0000005</Data>
    <Data>00000000006b9089</Data>
  </EventData>
</Event>

The information in the event log is limited. To find out what went wrong, I had to dig a little deeper and perform a memory dump analysis.

Collect Memory Dump
First I had to collect a memory dump when the IIS process crashes. In my case, the web server was already configured with WER to do that. The location of the memory dump was indicated in the evnet 1001 event log entry event.

A couple tools can be used to collect memory dump:
WER
DebugDiag

Analyze Memory Dump
To analyze the memory dump, I use use WinDbg. A list of common windbg commands can be found here

Download WinDbg if you do not already have it. Start WinDbg and go to File > Open Crash Dump to load the memory dump.

WinDbg

Enter the following command which will display information about the current exception or bug check in verbose format.

!analyze -v

This gives me the following output.

Loading unloaded module list
....................................
This dump file has an exception of interest stored in it.
The stored exception information can be accessed via .ecxr.
(5a8.172c): Access violation - code c0000005 (first/second chance not available)
ntdll!ZwWaitForSingleObject+0xa:
00000000`7797d9fa c3              ret
0:070> !analyze -v
*******************************************************************************
*                                                                             *
*                        Exception Analysis                                   *
*                                                                             *
*******************************************************************************

*** WARNING: Unable to verify checksum for mscorlib.ni.dll
Unable to load image C:\Windows\assembly\NativeImages_v2.0.50727_64\System.Web\aa5e82ba0882a68ec638bc04d3679a53\System.Web.ni.dll, Win32 error 0n2
*** WARNING: Unable to verify checksum for System.Web.ni.dll
Unable to load image C:\Windows\assembly\NativeImages_v2.0.50727_64\System\c7fb84e825f6604d7f4684ab96cbd148\System.ni.dll, Win32 error 0n2
*** WARNING: Unable to verify checksum for System.ni.dll
Unable to load image C:\Windows\assembly\NativeImages_v2.0.50727_64\System.Data\326df77841f23d0e0039374a7d233416\System.Data.ni.dll, Win32 error 0n2
*** WARNING: Unable to verify checksum for System.Data.ni.dll
Unable to load image C:\Windows\assembly\NativeImages_v2.0.50727_64\Microsoft.VisualBas#\6301f41d5d4d86722d00e611f0899490\Microsoft.VisualBasic.ni.dll, Win32 error 0n2
*** WARNING: Unable to verify checksum for Microsoft.VisualBasic.ni.dll
Unable to load image C:\Windows\assembly\NativeImages_v2.0.50727_64\System.Xml\d09a5530f1283b469957bf146e2f4d65\System.Xml.ni.dll, Win32 error 0n2
*** WARNING: Unable to verify checksum for System.Xml.ni.dll
Unable to load image C:\Windows\assembly\NativeImages_v2.0.50727_64\System.Web.Services\0ec14eb4a93723f6f5f24154fa9def53\System.Web.Services.ni.dll, Win32 error 0n2
*** WARNING: Unable to verify checksum for System.Web.Services.ni.dll
Unable to load image C:\Windows\assembly\NativeImages_v2.0.50727_64\System.Web.Extensio#\9286bb3f5e8fbfb834fed7588a2271bd\System.Web.Extensions.ni.dll, Win32 error 0n2
*** WARNING: Unable to verify checksum for System.Web.Extensions.ni.dll
*** ERROR: Module load completed but symbols could not be loaded for System.Web.Extensions.ni.dll

DUMP_CLASS: 2

DUMP_QUALIFIER: 400

CONTEXT:  (.ecxr)
rax=00000000050ebfb0 rbx=0000000000000000 rcx=00000000ffffffff
rdx=0000000000000001 rsi=0000000000000000 rdi=0000000180a027e8
rip=000007fefa3c9089 rsp=000000000a82d380 rbp=0000000000000000
 r8=0000000000000000  r9=0000000000000000 r10=e35a172cdd8cf770
r11=000000000a82d2e8 r12=000000000a82d478 r13=0000000000000001
r14=0000000080090005 r15=00000000054670d0
iopl=0         nv up ei pl zr na po nc
cs=0033  ss=002b  ds=002b  es=002b  fs=0053  gs=002b             efl=00010246
mscorwks!COMCryptography::_DecryptData+0x329:
000007fe`fa3c9089 0fb63401        movzx   esi,byte ptr [rcx+rax] ds:00000001`050ebfaf=??
Resetting default scope

FAULTING_IP: 
mscorwks!COMCryptography::_DecryptData+329
000007fe`fa3c9089 0fb63401        movzx   esi,byte ptr [rcx+rax]

EXCEPTION_RECORD:  (.exr -1)
ExceptionAddress: 000007fefa3c9089 (mscorwks!COMCryptography::_DecryptData+0x0000000000000329)
   ExceptionCode: c0000005 (Access violation)
  ExceptionFlags: 00000000
NumberParameters: 2
   Parameter[0]: 0000000000000000
   Parameter[1]: 00000001050ebfaf
Attempt to read from address 00000001050ebfaf

PROCESS_NAME:  w3wp.exe

ERROR_CODE: (NTSTATUS) 0xc0000005 - The instruction at 0x%08lx referenced memory at 0x%08lx. The memory could not be %s.

EXCEPTION_CODE: (NTSTATUS) 0xc0000005 - The instruction at 0x%08lx referenced memory at 0x%08lx. The memory could not be %s.

EXCEPTION_CODE_STR:  c0000005

EXCEPTION_PARAMETER1:  0000000000000000

EXCEPTION_PARAMETER2:  00000001050ebfaf

READ_ADDRESS:  00000001050ebfaf 

FOLLOWUP_IP: 
mscorwks!COMCryptography::_DecryptData+329
000007fe`fa3c9089 0fb63401        movzx   esi,byte ptr [rcx+rax]

WATSON_BKT_PROCSTAMP:  4ce7afa2

WATSON_BKT_PROCVER:  7.5.7601.17514

PROCESS_VER_PRODUCT:  Internet Information Services

WATSON_BKT_MODULE:  mscorwks.dll

WATSON_BKT_MODSTAMP:  53a11d6c

WATSON_BKT_MODOFFSET:  6b9089

WATSON_BKT_MODVER:  2.0.50727.5485

MODULE_VER_PRODUCT:  Microsoft® .NET Framework

BUILD_VERSION_STRING:  6.1.7601.19045 (win7sp1_gdr.151019-1254)

MODLIST_WITH_TSCHKSUM_HASH:  857813e725729c7dd498f79da93398e4b08dead1

MODLIST_SHA1_HASH:  5a768968a550e35d43040d297ed764b41596e31f

NTGLOBALFLAG:  0

APPLICATION_VERIFIER_FLAGS:  0

PRODUCT_TYPE:  3

SUITE_MASK:  272

DUMP_FLAGS:  d96

DUMP_TYPE:  0

APP:  w3wp.exe

From Faulting module name: mscorwks.dll, version: 2.0.50727.5485 and
FAULTING_IP: mscorwks!COMCryptography::_DecryptData+329, We can tell that the crash happens in the mscorwks.dll and in the DecryptData function. Doing a bit of googling led to this blog post. I also found the strDecrypt function in the website’s code base and I verified on a test site that passing empty string to the function will crash the iis process.

It’s been a few weeks after fixing the code and no crashes so far.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s