Vir Gnarus - I think he is referring to these two posts by you and Richard (niemiro) -
Stephen
Thanks a lot for this very informative post.
I have a user with this kernel dump: MEMORY.zip
It looks almost identical to the one you showed here, and I wondered whether you could think of anything other than hardware? I guess I will ask to update chipset drivers, and see if that does it. A bit desperate, I know.
Also, the OP claims that the computer works fine in Safe Mode, but crashes in normal mode. Finally, I notice that this computer seems to have 8 cores. That seems like quite a few. Do you think this may even be a multi-processor machine, perhaps even a small server? I will ask the OP.
Thanks a lot for any insight you may be able to offer.
Code:Microsoft (R) Windows Debugger Version 6.2.8229.0 AMD64 Copyright (c) Microsoft Corporation. All rights reserved. Loading Dump File [D:\MEMORY (2).DMP] Kernel Summary Dump File: Only kernel address space is available Symbol search path is: SRV*D:\Symbols*http://msdl.microsoft.com/download/symbols Executable search path is: Windows Server 2008/Windows Vista Kernel Version 6002 (Service Pack 2) MP (8 procs) Free x64 Product: WinNt, suite: TerminalServer SingleUserTS Personal Built by: 6002.18607.amd64fre.vistasp2_gdr.120402-0336 Machine Name: Kernel base = 0xfffff800`03003000 PsLoadedModuleList = 0xfffff800`031c7dd0 Debug session time: Sun Jun 24 14:10:20.781 2012 (UTC + 1:00) System Uptime: 0 days 0:01:39.562 Loading Kernel Symbols ............................................................... .................................................. Loading User Symbols Loading unloaded module list ..... ******************************************************************************* * * * Bugcheck Analysis * * * ******************************************************************************* Use !analyze -v to get detailed debugging information. BugCheck 101, {18, 0, fffffa60019d8180, 3} *** ERROR: Module load completed but symbols could not be loaded for intelppm.sys Probably caused by : Unknown_Image ( ANALYSIS_INCONCLUSIVE ) Followup: MachineOwner --------- 0: kd> !analyze -v ******************************************************************************* * * * Bugcheck Analysis * * * ******************************************************************************* CLOCK_WATCHDOG_TIMEOUT (101) An expected clock interrupt was not received on a secondary processor in an MP system within the allocated interval. This indicates that the specified processor is hung and not processing interrupts. Arguments: Arg1: 0000000000000018, Clock interrupt time out interval in nominal clock ticks. Arg2: 0000000000000000, 0. Arg3: fffffa60019d8180, The PRCB address of the hung processor. Arg4: 0000000000000003, 0. Debugging Details: ------------------ BUGCHECK_STR: CLOCK_WATCHDOG_TIMEOUT_8_PROC DEFAULT_BUCKET_ID: VISTA_DRIVER_FAULT PROCESS_NAME: System CURRENT_IRQL: d STACK_TEXT: fffff800`04416a98 fffff800`030193a0 : 00000000`00000101 00000000`00000018 00000000`00000000 fffffa60`019d8180 : nt!KeBugCheckEx fffff800`04416aa0 fffff800`030543aa : 00000000`00000000 fffff800`04416bc0 fffffa80`08765330 fffff800`03548320 : nt! ?? ::FNODOBFM::`string'+0x2de4 fffff800`04416ae0 fffff800`0352b8af : 00000000`00000000 fffff800`04416bc0 fffff800`03548320 fffffa80`08d91170 : nt!KeUpdateSystemTime+0xea fffff800`04416b10 fffff800`03053b6d : 00000000`00000000 fffff800`03548320 00000000`00000000 fffffa60`0390b6d6 : hal!HalpRtcClockInterrupt+0x127 fffff800`04416b40 fffffa60`00d407a2 : fffffa60`00d3f685 fffff800`04410000 00000000`00000000 00000000`00000001 : nt!KiInterruptDispatchNoLock+0x14d fffff800`04416cd8 fffffa60`00d3f685 : fffff800`04410000 00000000`00000000 00000000`00000001 00000000`0000000c : intelppm+0x37a2 fffff800`04416ce0 fffff800`0305f173 : 0000003d`d5f3b80e 00000000`00000000 fffffa80`00000001 fffff800`03179a80 : intelppm+0x2685 fffff800`04416d10 fffff800`0305ee91 : fffff800`03176680 fffff800`00000000 00000000`0f088bae 00000000`00000000 : nt!PoIdle+0x183 fffff800`04416d80 fffff800`0322e860 : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : nt!KiIdleLoop+0x21 fffff800`04416db0 00000000`fffff800 : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : nt!zzz_AsmCodeRange_End+0x4 fffff800`044100b0 00000000`00000000 : 00000000`00000000 00000000`00000000 00000000`00000000 00680000`00000000 : 0xfffff800 STACK_COMMAND: kb SYMBOL_NAME: ANALYSIS_INCONCLUSIVE FOLLOWUP_NAME: MachineOwner MODULE_NAME: Unknown_Module IMAGE_NAME: Unknown_Image DEBUG_FLR_IMAGE_TIMESTAMP: 0 FAILURE_BUCKET_ID: X64_CLOCK_WATCHDOG_TIMEOUT_8_PROC_ANALYSIS_INCONCLUSIVE BUCKET_ID: X64_CLOCK_WATCHDOG_TIMEOUT_8_PROC_ANALYSIS_INCONCLUSIVE Followup: MachineOwner --------- 0: kd> !prcb 0 PRCB for Processor 0 at fffff80003176680: Current IRQL -- 13 Threads-- Current fffff8000317bb80 Next 0000000000000000 Idle fffff8000317bb80 Number 0 SetMember 1 Interrupt Count -- 0001471f Times -- Dpc 000000bc Interrupt 00000018 Kernel 000018ab User 00000000 0: kd> !prcb 1 PRCB for Processor 1 at fffffa60005ec180: Current IRQL -- 0 Threads-- Current fffffa60005f5d40 Next 0000000000000000 Idle fffffa60005f5d40 Number 1 SetMember 2 Interrupt Count -- 0000c511 Times -- Dpc 00000000 Interrupt 00000000 Kernel 0000189a User 00000000 0: kd> !prcb 2 PRCB for Processor 2 at fffffa6001966180: Current IRQL -- 0 Threads-- Current fffffa600196fd40 Next fffffa80054d7bb0 Idle fffffa600196fd40 Number 2 SetMember 4 Interrupt Count -- 0000bc7e Times -- Dpc 00000015 Interrupt 00000000 Kernel 000012ec User 00000000 0: kd> !prcb 3 PRCB for Processor 3 at fffffa60019d8180: Current IRQL -- 0 Threads-- Current fffffa8009df3bb0 Next 0000000000000000 Idle fffffa60019e1d40 Number 3 SetMember 8 Interrupt Count -- 0000bf6c Times -- Dpc 00000001 Interrupt 00000002 Kernel 000012d2 User 00000000 0: kd> !prcb 4 PRCB for Processor 4 at fffffa6001a43180: Current IRQL -- 0 Threads-- Current fffffa80054e6210 Next 0000000000000000 Idle fffffa6001a4cd40 Number 4 SetMember 10 Interrupt Count -- 0000939e Times -- Dpc 00000000 Interrupt 00000000 Kernel 00001897 User 00000000 0: kd> !prcb 5 PRCB for Processor 5 at fffffa6001ab5180: Current IRQL -- 0 Threads-- Current fffffa6001abed40 Next 0000000000000000 Idle fffffa6001abed40 Number 5 SetMember 20 Interrupt Count -- 000091dc Times -- Dpc 00000000 Interrupt 00000010 Kernel 00001895 User 00000000 0: kd> !prcb 6 PRCB for Processor 6 at fffffa6001b27180: Current IRQL -- 0 Threads-- Current fffffa80054eebb0 Next 0000000000000000 Idle fffffa6001b30d40 Number 6 SetMember 40 Interrupt Count -- 0000bdf3 Times -- Dpc 00000001 Interrupt 00000004 Kernel 00001155 User 00000000 0: kd> !prcb 7 PRCB for Processor 7 at fffffa6001b99180: Current IRQL -- 0 Threads-- Current fffffa80069e6bb0 Next 0000000000000000 Idle fffffa6001ba2d40 Number 7 SetMember 80 Interrupt Count -- 0000bff6 Times -- Dpc 00000000 Interrupt 00000001 Kernel 0000114e User 00000000 0: kd> !prcb 8 Cannot get PRCB address 0: kd> !irql 0 Debugger saved IRQL for processor 0x0 -- 13 0: kd> !irql 1 Debugger saved IRQL for processor 0x1 -- 0 (LOW_LEVEL) 0: kd> !irql 2 Debugger saved IRQL for processor 0x2 -- 0 (LOW_LEVEL) 0: kd> !irql 3 Debugger saved IRQL for processor 0x3 -- 0 (LOW_LEVEL) 0: kd> !irql 4 Debugger saved IRQL for processor 0x4 -- 0 (LOW_LEVEL) 0: kd> !irql 5 Debugger saved IRQL for processor 0x5 -- 0 (LOW_LEVEL) 0: kd> !irql 6 Debugger saved IRQL for processor 0x6 -- 0 (LOW_LEVEL) 0: kd> !irql 7 Debugger saved IRQL for processor 0x7 -- 0 (LOW_LEVEL) 0: kd> ~0 0: kd> kv Child-SP RetAddr : Args to Child : Call Site fffff800`04416a98 fffff800`030193a0 : 00000000`00000101 00000000`00000018 00000000`00000000 fffffa60`019d8180 : nt!KeBugCheckEx fffff800`04416aa0 fffff800`030543aa : 00000000`00000000 fffff800`04416bc0 fffffa80`08765330 fffff800`03548320 : nt! ?? ::FNODOBFM::`string'+0x2de4 fffff800`04416ae0 fffff800`0352b8af : 00000000`00000000 fffff800`04416bc0 fffff800`03548320 fffffa80`08d91170 : nt!KeUpdateSystemTime+0xea fffff800`04416b10 fffff800`03053b6d : 00000000`00000000 fffff800`03548320 00000000`00000000 fffffa60`0390b6d6 : hal!HalpRtcClockInterrupt+0x127 fffff800`04416b40 [COLOR=#ff0000][B]fffffa60`00d407a2[/B][/COLOR] : fffffa60`00d3f685 fffff800`04410000 00000000`00000000 00000000`00000001 : nt!KiInterruptDispatchNoLock+0x14d (TrapFrame @ fffff800`04416b40) fffff800`04416cd8 [B][COLOR=#0000ff]fffffa60`00d3f685[/COLOR][/B] : fffff800`04410000 00000000`00000000 00000000`00000001 00000000`0000000c : intelppm+0x37a2 fffff800`04416ce0 fffff800`0305f173 : 0000003d`d5f3b80e 00000000`00000000 fffffa80`00000001 fffff800`03179a80 : intelppm+0x2685 fffff800`04416d10 fffff800`0305ee91 : fffff800`03176680 fffff800`00000000 00000000`0f088bae 00000000`00000000 : nt!PoIdle+0x183 fffff800`04416d80 fffff800`0322e860 : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : nt!KiIdleLoop+0x21 fffff800`04416db0 00000000`fffff800 : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : nt!zzz_AsmCodeRange_End+0x4 fffff800`044100b0 00000000`00000000 : 00000000`00000000 00000000`00000000 00000000`00000000 00680000`00000000 : 0xfffff800 0: kd> ~1 1: kd> kv Child-SP RetAddr : Args to Child : Call Site fffffa60`0191bcd8 fffffa60`00d3f685 : fffffa80`054d7720 fffffa60`005f5d40 fffffa60`00000001 fffffa60`0191bd50 : intelppm+0x37a2 fffffa60`0191bce0 fffff800`0305f173 : 00000000`00000001 fffffa80`054d7818 fffffa80`054d7720 fffffa60`005f5d40 : intelppm+0x2685 fffffa60`0191bd10 fffff800`0305ee91 : fffffa60`005ec180 fffffa60`00000000 00000000`0f096483 00000000`00000000 : nt!PoIdle+0x183 fffffa60`0191bd80 fffff800`0322e860 : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : nt!KiIdleLoop+0x21 fffffa60`0191bdb0 00000000`fffffa60 : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : nt!zzz_AsmCodeRange_End+0x4 fffffa60`005efd00 00000000`00000000 : 00000000`00000000 00000000`00000000 00000000`00000000 00680000`00000000 : 0xfffffa60 1: kd> uf [B][COLOR=#ff0000]fffffa60`00d407a2[/COLOR][/B] intelppm+0x37a0: fffffa60`00d407a0 fb sti fffffa60`00d407a1 f4 hlt fffffa60`00d407a2 c3 ret 1: kd> uf [B][COLOR=#0000ff]fffffa60`00d3f685[/COLOR][/B] intelppm+0x267c: fffffa60`00d3f67c 4883ec28 sub rsp,28h fffffa60`00d3f680 e81b110000 call intelppm+0x37a0 (fffffa60`00d407a0) fffffa60`00d3f685 33c0 xor eax,eax fffffa60`00d3f687 4883c428 add rsp,28h fffffa60`00d3f68b c3 ret 1: kd> ~3 3: kd> r if if=1 3: kd> !thread THREAD fffffa8009df3bb0 Cid 0234.0238 Teb: 000007fffffdd000 Win32Thread: fffff900c0004d50 RUNNING on processor 3 Not impersonating DeviceMap fffff880000073d0 Owning Process fffffa8009e03040 Image: csrss.exe Attached Process N/A Image: N/A Wait Start TickCount 5915 Ticks: 457 (0:00:00:07.140) Context Switch Count 140 IdealProcessor: 3 LargeStack UserTime 00:00:00.000 KernelTime 00:00:00.468 Win32 Start Address 0x0000000049d6153c Stack Init fffffa600569cdb0 Current fffffa600569b360 Base fffffa600569d000 Limit fffffa6005695000 Call 0 Priority 13 BasePriority 13 PriorityDecrement 0 IoPriority 2 PagePriority 5 *** ERROR: Symbol file could not be found. Defaulted to export symbols for nvlddmkm.sys - Child-SP RetAddr : Args to Child : Call Site fffffa60`0569b6c8 fffff800`03527699 : 00000000`00000010 00000000`00000246 fffffa60`0569b6f0 00000000`00000018 : hal!HalpPciReadMmConfigUlong+0x7 fffffa60`0569b6d0 fffff800`035274aa : 00000000`00000000 fffffa60`0569b800 00000000`00000040 fffff800`0351b000 : hal!HalpPCIPerformConfigAccess+0x55 fffffa60`0569b700 fffff800`035272ef : fffffa60`0569b800 00000000`00000000 00000000`00000000 fffffa60`0569b8d0 : hal!HalpPCIConfigHoldingConfigLock+0x17a fffffa60`0569b750 fffff800`035270d8 : 00000000`00000000 fffffa60`0569b8d0 fffffa60`0569b800 00000000`00000040 : hal!HalpPCIConfig+0x87 fffffa60`0569b790 fffff800`03526d1c : 00000000`00000000 00000000`00000000 00000000`00000040 fffff800`0353aa80 : hal!HalpReadPCIConfig+0x60 fffffa60`0569b7d0 fffff800`03528190 : 00000000`00000002 fffff800`03526d9a 00000000`00000000 00000000`0000000a : hal!HalpGetPCIData+0x89 fffffa60`0569b8a0 fffffa60`02c17c44 : 00000000`00000000 00000000`00000000 00000000`00000028 fffffa60`0569b9d0 : hal!HalGetBusDataByOffset+0x9c fffffa60`0569b990 fffffa60`02cdc48e : 00000000`00000000 00000000`0000ffff 00000000`00000007 00000000`00000000 : nvlddmkm+0x208c44 fffffa60`0569b9d0 fffffa60`02cdffa4 : fffffa80`ffff8086 fffffa80`08a72870 fffffa60`03537888 fffffa80`08a72c41 : nvlddmkm+0x2cd48e fffffa60`0569ba50 fffffa60`02ce0344 : fffffa80`09f40300 fffffa80`09f4d000 fffffa80`08a72870 fffffa80`08757610 : nvlddmkm+0x2d0fa4 fffffa60`0569bab0 fffffa60`02cd2867 : fffffa80`08a705e3 fffffa80`08a72870 fffffa80`08a710de fffffa80`08a72870 : nvlddmkm+0x2d1344 fffffa60`0569bb40 fffffa60`02c0231e : fffffa80`09f4d000 fffffa80`08a72870 fffffa80`08a72870 fffffa80`09e5e010 : nvlddmkm+0x2c3867 fffffa60`0569bb70 fffffa60`02d7f2db : fffffa80`09e5e010 fffffa80`09e5e010 fffffa80`08a72870 00000000`00000012 : nvlddmkm+0x1f331e fffffa60`0569bbb0 fffffa60`02cc860c : 00000000`00000000 00000000`00000000 00000000`00000001 00000000`00000012 : nvlddmkm+0x3702db fffffa60`0569bbf0 fffffa60`02ccd5c8 : 00000000`00000000 00000000`00000012 00000000`00000000 fffffa80`09f63d30 : nvlddmkm+0x2b960c fffffa60`0569bc20 fffffa60`02c383e5 : 00000000`00000000 fffffa80`09f4d000 00000000`00000000 fffffa80`095ca000 : nvlddmkm+0x2be5c8 fffffa60`0569bcb0 fffffa60`02c0a5cd : fffffa80`09f4d000 fffffa80`09f4d000 00000000`00000001 00000000`00000001 : nvlddmkm+0x2293e5 fffffa60`0569bce0 fffffa60`02c0a73b : fffffa60`00000000 00000000`d0000000 00000000`00000000 00000000`00000000 : nvlddmkm+0x1fb5cd fffffa60`0569bdb0 fffffa60`02b2ae91 : 00000000`00000000 00000000`00000000 00000000`d0000000 00000000`00000000 : nvlddmkm+0x1fb73b fffffa60`0569be70 fffffa60`02b2b3d0 : fffffa80`08d3d000 fffffa60`02b2ae0f 00000000`00000001 00000000`00000000 : nvlddmkm+0x11be91 fffffa60`0569bf20 fffffa60`02ae8292 : fffffa80`0017f71e fffffa80`08d3d000 fffffa80`09ea2240 fffffa80`09ea2240 : nvlddmkm+0x11c3d0 fffffa60`0569bf60 fffffa60`02a6473a : fffffa80`08d3d000 fffffa60`00000001 fffffa80`08d3d000 00000000`00000000 : nvlddmkm+0xd9292 fffffa60`0569c020 fffffa60`03749ca9 : fffffa80`08d3d000 fffffa80`08d3d000 fffffa60`0569c990 fffffa60`0569c8d0 : nvlddmkm+0x5573a fffffa60`0569c560 fffffa60`03753389 : fffffa60`03749c27 fffffa80`08d3d000 fffffa60`0569c990 fffffa80`08b2d72c : nvlddmkm!nvDumpConfig+0x23f999 fffffa60`0569c600 fffffa60`03756d25 : fffffa80`08b2d72c fffffa80`08d3d000 fffffa60`0569c990 00000000`00000000 : nvlddmkm!nvDumpConfig+0x249079 fffffa60`0569c7f0 fffffa60`03882b46 : fffffa80`08d3d000 fffffa80`08b2d72c fffffa80`08b2d728 fffff800`030df8b8 : nvlddmkm!nvDumpConfig+0x24ca15 fffffa60`0569c830 fffffa60`0388073a : 00000000`40020056 00000000`00000000 fffffa80`08b2d040 00000000`00000000 : dxgkrnl!DpiDxgkDdiStartDevice+0x62 fffffa60`0569c880 fffffa60`03880baa : fffffa80`00000000 00000000`00000000 00000000`00000000 fffffa80`08b3fd80 : dxgkrnl!DpiFdoStartAdapter+0x382 fffffa60`0569c9e0 fffffa60`0387b66f : 00000000`00000000 00000000`00000000 00000000`00000000 fffffa60`0569cca0 : dxgkrnl!DpiFdoStartAdapterThread+0x17a fffffa60`0569ca70 fffffa60`038f71be : fffffa60`00000000 00000000`00000000 00000000`00000000 00000000`00292d00 : dxgkrnl!DpiSessionCreateCallback+0x1b fffffa60`0569caa0 fffffa60`038f70f6 : 00000000`00000000 00000000`00000054 fffffa80`09e03040 00000000`00000000 : watchdog!SMgrSessionOpen+0x42 fffffa60`0569cae0 fffff960`00043ecb : fffffa80`09e03040 00000000`000007ff fffffa60`0569cb48 00000000`00000000 : watchdog!SMgrNotifySessionChange+0x22 fffffa60`0569cb20 fffff960`00046a9c : fffffa80`0000067b fffffa60`0569cca0 fffffa80`054d5080 00000000`000007ff : win32k!InitializeGreCSRSS+0x23 fffffa60`0569cbe0 fffff800`0305a573 : fffffa80`09df3bb0 000007fe`fd7d8a20 fffffa80`09f3f630 00000000`0018f808 : win32k!NtUserInitialize+0x13c fffffa60`0569cc20 000007fe`fd72cd9a : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : nt!KiSystemServiceCopyEnd+0x13 (TrapFrame @ fffffa60`0569cc20) 00000000`0018f768 00000000`00000000 : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : 0x7fe`fd72cd9a 3: kd> r rax=00000000ffffffff rbx=0000000000000028 rcx=ffffffffffd18000 rdx=fffffa600569b818 rsi=fffffa600569b818 rdi=0000000000000018 rip=fffff80003533b47 rsp=fffffa600569b6c8 rbp=ffffffffffd18000 r8=0000000000000018 r9=0000000000000018 r10=0000000000000000 r11=0000000000000000 r12=fffff8000353a980 r13=0000000000000003 r14=fffffa600569b907 r15=fffff800035424a0 iopl=0 nv up ei pl zr na po nc cs=0010 ss=0018 ds=0000 es=0000 fs=0000 gs=0000 efl=00000246 hal!HalpPciReadMmConfigUlong+0x7: fffff800`03533b47 8902 mov dword ptr [rdx],eax ds:fffffa60`0569b818=ffffffff
Good job on the approach. Looking at intelppm.sys wasn't really necessary since in this case when you look at the running thread for the faulting proc (proc 3) you can see that intelppm was not involved but rather nvlddmkm.sys, or the PCI-E bus, as the last few frames in the callstack show. In the specific situation I was dealing with in the OP, the amd chipset driver was responsible, but not in your case. I'll take a look at the kernel dump myself, but from what I see it looks like you'll want to ask the guy to remove the graphics card, clean up any foreign material that may be in the slot, and then reinsert it, as well as update graphics drivers if they haven't already.
I'm concerned about one thing though, in that you actually are retrieving a thread with all its info n stuff from the faulting proc, which isn't really supposed to happen if that proc was actually frozen. I would think what took place is the IRQL that proc was on at the time was higher than clock interrupt but not higher than the bugcheck, but if that was the case why didn't it save the IRQL (which shows up as 0), or if it did successfully save it, then why on earth would a thread at IRQL 0 stop a clock interrupt?
Perplexing, but I'd like to look into it further. One of the things I'd like to do is check to see if anything in the callstack actually called to increase the IRQL (KeRaiseIRQL). There's a script someone made at codemachine.com that will parse through a module to see if there's any calls it makes to a function name that you give it, which is very convenient. It's not perfect, but does the job well. There's other ways of approaching this as well, but I'll determine that to the best of my ability when I take a look at it.
Stephen