[SOLVED] Freezing / BSOD while idle - DPC WATCHDOG VIOLATION bugcheck 133 - FIX: I disabled C states in the BIOS

CodyB

Member
Joined
Aug 19, 2024
Posts
23
Hello, I've been having issues with my system freezing and blue screening constantly recently and found this forum through a post with very similar issues to me, however it does not appear to have been resolved yet so I figured I'll make my own post.

I found that my PC never freezes if I have a youtube video playing or while in a game... so while I'm working on my other computer I put on a video to keep my PC from freezing.. if I do not, the PC will almost without fail either freeze, blue screen, or when I come to wake up the screens it will not come out of sleep (I assume it's just freezing in that state).

I get the stop code DPC_WATCHDOG_VIOLATION
Bugcheck 133
The process named in the minidump always seems to be different, just whatever was active at the time maybe.

What I have done do far...
DDU clean up & updated GPU drivers
Windows Updates
Updated Network drivers
Updated AMD drivers
Updated BIOS
Ran windows memory diagnostic - no errors
Ran Memtest86 on every stick - no errors (only did 1 pass on each stick since it takes 30 minutes to do 1 pass)
Ran PC on each stick of RAM individually - managed to freeze on every stick individually.
Attempted to run PC with driver verifier on, resulted in my PC freezing within minutes of start of multiple times so I turned it off.


PC Info:
Self built in 2020.
Windows 10
Motherboard: ASUS X570 PLUS
CPU: AMD Ryzen 5900x
GPU: MSI 3070 Gaming X Trio
RAM: 64GB GSkill (F4-3200C16Q-64GVK)
PSU: 850W
SSD/HDD: 1TB M.2(2020) / 6TB WD BLACK(2020) / 3TB WD BLACK(2014ish)


Any help greatly appreciated, would like to keep reformatting to a last resort.
 
Hello and welcome to the forum!

The dumps all have the same bugcheck code, 0x133 but with argument 1 set to 1. These are a special case and they need the kernel dump to diagnose. The kernel dump is the file C:\Windows\Memory.dmp, please upload that to a cloud service with a link to it here.

In the meantime, your System log shows a lot of warning messages for a human interface device...
Code:
Log Name:      System
Source:        Microsoft-Windows-Kernel-PnP
Date:          20/08/2024 02:10:18
Event ID:      219
Task Category: (212)
Level:         Warning
Keywords:    
User:          SYSTEM
Computer:      DESKTOP-CODY
Description:
The driver \Driver\WudfRd failed to load for the device HID\VID_B58E&PID_9E84&MI_03&Col02\a&2acd794&0&0001.
The HID\VID_B58E&PID_9E84 identifier shows it to be some sort of human interface device and the VID and PID identifiers suggest it's a Yeti stereo microphone?

These warnings are immediately preceded by an additional warning for a Windows Hello device
Code:
Log Name:      System
Source:        Microsoft-Windows-Kernel-PnP
Date:          20/08/2024 02:10:15
Event ID:      219
Task Category: (212)
Level:         Warning
Keywords:    
User:          SYSTEM
Computer:      DESKTOP-CODY
Description:
The driver \Driver\WudfRd failed to load for the device ROOT\WindowsHelloFaceSoftwareDriver\0000.
A microphone isn't a Hello device, so perhaps this is a webcam, possibly with a built-in mic? It's certainly a USB attached human interface device of some sort.

What's interesting about these warnings is that they occur at the same time as error 41 messages, which indicate that Windows wasn't shutdown. This will be where you restarted after a freeze. However, I can also see them around the time of a BSOD and that's too much to be a coincidence.

I'd suggest you unplug all HID USB devices and see whether it still freezes. If it doesn't then plug them back in one by one until it does start freezing.
 
I have a Blue Yeti microphone, I will unplug it for now while not in use.
I also have a Logitech c920 webcam I have unplugged.
Aside from that I have a USB 3 switch with 3 devices plugged into it (mouse, keyboard, numpad) it's cheap and tends to have spotty connection at times.

I'm shocked by how quickly I can reproduce my freezing and prevent it by putting something on.. however it should make troubleshooting a bit easier, I'll see if I get any crashes with these unplugged.

Memory dump: MEMORY.DMP

thank you,
 
Less than 10 minutes, had another freeze followed by BSOD with the mic & camera unplugged.
 
Unplugged the USB switch, plugged KB&M directly into the PC, took a bit longer this time but did still end up getting a freeze.
 
The kernel dump was very useful! I'll briefly outline my thinking and the process...

The dumps (all of them) show the following failure bucket...
Code:
FAILURE_BUCKET_ID:  0x133_ISR_nt!KeAccumulateTicks
The ISR there indicates that the excessive time that led to the BSOD was spent during the Interrupt Service Routine. This is the front-end of device interrupt processing, where the device signal that data is available by issuing an interrupt. The relevant ISR gets control (the ISR code is in the driver) and all it does is record the buffer address where the data is located, it schedules a DPC and then ends. This is where your problem is - in this dump at least.

The scheduled DPC (Deferred Procedure Call - the back-end of device interrupt processing) will be run when a CPU comes idle. The DPC code is also in the device driver, and it tells the waiting thread that the data has arrived and points to where it is. The waiting thread is then marked ready for execution again.

Using the kernel dump we can examine both the ISRs and DPCs that are running on your system and from there see which one(s) are running for too long. We use the Windows Performance Analyser (WPA) to display the results...

4cqtRhm.jpg


At the top are the DPCs and at the bottom the ISRs. We know the problem is ISR related, so if you look at the far right column (Duration (Fragmented) (ms) Sum) you can see I've sorted on that column. At the top is the longest running ISR; HDAudBus.sys, which ran for 8.6 ms (8607 μs). No ISR should run longer that 25 μs, so this is your problem. You can also see in the DPCs that HDAudBus runs for 1029 μs, Microsoft recommend that no DPC run for more than 100 μs. We have other long running DPCs, but HDAudBus.sys looks to be the problem here.

However, HDAudBus.sys is a Windows driver and so it's not at fault. It will be calling lower level third-party drivers, including the Realtek audio driver RTKVHD64.sys. The version of this driver that you have is old, dating from 2020...
Code:
6: kd> lmDvmRTKVHD64
Browse full module list
start             end                 module name
fffff801`84320000 fffff801`849ac000   RTKVHD64   (deferred)            
    Image path: \SystemRoot\system32\drivers\RTKVHD64.sys
    Image name: RTKVHD64.sys
    Browse all global symbols  functions  data  Symbol Reload
    Timestamp:        Tue Jun 16 12:49:25 2020 (5EE895A5)
    CheckSum:         0067AE1D
    ImageSize:        0068C000
    File version:     6.0.8971.1
    Product version:  6.0.8971.1
    File flags:       8 (Mask 3F) Private
    File OS:          40004 NT Win32
    File type:        3.9 Driver
    File date:        00000000.00000000
    Translations:     0409.04b0
    Information from resource tables:
        CompanyName:      Realtek Semiconductor Corp.
        ProductName:      Realtek(r) High Definition Audio Function Driver
        InternalName:     RTKVHD64.sys 8971
        OriginalFilename: RTKVHD64.sys
        ProductVersion:   6.0.8971.1
        FileVersion:      6.0.8971.1 built by: WinDDK
        FileDescription:  Realtek(r) High Definition Audio Function Driver
        LegalCopyright:   Copyright (c) Realtek Semiconductor Corp.1998-2013
You need to look for an update for this driver on your motherboard vendor's website. I would suggest you update any other old drivers whilst you're at it.

I can see that you also have the Nvidia audio driver nvhda64v.sys installed, it comes with the graphics driver. You ONLY need this driver if you are sending audio over HDMI as well as video. If you're not doing that you don't need it, and it's known to cause conflicts with Realtek audio drivers. Updating the REaltek driver may solve this however. If it doesn't then use DDU to fully uninstall the existing graphics driver and reinstall it (or the latest version if there is one). Choose a 'Custom (Advanced)' install and uncheck the box for the audio driver so that nvhda64v.sys is not installed.
 
Ah I was so close already without even knowing it!
I had updated the Lan driver for my Asus website, and the chipset drivers from AMD (should I get the ones from ASUS for my board specifically instead?) but I did not get the audio drivers while I was there, should have done them all.
I've updated the realtek drivers now from ASUS, will report back after some time has passed and proceed to the next step with the NVidia audio drivers if needed.

thank you.
 
With realtek updated, PC was up for quite a while before I got a freeze, but did end up getting a freeze.
Proceeded to run DDU and re-install my GPU drivers (minus the audio drivers) and came back to my PC frozen a while later.
 
Looking at that latest kernel dump you seem to be still having the same problem, the failure is with a long-running ISR still...
Code:
FAILURE_BUCKET_ID:  0x133_ISR_nt!KeAccumulateTicks
And looking at the ISR run times via WPA again shows HDAudBus.sys is still the longest running ISR by a long way...
ypKb7VV.jpg

However, dxgkrnl.sys (the WIndows DirectX kernel driver) also runs long. THis may be because the audio driver runs long, or vice-versa. The dxgkrnl.sys driver is a Windows driver, so that's not at fault. It will call nvlddmkm.sys, the third-party graphics driver. The version of nvlddmkm.sys that you have installed is dated July 30th (560.81) and there is a later version available from the Nvidia website dated August 20th (560.91). It would be wise to update to this driver.

Although this appears to be an ISR failure it's worth looking at the DPC runtimes whilst we have the data open...
uLRS1ye.jpg

There you can see that HDAudBus.sys has a reasonable DPC runtime (though longer than the 100 μs recommended). The tcpip.sys (high-level networking driver), nvlddmkm.sys (graphics driver), and Wdf01000.sys (the Windows Driver Foundation root driver) are much longer running. (the ntoskrnl.exe entry is the Windows kernel, not a DPC).

Looking at your network (because of the tcpip.sys driver) you seem to have TWO Realtek Gaming GbE Family Controllers, numbered #1 and #2.
Rich (BB code):
Name    [00000001] Realtek Gaming GbE Family Controller
Adapter Type    Not Available
Product Type    Realtek Gaming GbE Family Controller
Installed    Yes
PNP Device ID    Not Available
Last Reset    8/19/2024 7:37 PM
Index    1
Service Name    rt640x64
IP Address    Not Available
IP Subnet    Not Available
Default IP Gateway    Not Available
DHCP Enabled    Yes
DHCP Server    Not Available
DHCP Lease Expires    Not Available
DHCP Lease Obtained    Not Available
MAC Address    Not Available

Name    [00000010] Realtek Gaming GbE Family Controller
Adapter Type    Ethernet 802.3
Product Type    Realtek Gaming GbE Family Controller
Installed    Yes
PNP Device ID    PCI\VEN_10EC&DEV_8168&SUBSYS_87C31043&REV_26\6&2AD155D1&0&0028000A
Last Reset    8/19/2024 7:37 PM
Index    10
Service Name    rt640x64
IP Address    192.168.0.108, fe80::98fd:b25f:1aef:e31e
IP Subnet    255.255.255.0, 64
Default IP Gateway    192.168.0.1
DHCP Enabled    Yes
DHCP Server    192.168.0.1
DHCP Lease Expires    8/26/2024 7:37 PM
DHCP Lease Obtained    8/19/2024 7:37 PM
MAC Address    ‪3C:7C:3F:50:E1:D0‬
I/O Port    0x0000F000-0x0000F0FF
Memory Address    0xFC604000-0xFC604FFF
Memory Address    0xFC600000-0xFC603FFF
IRQ Channel    IRQ 4294967247
Driver    C:\WINDOWS\SYSTEM32\DRIVERS\RT640X64.SYS (10.42.526.2020, 1.09 MB (1,146,456 bytes), 8/10/2024 11:23 PM)
You're using #2 but #1 is there, it's using the same driver and it was last reset at the same time as #2 (8/19/2024 7:37 PM). What is the deal here? Whay are there two LAN adapters that seem to be linked?

I can see that you've updated the LAN adapter driver (rt640x64.sys) since your first post, but I'm still concered as to why there are two LAN adapters using this same driver? I think we need to resolve that before moving on.
 
Hi CodyB,

If the crashes persist, can you try using Rzyen Master to disable CCD 1? Meaning, essentially temporarily turn your CPU into a 5600X. I think that's possible, though I haven't had direct experience with a 5900X myself to know for sure. Then see if the crashes continue in that configuration.
 
I had to show hidden devices in the device manager, then it showed 2 controllers.
I uninstalled both, and restarted - windows automatically reinstalled it, and I verified it's the same latest version I had previously downloaded.
Also have updated my nvidia drivers to the latest 560.91

Latest dump MEMORY3.DMP
 
Hi CodyB,

If the crashes persist, can you try using Rzyen Master to disable CCD 1? Meaning, essentially temporarily turn your CPU into a 5600X. I think that's possible, though I haven't had direct experience with a 5900X myself to know for sure. Then see if the crashes continue in that configuration.
I could give it a try, what's the thought behind this angle?

I've seen people with similar issues mainly on the 5600x saying they raised their CPU voltage and resolved the issues, however I haven't tried this and haven't played much with CPU overclocking either, new territory for me.
 
I have a different interpretation of the trace data from ubuysa. It looks to me like driver code is getting stuck waiting for a lock held by other threads running on other cores. My thinking is to remove communication/cooperation between the CCDs as a variable.

edit: Actually, if the CCDs are labeled 0 and 1, please see if you can disable CCD 0 and just use CCD 1 if you try it.
 
Last edited:
I may need some guidance as I have no idea what I'm doing here lol,

I entered advanced mode and selected one of the profiles.
It allows me to disable 5 out of 6 on each CCD, maybe I'm missing another way to do it.
Also, is there anything else I should do before applying any changes?


1724450914402.png
 
Do the buttons (or is it a toggle?) labeled 1 and 2 next to the "Active CCD Mode" text do anything about enabling one or the other CCDs? If not, I might ask you to try something else with Ryzen Master. I suspect logical core 9 (counting from 0) has an issue - which is why I want to disable CCD 0, if possible. If that's not possible, then I might want to try disabling individual CCX cores to include C05 (since Ryzen Master counts cores starting from 1 rather than 0).
 
I might want to try disabling individual CCX cores to include C05 (since Ryzen Master counts cores starting from 1 rather than 0).

So should I just disable C05 and apply?
Is there anything else I should be doing first or keeping an eye on? Again I'm unfamiliar with overclocking so this is new territory for me so just proceeding cautiously.

1724458826903.png
 
Back
Top