Freezing/BSOD DPC WATCHDOG VIOLATION

Honestly, I don't really know how I got it to work. I just removed the first file that it was trying to grab, maybe I messed up that file.

I have unplugged all connections to my PC besides displays, keyboard, mouse(BT), and ethernet. Still got some crashing.

I tried to use your command, but it didn't work. I'm getting some error, thought I was copying too much of the line but I guess not. I attached an image of the window.
 

Attachments

  • CommandPromptError.PNG
    CommandPromptError.PNG
    26.4 KB · Views: 0
I was able to track down the path manually in RegEdit, here's a snapshot of what I found. If you need more just let me know.

Do I now try to change this value from 1 to 0, and see if that helps?
 

Attachments

  • Regedit.PNG
    Regedit.PNG
    114.2 KB · Views: 2
Last edited:
I just built this PC a couple months ago, my settings say that I'm up to date.

Should I just go ahead and update to Windows 11?
 
Personally I would strongly advise against updating a system know to have issues to Windows 11. You just create a whole new set of problems. We need to find out why the system is crashing before it's wise to consider updating to Windows 11.

The most recent dump (Fri Sept 13) is a 0x116 VIDEO_TDR_FAILURE, the other dumps also highlight nvlddmkm.sys (the Nvidia graphics driver). THis does look like a graphics problem. I'm still concerned that the PSU might not be up to that RTX4090 with power spikes on load. Did you ever upgrade to the 1000W PSU?
 
Personally I would strongly advise against updating a system know to have issues to Windows 11. You just create a whole new set of problems. We need to find out why the system is crashing before it's wise to consider updating to Windows 11.

The most recent dump (Fri Sept 13) is a 0x116 VIDEO_TDR_FAILURE, the other dumps also highlight nvlddmkm.sys (the Nvidia graphics driver). THis does look like a graphics problem. I'm still concerned that the PSU might not be up to that RTX4090 with power spikes on load. Did you ever upgrade to the 1000W PSU?
I haven’t yet. Just waiting on shipment and also will be gone this week for work. So it will be around a week or so before I can do anything else.

I’ll swap the PSU ASAP and update.
 
Windows Error Reporting:


Fault bucket 0x116_TdrBCR:4:C000009A_Tdr:9_IMAGE_nvlddmkm.sys_Ada_SCG3D-AMD#0, type 0
nvlddmkm.sys allowing your computer to communicate with NVIDIA devices like the GPU.
Event Name: BlueScreen

+++ WER5 +++:
Fault bucket 1861905259368572200, type 1
Event Name: APPCRASH
Problem signature:
P1: FMSIScan.exe
P4: atiadlxx.dll

+++ WER6 +++:
Fault bucket , type 0
Event Name: APPCRASH
Problem signature:
P1: FMSIScan.exe
4: atiadlxx.dll

+++ WER7 +++:
Fault bucket 1448329291139938386, type 5
Event Name: RADAR_PRE_LEAK_WOW64
Problem signature:
P1: asus_framework.exe
^^^^^^^^RADAR_PRE_LEAK_64 can indicate issues with your system's hardware configuration."

Dump Files: Shows problems with CPU and GPU
 
OK, so we know that the lack of power during GPU spikes was not the issue, but it's still wise to have upgraded the PSU.

The two dumps both point very clearly at either the graphics driver or the graphics card...

One is a 0x116 VIDEO_TDR_FAILURE bugcheck, this happens when the Windows Timeout Detection and Recovery feature (TDR), which detects a graphics hang and resets the graphics driver and graphics card, fails to recover from the hang. The cause here is almost certainly either the driver or the card.

The other dump is a 0x133 DPC_WATCHDOG_TIMEOUT. A DPC is a Deferred Procedure Call and they are typically used in the back-end of device interrupt processing, the DPC code is part of the device driver. In this dump the graphics driver (nvlddmkm.sys) is where the failure happens...
Code:
FAILURE_BUCKET_ID: 0x133_ISR_nvlddmkm!unknown_function
Note that this failure bucket blames the ISR, the Interrupt Service Routine, which is the front-end of device interrupt processing, the ISR code is also part of the device driver. Long running of either the ISR or the DPC will cause this 0x133 bugcheck.

You thus have two dumps here, both pointing very clearly at a graphics problem, just as the earlier dump did. We now know it's not a power problem, although be sure that the additional power cable is securely plugged into the 4090 card and the PSU.

The first thing I'd suggest is that you remove the 4090 card and then re-seat it firmly. You'd be surprised how many times this simple action solves the problem. The slightest bit of dust or dirt between a card pin and the slot can cause all sorts of issues.

If that doesn't help then your next best option is to remove the 4090 and plug the monitor into the motherboard port and use the Radeon graphics iGPU on the CPU and see whether it crashes or BSODs then. If it's stable without the 4090 installed then you know for certain that the problem does lie with the 4090 or the driver. You might combine this test with removing and re-seating the 4090.

The best way to check whether the 4090 or the driver is at fault is to download the four most recent driver versions for that card from the Nvidia website. Also download DDU. Use DDU to uninstall the existing driver and then manually install the most recent driver. If it crashes or BSODs use DDU again to remove that driver and then manually install the next most recent driver. Keep doing this until you either find a driver where it's stable or it BSODs/crashes on every driver version. If it fails on the four most recent driver versions then the problem is likely the 4090 card.
 
Last edited:
Check if the temperatures are within the limits.
Using SpeedFan, log the GPU/CPU temperatures and fan speeds to a logfile.

There are also a new BIOS and new chipset drivers.
Also, you may want to disable the RGB lights on your graphics card as they may cause problems.
 
Last edited:
OK, so we know that the lack of power during GPU spikes was not the issue, but it's still wise to have upgraded the PSU.

The two dumps both point very clearly at either the graphics driver or the graphics card...

One is a 0x116 VIDEO_TDR_FAILURE bugcheck, this happens when the Windows Timeout Detection and Recovery feature (TDR), which detects a graphics hang and resets the graphics driver and graphics card, fails to recover from the hang. The cause here is almost certainly either the driver or the card.

The other dump is a 0x133 DPC_WATCHDOG_TIMEOUT. A DPC is a Deferred Procedure Call and they are typically used in the back-end of device interrupt processing, the DPC code is part of the device driver. In this dump the graphics driver (nvlddmkm.sys) is where the failure happens...
Code:
FAILURE_BUCKET_ID: 0x133_ISR_nvlddmkm!unknown_function
Note that this failure bucket blames the ISR, the Interrupt Service Routine, which is the front-end of device interrupt processing, the ISR code is also part of the device driver. Long running of either the ISR or the DPC will cause this 0x133 bugcheck.

You thus have two dumps here, both pointing very clearly at a graphics problem, just as the earlier dump did. We now know it's not a power problem, although be sure that the additional power cable is securely plugged into the 4090 card and the PSU.

The first thing I'd suggest is that you remove the 4090 card and then re-seat it firmly. You'd be surprised how many times this simple action solves the problem. The slightest bit of dust or dirt between a card pin and the slot can cause all sorts of issues.

If that doesn't help then your next best option is to remove the 4090 and plug the monitor into the motherboard port and use the Radeon graphics iGPU on the CPU and see whether it crashes or BSODs then. If it's stable without the 4090 installed then you know for certain that the problem does lie with the 4090 or the driver. You might combine this test with removing and re-seating the 4090.

The best way to check whether the 4090 or the driver is at fault is to download the four most recent driver versions for that card from the Nvidia website. Also download DDU. Use DDU to uninstall the existing driver and then manually install the most recent driver. If it crashes or BSODs use DDU again to remove that driver and then manually install the next most recent driver. Keep doing this until you either find a driver where it's stable or it BSODs/crashes on every driver version. If it fails on the four most recent driver versions then the problem is likely the 4090 card.
Let me start by saying I was using this GPU in a previous build with no crashes. It’s now in a water block as well as using a riser cable. I used the stock cable that came with the T1 and after some crashing I replaced it. I thought this fixed my problems but now obviously I’m crashing more. I’ve seen loads and loads of people expressing issues with riser cables, and I can try to get another one to test or bypass it altogether. My only issue is just the way the build is, it would be very difficult for me to bypass the riser cable. But I think this would have to be something to try as well, bypass the riser and go straight to the motherboard.

I’ve reseated the riser cable/GPU multiple times. I have also tried to use DDU multiple times. But only with the most recent driver, I can try it with some previous versions too. Maybe with an older version?
 
Check if the temperatures are within the limits.
Using SpeedFan, log the GPU/CPU temperatures and fan speeds to a logfile.

There are also a new BIOS and new chipset drivers.
Also, you may want to disable the RGB lights on your graphics card as they may cause problems.
I’ll try these and update with some result.

Does SpeedFan log the temps even in the event of a crash? For the most part, my temps have been fine from what I see.

My GPU is in a water block so lighting is not connected.
 
Okay so I have been replicating the crash by doing 3DMark Steel Nomad stress test. It consistently crashes on loop 2-4 out of 20 total that it should do.

I unplugged my GPU and used my motherboards HDMI. I performed the same Steel Nomad stress test and it went all the way through the test, very slowly of course.

I have also performed a DDU and install of the two most recent drivers and also the oldest one I was able to download, from around March 2024. I am still getting crashes after these steps.

I saw some post about going into ‘Debug Mode’ in Nvidia Control Panel, this actually let the test run a bit longer, to around loop 5-6 but still crashed.

My next steps will be to bypass the riser cable and connect my GPU directly to the motherboard. If this works, I’ll get a new riser cable. If still crashing, I’ll look into RMA’ing the card.
 
Back
Top