nvlddmkm.sys CONSTANT crash

I have disabled some of the services, mainly nvidia based. Got yet a new bugcheck error today too :(

0x144: BUGCODE_USB3_DRIVER - Wondered if that has anything to do with my dying SSD. Formatted and removed it now.

Literally considering buying a new CPU, GPU and RAM at this point icl lol
 
Hmm. I thought I did but it is still here. Uninstalling now.

Edit: it's gone. My previous case used to be NZXT, kept it as I wanted to wanted to monitor temps. Know of any good monitoring apps that won't potentially crash my system?
 
Small update - still crashing with new errors in event viewer and reliability history.

Event viewer: Dwminit - The Desktop Window Manager process has exited. (Process exit code: 0x0000042b, Restart count: 1, Primary display device ID: NVIDIA GeForce RTX 2080 SUPER)

About 30 seconds before that crash was Application Hang error. Saying
The program explorer.exe version 10.0.26100.3323 stopped interacting with Windows and was closed. To see if more information about the problem is available, check the problem history in the Security and Maintenance control panel.

Thats a new one.

In reliability history, a new Live Kernal Dump.

VIDEO_MINIPORT_BLACK_SCREEN_LIVEDUMP (1b8)

This time caused by dxgkrnl.sys.

I had found a video online saying to move the nvlddmkm.sys file to the drivers folder in system32 which I did. This seemed to make the crash behave different as in the keyboard num lock, caps lock etc reponded even though I could do nothing on screen. But this could have been a coincidence. Still running a clean boot with just a few services such as icue running (otherwise my fans are so loud it drives me nuts)

Still thinking its the GPU, what do you think? I may ditch nvidia for AMD. Currently looking at the RX 7800XT as a replacement.
 
I would try to clean the graphics card thoroughly, also replacing the thermal paste.
I think it costs 20-40 pounds in a well-equipped shop, if you don't want to do it yourself.
It should also reduce the noise.

You can however buy another graphics card later, if the current one still doesn't work properly...
 
I'm sorry I'm late to this, I have a lot going on at the moment.

The triage analysis in those two dumps does finger nvlddmkm.sys as you rightly say, but the tirage analysis isn't always right. There is another feature of both dumps that concerns me; a checksum mismatch...
Code:
*** WARNING: Check Image - Checksum mismatch - Dump: 0xc2ed94, File: 0xc31fe0 - W:\SymCache\ntkrnlmp.exe\783A7A83144f000\ntkrnlmp.exe
Both are for the ntkrnlmp.exe module, this is the Windows multiprocessor kernel. A checksum mismatch error means that the copy of the module in memory (in the dump) has a different computed checksum to the copy on the Microsoft servers. That usually indicates either a RAM issue or a module corruption. (I also see this in the additional dump you uploaded, so it's consistent).

The easiest way to resolve a corruption within the ntkrnlmp.exe module is to run Windows Reset, keeping all apps and data. This will refresh all the Windows libraries but retain all installed drivers. Backup everything first though - twice.

You also mention a WHEA error for a Cache Hierarchy Error - these can also be caused by flaky RAM. It might be that the graphics driver/card indications are red herrings and you have some sort of rare RAM issue. I see you've run Memtest86 for 8 passes but no memory tester can find 100% of potential issues so I'd suggest that you remove one RAM stick for a week or so and see whether these issues continue. Swap the sticks over and run on just the other for a week to check. Be sure the single stick is in the correct slot (typically A2).
 
Hi Ubuysa

Thanks for the detailed reply. I thought it would be mentioning that both my RAM and GPU are the oldest original components still in my system. These are now over 5 years old and my system is used everyday. My CPU and motherboard died mid 2021 and were replaced. I'm not sure what the lifespan on these components are but perhaps they are wearing out?

Another thing that has just popped into my head. I can't remember if it was 2022 or 2023 but one day I switched my system on and it was near the RAM sticks as soon as I pushed the power button there was a spark and the system immediately shut down. However I pushed the power button and the system was perfectly fine.
 
Another thing that has just popped into my head. I can't remember if it was 2022 or 2023 but one day I switched my system on and it was near the RAM sticks as soon as I pushed the power button there was a spark and the system immediately shut down. However I pushed the power button and the system was perfectly fine.
A spark! Wow, never seen actual sparks myself. That would ring all the alarm bells for me. I really would try removing one RAM stick at a time.

One other thing; your RAM is not on the QVL for the CPU/motherboard. That's often not a worry, and clearly the RAM has been working fine for some time, but non-QVL RAM is always a worry when one is looking at potential RAM issues. If you do decide to ever replace the RAM i would select RAM that is on the QVL.

All that said, you're right that the dumps do point strongly at the GPU or driver. Since you've already done a lot of troubleshooting in that area I thought it might help to look elsewhere...
 
Thanks for the RAM info. Not sure if you saw up there but my system totally died today. I was getting help on another forum but the guy helping chose to give up when he realised I'm inexperienced, so that's nice.

Anyhow I have bought a new psu which will be arriving later and hopefully that solves all my issues.
 
The Qualified Vendor List (QVL) is a list of RAM that has been tested and verified as compatible with the motherboard and CPU combination. Other RAM may work perfectly fine, but IMO it's always wise to stick with QVL RAM because you know that it's compatible.

It's also not wise to seek help from two different places at the same time, because neither site know what the other site has asked you to do. Doing this also introduces changes that we don't know about.

Hardware is not my strongest point (though I build my own PCs) but my advice would be to reduce the system to the minimal possible hardware config. Remove all but on RAM stick. Remove all external devices, except mouse, keyboard, and one monitor, also used a wired mouse and keyboard if possible. Remove all PCIe cards that you can do without. Remove all storage drives that you can do without, ideally you just want the system drive in there. You want to see whether it's stable with that minimal hardware config.

From what you were saying about sparks I fear that the root cause may be the motherboard. I would get a strong light and a magnifying glass and have a VERY close look at the area of the motherboard where you think you saw the spark - both sides of the board. You're looking for discolouration, black marks, soot even. If you see anything that looks off try photographing it, as close as you can get and under a strong light, and post them here.
 
Thank you so much for sticking with me. The system is actually getting absolutely no power whatsoever, before I would be able to charge my Xbox controller or my usb would light up when connected, that's why my first thought was the psu.

Also the spark was at least 2 years ago, I'm surprised it would have taken so long to die? I will have a close look at the motherboard later
 
No lights at all on the motherboard? It's likely the motherboard then. Give it a good blow out with a blower - not your mouth! - paying particular attention to the RAM and PCIe sockets. Check that all power connectors are good. I don't know what else to suggest.
 

Has Sysnative Forums helped you? Please consider donating to help us support the site!

Back
Top