Sporadic DRIVER_IRQL_NOT_LESS_OR_EQUAL and very occasional random restart without error

Tankut

Member
Joined
Oct 24, 2024
Posts
7
Hi!

Sometimes the machine goes to BSOD a few times a day when I'm working, sometimes nothing happens for a week. It is not something I am able to replicate, it can happen when system is stressed (rendering 3D, running a sim) or just sitting idle with the browser or just a text editor open.

My PC is a desktop I've had built in 2019, but I've upgraded components over time (system disk m.2, CPU, GPU - only RAM and motherboard remains from initial build).

I'm running Windows 10 Pro retail (22H2 build 19045.5011) , kept up to date (says installed on 6/6/2020, should be older to my recollection). No additional antivirus software besides Windows Defender.

Quick specs:

motherboard: AMD X470 (AM4) (Gigabyte Aorus Ultra)
cpu: AMD Ryzen 9 5950X
ram: 64 GB (Corsair DDR4 2133 16GB x 4)
gpu: NVIDIA GeForce RTX 4070 super
system disk: Samsung SSD 990 PRO w/ Heatsink 1TB
psu: Thermaltake Toughpower XT 750W
others: a work drive (samsung m.2), a few internal mechanical drives, wacom intuos 4, logitech webcam, bt dongle, powered usb hub, printer/scanner, a couple of monitors.

There's no over/underclocking. Bios was updated when changing CPU. PSU is not stressed. Cooling is adequate, it is a big old Thermaltake Armor+ ironmongery thing with good ventilation.

I'm suspecting it's a buggy driver somewhere that's causing this, but will run the MemTest86 and driver verifier after I post this.


Thank you very much for any pointers you might share :)
 

Attachments

Hello and welcome to the forum!

The dumps are all the same and the BSOD seems to happen as the processor comes out of the idle state. In addition, all the dumps occur on either processor #2 or processor #3 and I can see that you have hyperthreading enabled (there are 32 logical processors). Logical processors #2 and #3 are both on the same physical core (I would imagine) and that's a useful clue.

Some CPUs, and AMD seem to be more susceptible to this, develop problems changing from the low power idle C-State into the high power running C-State. My first thought here from looking at the dumps is that this could potentially be your problem - it's certainly the first thing to look at. The test (and the permanent workaround, since this cannot be fixed) is to go into the BIOS settings and disable C-States for all processors/cores. That will stop any processor/core from entering the low power state when it goes idle and will thus eliminate the power transition problem when it comes active again. The only downside to this is a tad more heat from the CPU when idle, but a decent cooler should easily handle this, and a slightly higher power consumption when idle.

If it still BSODs after disabling C-States then please run the Sysnative data collection app again and upload the new output.
 
Hi!

I agree with Ubuysa, the first (and simple) thing to do is to disable c-states.

Anyway, that BIOS version is not listed on the manufacturer support webpage for your motherboard: was it a BETA?
There were new BIOSs; the newer is F65d (2024/09/02 - Sep 02, 2024).

Your ram model CMK64GX4M4B3000C15 or CMK32GX4M2B3000C15 or CMK16GX4M1B3000C15, i.e. 4 sticks of 16GB corsair vengeance DRAM DDR4 C15 3000MHz is not visible in your MB QVL for a vermeer CPU like yours.
It is visible in your MB QVL for a matisse CPU.
What cpu did you have before AMD Ryzen 9 5950X?
Its specs on corsair.com say Memory Compatibility: Intel (100 Series, 200 Series, 300 Series, 400 Series, 500 Series, X99, X299).
However you have been lucky enough to see it worked well in past months; did you change the CPU in March?
 
Thank you.

Bios version is F63c, date 07/20/2022.

Yes March is probably when I changed the CPU, used to have a Ryzen 7 2700X .
 
Wasn't suspecting RAM failure/corruption (glitching if the word is appropriate) since I have previous purely empirical experience how that manifests. Wasn't able to do the RAM test due to some family issues today, will check that off soon as possible.
 
Left Memtest86 run overnight with default parameters (all tests on except the last experimental option), 4 passes. Last I looked it was nearing the end of pass 2 with 0 errors. In the morning the computer had shut down. At this point I hadn't changed the C-State yet, so that may be the culprit.

-----------

In BIOS there's a "Global C-State Control" setting, default mode being "Auto". I set this to "Disabled" as you recommended.

Logged into Windows to see if there would be another BSOD. I ran the DISM+SFC tools as recommended on some tech site, while browsing the web (very low CPU/RAM stress). I did run into BSOD a couple of times in half an hour. I think this has no correlation/causation with the DISM tools, since I was able to run the tools after reboot with no issues.

DISM /Online /Cleanup-Image /CheckHealth
DISM /Online /Cleanup-Image /ScanHealth
DISM /Online /Cleanup-Image /RestoreHealth
SFC /scannow

There were no errors or flags.

-----------

BSODs with (page_fault_in_nonpaged_area) became frequent with "Global C-State Control" set to "Disabled". In BIOS I switched this to "Enabled".

-----------

With "Global C-State Control" set to "Enabled" one more BSOD after 3 hours (less frequent, but this may mean nothing).

There are a number of other settings in the BIOS I did not touch (don't know which are relevant, default values in parantheses):
AMD Cool&Quiet function (Enabled)
PPC Adjustment (P State 0)
Power Supply Idle Control (Auto)
CCD Control (Auto)
CPPC (Auto)
CPPC preferred cores (Auto) (possible options are "1+0 ONE" up to "6+0 SIX")
Downcore Control (Auto)
SMT Mode (Auto)

-----------

Ran the "WhoCrashed Home Edition" software to analyze dump file ( I attached its output report). Also ran the SysnativeFileCollectionApp again, that too is attached.

Do you have any suggestions?

Thank you very much for all the help :)
 

Attachments

RAMtest was running for a few hours with 1 pass done with 0 errors when I went to bed.
The pc was off in the morning, RAMtest probably hadn't concluded, no idea why, can even be a power outage - I don't have a UPS. RAMtest is running again right now.
 
Back
Top