Obscure BSOD over the past few months

Just an FYI, not sure if I'm allow to post link to other sites but I created a thread over at the NVIDIA forums to help and was advised to try switching from the NVDIA DCH driver type to Standard, after reading some other threads and remarks regarding DCH I am now extremely keen to try out the Standard ones and will be doing that shortly to see how it goes.

https://www.nvidia.com/en-us/geforc.../obscure-bsod-over-the-past-number-of-months/
 
Just an FYI, not sure if I'm allow to post link to other sites

It's fine, we aren't that kind of forum, as long as, it's not advertising or spam then it's generally acceptable.

Anyhow, thanks for letting us know, please keep us updated!
 
Awesome, I used DDU in safe mode and installed the 431.60 driver, however that brung back the odd double mouse cursor issue so I then installed 431.68 on top of that, have not had any crashes since Saturday/Sunday.
 
So it just did it again 3 hours ago, idle doing nothing, have posted an updated message on the NVIDIA forum, will have to try and see what else I can do.
 
Still having issues, have updated the NVIDIA forum but will paste here as well for visibility.

At the moment I'm running each ram stick through an 8 pass memtest86+ run, once done I will try installing the 431.60 driver again.

Regarding the monitor, below is what I have connected to the RTX 2080:
2 x Dell S2716DG 2560x1440 @ 144Hz - DisplayPort
1 x Samsung 4K Smart TV - HDMI
1 x Oculus Rift - DisplayPort (The Rift has a HDMI connection but I'm using a HDMI to DP adapter to connect it to the vid card)

Regarding the RAID, I'm using an LSI MegaRAID 9271-4i raid card, so no onboard raid function.

I'll try and turn off Fast Boot/Fast Startup as I believe Fast Boot is currently enabled, I've always turned off hibernation.

Regarding the apps:
VMWare I only installed the other day, there was no VMWare software installed when it crashed twice in past few weeks.
Oculus - I am considering uninstalling it and disconnecting the 3 USB Sensors and the Rift's USB/DP connection to see if it helps, although I'm having difficulty remembering I'm sure months back when troubleshooting I had the Oculus disconnected and it still crashed, but will try again.
ESET - Could be, I'd rather not remove this though.

PCI connected hardware:
RTX 2080 - PCIE x16
ASUS Essense STX II - PCIE x1
LSI MegaRAID 9271-4i - PCIE x16
Audient iD4 is over USB

There was something odd which I noticed but did not get time to look into it, I'm running Memtest86+ off a USB, as soon as it's finished loading it's supposed to start a countdown timer and autotest with 4 passes, so when I had only 1 of the ram sticks in which I wanted to test , twice when Memtest86+ had finished loading it did not start the countdown timer, I had to manually start it, that ran overnight, then this morning I removed that stick, put another ram stick in one of the right bank slots, started Memtest86+ and this time when it finished loading it autostarted the countdown timer and ran, not sure if that's something or nothing, once I'm finished testing all ram sticks I'll test that issue specifically.
 
There was something odd which I noticed but did not get time to look into it, I'm running Memtest86+ off a USB, as soon as it's finished loading it's supposed to start a countdown timer and autotest with 4 passes, so when I had only 1 of the ram sticks in which I wanted to test , twice when Memtest86+ had finished loading it did not start the countdown timer, I had to manually start it, that ran overnight, then this morning I removed that stick, put another ram stick in one of the right bank slots, started Memtest86+ and this time when it finished loading it autostarted the countdown timer and ran, not sure if that's something or nothing, once I'm finished testing all ram sticks I'll test that issue specifically.

That is very odd, I wonder if it is a possible bug in their software? I haven't had anyone mention that before though.

Oculus - I am considering uninstalling it and disconnecting the 3 USB Sensors and the Rift's USB/DP connection to see if it helps, although I'm having difficulty remembering I'm sure months back when troubleshooting I had the Oculus disconnected and it still crashed, but will try again.

Please do, would be a good troubleshooting step to take. Have you ensured that your PSU is able to handle all the additional devices?
 
Seems like it might be a bug in their software, once I took the RAM stick out where Memtest did autostart and put it in one of the left slots it did do the same thing, not sure what it is but probably not related.

So I purchased new RAM on Friday last week, installed on Saturday, this is RAM that is on the motherboards QVL, was working fine until about an hour ago, same crash, this time I configured Windows to not auto reboot however because the displays were asleep the machine just did a hard hang with the displays backlight on but no image, had to reboot via button.

Have now uninstalled Oculus software, disconnected Rift's DP connector and 3 USB sensors, also disconnected my USB camera, will see how it goes, I'm starting to think it might be the video card at this point.


14: kd> !analyze -v
*******************************************************************************
* *
* Bugcheck Analysis *
* *
*******************************************************************************

SYSTEM_THREAD_EXCEPTION_NOT_HANDLED_M (1000007e)
This is a very common bugcheck. Usually the exception address pinpoints
the driver/function that caused the problem. Always note this address
as well as the link date of the driver/image that contains this address.
Some common problems are exception code 0x80000003. This means a hard
coded breakpoint or assertion was hit, but this system was booted
/NODEBUG. This is not supposed to happen as developers should never have
hardcoded breakpoints in retail code, but ...
If this happens, make sure a debugger gets connected, and the
system is booted /DEBUG. This will let us see why this breakpoint is
happening.
Arguments:
Arg1: ffffffffc0000005, The exception code that was not handled
Arg2: fffff8022c6a4bc2, The address that the exception occurred at
Arg3: fffff480a0e96f38, Exception Record Address
Arg4: fffff480a0e96780, Context Record Address

Debugging Details:
------------------

*** WARNING: Unable to verify timestamp for nvlddmkm.sys
*** WARNING: Unable to verify timestamp for win32k.sys

KEY_VALUES_STRING: 1

Key : AV.Dereference
Value: NullPtr

Key : AV.Fault
Value: Read


PROCESSES_ANALYSIS: 1

SERVICE_ANALYSIS: 1

STACKHASH_ANALYSIS: 1

TIMELINE_ANALYSIS: 1


DUMP_CLASS: 1

DUMP_QUALIFIER: 400

BUILD_VERSION_STRING: 18362.1.amd64fre.19h1_release.190318-1202

DUMP_TYPE: 2

BUGCHECK_P1: ffffffffc0000005

BUGCHECK_P2: fffff8022c6a4bc2

BUGCHECK_P3: fffff480a0e96f38

BUGCHECK_P4: fffff480a0e96780

EXCEPTION_CODE: (NTSTATUS) 0xc0000005 - The instruction at 0x%p referenced memory at 0x%p. The memory could not be %s.

FAULTING_IP:
nvlddmkm+9b4bc2
fffff802`2c6a4bc2 488b01 mov rax,qword ptr [rcx]

EXCEPTION_RECORD: fffff480a0e96f38 -- (.exr 0xfffff480a0e96f38)
ExceptionAddress: fffff8022c6a4bc2 (nvlddmkm+0x00000000009b4bc2)
ExceptionCode: c0000005 (Access violation)
ExceptionFlags: 00000000
NumberParameters: 2
Parameter[0]: 0000000000000000
Parameter[1]: 0000000000000000
Attempt to read from address 0000000000000000

CONTEXT: fffff480a0e96780 -- (.cxr 0xfffff480a0e96780)
rax=0000000000000000 rbx=000000280000001d rcx=0000000000000000
rdx=0000000000000000 rsi=0000000000000001 rdi=ffffe30f32229000
rip=fffff8022c6a4bc2 rsp=fffff480a0e97170 rbp=ffffe30f32236c58
r8=0000000000000000 r9=0000000000000000 r10=0000000000000001
r11=fffff780000003b0 r12=ffffe30f30b54938 r13=ffffe30f30b4c000
r14=0000000000000000 r15=ffffe30f32076001
iopl=0 nv up ei ng nz na po nc
cs=0010 ss=0018 ds=002b es=002b fs=0053 gs=002b efl=00050286
nvlddmkm+0x9b4bc2:
fffff802`2c6a4bc2 488b01 mov rax,qword ptr [rcx] ds:002b:00000000`00000000=????????????????
Resetting default scope

CPU_COUNT: 18

CPU_MHZ: db0

CPU_VENDOR: GenuineIntel

CPU_FAMILY: 6

CPU_MODEL: 55

CPU_STEPPING: 4

CUSTOMER_CRASH_COUNT: 1

DEFAULT_BUCKET_ID: NULL_DEREFERENCE

PROCESS_NAME: System

CURRENT_IRQL: 0

FOLLOWUP_IP:
nvlddmkm+9b4bc2
fffff802`2c6a4bc2 488b01 mov rax,qword ptr [rcx]

BUGCHECK_STR: AV

READ_ADDRESS: fffff80211d733b8: Unable to get MiVisibleState
Unable to get NonPagedPoolStart
Unable to get NonPagedPoolEnd
Unable to get PagedPoolStart
Unable to get PagedPoolEnd
0000000000000000

ERROR_CODE: (NTSTATUS) 0xc0000005 - The instruction at 0x%p referenced memory at 0x%p. The memory could not be %s.

EXCEPTION_CODE_STR: c0000005

EXCEPTION_PARAMETER1: 0000000000000000

EXCEPTION_PARAMETER2: 0000000000000000

ANALYSIS_SESSION_HOST: ALX

ANALYSIS_SESSION_TIME: 10-21-2019 22:54:24.0121

ANALYSIS_VERSION: 10.0.18362.1 amd64fre

LAST_CONTROL_TRANSFER: from ffffe30f32236c58 to fffff8022c6a4bc2

STACK_TEXT:
fffff480`a0e97170 ffffe30f`32236c58 : ffffe30f`00000000 00000000`00000000 fffff802`2bd84e00 00000000`00000000 : nvlddmkm+0x9b4bc2
fffff480`a0e97178 ffffe30f`00000000 : 00000000`00000000 fffff802`2bd84e00 00000000`00000000 ffffe30f`306b79a0 : 0xffffe30f`32236c58
fffff480`a0e97180 00000000`00000000 : fffff802`2bd84e00 00000000`00000000 ffffe30f`306b79a0 ffffe30f`306b79a0 : 0xffffe30f`00000000


THREAD_SHA1_HASH_MOD_FUNC: d79c3f9e9541b50dff558588ee91b494a55f2aae

THREAD_SHA1_HASH_MOD_FUNC_OFFSET: aec45d7f8a6ffa744cacd570be4de2306ffa003f

THREAD_SHA1_HASH_MOD: d79c3f9e9541b50dff558588ee91b494a55f2aae

FAULT_INSTR_CODE: ff018b48

SYMBOL_STACK_INDEX: 0

SYMBOL_NAME: nvlddmkm+9b4bc2

FOLLOWUP_NAME: MachineOwner

MODULE_NAME: nvlddmkm

IMAGE_NAME: nvlddmkm.sys

DEBUG_FLR_IMAGE_TIMESTAMP: 5d8d48d8

STACK_COMMAND: .cxr 0xfffff480a0e96780 ; kb

BUCKET_ID_FUNC_OFFSET: 9b4bc2

FAILURE_BUCKET_ID: AV_nvlddmkm!unknown_function

BUCKET_ID: AV_nvlddmkm!unknown_function

PRIMARY_PROBLEM_CLASS: AV_nvlddmkm!unknown_function

TARGET_TIME: 2019-10-21T19:45:05.000Z

OSBUILD: 18362

OSSERVICEPACK: 418

SERVICEPACK_NUMBER: 0

OS_REVISION: 0

SUITE_MASK: 272

PRODUCT_TYPE: 1

OSPLATFORM_TYPE: x64

OSNAME: Windows 10

OSEDITION: Windows 10 WinNt TerminalServer SingleUserTS

OS_LOCALE:

USER_LCID: 0

OSBUILD_TIMESTAMP: unknown_date

BUILDDATESTAMP_STR: 190318-1202

BUILDLAB_STR: 19h1_release

BUILDOSVER_STR: 10.0.18362.1.amd64fre.19h1_release.190318-1202

ANALYSIS_SESSION_ELAPSED_TIME: 899f

ANALYSIS_SOURCE: KM

FAILURE_ID_HASH_STRING: km:av_nvlddmkm!unknown_function

FAILURE_ID_HASH: {7eea5677-f68d-2154-717e-887e07e55cd3}

Followup: MachineOwner
---------
 
Have now uninstalled Oculus software, disconnected Rift's DP connector and 3 USB sensors, also disconnected my USB camera, will see how it goes, I'm starting to think it might be the video card at this point.

Yes, it seems to be unfortunately. Have you tried testing it with Furmark?

FurMark Display Card Stress Test

Do you know if the motherboard has an onboard graphics chip which you could use for troubleshooting purposes?
 
I did test the card with Furmark a few times a while back overnight, no problems at all, it's watercooled so the temps never went above 50c or so, I managed to get another video card to test so I'm doing that now.

I've restored back to my image I created when I built the machine, used DDU to uninstall the existing NVIDIA drivers, shutdown the machine, put new video card in and install latest NVIDIA Standard drivers, will see how that goes.

I want to just narrow things down 1 by 1 instead of changing too many things now, after the video card, a member on the NVIDIA forums suggested changing the fan out cable I have on my RAID card, and then possibly RMA'ing the RAID card if possible, I will probably try and move the RAID card to a different PCIE slot before RMA.

Motherboard is X299 chipset so does not have on board video unfortunately.
 
Yes please, test the other PCIe slots on the motherboard too, I've seen rare occasions where the slot was causing the issue and not the card itself.
 

Has Sysnative Forums helped you? Please consider donating to help us support the site!

Back
Top