I'm still looking... I'm just trying to make sense of what I'm seeing.
What you wrote about the GPU made me look at the GPU data in the traces and there are some strange things going on with the traces from your system. There's a GPU DMA Operations dataset in XMA and all of your traces only show data for a few hundred milliseconds at the beginning of the trace but then they all abruptly stop. That data set from your friend's trace shows data throughout the trace. I thought it might be something different about a GTX 1080 and a RTX 2070 Super so I tried capturing a trace on a system I have with a GTX 1080 and mine is capturing GPU DMA Operations data throughout the trace, too.
Despite the data no longer being captured for DMA operations, other data about the GPU is still getting captured and it shows one processor on the GPU pegged at 100 percent utilization through almost the entire trace. The following is just one sample point on the graph:
[CODE]Unit 901 of 9699
Adapter 2 -Node 10 -Engine 1
Time(ns) 4500000000
Event Local Time 09:49:28:480.6818 06-29-2022
Event Time 09:49:28:480.6818 06-29-2022
CPU 2
Usage (%) 100
[/CODE]
When I hover over the GPU utilization line for that GPU processor it shows a callstack that looks like this along the entire trace:
[CODE]0 / 0
Callstack at time 6735290500 - I/O Priority: Very low - System (PID-4) - Thread-240 (ntoskrnl.exe!ExpWorkerThread) - Write - \C:\$BitMap - Unit:# 1
ntoskrnl.exe!_output_l
ntoskrnl.exe!_vsnprintf_l
ntoskrnl.exe!_vsnprintf
storport.sys!RtlStringCbPrintfA
storport.sys!StorPortDebugPrint
stornvme.sys!NVMeHwAdapterControl
storport.sys!RaCallMiniportAdapterControl
storport.sys!RaidAdapterSendPoFxActiveToMiniport
storport.sys!StorPortAdapterIdleCondition
ntoskrnl.exe!PopFxIdleWorker
ntoskrnl.exe!PopFxIdleComponent
ntoskrnl.exe!PoFxIdleComponent
storport.sys!RaidAdapterPoFxIdleComponent
storport.sys!RaidUnitCompleteRequest
storport.sys!RaidpAdapterRedirectDpcRoutine
ntoskrnl.exe!KiExecuteAllDpcs
ntoskrnl.exe!KiRetireDpcList
ntoskrnl.exe!KiIdleLoop
[/CODE]
I don't know why debug information about NVMe storage functions would be getting called as much as they seem to be - pegging that GPU processor at 100 percent. It's such an odd thing it makes me wonder if it's just bad data but all of the traces from your system show the same thing.
The traces from your friend's system and mine don't show anything like that and the GPU utilization for both is less than 30 percent on any GPU processor at most and usually barely registering any utilization for the majority of the time.
I'm not sure what to make of it.