!tz and !tzinfo WinDbg Extensions - Thermal Zone ACPI Trip Levels

x BlueRobot

Administrator
Staff member
Joined
May 7, 2013
Posts
10,396
I've just come across two extensions called !tz and !tzinfo, they seem to be very sparely undocumented. However, I've manged to find out it is possibly related to the Thermal Trip point found in the ACPI.

Code:
0: kd> [COLOR=#008000]!tz[/COLOR]
0 - ThermalZone @ [COLOR=#ff0000]0x86657870[/COLOR]
  State:         Read              Flags:              00000002
  Mode:          Active            PendingMode:        Active  
  ActivePoint:   00000000          PendingActivePoint: 00000000
  Throttle:      00000064
  SampleRate:    00000000
  LastTime:      0000000000000000  LastTemp:           00000000 (0.0K)
  PassiveTimer:  0x86657898
  PassiveDpc:    0x866578c0
  OverThrottled: 0x866578e0
  Irp:           0x85d70720
  Thermal Info:  0x866578f4
0: kd> [COLOR=#008000]~1[/COLOR]
1: kd> [COLOR=#008000]!tz[/COLOR]
0 - ThermalZone @ [COLOR=#ff0000]0x86657870[/COLOR]
  State:         Read              Flags:              00000002
  Mode:          Active            PendingMode:        Active  
  ActivePoint:   00000000          PendingActivePoint: 00000000
  Throttle:      00000064
  SampleRate:    00000000
  LastTime:      0000000000000000  LastTemp:           00000000 (0.0K)
  PassiveTimer:  0x86657898
  PassiveDpc:    0x866578c0
  OverThrottled: 0x866578e0
  Irp:           0x85d70720
  Thermal Info:  [COLOR=#ff0000]0x866578f4[/COLOR]

You can use the !tzinfo extension with the address of the Thermal Info address, to give this:

Code:
1: kd> [COLOR=#008000]!tzinfo 0x866578f4[/COLOR]
ThermalInfo @ [COLOR=#ff0000]0x866578f4[/COLOR]
  Stamp:         00000016  Constant1:  00000002  Constant2:  00000003
  Period:        0000001e  ActiveCnt:  00000000  AffinityEx: 0x86657900
  Current Temperature:                 00000cd2 (328.2K)
  Passive TripPoint Temperature:       00000e76 (370.2K)
  Critical TripPoint Temperature:      00000000 (0.0K)
  Hibernate TripPoint Temperature:     00000e94 (373.2K)

I would like to gather more information on this so I add it to my WinDbg Cheat Sheet.

OSR's ntdev List: Critical Shutdown?
 
I find it extremely interesting the temperatures are measured in Kelvin as that's what used for things like planets!

This just made debugging that much cooler :s16:
 
Did a little research on this since I just learned cool new commands thanks to you, Harry.

!tz and !tzinfo display thermal zones and information about them.

In regards to ACPI thermal management, the system designer logically partitions a hardware platform into one or more physical regions known as thermal zones. When a thermal zone overheats, the OS takes possible actions to cool down any devices in that specific zone.

-- Edit.

Evidently thermal zones keep track of a few things such as what hardware is in the specific zone, what the threshold(s) are/is and what to do when it reaches that point. There's also passive vs active cooling precautions as well.

Evidently lots of info regarding thermal management in Chapter 11 - ACPI - Advanced Configuration and Power Interface

Pretty neat stuff!
 
Last edited:
Thanks Patrick, great research, I'll have to check the ACPI Specification tonight.
 
I've manged to read through the ACPI documentation and make some notes about Thermal Zone Management. There's a note confirming that the temperatures are measured in Kelvins, but the documentation uses Celsius for clarity.

Kelvins.JPG

The threshold for Critical Shutdown, Active Cooling and Passive Cooling has shown in the diagram below:

Thermal Events.JPG

Here's an example of what a Thermal Zone consists of:

Thermal Zone.JPG

I'm planning to write this information properly tomorrow morning. I'll probably add it both to my blog and on Sysnative in a separate thread. Thanks to Patrick for reminding me to check the ACPI documentation :D
 
That's really interesting, good findings!

I wonder if Critical shutdown threshold = The throttle point as well for the CPU. I'd think so.
 
It probably is, and the !tzinfo seems to be the better extension to use.
 
I found the Overtrottled field within the Power Policy section too:

Code:
3: kd> [COLOR=#008000]!popolicy[/COLOR]
SYSTEM_POWER_POLICY (R.1) @ 0xfffff80002e30b44
  PowerButton:          None  Flags: 00000000   Event: 00000010  
  SleepButton:         Sleep  Flags: 00000000   Event: 00000000  
  LidClose:            Sleep  Flags: 00000000   Event: 00000000  
  Idle:                Sleep  Flags: 00000000   Event: 00000000  
  [COLOR=#ff0000]OverThrottled:        None[/COLOR]  Flags: 00000000   Event: 00000000  
  IdleTimeout:             0  IdleSensitivity:        90%
  MinSleep:               S3  MaxSleep:               S3
  LidOpenWake:            S0  FastSleep:              S0
  WinLogonFlags:           0  S4Timeout:               0
  VideoTimeout:            0  VideoDim:                0
  SpinTimeout:           4b0  OptForPower:             0
  FanTolerance:            0% ForcedThrottle:          0%
  MinThrottle:             0% DyanmicThrottle:      None (0)
 
I've published the blog version, now I'm going to write the Sysnative version.

!tz and !tzinfo Tutorial

The !tz and !tzinfo gathers information from the ACPI subsystem about the currently allocated thermal zones and the cooling policies being implemented. I would suggest checking the ACPI Documentation - Chapter 11, for additional information about the ACPI Thermal Management.

The Thermal Management mostly uses a component called the OSPM (Operating System Directed Configuration and Power Management) to manage different cooling policies and check the thermal zones.

The OSPM is used to remove any device management responsibilities from the legacy devices, and therefore made thermal management more robust.

The OSPM creates logical regions called Thermal Zones. Thermal Zones are a key component within Thermal Management. The entire motherboard is one thermal zone, and is usually subdivided further into smaller thermal zones to make management more efficient. A cooling policy is set for each individual device with a thermal zone, and therefore each thermal zone will have multiple cooling policies and cooling resources (e.g. fans).

Thermal Zone.JPG


We can find the Thermal Zones on a system using the !tz extension in WinDbg.

Code:
3: kd> [COLOR=#008000]!tz[/COLOR]
0 - ThermalZone @ 0xfffffa800445e580
  State:         Read              Flags:              00000002
  Mode:          Active            PendingMode:        Active  
  ActivePoint:   00000002          PendingActivePoint: 00000002
  Throttle:      00000064
  SampleRate:    00000000
  LastTime:      0000000000000000  LastTemp:           00000000 (0.0K)
  PassiveTimer:  0xfffffa800445e5b0
  PassiveDpc:    0xfffffa800445e5f0
  OverThrottled: 0xfffffa800445e630
  Irp:           0xfffffa8004057310
  Thermal Info:  0xfffffa800445e650
1 - ThermalZone @ [COLOR=#0000ff]0xfffffa8004807010[/COLOR]
  State:         Read              Flags:              00000002
  Mode:          Active            PendingMode:        Active  
  ActivePoint:   00000000          PendingActivePoint: 00000000
  Throttle:      00000064
  SampleRate:    00000000
  LastTime:      0000000000000000  LastTemp:           00000000 (0.0K)
  PassiveTimer:  0xfffffa8004807040
  PassiveDpc:    0xfffffa8004807080
  OverThrottled: 0xfffffa80048070c0
  Irp:           0xfffffa800445e380
  Thermal Info:  [COLOR=#ff0000]0xfffffa80048070e0[/COLOR]


The most useful part of the !tz output is the Thermal Info Address which we can use with the !tzinfo extension to give the trip point temperatures of the thermal zone(s).

Code:
3: kd> [COLOR=#008000]!tzinfo 0xfffffa80048070e0[/COLOR]
ThermalInfo @ [COLOR=#ff0000]0xfffffa80048070e0[/COLOR]
  Stamp:         00000007  Constant1:  00000001  Constant2:  00000005
  Period:        0000000a  ActiveCnt:  00000000  AffinityEx: 0xfffffa80048070f0
  Current Temperature:                 00000bd6 ([COLOR=#0000cd]303.0K[/COLOR])
  Passive TripPoint Temperature:       00000ed0 ([COLOR=#ff8c00]379.2K[/COLOR])
  Critical TripPoint Temperature:      00000ed0 ([COLOR=#ff0000]379.2K[/COLOR])
  Hibernate TripPoint Temperature:     00000000 ([COLOR=#ffff00]0.0K[/COLOR])


The extension isn't perfect as you can see, but it does provide a good indication of the hardware temperature limits for the system, and thus will be useful for debugging Stop 0x124's.

Although, these trip point temperatures correspond to the cooling policies implemented when that threshold is reached. Each device within a thermal zone will have its own threshold. The two main cooling modes are Active Cooling and Passive Cooling.

Passive Cooling - The operating system will decrease the power consumption of all devices, in order to reduce the temperature of the system, however, the cost is a reduction in system performance.

Active Cooling - The operating system will increase the power consumption of cooling resources such as fans, to decrease the temperature of the system. Active Cooling has better system performance, but with laptops it will reduce the battery life much faster than usual.

There is also a Critical temperature threshold, whereby if any thermal zone breaches this threshold, then the entire system will shut down. The thresholds are managed by objects called Thermal Objects.

Thermal Events.JPG

The _TMP object is the current temperature of a thermal zone, and is compared to the _HOT, _CRT, PSV and _AC0/_AC1 thermal objects in order to implement the different cooling policies.

If the _TMP object value reaches the _CRT (Critical Temperature Threshold), the entire system will shut down. If the _TMP reaches the _HOT value, then the system will be placed into the S4 sleep state (Hibernation) if this mode is supported.

If the _TMP object reaches the _AC0/_AC1 (Active Cooling) then the Active Cooling policy will be implemented; there is two versions which adjust the fan speed. If the _TMP object reaches the _PSV (Passive Cooling) then the Passive Cooling policy is used. The Thermal Events are notified to the OSPM by Thermal Change Notifications.


Power State Information:

We can check the supported power state information with the !pocaps extension, and review the current power policy with the !popolicy extension.

Code:
3: kd> [COLOR=#008000]!pocaps[/COLOR]
PopCapabilities @ 0xfffff80002e3eae0
  Misc Supported Features:  [COLOR=#ff0000]PwrButton SlpButton S3 S4 S5 HiberFile FullWake[/COLOR]
  Processor Features:       Thermal
  Disk Features:            SpinDown
  Battery Features:        
  Wake Caps
    Ac OnLine Wake:         Sx
    Soft Lid Wake:          Sx
    RTC Wake:               S4
    Min Device Wake:        Sx
    Default Wake:           Sx


The power states are stored in a enumeration called _SYSTEM_POWER_STATE, if you don't consider the undefined power state then the Hiberation State is S4. Microsoft also states that Hibernation is S4.

Code:
3: kd> [COLOR=#008000]dt nt!_SYSTEM_POWER_STATE[/COLOR]
   PowerSystemUnspecified = 0n0
   PowerSystemWorking = 0n1
   PowerSystemSleeping1 = 0n2
   PowerSystemSleeping2 = 0n3
   PowerSystemSleeping3 = 0n4
   PowerSystemHibernate = 0n5 <-- Hibernation is S4 not S5 (this is documented by Microsoft too)
   PowerSystemShutdown = 0n6
   PowerSystemMaximum = 0n7


The Power Policy gives some general information what policy is being used, it is a formatted version of _SYSTEM_POWER_POLICY structure.

Code:
3: kd> [COLOR=#008000]!popolicy[/COLOR]
SYSTEM_POWER_POLICY (R.1) @ 0xfffff80002e30b44
  PowerButton:          None  Flags: 00000000   Event: 00000010  
  [COLOR=#ff0000]SleepButton:         Sleep[/COLOR]  Flags: 00000000   Event: 00000000  
  [COLOR=#ff0000]LidClose:            Sleep[/COLOR]  Flags: 00000000   Event: 00000000  
  Idle:                Sleep  Flags: 00000000   Event: 00000000  
  OverThrottled:        None  Flags: 00000000   Event: 00000000  
  IdleTimeout:             0  IdleSensitivity:        90%
  MinSleep:               S3  MaxSleep:               S3
  LidOpenWake:            S0  FastSleep:              S0
  WinLogonFlags:           0  [COLOR=#ff0000]S4Timeout:               0[/COLOR]
  VideoTimeout:            0  VideoDim:                0
  SpinTimeout:           4b0  OptForPower:             0
  FanTolerance:            0% ForcedThrottle:          0%
  MinThrottle:             0% DyanmicThrottle:      None (0)


Code:
[FONT=verdana]3: kd> [COLOR=#008000]dt nt!_SYSTEM_POWER_POLICY[/COLOR]
   +0x000 Revision         : Uint4B
   +0x004 PowerButton      : POWER_ACTION_POLICY
   +0x010 SleepButton      : POWER_ACTION_POLICY
   +0x01c LidClose         : POWER_ACTION_POLICY
   +0x028 LidOpenWake      : _SYSTEM_POWER_STATE
   +0x02c Reserved         : Uint4B
   +0x030 Idle             : POWER_ACTION_POLICY
   +0x03c IdleTimeout      : Uint4B
   +0x040 IdleSensitivity  : UChar
   +0x041 DynamicThrottle  : UChar
   +0x042 Spare2           : [2] UChar
   +0x044 MinSleep         : _SYSTEM_POWER_STATE
   +0x048 MaxSleep         : _SYSTEM_POWER_STATE
   +0x04c ReducedLatencySleep : _SYSTEM_POWER_STATE
   +0x050 WinLogonFlags    : Uint4B
   +0x054 Spare3           : Uint4B
   +0x058 DozeS4Timeout    : Uint4B
   +0x05c BroadcastCapacityResolution : Uint4B
   +0x060 DischargePolicy  : [4] SYSTEM_POWER_LEVEL
   +0x0c0 VideoTimeout     : Uint4B
   +0x0c4 VideoDimDisplay  : UChar
   +0x0c8 VideoReserved    : [3] Uint4B
   +0x0d4 SpindownTimeout  : Uint4B
   +0x0d8 OptimizeForPower : UChar
   +0x0d9 FanThrottleTolerance : UChar
   +0x0da ForcedThrottle   : UChar
   +0x0db MinThrottle      : UChar
   +0x0dc OverThrottled    : POWER_ACTION_POLICY[/FONT]

There a few more like !poreqlist, !podev and !poaction which I haven't used properly yet, so I'll make another tutorial for those extensions.





 
Very good Harry!
This is a great demonstration of research and commitment :)

Oh by the way, what dumps are required to view this information, do minidumps contain it or would Kernel memory dumps be required?
 
Last edited:
Looks like kernel-dump only, as I imagined. I took a random minidump and tried a few:

Code:
0: kd> [COLOR=#006400]!tz[/COLOR]
Could not read PopThermal at fffff80002c45ab0

Code:
0: kd> [COLOR=#006400]!pocaps[/COLOR]
Could not read PopCapabilities at fffff80002c45ae0

Code:
0: kd> [COLOR=#006400]!popolicy[/COLOR]
Could not read PopPolicy at fffff80002c37b38
 
I've only personally tried it with a Kernel Memory Dump, but you might get a lucky Minidump where it has retained enough information for it to work.
 
It could work I guess but only if we're looking, I mean sometimes Page Table Entries work but most of the time they don't with minidumps.
 
Sometimes !pool doesn't even work when a minidump is too far gone.

I roll the dice before every minidump :grin1:
 
Sometimes I ask for a Kernel Memory Dump, there's so much more information and usually you can find the root cause straight away, or at least get a good indication of what has gone wrong.
 
Back
Top