Hi everyone!
Today we're going to look into debugging when a rootkit is likely present on the system, or how to find out if this is the case so you can provide the appropriate recommendations to the user. Please do note that in almost all cases a kernel-dump will be necessary, simply because with a minidump, as many of us know, we can barely do any deep debugging at all. You can at times guess whether or not a rootkit is present on the system causing the bug check by taking a look at the stack in certain situations, however, if you're unaware of the status of the linked list, and other factors I will explain later, it could be a false positive/something else that functions similarly that's actually legitimate.
There are various ways to find out whether or not there is a rootkit present on the system, but in this thread I will go over two instances in which a rootkit was present when the bug check was SYSTEM_SERVICE_EXCEPTION (3b) and KERNEL_DATA_INPAGE_ERROR (7a).
Let's first start off with the 0x3B bug check!
SYSTEM_SERVICE_EXCEPTION (3b)
This indicates that an exception happened while executing a routine that transitions from non-privileged code to privileged code.
We've all seen this before, fairly common bug check. Let's look further into the dump, starting with the parameters:
Using the !ln command on the 2nd parameter of the bug check (the address of the instruction that caused the bug check) displays the symbols at and/or near the given address.
From the above, we can see that the exception occurred in nt!IofCallDriver+0x44. We can also see mention of nt!ExQueryDepthSList, which is a routine that returns the number of entries currently in a given sequenced singly linked list. I mentioned linked lists above, so with that said, this is a good time to go ahead and explain linked lists!
Understanding linked lists, etc, is extremely complicated and it would actually take an entire book to explain. With that said, in its simplest terms which I will try to sum up a few times here & there with examples, Windows' list of active process is obtained by traversing (better known as 'walking' amongst infosec/debugging terms) a doubly linked list referenced in the EPROCESS structure of each structure. To further expand, a process's EPROCESS structure contains what is known as a LIST_ENTRY structure. This structure has what is known as the members FLINK and BLINK. FLINK and BLINK are both pointers to processes in front of and behind a current process descriptor.
Linked list + an exception occurring in nt!IofCallDriver+0x44. Well, what driver? This is very suspicious. At this point, I was very concerned a rootkit was present on the system. Why did I become suspicious of a rootkit being present? Well, first we must understand how rootkits work. I am not going to go extremely in-depth, but I will of course explain as always!
In its most basic description, rootkits (at least in this generation) use a technique known as Direct Kernel Object Manipulation (DKOM). DKOM greatly increases the sophistication of the rootkit and allows it to go undetected by today's basic antivirus suites. Some of the things it allows rootkits to do are:
1. Hook the Interrupt Descriptor Table (IDT). By doing this, the rootkit can filter exported kernel functions. Remember, interrupts signal the kernel that something needs to be done. That's exactly how today's OS' work, they work based on interrupts.
-- It's worth noting that newer (as of this post, at least) rootkits generally no longer hook, as it's detectable. However, as far as I know, the IDT is still hooked.
2. Direct access to kernel memory.
3. Modify objects in memory and go undetected in doing so.
4. Hide processes, files, network-based connections (ports), etc.
5. Add privileges/groups to tokens. It can also go one step further and manipulate the token to fool Event Viewer.
etc...
With this said, let's take a look at how processes are overall managed by the OS:
(thanks to the HB Gary .pdf for this)
I am not going to go further in-depth here, but if you'd like to understand a fair amount about this diagram, my good friend Harry has written about it (here).
Essentially, rootkits take advantage of the linked list structure by modifying pointers within the linked list by using DKOM. Rootkits also change the Flink and Blink pointers (which we can see in the above diagram) to wrap around processes that should be hidden. Remember I went into Flink and Blink a bit above?
So now that we understand all of that, you should now also understand why I was suspicious when seeing a linked list routine + an exception occurring in nt!IofCallDriver+0x44.
When I saw this, I had the user run aswMBR. This is a rootkit scanner that scans for TDL4/3, MBRoot (Sinowal), Whistler and other rootkits. Here was the log:
From the above log, we can see it's showing a hidden service (driver appears to be likely loading at boot). The 3d0ce9e8976dc0a9.sys driver is the driver responsible for this call - nt!IofCallDriver+0x44.
The user also noted that when attempting to install HijackThis, they got the following message:
Remember step #5 from above?
This is exactly what the rootkit did, it appeared to modify tokens and disallow the install of Hijackthis, and other probable common startup/hijacker detection software.
So now that we've seen some 0x3B rootkit debugging, let's take a look at an 0x7A scenario that's just a bit different!
KERNEL_DATA_INPAGE_ERROR (7a)
This bug check indicates that the requested page of kernel data from the paging file could not be read into memory.
The 1st parameter of our bug check is 4, which indicates that the 2nd parameter is an error status code (typically I/O status code). With this said, the 3rd parameter in our case AFAIK is the PTE contents, and the 4th parameter is the faulting address.
The 2nd parameter in our case is 0, which is the following NTSTATUS value - STATUS_SUCCESS (0x00000000). Quite simply, it implies that the operation completed successfully.
Let's have a basic look at the call stack:
Very interesting call stack we have here! The first big red flag/question we are asking ourselves here is 'Why is a low-level NT function calling into a pagefault?' The answer is... we likely have a rootkit!
First off, the CmEnumerateKey routine returns information about a subkey of an open registry key, and if we remember, the 2nd parameter of the bug check was 0 (STATUS_SUCCESS). This indicates that it was successful in its attempt to return information regarding the subkey. As we discussed above in our 0x3B example, rootkits use a technique labeled DKOM (Direct kernel object manipulation) to hide themselves in legitimate Windows processes. In our case here, it appears to have hooked itself into NtEnumerateKey. Remember we also discussed hooking?
Expanding further off of DKOM, one of the most common ways of going undetected is hooking registry API functions such as - RegOpenKey, RegEnumKey, RegEnumValue. To further increase effectiveness of the rootkit regarding its ability to go undetected, it will specifically hook the low-level NT versions of these functions, such as - NtOpenKey, NtEnumerateKey and NtEnumerateValueKey.
With all of that said, if the rootkit is so sophisticated at hiding itself, why is this showing in a call stack of a crash dump? Also, why is the system even crashing in the first place? One of the ways to effectively discover a rootkit hooked to such entries listed above is to directly invoke said functions. In this case, the user was attempting to run scans with various software that would detect the rootkit (such as TDSSKiller, etc). Every time the user ran a scan with such software, the system would call an 0x7A bug check. I may be wrong, but this may be a 'defense' mechanism of the rootkit, or the scan is actually conflicting with the rootkit/what it's trying to accomplish regarding hooking, therefore it forces a bug check or kernel corruption will occur.
What's the problem? Well, we can surely almost confirm a rootkit, however, at the same time, we also actually cannot be too sure. Why? Well, this is pretty clear that hooks are occurring and we've caught them, however, what's interesting to know is that intrusion prevention-based software also hooks like this. The user is crashing every single time he/she runs a scan, so that could also be a possibility of buggy software.
The first entry is the System Process, which we can confirm:
From here, we can walk along the linked list to confirm whether or not it is corrupt (remember we discussed this?):
It's not corrupt, however, I don't believe this implies that a rootkit is not present on the system, and that it hasn't been modified. More comments on this if/when I learn more regarding linked lists modified by rootkits. Harry (x BlueRobot here on Sysnative) goes into pretty nice detail regarding linked lists, etc, here on his blog - BSODTutorials: Rootkits: Direct Kernel Object Manipulation and Processes
Hope you enjoyed reading! I will make edits to this as time goes by as I see fit to add more info, change things, etc.
References/extra reading:
Professional Rootkits - Ric Vieler - Google Books
Detection of Intrusions and Malware, and Vulnerability Assessment: 7th ... - Google Books
BSODTutorials: Rootkits: Direct Kernel Object Manipulation and Processes
Today we're going to look into debugging when a rootkit is likely present on the system, or how to find out if this is the case so you can provide the appropriate recommendations to the user. Please do note that in almost all cases a kernel-dump will be necessary, simply because with a minidump, as many of us know, we can barely do any deep debugging at all. You can at times guess whether or not a rootkit is present on the system causing the bug check by taking a look at the stack in certain situations, however, if you're unaware of the status of the linked list, and other factors I will explain later, it could be a false positive/something else that functions similarly that's actually legitimate.
There are various ways to find out whether or not there is a rootkit present on the system, but in this thread I will go over two instances in which a rootkit was present when the bug check was SYSTEM_SERVICE_EXCEPTION (3b) and KERNEL_DATA_INPAGE_ERROR (7a).
Let's first start off with the 0x3B bug check!
SYSTEM_SERVICE_EXCEPTION (3b)
This indicates that an exception happened while executing a routine that transitions from non-privileged code to privileged code.
We've all seen this before, fairly common bug check. Let's look further into the dump, starting with the parameters:
Code:
BugCheck 3B, {c0000005, [COLOR=#ff0000]fffff80003cef274[/COLOR], fffff8800cd1ef50, 0}
Using the !ln command on the 2nd parameter of the bug check (the address of the instruction that caused the bug check) displays the symbols at and/or near the given address.
Code:
0: kd> ln [COLOR=#ff0000]fffff80003cef274[/COLOR]
(fffff800`03cef230) [COLOR=#4b0082]nt!IofCallDriver+0x44[/COLOR] | (fffff800`03cef290) [COLOR=#0000cd]nt!ExQueryDepthSList [/COLOR]
From the above, we can see that the exception occurred in nt!IofCallDriver+0x44. We can also see mention of nt!ExQueryDepthSList, which is a routine that returns the number of entries currently in a given sequenced singly linked list. I mentioned linked lists above, so with that said, this is a good time to go ahead and explain linked lists!
Understanding linked lists, etc, is extremely complicated and it would actually take an entire book to explain. With that said, in its simplest terms which I will try to sum up a few times here & there with examples, Windows' list of active process is obtained by traversing (better known as 'walking' amongst infosec/debugging terms) a doubly linked list referenced in the EPROCESS structure of each structure. To further expand, a process's EPROCESS structure contains what is known as a LIST_ENTRY structure. This structure has what is known as the members FLINK and BLINK. FLINK and BLINK are both pointers to processes in front of and behind a current process descriptor.
Linked list + an exception occurring in nt!IofCallDriver+0x44. Well, what driver? This is very suspicious. At this point, I was very concerned a rootkit was present on the system. Why did I become suspicious of a rootkit being present? Well, first we must understand how rootkits work. I am not going to go extremely in-depth, but I will of course explain as always!
In its most basic description, rootkits (at least in this generation) use a technique known as Direct Kernel Object Manipulation (DKOM). DKOM greatly increases the sophistication of the rootkit and allows it to go undetected by today's basic antivirus suites. Some of the things it allows rootkits to do are:
1. Hook the Interrupt Descriptor Table (IDT). By doing this, the rootkit can filter exported kernel functions. Remember, interrupts signal the kernel that something needs to be done. That's exactly how today's OS' work, they work based on interrupts.
-- It's worth noting that newer (as of this post, at least) rootkits generally no longer hook, as it's detectable. However, as far as I know, the IDT is still hooked.
2. Direct access to kernel memory.
3. Modify objects in memory and go undetected in doing so.
4. Hide processes, files, network-based connections (ports), etc.
5. Add privileges/groups to tokens. It can also go one step further and manipulate the token to fool Event Viewer.
etc...
With this said, let's take a look at how processes are overall managed by the OS:
(thanks to the HB Gary .pdf for this)
I am not going to go further in-depth here, but if you'd like to understand a fair amount about this diagram, my good friend Harry has written about it (here).
Essentially, rootkits take advantage of the linked list structure by modifying pointers within the linked list by using DKOM. Rootkits also change the Flink and Blink pointers (which we can see in the above diagram) to wrap around processes that should be hidden. Remember I went into Flink and Blink a bit above?
So now that we understand all of that, you should now also understand why I was suspicious when seeing a linked list routine + an exception occurring in nt!IofCallDriver+0x44.
When I saw this, I had the user run aswMBR. This is a rootkit scanner that scans for TDL4/3, MBRoot (Sinowal), Whistler and other rootkits. Here was the log:
Code:
aswMBR version 0.9.9.1771 Copyright(c) 2011 AVAST Software
Run date: 2014-03-28 10:16:54
-----------------------------
10:16:54.620 OS Version: Windows x64 6.1.7601 Service Pack 1
10:16:54.620 Number of processors: 2 586 0x170A
10:16:54.622 ComputerName: **Removed** UserName:
10:16:58.691 Initialze error C0000001 - driver not loaded
10:17:34.248 Service scanning
[COLOR=#4b0082]10:17:35.346 Service 3d0ce9e8976dc0a9 C:\Windows\System32\Drivers\[/COLOR][COLOR=#ff0000]3d0ce9e8976dc0a9.sys[/COLOR][COLOR=#4b0082] **HIDDEN**[/COLOR]
10:18:19.964 Modules scanning
10:18:19.972 Disk 0 trace - called modules:
10:18:19.977
10:18:19.982 Scan finished successfully
10:18:59.144 The log file has been saved successfully to "C:\Users\**Removed**\Desktop\aswMBR.txt"
From the above log, we can see it's showing a hidden service (driver appears to be likely loading at boot). The 3d0ce9e8976dc0a9.sys driver is the driver responsible for this call - nt!IofCallDriver+0x44.
The user also noted that when attempting to install HijackThis, they got the following message:
Code:
The system administrator has set policies to prevent this installation.
Remember step #5 from above?
Code:
Add privileges/groups to tokens. It can also go one step further and manipulate the token to fool Event Viewer.
This is exactly what the rootkit did, it appeared to modify tokens and disallow the install of Hijackthis, and other probable common startup/hijacker detection software.
So now that we've seen some 0x3B rootkit debugging, let's take a look at an 0x7A scenario that's just a bit different!
KERNEL_DATA_INPAGE_ERROR (7a)
This bug check indicates that the requested page of kernel data from the paging file could not be read into memory.
Code:
BugCheck 7A, {[B][COLOR=#ff0000]4[/COLOR][/B], [COLOR=#4b0082]0[/COLOR], fffffa8009bc11f0, fffff8a009446220}
The 1st parameter of our bug check is 4, which indicates that the 2nd parameter is an error status code (typically I/O status code). With this said, the 3rd parameter in our case AFAIK is the PTE contents, and the 4th parameter is the faulting address.
The 2nd parameter in our case is 0, which is the following NTSTATUS value - STATUS_SUCCESS (0x00000000). Quite simply, it implies that the operation completed successfully.
Let's have a basic look at the call stack:
Code:
1: kd> k
Child-SP RetAddr Call Site
fffff880`177104d8 fffff801`52f2906c nt!KeBugCheckEx
fffff880`177104e0 fffff801`52eeabb7 nt! ?? ::FNODOBFM::`string'+0x24cc6
fffff880`177105c0 fffff801`52ea8def nt!MiIssueHardFault+0x1b7
fffff880`17710690 fffff801`52e6beee nt!MmAccessFault+0x81f
fffff880`177107d0 fffff801`532ba031 nt!KiPageFault+0x16e
fffff880`17710960 fffff801`532ba8a8 [COLOR=#ff0000]nt!CmEnumerateKey+0x191[/COLOR]
fffff880`17710a10 fffff801`52e6d453 [COLOR=#4b0082]nt!NtEnumerateKey+0x308[/COLOR]
fffff880`17710b90 000007ff`2b3e2f0a nt!KiSystemServiceCopyEnd+0x13
00000042`a4baf118 00000000`00000000 0x000007ff`2b3e2f0a
Very interesting call stack we have here! The first big red flag/question we are asking ourselves here is 'Why is a low-level NT function calling into a pagefault?' The answer is... we likely have a rootkit!
First off, the CmEnumerateKey routine returns information about a subkey of an open registry key, and if we remember, the 2nd parameter of the bug check was 0 (STATUS_SUCCESS). This indicates that it was successful in its attempt to return information regarding the subkey. As we discussed above in our 0x3B example, rootkits use a technique labeled DKOM (Direct kernel object manipulation) to hide themselves in legitimate Windows processes. In our case here, it appears to have hooked itself into NtEnumerateKey. Remember we also discussed hooking?
Expanding further off of DKOM, one of the most common ways of going undetected is hooking registry API functions such as - RegOpenKey, RegEnumKey, RegEnumValue. To further increase effectiveness of the rootkit regarding its ability to go undetected, it will specifically hook the low-level NT versions of these functions, such as - NtOpenKey, NtEnumerateKey and NtEnumerateValueKey.
With all of that said, if the rootkit is so sophisticated at hiding itself, why is this showing in a call stack of a crash dump? Also, why is the system even crashing in the first place? One of the ways to effectively discover a rootkit hooked to such entries listed above is to directly invoke said functions. In this case, the user was attempting to run scans with various software that would detect the rootkit (such as TDSSKiller, etc). Every time the user ran a scan with such software, the system would call an 0x7A bug check. I may be wrong, but this may be a 'defense' mechanism of the rootkit, or the scan is actually conflicting with the rootkit/what it's trying to accomplish regarding hooking, therefore it forces a bug check or kernel corruption will occur.
What's the problem? Well, we can surely almost confirm a rootkit, however, at the same time, we also actually cannot be too sure. Why? Well, this is pretty clear that hooks are occurring and we've caught them, however, what's interesting to know is that intrusion prevention-based software also hooks like this. The user is crashing every single time he/she runs a scan, so that could also be a possibility of buggy software.
Code:
1: kd> dl nt!PsActiveProcessHead 10 2
[COLOR=#ff0000]fffff801`530acc80[/COLOR] fffffa80`03088328 fffffa80`0a0b1828
fffffa80`03088328 fffffa80`067aec28 [COLOR=#ff0000]fffff801`530acc80[/COLOR]
fffffa80`067aec28 fffffa80`07314ae8 fffffa80`03088328
fffffa80`07314ae8 fffffa80`07d45c28 fffffa80`067aec28
fffffa80`07d45c28 fffffa80`07d13368 fffffa80`07314ae8
fffffa80`07d13368 fffffa80`07d13ae8 fffffa80`07d45c28
fffffa80`07d13ae8 fffffa80`07d787e8 fffffa80`07d13368
fffffa80`07d787e8 fffffa80`07d65c28 fffffa80`07d13ae8
fffffa80`07d65c28 fffffa80`07d63368 fffffa80`07d787e8
fffffa80`07d63368 fffffa80`07d63c28 fffffa80`07d65c28
fffffa80`07d63c28 fffffa80`091c2368 fffffa80`07d63368
fffffa80`091c2368 fffffa80`091eb7e8 fffffa80`07d63c28
fffffa80`091eb7e8 fffffa80`07428c28 fffffa80`091c2368
fffffa80`07428c28 fffffa80`0742dc28 fffffa80`091eb7e8
fffffa80`0742dc28 fffffa80`0742ac28 fffffa80`07428c28
fffffa80`0742ac28 fffffa80`07463c28 fffffa80`0742dc28
The first entry is the System Process, which we can confirm:
Code:
1: kd> dt nt!_EPROCESS ActiveProcessLinks.Blink poi([COLOR=#4b0082]PsInitialSystemProcess[/COLOR])
+0x2e8 ActiveProcessLinks : [ 0xfffffa80`067aec28 - 0xfffff801`530acc80 ]
+0x008 Blink : 0xfffff801`530acc80 _LIST_ENTRY [ [COLOR=#ff0000]0xfffffa80`03088328[/COLOR] - 0xfffffa80`0a0b1828 ]
From here, we can walk along the linked list to confirm whether or not it is corrupt (remember we discussed this?):
Code:
1: kd> !validatelist fffff801`530acc80
[COLOR=#000080]Found list end after 118 entries[/COLOR]
It's not corrupt, however, I don't believe this implies that a rootkit is not present on the system, and that it hasn't been modified. More comments on this if/when I learn more regarding linked lists modified by rootkits. Harry (x BlueRobot here on Sysnative) goes into pretty nice detail regarding linked lists, etc, here on his blog - BSODTutorials: Rootkits: Direct Kernel Object Manipulation and Processes
Hope you enjoyed reading! I will make edits to this as time goes by as I see fit to add more info, change things, etc.
References/extra reading:
Professional Rootkits - Ric Vieler - Google Books
Detection of Intrusions and Malware, and Vulnerability Assessment: 7th ... - Google Books
BSODTutorials: Rootkits: Direct Kernel Object Manipulation and Processes
Last edited: