Firstly, I’d like to thank everybody who has decided to read my blog. I hope it will inspire others to start their own and share knowledge on the field of reverse engineering and exploitation with everyone. I am always open to suggestions and critics about my work, so feel free to send me an email or write a comment down below.
With that out of the way, let’s start with the first blog post, which is going to be something really simple to start things off and should be easy enough to understand with some basic knowledge of the Windows kernel structure.
Note: This project was developed and tested on Windows 10 x64 version 1903.
Introduction
Windows, unlike Linux, is a closed source OS, so it is much harder to modify its internal structure. Furthermore, Microsoft heavily discourages it, as it can open up the system to a wide range of security threats. That’s why they have introduced various protection mechanisms to the NT kernel, a major one being PatchGuard (PG), Microsoft’s implementation of KPP. This severely limits the attacker’s options when it comes to hooking and modifying system code unless you manage to bypass PG all together.
But let’s start at the beginning. If you want to modify OS’s kernel code, you will need to gain access to kernel/system memory first. You can achieve this in multiple ways. The only official way provided by Microsoft is through the Windows Loader (NtLoadDriver). But you can use exploits like r0ak (demonstrated by Alex Ionescu) or exploit a vulnerable system driver (capcom.sys, asmmap64.sys, iqvw64e.sys…), which are not supported by Microsoft in any way. I will not go into detail on this, as hundreds of articles already exist on this topic. Instead, I will focus on retaining kernel access once you have your code already running at CPL 0 and doing it in a pretty stealthy way.
Communicating With The Kernel
In every OS, the main way of communicating with the kernel code is through system calls. Windows is no different. The Windows Native API (NTAPI) exposes thousands of kernel and executive functions. All of these functions are implemented in the Windows kernel and are easily accessible if you have code running in the kernel. Otherwise, the OS also provides access to a lot of these functions to user mode applications, mainly through ntdll.dll and some other libraries. The NTAPI functions that ntdll exposes have the “Nt” prefix. The only implementation of these functions in user mode is the system call part. Each NTAPI function that is available to be called from user mode has it’s own SSDT index which is used when performing the system call through the syscall instruction (int 0x2E on older versions of Windows). The syscall instruction simply makes the calling thread transition from user mode to kernel mode and calls the system call handler, which then uses the function index to execute the correct system function.
This is great when your application uses the system functions that are already provided by the Windows kernel. But what do you do when you are in need of some data or perform some action which is only allowed from kernel mode and is not offered by the NTAPI? Well, in that case, you will need a kernel driver. But how do you then communicate with the driver to perform your desired action? Well, thankfully, Windows implements a special mechanism, the I/O Manager. One of its purposes is to allow 3rd party device drivers to communicate with user mode. I will not go into specifics on how it works but in short, NTAPI exposes a few system functions like DeviceIoControl to which you pass data like you the driver you are trying to communicate with, input and output data. The system then calls the appropriate driver and handler function inside it, to fulfill your request.
The problem with the I/O Manager is that all requests can easily be spied on and it requires that your driver is properly loaded and has its own DRIVER_OBJECT structure, which stores the pointers to the handler routines. Our goal in this example is to stay as stealthy and hidden as possible. There are multiple ways of achieving this, today I will demonstrate one that utilizes the Windows built-in system functions.
Hooking System Calls
Before the introduction of PG, a popular method of hooking system calls was via SSDT (System Service Descriptor Table) hooks. The SSDT is used to locate system functions on every system call by the system call handler (KiSystemCall64 on newer 64-bit versions of Windows 10). SSDT hooks work by simply replacing the pointer of a system function with a pointer to a custom function in the SSDT. This works great unless PG is active. If PG detects any modification of the SSDT (and most of the ntroskrnl image) it will cause the system to crash through bugcheck CRITICAL_STRUCTURE_CORRUPTION. So either we need to somehow disable/bypass PG or find another way to hook system calls. An awesome example is InfinityHook, which uses the Windows ETW component to intercept all system calls. This was, however, recently patched.
While I am sure there are still ways to monitor and intercept all system calls, this is not our goal of this project. What we want to do is just make a system call for ourselves. While open-source kernels like the Linux Kernel allow you to create your own system calls, that is not the case in Windows, consequently, we are forced to exploit some components of the OS.
There have been multiple articles in the past about exploiting the Windows graphics component. Usually by changing the pointers in the DxgCoreInterface table inside dxgkrnl.sys or something similar which is used by the Win32 subsystem to call the appropriate graphics related system functions. This implementation is pretty bad because the mentioned table is exported by the kernel module and the module itself is not even protected by PG, so you are free to modify it as you wish.
As already said, most of the system call functions are implemented in ntoskrnl.exe, but a chunk of them is also located inside the win32k family of drivers, which are primarily used for graphics and mouse/keyboard support. The kernel, ntoskrnl.exe, is almost fully protected by PG (with some exceptions) and the win32k driver subsystem is also supposed to be, judging by the value 0x102 as the parameter 4 of the CRITICAL_STRUCTURE_CORRUPTION bugcheck. But after some testing, I soon figured out that is not the case. I was almost completely fine modifying win32kfull.sys. This leads me to believe that PG is only actively guarding win32k.sys against modifications, for whatever reason. Why I said “almost completely fine”? Well for some weird reason, you do crash if you modify win32kfull.sys in the first few minutes after Windows boots. Not sure what the purpose of that is.
So now that we are safe from PG, let’s continue. One day, while browsing through win32kfull.sys I stumbled upon some interesting functions. NtUserCallxParam, where x is either “No”, “One” or “Two”, is a triplet of functions, system call functions, which have a very interesting structure, specifically NtUserCallTwoParam…
The function looks fairly simple at first glance. As we can see, it starts off by entering a critical region. Fairly normal for a system call function in order to prevent code from accessing and modifying data at the same time. The function then proceeds by checking the value of the third parameter, if we deduct 0x80 from it and it’s still bigger than 0xF then the function will leave the critical region and return 0. Otherwise we… oh, what’s that. Call code pointed to by a pointer stored inside a table? Ehm, sure… this is where we come in. Since this driver (at least the .rdata section) is not protected by PG we are free to change the pointers in that table as we wish. This basically means that we get our own system call!
The other two functions are fairly similar. They use the same table to call the functions, but they have a few extra checks. Plus, NtUserCallTwoParam allows us to pass our own function two parameters, as the name suggests.
So all we have to do is simply replace the pointer to any function inside the apfnSimpleCall table (for which we have to signature scan) and we are done. These 3 functions seem to be only called from some system components, particularly when the user makes a mouse click. So we can simply put a magic number as the first parameter and a pointer to our data as the second parameter to the system call. Then when our custom function is called we simply check the magic number, if it matches we continue with performing our operations using the data provided as the second parameter, otherwise, we simply return to the original function that we replaced. Another way of checking if the function was called by our user mode process is by simply checking what is the current process since a system call is just a transition from user mode to kernel mode, the context stays the same. Since the apfnSimpleCall table stores pointers to multiple functions, you can replace as many as you want and only change to index you pass to the system call function in order to select which function you want to execute.
There are, however, ways of detecting this. The simplest one being the call stack. Since we are just transitioning from user mode to kernel mode the stack stays the same so we can see all calls, and that exposes our hook since it calls a function that is outside a valid module (t.i. if you call other functions from your code). Another one is also the integrity of .rdata section, which can easily be checked by anti-virus or anti-cheat software.
You can find the source code for this simple example on my GitHub.
Conclusion
I have demonstrated one of the methods that still allow system call hooking. It’s fairly similar to standard virtual function table hooking. I am not sure if this is designed this way intentionally for whatever reason or if there is a bug in PG. Either way, it is not really dangerous since the attacker is required to have access to the kernel in the first place, it’s just a sneaky way to allow stealthy and controlled code execution after you have already gained access to the kernel. There are still a bunch of other methods, that allow system call hooking, out there, that are even stealthier than this, but for now, I’m only releasing this one, I might, however, decide to edit this post in the future to include other methods.