CallStack Evasion

 Introduction: 

Hello all, it's been a while since I have written a blog, I have been doing a lot of stuff and just hadn't got the time to document on the blog much but today I took sometime out to write about something cool, it's nothing new but it's very interesting, so a few weeks ago I came across an article by DarkVortex (Paranoid Ninja) the creator of BrutelRatel he wrote about how to clear the stack while calling any windows api. 

Basically EDRs and AVs tend to look at the stack when the program is running and when any windows api is called basically the stack gets filled with the function's arguments so one of the detection method of EDRs is to look to the stack and determine whether the functions are being used maliciously or not. 

He did that by using TpAlloc undocumented Windows api, it seemed very interesting to me and I wanted to implement the same technique but with different function but TpAlloc and similar APIs are undocumented which means if you want to find it you need to reverse engineer NTDLL and look for similar APIs so I did and found some similar windows apis that were undocumented and did the same work. 

Today in this blog I'll be showing how you can clear the stack. The undocumented API I am going to use is TpSimpleTryPost.


Stack:

A little concept about the stack and how does stack is mapped when a process is started. Basically when a process is started a temporary memory space is created where local variables and function arguments are stored. The stack contains information about a thread as well as the function in which it is being executed. 
It is based on the Last In First Out principal. Whenever a process is executed a new staack is created. 

This is a basic and high level definition of a stack.

EDR Detection Technique:

In this blog I am going to discuss about how the EDR uses the ETW to capture the stack telemetry. Some EDRs tend to use ETW or kernel callbacks (as described in my previous blog) to check from where does the function call is being originated from. The stack trace will provide a complete stack frame of return address and all the functions from where the call was started. In short, if you execute a DLL sideload which executes your shellcode which calls LoadLibrary, it would look like this: 


This means any EDR which hooks LoadLibrary in usermode or via kernel callbacks/ETW, can check the last return address region or where the call came from.

The above example is taken from this link.
 

Windows Callbacks:

In simple definition a callback function executes another functions within the caller function. But there's a small issue with the callbacks. The callback function executes in the same thread as of your caller thread. The stack trace usually follow a trail like: LoadLibrary return address to Callback Function and it returns to the Read Execute region. 
So to evade the call stack we need to make sure our LoadLibrary function executes in a separate thread independent of our RX region and also when we use callback functions we need to be able to pass parameters to the function that we want to call, most callback functions in windows, either don't have parameters, or don't forward the parameters 'exactly' to our target function.
 
Take an example of the below code:



#include <windows.h>
#include <stdio.h>

typedef NTSTATUS (NTAPI* TPSIMPLETRYPOST)(PTP_SIMPLE_CALLBACK pfnwkCallback, PVOID OptionalArg, PTP_CALLBACK_ENVIRON CallbackEnvironment);


int main() {
    CHAR *libName = "wininet.dll";

    
    FARPROC pTpTpSimpleTryPost = GetProcAddress(GetModuleHandleA("ntdll"), "TpSimpleTryPost");
    
    ((TPSIMPLETRYPOST)pTpTpSimpleTryPost)(LoadLibraryA, libName, NULL);
    printf("
Loaded Library Address: %p\n", GetModuleHandleA(libName)); //check if library is loaded

    return 0;
}


The above code will crash. Before explaining this let us see the definition of the TpSimpleTryPost Function: 


NTSTATUS NTAPI TpSimpleTryPost (

_In_ PTP_SIMPLE_CALLBACK Callback, 

_Inout_opt_ PVOID Context, 

_In_opt_ PTP_CALLBACK_ENVIRON CallbackEnviron

);


Now let's discuss on the crash of the program, I ran the program in the debugger: 

As you can see in the above program I was trying to load the wininet.dll dll, but the parameters ends up as the second parameter but LoadLibrary does not have a second parameter which results in the crashing of the program. 

How can we get around this problem we can create a custom function that will call LoadLibraryA via our callback.

#include <windows.h>
#include <stdio.h>

typedef NTSTATUS (NTAPI* TPSIMPLETRYPOST)(PTP_SIMPLE_CALLBACK pfnwkCallback, PVOID OptionalArg, PTP_CALLBACK_ENVIRON CallbackEnvironment);


int main() {
    CHAR *libName = "wininet.dll";


VOID CALLBACK WorkCallback(
  _Inout_     PTP_CALLBACK_INSTANCE Instance,
  _Inout_opt_ PVOID                 Context,
  _Inout_     PTP_WORK              Work
) {
    LoadLibraryA(Context);
}
    
    FARPROC pTpTpSimpleTryPost = GetProcAddress(GetModuleHandleA("ntdll"), "TpSimpleTryPost");
    
    ((TPSIMPLETRYPOST)pTpTpSimpleTryPost)(LoadLibraryA, libName, NULL);
    printf("
Loaded Library Address: %p\n", GetModuleHandleA(libName)); //check if library is loaded

    return 0;
}


The above code fixes our issue, but if we execute the program, it will push the return address onto the stack frame. This will result in exactly what we were trying to avoid - leaving the stack with return address. However, what if we manipulate the stack to not push the return address on to the stack. We can do that in assembly language: 


#include <windows.h>
#include <stdio.h>


typedef NTSTATUS (NTAPI* TPSIMPLETRYPOST)(PTP_SIMPLE_CALLBACK pfnwkCallback, PVOID OptionalArg, PTP_CALLBACK_ENVIRON CallbackEnvironment);


FARPROC pLoadLibraryA;

UINT_PTR CallLoadLibraryA() {
    return (UINT_PTR)pLoadLibraryA;
}

extern VOID CALLBACK WorkCallback(PTP_CALLBACK_INSTANCE Instance, PVOID Context, PTP_WAIT wait);


int main() {


    pLoadLibraryA = GetProcAddress(GetModuleHandleA("kernel32"), "LoadLibraryA");
    FARPROC pTpTpSimpleTryPost = GetProcAddress(GetModuleHandleA("ntdll"), "TpSimpleTryPost");


    CHAR *libName = "EnterpriseAppVMgmtCSP.dll";



        ((TPSIMPLETRYPOST)pTpTpSimpleTryPost)((PTP_SIMPLE_CALLBACK)WorkCallback, libName, NULL);

        printf("Loaded Library Address: %p\n", GetModuleHandleA(libName));
    return 0;
}


Assembly Program to manipulate the stack:

section .text

extern CallLoadLibraryA

global WorkCallback

WorkCallback:
    mov rcx, rdx
    xor rdx, rdx
    call CallLoadLibraryA
    jmp rax


Now if we compile both the programs and execute it, the WorkCallback function will simply move the library name from RDX register to RCX register, erases RDX, gets the address of LoadLibraryA from the adhoc function and then jumps to LoadLibraryA.

As you can see in the above image the stack is crystal clear and we have rearranged the whole stack without adding our return address. And in the below image we can see that the DLL is loaded and the stack is clear and there is no sign of anything malicious.


Conclusion:

This blog was inspired by the Chetan Nayak' Stack Evasion technique in which he described how to evade stack by using callback functions. However the API he implemented was TpAllocWork, and also said that there were more APIs like this. This motivated me to find more APIs like this, I found a couple of APIs by reversing the ntdll, including TpSimpleTryPost, TpAllocWait, TpAllocTimer. 

I also implemented TpAllocTimer and made a process dumper to dump process such as Lsass. There are more APIs to be explored. It was really interesting and I learned a lot implementing this concept. 

 

Till Next Time.