CybersecurityDev

Malware Dev – Chapter 07 – Anti-Disassembly Strategies

Continued series from the Malware Development for Ethical Hackers Book.
GitHub repo: EricTurner3 – Malware_Development.

Opcode / Assembly Obfuscation

The main point of opcode obfuscation is to make it harder for the analyst to decompile the code. Other sources seem to refer to this as directly making changes to the assembly or binary in order to obfuscate. The book instead just adds junk code – code that runs a bunch of mathematical calculations but serves no other purpose – just to obfuscate. While I definitely see junk code as a method of obfuscation for a decompiler, I’m not sure it meets the actual definition of assembly obfuscation.

Nonetheless, using the reverse shell from Ch 1, a new function is added and called within main() to run a bunch of random math calculations. The author runs this code at the end of the program, after the reverse shell process has been created. I don’t like this approach. Instead, I call this function first, and then also randomly again a few times during the reverse shell setup. Thus, if someone is line by line debugging with x64dbg, they have to go through a bunch of junk before seeing what really occurs. This should hopefully create a bunch of JMP instructions to keep breaking the debugger into doing math.

Reverse shell works fine. Throwing this into ghidra, not much really seems to be obfuscated by just adding extra function calls like this. It would be more useful to mask some of the strings or calls used to set up the reverse listener than the extra junk code.

I renamed the functions for main and junk. The true intent is still rather obvious, however it did take a bit to find this function as ghidra’s entry point was not the actual main function.

Function Call Obfuscation

Because I never read the full chapter first and work step by step, this next section appears to fix some of the issues of the first obfuscation. Instead of directly calling functions from the ws2_32 library, we use function pointer types and GetProcAddress to dynamically load the proper functions.

At the top, use the windows app documentation to rebuild the function calls we want to use.

typedef int (WSAAPI *WSAStartup_t)(WORD, LPWSADATA);
typedef SOCKET (WSAAPI *WSASocket_t)(int, int, int, LPWSAPROTOCOL_INFO, GROUP, DWORD);
typedef int (WSAAPI *WSAConnect_t)(SOCKET, const struct sockaddr*, int, LPWSABUF, LPWSABUF, LPQOS, LPQOS);

Then in the main function, we use our definitions along with GetProcAddress to dynamically resolve the real function calls:

HMODULE hWS2_32 = LoadLibrary("ws2_32.dll");

WSAStartup_t st = (WSAStartup_t)GetProcAddress(hWS2_32, "WSAStartup");
WSASocket_t  so = (WSASocket_t)GetProcAddress(hWS2_32, "WSASocketA");
WSAConnect_t co = (WSAConnect_t)GetProcAddress(hWS2_32, "WSAConnect");

One other trick I performed was removing the hardcoded 4444 port but instead used a function to do a bunch of garbage math to return the port for later use:

int wow(){
  int number = 8888;
  int number2 = 6666;
  int number3 = 9999;

  number = number * 2;
  number = number / 4;

  number3 = number3 / 3;

  number2 = number * number3;
  number3 = number2 - number;

  return number;
}

Compiled and ran on the target machine, reverse shell pops fine.

Using Ghidra, we can see it is a bit more involved to reverse. The port is passed to htons, but it has to go into another function to determine what value this is. Using similar tactics of nesting all important strings in functions could help here:

Function Hashing

This chapter doesn’t dive into the algorithm, but provides a PowerShell script that allows you to pass in the Win32 function name, such as CreateProcess, and it then returns a hash ID. This is further expanded upon in C code, with another function getAPIAddr to confirm if the hash matches the function address being searched. This function is not explained at this time.

However, it is used to replace the CreateProcess function call using a hashed value. I took this a step further and calculated hashes to the other WSAStartup, Socket, Connect functions and used the hashing here as well. The source code provided by the author tries to directly call (char *)"kernel32"which does not work for me. I needed to use LoadLibrary and also ensure #include <windows.h> was at the top.

As a side note, I have also seen this before during my Lovely Malware reverse engineering.

After using API hashing for several of the main functions, our reverse shell connects fine like usual.

Opening up Ghidra, things are definitely looking much more complicated. While it is apparent which libraries are being called, the functions themselves are not apparent:

If we continued to do this for other functions such as htons and inet_addr, and started to obfuscate or encrypt strings, it would really be a perfect example of full obfuscation.

Crashing Malware Analysis Tools

This section uses a recursive function with the intent to break decompilers by running out of memory.

Running this in x32dbg shows the stack overflow, but does not crash the debugger itself:

However in this instance, I wasn’t able to get a reverse shell either. The exe crashes. Lowering the number down even to 250000 from 1000000 has no effect and still causes a crash.

Conclusion

This chapter was interesting. The section I liked the best was the API hashing technique that can be used to mask API calls from a decompiler and require more advanced analysis to reverse. I have seen malware use this approach before where every single call is an API hash. The best reversal technique for that was to have Ghidra on one side and x64dbg on the other and wait to see what library eventually gets loaded once the relevant code appears. Then, the function can be properly renamed to give a better sense of what is going on. It definitely creates a lot more work for the malware analyst to decipher though.