/ 2023-11-13-write-ups-for-flare-on-10
back to the top

Write-ups for Flare-On 10

2023-11-13 — last updated 2024-06-26

With the official write-ups and countless community blog posts out there, much has already been said. Still, this blog post serves as a way for me to document my own process and share some notes. For brevity, a bunch of hitting-my-head-against-the-wall was omitted.

As I did not take detailed notes while Flare-on 10 was running, I will try to reconstruct my paths through the challenges as time and memory permits. This page contains write-ups for the following challenges:

1: X

TLDR: click the button 42 times.

Welcome to the 10th Annual Flare-On Challenge!

Statistically, you probably won’t finish every challenge. Every journey toward excellence starts somewhere though, and yours starts here. Maybe it ends here too.

This package contains many files and, I can’t believe i’m saying this, click the one with the “.exe” file extension to launch the program. Maybe focus your “reverse engineering” efforts on that one too.

The first challenge is the classic liveness check. Starting the executable shows a big cross, and a bunch of buttons that we can click to interact with it; presumably we need to find the right 2-digit code and press the unlock button.

X.exe appears to be a plain MSVC/C++ executable, but all bundled DLLs suggest there’s some .NET going on here. In addition to X.exe, there’s also an X.dll that seems interesting. The naming suggests it’s not library code, and Detect It Easy identifies it as .NET.

After loading the .dll in DNSpy and clicking around a bit, I quickly succumbed to frustration of not finding the relevant code (it seems I overlooked monogame.Game1 entirely) and just decide to click the buttons a bunch of times. I should’ve known: the number 42 leads to the flag.

2: ItsOnFire

TLDR: find string AES-CBC and IV. Build the AES key using string slices. Decrypt iv.png.

The FLARE team is now enthusiastic about Google products and services but we suspect there is more to this Android game than meets the eye.

This challenge consists of a single APK file called ItsOnFire.apk. Straight into Android on the second challenge; might as well get it over with.

Without a ready-to-go setup for Android analysis, I scramble a bit and install Android Studio to have a look at what we’re dealing with. I play a bit of Space Invaders.

For static analysis, I open the APK using jadx-gui. It appears the APK is obfuscated; all names were replaced with meaningless placeholders. I start at the top and stop when I find something of interest; in f.b, I identify code that performs CRC32, and there’s symmetric crypto going on. Two of the functions retrieve various values from the APK’s resources

private final File c(int i2, Context context) {  
	Resources resources = context.getResources();  
	Intrinsics.checkNotNullExpressionValue(resources, "context.resources");  
	byte[] e2 = e(resources, i2);  
	String d2 = d(context);  
	Charset charset = Charsets.UTF_8;  
	byte[] bytes = d2.getBytes(charset);  
	Intrinsics.checkNotNullExpressionValue(bytes, "this as java.lang.String).getBytes(charset)");  
	SecretKeySpec secretKeySpec = new SecretKeySpec(bytes, context.getString(R.string.ag));  
	String string = context.getString(R.string.alg);  
	Intrinsics.checkNotNullExpressionValue(string, "context.getString(R.string.alg)");  
	String string2 = context.getString(R.string.iv);  
	Intrinsics.checkNotNullExpressionValue(string2, "context.getString(\n     …             R.string.iv)");  
	byte[] bytes2 = string2.getBytes(charset);  
	Intrinsics.checkNotNullExpressionValue(bytes2, "this as java.lang.String).getBytes(charset)");  
	byte[] b2 = b(string, e2, secretKeySpec, new IvParameterSpec(bytes2));  
	File file = new File(context.getCacheDir(), context.getString(R.string.playerdata));  
	FilesKt__FileReadWriteKt.writeBytes(file, b2);  
	return file;  
}  

This appears to build some sort of crypto object using R.string.ag and R.string.alg to identify the algorithm, and R.string.iv as an IV. I find out that Android apps store their strings in res/values/strings.xml, and identify the following strings:

<string name="alg">AES/CBC/PKCS5Padding</string>
<string name="ag">AES</string>
<string name="iv">abcdefghijklmnop</string>

So, AES it is, then! There’s also a bunch of strings regarding days of the week and a couple URLs, and I resist the urge to browse to them (.. briefly – there’s a Youtube URL that’s not clearly Rick Astley, and I find out it’s Working for the Weekend by Loverboy). More carefully reading through the function, it appears that the function d is responsible for generating the AES key.

private final String d(Context context) {  
	String slice;  
	String string = context.getString(R.string.c2);  
	Intrinsics.checkNotNullExpressionValue(string, "context.getString(R.string.c2)");  
	String string2 = context.getString(R.string.w1);  
	Intrinsics.checkNotNullExpressionValue(string2, "context.getString(R.string.w1)");  
	StringBuilder sb = new StringBuilder();  
	sb.append(string.subSequence(4, 10));  
	sb.append(string2.subSequence(2, 5));  
	String sb2 = sb.toString();  
	Intrinsics.checkNotNullExpressionValue(sb2, "StringBuilder().apply(builderAction).toString()");  
	byte[] bytes = sb2.getBytes(Charsets.UTF_8);  
	Intrinsics.checkNotNullExpressionValue(bytes, "this as java.lang.String).getBytes(charset)");  
	long a2 = a(bytes);  
	StringBuilder sb3 = new StringBuilder();  
	sb3.append(a2);  
	sb3.append(a2);  
	String sb4 = sb3.toString();  
	Intrinsics.checkNotNullExpressionValue(sb4, "StringBuilder().apply(builderAction).toString()");  
	slice = StringsKt___StringsKt.slice(sb4, new IntRange(0, 15));  
	return slice;  
}

The function loads two strings and performs a bit of slicing to extract 16 characters. Let’s grab the relevant resources.

<string name="c2">https://flare-on.com/evilc2server/report_token/report_token.php?token=</string>
<string name="w1">wednesday</string>

Ah, so this is where the URL comes into play! I replicate the functionality in Python, armed with the knowledge that function a is just a wrapper around CRC32.

from binascii import crc32

string = "https://flare-on.com/evilc2server/report_token/report_token.php?token="
string2 = "wednesday"
sb = string[4:10] + string2[2:5]

a2 = crc32(sb.encode('utf-8'))
s = str(a2) + str(a2)
print(s[:16])

This outputs 4508305374508305 (and it has me double-checking whether that’s a 16-byte character string, or an integer).

Okay, now we have AES with a key (4508305374508305) and IV (abcdefghijklmnop), but no ciphertext to decrypt yet. Clicking through the resources, there’s an odd iv.png and ps.png in res/raw; these are not valid PNG files. Could it be..?

Exporting a resource is done by clicking Save as gradle project in the JADX interface. Turning to CyberChef, we can decrypt the images using AES in CBC/NoPadding mode.

ps.png

iv.png

3: Mypassion

TLDR: decompile, step along in a debugger. Gradually work through the stages, tweak input until all checks are satisfied.

This is one of those ones that will work under the right circumstances, Steve. May I call you Steve? Before you take to twitter complaining about a broken challenge, think about how you can be the change you want to see in the world, Steve.

4: Aimbot

I hope this is the only aimbot on your system. Twitch streaming probably pays better than being a mediocre reverse engineer though.

TODO

5: Where_am_i

I wish we had more easy challenges for you this year for you to rack up your personal high score. It wouldn’t help you improve but it would feel great and that is what is most important.

TODO

6: FlareSays

TLDR: run the game in DOSBox-X. Fiddle a bit with the built-in debugger. Give up. Analyze the 16-bit assembly statically. Decrypt strings. Find keyboard input check. Note that file is self-modifying after 128 levels. Patch binary to always set correct input, run until level 128. Run modified file; no success. Note Konami code input check on start screen and that it affects RNG setup. Press Konami code, see blinking screen. Run again. Run modified file.

You’re doing great champ! This challenge is a modern (and retro) take on the classic game Simon Says. The cool thing about this one though is it will make you give up, and maybe take that sales job after all. It’s better for everybody that way.

7: Flake

TLDR: play snake for a bit. Run strings, find d3m0_c0nf.txt file name and XOR key. Decode config file. Set starting score to 10000, run into errors. Try to set up Nuitka to understand how to do injection. Fail miserably. Use Cheat Engine](https://www.cheatengine.org) to set Oliver’s score to 1. Overtake Oliver.

Subject: need your help... Oliver Shrub, Gardens Department Head, keeps bragging about his high score for this rip off Snake game called Flake. I'm pretty sure he found a way to cheat the game because there is no way it's possible to score 10,000 points...I mean the game ships with a sketchy TXT file so there must be something fishy going on here. Can you look at the game and let me know how I can beat Oliver's high score? Best, Nox Shadows Department Head

8: AmongRust

Our customer recently found the following malware executing on one of their machines. The system was left with a very silly looking wallpaper, and a lot of executables on the machine had stopped working. The customer had successfully restored from backup, but we are still interested in understanding all capabilities of the malware. A packet capture file has been provided to aid your analysis.

TODO

9: Mbransom

TLDR: use bochs debugger to start disk image. Identify deobfuscation, set breakpoints. Figure out keyboard input loop. Find key validation. Compute first 12 characters (XOR constant with 0x55). Use Unicorn to emulate the code and brute force the remaining 2 bytes.

You’re doing so great! Go out and celebrate. Take a day off kid, you’ve earned it. Watch your scoreboard position fall like the sand through the hourglass. Avoid this VM and feel the joy the outside world has to offer. Or crush this one and earn even more internet points, those will come in handy.

10: Kupo

TLDR: boot pdp11 using SIMH. Use rawtap to extract binary from tape. Learn some Forth. Try to debug using built-in adb debugger. Parse the PDP11 symbol table to label functions. Distinguish between native functions and Forth code. Statically figure out that decrypt is a xor-loop (use p/q2-q4! as key) and decode is some accumulator loop. Re-implement in Python.

Did you know the PDP-11 and the FOURTH programming language were both released 53 years ago, in 1970? We like to keep our challenges fresh here at Flare-On, in-line with the latest malware trends.

11: Over The Rainbow

TLDR: identify XOR and ChaCha20. Decrypt RSA public key. Attack given ciphertext using Coppersmith and stereotyped message.

I’m told this one is easy if you are really good. Based on your solve times so far Google Bard predicts your performance will be: “1 out of 5 stars. I’d give it 0 stars if I could. Food arrived over an hour late, covered in oil. I wouldn’t feed it to my dog”

We’re presented with a 64-bit PE file and a file called very_important_file.d3crypt_m3. Supposedly this is the encrypted flag; the file contains the following 317 bytes.

3D77B35D ADDBF4F9 CB95A20D 0BA0055E 03AAD1ED 96AA9BA6 7A5D1D14 106A2F7E A5C06133 18D51971 A6A759E9 0433F557 7557BC16 1AB77C85 C7289176 591336E2 80428040 94B2BF03 051257AA AABA7EBA 3E3DD6FA CFF7E3AB DD571E9D 2E2D2C84 F512C014 3B27207A 3EAC0EF9 65A23F4F 4864C7A1 CEB913CE 1803DBA0 2FEB1B56 CD8EBE16 656ABAB2 22E8EDCA 8E9C0DDA 17C370FC E72FE7F6 909EED1E 6B02E92E BF720BA6 051FD7F6 69CF309B A5467C1F B5D7BB2B 7AECA07F 11A57574 6C1047EA 35CC3CE2 46AC0861 F0778880 D18B71FB 2A8D7A73 6A646CF9 9B3DCEC3 62D41341 4BEB9F01 815DB7F7 2F6E081A EE91F191 572A28B9 576F6C53 2349F823 5B6DAF31 B39B5ADD 7ADE0CFB D30F704E B83D983C 215DE326 1F735658 43539F6B B46C9457 DF16E807 449F99F3 DABDDDD5 764FD63D 09BC9C4E 6844EC34 10DC821A B4

Opening up the executable in IDA and clicking our way through the main function, we note that it takes a single path as a command-line argument. Then, two threads are initialized. It appears the functions sub_14007FD20 and sub_14007EE60 serve as the entrypoints for these threads. Let us refer to these threads as thread A and thread B, and the functions as main_A and main_B, respectively. It appears the file path is passed as an argument to main_A; sub_14007EDC0 is responsible for starting the thread with its newly populated context.

In main_A, we notice the string ".3ncrypt_m3", as well as references to recursive_directory_iterator. In general we see several calls to sub_14007E310 with informative strings. This function seems responsible for raising errors, and provides us with useful context.

Digging further into the function, we see comparisons to characters 92 (\) and 47 (/). The loop right after shows a comparison with the . character. Surely we’re looking at some sort of file path handling.

Sure enough, creating a file with the extension .3ncrypt_m3 and passing the path to the directory it is in as an argument results in a .d3crypt_m3 file. It appears our binary is an encryption routine; my 3-byte text file was replaced by a 259-byte file containing seemingly random data. The 256 bytes difference is the tell-tale sign of a 2048-bit RSA ciphertext!

Identifying the crypto

Switching over to main_B, this appears to be where the crypto happens. The constant string expand 32-byte k reveals that we’re likely dealing with a Salsa variant. sub_14007F8F0 contains the permutation routine, which we identify as ChaCha20.

As before, static strings identify library functions for us: the input to sub_140089500 appears to relate to error reporting, and helps us label sub_140086A00 as a random number generator originating from crypto\rand\rand_lib.c. Thus, the ChaCha20 state consists of 48 random bytes, followed by the static strings expand 32-byte k.

Backing up a bit, we note that the function starts by loading a large array of 451 bytes, and deobfuscating it using a simple XOR loop. Starting the binary in a debugger and setting a breakpoint just afterwards allows us to read the decrypted string:

-----BEGIN PUBLIC KEY-----
MIIBIDANBgkqhkiG9w0BAQEFAAOCAQ0AMIIBCAKCAQEAycMwco9oCHr8YKEz5Jud
PeSfD/mZXF4S5cZcEYl7xxjj5NJy1aWM5GN1WyxjRn8NCfk8Mctn/jGICa9/yLLI
xyGrVHzk22Pb3/9dmwbIV5n97mkPkMR5xtC546P2blXWMCnOWgLvhMaq3F4iQWgw
JMxl11ZCr+C6vnbymmd86xWb5IuzJl69K9UZoq9+A2zC5kAcN1VXYagcPR0opFbD
i5G1WQNb/wE92gQ5BTuelvSyePcZ6Tnmd9BYvG6YAFr/IwgUpJerNLf6kCtmbRgN
6E4k6Q91PXnbC3IXrLXEb00apWvuVz8tR6Qzfd0eK5Z+3HA4/usJDex0ktlNlom7
YQIBAw==
-----END PUBLIC KEY-----

Looking a bit more closely at the functions that operate on this public key, we see several references to crypto\bio\bio_lib.c in sub_1400860E0. Each reference is accompanied by an integer; in this function, we observe the integers 76, 89 and 95 – perhaps these are line numbers. By having a look at the openssl source code, we identify BIO_new_ex, which indeed has several error-cases. It’s not an exact match, as only one case results in a call to ERR_raise, but the resemblance is there; perhaps we’re looking at a slightly refactored version (indeed, viewing a random commit from 2021 shows more calls to ERR_raise).

We can further verify this suspicion by looking at xrefs to the supposed BIO_new_ex. It is called from sub_140087DF0, which shows a reference to crypto\bio\bss_mem.c, line 89. That’s right in the middle of BIO_new_mem_buf, which indeed calls BIO_new; a thinly veiled wrapper around BIO_new_ex. Closer inspection suggests the compiler inlined the call into BIO_new, as the ->libctx member is not initialized.

BIO *BIO_new(const BIO_METHOD *method)
{
    return BIO_new_ex(NULL, method);
}

Armed with this new knowledge, we quickly identify the functions that construct the public key object, and trace it to sub_140085470. This supposedly calls the RSA encryption routine. The lines above it reveal that the plaintext consists of 24 random bytes and the initial ChaCha20 state, preceded by 176 null bytes. The random bytes make an appearance in the encryption loop: they’re used as an XOR key in addition to the output of the ChaCha20 keystream generator.

Attacking RSA

Decoding the public key reveals the following RSA parameters:

n = c9c330728f68087afc60a133e49b9d3de49f0ff9995c5e12e5c65c11897bc718
    e3e4d272d5a58ce463755b2c63467f0d09f93c31cb67fe318809af7fc8b2c8c7
    21ab547ce4db63dbdfff5d9b06c85799fdee690f90c479c6d0b9e3a3f66e55d6
    3029ce5a02ef84c6aadc5e2241683024cc65d75642afe0babe76f29a677ceb15
    9be48bb3265ebd2bd519a2af7e036cc2e6401c37555761a81c3d1d28a456c38b
    91b559035bff013dda0439053b9e96f4b278f719e939e677d058bc6e98005aff
    230814a497ab34b7fa902b666d180de84e24e90f753d79db0b7217acb5c46f4d
    1aa56bee573f2d47a4337ddd1e2b967edc7038feeb090dec7492d94d9689bb61

e = 3

With $e = 3$ and so many null bytes in the plaintext, this may only be a matter of performing a cube-root attack; with a small exponent, $m^e\bmod N$ could be equal to $m^e \in \mathbb{N}$. Unfortunately, with 88 bytes set we get $m \approx 2^{2112}$. This significantly exceeds $N \approx 2^{2048}$.

As we know part of the message, we can reformulate the problem slightly and apply Coppersmith’s attack on stereotyped messages, as follows. Refer to this paper or this Youtube video by David Wong for more details.

We define $m_0$ to be known part of the message (i.e., expand 32-byte k), and try to find $x_0$ such that $m = m_0 + x_0 \cdot 2^{128}$. Note that the multiplication by $2^{128}$ accounts for the fact that $x_0$ does not need to include the 16 known bytes defined in $m_0$.

Let the ciphertext be $c = m^e\bmod N$, and define the polynomial $f$:

\begin{align} f(x) = (m_0 + x \cdot 2^{128})^e - c \mod{N} \end{align}

Finding $x_0$ is now equivalent to finding $x$ such that $f(x) = 0 \bmod{N}$. Within the bounds set out by Coppersmith’s Theorem, Coppersmith’s algorithm provides a way to solve this problem efficiently.

Coppersmith’s Theorem. Let $N$ be an integer which has a divisor $b \ge N^\beta$ with $0 < \beta \le 1$. Let $f(x)$ be a univariate monic polynomial of degree $\delta$ and let $\gamma \ge 1.$ Then we can find all solutions $x_0$ such that $f(x_0) = 0 \bmod{b}$ with $|x_0| \le \gamma \cdot N^{\frac{\beta^2}{\delta}}$, in time $\mathcal{O}(\gamma\delta^5\log^9(N))$.

We can simplify this by letting $\beta = 1$ and $\gamma = 1$, and we observe that the degree of $f$ is $e = 3$. The promise of Coppersmith’s algorithm is thus to deliver us all candidates for $x_0$ where $|x_0| \le N^{\frac{1}{3}}$. With $N \approx 2^{2048}$, $N^{\frac{1}{3}} \approx 2^{683}$. As $x_0$ will be 72 bytes (= 576 bits), we know such a solution to exist.

Coppersmith’s algorithm is implemented in sage as the small_roots() method. We find $x_0$ as follows:

sage: c = 0x1336E28042804094B2BF03051257AAAABA7EBA3E3DD6FACFF7E3ABDD571E9D2E2D2C84F512C0143B27207A3EAC0EF965A23F4F4864C7A1CEB913CE1803DBA02FEB1B56CD8EBE16656ABAB222E8EDCA8E9C0DDA17C370FCE72FE7F6909EED1E6B02E92EBF720BA6051FD7F669CF309BA5467C1FB5D7BB2B7AECA07F11A575746C1047EA35CC3CE246AC0861F0778880D18B71FB2A8D7A736A646CF99B3DCEC362D413414BEB9F01815DB7F72F6E081AEE91F191572A28B9576F6C532349F8235B6DAF31B39B5ADD7ADE0CFBD30F704EB83D983C215DE3261F73565843539F6BB46C9457DF16E807449F99F3DABDDDD5764FD63D09BC9C4E6844EC3410DC821AB4
sage: N = 0xc9c330728f68087afc60a133e49b9d3de49f0ff9995c5e12e5c65c11897bc718e3e4d272d5a58ce463755b2c63467f0d09f93c31cb67fe318809af7fc8b2c8c721ab547ce4db63dbdfff5d9b06c85799fdee690f90c479c6d0b9e3a3f66e55d63029ce5a02ef84c6aadc5e2241683024cc65d75642afe0babe76f29a677ceb159be48bb3265ebd2bd519a2af7e036cc2e6401c37555761a81c3d1d28a456c38b91b559035bff013dda0439053b9e96f4b278f719e939e677d058bc6e98005aff230814a497ab34b7fa902b666d180de84e24e90f753d79db0b7217acb5c46f4d1aa56bee573f2d47a4337ddd1e2b967edc7038feeb090dec7492d94d9689bb61
sage: e = 3
sage: m_0 = int.from_bytes(b'expand 32-byte k', byteorder='big')
sage: P.<x> = PolynomialRing(Zmod(N))
sage: f = ((m_0 + x * 2^128)^e - c).monic()
sage: f.small_roots(X=2^576)
[6730722853673567334832243774533669260003790577373139776704376326794058404121535292812433356801604068239665891105595607303018172210866035032633357162971325439700251947045071]

Note that we make $f$ monic to satisfy the requirements of Coppersmith’s theorem; dividing $f$ by its leading coefficient does not change the roots. Furthermore, remark that small_roots() can take an upper bound X as an argument, which we have readily available. Finally, recall that $c$ is the last 256 bytes from the .d3crypt_m3 file.

Decrypting the flag

Now all that remains is decrypting the flag using ChaCha20 and the XOR loop. Rather than reimplement the ChaCha permutation, we can lift it from the original code and run it using Unicorn.

from itertools import chain, repeat
import math
from unicorn import Uc, UC_ARCH_X86, UC_MODE_64
from unicorn.x86_const import (
    UC_X86_REG_RDX,
    UC_X86_REG_R8,
    UC_X86_REG_RSP,
)

root = 6730722853673567334832243774533669260003790577373139776704376326794058404121535292812433356801604068239665891105595607303018172210866035032633357162971325439700251947045071

root_as_bytes = int.to_bytes(root, length=72, byteorder="big")
xorkey = root_as_bytes[:24]
state = root_as_bytes[24:] + b"expand 32-byte k"

with open("very_important_file.d3crypt_m3", "rb") as f:
    ciphertext = f.read()[:-256]

CHACHA_X64 = bytes.fromhex(
    "4c89442418488954241048894c2408555356574154415541564157488bec4883ec680f100248c745f00a0000000f104a100f1145980f1042200f114da8448b55b40f104a30448b4db0448b65ac448b5da80f114dc8448b75d48b75d08b7dcc448b45c80f1145b8448b6dbc8b5db866908b45984103c38bd04133d0c1c2108d0c1a448bc14533c341c1c00c418d1c00448bfb4433fa41c1c708418d040f8945dc4133c0c1c0078945e08b459c4103c48bd033d7c1c210428d0c2a448bc14533c441c1c00c418d3c00448be74433e241c1c408418d040c8b4dc0448bd88945e48b45a04533d84103c141c1c3078bd033d6c1c21003ca448bc14533c141c1c00c418d3400448bee4433ea8b55c441c1c508468d0c298b4da44103ca418bc14433f14133c041c1c6104103d6c1c0074433d24489754841c1c20c894560418bc2458d340a418bce334d48c1c108448d1411418bd233d0418d041bc1c2078955d88bd033d1c1c210428d0c0a448bc14533c3448b4d60448b5de041c1c00c4103c089459833c2c1c0088945488945d403c18945c04133c0c1c0078945e88945ac418d04398bc84133cfc1c110428d1411448b55d84433ca41c1c10c458d04014489459c4433c141c1c008418d04108b55dc4433c88945c4418d043241c1c1078bc844894db04133cc448b65e8c1c11003d14433d241c1c20c418d3c02438d0433897da0448b754833f9c1c7088bc84133cdc1c1108d1c3a8b55e403d14433d34433da41c1c20741c1c30c448955b4418d34038975a433f1c1c608448d2c324533dd41c1c30748836df00144895da80f851ffeffff488b4d584c8d5598895db84c8d5d9c448945c8488d5da04c8b45504c2bd1498bd08975d0482bd1897dcc4c2bd944896dbc482bd9488d410441b904000000908b4c02fc41034c02fc8948fc8b0c1042030c108908418b0c03034c02048948048b0c03034c0208894808488d40104983e90175cc418b48208d41014189402085c9750441ff40244883c468415f415e415d415c5f5e5b5d"
)

CODE_ADDR = 0x1000000
mu = Uc(UC_ARCH_X86, UC_MODE_64)
mu.mem_map(CODE_ADDR, 2 * 1024 * 1024)
mu.mem_write(CODE_ADDR, CHACHA_X64)


def chacha_stream(state):
    STATE_ADDR = CODE_ADDR + 0x1000
    OUT_ADDR = CODE_ADDR + 0x2000
    STACK_ADDR = CODE_ADDR + 0x3000
    mu.mem_write(STATE_ADDR, state)
    mu.reg_write(UC_X86_REG_RSP, STACK_ADDR)
    mu.reg_write(UC_X86_REG_RDX, STATE_ADDR)  # state
    mu.reg_write(UC_X86_REG_R8, OUT_ADDR)  # output
    mu.emu_start(CODE_ADDR, CODE_ADDR + len(CHACHA_X64))
    return mu.mem_read(STATE_ADDR, 64), mu.mem_read(OUT_ADDR, 64)


stream = bytearray()
for _ in range(math.ceil(len(ciphertext) / 64)):
    state, output = chacha_stream(state)
    stream += output

xorstream = chain.from_iterable(repeat(xorkey))
for a, b, c in zip(ciphertext, stream, xorstream):
    print(chr(a ^ b ^ c), end="")

Bingpot. Wa5nt_th1s_Supp0s3d_t0_b3_4_r3vers1nG_ch4l1eng3@flare-on.com

12: HVM

TLDR: write code to decrypt RC4-encrypted functions. Note two input parameters. XOR hardcoded strings to get FLARE2023FLARE2023FLARE2023FLARE2023 as arg1. Figure out Salsa20 keystream; re-implement using Unicorn. Invert the XOR-loop function through tedious manual labor. Input the result as arg2.

This is the second smallest challenge this year! If only that mattered.

When we simply run the binary in a virtual machine, we’re presented with an error:

[-] OS/CPU feature not enabled

Opening the binary in IDA, we quickly find out where that error comes from. The first thing the main function does is call WHvGetCapability(WHvCapabilityCodeHypervisorPresent, ...), which I learn to be a way to check whether Hyper-V is available. It.. is not. After a brief foray into VMWare configuration, I set out to approach this challenge statically.

Some clean-up later, the following snippets suggest we’re supposed to supply two arguments with lengths between 9 to 47 and 25 to 64 …

// ...
len_arg1 = strlen(argv[1]);
len_arg2 = strlen(argv[2]);
if ( len_arg1 > 8 && len_arg1 < 48 )
{
  if ( len_arg2 > 24 && len_arg2 < 65 )
// ...

… the latter of which will be used to decrypt the flag!

// ...
if ( should_be_x1337 == 0x1337 )
{
  qmemcpy(xored_flag, &xored_flag_res, 42ui64);
  for ( i = 0; i < 41; ++i )
  printf("%c", argv[2][i] ^ (unsigned int)xored_flag[i]);
  printf("@flare-on.com\n");
}

The majority of the main function seems to be concerned with setting up and interacting with the Hyper-V environment. We can identify a function that initializes registers (sub_140001070), a function that reads RIP, R8 and R9 (sub_140001310) and uses them to derive arguments for RC4 (sub_140001730), and a function that reads RAX (sub_1400013E0). From the above snippet, we’ve figured out that RAX will need to be 0x1337 for us to get the flag.

All of this happens in a loop. The virtual processor is started (WHvRunVirtualProcessor), and runs until it hits an interrupt. At that point the exit reason is checked. Depending on the specific exit code, RC4 is called to decrypt the data pointed to by RIP using the eight-byte key in R8 and the data length in R9.

Setting the appropriate type for the ExitContext variable (WHV_RUN_VP_EXIT_CONTEXT) provides some more insight into what these exit reasons relate to.

// ...
loop_flag = 1;
should_be_x1337 = 0;
while ( loop_flag )
{
  if ( WHvRunVirtualProcessor(Partition, 0, &ExitContext, 0xE0u) >= 0 )
  {
  ExitReason = ExitContext.ExitReason;
  if ( ExitContext.ExitReason == WHvRunVpExitReasonX64IoPortAccess )
  {
    get_RIP_R8_R9(Partition, RIP_R8_R9);
    if ( (*(_BYTE *)(&ExitContext.ApicEoi + 5) & 1) != 0 )
      RC4(buf, RIP_R8_R9[0] - 16 - RIP_R8_R9[2], RIP_R8_R9[2], (_BYTE *)RIP_R8_R9[1]);
    else
      RC4(buf, RIP_R8_R9[0] + 2, RIP_R8_R9[2], (_BYTE *)RIP_R8_R9[1]);
    increment_RIP_with_2(Partition);
  }
  else if ( ExitReason == WHvRunVpExitReasonX64Halt )
  {
    should_be_x1337 = get_RAX(Partition);
    loop_flag = 0;
  }
  else
  {
    loop_flag = 0;
  }
  }
}
// ...

But where does the code live? And what’s in buf? It appears that’s what sub_140001440 takes care of.

__int64 __fastcall sub_140001440(_QWORD *a1)
{
  DWORD Size; // [rsp+20h] [rbp-38h]
  HMODULE hModule; // [rsp+28h] [rbp-30h]
  HRSRC hResInfo; // [rsp+30h] [rbp-28h]
  HGLOBAL hResData; // [rsp+38h] [rbp-20h]
  const __m128i *Src; // [rsp+40h] [rbp-18h]

  hModule = GetModuleHandleA(0i64);
  hResInfo = FindResourceA(hModule, (LPCSTR)0x85, (LPCSTR)0x100);
  hResData = LoadResource(hModule, hResInfo);
  Size = SizeofResource(hModule, hResInfo);
  Src = (const __m128i *)LockResource(hResData);
  return memcpy(*a1, Src, Size);
}

The above function calls FindResourceA(hModule, 0x85, 0x100) to get the resource with ID 0x85 (133) and resource type 0x100 (256). Opening the binary in CFF Explorer shows that indeed, there is exactly one resource, and its ID is 133. We can export the data to a file for closer inspection.

BC 00 80 FA 0F 01 16 26 0D 0F 20 C0 66 83 C8 01
0F 22 C0 EA 18 00 08 00 66 B8 10 00 8E D8 8E E0
8E E8 8E D0 E8 0E 00 00 00 0F 01 15 44 0D 00 00
EA F2 0C 00 00 08 00 BF 00 30 00 00 0F 22 DF 31
C0 B9 00 10 00 00 F3 AB 0F 20 D8 C7 00 03 40 00
...

There’s one more interesting snippet to consider before we dive into the resource code: we observe that the arguments that are passed to the original executable are inserted into the code before it is started, at 0x400 and 0x200 from the end of the allocated buffer. As the buffer is 0x10000 bytes, this puts the arguments at 0xFC00 and 0xFE00, respectively.

// ...
arg1 = &buf[buf_size - 0x400];
memcpy(arg1, argv[1], len_arg1);
arg2 = &buf[buf_size - 0x200];
memcpy(arg2, argv[2], len_arg2);
// ...
Decrypting the code

We open the resource file in IDA and scroll through the disassembly. Note that it is useful to open a new instance here, rather than load it into the existing database: we will want to set the compiler options to a compiler that uses the SystemV calling convention (such as GNU C++).

While most bytes do not represent actual x64 instructions, we notice several snippets that stand out:

...
000000000000082E                 mov     r8, 0D3A5541BC79F6DF3h
0000000000000838                 mov     r9d, 23Bh
000000000000083E                 out     3, al
0000000000000840                 retn
0000000000000840 ; ---------------------------------------------------------------------------
0000000000000841                 db  49h ; I
0000000000000842                 db 0B8h
0000000000000843                 db 73h, 0EAh, 87h, 80h, 0AAh
0000000000000848                 db 0EFh
0000000000000849                 db 29h, 53h, 41h, 0B9h, 2Dh, 2 dup(0)
0000000000000850                 db    0
0000000000000851                 db 0E4h
0000000000000852                 db    3
0000000000000853                 db 0A0h
0000000000000854                 db 0BCh
0000000000000855                 db  53h ; S
0000000000000856                 db 0A6h
0000000000000857                 db 0AFh
0000000000000858                 db 0F5h
0000000000000859                 db 25h, 0D8h, 1Fh, 6Fh, 32h, 0B5h, 30h
0000000000000860                 db  2Eh ; .
0000000000000861                 db 0AEh
0000000000000862                 dw 34C0h, 0AEE6h, 3D1Dh
0000000000000868                 db  8Bh
0000000000000869                 db  72h ; r
000000000000086A                 db  4Ah ; J
000000000000086B                 db 9Bh, 84h, 80h, 62h, 2
0000000000000870                 db  9Ah
0000000000000871                 db  1Dh
0000000000000872                 dw 5903h, 0E382h, 0F3D2h
0000000000000878                 db 0D6h
0000000000000879                 db  85h
000000000000087A                 db 0ADh
000000000000087B                 db 0C4h, 27h, 0CFh, 41h, 65h
0000000000000880 ; ---------------------------------------------------------------------------
0000000000000880                 mov     r8, 5329EFAA8087EA73h
000000000000088A                 mov     r9d, 2Dh ; '-'
0000000000000890                 out     3, al
0000000000000892                 retn
...

Note that earlier we concluded that R8 and R9 would contain the RC4 key and data length. Perhaps these out instructions are the interrupts we’re looking for. The length fields do not make sense, though; they appear to relate to the size of the chunks before the out call rather than after.

If we look at the above example, we can subtract 0x2D from 0x880 to arrive at 0x853. That still leaves us with several bytes preceding the supposedly encrypted data..

Indeed, converting the bytes at 0x841 to code results in the following snippet:

0000000000000841                 mov     r8, 5329EFAA8087EA73h
000000000000084B                 mov     r9d, 2Dh ; '-'
0000000000000851                 in      al, 3

Same length, same key, but an in instruction. It would appear this marks the start of a function, and out marks its end – perhaps to re-encrypt the function after it is executed. Using the Python code below, we decrypt all fragments.

import re


def RC4(inp, offset, length, key):
    S = list(range(0, 256))
    i = 0
    k = 0
    # KSA
    for j in range(256):
        i = (key[k] + i + S[j]) % 256
        S[j], S[i] = S[i], S[j]
        k = (k + 1) % 8
    # PRGA
    s = 0
    j = 0
    out = []
    for i in range(length):
        s = (s + 1) % 256
        j = (j + S[s]) % 256
        S[s], S[j] = S[j], S[s]
        out.append(inp[offset + i] ^ S[(S[s] + S[j]) % 256])
    return bytes(out)


def decrypt(ea, length, key):
    data[ea : ea + length] = RC4(data[ea : ea + length], 0, length, key)
    print(f"Patched {length} bytes at {hex(ea)}")


for match in re.finditer(b"\x49\xB8.{8}\x41\xB9.{4}\xE4\x03", data):
    ea = match.start()
    key = data[ea + 2 : ea + 10]
    length = int.from_bytes(data[ea + 12 : ea + 16], byteorder="little")
    out = ea + 0x12
    decrypt(out, length, key)
    # convert interrupts to NOPs to clean up decompilation
    data[ea : ea + 0x12] = bytes([0x90] * 0x12)
    data[ea + length : ea + 0x12 + length] = bytes([0x90] * 0x12)

This immediately allows IDA to identify 11 distinct functions, and a bit of code at the end that appears to be the entrypoint: this is where the arguments in 0xFE00 and 0xFC00 come into play!

0000000000000BC4                 push    rbp
0000000000000BC5                 mov     rbp, rsp
0000000000000BC8                 sub     rsp, 90h
0000000000000BCF                 mov     esi, 0FE00h
0000000000000BD4                 mov     edi, 0FC00h
0000000000000BD9                 call    sub_B3F
0000000000000BDE                 leave

Diving into sub_B3F, we see it uses sub_918 to check the first argument, and sub_A62 to check the second argument. On success, it returns the 0x1337 value the wrapper code is looking for.

  HIDWORD(v3) = sub_918(a1);
  LODWORD(v3) = sub_A62((int *)a1, (__int64)a2);
  if ( v3 == 0x2400000001 )
    return 0x1337;
  else
    return 0;
Argument 1

Inside sub_918 we find a simple XOR check:

  strcpy(key, "*#37([@AF+ .  _YB@3!-=7W][C59,>*@U_Zpsumloremips");
  strcpy(lipsum, "loremipsumloremipsumloremipsumloremipsumloremips");
  a1_len = strlen(a1);
  v6 = 0;
  for ( i = 0; i < a1_len; ++i )
  {
    if ( ((unsigned __int8)lipsum[i] ^ (unsigned __int8)a1[i]) == key[i] )
      ++v6;
  }
  return v6;

Computing the result of key ^ lipsum, we find the string FLARE2023FLARE2023FLARE2023FLARE2023. With a length of 36 characters, this fits the > 8 and < 48 constraints.

Argument 2

Now to figure out the second argument. Looking into sub_A62, we can break it down further.

__int32 __fastcall sub_A62(char *a1, char *a2)
{
  int v2; // ecx
  char v4[49]; // [rsp+10h] [rbp-40h] BYREF
  int v5; // [rsp+4Ch] [rbp-4h]

  memset(v4, 0, sizeof(v4));
  v2 = strlen(a2);
  v5 = sub_5E1(a2, v2, v4);
  if ( (v5 & 7) != 0 )
    return 0;
  sub_4AF(v4, v5, *(_DWORD *)a1);
  return sub_893(a1, v4, 48);
}

We can readily identify sub_5E1 as base64_decode; it starts off with a tell-tale v24 = 3 * (a2 / 4) output length computation (i.e., each input byte leads to 6 output bits) and proceeds to strip off the padding characters (=).

The final sub_893 is also trivial: it compares 48 bytes in v4 to a1. Recall that a1 is the string FLARE2023.... In the main function of hvm.exe, we can verify that the bytes at 0xFC00 following the user input are initialized to zero.

The function sub_4AF is where the magic happens, though. After a little cleaning, it looks as follows:

void __fastcall sub_4AF(char *arg2, int arg2_len, int arg1)
{
  int v3[16]; // [rsp+10h] [rbp-A0h] BYREF
  int v4[16]; // [rsp+50h] [rbp-60h] BYREF
  __int64 *state; // [rsp+98h] [rbp-18h]
  int v6; // [rsp+A4h] [rbp-Ch]
  int j; // [rsp+A8h] [rbp-8h]
  int i; // [rsp+ACh] [rbp-4h]

  memset(v4, 0, sizeof(v4));
  for ( i = 0; i <= 15; ++i )
    v3[i] = arg1;
  sub_A7(v4, v3);
  v6 = arg2_len / 8;
  state = (__int64 *)arg2;
  for ( j = 0; j < v6; j += 2 )
    sub_421(&state[j], &state[j + 1], (__int64)v4);
}

The local variable v3 is initialized based on the first user-supplied argument, and then passed into sub_A7, which we can identify to be the Salsa20 permutation. Variable v4 contains the output of Salsa20.

It took me longer than I care to admit to realize that v3 is not initialized with the entire first argument, but simply repeats the first four bytes (FLAR). I ended up re-implementing the Salsa20 permutation in Python and using Unicorn to verify that it resulted in the same output before realizing what was going on. So far so good, though: we can compute this value based on arg1.

The resulting byte sequence is:

026124f56d840c78fafa18a3b91c245fb91c245f026124f56d840c78fafa18a3
fafa18a3b91c245f026124f56d840c786d840c78fafa18a3b91c245f026124f5

The function sub_421 is where the second argument enters the mix.

void __fastcall sub_421(__int64 *a1, __int64 *a2, int *a3)
{
  __int64 v4; // [rsp+18h] [rbp-10h]
  int i; // [rsp+24h] [rbp-4h]

  for ( i = 7; i >= 0; --i )
  {
    v4 = *a1;
    *a1 ^= sub_3D1(*a2, i, (__int64 *)a3);
    *a2 = v4;
  }
}

This function appears to be some kind of XOR-loop; sub_3D1 computes *a2 ^ a3[i]. Note that a3 is the output of Salsa permutation, and a1 starts off pointing at arg2.

Recall that we want the resulting output of the loop in sub_4AF to be equal to arg1 in order to pass the check in sub_A62. That means we will have to invert the XOR-loop to work towards the value of arg2.

In an attempt to better understand what is going on here, I re-implement sub_421 in Python as follows. Using Unicorn, we can again verify that it matches the original code.

def xorloop(x, y, z):
    zz = [int.from_bytes(z[i : i + 8], byteorder="little") for i in range(0, 64, 8)]
    x = int.from_bytes(x[:8], byteorder="little")
    y = int.from_bytes(y[:8], byteorder="little")

    for i in range(7, -1, -1):
        t = x
        x ^= y ^ zz[i]
        y = t

    return (
        int.to_bytes(x, length=8, byteorder="little"),
        int.to_bytes(y, length=8, byteorder="little"),
        z,
    )

This is where it gets ugly. As I did not immediately see a way to cleanly invert the loop, I decided to unroll it. We can rely on the XOR operation to cancel terms along the way. The general expression is

\begin{align} x_{i} &= x_{i-1} \oplus y_{i-1} \oplus z_{7-i} \\ y_{i} &= x_{i-1} \\ \end{align}

… which unrolls to …

\begin{align} x_0 &= x\\ y_0 &= y\\ \\ x_1 &= x_0 \oplus y_0 \oplus z_7 \\ &= x \oplus y \oplus z_7 \\ y_1 &= x_0 = x\\ \\ x_2 &= x_1 \oplus y_1 \oplus z_6 \\ &= (x \oplus y \oplus z_7) \oplus x \oplus z_6\\ &= y \oplus z_7 \oplus z_6 \\ y_2 &= x_1 = x \oplus y \oplus z_7\\ \\ x_3 &= x_2 \oplus y_2 \oplus z_5 \\ &= (y \oplus z_7 \oplus z_6) \oplus (x \oplus y \oplus z_7) \oplus z_5\\ &= x \oplus z_6 \oplus z_5 \\ y_3 &= x_2 = y \oplus z_7 \oplus z_6 \\ \\ x_4 &= x_3 \oplus y_3 \oplus z_4 \\ &= (x \oplus z_6 \oplus z_5) \oplus (y \oplus z_7 \oplus z_6) \oplus z_4\\ &= x \oplus y \oplus z_7 \oplus z_5 \oplus z_4 \\ y_4 &= x_3 = x \oplus z_6 \oplus z_5\\ \\ x_5 &= x_4 \oplus y_4 \oplus z_3\\ &= (x \oplus y \oplus z_7 \oplus z_5 \oplus z_4) \oplus (x \oplus z_6 \oplus z_5) \oplus z_3\\ &= y \oplus z_7 \oplus z_6 \oplus z_4 \oplus z_3 \\ y_5 &= x_4 = x \oplus y \oplus z_7 \oplus z_5 \oplus z_4\\ \\ x_6 &= x_5 \oplus y_5 \oplus z_2\\ &= (y \oplus z_7 \oplus z_6 \oplus z_4 \oplus z_3) \oplus (x \oplus y \oplus z_7 \oplus z_5 \oplus z_4) \oplus z_2\\ &= x \oplus z_6 \oplus z_5 \oplus z_3 \oplus z_2 \\ y_6 &= x_5 = y \oplus z_7 \oplus z_6 \oplus z_4 \oplus z_3\\ \\ x_7 &= x_6 \oplus y_6 \oplus z_1\\ &= (x \oplus z_6 \oplus z_5 \oplus z_3 \oplus z_2) \oplus (y \oplus z_7 \oplus z_6 \oplus z_4 \oplus z_3) \oplus z_1\\ &= x \oplus y \oplus z_7 \oplus z_5 \oplus z_4 \oplus z_2 \oplus z_1 \\ y_7 &= x_6 = x \oplus z_6 \oplus z_5 \oplus z_3 \oplus z_2\\ \\ x_8 &= x_7 \oplus y_7 \oplus z_0\\ &= (x \oplus y \oplus z_7 \oplus z_5 \oplus z_4 \oplus z_2 \oplus z_1) \oplus (x \oplus z_6 \oplus z_5 \oplus z_3 \oplus z_2) \oplus z_0\\ &= y \oplus z_7 \oplus z_6 \oplus z_4 \oplus z_3 \oplus z_1 \oplus z_0\\ y_8 &= x_7 = x \oplus y \oplus z_7 \oplus z_5 \oplus z_4 \oplus z_2 \oplus z_1\\ \end{align}

Thus, we find that the xor-loop effectively computes:

\begin{align} x_8 = y \oplus z_7 \oplus z_6 \oplus z_4 \oplus z_3 \oplus z_1 \oplus z_0\\ y_8 = x \oplus y \oplus z_7 \oplus z_5 \oplus z_4 \oplus z_2 \oplus z_1\\ \end{align}

By shuffling the XOR operations around, we can formulate the inverse operation, as well:

\begin{align} y = x_8 \oplus z_7 \oplus z_6 \oplus z_4 \oplus z_3 \oplus z_1 \oplus z_0\\ x = y \oplus y_8 \oplus z_7 \oplus z_5 \oplus z_4 \oplus z_2 \oplus z_1\\ \end{align}

Taking a look at sub_4AF again, we see that the separate DWORDs in the state variable are touched in pairs of two, and do not interleave.

for ( j = 0; j < v6; j += 2 )
  sub_421(&state[j], &state[j + 1], (__int64)salsa_output);

This means we can simply invert the order of the loop to invert its effect. Initializing the state to arg1 and using the Salsa20 output computed before, we derive the desired state before the XOR operations:

cc16294c151626f7fd3141f82ad718bfbb1d51550f723382
894e46e62eb76dbfe74364096f0821eb894d32996fe55de0

Base64-encoding this bytestream leads to the required arg2 input:

zBYpTBUWJvf9MUH4KtcYv7sdUVUPcjOCiU5G5i63bb/nQ2QJbwgh64lNMplv5V3g
Getting the flag

We’re doing this statically, so we cannot simply run the executable and supply the arguments to obtain the flag. Instead, we get the flag by computing the XOR of arg2 and the hardcoded value in the data section at address 0x1400144B0. We find c4n_i_sh1p_a_vm_as_an_exe_ask1ng_4_a_frnd

13: Y0da

TLDR: re-use the deobfuscator from challenge 5; point it at all thread entrypoints. Decompile. Use HashDB to identify API calls. Fix API calls by creating an enum. Find gimmie_advic3 and gimmie_s3cr3t commands. Identify MD5 hash aa321932ccdd8cce334a1c3354eed3b1. Google it. Find password patience_y0u_must_h4v3. Find fake flag . Cry. Obtain base32 strings. Fix string by making input to Mersenne Twister constant. Zero the output of the RNG entirely. Decode base32 using custom alphabet. .. ROP chain? What ROP chain?

So close to the end you can almost taste it. Keep going champ, complete this challenge and take your place as one of the Reverse Engineers of All-Time.