2023-11-13 — last updated 2024-06-26
With the official write-ups and countless community blog posts out there, much has already been said. Still, this blog post serves as a way for me to document my own process and share some notes. For brevity, a bunch of hitting-my-head-against-the-wall was omitted.
As I did not take detailed notes while Flare-on 10 was running, I will try to reconstruct my paths through the challenges as time and memory permits. This page contains write-ups for the following challenges:
TLDR: click the button 42 times.
Welcome to the 10th Annual Flare-On Challenge!
Statistically, you probably won’t finish every challenge. Every journey toward excellence starts somewhere though, and yours starts here. Maybe it ends here too.
This package contains many files and, I can’t believe i’m saying this, click the one with the “.exe” file extension to launch the program. Maybe focus your “reverse engineering” efforts on that one too.
The first challenge is the classic liveness check. Starting the executable shows a big cross, and a bunch of buttons that we can click to interact with it; presumably we need to find the right 2-digit code and press the unlock button.
X.exe appears to be a plain MSVC/C++ executable, but all bundled DLLs suggest there’s some .NET going on here. In addition to X.exe
, there’s also an X.dll
that seems interesting. The naming suggests it’s not library code, and Detect It Easy identifies it as .NET.
After loading the .dll in DNSpy and clicking around a bit, I quickly succumbed to frustration of not finding the relevant code (it seems I overlooked monogame.Game1
entirely) and just decide to click the buttons a bunch of times. I should’ve known: the number 42 leads to the flag.
TLDR: find string AES-CBC and IV. Build the AES key using string slices. Decrypt iv.png.
The FLARE team is now enthusiastic about Google products and services but we suspect there is more to this Android game than meets the eye.
This challenge consists of a single APK file called ItsOnFire.apk
. Straight into Android on the second challenge; might as well get it over with.
Without a ready-to-go setup for Android analysis, I scramble a bit and install Android Studio to have a look at what we’re dealing with. I play a bit of Space Invaders.
For static analysis, I open the APK using jadx-gui
. It appears the APK is obfuscated; all names were replaced with meaningless placeholders. I start at the top and stop when I find something of interest; in f.b
, I identify code that performs CRC32, and there’s symmetric crypto going on. Two of the functions retrieve various values from the APK’s resources
private final File c(int i2, Context context) {
Resources resources = context.getResources();
Intrinsics.checkNotNullExpressionValue(resources, "context.resources");
byte[] e2 = e(resources, i2);
String d2 = d(context);
Charset charset = Charsets.UTF_8;
byte[] bytes = d2.getBytes(charset);
Intrinsics.checkNotNullExpressionValue(bytes, "this as java.lang.String).getBytes(charset)");
SecretKeySpec secretKeySpec = new SecretKeySpec(bytes, context.getString(R.string.ag));
String string = context.getString(R.string.alg);
Intrinsics.checkNotNullExpressionValue(string, "context.getString(R.string.alg)");
String string2 = context.getString(R.string.iv);
Intrinsics.checkNotNullExpressionValue(string2, "context.getString(\n … R.string.iv)");
byte[] bytes2 = string2.getBytes(charset);
Intrinsics.checkNotNullExpressionValue(bytes2, "this as java.lang.String).getBytes(charset)");
byte[] b2 = b(string, e2, secretKeySpec, new IvParameterSpec(bytes2));
File file = new File(context.getCacheDir(), context.getString(R.string.playerdata));
FilesKt__FileReadWriteKt.writeBytes(file, b2);
return file;
}
This appears to build some sort of crypto object using R.string.ag
and R.string.alg
to identify the algorithm, and R.string.iv
as an IV. I find out that Android apps store their strings in res/values/strings.xml
, and identify the following strings:
<string name="alg">AES/CBC/PKCS5Padding</string>
<string name="ag">AES</string>
<string name="iv">abcdefghijklmnop</string>
So, AES it is, then! There’s also a bunch of strings regarding days of the week and a couple URLs, and I resist the urge to browse to them (.. briefly – there’s a Youtube URL that’s not clearly Rick Astley, and I find out it’s Working for the Weekend by Loverboy). More carefully reading through the function, it appears that the function d
is responsible for generating the AES key.
private final String d(Context context) {
String slice;
String string = context.getString(R.string.c2);
Intrinsics.checkNotNullExpressionValue(string, "context.getString(R.string.c2)");
String string2 = context.getString(R.string.w1);
Intrinsics.checkNotNullExpressionValue(string2, "context.getString(R.string.w1)");
StringBuilder sb = new StringBuilder();
sb.append(string.subSequence(4, 10));
sb.append(string2.subSequence(2, 5));
String sb2 = sb.toString();
Intrinsics.checkNotNullExpressionValue(sb2, "StringBuilder().apply(builderAction).toString()");
byte[] bytes = sb2.getBytes(Charsets.UTF_8);
Intrinsics.checkNotNullExpressionValue(bytes, "this as java.lang.String).getBytes(charset)");
long a2 = a(bytes);
StringBuilder sb3 = new StringBuilder();
sb3.append(a2);
sb3.append(a2);
String sb4 = sb3.toString();
Intrinsics.checkNotNullExpressionValue(sb4, "StringBuilder().apply(builderAction).toString()");
slice = StringsKt___StringsKt.slice(sb4, new IntRange(0, 15));
return slice;
}
The function loads two strings and performs a bit of slicing to extract 16 characters. Let’s grab the relevant resources.
<string name="c2">https://flare-on.com/evilc2server/report_token/report_token.php?token=</string>
<string name="w1">wednesday</string>
Ah, so this is where the URL comes into play! I replicate the functionality in Python, armed with the knowledge that function a
is just a wrapper around CRC32.
from binascii import crc32
string = "https://flare-on.com/evilc2server/report_token/report_token.php?token="
string2 = "wednesday"
sb = string[4:10] + string2[2:5]
a2 = crc32(sb.encode('utf-8'))
s = str(a2) + str(a2)
print(s[:16])
This outputs 4508305374508305
(and it has me double-checking whether that’s a 16-byte character string, or an integer).
Okay, now we have AES with a key (4508305374508305
) and IV (abcdefghijklmnop
), but no ciphertext to decrypt yet. Clicking through the resources, there’s an odd iv.png
and ps.png
in res/raw
; these are not valid PNG files. Could it be..?
Exporting a resource is done by clicking Save as gradle project in the JADX interface. Turning to CyberChef, we can decrypt the images using AES in CBC/NoPadding mode.
TLDR: decompile, step along in a debugger. Gradually work through the stages, tweak input until all checks are satisfied.
This is one of those ones that will work under the right circumstances, Steve. May I call you Steve? Before you take to twitter complaining about a broken challenge, think about how you can be the change you want to see in the world, Steve.
I hope this is the only aimbot on your system. Twitch streaming probably pays better than being a mediocre reverse engineer though.
TODO
I wish we had more easy challenges for you this year for you to rack up your personal high score. It wouldn’t help you improve but it would feel great and that is what is most important.
TODO
TLDR: run the game in DOSBox-X. Fiddle a bit with the built-in debugger. Give up. Analyze the 16-bit assembly statically. Decrypt strings. Find keyboard input check. Note that file is self-modifying after 128 levels. Patch binary to always set correct input, run until level 128. Run modified file; no success. Note Konami code input check on start screen and that it affects RNG setup. Press Konami code, see blinking screen. Run again. Run modified file.
You’re doing great champ! This challenge is a modern (and retro) take on the classic game Simon Says. The cool thing about this one though is it will make you give up, and maybe take that sales job after all. It’s better for everybody that way.
TLDR: play snake for a bit. Run strings, find d3m0_c0nf.txt file name and XOR key. Decode config file. Set starting score to 10000, run into errors. Try to set up Nuitka to understand how to do injection. Fail miserably. Use Cheat Engine](https://www.cheatengine.org) to set Oliver’s score to 1. Overtake Oliver.
Subject: need your help... Oliver Shrub, Gardens Department Head, keeps bragging about his high score for this rip off Snake game called Flake. I'm pretty sure he found a way to cheat the game because there is no way it's possible to score 10,000 points...I mean the game ships with a sketchy TXT file so there must be something fishy going on here. Can you look at the game and let me know how I can beat Oliver's high score? Best, Nox Shadows Department Head
Our customer recently found the following malware executing on one of their machines. The system was left with a very silly looking wallpaper, and a lot of executables on the machine had stopped working. The customer had successfully restored from backup, but we are still interested in understanding all capabilities of the malware. A packet capture file has been provided to aid your analysis.
TODO
TLDR: use bochs debugger to start disk image. Identify deobfuscation, set breakpoints. Figure out keyboard input loop. Find key validation. Compute first 12 characters (XOR constant with 0x55
). Use Unicorn to emulate the code and brute force the remaining 2 bytes.
You’re doing so great! Go out and celebrate. Take a day off kid, you’ve earned it. Watch your scoreboard position fall like the sand through the hourglass. Avoid this VM and feel the joy the outside world has to offer. Or crush this one and earn even more internet points, those will come in handy.
TLDR: boot pdp11 using SIMH. Use rawtap to extract binary from tape. Learn some Forth. Try to debug using built-in adb debugger. Parse the PDP11 symbol table to label functions. Distinguish between native functions and Forth code. Statically figure out that decrypt
is a xor-loop (use p/q2-q4!
as key) and decode
is some accumulator loop. Re-implement in Python.
Did you know the PDP-11 and the FOURTH programming language were both released 53 years ago, in 1970? We like to keep our challenges fresh here at Flare-On, in-line with the latest malware trends.
TLDR: identify XOR and ChaCha20. Decrypt RSA public key. Attack given ciphertext using Coppersmith and stereotyped message.
I’m told this one is easy if you are really good. Based on your solve times so far Google Bard predicts your performance will be: “1 out of 5 stars. I’d give it 0 stars if I could. Food arrived over an hour late, covered in oil. I wouldn’t feed it to my dog”
We’re presented with a 64-bit PE file and a file called very_important_file.d3crypt_m3
. Supposedly this is the encrypted flag; the file contains the following 317 bytes.
3D77B35D ADDBF4F9 CB95A20D 0BA0055E 03AAD1ED 96AA9BA6 7A5D1D14 106A2F7E A5C06133 18D51971 A6A759E9 0433F557 7557BC16 1AB77C85 C7289176 591336E2 80428040 94B2BF03 051257AA AABA7EBA 3E3DD6FA CFF7E3AB DD571E9D 2E2D2C84 F512C014 3B27207A 3EAC0EF9 65A23F4F 4864C7A1 CEB913CE 1803DBA0 2FEB1B56 CD8EBE16 656ABAB2 22E8EDCA 8E9C0DDA 17C370FC E72FE7F6 909EED1E 6B02E92E BF720BA6 051FD7F6 69CF309B A5467C1F B5D7BB2B 7AECA07F 11A57574 6C1047EA 35CC3CE2 46AC0861 F0778880 D18B71FB 2A8D7A73 6A646CF9 9B3DCEC3 62D41341 4BEB9F01 815DB7F7 2F6E081A EE91F191 572A28B9 576F6C53 2349F823 5B6DAF31 B39B5ADD 7ADE0CFB D30F704E B83D983C 215DE326 1F735658 43539F6B B46C9457 DF16E807 449F99F3 DABDDDD5 764FD63D 09BC9C4E 6844EC34 10DC821A B4
Opening up the executable in IDA and clicking our way through the main function, we note that it takes a single path as a command-line argument. Then, two threads are initialized. It appears the functions sub_14007FD20
and sub_14007EE60
serve as the entrypoints for these threads. Let us refer to these threads as thread A and thread B, and the functions as main_A
and main_B
, respectively. It appears the file path is passed as an argument to main_A
; sub_14007EDC0
is responsible for starting the thread with its newly populated context.
In main_A
, we notice the string ".3ncrypt_m3"
, as well as references to recursive_directory_iterator
. In general we see several calls to sub_14007E310
with informative strings. This function seems responsible for raising errors, and provides us with useful context.
Digging further into the function, we see comparisons to characters 92 (\
) and 47 (/
). The loop right after shows a comparison with the .
character. Surely we’re looking at some sort of file path handling.
Sure enough, creating a file with the extension .3ncrypt_m3
and passing the path to the directory it is in as an argument results in a .d3crypt_m3
file. It appears our binary is an encryption routine; my 3-byte text file was replaced by a 259-byte file containing seemingly random data. The 256 bytes difference is the tell-tale sign of a 2048-bit RSA ciphertext!
Switching over to main_B
, this appears to be where the crypto happens. The constant string expand 32-byte k
reveals that we’re likely dealing with a Salsa variant. sub_14007F8F0
contains the permutation routine, which we identify as ChaCha20.
As before, static strings identify library functions for us: the input to sub_140089500
appears to relate to error reporting, and helps us label sub_140086A00
as a random number generator originating from crypto\rand\rand_lib.c
. Thus, the ChaCha20 state consists of 48 random bytes, followed by the static strings expand 32-byte k
.
Backing up a bit, we note that the function starts by loading a large array of 451 bytes, and deobfuscating it using a simple XOR loop. Starting the binary in a debugger and setting a breakpoint just afterwards allows us to read the decrypted string:
-----BEGIN PUBLIC KEY-----
MIIBIDANBgkqhkiG9w0BAQEFAAOCAQ0AMIIBCAKCAQEAycMwco9oCHr8YKEz5Jud
PeSfD/mZXF4S5cZcEYl7xxjj5NJy1aWM5GN1WyxjRn8NCfk8Mctn/jGICa9/yLLI
xyGrVHzk22Pb3/9dmwbIV5n97mkPkMR5xtC546P2blXWMCnOWgLvhMaq3F4iQWgw
JMxl11ZCr+C6vnbymmd86xWb5IuzJl69K9UZoq9+A2zC5kAcN1VXYagcPR0opFbD
i5G1WQNb/wE92gQ5BTuelvSyePcZ6Tnmd9BYvG6YAFr/IwgUpJerNLf6kCtmbRgN
6E4k6Q91PXnbC3IXrLXEb00apWvuVz8tR6Qzfd0eK5Z+3HA4/usJDex0ktlNlom7
YQIBAw==
-----END PUBLIC KEY-----
Looking a bit more closely at the functions that operate on this public key, we see several references to crypto\bio\bio_lib.c
in sub_1400860E0
. Each reference is accompanied by an integer; in this function, we observe the integers 76, 89 and 95 – perhaps these are line numbers. By having a look at the openssl source code, we identify BIO_new_ex, which indeed has several error-cases. It’s not an exact match, as only one case results in a call to ERR_raise
, but the resemblance is there; perhaps we’re looking at a slightly refactored version (indeed, viewing a random commit from 2021 shows more calls to ERR_raise
).
We can further verify this suspicion by looking at xrefs to the supposed BIO_new_ex
. It is called from sub_140087DF0
, which shows a reference to crypto\bio\bss_mem.c
, line 89. That’s right in the middle of BIO_new_mem_buf
, which indeed calls BIO_new
; a thinly veiled wrapper around BIO_new_ex
. Closer inspection suggests the compiler inlined the call into BIO_new
, as the ->libctx
member is not initialized.
BIO *BIO_new(const BIO_METHOD *method)
{
return BIO_new_ex(NULL, method);
}
Armed with this new knowledge, we quickly identify the functions that construct the public key object, and trace it to sub_140085470
. This supposedly calls the RSA encryption routine. The lines above it reveal that the plaintext consists of 24 random bytes and the initial ChaCha20 state, preceded by 176 null bytes. The random bytes make an appearance in the encryption loop: they’re used as an XOR key in addition to the output of the ChaCha20 keystream generator.
Decoding the public key reveals the following RSA parameters:
n = c9c330728f68087afc60a133e49b9d3de49f0ff9995c5e12e5c65c11897bc718
e3e4d272d5a58ce463755b2c63467f0d09f93c31cb67fe318809af7fc8b2c8c7
21ab547ce4db63dbdfff5d9b06c85799fdee690f90c479c6d0b9e3a3f66e55d6
3029ce5a02ef84c6aadc5e2241683024cc65d75642afe0babe76f29a677ceb15
9be48bb3265ebd2bd519a2af7e036cc2e6401c37555761a81c3d1d28a456c38b
91b559035bff013dda0439053b9e96f4b278f719e939e677d058bc6e98005aff
230814a497ab34b7fa902b666d180de84e24e90f753d79db0b7217acb5c46f4d
1aa56bee573f2d47a4337ddd1e2b967edc7038feeb090dec7492d94d9689bb61
e = 3
With $e = 3$ and so many null bytes in the plaintext, this may only be a matter of performing a cube-root attack; with a small exponent, $m^e\bmod N$ could be equal to $m^e \in \mathbb{N}$. Unfortunately, with 88 bytes set we get $m \approx 2^{2112}$. This significantly exceeds $N \approx 2^{2048}$.
As we know part of the message, we can reformulate the problem slightly and apply Coppersmith’s attack on stereotyped messages, as follows. Refer to this paper or this Youtube video by David Wong for more details.
We define $m_0$ to be known part of the message (i.e., expand 32-byte k
), and try to find $x_0$ such that $m = m_0 + x_0 \cdot 2^{128}$. Note that the multiplication by $2^{128}$ accounts for the fact that $x_0$ does not need to include the 16 known bytes defined in $m_0$.
Let the ciphertext be $c = m^e\bmod N$, and define the polynomial $f$:
\begin{align} f(x) = (m_0 + x \cdot 2^{128})^e - c \mod{N} \end{align}
Finding $x_0$ is now equivalent to finding $x$ such that $f(x) = 0 \bmod{N}$. Within the bounds set out by Coppersmith’s Theorem, Coppersmith’s algorithm provides a way to solve this problem efficiently.
Coppersmith’s Theorem. Let $N$ be an integer which has a divisor $b \ge N^\beta$ with $0 < \beta \le 1$. Let $f(x)$ be a univariate monic polynomial of degree $\delta$ and let $\gamma \ge 1.$ Then we can find all solutions $x_0$ such that $f(x_0) = 0 \bmod{b}$ with $|x_0| \le \gamma \cdot N^{\frac{\beta^2}{\delta}}$, in time $\mathcal{O}(\gamma\delta^5\log^9(N))$.
We can simplify this by letting $\beta = 1$ and $\gamma = 1$, and we observe that the degree of $f$ is $e = 3$. The promise of Coppersmith’s algorithm is thus to deliver us all candidates for $x_0$ where $|x_0| \le N^{\frac{1}{3}}$. With $N \approx 2^{2048}$, $N^{\frac{1}{3}} \approx 2^{683}$. As $x_0$ will be 72 bytes (= 576 bits), we know such a solution to exist.
Coppersmith’s algorithm is implemented in sage as the small_roots()
method. We find $x_0$ as follows:
sage: c = 0x1336E28042804094B2BF03051257AAAABA7EBA3E3DD6FACFF7E3ABDD571E9D2E2D2C84F512C0143B27207A3EAC0EF965A23F4F4864C7A1CEB913CE1803DBA02FEB1B56CD8EBE16656ABAB222E8EDCA8E9C0DDA17C370FCE72FE7F6909EED1E6B02E92EBF720BA6051FD7F669CF309BA5467C1FB5D7BB2B7AECA07F11A575746C1047EA35CC3CE246AC0861F0778880D18B71FB2A8D7A736A646CF99B3DCEC362D413414BEB9F01815DB7F72F6E081AEE91F191572A28B9576F6C532349F8235B6DAF31B39B5ADD7ADE0CFBD30F704EB83D983C215DE3261F73565843539F6BB46C9457DF16E807449F99F3DABDDDD5764FD63D09BC9C4E6844EC3410DC821AB4
sage: N = 0xc9c330728f68087afc60a133e49b9d3de49f0ff9995c5e12e5c65c11897bc718e3e4d272d5a58ce463755b2c63467f0d09f93c31cb67fe318809af7fc8b2c8c721ab547ce4db63dbdfff5d9b06c85799fdee690f90c479c6d0b9e3a3f66e55d63029ce5a02ef84c6aadc5e2241683024cc65d75642afe0babe76f29a677ceb159be48bb3265ebd2bd519a2af7e036cc2e6401c37555761a81c3d1d28a456c38b91b559035bff013dda0439053b9e96f4b278f719e939e677d058bc6e98005aff230814a497ab34b7fa902b666d180de84e24e90f753d79db0b7217acb5c46f4d1aa56bee573f2d47a4337ddd1e2b967edc7038feeb090dec7492d94d9689bb61
sage: e = 3
sage: m_0 = int.from_bytes(b'expand 32-byte k', byteorder='big')
sage: P.<x> = PolynomialRing(Zmod(N))
sage: f = ((m_0 + x * 2^128)^e - c).monic()
sage: f.small_roots(X=2^576)
[6730722853673567334832243774533669260003790577373139776704376326794058404121535292812433356801604068239665891105595607303018172210866035032633357162971325439700251947045071]
Note that we make $f$ monic to satisfy the requirements of Coppersmith’s theorem; dividing $f$ by its leading coefficient does not change the roots. Furthermore, remark that small_roots()
can take an upper bound X
as an argument, which we have readily available. Finally, recall that $c$ is the last 256 bytes from the .d3crypt_m3
file.
Now all that remains is decrypting the flag using ChaCha20 and the XOR loop. Rather than reimplement the ChaCha permutation, we can lift it from the original code and run it using Unicorn.
from itertools import chain, repeat
import math
from unicorn import Uc, UC_ARCH_X86, UC_MODE_64
from unicorn.x86_const import (
UC_X86_REG_RDX,
UC_X86_REG_R8,
UC_X86_REG_RSP,
)
root = 6730722853673567334832243774533669260003790577373139776704376326794058404121535292812433356801604068239665891105595607303018172210866035032633357162971325439700251947045071
root_as_bytes = int.to_bytes(root, length=72, byteorder="big")
xorkey = root_as_bytes[:24]
state = root_as_bytes[24:] + b"expand 32-byte k"
with open("very_important_file.d3crypt_m3", "rb") as f:
ciphertext = f.read()[:-256]
CHACHA_X64 = bytes.fromhex(
"4c89442418488954241048894c2408555356574154415541564157488bec4883ec680f100248c745f00a0000000f104a100f1145980f1042200f114da8448b55b40f104a30448b4db0448b65ac448b5da80f114dc8448b75d48b75d08b7dcc448b45c80f1145b8448b6dbc8b5db866908b45984103c38bd04133d0c1c2108d0c1a448bc14533c341c1c00c418d1c00448bfb4433fa41c1c708418d040f8945dc4133c0c1c0078945e08b459c4103c48bd033d7c1c210428d0c2a448bc14533c441c1c00c418d3c00448be74433e241c1c408418d040c8b4dc0448bd88945e48b45a04533d84103c141c1c3078bd033d6c1c21003ca448bc14533c141c1c00c418d3400448bee4433ea8b55c441c1c508468d0c298b4da44103ca418bc14433f14133c041c1c6104103d6c1c0074433d24489754841c1c20c894560418bc2458d340a418bce334d48c1c108448d1411418bd233d0418d041bc1c2078955d88bd033d1c1c210428d0c0a448bc14533c3448b4d60448b5de041c1c00c4103c089459833c2c1c0088945488945d403c18945c04133c0c1c0078945e88945ac418d04398bc84133cfc1c110428d1411448b55d84433ca41c1c10c458d04014489459c4433c141c1c008418d04108b55dc4433c88945c4418d043241c1c1078bc844894db04133cc448b65e8c1c11003d14433d241c1c20c418d3c02438d0433897da0448b754833f9c1c7088bc84133cdc1c1108d1c3a8b55e403d14433d34433da41c1c20741c1c30c448955b4418d34038975a433f1c1c608448d2c324533dd41c1c30748836df00144895da80f851ffeffff488b4d584c8d5598895db84c8d5d9c448945c8488d5da04c8b45504c2bd1498bd08975d0482bd1897dcc4c2bd944896dbc482bd9488d410441b904000000908b4c02fc41034c02fc8948fc8b0c1042030c108908418b0c03034c02048948048b0c03034c0208894808488d40104983e90175cc418b48208d41014189402085c9750441ff40244883c468415f415e415d415c5f5e5b5d"
)
CODE_ADDR = 0x1000000
mu = Uc(UC_ARCH_X86, UC_MODE_64)
mu.mem_map(CODE_ADDR, 2 * 1024 * 1024)
mu.mem_write(CODE_ADDR, CHACHA_X64)
def chacha_stream(state):
STATE_ADDR = CODE_ADDR + 0x1000
OUT_ADDR = CODE_ADDR + 0x2000
STACK_ADDR = CODE_ADDR + 0x3000
mu.mem_write(STATE_ADDR, state)
mu.reg_write(UC_X86_REG_RSP, STACK_ADDR)
mu.reg_write(UC_X86_REG_RDX, STATE_ADDR) # state
mu.reg_write(UC_X86_REG_R8, OUT_ADDR) # output
mu.emu_start(CODE_ADDR, CODE_ADDR + len(CHACHA_X64))
return mu.mem_read(STATE_ADDR, 64), mu.mem_read(OUT_ADDR, 64)
stream = bytearray()
for _ in range(math.ceil(len(ciphertext) / 64)):
state, output = chacha_stream(state)
stream += output
xorstream = chain.from_iterable(repeat(xorkey))
for a, b, c in zip(ciphertext, stream, xorstream):
print(chr(a ^ b ^ c), end="")
Bingpot. Wa5nt_th1s_Supp0s3d_t0_b3_4_r3vers1nG_ch4l1eng3@flare-on.com
TLDR: write code to decrypt RC4-encrypted functions. Note two input parameters. XOR hardcoded strings to get FLARE2023FLARE2023FLARE2023FLARE2023
as arg1. Figure out Salsa20 keystream; re-implement using Unicorn. Invert the XOR-loop function through tedious manual labor. Input the result as arg2.
This is the second smallest challenge this year! If only that mattered.
When we simply run the binary in a virtual machine, we’re presented with an error:
[-] OS/CPU feature not enabled
Opening the binary in IDA, we quickly find out where that error comes from. The first thing the main function does is call WHvGetCapability(WHvCapabilityCodeHypervisorPresent, ...)
, which I learn to be a way to check whether Hyper-V is available. It.. is not. After a brief foray into VMWare configuration, I set out to approach this challenge statically.
Some clean-up later, the following snippets suggest we’re supposed to supply two arguments with lengths between 9 to 47 and 25 to 64 …
// ...
len_arg1 = strlen(argv[1]);
len_arg2 = strlen(argv[2]);
if ( len_arg1 > 8 && len_arg1 < 48 )
{
if ( len_arg2 > 24 && len_arg2 < 65 )
// ...
… the latter of which will be used to decrypt the flag!
// ...
if ( should_be_x1337 == 0x1337 )
{
qmemcpy(xored_flag, &xored_flag_res, 42ui64);
for ( i = 0; i < 41; ++i )
printf("%c", argv[2][i] ^ (unsigned int)xored_flag[i]);
printf("@flare-on.com\n");
}
The majority of the main function seems to be concerned with setting up and interacting with the Hyper-V environment. We can identify a function that initializes registers (sub_140001070
), a function that reads RIP
, R8
and R9
(sub_140001310
) and uses them to derive arguments for RC4 (sub_140001730
), and a function that reads RAX (sub_1400013E0
). From the above snippet, we’ve figured out that RAX will need to be 0x1337
for us to get the flag.
All of this happens in a loop. The virtual processor is started (WHvRunVirtualProcessor
), and runs until it hits an interrupt. At that point the exit reason is checked. Depending on the specific exit code, RC4 is called to decrypt the data pointed to by RIP using the eight-byte key in R8 and the data length in R9.
Setting the appropriate type for the ExitContext variable (WHV_RUN_VP_EXIT_CONTEXT
) provides some more insight into what these exit reasons relate to.
// ...
loop_flag = 1;
should_be_x1337 = 0;
while ( loop_flag )
{
if ( WHvRunVirtualProcessor(Partition, 0, &ExitContext, 0xE0u) >= 0 )
{
ExitReason = ExitContext.ExitReason;
if ( ExitContext.ExitReason == WHvRunVpExitReasonX64IoPortAccess )
{
get_RIP_R8_R9(Partition, RIP_R8_R9);
if ( (*(_BYTE *)(&ExitContext.ApicEoi + 5) & 1) != 0 )
RC4(buf, RIP_R8_R9[0] - 16 - RIP_R8_R9[2], RIP_R8_R9[2], (_BYTE *)RIP_R8_R9[1]);
else
RC4(buf, RIP_R8_R9[0] + 2, RIP_R8_R9[2], (_BYTE *)RIP_R8_R9[1]);
increment_RIP_with_2(Partition);
}
else if ( ExitReason == WHvRunVpExitReasonX64Halt )
{
should_be_x1337 = get_RAX(Partition);
loop_flag = 0;
}
else
{
loop_flag = 0;
}
}
}
// ...
But where does the code live? And what’s in buf
? It appears that’s what sub_140001440
takes care of.
__int64 __fastcall sub_140001440(_QWORD *a1)
{
DWORD Size; // [rsp+20h] [rbp-38h]
HMODULE hModule; // [rsp+28h] [rbp-30h]
HRSRC hResInfo; // [rsp+30h] [rbp-28h]
HGLOBAL hResData; // [rsp+38h] [rbp-20h]
const __m128i *Src; // [rsp+40h] [rbp-18h]
hModule = GetModuleHandleA(0i64);
hResInfo = FindResourceA(hModule, (LPCSTR)0x85, (LPCSTR)0x100);
hResData = LoadResource(hModule, hResInfo);
Size = SizeofResource(hModule, hResInfo);
Src = (const __m128i *)LockResource(hResData);
return memcpy(*a1, Src, Size);
}
The above function calls FindResourceA(hModule, 0x85, 0x100)
to get the resource with ID 0x85
(133) and resource type 0x100
(256). Opening the binary in CFF Explorer shows that indeed, there is exactly one resource, and its ID is 133. We can export the data to a file for closer inspection.
BC 00 80 FA 0F 01 16 26 0D 0F 20 C0 66 83 C8 01
0F 22 C0 EA 18 00 08 00 66 B8 10 00 8E D8 8E E0
8E E8 8E D0 E8 0E 00 00 00 0F 01 15 44 0D 00 00
EA F2 0C 00 00 08 00 BF 00 30 00 00 0F 22 DF 31
C0 B9 00 10 00 00 F3 AB 0F 20 D8 C7 00 03 40 00
...
There’s one more interesting snippet to consider before we dive into the resource code: we observe that the arguments that are passed to the original executable are inserted into the code before it is started, at 0x400
and 0x200
from the end of the allocated buffer. As the buffer is 0x10000
bytes, this puts the arguments at 0xFC00
and 0xFE00
, respectively.
// ...
arg1 = &buf[buf_size - 0x400];
memcpy(arg1, argv[1], len_arg1);
arg2 = &buf[buf_size - 0x200];
memcpy(arg2, argv[2], len_arg2);
// ...
We open the resource file in IDA and scroll through the disassembly. Note that it is useful to open a new instance here, rather than load it into the existing database: we will want to set the compiler options to a compiler that uses the SystemV calling convention (such as GNU C++).
While most bytes do not represent actual x64 instructions, we notice several snippets that stand out:
...
000000000000082E mov r8, 0D3A5541BC79F6DF3h
0000000000000838 mov r9d, 23Bh
000000000000083E out 3, al
0000000000000840 retn
0000000000000840 ; ---------------------------------------------------------------------------
0000000000000841 db 49h ; I
0000000000000842 db 0B8h
0000000000000843 db 73h, 0EAh, 87h, 80h, 0AAh
0000000000000848 db 0EFh
0000000000000849 db 29h, 53h, 41h, 0B9h, 2Dh, 2 dup(0)
0000000000000850 db 0
0000000000000851 db 0E4h
0000000000000852 db 3
0000000000000853 db 0A0h
0000000000000854 db 0BCh
0000000000000855 db 53h ; S
0000000000000856 db 0A6h
0000000000000857 db 0AFh
0000000000000858 db 0F5h
0000000000000859 db 25h, 0D8h, 1Fh, 6Fh, 32h, 0B5h, 30h
0000000000000860 db 2Eh ; .
0000000000000861 db 0AEh
0000000000000862 dw 34C0h, 0AEE6h, 3D1Dh
0000000000000868 db 8Bh
0000000000000869 db 72h ; r
000000000000086A db 4Ah ; J
000000000000086B db 9Bh, 84h, 80h, 62h, 2
0000000000000870 db 9Ah
0000000000000871 db 1Dh
0000000000000872 dw 5903h, 0E382h, 0F3D2h
0000000000000878 db 0D6h
0000000000000879 db 85h
000000000000087A db 0ADh
000000000000087B db 0C4h, 27h, 0CFh, 41h, 65h
0000000000000880 ; ---------------------------------------------------------------------------
0000000000000880 mov r8, 5329EFAA8087EA73h
000000000000088A mov r9d, 2Dh ; '-'
0000000000000890 out 3, al
0000000000000892 retn
...
Note that earlier we concluded that R8
and R9
would contain the RC4 key and data length. Perhaps these out
instructions are the interrupts we’re looking for. The length fields do not make sense, though; they appear to relate to the size of the chunks before the out
call rather than after.
If we look at the above example, we can subtract 0x2D
from 0x880
to arrive at 0x853
. That still leaves us with several bytes preceding the supposedly encrypted data..
Indeed, converting the bytes at 0x841
to code results in the following snippet:
0000000000000841 mov r8, 5329EFAA8087EA73h
000000000000084B mov r9d, 2Dh ; '-'
0000000000000851 in al, 3
Same length, same key, but an in
instruction. It would appear this marks the start of a function, and out
marks its end – perhaps to re-encrypt the function after it is executed. Using the Python code below, we decrypt all fragments.
import re
def RC4(inp, offset, length, key):
S = list(range(0, 256))
i = 0
k = 0
# KSA
for j in range(256):
i = (key[k] + i + S[j]) % 256
S[j], S[i] = S[i], S[j]
k = (k + 1) % 8
# PRGA
s = 0
j = 0
out = []
for i in range(length):
s = (s + 1) % 256
j = (j + S[s]) % 256
S[s], S[j] = S[j], S[s]
out.append(inp[offset + i] ^ S[(S[s] + S[j]) % 256])
return bytes(out)
def decrypt(ea, length, key):
data[ea : ea + length] = RC4(data[ea : ea + length], 0, length, key)
print(f"Patched {length} bytes at {hex(ea)}")
for match in re.finditer(b"\x49\xB8.{8}\x41\xB9.{4}\xE4\x03", data):
ea = match.start()
key = data[ea + 2 : ea + 10]
length = int.from_bytes(data[ea + 12 : ea + 16], byteorder="little")
out = ea + 0x12
decrypt(out, length, key)
# convert interrupts to NOPs to clean up decompilation
data[ea : ea + 0x12] = bytes([0x90] * 0x12)
data[ea + length : ea + 0x12 + length] = bytes([0x90] * 0x12)
This immediately allows IDA to identify 11 distinct functions, and a bit of code at the end that appears to be the entrypoint: this is where the arguments in 0xFE00
and 0xFC00
come into play!
0000000000000BC4 push rbp
0000000000000BC5 mov rbp, rsp
0000000000000BC8 sub rsp, 90h
0000000000000BCF mov esi, 0FE00h
0000000000000BD4 mov edi, 0FC00h
0000000000000BD9 call sub_B3F
0000000000000BDE leave
Diving into sub_B3F
, we see it uses sub_918
to check the first argument, and sub_A62
to check the second argument. On success, it returns the 0x1337
value the wrapper code is looking for.
HIDWORD(v3) = sub_918(a1);
LODWORD(v3) = sub_A62((int *)a1, (__int64)a2);
if ( v3 == 0x2400000001 )
return 0x1337;
else
return 0;
Inside sub_918
we find a simple XOR check:
strcpy(key, "*#37([@AF+ . _YB@3!-=7W][C59,>*@U_Zpsumloremips");
strcpy(lipsum, "loremipsumloremipsumloremipsumloremipsumloremips");
a1_len = strlen(a1);
v6 = 0;
for ( i = 0; i < a1_len; ++i )
{
if ( ((unsigned __int8)lipsum[i] ^ (unsigned __int8)a1[i]) == key[i] )
++v6;
}
return v6;
Computing the result of key ^ lipsum
, we find the string FLARE2023FLARE2023FLARE2023FLARE2023
. With a length of 36 characters, this fits the > 8
and < 48
constraints.
Now to figure out the second argument. Looking into sub_A62
, we can break it down further.
__int32 __fastcall sub_A62(char *a1, char *a2)
{
int v2; // ecx
char v4[49]; // [rsp+10h] [rbp-40h] BYREF
int v5; // [rsp+4Ch] [rbp-4h]
memset(v4, 0, sizeof(v4));
v2 = strlen(a2);
v5 = sub_5E1(a2, v2, v4);
if ( (v5 & 7) != 0 )
return 0;
sub_4AF(v4, v5, *(_DWORD *)a1);
return sub_893(a1, v4, 48);
}
We can readily identify sub_5E1
as base64_decode
; it starts off with a tell-tale v24 = 3 * (a2 / 4)
output length computation (i.e., each input byte leads to 6 output bits) and proceeds to strip off the padding characters (=
).
The final sub_893
is also trivial: it compares 48 bytes in v4
to a1
. Recall that a1
is the string FLARE2023...
. In the main function of hvm.exe
, we can verify that the bytes at 0xFC00
following the user input are initialized to zero.
The function sub_4AF
is where the magic happens, though. After a little cleaning, it looks as follows:
void __fastcall sub_4AF(char *arg2, int arg2_len, int arg1)
{
int v3[16]; // [rsp+10h] [rbp-A0h] BYREF
int v4[16]; // [rsp+50h] [rbp-60h] BYREF
__int64 *state; // [rsp+98h] [rbp-18h]
int v6; // [rsp+A4h] [rbp-Ch]
int j; // [rsp+A8h] [rbp-8h]
int i; // [rsp+ACh] [rbp-4h]
memset(v4, 0, sizeof(v4));
for ( i = 0; i <= 15; ++i )
v3[i] = arg1;
sub_A7(v4, v3);
v6 = arg2_len / 8;
state = (__int64 *)arg2;
for ( j = 0; j < v6; j += 2 )
sub_421(&state[j], &state[j + 1], (__int64)v4);
}
The local variable v3
is initialized based on the first user-supplied argument, and then passed into sub_A7
, which we can identify to be the Salsa20 permutation. Variable v4
contains the output of Salsa20.
It took me longer than I care to admit to realize that v3
is not initialized with the entire first argument, but simply repeats the first four bytes (FLAR
). I ended up re-implementing the Salsa20 permutation in Python and using Unicorn to verify that it resulted in the same output before realizing what was going on. So far so good, though: we can compute this value based on arg1.
The resulting byte sequence is:
026124f56d840c78fafa18a3b91c245fb91c245f026124f56d840c78fafa18a3
fafa18a3b91c245f026124f56d840c786d840c78fafa18a3b91c245f026124f5
The function sub_421
is where the second argument enters the mix.
void __fastcall sub_421(__int64 *a1, __int64 *a2, int *a3)
{
__int64 v4; // [rsp+18h] [rbp-10h]
int i; // [rsp+24h] [rbp-4h]
for ( i = 7; i >= 0; --i )
{
v4 = *a1;
*a1 ^= sub_3D1(*a2, i, (__int64 *)a3);
*a2 = v4;
}
}
This function appears to be some kind of XOR-loop; sub_3D1
computes *a2 ^ a3[i]
. Note that a3
is the output of Salsa permutation, and a1
starts off pointing at arg2.
Recall that we want the resulting output of the loop in sub_4AF
to be equal to arg1 in order to pass the check in sub_A62
. That means we will have to invert the XOR-loop to work towards the value of arg2.
In an attempt to better understand what is going on here, I re-implement sub_421
in Python as follows. Using Unicorn, we can again verify that it matches the original code.
def xorloop(x, y, z):
zz = [int.from_bytes(z[i : i + 8], byteorder="little") for i in range(0, 64, 8)]
x = int.from_bytes(x[:8], byteorder="little")
y = int.from_bytes(y[:8], byteorder="little")
for i in range(7, -1, -1):
t = x
x ^= y ^ zz[i]
y = t
return (
int.to_bytes(x, length=8, byteorder="little"),
int.to_bytes(y, length=8, byteorder="little"),
z,
)
This is where it gets ugly. As I did not immediately see a way to cleanly invert the loop, I decided to unroll it. We can rely on the XOR operation to cancel terms along the way. The general expression is
\begin{align} x_{i} &= x_{i-1} \oplus y_{i-1} \oplus z_{7-i} \\ y_{i} &= x_{i-1} \\ \end{align}
… which unrolls to …
\begin{align} x_0 &= x\\ y_0 &= y\\ \\ x_1 &= x_0 \oplus y_0 \oplus z_7 \\ &= x \oplus y \oplus z_7 \\ y_1 &= x_0 = x\\ \\ x_2 &= x_1 \oplus y_1 \oplus z_6 \\ &= (x \oplus y \oplus z_7) \oplus x \oplus z_6\\ &= y \oplus z_7 \oplus z_6 \\ y_2 &= x_1 = x \oplus y \oplus z_7\\ \\ x_3 &= x_2 \oplus y_2 \oplus z_5 \\ &= (y \oplus z_7 \oplus z_6) \oplus (x \oplus y \oplus z_7) \oplus z_5\\ &= x \oplus z_6 \oplus z_5 \\ y_3 &= x_2 = y \oplus z_7 \oplus z_6 \\ \\ x_4 &= x_3 \oplus y_3 \oplus z_4 \\ &= (x \oplus z_6 \oplus z_5) \oplus (y \oplus z_7 \oplus z_6) \oplus z_4\\ &= x \oplus y \oplus z_7 \oplus z_5 \oplus z_4 \\ y_4 &= x_3 = x \oplus z_6 \oplus z_5\\ \\ x_5 &= x_4 \oplus y_4 \oplus z_3\\ &= (x \oplus y \oplus z_7 \oplus z_5 \oplus z_4) \oplus (x \oplus z_6 \oplus z_5) \oplus z_3\\ &= y \oplus z_7 \oplus z_6 \oplus z_4 \oplus z_3 \\ y_5 &= x_4 = x \oplus y \oplus z_7 \oplus z_5 \oplus z_4\\ \\ x_6 &= x_5 \oplus y_5 \oplus z_2\\ &= (y \oplus z_7 \oplus z_6 \oplus z_4 \oplus z_3) \oplus (x \oplus y \oplus z_7 \oplus z_5 \oplus z_4) \oplus z_2\\ &= x \oplus z_6 \oplus z_5 \oplus z_3 \oplus z_2 \\ y_6 &= x_5 = y \oplus z_7 \oplus z_6 \oplus z_4 \oplus z_3\\ \\ x_7 &= x_6 \oplus y_6 \oplus z_1\\ &= (x \oplus z_6 \oplus z_5 \oplus z_3 \oplus z_2) \oplus (y \oplus z_7 \oplus z_6 \oplus z_4 \oplus z_3) \oplus z_1\\ &= x \oplus y \oplus z_7 \oplus z_5 \oplus z_4 \oplus z_2 \oplus z_1 \\ y_7 &= x_6 = x \oplus z_6 \oplus z_5 \oplus z_3 \oplus z_2\\ \\ x_8 &= x_7 \oplus y_7 \oplus z_0\\ &= (x \oplus y \oplus z_7 \oplus z_5 \oplus z_4 \oplus z_2 \oplus z_1) \oplus (x \oplus z_6 \oplus z_5 \oplus z_3 \oplus z_2) \oplus z_0\\ &= y \oplus z_7 \oplus z_6 \oplus z_4 \oplus z_3 \oplus z_1 \oplus z_0\\ y_8 &= x_7 = x \oplus y \oplus z_7 \oplus z_5 \oplus z_4 \oplus z_2 \oplus z_1\\ \end{align}
Thus, we find that the xor-loop effectively computes:
\begin{align} x_8 = y \oplus z_7 \oplus z_6 \oplus z_4 \oplus z_3 \oplus z_1 \oplus z_0\\ y_8 = x \oplus y \oplus z_7 \oplus z_5 \oplus z_4 \oplus z_2 \oplus z_1\\ \end{align}
By shuffling the XOR operations around, we can formulate the inverse operation, as well:
\begin{align} y = x_8 \oplus z_7 \oplus z_6 \oplus z_4 \oplus z_3 \oplus z_1 \oplus z_0\\ x = y \oplus y_8 \oplus z_7 \oplus z_5 \oplus z_4 \oplus z_2 \oplus z_1\\ \end{align}
Taking a look at sub_4AF
again, we see that the separate DWORDs in the state
variable are touched in pairs of two, and do not interleave.
for ( j = 0; j < v6; j += 2 )
sub_421(&state[j], &state[j + 1], (__int64)salsa_output);
This means we can simply invert the order of the loop to invert its effect. Initializing the state to arg1 and using the Salsa20 output computed before, we derive the desired state before the XOR operations:
cc16294c151626f7fd3141f82ad718bfbb1d51550f723382
894e46e62eb76dbfe74364096f0821eb894d32996fe55de0
Base64-encoding this bytestream leads to the required arg2 input:
zBYpTBUWJvf9MUH4KtcYv7sdUVUPcjOCiU5G5i63bb/nQ2QJbwgh64lNMplv5V3g
We’re doing this statically, so we cannot simply run the executable and supply the arguments to obtain the flag. Instead, we get the flag by computing the XOR of arg2 and the hardcoded value in the data section at address 0x1400144B0
. We find c4n_i_sh1p_a_vm_as_an_exe_ask1ng_4_a_frnd
TLDR: re-use the deobfuscator from challenge 5; point it at all thread entrypoints. Decompile. Use HashDB to identify API calls. Fix API calls by creating an enum. Find gimmie_advic3
and gimmie_s3cr3t
commands. Identify MD5 hash aa321932ccdd8cce334a1c3354eed3b1
. Google it. Find password patience_y0u_must_h4v3
. Find fake flag . Cry. Obtain base32 strings. Fix string by making input to Mersenne Twister constant. Zero the output of the RNG entirely. Decode base32 using custom alphabet. .. ROP chain? What ROP chain?
So close to the end you can almost taste it. Keep going champ, complete this challenge and take your place as one of the Reverse Engineers of All-Time.