Technological Thoughts by Jerome Kehrli

Ethical hacking : a glimpse on local program vulnerabilities exploitation techniques

by Jerome Kehrli

Posted on Saturday Oct 08, 2016 at 12:19AM in Computer Science

Ethical hacking is a very interesting field, and a pretty funny hobby. Well, y'a all penetration tester out there, don't get me wrong: I am well aware that Penetration testing and Ethical Hacking is a full and challenging Software Engineering Field and an actual profession, don't get upset.

I am rather saying that studying vulnerabilities exploitation techniques in one's free time is pretty fun and intellectually rewarding. With the all time and everywhere connection of everything for all kind of usages (understand Internet of Things), current focus in the field of vulnerabilities exploitation is really on Web application, networks, distributed systems, etc.
In addition, most recent progresses in CPU-level protections and compiler-level protections have made local programs exploitation techniques somewhat outdated and such techniques are not very much presented or discussed anymore.

During my master studies, I followed an extended set of lectures on Ethical Hacking and Software Security in general and got quite interested in the field. I wrote a paper for a study in the context of the university at that time which I am reporting today in this blog.
The following article presents various classical vulnerabilities exploitation techniques on local programs.

1. Ethical hacking methodology

In regards to ethical hacking, there is some sort of a methodology on which pen testers and ethical hackers commonly agree.

This methodology is as follows:

Information Gathering

This is Fully passive phase.

  • The goal is to find all possible useful information about the target.
    • Company / persons
    • DNS / WHOIS / ...
    • Search engines, newsgroups, mailing lists
  • Information mainly collected from public sources
  • Allows to identify the points that will be most likely vulnerable

Network Mapping

First, as a prerequisites, one might need with the results from the previous step to gather the IP address ranges allocated to the target.

The network mapping consists of building a Foot-Print of the network (i.e. foot-printing :

  • Finding of live hosts
  • Port and service scanning
  • Perimeter network mapping (router, firewalls)
  • Identification of critical services
  • OS fingerprinting
  • Service fingerprinting

=> Simply anything the nmap command can do

Vulnerability Identification

At this stage on tries to answer to

  • What type of server ?
  • What OS ?
  • Wich version of OS ? services ? Patched or vulnerable ?

The goals are :

  • Identification of vulnerable services (banners)
  • Vulnerability scan (SecurityFocus, CVE, CERT)
  • Verification
  • Enumerate discovered vulns
  • Estimate probable impact
  • Identify attack paths and scenarios for exploitation

Penetration / Exploitation

Now it's about attempting to exploit the previously discovered vulnerabilities. The exploit can be searched by whatever means (internet, database, tools such as metasploit, etc.)

  • Find proof of concept, tool
  • Development of tools/scripts
  • Testing of proof of concept
  • Verify or disprove vulnerabilities
  • Documentation of findings

Gaining access and Privilege Escalation

Gaining access :

  • Discovery of username/password combinations
  • Blank/default passwords
  • Exploitation of vendor default settings
  • Discovery of public services that allow for certain operations within the system

Privilege Escalation :

  • Get administrative rights, if not yet done
  • Exploitation of a local vulnerability

Further enumeration

This consisis in repeating the previous phases iteratively in order to reach further targets. (for instance from a DMZ to penetrate an internal network)

  • Obtain encrypted passwords (offline cracking)
  • Obtain passwords using sniffing or other techniques
  • Traffic sniffing
  • Cookies gathering
  • Internal network mapping

Compromise of remote users/servers

Break other (internal) services and if possible and iterate the process.

Maintaining access

  • Use of covert channels (protocol tunnels, VPNs, etc.)
  • Backdoors
  • Rootkits

Covering tracks

The problem is that the previous steps usually leave lots of traces in the various system logs. One should attempt to get rid of these traces.

  • Hide files
  • Clear logs : Check history, Edit log files, Identify remote logging, etc.
  • Defeat protection mechanisms : Integrity checking, Anti-virus

2. A focus on Exploitation

This article focuses on exploitation of vulnerabilities related to local programs. We will look at various way to exploit local programs vulnerabilities, especially in the C language.

Most of the techniques presented here, thanks to latest progresses in both hardware (processors) and software (compilers), are not so easily exploitable in practice.
The purpose of the following elements is really to present the basic techniques and the key concepts behind them.

3. Introduction to memory manipulation

Memory manipulation attacks consist in sending malformed data to the target application in such a way that the logical program flow is affected.

Goals :

  • Be able to execute arbitrary code
  • Crash the application (DoS)

3.1 Runtime Memory Organization

The Memory layout of a running process on x86 is as follows :

  • Stack
  • Heap
  • BSS segment
  • Data segment
  • Text segment

Figure 5.1: Memory layout
Image mem_layout

3.1.1 Text Segment

This segment contains all the compiled executable code for the program. It is (Usually) non-writable :

  • Code does not contain any sort of variables
  • Read-only segments can be shared between different copies of the program executing simultaneously

3.1.2 Data and BSS Segments

Those segments contain all the global variables.

  • Read/write/execute rights (at least on Intel)
  • Data segment contains initialized variables
  • BSS segment contains un-initialized variables

3.1.3 Stack

The stack is the region of memory used to dynamically store and manipulate most program functions variables (i.e., local variables)

  • Can (usually) be written, red and executed.
  • When a program enters a function, space on the stack is provided for variables and data (this space is called a stack frame).

The Stack frame contains :

  • Function's argument(s)
  • Stack variables (saved instruction and frame pointers)
  • Space for local variables

3.1.4 Heap

Programs use the heap to store data that must exist after a function returns (and its stack frame is erased). The heap is often the largest segment of memory.

Allocator and de-allocator algorithms manage the heap :

  • In C: malloc() / free ()
  • In C++: new / delete

3.2 Processor Registers (IA32)

eip : instruction pointer
  • points to the next instruction to be executed
  • ebp : frame pointer
  • points at the start of the current stack frame
  • esp : stack pointer
  • points at the end of the stack
  • Figure 5.2: Processor Registers
    Image proc_reg

    Figure 5.3: Registers and the stack
    Image stack_frame

    4. Stack overflows

    The Stack overflow exploit is one of the classical security holes. Currently, OS implements stack smashing protections making such stack overflow exploits more difficult to exploit...

    4.1 Introductory example

    Consider the following tiny C program:

    int main (int argc, char **argv)
        char smallbuf[32];
        strcpy (smallbuf, argv[1]);
        printf ("%s\n", smalbuf);
        return 0;

    In memory, the frame for method main stands as follows. We specifically see where the smallbuf array is located, right below the Saved instruction pointer and the Saved frame pointer :

    Figure 5.4: Method "main" in memory
    Image mem_meth_main

    Let's try to smash the stack by running the program (the main function) this way :

    gcc -o print print.c
    Segmentation fault

    What happened ?!? Let's look at the memory again :

    Figure 5.5: Method "main" in memory - smashed
    Image mem_meth_main_smashed

    4.2 Shell code

    Now smashing the stack is fun but what we want is get access to the compromised host. Most often what we eant is a shell code :

    • Machine code executing something useful
    • Shape is heavily OS- and architecture dependent

    For instance, on Linux / IA32, this is a binary runnable code launching a shell /bin/bash :


    4.2.1 Back to our example program

    We pad the shell code with NOP instructions (NOP sled)


    and the last line contains the values we want to put in both pointers, most importantly the aved instruction pointer.

    And we give it as input to the vulnerable program (playing the exploit)

    $ ./print 'perl -e 'print
    "\x90\x90\x90\x90\x90\x90\x90\x90\x31 \xc0\x50\

    The stack layout during execution is as follows :

    Figure 5.6: Method "main" in memory - exploited
    Image mem_meth_main_exploited

    4.2.2 Principle

    • put a shell code in a buffer
    • pad the beginning with enough NOP
    • put in the Save Instruction Pointer one of the addresses containing a NOP

    5. Stack off-by-one

    This is a variation on the stack overflow. It implies conditions where a program miscalculates a data copying operation and allows one byte to flow past the end of the buffer.
    Exploiting this vulnerability, it may be possible to overwrite the least significant byte (LSB) of the saved frame pointer, which could later allow complete control over program flow.

    5.1 Illustration example

    We wont develop this example as far as exploiting the flaw, but we'll use it to illustrate the vulnerability.

    Let's look at the following program :

    int main(int argc, char *argv[])
        if(strlen(argv[1]) > 32)
            printf("Input string too long!\n");
            exit (1);
        return 0;
    int vulfunc(char *arg)
        char smallbuf[32];
        strcpy(smallbuf, arg);
        printf("%s\n", smallbuf);
        return 0;

    The problem with this code is that there is always an additional NULL character at the end of a string, but the strlen function never takes it into account.

    5.1.1 Let's play with it

    $ gcc -o print print.c
    $ ./print test
    Input string too long!
    Segmentation fault (core dumped)

    5.1.2 What happened ?

    This is the stack as it should be :

    Figure 5.7: Normal situation of the stack
    Image off_by_one_stack_norm

    This is the after the exploit. We can see that we have been able to overwrite the LSB of the Save frame Pointer :

    Figure 5.8: Exploited situation of the stack
    Image off_by_one_stack_exp

    5.2 Exploitation

    The whole game is to find a way to put some shell code somewhere on the stack and to use the buffer off-by-one overflow to redirect the program flow to this buffer. The details exceed the scope of this document.

    5.3 Remarks

    Interestingly, not all architectures are (potentially) vulnerable :

    • x86, Dec Alpha : little endian ordering -> vulnerable
    • Spart, SGm IBM RS/600, PowerPC : big endian ordering -> Off-by-one shoot you to an address far away.

    In practice :

    • Padding added by compilers
    • Canary values
    • Non-executable stacks
    • -> More more work is required...

    6. Format String bugs

    The printf function is a special function which can take an infinite number of arguments. The problem is that if one specifies a format which required a certain number of arguments but omits them, printf will nevertheless take the values located on the stack at the place it expected its arguments.

    In addition, printf supports the %x flag which display what is stored at the address given as argument to printf (which might be omitted).

    6.1 Introductory example

    int main(int argc, char *argv[])
        if(argc < 2)
            printf("You need to supply an argument\n");
            return 1;
        return 0;

    6.1.1 Let's play with it

    $ ./printf "Hello, world!"
    Hello, world!
    $ ./printf %x
    $ ./printf %x.%x.%x.%x

    This program is kind of a Stack dumper : one can dump the content of the stack starting from a certain address. (an the stack contains lots of interesting stuff : environment variables, arguments given to the program, function arguments, etc.)

    6.1.2 Principle

    Figure 5.9: printf vulnerability principle
    Image printf_principle

    With the above example, one can dump pretty much any value located in this area :

    Figure 5.10: printf vulnerability principle
    Image printf_principle_area

    6.2 Read environment variables

    I know that the program environment variables are stored around address 0xBFFFFF600 :

    $ ./printf 'perl -e 'print "%53\$s" . "AA" . "\x80\xf6\xff\xbf"';'

    6.3 Overwriting any word in memory

    Man page of printf tells us about the %n specifier: %n The number of characters written so far is stored into the integer indicated by the int * (or variant) pointer argument. No argument is converted.

    Hence, by supplying a pointer to the memory you wish to overwrite and issuing the %n specifier, you write the number of characters that printf has written so far directly to that memory address.

    Here is a nice format string:

    %.0(pad 1)x%(arg number 1)$hn%.0(pad 2)x%(arg number 2) \
    $hn(address 1)(address 2)(padding)

    (pad 1)
  • is the lowest two bytes of the value you wish to write
  • (pad 2)
  • is the highest two bytes of the value you wish to write minus (pad 1)
  • (arg number 1)
  • is the offset from the first argument to (address 1) in the buffer
  • (arg number 2)
  • is the offset from first argument to (address 2) in the buffer
  • (address 2)
  • is (address 1) + 2
  • (padding)
  • is between 0 and 4, to get the addresses aligned
  • 6.4 Practice : overwriting the .dtors section

    The .dtors section contains a series of pointers on functions to be executed when a program exits. By overwriting them, one can execute any arbitrary code at the end of a program execution.

    We will consider here the introductory program, which we will exploit using the following format string:

    $ gcc -o printf printf.c
    $ ./printf 'perl -e 'print "%.048879x" . "%114\$hn" . "%.08126x" \
    . "%115\$hn" . "\x44\x95\x04" . "\x08\x46\x95\x04\x08" . "A"';'

    The format string is thus equal to


    Knowing that the .dtors section was assumed to begin at address 0x08049540, one can analyse this format string this way:

    The printf() routine begins by writing 48879 characters thanks to the %x format command, whose content is not interesting. Then, the %n command writes the number of bytes written so far (i.e., 48879 or 0xBEEF in hexadecimal) in the content of the address given as the 105th argument of printf(), i.e., the address 0x08049540.
    Note that when one writes a "short", i.e., only 2 bytes. The format string write another 8126 characters (where we do not care about the content).
    Then, it writes the number of characters written so far, which is 48879+8126=57005 (or 0xDEAD in hexadecimal) in the address pointed by the 106th argument, which is 0x08049546. Hence, this format string writes the value 0xDEADBEEF at the address 0x08049544.

    7. Heap overflow

    A common heap implementation is as follows (of course, there might be variations, for instance for performance reasons):

    • Available memory is divided into chunks
    • Each chunk contains :
      • A header structure
      • Free space for the user

    7.1 Heap organisation

    Figure 5.11: Heap organisation
    Image heap_orga

    • The size element also specifies in its LSB whether the previous chunk is free or not
    • This information bit is called PREV_INUSE
    • Example:
      size == 0x30 (40-byte buffer, previous chunk free)
      size == 0x31 (40-byte buffer, previous chunk in use)
    • Consequence : malloc() should always allocate multiple of at least 2 bytes
    • In practice : The minimum size of a chunk is always 16 bytes

    7.2 Free chunks

    Free chunks are stored in a doubly linked list :

    Figure 5.12: Free chunks : doubly-linked list
    Image free_chunks_list

    free() does (approximately) as follows:

    • The PREV_INUSE bit is cleared
    • The addresses of the previous (bk pointer) and next (fd pointer) free chunks are placed in the chunk's data section
    • Adjacent free chunks are merged (defragmentation operation)

    In the light of this, the heap organisation is as follows :

    Figure 5.13: Heap organisation
    Image heap_orga_2

    7.3 Defragmentation

    Defragmentation operation of two chunks happens as follows :

    • Adding the sizes
    • Removing the second chunk from the doubly-linked list using the following unlink() macro below.

    Figure 5.14: Free chunks : doubly-linked list
    Image heap_defrag

    #define unlink(P, BK, FD) \
      FD = P->fd;             \
      BK = P->bk;             \
      FD->bk = BK;            \
      BK->fd = FD;            \

    7.4 Heap overflow

    If you can overflow a buffer on the heap, you may be able to overwrite the chunk header of the next chunk on the heap.

    Example of vulnerable program:

    int main(void)
        char *buff1, *buff2;
        buff1 = malloc(40);
        buff2 = malloc(40);

    • Two 40-bytes buffers are assigned on the heap
    • buff1 is used to store user-supplied input (without length checking thereof)
    • Hence, a heap overflow can potentially occur !

    In memory, this takes place as follows :

    Figure 5.15: Heap overflow : example
    Image heap_over_examp

    7.5 Heap overflow Exploitation

    • Idea: make buff2 to be "merged" - Overwrite the size element of buff2 with the PREV_INUSE bit unset.
    • Some constraints
      • prev_size and size are added to pointers so they must have small absolute values
      • fd + size + 4 must point to a value whose least significant bit is 0 to fool the heap management algorithm into thinking that the chunk after next is also free.
      • There must be no 0x00 byte in the overflow string, or gets() will stop process it.

    Figure 5.16: Heap overflow : exploitation
    Image heap_over_exp_1

    What happens when free() deallocates buff1, as required ?

    • It checks to see if the next forward chunk is free
    • Because the size element of the second chunk (buff2) is equal to -4, the algorithm reads the PREV_INUSE flag from the second check, believing it is the third
    • The algorithm tries to consolidate the chunks into a new, bigger one, processing the fake fd and bk pointers as
      #define unlink(P, BK, FD) { \
          FD = P->fd;             \
          BK = P->bk;             \
          FD->bk = BK;            \
          BK->fd = FD;            \
    • fd + 12 is overwritten with bk
    • bk + 8 is overwritten with fd

    In summary, you can overwrite a 4-byte work of your choice anywhere in memory with the values stored in the fake fd and bk pointers

    How to concretely exploit this ? For instance :

    • Linux Executable File Format (ELF)
      • Global Offset Table (GOT) : Contains addresses of various functions
      • .dtors section : contains addresses of functions that perform cleanup work when a program exit
      • Use fd = GOT address of exit() - 12
      • Use bk = shell-code address (i.e., buff1)
      • It means that you will overwrite the GOT entry for exit() with the address of buff1
      • At the end of the main(), you execute your shellcode instead of exit()

    Figure 5.17: Heap overflow : exploitation
    Image heap_over_exp_2

    8. Integer Overflows

    8.1 Principle

    The idea is to identify where an integer is used for a buffer allocation and use the integer overflow to write a value much bigger that the wrongly allocated buffer.

    Integer overflow over 32-bit values:

    int a = 0xffffffff;
    int b = 1;
    int r = a + b;

    results in r == 0x00000000 ...

    8.2 Example

    Imagine a user can tell the size of a data chunk to an application ...

    Example :

    int myfunction(int *array, int len)
        int *myarray, i;
        myarray = malloc(len * sizeof(int));
        if(myarray == NULL)
            return -1;
        for(i = 0; i < len; i++)
            myarray[i] = array[i];
    return myarray;

    In the code above, if one can master the value of the parameter len, he can screw the malloc right below:

    • If the parameter is very long, like len = 0x40000001, we have
      = 0x40000001*4
      = 0x100000004
      == 0x00000004
    • malloc() will allocate a 4-byte buffer, and we will write a much larger buffer.
    • This can trigger a heap overflow

    9. Protections against Overflows

    9.1 Canaries

    Canaries are specific values placed between a buffer and sensitive data on the stack. This is an analogy to the mine workers who used to take a little canary bird with them in the mine. When the air was getting scarce or when unexpected gaz shown up, the canary died and they knew they had to run out.

    • If an overflow occurs, then the canary will be overwritten
    • Canary check code should detect this.

    Type of canaries:

    • Terminators : Built of 0x00, CR, LF or -1 values -> Known to an adversary.
      These are the simplest ones. They're easily countered : one only needs to write the canary back after the overflow occurred.
    • Random : Random value generated at program initialization and stored as a «protected» global variable
      This is a little more difficult to counter, yet it's still feasible.
    • Random XORing Canary built out of a random XORed with the sensitive data. Hence, it depends also on those sensitive data
      These are much more difficult to control by an attacker. For instance one can XOR the random variable and the return pointer -> depends on the data.

    9.2 Protections implemented in GCC

    These protections are called ProPolice. The Main features are :

    • Protects all registers saved in a function's prologue (saved EBP, saved EIP)
    • Add canaries
    • Sort array variables to the highest part of the stack frame
    • Creates copies of the function's argument and relocates them with local variables

    9.3 Address Space Layout Randomization - ASLR

    The idea is to arrange in a random way in the process address space the positions of key data areas :

    • Executable code
    • Library entry points
    • Heap
    • Stack

    ... either statically or at runtime.

    • Linux : Weak form since kernel 2.6.12. Patches exist for better implementation
    • MS Windows : Vista + Server 2008 have ASLR enabled by default but only for few executables (can be forced for others)
    • Max OS X : Some library offset are randomized as of Mac OS X 10.5

    9.4 Other possible protections

    • Non-executable stack
    • dedicated hardware. For instance, HW-based code authentication or the Same kind of mechanisms to avoid fault attacks
    • Program obfuscation (avoids mass exploits)

    10. Exploitation frameworks

    Exploitation frameworks are software packages that contain

    • Reliable exploit modules
    • Agents for re-positioning
    • Obfuscation methods
    • ...


    • Metasploit Framework (free)
    • CORE IMPACT (commercial)
    • Immunity CANVAS (commercial)

    Metasploit Framework

    Available freely through http://www.metasploit.com

    Three main components :

    • MSF Interface (cli and GUI)
    • Modules
    • Payloads

    Using MSF in a nutshell

    • Select the exploit module to use
    • Select the exploit payload to use
    • Select the target host and delivery vector
    • Set exploit target and payload options

    No one has commented yet.

    Leave a Comment

    HTML Syntax: Allowed