Thursday, August 18, 2011

Extracting TFTP data from pcap files with Scapy


  I've come across another challenge on ismellpackets.com. This time it was about analyzing a pcap file. When we open the OperationNEPTUNE.pcap in Wireshark and going to Statistics->Conversations we see a couple of TFTP file transfers.

Usually we extract the data by following the TCP/UDP stream, specifying the direction on which the data is sent, "save as" then clean the added data with a hex editor (Okteta or MadEdit on Linux).

In this case however, every data packet payload sent begins with an TFTP opcode on 2 bytes (0x0003 == Data packet) followed by the block number on 2 bytes incremented each time. (0x0001, 0x0002, 0x0003 etc...)



This means that not only we have overhead data at the beginning of the stream but also in the middle.
00 03 00 01 data1 00 03 00 02 data2 00 03 00 03 data3 00 03 00 04 data etc...
Sometimes, 4 bytes for each 512 bytes of data is not a big deal like a .wav file in this pcap where It was readable and not really altered by removing 00 03 00 01 at the beginning only (It should be removed because each file begins with it's format's magic number which is the beginning of data1 in this case).
But in other cases, like a .7z file contained in the same pcap, the integrity of the data shouldn't be altered.

Obviously, it would be very non-productive to clean all the data stream manually with an editor.
In Wireshark we first filter out packets that don't interest us, applying this display filter:
tftp.opcode == 3 and tftp.destination_file == "IECache.7z"
Then we go to File->Save As..., select Displayed and choose a file name.

After that we fire Scapy and read the pcap file.
$scapy
>>>packets = rdpcap("IECache.pcap")
>>>packets
<IECache.pcap: TCP:0 UDP:569 ICMP:0 Other:0>
Then we iterate over the packets list and write the data to the output file without the first 4 bytes.
>>> f = open("IECache.7z","w")
>>> for p in packets:
. . .        f.write(p.load[4:])
. . .
>>> f.close()
and that is it.
$ file IECache.7z
IECache.7z: 7-zip archive data, version 0.3
We could have used automated pcap extraction tools like NetworkMiner on Windows or Xplico on Linux (Both are Open Source and free), but nothing beats Scapy, the networking Swiss Army Knife. ;)

Friday, August 12, 2011

Convert hex dump into pcap

Recently, I've come across an old packet challenge on ismellpackets blog which consists of finding a secret in a 4 packets capture. The problem is that they were given as a hex dump of this form:
00 0c 29 4c 6d a6 00 0c 29 0e 66 bd 08 00 45 00
00 f6 04 38 40 00 80 06 b7 77 c0 a8 5e 80 c0 a8
5e 81 0d 0d 01 bd 42 d0 33 b5 d2 64 26 85 50 18
fa 97 0e ae 00 00 00 00 00 ca ff 53 4d 42 73 00
...
One possible way to convert it to pcap file format is to use text2pcap.
But we should first modify it into an appropriate format, adding the offset at the beginning of each line. (man text2pcap and man od for more info)
000000 00 0c 29 4c 6d a6 00 0c 29 0e 66 bd 08 00 45 00
000016 00 f6 04 38 40 00 80 06 b7 77 c0 a8 5e 80 c0 a8
000032 5e 81 0d 0d 01 bd 42 d0 33 b5 d2 64 26 85 50 18
000048 fa 97 0e ae 00 00 00 00 00 ca ff 53 4d 42 73 00
...
The offset could be in decimal, hexadecimal or octal.

Not a very hard task. I copied all the packets in a text file named input and made the changes with a quick Python script.
with open("input") as f:
    count = 0
    for line in f:
        if line=='\n':
            count = 0
        else:
            print "%06d" % count + " " + line
            count += len(line.split())
the if line=='\n' is used to reset the offset to 000000 as there are 4 packets separated with a blank line.
We can now pipe the result into text2pcap:
text2pcap <options> <infile> <outfile>
$python myscript.py | text2pcap -o dec - output.pcap
The default offset format for text2pcap is hexadecimal. In our case, it would
have been 000000 000010 000020 000030 etc...
We use -o dec to precise that the offset is in decimal.
the - is to precise that the input is provided through the stdin.
output.pcap is the output file. We can open it in Wireshark.


we could have piped the output into tcpdump for instance to see:
$python myscript.py | text2pcap -o dec - - | tcpdump -Xnnr -
That's it. There are many other ways and already existing scripts that could do the same job.

Update:
I came across another challenge in which the packet dump was given in this format:
4500 0527 0001 4000 4006 0000 c0a8 0102
c0a8 0101 2b67 0014 0000 006f 0000 006f
5018 0200 aa32 0000 ffd8 ffe0 0010 4a46
here's the modified version of the script which supports this format:
import sys
with open(sys.argv[1]) as f:
    count = 0
    for line in f:
        if line == '\n': pass
        else:
            string = [i[:2] for i in line.split()]
            for i in range(len(string)):
                string.insert(i*2+1, line.split()[i][2:])
            print "%06d" % count + " " + ' '.join(string)
            count += len(string)