Skip to content

Peepdf

Peepdf is a Python based tool to explore PDF files in order to find out if the file can be harmful or not. The aim of this tool is to provide all the necessary components that a security researcher could need in a PDF analysis without using 3 or 4 tools to make all the tasks. With peepdf it's possible to see all the objects in the document showing the suspicious elements, supports all the most used filters and encodings, it can parse different versions of a file, object streams and encrypted files.

Installation

Download the tool from the official Github Repo

Usage

$ ./peepdf.py -i pdffile.pdf

Example

We will now see how to extract an embedded object file in PDFs

Image

As we can see there is no suspiction in the pdf file when viewed normally in a pdf viewer.

So now lets load the pdf file in peepdf

$ ./peepdf.py -i nothing.pdf
File: nothing.pdf
MD5: 56572d46b09ef2b3de1faa4c9d5e1cb0
SHA1: 99b73b7d87815f669d54bb1c430b703d4ae827a4
SHA256: 98d1aa64f417da1a331b18c3b57d8d25e642c8f23a661e5298730c01d0a04ad2
Size: 925647 bytes
Version: 1.1
Binary: True
Linearized: False
Encrypted: False
Updates: 0
Objects: 8
Streams: 2
URIs: 0
Comments: 0
Errors: 0

Version 0:
    Catalog: 1
    Info: No
    Objects (8): [1, 2, 3, 4, 5, 6, 7, 8]
    Streams (2): [5, 8]
        Encoded (1): [8]
    Suspicious elements:
        /Names (1): [1]
        /EmbeddedFiles: [1]
        /EmbeddedFile: [8]

As we can see there is an embedded file in the pdf.

So now we need to extract the embedded file using the stream command as follows,

PPDF> stream 8 > embedfile
$ file embedfile
embedfile: PNG image data, 960 x 640, 8-bit/color RGB, non-interlaced
$ xdg-open embedfile
We can see that there is an Image embedded in the pdf.

Embedded Image