Hello, aspiring Ethical Hackers. In this blogpost you will learn how to perform PDF analysis on PDF files. In recent times, PDF files are back as initial attack vector. Many APT’s and cyber criminal groups have been seen using PDF files to gain initial access. So, I thought it is a good idea to make an article on PDF analysis. By the end of this article, you will be able to tell whether the PDF file you want to analyze is harmless or malicious.
For this howto, I will create a malicious PDF with Metasploit using the following exploit.
As is well known, this exploit hides an exe within a PDF file. This PDF file can be sent to our target using any social engineering technique. When the target user clicks on it, we will get reverse_tcp connection. Another file we will be analyzing is a normal PDF file. Both of the files are shown below.
Now let us analyze my monthly magazine as shown below.
Now coming to the malicious PDF, we can disable the malicious elements of the file using pdfid as shown below. Now the file is clean.
Now if we want to do further analysis on the malicious PDF, we can use another tool called pdf-parser. It will parse a PDF document to identify the fundamental elements used in the analyzed file.
Type command “pdf-parser /root/Desktop/evil.pdf” without quotes.
That will parse the entire PDF and its objects (We saw earlier that our malicious pdf contains 12 objects). On observation, objects 10 and 9 evoke some interest. We can also parse each object of the pdf file. Let us parse the object 10 as shown below.
We can see it has a launch action which launches the cmd.exe.
Using pdf-parser with the ‘c’ option will display the content for objects without streams or with streams without filters.
On observation we can see a stream that looks like shellcode present in object 8.
That’s how we perform PDF analysis of a PDF file.