“The attack surface is the vulnerability. Finding a bug there is just a detail.”
— Mark Dowd
TL;DR
Go grab my Semgrep ruleset for C/C++ vulnerability research, and happy hacking!
Backstory
In the past few years, I’ve been mostly doing vulnerability research against proprietary and closed-source software. However, at HN Security we’re experiencing an increasing demand for source-assisted penetration tests and white box assessments in general (about time, that’s a welcome evolution!). Therefore, in order to increase both speed and quality of our assessments, we’ve been scouting for tools that could help us automate some of the most boring and repetitive audit tasks. Enter Semgrep.
Semgrep by @r2cdev is amazing! I’ve been postponing checking it out for too long and oh boy I’ve been missing out. https://t.co/vTTC5ZTY0N
— raptor@infosec.exchange (@0xdea) March 13, 2022
A brief intro to Semgrep
According to the official documentation, Semgrep is a lightweight, open-source, static analysis tool for finding bugs and enforcing code standards. It supports many different languages and can find bug variants with patterns that look like source code. Together with the tool, a collection of pre-written rules is provided. You can test rules live using the registry and the playground.
Here’s how a basic rule is able to match a pattern in C source code:
Nifty, isn’t it? In my opinion, the true strength of Semgrep lies in its simplicity.
My ruleset
We’ve been using Semgrep successfully on our white box web application engagements for a while. However, one thing I noticed when approaching Semgrep standard rulesets is that there’s a serious lack of rules for C/C++ code. Since I do a lot of work with these languages, I set off to write my own custom rules for C/C++ vulnerability research. They’re mostly focused on Linux and POSIX systems, and are built on my experience in the field and on some fundamental knowledge resources, such as “The Art of Software Security Assessment” (TAOSSA).
In a few weeks, I managed to put together 36 new rules. I collected them all in this public repository on GitHub: https://github.com/0xdea/semgrep-rules
Here’s my ruleset in action against some sample source code, courtesy of Chris Rohlf:
A word of caution. Semgrep’s support for C/C++ is still experimental and somewhat limited, especially if compared with the mature support that some other languages such as Java, Python, and Go enjoy. This means I couldn’t harness the full power of Semgrep’s pattern matching engine. In addition, as most other Semgrep rules, my rules aren’t perfect. That said, they should be able to help you *get things done*. That’s my mantra when it comes to security research.
Conclusion
I hope my Semgrep ruleset will help you with your vulnerability research tasks. I’ll continue to update and improve it in the future. However, any generic rules can’t possibly be a perfect fit for all scenarios. Therefore, I invite you to explore Semgrep yourself and write the custom rules you need. To get started, I recommend their excellent tutorial, which should get you up to speed in one hour or so. Feel free to use my rules as a template, pull requests are welcome!