Semgrep rules for PHP security assessment

Hi!

According to the official documentation, Semgrep is a lightweight, open-source, static analysis tool for finding bugs and enforcing code standards. It supports many different languages and can find bug variants with patterns that look like source code. Together with the tool, a collection of pre-written rules is provided.

Semgrep is a simple yet powerful tool. It can be considered as the grep Unix tool on steroids, because it understands the syntax of the analyzed language (but it offers many more functionalities and it evolves continuously). Support for many languages is mature, but for others like PHP it is still experimental. Many rules are created by the community and the resulting rulesets are more complete for some languages than for others.

At the time of this writing, public rules available for the PHP language are only a few, and some of them make use of taint tracking. Speaking for example of SQL Injection, there are only tainted rules that during an assessment I was involved in could not find much.

So, I wrote some new rules of my own, some of them specific to the engagement (that obviously I cannot publish) and others more general for the PHP language and the YII PHP framework. My ruleset is mostly focused on SQL Injection, with some rules dedicated to finding instances of Cross-Site Scripting and authorization bypass. These rules were written with limited time, they are not exhaustive, and can definitely be optimized. However, they do their job quite well.

The rules can be downloaded from my Semgrep rules GitHub repository: https://github.com/federicodotta/semgrep-rules

A quick note: in PHP (and probably also in other languages) common rules that include a reference to a function argument did not match when the argument had a default value. An example follows:

rules:
- id: test
  languages:
    - php
  severity: ERROR
  message: test
  patterns:
    - pattern-inside: |
        function $FUNC(...,$PAR,...) {
          ...
        }     
    - pattern: $AAAA . <... $PAR ...> . $BBBB;

With the following code, the rule did not match, requiring a copy of the rules that explicitly define a default value for parameter $PAR:

public static function testWorking($aaa, $shortName = "public"): array
{
  $sql = "TEST" . $shortName . "TEST2";
}

After reporting this behavior to Semgrep’s awesome team, they promptly changed it.

If you’re interested in Semgrep and static analysis, you should also check out our Semgrep C/C++ ruleset for vulnerability research.

Cheers!