Cracked password analytics with Kraken

Hi!

We are releasing Kraken, HN Security’s internal tool to analyze passwords cracked during security assessments.

Introduction

During our assessments, it’s common to get our hands on some hashed passwords, often related to an Active Directory (AD) domain. A password dump is usually obtained after compromising a domain controller and it contains the password hash of every account configured in the domain. In the Windows AD case, this hash is usually an NTLM hash, which doesn’t offer strong protection against brute force and dictionary attacks.

At this stage, we usually start a password cracking attack on the dumped hashes to assess the strength of the passwords used to access the domain’s accounts. And guess what? Users’ passwords are usually weak!

This step allows us to get an overview of both the effectiveness of the password policies implemented in the domain and the awareness on the importance of strong passwords. We believe that this information is crucial to assess the security of an organization and, if presented correctly, it can have a significant impact on improving its security posture. However, the critical point is to present the results of this analysis in the best way possible.

First of all, we need to avoid giving the chance of blaming the users or, even worse, finding a scapegoat to blame for the analysis results. The focus should always be on password policies and the importance of security awareness in the employees, never on single cases. For this reason, we anonymize the report, excluding any information that would allow identifying the users.

Then, we need to show the impact of the analysis. Indeed, the list of cracked passwords doesn’t provide a lot of information and it is not very useful by itself. For this reason, we need to aggregate the results and extract some useful statistics.

To speed up and standardize this process we developed Kraken. Even if Kraken is designed to integrate with the workflow that we usually follow in this type of activity on Windows AD, it could also be used for similar activities that require password cracking (e.g. Azure AD, LDAP) with minor modifications.

Features

Kraken performs different types of analytics to visualize the results from different points of view, trying to identify patterns in the provided data. We will now take a brief look at the charts automatically generated by Kraken.

Password analysis

This chart is pretty simple, yet crucial in understanding the impact of the analysis. It shows how many of the passwords we were able to crack, giving an immediate idea of the overall security posture.

Character analysis

In this step we analyze the charset used in the cracked passwords. This is important to show potential flaws in the implemented password policy. If we can identify a prevalence of passwords that are based on small charsets we should probably consider implementing a more strict password policy. However, this chart should be interpreted carefully, knowing that it’s biased because we are considering only the passwords that were successfully cracked. To address this bias, you can also include the not-cracked passwords in this chart by using the show-not-cracked option.

Length analysis

This analysis is self explanatory, we want to check the length of the cracked passwords. This allows to evaluate the implemented password policy and identify possible flaws. As in the character analysis, we need to be aware that this chart is biased against the shorter passwords that are easier to crack. Also in this case, the show-not-cracked option can be used.

Topology analysis

With this analysis, we try to highlight patterns in cracked passwords. The tool will automatically categorize some of the passwords based on some basic patterns. For example, we look for passwords that are based on the username or are composed of repeated words, adding more context to the chart. Typical examples are passwords that contain the name of the company or the year when the password was changed. We can simply add custom categories that will be used in this analysis with the dictionary and the regex options.

Leaked passwords analysis

This is an optional analysis that checks automatically the presence of cracked credentials in public data breaches. It gives us an overview of the safety of passwords based on real data, allowing us to get another point of view on password strength. This check is done anonymously, the cleartext passwords never leave the device to guarantee password safety. We implemented this feature using two different services:

haveibeenpwned

This API offered by haveibeenpwned checks for the presence of the password in a database with hundreds of millions of real-world passwords previously exposed in data breaches.

Google password manager

This service checks for the user and password pair inside a database of leaked credentials. For more information about the design of this API, you can check out this blog post by Google.

This service is freely available to Google Chrome users. However, the API specification is not publicly available. To overcome this limitation we reverse-engineered the protocol to implement it in our tool.

Note that we need a token to use this API. To extract a valid token, the tool will spawn an instance of Google Chrome. You will need to be logged into the browser with a Google account to generate the token. We only use the token to emulate the requests generated by the browser to check the cracked passwords in the data breaches. However, we suggest using a secondary account for this purpose, to avoid any problems with your main Google account. To avoid launching the browser every time, the tool will save the token in the refresh_token file. If this file is found, the tool will check for the validity of the token, reusing it when possible.

Levenshtein distance analysis

In a lot of cases, we have access to both the current and previous passwords of the users. When this happens, it’s interesting to check if the users tend to reuse always the same password with small changes. This is pretty common when users are required to change often their password and it basically defeats the purpose of password rotation. Evaluating this phenomenon and visualizing it clearly will help us in understanding if the policies for password changing are effective. For this purpose, we need to define a metric to measure how much two passwords differ. We want a number that summarizes if two passwords are similar to each other.

The Levenshtein distance is perfect for this purpose as it measures the “distance” between two strings. A low Levenshtein distance value means that the two strings are very similar to each other, while a high value means the opposite. To better understand this distance definition here are some examples:

Right now, we perform this analysis by computing the distance between the current and the last changed password, without considering the complete password history.

Common passwords

Kraken also creates a ranking of the most common passwords in the cracked password list. This is useful to find more patterns in the password selection, for example, we can identify if a default password is used in the domain. This ranking could also be useful to extract a small set of passwords to add to the final report for the client when presenting the results.

Common hashes

Similarly to the common password ranking, in this case, we rank the most common password hashes. This allows us to identify common passwords that we weren’t able to crack that could add more information to the previous analysis.

Usage

Kraken is pretty straightforward to use, but, at the same time, it allows customizing the report case by case.

In the most basic case, the tool takes the cracked password file as input and generates the report in the XLSX format. You can try it out with the following command.

python kraken.py --pwd passwords.txt --out report.xlsx

The input file is expected in the format generated by hashcat, the current standard tool for password cracking. This format is pretty simple, every line contains a new set of credentials in the format “username:password”. We also support the hexadecimal format used by hashcat to save passwords that contain special characters. In this case, the password is decoded automatically during the analysis. An example of a valid password file is the following:

username:password
user2:p4ssw0rd
anotheruser:$HEX[706124243a776f726421]

Note that the tool also supports the process of historical password data. Previous passwords are identified by appending to the username the suffix _history and an incremental number. For example, this file contains the password history for the user username.

username:currentpassword
username_history0:previouspassword
username_history1:oldpassword

In addition to the password file, the tool can take the following options as input.

Group

This option is used to define a list of users to be considered in the analysis. The information provided with this option has two purposes:

  •  Customize what users to consider in the statistics without needing to modify the password file. In this way, we can limit the analysis to a group of users that we are interested in, ignoring the accounts that are not relevant to our purposes.
  •  Adding the information about all the accounts defined in the domain, even the ones that we weren’t able to crack.

This allows the tool to consider in the statistics the non-cracked passwords, providing a better overview of the results.

This option can be passed to the tool with the syntax --group user_file.txt where each line of the user file is a username that we want to consider in the analysis, as in the following example:

user
user1
user2

Show not cracked

This option is used to include the information about the non-cracked passwords, obtained from the group option, in the generated statistics. With this option, we can choose if we want to focus the analysis only on the passwords that we were able to crack or if want to give a complete overview of the results. This choice is a matter of personal preference and it depends on the actual distribution of the data. Indeed, including this information would make the generated charts unusable if we cracked only a small portion of the passwords. However, if the portion of cracked passwords is significant adding this information will help give a better overview of the results.

You can toggle this flag by adding the --show-not-cracked option to the command.

Dictionary

As mentioned earlier, this option can be used to define new categories to be used in the password topology analysis. This is useful to customize the analysis and highlight some of the patterns that we identified in the cracked passwords. Indeed, it’s pretty common to find out that a lot of passwords are based on some specific word. Usually, it is something related to the domain that we are testing, the most classic of examples is the name of the company. The tool will take each word in the dictionary file and it will use it to test the leaked passwords.

To check if a password is based on a specific word, we consider it case insensitive and we also consider substitutions of letters and numbers (leet). For example, if we added the word “password” in the dictionary all of the following strings will match it and will be categorized based on the word “password”:

  • password123
  • aaaPassWOrd!
  • p4ssw0RD?

To use this option we can pass it to the tool with the syntax --dictionary dictionary_file. Each line of the dictionary file contains a word that we want to use to categorize passwords, as in the following example:

acme
2023
password

Regex

This option is similar to the dictionary option, but it allows us to be more flexible in the categorization of the passwords.

For example, if we want to highlight the passwords containing a date we could use a regex similar to .*\d{8}.*. This regex would match all of the following passwords:

02121999
test06072022
tt00000000!

To use this option we can pass it to the tool with the syntax --regex regex_file.txt. Each line of the regex file contains a python regex that will be used to match the cracked passwords, as in the following example:

p(a|4)s.+rd
.*\d{6}.*

Pwdump file

Kraken can also take as input the raw hash dump that we used in hashcat to crack the passwords. This file adds some information on all the passwords, even the ones that we weren’t able to crack. At the moment this file is used only to generate the ranking of the most common password hashes available in the dump.

You can pass this option with the syntax pwdump hash_dump_file.txt. The file format expected by Kraken is the output format of secretsdump.py. An example of this file format is the following:

octagon.local\Administrator:500:aad3b435b51404eeaad3b435b51404ee:c8ca0f8d1f3ca975464bee8843bceda3:::
Guest:501:aad3b435b51404eeaad3b435b51404ee:31d6cfe0d16ae931b73c59d7e0c089c0:::
krbtgt:502:aad3b435b51404eeaad3b435b51404ee:52db83a2f77be50bdb0f9cad2978d682:::
octagon.local\REVA_JOYCE:1103:aad3b435b51404eeaad3b435b51404ee:b2102052a134db43cdf57649d36aecbb:::
octagon.local\MARYANN_HANSON:1104:aad3b435b51404eeaad3b435b51404ee:57cbd01ad63e05402db21fb22cdedda2:::
octagon.local\MICHELLE_WOLF:1105:aad3b435b51404eeaad3b435b51404ee:65db1dfba7ef28bd9a95c2b4c0a8c213:::

Check leaked

With this option, you can enable the feature that checks the presence of cracked credentials in public data breaches. As described earlier, this feature supports two different APIs.

You can enable this analysis using the haveibeenpwned API by adding the flag --check-leaked in the command.

If you also what to enable the API used by Google password manager you can add the --check-leaked-google flag too.

Download

You can find Kraken on GitHub. Enjoy!