MD5 is a very commonly used function for calculating hashes.
Beginners often use it for password hashing, which is not a good idea because there are many ways to retrieve the original password.
This article describes specific methods for doing so.
All security is built on the fact that it takes a disproportionately long time to try all passwords. Well, it should. The problem with the md5()
algorithm in particular is that it is a very fast function. On a normal computer, it is no problem to compute over a million hashes per second.
If we break the password by trying combinations one by one, it is a brute force attack.
There are several strategies:
There are many more methods, this article describes only the most common ones.
All combinations of letters, numbers and other characters are tried one at a time.
The generated attempts are hashed one by one and compared with the original hash.
So, for example:
aaaaaaabaaabaaacaaadaaaaaaaaaf...
The problem with this attack lies in the md5()
algorithm itself, if we were to try only the lower case letters of the English alphabet and numbers, it would take at most tens of minutes to try all the combinations on a commonly available computer.
It is therefore important to choose long passwords, preferably random ones with special characters.
People usually choose weak passwords that exist in the dictionary.
If we take advantage of this fact, we can quickly discard unlikely variants such as 6w1SCq5cs
and instead guess existing words.
Furthermore, we know from previous password leaks of large companies that users choose a capital letter at the beginning of the password and a number at the end. Let's see - does your password have that too? :)
Since one password always corresponds to the same hash, you can easily recalculate a huge database in which passwords will be searched first.
In fact, searching is always orders of magnitude faster than searching hashes over and over again.
In addition, for larger data leaks, passwords can be hashed in parallel in this way and, for example, 10% of all user passwords can be quickly retrieved.
A good password database is for example Crack Station.
Many simple passwords are known directly to Google because it indexes pages that contain hashes.
I always use Google as my first option. :)
The Dirichlet Principle describes that if we have a set of hashes that are always 32 characters long, then there are at least 2 different passwords of 33 characters (one longer) that generate the same hash.
In practice, it doesn't make sense to look for collisions, but sometimes the application author himself makes the guessing easier by recomputing the collisions.
For example:
$password = 'password';for ($i = 0; $i <= 1000; $i++) {$password = md5($password);}echo $password; // hashed 1000 times via md5()
In this case, it makes sense to guess the collision rather than the original hash.
Cheers to basting!
Jan Barášek Více o autorovi
Autor článku pracuje jako seniorní vývojář a software architekt v Praze. Navrhuje a spravuje velké webové aplikace, které znáte a používáte. Od roku 2009 nabral bohaté zkušenosti, které tímto webem předává dál.
Rád vám pomůžu:
Články píše Jan Barášek © 2009-2024 | Kontakt | Mapa webu
Status | Aktualizováno: ... | en