Blog

Has your password been leaked? Theoretical part.

Wojciech Stróżyński 01.04.2019

Today about computers, passwords and the internet! In the first part we will learn something about passwords and how to store them. We’ll learn how to make sure our data is safe.

In the second part, we will write a simple script in Python and use the web API to securely check our own password.

Has my password ever been leaked? I asked myself this question sometimes, but how do I really know? And is this any threat to me? Well, it is.

From time to time there is a breach of security of services and databases come to light, for example, the customers of the portal. From such leaks are built later databases of passwords, email addresses and who knows what else. Addresses can be used in 1000 ways. With an address and password for our accounts, you can collect very large amounts of data about us. After all, how many of us log into the portals with the same email address and password? Well, certainly a few of us do.

Fortunately, the days when passwords in databases were kept in text form have passed. From passwords that were leaked in text form, dictionaries could be easily prepared, used later in attacks. Currently, passwords as such are kept, but password hash functions. Thus, in order to read our password after a leak, you must first find the one that gives the same result of the shortcut function as our password. Which is no longer so easy. It would be fitting at this point to explain what this shortcut function is and what we need it for at all.

What is this shortcut function (hash function)

To quote behind wikipedia:

Hash function, hash function or hash function – a function assigning any large number of short, always fixed size, non-specific, quasi-random value, so-called irreversible abbreviation.

That is, in a nutshell, it is a value obtained from any large set of data, which is a fixed size, and most importantly it is one-sided – that is, it is impossible to get from its result the data passed at the input of the hash function. Of course, depending on the hashing function, cryptographic attacks are possible to help you find input. However, this is a topic too complex for today’s article. However, there is one simple but very time-consuming way to find this data.

The easiest way is to take a database of passwords, enumerate the hash functions from them and see if the hash function from the search term is in the list generated by us from the password database. Of course, this would require a list of shortcut functions from all possible combinations, which is practically impossible due to the number of possibilities. Nevertheless, dictionaries of hash values are created for passwords that have already been leaked. If your password is on this list, you can find out what password we used with the shortcut function. That’s why it’s so important that your password is unique.

Hash sums are used very often in computer science. For example, as checksums to validate data. Changing one bit in the input (for example, a CD image) will generate a completely different sum than would be generated from the correct data. However, two identical files will always give two identical values of the hash function. This is a very important feature.

That’s what, my password leaked?

I do not ̄_(ツ)_/
̄But I know how to check it!

If you were interested in data leaks, you probably came to the https://haveibeenpwned.com and if not, I heartily recommend it. The man behind it collects all the data and uploads it into a large database. But he says he’s doing it for our good. And I even believe him. The site allows you to add your mail. So if you find an address in new data leaks, we will be informed by the author. We can also check if our email address has already appeared in any leaks, and even see if the password we use has ever been leaked!

Just enter your password on this page. Just type them in one of the windows and click check. Your password flies in an unknown form to an unknown server, and this one replies that everything is ok, or the password has been leaked. Did the red warning light light on? Should.

What not to do online?

And while entering an email address on a trusted site is not that bad, entering your password in any window should always cause you to reflexively check whether the connection is encrypted, whether the address is correct, whether the certificate is reliable. Preferably twice
. Our password could not only be used by us. I know. It sounds strange, but someone can use the same slogan as us. When checking leaks containing our email, we check only a part of the potential threats. In this way, we are unable to confirm that our password is not in huge password dictionaries.

Well. We can also take a chance and enter our password on some random website on the Internet and believe that it will not be stored in its database and will not simply answer that it is safe. There used to be sites to check if our credit card details were leaked. It was enough to enter her number, expiration date and CVV code and we were sure… that our card details have just fallen into the hands of thieves.

Fortunately, we can do something that is perfect for a simple development task. This site provides an API (No, I won’t explain it anymore). In addition, when writing the script yourself, we have full control over what happens to the data and what we send to the Internet. We will write the script itself in the second part of the article, but already we can see what the API is, and what it looks like concrete, allowing us to check whether our password has been leaked.

API – introduction

We know we can’t send our password like that. We also know that we can’t send the entire value of the hash function, because if that value is known, then our password will be known. So what can we do? Let’s see how cleverly this API issue is handled by this site.

We can find the API documentation HERE Thanks to it, we are able to call a query that will contain only residual data. Specifically, it will be the first five characters of the hash function of our password. Instead, the page will respond to us with a collection of shortcut functions with the same origin and value indicating how many times the password appeared in the database. This is a good compromise. We will get about 550 keys, which we will now have to search on our own. But what is a collection of 550 keys to check across a database containing data from 7761152394 accounts that have been leaked? We could do it manually. Nothing prevents you from playing with the API right now. For example, just type /12345 as the first five checksum characters. The server should answer us something. More broadly, however, the API will be discussed in the second part of the article.

What do I need for this Python?

We’re going to need a working environment and some kind of text editor. HERE you will find information on how to do it under Windows. Linux almost always has this support already built in.

For this you will also need to know what a dictionary (associative container) is. Well, maybe it would be appropriate to write a simple program in Python.

In the next section, we’ll learn how to use a simple script to speed up checking the passwords of all family members. Maybe it will be a good opportunity to learn a new language, or to throw some project on a private Github? Practice makes perfect!