Hashing in Python

A hashing function is a one way function that can be used to map data of arbitrary size to a fixed size value. Practical applications of this range from error checking (for example in packets and data transmission) to secure password storage.

Python provides a built in module for hashing, called hashlib. This supports many different hashing algorithms, at least sha1, sha224, sha256, sha384, sha512, blake2b and blake2s.

md5 algorithm will also be available if you are not using a FIPS compliant build of Python (this is rare and you likely do not).

Additional algorithms may also be available depending upon the OpenSSL library that Python uses on your platform. On most platforms the sha3_224, sha3_256, sha3_384, sha3_512, shake_128 and shake_256 are also available.

These can be determined by running print(hashlib.algorithms_available)

Some hashing algorithms included in the module are insecure - refer to "Hashing collisions" for more info.

Basic hashing

To create a basic hash is simple. To start with, you need to have an input stored as bytes. We can calculate our hash using this, optionally digesting it into a hexadecimal format string using hexdigest.

import hashlib

input_string = "This is a string to be hashed".encode()
output = hashlib.sha512(input_string)

output_hash = output.hexdigest()  # b339bca859a183b05fcde0bd538df0745ab6344e53578afc96e649df060304af65f9223453757a3368a584bdfac54275fbfc0b05bcd86127f45e91475c235a17

# ----- CONDENSED -----
output_hash = hashlib.sha512(input_string).hexdigest()

Adding to a hash

hashlib provides the ability to update a hash, similar to concatentating a string. For example:

import hashlib

h = hashlib.md5()
words = "this is a string"
for char in words:
h.hexdigest()  # results in the hash of "this is a string"

would output the same as if you used the first method.

Copying a hash instance

In combination with the previous example, you can use the copy function in order to find the hash of two strings with an initial substring.

Hashing collisions