Cryptographic hashing

The Hashable typeclass is suitable for hash maps, but sometimes we need something a bit stronger. A hash function is a good cryptographic hash if mathematicians are confident that nobody will ever be able to come up with two inputs to the hash function that produce the same output.

The cryptonite package contains a lot of cryptography. The hash algorithms are in the Crypto.Hash module.

import Crypto.Hash

The input to a hash function is a byte string, so we will use this type from the bytestring library.

import Data.ByteString (ByteString)

In this example we will be hashing text, so we need to convert between character strings and byte strings. We’ll use UTF-8, one of the more popular character encodings. The utf8-string package will help with this.

import qualified Data.ByteString.UTF8 as UTF8

The output (or “digest”) of a hash function is also a string of bytes. So in order to print a digest, we need some way to represent its bytes as text. It is common to use a hexidecimal encoding. The convertToBase function in the memory library will do this for us.

import Data.ByteArray.Encoding

We’ll define a function named sha256. This function will calculate a string’s hash according to SHA-256, one popular cryptographic hash.

sha256 :: String -> String
sha256 input = result
  where

First we convert we the input to a byte string so that it can be fed into the hash function.

    bytes = UTF8.fromString input      :: ByteString

Then we apply the hash function.

    digest = hashWith SHA256 bytes     :: Digest SHA256

We convert the result to base 16 (hexidecimal).

    hex = convertToBase Base16 digest  :: ByteString

Finally we use the UTF-8 library again, this time in reverse; we decode the byte string to interpret it as a list of characters so that we can print it to the terminal.

In the sha256 function, we chose to use type annotations; each expression is followed by :: and then the type of the expression. We include these only for clarity; they are optional, and the program would work the same if we had omitted them.

    result = UTF8.toString hex         :: String
main =
  do

To demonstrate the sha256 function, this program prints the hashes of a couple strings.

    putStrLn ("sha256(abc)   = " ++ sha256 "abc")
    putStrLn ("sha256(hello) = " ++ sha256 "hello")

We have ellipsized the output here for brevity.

The output appears random, but it is not; if you run the program repeatedly, it will produce the same output each time.

$ runhaskell crypto-hashing.hs
sha256(abc)   = ba7816bf8f01cfea414140de5dae2223[...]
sha256(hello) = 2cf24dba5fb0a30e26e83b2ac5b9e29e[...]

$ echo -n "abc" > 1.txt

$ echo -n "hello" > 2.txt

We can test that these are the same results we get from the sha256sum command-line utility.

$ sha256sum 1.txt 2.txt
ba7816bf8f01cfea414140de5dae2223[...]  1.txt
2cf24dba5fb0a30e26e83b2ac5b9e29e[...]  2.txt