Hashing

For a type to be used in a HashSet or as the key in a HashMap, the type must implement the hash method of the Hashable typeclass.

When you define a new datatype, you can get an automatically-generated hash function with the help of a few language extensions.

class Hashable a where
    hash :: a -> Int
    {- ... -}

The default instance of Hashable uses generics.

{-# LANGUAGE DeriveGeneric #-}

Since the default Hashable instance does not require us to write any code at all, we can derive it with the anyclass strategy.

{-# LANGUAGE DeriveAnyClass #-}

We use the deriving strategies extension to be explicit about which deriving mechanism is in use for each typeclass.

{-# LANGUAGE DerivingStrategies #-}

The Data.Hashable module comes from the hashable package.

import Data.Hashable (Hashable (hash))
import Data.Word (Word8)
import GHC.Generics (Generic)

For this example, we’ll define a type called Color, with a byte (Word8) representing each of red, green, and blue.

data Color =
    Color
        { red   :: Word8
        , green :: Word8
        , blue  :: Word8
        }

Typeclasses with specific deriving support built into the compiler are called stock-derivable.

    deriving stock (Show, Generic)

The presence of the Generic instance allows us to derive Hashable using the anyclass strategy.

    deriving anyclass (Hashable)

In the REPL, we’ll demonstrate by showing the hash of the colors red and green.

$ ghci hashing.hs
Ok, one module loaded.

It isn’t really important to us what the Int hash results are; all that matters is that it is improbable for two different values to give us the same hash.

λ> hash (Color 255 0 0)
-2971258545394699232

λ> hash (Color 0 255 0)
-2788793491217597546

Please note that this sort of hashing is different from secure cryptographic hashing.

Next: