Lesson 3: Laws and orders

Contents
  • Let’s Review
  • Flipped Either
  • The function type
  • More on kinds
  • Function map
  • What isn’t a functor
  • Coming up

In previous lessons, we have looked at some common functors, what their Functor instances look like, and how to use them. This lesson will look at some unusual but important cases of this cornerstone typeclass, as well as some types that are not instances of Functor, which will hopefully lead us to a better understanding of what Functor is about and why it matters so much.

Let’s Review

By now, we have looked at several common functors, some having one type parameter and some, such as Either and the tuples, having multiple type parameters. We have noted that reducing the arity of type constructors having multiple type parameters requires applying those constructors to one or more type arguments. Only the rightmost type argument, then, is viable for providing the value that fmap will apply a function to.

So most of our instances so far have been a straightforward action of applying the function to the rightmost variable. The InstanceSigs extension.

That isn’t going to change in this lesson, because the rules of Functor are what they are. But today we’ll look at some instances that will expand our expectations of what this means.

Flipped Either

We noted in the previous lesson that the basic Functor instance for the Either type constructor is one reason using Left l as the error or failure case is such a common pattern in Haskell. Often, when something goes wrong, we want that error to propagate and evaluation to stop; often, we do not want to apply a function to that failure or error value.

However, there are times when we do. We may be able to think of times when we want to try to work with that error in some way, rather than merely catch or report it. For those times, we may want to have a sort of flipped-around Either functor that would return a Right value unchanged but apply some function to the Left side instead.

We’re talking about Haskell, so, of course, we can have that, if we write it. Let’s walk through how to write it. We’ll use different names here to prevent a name clash with the datatype and constructors already in Prelude, but you could also hide Either from the current scope or work without the Prelude entirely.

Since we know the Functor instance will have to be written to apply the function to the rightmost value, what we need to do is write a new type. The new type will be equivalent to Either but with the type parameters flipped around.

The order of arguments in the standard Either type is the same on both the left and right sides of the equals sign.

In our FlippedEither type, the order of parameters on the left side of the equals sign is the opposite of the order in which they appear on the right side.

With the standard Either, applying it to its first – that is, leftmost – type parameter means we’ve applied away the type that is part of the Left data constructor. With our FlippedEither type, our partial application will be to the type parameter that appears in the rightmost data constructor.

We note that the names of the data constructors of the Either a b type are generic; the name Left doesn’t say anything about whether it’s an error or not. However, as we pointed out, it’s become common and idiomatic to use it that way, because of the way the standard functors work. We’ve named our constructors Error and Success to reflect an intention that we’d use this instance at times when we specifically want to modify errors rather than returning them unchanged and possibly stopping the program.

So, now we can write ourselves a Functor instance for this type.

Any Success _ value gets returned unchanged, just like any Left _ value must in the normal instance for Either. The a -> b function will apply to the Error value. Again, we can compare. For the REPL sessions in this lesson, we’ll turn on a GHCi feature that displays the type of whatever GHCi evaluated last, a value GHCi calls it. We can enable this feature with :set +t.

λ> :set +t 

λ> fmap ("Hello, " ++) (Left "error")
Left "error"
it :: Either [Char] [Char]                   -- 1

λ> fmap ("Hello, " ++) (Right "Julie")
Right "Hello, Julie"
it :: Either a [Char]                        -- 2

λ> fmap ("Game over, " ++) (Error "Julie")
Error "Game over, Julie"
it :: FlippedEither b [Char]                 -- 3

λ> fmap ("Game over, " ++) (Success "Julie")
Success "Julie"
it :: FlippedEither [Char] [Char]            -- 4

Take a look at the inferred types for these expressions. Do you notice a difference? With the Right of Either and the Error of FlippedEither, the leftmost type parameter is fully polymorphic:

it :: Either a [Char]              -- 2
it :: FlippedEither b [Char]       -- 3

GHCi can infer nothing about the types of those parameters from this context. On the other hand, when we are trying to apply the function to the data constructor that contains the leftmost parameter – Left and Success – then the type inferencer can see the type of the data they’re holding. It can furthermore infer the type of the other parameter from the type of the function that we’re trying to apply:

it :: Either [Char] [Char]         -- 1
it :: FlippedEither [Char] [Char]  -- 4

Since the inferencer knows that function can apply to the value in the Right or Error constructors, it can safely infer the type of those arguments. But when we’re applying the function to the value its meant to be applied to, giving it no information about the other parameter, the type of the leftmost parameter remains rigidly unknowable.

Exercise

Write a re-ordered tuple functor, such that fmap (+1) (3, 4) returns (4,4) rather than (3,5).

The function type

In this section, we’ll be discussing what is surely the most common and yet least considered datatype of all: the function type.

The function arrow is a type constructor in Haskell. Like IO, it’s a type that we can’t inspect too much, and it, too, has a Functor instance already in base. The type constructor is the arrow -> we use in type signatures; that constructs the type of a function. The function type is an opaqueYou may hear datatypes that don’t expose their constructors called abstract datatypes. We use ‘opaque’ instead of ‘abstract’ because we find ‘opaque’ to be more relevant to the meaning that we care about here. Also, it leads to people using the initialism ADT to mean both algebraic datatype and also abstract datatype, which seem to us infelicitous. datatype; that is, its data constructors are not exposed. That means you can’t use them for pattern matching or explicitly construct them at term-level. At the term level, while we do not see its data constructors, we can think of it as constructing functions themselves, in the sense that Maybe a at the type level constructs a type when applied and a Just <value> term at the term level.

Unlike most type constructors, the function arrow is an infix operator at the type level. When we query it, we need to wrap it in parentheses, as we do when we query the type of any term-level infix operator.Infix type operators other than (->) aren’t very common, although there are some in the GHC.Generics module. With the TypeOperators language extension, it is possible to define your own custom type-level infix operators, if you wish.

We can query the :info for the function type to see the datatype in GHCi.

λ> :info (->)
data (->) (a :: TYPE q) (b :: TYPE r) 
infixr 0 ->

There’s some unfortunate noise there, and we will simplify it for the purposes of this lesson, although since we’ve talked about kinds, it’s worth talking about what that noise means before we do.

More on kinds

We mentioned previously that GHCi prints the kind of most concrete types as stars * but they are also called Type. So a type constructor with one parameter can be read as having the kind * -> * or Type -> Type. For now, the * is synonymous with Type, but there are plans to remove the use of * from GHC, so it’s best to be familiar with both.

TYPE in all caps, though, is different.

The Haskell Report only gives a name to one kind, and that is *, the kind of concrete types, although it allows for the existence of other kinds as well as for type constructors, which are functions of kind -> kind.See section 4.1.1 of the 2010 Haskell Report for more. Usually the star is enough, because what we are usually concerned with is the arity of the type constructor, but standard GHC has at least one more, called #. Some recent developments in GHC have essentially collapsed the distinction between kinds and types, which means talking about the “number of kinds” is sort of pointless. For example, using the DataKinds extension raises the possibility of having many different kinds of types, because it promotes types to the kind level (and values to the type level), which allows you to do some pretty exciting things. In practice, this is not something you probably need to spend a lot of time worrying about, and for the purposes of this course, what we care about is the arity, which is still represented by being kind-level functions. The hash sign is the kind of all the unlifted, unboxed, primitive machine types. You will rarely need to deal with it, unless you’re doing something very low-level like binding a low-level library or dealing with the compiler directly. However, the presence of this kind means that GHC has introduced the concept of a kind constructor.

Just like there are types like Bool and also type constructors like Maybe a, there are kinds and there are also kind constructors – or, there is one kind constructor: TYPE. We’ve said about type constructors that they construct a type when applied to a type, and this is the same thing just one level up. TYPE constructs a kind – we’ll say for simplicity that it constructs a kind * or # when applied to a constructor such as LiftedRepThe RuntimeRep datatype on Hackage. to construct the kind of a lifted type (which is most of the types and certainly all the ones that matter to us in this course).

Once again, this is the kind signature of (->).

λ> :kind (->)
(->) :: TYPE q -> TYPE r -> *

And what it says is:

  • This type constructor has two parameters;
  • each of those parameters may have different kinds;
  • the result type is always a lifted type, represented by * or Type.

We don’t care about the other possible kinds, so we’ll read the kind signature this way for the purposes of this course:

λ> :kind (->)
(->) :: * -> * -> *

Function map

For the purposes of this course, the kinds of the parameters to the function type need not concern us, so we will read the datatype this way throughout the course:

What we care about is this type having two parameters and, thus, the wrong kindedness for a Functor instance. But we can see when we query its :info that it has a Functor instance:

λ> :info (->)
data (->) (a :: TYPE q) (b :: TYPE r)
infixr 0 ->
instance Applicative ((->) a)
instance Functor ((->) r) 
         ^^^^^^^

So we know we can write one and we know from the kindedness that we will need to do some partial application of this type constructor in order to do it.

As we’ve encouraged you to do in previous lessons, you may at this point go to your REPL and try querying the type of fmap when applied to the function arrow applied to its first argument.

λ> :type fmap @((->) _)
fmap @((->) _) :: (a -> b) -> (w -> a) -> w -> b

Recalling that the underscore represents applying the arrow to its first argument of some type that we don’t care about and that GHCi will represent that “wildcard” with a w, you may have already figured out for yourself what this is, but we’ll walk through it here.

So far, we know three things for sure, keeping GHCi’s usage of w for a wildcard type that we’re not paying attention to:

  • the type of fmap is (a -> b) -> f a -> f b;
  • the f for this instance is ((->) w);
  • and ((->) w) is prefix notation for (w -> ).

We’re going to reason our way through this, rather than try to compile it. First, we write the type of fmap for this instance, translating the f to ((->) w).

Next we rearrange that function arrow from prefix to its usual infix position, to be more readable.

We know that’s equivalent to this.

So we can see that it’s … a series of functions. If we gave it a w, it would produce an a and from that a produce a b, like a path for getting from a value of type w to a value of type b through two other functions. What we want is the b value in the end, since we don’t want to return a function as a final result (and we couldn’t show it if we did). So we can remove those last parentheses.

It may still look unfamiliar to you due to the names. This has the types named like we expect them to be in an fmap implementation. But we could change the names and, as long as we did so consistently, it would be semantically equivalent.

That’s consistent: w became a, a became b, and b became c. And that’s the same type as the type of function composition.

λ> :type (.)
(.) :: (b -> c) -> (a -> b) -> a -> c

So, we can write a Functor instance for a partially-applied function with function composition.

The functor of functions is function composition! At least this is a functor of the function type; we’ll talk about another when we get to Contravariant.

λ> ((+2) . (*10)) 5
52

λ> (fmap (+2) (*10)) 5
52

λ> (take 5 . reverse) "hello julie"
"eiluj"

λ> (fmap (take 5) reverse) "hello julie"
"eiluj"

λ> (take 5 <$> reverse) "hello julie"
"eiluj"

Using fmap with functions like this is not as common as using the normal function composition operator, but since fmap forms the core of all the other functors we’re going to talk about, it’s important to understand this equivalence when we get to some of the function instances that people do use, such as the Monad and Contravariant instances.

What isn’t a functor

Finally, let’s take a moment to talk about some types that you might think are functors but aren’t. There aren’t very many such examples, but the few that exist tell us something important about functors in Haskell. We’ll look at the Set type since it is perhaps the most famous and often-asked-about non-functor.

The Set type lives in the containers package. The Data.Set module of containers on Hackage. If you want to have a REPL open with that package loaded (as opposed to starting a whole project for this) so you can see for yourself, you can do this:

$ stack ghci --package containers

<... stuff ...>

λ> import qualified Data.Set as Set

We recommend you import it qualified like that to prevent getting annoying name clash problems with Prelude functions. You could always unimport the Prelude instead, but it’s not necessary.

Sets are sort of like a lot of other container types that are definitely functors, and the Set type has the right kindedness.

λ> :kind Set.Set
Set.Set :: * -> *

So… what’s up, exactly?

It’s difficult to see immediately, without digging into the source code (and even then it’s a little weird, because Set is another opaque datatype and doesn’t export its constructors), but most operations over Set require that the values they contain be ordered values. Its values are therefore usually constrained by the Ord typeclass.

The constraint is really about ensuring that all values in a Set are unique, because, unlike a list, sets aren’t supposed to contain duplicates. Ensuring uniqueness requires some kind of constraint on the values so that we have operations that can compare them for equality and so forth; at a minimum, this would require that the type of the elements in the set was an instance of Eq.Not all types can be tested for equality, so not all have Eq instances. Most notably, neither IO nor (->) has an Eq instance, and mathematicians are pretty sure it’s impossible to have a general method of testing functions for equality. The most common Set implementation, which is in the containers package, relies on a binary tree implementation for efficiency and so it needs to be ordered for fast searching. A Set implementation in the unordered-containers package relies instead on the Hashable class for efficient comparisons of values.The unordered-containers on Hackage.

At any rate, a Set needs a constraint on its values. They need to be testable for equality, and so a Set cannot contain any type a. But fmap requires that the a in its type refer to all possible types a, every type in the universe of discourse.

λ> :set -fprint-explicit-foralls
λ> :type fmap
fmap :: forall {f :: * -> *} {a} {b}.
     Functor f =>
     (a -> b) -> f a -> f b

You can read this as “for all functors and for all types a and b, if you have a function a -> b, you can get a function f a -> f b”. It can’t be limited to only some functions a -> b, but any such function over a Set will necessarily be limited to just the types a that the Set can contain.

We can state this principle somewhat more mathematically. Functors are mappings of one category to another category, preserving the objects and morphisms and their relationships. Functors in Haskell are endofunctors: they are all functors from a category called Hask to itself. The category Hask contains types and functions between them.

Functions should be total – that is, they should provide a mapping from every element in the set of possible inputs (domain) to something in the set of possible outputs (codomain) – or else they are partial functions, which we should generally eschew. Functors in general need not be total in this way; but instances of the Functor class in Haskell should be. They must map every element in their domain category to some output in their codomain. They need not be surjective, though; neither functions nor functors need to provide a mapping to every possible value in the codomain.

In this case, an endofunctor in the Hask category has Hask as its domain. Endofunctors must map every value in Hask to some value in the codomain, which is also Hask. A functor over Set would be a functor from a subset – only those objects that are also in Ord – to Hask, so they do not map the whole domain of Hask to itself. And so in that sense, Set isn’t a functor in Haskell because it doesn’t do what endofunctors are supposed to do: there is no way to define a sensible fmap :: (a -> b) -> Set a -> Set b if the types a and b aren’t in Ord.

Functors are general containers that can contain any type in Hask and give you a function that maps an a -> b function to the same function but lifted into a new context. If a container can only contain one type (such as a String or a Text) or only some subset of types, like Set, it’s not properly an instance of Functor. Set is a functor from what we might call Ord Hask to Hask and therefore has a perfectly sensible map function defined for it.

Coming up

In the next lesson, we’ll examine the Functor laws and use the property-testing library hedgehog to test some of the instances we’ve written!

Sign up for access to the full page, plus the complete archive and all the latest content.