Python classes are “a means of bundling data and functionality together.”Classes in The Python Tutorial Data classes are a recent addition to Python Data classes first appeared in Python 3.7, released in June 2018. that place a heavier emphasis on the data than the functionality aspect. This invites a pleasant comparison with Haskell datatypes, which exhibit a more distinct separation between data and functionality.
The following example defines a type representing a point in three dimensions as the conjunction of three integers:
This class looks and behaves quite similarly to the following Haskell datatype defined using record syntax:
Data classes can also have all the complexities of regular Python classes, and the
dataclass decorator is a good deal more feature-packed than we discuss here. We will focus on the primary purpose of data classes: the field list and the special methods that are generated based on the field list.
__init__ is a special method Special methods are sometimes called “dunder” methods (for the double underscores in their names). that initializes the instance; it is called when a value is first constructed. In the example above, the
dataclass decorator in the example above generates the following method for the
The effect is roughly akin to what the constructor of a Haskell record does. Here’s how we create a new
Point value, in Python and in Haskell:
Types vs constructors
In both languages, one may find it slightly unsettling that the word “Point” describes two things:
- The type;
- A function that returns a value of that type.
How can “Point” be overloaded to mean both of those things? The answer varies between the two languages. The Python and Haskell REPL sessions above look more or less identical, but if you’re interested in the subtle distinctions about what’s going on “under the hood”, as they say, read on.
The Python expression
Point, by itself, refers to the class. We can see this by typing
Point into the REPL:
The reason we can write the expression
Point(3,4,5) is that Python classes are callable; this means they have the special method
__call__ The Python documentation describes how the expression desugars:
if this method is defined,
x(arg1, arg2, ...)is a shorthand for
x.__call__(arg1, arg2, ...).
We can verify in the REPL that this method is in fact defined on
And we can also evaluate the desugared form and see that it does produce the same result.
In Haskell, the distinction between
Point the type and
Point the function is made by context. We will use the following function definition to illustrate. This function accepts one
Point argument, and returns a new
Point where the
x coordinate is reduced by one. Interpreted according to the convention that the axis labeled X in three-dimensional space is oriented with lesser numbers to the left and greater numbers to the right, this function’s output is one unit to the left of its input, and so we have chosen to name it
“Point” appears multiple times here: In a type context, in a pattern context, and in an expression context.
- What comes after two colons (
::) is a type.
- What comes before the equal symbol (
=) is a pattern. As a pattern, a constructor splits up a value into its consituent parts.
- What comes after the equal symbol (
=) is an expression. As an expression, a constructor builds a value from its constituent parts.
From the contexts in which they appear, then, we can tell:
- “Point” refers to the
- “Point” is a function that constructs a value of the
The other special methods that data classes generate automatically correspond to Haskell typeclass deriving.
__repr__– equivalent to deriving
__eq__– equivalent to deriving
dataclassfunction is called with
order=True. – equivalent to deriving
dataclassfunction is called with
frozen=True. – equivalent to deriving
Hashable, which we discuss below.
Replacing field values
dataclasses.replace lets you construct a new instance based on an existing one, with one or more field values modified.
Somewhat uncharacteristically, this is not accomplished with a function in Haskell, but instead is a feature of the built-in record syntax.
Hashing is important in Python because it is the basis of the
dict structures. For some variety we’ll switch up the example; this time we’ll define a data class called
Suppose we want to make a
set of the primary colors.
Colors can’t go into a set unless they’re hashable. We can get that by requesting a frozen data class, which entreats Python to generate an implementation of the
Now we have a hash function for
And thus now we can put
Colors into a set.
Haskell doesn’t have hashing as a built-in concept. Instead of hash-based collections, we more often use the
Ord-based collections See the containers package on Hackage. found in the
But we do have hash-based collections
HashSet is defined in the
unordered-collections package. in Haskell as well. If we want to store a collection of colors in a
HashSet, then the
Color type must have an instance of the
Hashable typeclass. We will need to turn to the
hashable library See the hashable package on Hackage. and a few language extensions. For more about what’s going on here, see our page on deriving strategies.
Python’s data classes can have default values for its fields, just like Python functions can have defaults for their parameters. Haskell does not have a concept of parameter defaults, so this is something we have to reckon with.
Let’s look at the first example from PEP 557: PEP 557
The default value of
quantity_on_hand in the Python data class allows us to construct an
InventoryItem with only two parameters instead of three:
Our corresponding Haskell type declaration has no ability to specify default values for its fields. Note that this example uses floating-point numbers to represent money, which is not generally a good idea due to the possibility of rounding error. Consider using the safe-money package if you deal with money.
As an analogue in Haskell, we might write a function with two parameters.
Or more tersely using record wildcards:
Now we can use the
inventoryItem function like we used the Python
There is one more piece of the example from PEP 557 that we haven’t mentioned yet: the
total_cost method on the
We do not typically attach functions to data in this way in Haskell. The
total_cost function becomes a top-level function in the Haskell version. We use the
fromIntegral function to convert from
Double (or, more generally, from any integer-like type to any other type of number).
Alternatively, using the