iterator Python’s provides many handy utilities for working with iterators; the next lessons on
itertools are dedicated to those functions and their Haskell equivalents. is a sort of sequence that, unlike a
list, computes each subsequent value only as it is needed rather than producing the entire list all at once. For example, from a range, we can produce an iterator of numbers. Here we construct an iterator
xs that will produce three values: We are using Python 3, in which the
range function produces an iterator. (In Python 2,
range produced a list, and
xrange produced an iterator.)
We can use the
next function to demonstrate pulling values from the iterator one at a time. The first three times we apply the
xs will produce a value. The fourth time, it raises
StopIteration to indicate that the iterator is exhausted.
We don’t often use the
next function directly, though – Most often, we use an iterator via a
for loop (and Python uses the
next functions under the hood).
For these examples, the closest practical Haskell analogue is lists. Unlike Python lists, Haskell lists compute each subsequent value only as it is needed rather than producing the entire list all at once – much like Python iterators.
enumFromTo We could also write this example using Haskell’s range notation. A list from 1 to 3 can be expressed as either
enumFromTo 1 3 or
When parts of a value aren’t computed until they’re needed, we call that laziness – not as an insult, but to say that we’re trying to avoid putting the machine through more work than necessary.
The easiest sure way to tell whether a data structure is lazy is to construct something infinite. For Python iterators and Haskell lists, we can interact with some portion of the beginning of the sequence, as long as we don’t try to use the entire thing.
xs is an iterator that counts upward from 1, on to infinity. If we were to try converting this to a list, the REPL would crash from attempting to contemplate infinity.
But we can take,
itertools.islice for example, just the first ten elements from that iterator.
Here’s a Haskell list that counts upward from 1:
enumFrom Again we could have written this with range notation. A list from 1 to infinity can be expressed as either
enumFrom 1 or
Again, if we try to evaluate
xs, we have a problem.
If you try this, you can see GHCi make an admirable attempt to print the entire list. But it will never finish. (Use ctrl+C to stop it.)
But again, we can take
take some portion of the beginning of the list.
Python provides a convienient way to define an iterator as a function written using the
yield keyword. This feature is called generator syntax.
When a Python function
returns, it can only return once and then it’s over. But when a function
yields, it can keep going and continue to yield more things until it returns.
When we apply the function
f, it returns an iterator:More specifically, it returns a
generator; a generator is a kind of iterator.
If we convert that to a list, we can see that the list contains the three values that were yielded by
Generator syntax does not have any direct analogue in the Haskell language, but it is strikingly similar to the
pipes library which we will discuss further below.
The fun thing about generators is that they don’t only yield values; they can also have side effects as they run, which makes generator syntax much more than a way to express sequences. It’s a way to write programs that emit values; The iterator represents the stream of values that are emitted as the program runs.
As a simple example, we’ll write a generator that reads three lines from the terminal and yields each one in all caps.
When we apply
f and convert the result to a
list to collect all the results, it looks like this:
This is what the
for function Yes,
for is a function, and not a language keyword. This is a recurring theme you’ll notice as you learn Haskell; a lot of things that are built-in language features or macros in other languages are ordinary functions in Haskell. does in Haskell. The underscore signifies the same thing in Haskell as it does in Python: that the argument is not being used.
The type of
for function comes from the
Data.Traversable module. It has many uses, and its fully-polymorphic type is much more general than the specialized type we show here. in this context, is:
- The first argument is the list of integers we’re iterating over:
- The second argument is an action, which returns a string, that will be performed for each integer in the list: read a line and capitalize each of the characters.
- The return value is an action which loops over the input and returns all of the results aggregated in a list.
When we run
main in GHCi, we get a behavior quite similar to that of the Python generator.
Now we’re going to venture outside of the Haskell standard library, because when you use the
pipes package, The
pipes library Web servers, lesson 13 is about the
pipes package and goes into more depth with a realistic example. writing
Producers is strikingly similar to writing Python generators. There is even a function with the same name as the Python keyword
yield, and it serves a similar role.
A pipe can
await to receive a value, and it can
yield to emit a value. We give a stage of a pipeline a different name depending on whether it awaits, receives, or both: A
awaits, and a
Pipe does both. For our purpose here, to compare pipes to simple Python generators, we will only consider
Here is a simple
Among the many useful functions in the
Pipes.Prelude module is
Pipes.Prelude.toList which runs a producer and returns a list of its results.
toList function is a bit like when we had a Python generator function
f and we invoked it as
f() to run it and produce an iterator of its results.
Now let’s look at a producer that also does some I/O. We’ll rewrite
threeUppercase using pipes:
It does the same thing as before. There are some new things in this producer that we haven’t seen before:
- We have to
getLineinto the world of pipes. It may not be clear why we need
lift, because we haven’t shown any of the types in this very brief pipes overview. Informally, think of lifting here as converting the type from “I/O” to “a pipe that involves I/O”.
- This time we’re using
toListbecause our producer involves I/O. You’ll see this a lot in Haskell: something with side effects has a different type than something that doesn’t. The ‘M’ in ‘toListM’ stands for ‘monad’. Whereas before
IO [String]; the resulting list is within some monad (
IO). It isn’t a list of strings; it’s an I/O action that produces a list of strings. This is an important distinction!
In the next lessons, we give a lot more examples on how Python iterator code translates into Haskell.