Lazy Lists¶
1. Infinite Sequences¶
Implementing call-by-name
Macro-expansion may be implemented with a double textual substitution (as in the C++ pre-processor) or with a single substitution and dynamic scoping. The result is the evaluation of the entire function body in the caller's environment. But how to implement call-by-name? How to evaluate the arguments in the caller’s environment but the rest of the body in the callee’s environment?
Instead of simply passing a textual representation of the argument, we pass in a parameterless anonymous function that returns the argument. Such an anonymous function is called a thunk.
Understanding the difference between passing an argument that is evaluated before calling the function and a thunk is to understand the difference between 7 and function () { return 7; }. The former, when passed as an argument, is already evaluated. The function can use that value without having to do anything else to it. However, the latter, when passed as an argument, requires that the parameterless function be executed to "unwrap" the value that the function should be using in its computation.
Instead of evaluating the argument before calling the function and using that value in the function, every time a parameter is referenced in the function body, the thunk is evaluated to obtain the argument’s value. The evaluation process is often referred to as thawing the thunk.
Call-by-name lists
To illustrate the use of thunks, we will implement call-by-name lists, which are similar to the way lists are used by default in Haskell or as a programmer-chosen option in Python and Scala. Call-by-name lists essentially give you lazy lists, and we will see that they can also be thought of as "infinite sequences". This perspective offers a very different approach to the way in which one works with such lists.
Below is documentation for some of the functions that we will provide in a JavaScript module for infinite sequences called the is module.
The constructor for sequences (i.e., the cons function) takes two arguments, namely the element we want at the head of the sequence and a thunk that will return the tail of the sequence if we ever need to go beyond the first element. For simplicity, we will only manipulate infinite sequences of integers.
// Construct a new sequence comprised of the given integer and thunk
var cons = function (n,thunk) { ... };
// Get the first integer in the sequence
var hd = function (seq) { ... };
// Get the the infinite sequence following the first element. This
// will itself be in the form of an integer followed by a thunk
var tl = function (seq) { ... };
// Return the (finite, non-lazy) list containing the first n
// integers in the given sequence
var take = function (seq,n) { ... };
The following slide-show illustrates how we could use these operations to construct and then expose various parts of an infinite sequence of 1's.
Let's now turn our attention to how these four basic functions in the is module -- cons, hd, tl, and take -- are implemented. The underlying representation of a lazy list is a two-element array seq. seq[0] stores the head of the list, which is already evaluated, and seq[1] stores the thunk that must be evaluated to expose the remainder of the list.
// Construct a new sequence comprised of the given integer and thunk
var cons = function (x, thunk) {
return [x, thunk];
};
// Get the first integer in the sequence
var hd = function (seq) {
return seq[0];
};
// Get the the infinite sequence following the first element. This
// will itself be in the form of an integer followed by a thunk
var tl = function (seq) {
return thaw(seq[1]);
};
// thaw is a helper function for tl. It returns the result
// of evaluating the function given as argument
var thaw = function (thunk) { return thunk(); };
// Return the (finite, non-lazy) list containing the first n
// integers in the given sequence
var take = function (seq, n) {
if (n === 0)
return [];
else {
// Get a copy of the result of recursive call with n - 1
var result = take(tl(seq), n - 1).slice(0); // slice(0) gives a copy of the array
// And use Javascript's unshift to put the hd at the beginning of result
result.unshift(hd(seq));
return result;
}
};
So far the only sequence that we have been able to create has been a boring sequence consisting of all ones. To make it easier to construct more interesting sequences, in addition to cons, hd, tl, and take, the is module has some utility functions that are "infinite analogues" to their counterparts in finite lists (our fp module). All of these utility functions (i.e., from, map, filter, iterates, and drop) are discussed and illustrated below.
- The from operation:
- The map operation
- The filter operation
- The drop operation:
- The iterates operation:
The Sieve of Erastosthenes -- an example that takes advantage of lazy lists
The need to compute various prime numbers occurs in a variety of applications, for example, public-key encryption. A long known technique to compute all the prime numbers up to a limit n with reasonable efficiency is the Sieve of Erastosthenes. The slide slow below describes the sieve algorithm in a language with eager (as opposed to lazy) evaluation.
There is a problem with this algorithm, however, from the perspective of its utility. Think about how well it can respond to the requests regarding primes that we might want to ask of it. While it can handle a request like "Find all primes less than or equal to n", it comes up short on requests like "Find the first 1000 prime numbers" or "Find the first prime number larger that 1 billion". The reason for this is that the underlying eager evaluation of the algorithm is limited by the finite nature of the value n that it is given. On the other hand, with lazy evaluation of lists, we need not be bound by a finite n. Instead we can construct the infinite sequence of primes, relying on repeated applications of a thunk to take us to any point in the sequence that we need to reach. The following slide show indicates how the Sieve of Erastosthenes would be implemented using lazy lists.
Call-by-need
What's the difference between our call-by-name implementation of infinite sequences and the way it is done in Haskell? In Haskell, the analogue of the is.tl and is.take functions are done with call-by-need instead of call-by-name. In call-by-need, the value returned by a thunk is stored (that is, cached) after it is thawed for the first time. This is much more efficient since it never results in a thunk being thawed more than once.
Now it's your chance to get some practice with infinite sequences in the following problems.
This problem will help you better understand code that creates call-by-name infinite sequences.
2. Practice With Infinite Sequences¶
This problem will help you write recursive code to process infinite sequences. To earn credit for it, you must complete this randomized problem correctly three times in a row.
3. Practice With Infinite Sequences (2)¶
This problem reviews recursive definitions of sequences. To earn credit for it, you must complete this randomized problem correctly three times in a row.
4. Practice With Infinite Sequences (3)¶
This problem deals with one more example of a recursive definition of a sequence.