Published on ONLamp.com (http://www.onlamp.com/)
See this if you're having trouble printing code examples

08/02/2007

In Part 1 of this series, I described how programming languages are grouped into two families, mechanical and mathematical. Programmers also fall into two general categories, those who think mechanically about what a computer does as it executes a program and those who think mathematically about what a program means. Some programmers are gifted and hop between these two worlds as needed.

In Part 2, I described how Haskell is a purely functional programming language and how it uses pure functions to express programs. Unfortunately, pure functions are limiting, because they prohibit side effects and basic I/O.

### The Language Divide Revisited

Haskell comes from a long and rich history of research in mathematics and functional programming. Naturally, many mathematically inclined programmers are quite comfortable with Haskell, while more mechanically minded programmers are generally befuddled with the language and the mathematical concepts it draws upon.

No concept is more befuddling than the monad.

Monads come from a branch of mathematics known as category theory (See more about category theory on Wikipedia and Haskell.org). Unfortunately, category theory is an obscure branch of mathematics that most people never come across, even when studying the mathematical foundations of computer science.

The news isn't all bad. For example, SQL is built on a mathematical foundation of relational algebra. A programmer may think of a SQL query in terms of what the relational operators mean or in terms of what the database engine does. Whether or not a programmer is mathematically or mechanically inclined, he can still formulate SQL queries to examine the data in a database.

Similarly, mechanically inclined programmers do not need to learn category theory to understand what monads mean in order to understand how they work. But before I can explain what monads do and how they work, I need to explain one of Haskell's most innovative features, its type system.

### The Type System

Because Haskell is so deeply tied to mathematics, its designers and practitioners are concerned with understanding what a program means. Haskell compilers attempt to understand what your program means through type analysis and type inferencing.

Languages like C, Java, and C# offer a simple form of static typing. Static typing in these languages involves adding annotations to every variable, function, and method declaration to tell the compiler how to allocate memory and generate code. Unfortunately, this kind of typing is quite verbose and very error prone. In these languages, the type system can also be subverted by type casting and by ignoring type errors from the compiler. As a result, it's not difficult to write code with type mismatch errors that cause a program to crash.

Languages like Perl, Python, and Ruby offer some form of dynamic typing. In these languages, little to no type checking is done when a program is compiled. Instead, type issues are deferred until runtime, and a program is assumed to be meaningful and well typed. This means type annotations are unnecessary, and programs in these languages are generally more concise than their counterparts in C and C-like languages. Type errors, if they do occur, generate exceptions at runtime, and may cause a program to crash; or they may be ignored, leading to odd and unexpected runtime behavior.

Haskell takes a very different approach. It starts by performing type analysis with a type checker to understand your program at compile time. The type checker strictly prohibits type casting and does not allow type errors to be ignored. Because types are checked at compile time, and there is no escape from the type checker, Haskell is frequently described as both statically typed and strongly typed.

Any Haskell program with a type error simply does not compile. Haskell programs that do compile are free from type errors, and this guarantee eliminates a whole class of crashes and unexpected runtime behaviors. (Type errors are merely one kind of problem that creep into a program. There are still many ways for a Haskell program to compile and misbehave at runtime.)

Additionally, Haskell uses type inferencing to deduce the types of functions and values, eliminating the need for verbose type annotations in most cases. These features combine to provide more safety than C-like static typing and code at least as concise as that found in dynamic languages.

For example, here is the definition of `length`, a function that takes any list and returns the number of values contained within it:

``````length [] = 0
length (x:xs) = 1 + length xs``````

Here is a type annotation the compiler could infer for this function:

``length :: [a] -> Int``

This annotation can be read as follows:

• The `::` separates the symbol being defined on the left (`length`) from its type signature on the right (`[a] -> Int`).
• `[a] -> Int` describes a function that accepts a value of `[a]` and returns a value of type `Int`.
• `a` is a type variable that can match any type.
• `[a]` is a list of any kind of value.
• `Int` is a native machine integer.

The compiler can infer this type because:

• The value of the base case is 0, an `Int`.
• The value of the normal case 1 + an `Int`, which must also be an `Int`.
• `length` doesn't examine the contents of the list, so the type of elements contained in the list must not matter.

Although type annotations aren't strictly required, it's a good idea to include them directly above a function definition. This helps the compiler check that your idea of a function's type matches the actual type for a function. More importantly, the type annotation also acts as documentation for others who are reading your code. A better way to define `length` would be:

``````length :: [a] -> Int
length [] = 0
length (x:xs) = 1 + length xs``````

Here is a definition of the higher order function `map`, which applies a function to every element in a list to create a new list:

``````map :: (a -> b) -> [a] -> [b]
map f (x:xs) = (f x):(map f xs)``````

The type annotation here means the following:

• `map` is a function that takes two values (`(a -> b)` and `[a]`), and returns a third value (`[b]`).
• The first value is a function that takes a value of an arbitrary type `a`, and returns a value of an arbitrary type `b`.
• The second value is a list of type `a`.
• The result value is a list of type `b`.
• The types `a` and `b` can be any two types, but the type `a` in the first argument must match the type `a`, in the second argument, just as the type `b` in the first argument must match the type `b` in the result.

Here is an example of a function using `map`:

``f t = map length (lines t)``

The compiler can infer the type of this function as:

``f :: String -> [Int]``

This is because:

• `lines` accepts a string argument and returns a list of strings (i.e., `String -> [String]`).
• `length` accepts a list of any type and returns an `Int` (i.e. `[a] -> Int`).
• In this instance, `map` takes the function `length` to convert a list of strings (a list of lists of characters) into a list of numbers.

Therefore, the type of `f` must be `String -> [Int]` (sometimes phrased as `[Char] -> [Int]`).

### Type Classes

So far, type inferencing seems useful for deducing the type of a function when either specific types or generic types are used. However, this leaves a huge gap in the middle—functions where any one of a group of related types may be used.

This is where type classes come in. A type class unifies a set of interchangeable, related types. Here is a typical example, the `Num` type class that defines a series of operations that can be used with any kind of number:

``````class Num a where
(+) :: a -> a -> a
(-) :: a -> a -> a
(*) :: a -> a -> a
...``````

This class declaration means that any type `a` that is a member of type class `Num` must implement the three operations `+`, `-`, and `*` (and a few others not listed here). Furthermore, the implementation of `+` for any type that implements this interface must take two values and return a value. All three values must also be of the same numeric type.

Type class declarations define the interface a type must implement to be a member of that type class. An `instance` declaration ties together this interface with a specific implementation for a single type. Here are portions of the instance declarations that specify that `Int` and `Double` are both members of type class `Num`:

``````-- low level multiplication of 2 integers
mulInt :: Int -> Int -> Int
mulInt a b = ...

instance Num Int where
(*) = mulInt
...

-- low level multiplication of 2 doubles
mulDouble :: Double -> Double -> Double
mulDouble a b = ...

instance Num Double where
(*) = mulDouble
...``````

The type checker knows how to infer when a type class is needed to describe a function's type. Here is the simple function from the previous article that squares numbers:

``square x = x * x``

The type inferencer deduces the following type for this function:

``square :: (Num a) => a -> a``

This type signature is a little different, because it includes an added constraint, namely the implication `(Num a)` to the left of the `=>` symbol. This means that the function `square` is a function that takes any value of type `a`, and returns a value of the same type `a`, so long as that type is also a member of the type class `Num`.

For example, the value of `square 3` is valid, because `3` is of type `Int` and `Int` is a member of type class `Num`. Evaluating this expression causes the Haskell compiler to look under the covers to see how multiplication is defined for `Int` values and replace the invocation of `*` with the actual implementation of multiplication for `Int`, or the function `mulInt`. Evaluating the expression `square 2.5` also works, because the value `2.5` is of type `Double`, and `Double` is a member of type class `Num`. In this case, the compiler invokes the version of `*` for the `Double` type, or the function named `mulDouble` in this example.

The expression `square "this"` is invalid because `square` is defined only over numbers, and the value `"this"` is a string, and strings are not numbers. Attempting to evaluate this expression in the interpreter produces the following error:

``````\$ ghci
...
Prelude> let square x = x * x
Prelude> :t square
square :: (Num a) => a -> a
Prelude> square "this"

<interactive>:1:0:
No instance for (Num [Char])
arising from use of `square' at <interactive>:1:0-12
Possible fix: add an instance declaration for (Num [Char])
In the expression: square "this"
In the definition of `it': it = square "this"``````

Here is proof that this one definition of `square` works on different kinds of numbers:

``````Prelude> square 3
9
Prelude> square 2.5
6.25``````

Now that you've seen how type classes group together a series of related types, it's time to return to the topic of monads. Type classes are important, because, at the most fundamental level of implementation, monads are merely containers that are instances of the type class `Monad`:

``````class Monad m where
(>>=)  :: m a -> (a -> m b) -> m b
(>>)   :: m a -> m b -> m b
return :: a -> m a
fail   :: String -> m a``````

Here is what this type class declaration means:

• A type m is a monad if it implements four operations: bind (the `>>=` operator), then (the `>>` operator), return, and fail.
• Monads are a higher order type. That is, a monad m is something that acts as a container over some other type (expressed as `m a` or `m b` in this definition).
• The bind operation takes a container (`m a`), and a function (`a -> m b`) and returns a new container (`m b`). This function must accept the value from the first container and produce the second container.
• The then operation works just like bind, except the value in the first container is ignored, and the second function is replaced by a simple container.
• The return operation takes a simple value and turns it into a container.
• The fail operation takes a string and returns a container. (This operation may be used to "fail fast," so once a failure is encountered, all subsequent operations are skipped.)

These ideas come from category theory, which also defines a few important properties that monadic operators must provide:

• The expression `return a >>= f` is equivalent to `f a`.
• The expression `m >>= return`, is equivalent to `m`. (This means that the `return` function used on its own is a no-op. `return` exists to inject values into a container.)
• The expression `(m >>= f) >>= g`, where `m` is a container and `f` and `g` are functions, is equivalent to the expression `m >>= (\x -> f x >>= g)`. That is, the bind operation always works from left-to-right.

These definitions are quite formal and may be somewhat confusing the first few times you read them, but the goal here is to define a series of simple, mechanical operations for connecting containers together. Monads provide a way to chain together a sequence of operations in a manner that works using just pure functions. And because they are defined in terms of pure functions, the Haskell compiler can understand them and handle the busywork for you.

Thinking about these operations in the abstract can be a little difficult. Here is a simple example. These two functions operate within the IO monad, the mechanism Haskell uses to communicate with the outside world.

``````getLine  :: IO String
putStrLn :: String -> IO ()``````

Because all IO operations are monadic, the IO monad acts as a container that isolates external side effects in a Haskell program. The function `getLine` reads a line of input from the console and returns that string inside the IO container. The function `putStrLn` accepts a plain string value, prints that string to the console (with a trailing newline), and returns an empty value, also contained within the IO monad.

Here is how these functions are connected together, using the monad operators:

``````echo :: IO ()
echo = getLine >>= putStrLn``````

In this example, the monadic action `getLine` is evaluated and produces an `IO String` result. The bind function (`>>=`) gets this result, extracts the string from its IO container, and sends it to the monadic action `putStrLn`. Finally, the action `putStrLn` consumes this plain string, sends it to the console, and returns the empty value within the IO container (the `IO ()` type). This final result is returned by the `echo` function, so `echo` must have the type `IO ()` as well.

Using the monad operators in this fashion may be powerful, but it is very cumbersome. Thankfully, Haskell allows for an alternate means to describe monadic operations, the "monadic do" notation. Here is an equivalent way of writing the `echo` function:

``````echo :: IO ()
echo = do str <- getLine
putStrLn str``````

Here, a monadic function is a series of statements, aligned according to the layout rule. The function `getLine` returns a value, so that value can be captured in the symbol `str` and used later within this function. (Remember, `str` is not a variable, because its value cannot change.) Because this value is extracted from the monadic container, the value of `str` here is of type `String`, not `IO String`. As a result, this value can be passed to the function `putStrLn`, which is expecting a `String` as its first argument.

Because these actions are performed left-to-right (or top-to-bottom in "monadic do" notation), monads enforce an order of execution on their statements. With pure functions, sub-expressions may be evaluated in any order without changing their meaning. With monadic functions, the order of execution is very important. For example, it is meaningless to evaluate `putStrLn` in this function before receiving its input from `getLine`.

### Conclusion

The IO monad is not the only monad in Haskell, but it is one of the most frequently used. Other monads offer different kinds of containers and code structures. For example, lists are simple data structures, but they may also be used as monads. The GHC standard library contains series of useful monads that are simple, generic, and useful in a variety of contexts. One of these is Parsec, a very convenient library for writing parsers.

A lot of exciting new work in Haskell is being done using monads. For example, transactional memory and nested data parallelism are two useful techniques that help when distributing a Haskell program across multiple CPUs. The portions of the program that use these features are strictly segregated from the portions of a program that are purely functional or deal with IO. This makes it easy for the compiler to isolate the portions of a program that are transactional or parallel from the parts that aren't. Furthermore, the type system enforces this separation and highlights bugs in code where monads are improperly used and don't make any sense.

All of these structures share one thing in common—they are container types that are declared to be instances of the `Monad` type class. Each monad type defines its version of the basic operations of bind (`>>=`), then (`>>`), return, and fail. The Haskell compiler can then compose a sequence of monadic actions into a single, larger monadic action. The compiler also replaces these generic operations with the specific implementations needed for a given monad, just as it replaces the generic numeric multiplication operation with the specific operation needed to multiply `Int` or `Double` values. Additionally, because all monads use the same basic operators, complicated monadic actions can be expressed using a simple "monadic do" notation, making these operations easier to understand.