Haskell is Not a Purely Functional Language

Haskell has a strange reputation. Some of the best programmers I've known, coming to Haskell, have made ridiculous assumptions like "It must be pretty hard to write real programs in Haskell, given you can't have side effects." Of course that statement can't be true -- there are many real programs written in Haskell, and a program without a side effect wouldn't be worth running, right?

Everyone describes Haskell as purely-functional. Even the haskell.org front page uses the phrase "purely-functional" within the first ten words. I argue that purely-functional, while in some ways accurate, gives the wrong impression about Haskell.

I propose instead that, for most programmers, it's better to think of Haskell as a language with restricted effects.

Out of the box, Haskell has two effect contexts, one at each extreme: pure functions and IO. Pure functions cannot[1] have side effects at all. Their outputs depend solely on their inputs; thus, they can be evaluated in any order[2] (or multiple times or never at all). IO actions, however, can have arbitrary side effects. Thus, they must be sequenced and run in order.

IO is unrestricted, but usefully restricted subsets of IO can be defined atop it. Consider software transactional memory. One reason STM works so well in Haskell compared to other languages is because it does not need to track reads and writes to all memory: only those inside of the STM execution context. STM is a restricted execution context - if arbitrary side effects were allowed in a transaction, then they might be run zero or multiple times, which would be unsafe. But there's another benefit of this restricted context: it means that transactions only need track reads and writes to STM variables like TVars. Independent memory reads and writes, such as those that occur as part of the evaluation of pure functions, can be completely ignored -- pure functions can have no side effects, so it's fine to reevaluate them as needed. This is why Haskell has some of the best STM support, as explained in Haskell is useless.

Restricted effect contexts are powerful. IMVU uses them to guarantee unit tests are deterministic. They also let you guarantee that, for example, a Redis transaction cannot issue a MySQL query or write to a file, because the Redis execution engine may arbitrarily retry or abort the transaction.

OK, so Haskell has restricted effects. Restricted effects are a useful, though quite uncommon, language feature.

Should it be called a "restricted effects language"? No, I don't think so. It is totally possible to write all of your code in IO with imperative variable updates and mutable data structures, just like you would in Python or C. It's also fine to write all of your code with pure functions invoked from a minimal main function. That is, Haskell supports many styles of programming.

One word or phrase is simply not sufficient to describe anything meaningfully complex, such as a programming language. Languages consist of many features.

In one of my favorite papers, Teaching Programming Languages in a Post-Linnean Age, Shriram Krishnamurthi argues that programming language paradigms are outdated -- languages are designed with interacting sets of features and should be evaluated as such.

Programming language "paradigmsâ€ are a moribund and tedious legacy of a bygone age. Modern language designers pay them no respect, so why do our courses slavishly adhere to them? This paper argues that we should abandon this method of teaching languages, offers an alternative, reconciles an important split in programming language education, and describes a textbook that explores these matters.

The problem with using paradigms to describe a language is that they come across as exclusionary. "This language is functional, not imperative." "This language is object-oriented, not functional." However, almost all languages support a wide range of programming styles.

So let's compare a few languages broken down into a mix of individual language features:

	Java	Haskell	Go	Rust	C	Python
Memory safety	Yes	Yes	Kinda[3]	Yes	No	Yes
Garbage collection	Yes	Yes	Yes	No	No	Yes
Precise control over memory layout	No	No	No	Yes	Yes	No
Concurrency model	1:1	M:N	M:N	1:1	1:1	GIL
Obvious translation to machine code	No	No	Somewhat	Yes	Yes	No
Generic programming[4]	Yes	Yes	Kinda, via interface{}	Yes	No	Yes
Mutable data structures	Yes	Yes	Yes	Yes	Yes	Yes
Higher-order functions	Kinda[5]	Yes	Yes	Yes	No	Yes
Sum types	No	Yes	No	Yes	No	No
Runtime eval	Kinda[6]	No	No	No	No	Yes
Type inference	Limited	Yes	Limited	Limited	No	No
Tail calls	No	Yes	No	No	No	No
Exceptions	Yes	Yes	Kinda[7]	No	No	Yes
"Goroutines" e.g. user-threads	No[8]	Yes	Yes	No	No	Yes
Function overloading	Yes	Yes	No	No	No	No
Costless abstractions	Depends on JIT	Yes	No	Yes	Kinda [Macros]	No

The chart above is pretty handwavy, and we could quibble on the details, but my point is that languages should be compared using feature sets rather than paradigms. With feature sets it's possible to see which languages occupy similar points in the overall language design space. Haskell and Rust have related type systems, and Haskell and Java and Go have similar runtime semantics. It should be clear, then, that there is no taxonomy as simple as "functional", "imperative", or "object-oriented" that accurately classifies the differences between these languages.

(An aside: Whether a language supports a feature is often fuzzy, especially because some features can be broken apart into several aspects. Also, you can sometimes approximate one feature with others. For example, since Go doesn't support sum types, you can use type switches instead, at the cost of exhaustiveness checks and efficiency[9].)

"Purely-functional" is a description so generic that it doesn't fully convey what is interesting about Haskell. While Haskell has great support for functional programming, it has uniquely-powerful support for imperative programming too.

I recommend avoiding using the terms "functional language", "procedural language", "imperative language", "object-oriented language", and whatever other paradigms that have been popularized over the years.

Thanks to Imran Hameed for his feedback and suggestions. Of course, however, any mistakes are mine.

[1] Except via unsafe, user-beware mechanisms like unsafePerformIO and unsafeDupablePerformIO.

[2] I'm hand-waving a bit. If evaluating a value throws an exception, the exception must not be thrown unless the value is needed.

[3] Go is memory-safe except when one thread writes to an interface variable or slice and another reads from it.

[4] By generic programming I mean the ability to conveniently express, in the language, a computation that can operate on many different types of input. Any dynamic language will have this property because all values inhabit the same type. Typed languages with generic type systems also have this property. Go doesn't so you must use interface{} with explicit casts or code generation to implement generic algorithms.

[5] JVM doesn't really support higher-order functions

[6] In the JVM, code can be loaded dynamically with a ClassLoader. But runtime implementations don't generally ship with a Java language compiler.

[7] Go has panic and recover which are roughly equivalent to try...catch but with manual exception filtering and rebubbling.

[8] Java has a user threading mechanism implemented with bytecode translation in a library called Quasar.

[9] A direct encoding for sum types can be unboxed, smaller than a word, and only require one indirection to access the payload. The equivalent Go using type switches and interfaces would use two words for the interface pointer and an additional indirection.

Imported Comments [?]

Franklin Chen on Oct 14, 2015

Great post. I agree that it is best to characterize Haskell as "restricted effects" (and each effect indicated in a type, with the exception of, ahem, exceptions and unsafe operations).

A couple of comments on the matrix: GHC allows some control over memory layout: I have used unboxed types, unpacked constructor fields, and unpointed types may be coming. Also, as for runtime eval, I use the plugins package that of course depends on GHC internals. But you could argue GHC Haskell is not standard "Haskell".

Dropbox Hack Week: GraphQL Server in Haskell – Chad Austin on Feb 2, 2016

[…] like GHC’s upcoming ApplicativeDo extension. There are some things that Haskell’s restricted effects are uniquely good at. […]

Two Kinds of Servers – Chad Austin on Apr 15, 2016

[…] Go does not make a good language for application servers. The lack of immutable data makes it really awkward to enforce invariants. The lack of generics makes it awkward to express IO unless you generate code for each entity type in your database. Complicated arrangements of goroutines and channels are necessary to parallelize backend IO. Data structure traversals must be manually spelled out every time. Haskell, on the other hand is brilliant for application servers (assuming your team is comfortable with the learning curve). Libraries like Haxl allow specification of business rules in natural imperative style, while automatically extracting IO parallelism from the structure of the code. In Go, timeouts must be implemented manually every time, whereas in Haskell there is a single, always-correct timeout function. Go also doesn’t have anything like Haskell’s mapConcurrently, and while it can be done manually, it’s tricky to remain correct in the presence of unhandled errors. (In a high-reliability system, you probably want to transfer a runtime error like division by zero or null pointer dereference to your request handler’s error handler so it can be logged appropriately, rather than shutting down your whole server.) With Haskell, you can enforce 100% reliable unit tests or that transactions are never nested. […]

Why isn’t Crux a pure functional language? – Chad Austin on Sep 9, 2016

[…] of all, the term “pure functional language” isn’t terribly enlightening. It’s certainly not useful for communication, because everyone has or invents their own […]