I'd like to follow up on my previous post with some points that are more nuanced and probably only make sense to people who've written Haskell before.

Syntax

Some syntax elements in Haskell were annoying at first, but I adjusted to them quite easily, and I now believe my annoyance was simply unfamiliarity. Separating arguments by spaces and using the $ operator (that is, a $ b c as shorthand for a (b c) comes naturally now. The only thing I still get wrong is operator precedence.

Terminology

Regarding category theory terminology: when I even mention words like functor or monad, I've seen engineers' eyes glaze over. They instantly think "Oh, that's too complicated to understand." I swear, if Functor was named Mappable and Monad was named Chainable or something like that, it would make Haskell seem much less intimidating to beginners. Explaining monads doesn't require any fancy tutorials or stretch analogies. Similarly, explaining functors doesn't require using the words "lifting" or "context". It's as simple as "Well, you know how you can map across a list in every other language? Well, consider Maybe as a list of zero or one elements. You can map across that too. What about a promise or future? You can map across that too. Functions themselves can be mapped, transforming their return values. Anything that can be mapped is a functor."

In general, I do think it's a good idea for programmers to use the same terminology as mathematicians, as they're generally more rigorous and precise, but... I'll just quote An Introduction to Category Theory:

In her charming Italian style she asked me (instructed me) to cut out all the jokes. This was quite difficult since some of the official categorical terminology is a joke, but I have done my best.

Modules

This is a relatively minor nit: Haskell module import syntax is cluttered. I always feel like import qualified Data.Vector as Vector would be better written in Python style: from Data import Vector. This would have the side benefit of mitigating a common, and in my opinion unfortunate, pattern you see in Haskell code: importing modules by abbreviations. import qualified Data.Vector as V. For some common modules, like Data.ByteString as BS and Data.ByteString.Char8 as BSC, the abbreviations are common enough that everyone knows in context what module has been imported. However, for your own modules, you should import with explicit, non-abbreviated names, so it's clear in context which module is being referenced.

Laziness

I'm unsure about laziness by default. Laziness can be really great, and it's somewhat cheap in Haskell, but there are many situations where, if you're going to compute anything for a data structure, you might as well compute it all right then, while the caches are still hot. The theoretical benefits of only computing the values necessary have a nontrivial cost: lazy thunked data structures have branching and dereferencing costs that strict languages can avoid.

Streaming and Chunking

I feel like Haskell streaming and chunking libraries, like Conduit, have the same problem as laziness for most reasonably-sized data structures. If your machine has 32 GiB of RAM and 50 GB/s of memory bandwidth, who cares if you allocate 100 MB of data and dump it all on a socket in one go. Chunking only matters much if your peak memory usage times concurrency doesn't fit. I've seen similar performance issues with with Python 2's xrange() function. range() is frequently faster in context because it avoids the extra iteration overhead.

Partial Functions

Some people have vehemently argued that pattern matching is so superior to conditionals and dereferencing than Java and C++ are fundamentally flawed. That is:

case p of
  Nothing -> bar
  Just v -> foo v

is safer than:

if (ptr) {
  ptr->foo();
} else {
  bar();
}

I have some sympathy for this argument. Pattern matching syntax does mean that the dereferenced value is only visible in the scope where it is valid. However, the real problem with languages like C++ and Java is not the lack of pervasive pattern matching. It's that Java and C++ don't have a way to distinguish in the type system between a boxed value (aka T*) and a potentially-null boxed value (aka T*).

Pattern matching by itself is nice, but the real benefit comes from its combination with sum types.

What is the Haskell equivalent of NullPointerException? Partial functions!

Let's imagine I wrote the above code as follows, except assuming p is not Nothing:

foo $ fromJust p

If p is Nothing, this code throws an error. fromJust is a partial function in that, when the input has an unexpected value, it throws an error. Haskell has many partial functions.

In your code, you should strive to write total functions. However, I am of the opinion that Haskell should distinguish, in the type system, between the two types of bottom. There are two ways that a function in Haskell can have no value. It can either throw an exception or it can enter an infinite loop. Partial functions throw exceptions.

In practice, infinite loops are rare (except through accidental recursive references). However, it would be useful to disallow certain classes of pure code from throwing an exceptions. This would make it illegal for pure code to call partial functions. However, certain types of errors could still occur, especially out of memory errors and stack overflows. Thus, the language would have to draw a distinction between synchronous errors such as partial matches and asynchronous errors such as resource exhaustion.

I can understand why the language adopted a simpler model, even though it would be nice to guarantee that some pure functions cannot throw non-resource-exhaustion errors.

Purity is a vague concept. Some useful attributes can be applied to a call graph: "has no observable side effects". "does not allocate heap memory". "fits within a defined resource budget". "does not throw exceptions." I'm not aware of any mainstream languages that distinguish between these notions of purity.