Monad in Haskell

Haskll 的 Monad 其实是针对一个 type constructor 的定义。

所谓 type constructor，就是类型构造器，他接收一个或若干个类型，产生一个新的类型。例如 Maybe a，其中 a 是一个 type，而 Maybe 就是一个 type constructor。又如 [a]，其中的 [] 即为一个 type constructor，传入一个类型 a，会构造一个 a 的列表的 type。

我们可以大致认为一个 type constructor 就是一个盒子，他就是把一个 a 类型的值给装进了一个盒子里。虽然这种比喻有些时候并不完全恰当。

Functor#

Monad 体系内最先被定义的是 Functor。考虑函数 map，它的类型是 map :: (a -> b) -> [a] -> [b]，即给 [a] 内的每个元素都作用上 a -> b 的函数。但是该定义只对 list 这个 type constructor 有效。我们希望将这个函数普适到其他的 type constructor。

而所有能被作用这样类似 map 函数的 type constructor，我们统一定义其为 Functor class。

Functor 这个 class 的定义如下：

class Functor f where
  fmap :: (a -> b) -> f a -> f b
  (<$) :: a -> f b -> f a
  (<$) = fmap . const

其中后面两行其实是对 fmap 进行符号化的定义，本质定义只有第二行，即我们只需要对于这个 type constructor 定义一个 fmap 函数。函数的作用就是给定一个 a -> b 的函数，然后再给一个用 f 包裹住 a 的类型，要将盒子里的每个元素都做用一下 a -> b 的函数，最后得到一个用 f 包裹住 b 的结果。

如果我们把 f 当作是 [] 这个 constructor，那么就很容易这是在干什么了。定义 [] 的 Functor 可以写作：

instance Functor [] where
  fmap = map

因为对于 list 来说，fmap 就是 map。

我们再来看 Maybe 这个 type constructor 的 Functor 应该怎么写：

instance Functor Maybe where
  fmap _ Nothing = Nothing
  fmap f (Just x) = Just $ f x

还有 IO 的 Functor：

instance Functor IO where
  fmap f mx = do
    x <- mx
    return $ f x

然后 Prelude 中还对 Functor 定义了一个运算符：

(<$>) :: Functor f => (a -> b) -> f a -> f b
(<$>) = fmap

就是我们可以用 <$> 这个中缀来代替 fmap，这个的作用在下面会有体现。

Applicative#

在上面 Functor 中，我们只能对一个给定的 a -> b，去 map over 一个 f 包裹的 type。Applicative 其实是对 Functor 的一个扩展，它将定义一个 <*> 函数，可以给定一个 f 包裹的 a -> b，再给定一个 f 包裹的 a，得到一个 f 包裹的 b。具体来说，其定义如下：

class Functor f => Applicative f where
  pure :: a -> f a
  (<*>) :: f (a -> b) -> f a -> f b

它定义了两个函数：pure 和 <*>。其中 <*> 就是上面所说的，给定一个用 f 包裹的 a -> b 函数，再给定一个用 f 包裹的 a，返回一个 f 包裹的 b。

而 pure 则是一个用于构造单位量的函数，给定一个 a 类型的变量，返回他用 f 包裹后的结果。

我们还是以 Maybe 为例来看一下如何将它变成 Applicative 的实作：

instance Applicative Maybe where
  pure = Just
  Nothing <*> _ = Nothing
  (Just f) <*> mx = g <$> mx

首先考虑 pure，显然就是直接装进 Just 即可。然后是 <*>，传入的第一个值就是一个 Maybe 包裹的 a -> b 的函数，那么如果这个函数是 Nothing，结果必然是 Nothing，否则直接将 a -> b 的函数取出，使用 Functor 中的 fmap 进行作用即可。

再看 list 的 Applicative 是怎么定义的：

instance Applicative [] where
  pure x = [x]
  fs <*> xs = [f x | f <- fs, x <- xs]

先看 pure，单个元素装入 list 就是直接构造只包含一个元素的 list。对于 <*>，则是取出 fs 中的一个函数 f，再取出 xs 中的一个值 x，将所有 f x 拼成一个 b 类型的 list 即可。同时，我们也可以用 fmap 来定义 <*> 运算：

instance Applicative [] where
  pure x = [x]
  fs <*> xs = concat [fmap f xs | f <- fs]

这种写法和上面的没有本质区别。接下来列举 list 类型的一种示例：

ghci> pure (+1) <*> [1,2,3]
[2,3,4]
ghci> [(+3),(*3)] <*> [2,3,4]
[5,6,7,6,9,12]
ghci> pure (*) <*> [1,2,3] <*> [4,5,6]
[4,5,6,8,10,12,12,15,18]

前两个示例很容易理解，第三个示例可以认为是从左往右逐个运算，第一部分是一个 (<*>) :: f (Int -> Int -> Int) -> f Int -> f (Int -> Int)，然后变成了 [(*1),(*2),(*3)]，再作用到最后一个 list 上，得到答案。

然后还有 IO 类型的 Applicative：

instance Applicative IO where
  pure = return
  mg <*> mx = do {g <- mg; x <- mx; return (g x)}

一个示例是：

getChars :: Int -> IO String
getChars 0 = return []
getChars n = pure (:) <*> getChar <*> getChars (n - 1)

这个过程很好理解。然后 Applicative 还有一个经典应用是 sequenceA，他的定义如下：

sequenceA :: Applicative f => [f a] -> f [a]
sequenceA []     =  pure []
sequenceA (x:xs) =  pure (:) <*> x <*> sequenceA xs

他的作用可以根据定义看出来，当然我们可以看他在 Maybe，list 等上的表现：

ghci> sequenceA [Just 1, Just 2, Just 3]
Just [1,2,3]
ghci> sequenceA [Just 1, Nothing, Just 3]
Nothing

也就是说只要 list 中有一个元素是 Nothing，结果就会是 Nothing。

ghci> sequenceA [[1,2,3],[4,5,6],[7,8,9]]
[[1,4,7],[1,4,8],[1,4,9],[1,5,7],[1,5,8],[1,5,9],[1,6,7],[1,6,8],[1,6,9],[2,4,7],[2,4,8],[2,4,9],[2,5,7],[2,5,8],[2,5,9],[2,6,7],[2,6,8],[2,6,9],[3,4,7],[3,4,8],[3,4,9],[3,5,7],[3,5,8],[3,5,9],[3,6,7],[3,6,8],[3,6,9]]

这其实就是每个 list 内选一个元素连起来，形成的所有 list 再组成一个 list 的结果。

而对于 IO 类型，我们也可以把上面写的 getChars 写成 sequenceA 的形式：

getChars :: Int -> IO String
getChars n = sequenceA [replicate n getChar]

Monad#

之前说的 Applicative 是 Functor 的扩展版本，那么再扩展一下就有了 Monad。Monad 主要定义的函数是 (>>=)，他的类型是 (>>=) :: (m a) -> (a -> m b) -> (m b)。这里把 type constructor 从字母 f 变成了 m 也体现了这一 type constructor 的 Monad 属性。也就是我们有一个 a 到 m b 的函数，要把它作用到 m a 内装着的元素里，得到 m b。这么定义有什么作用我们等会儿再来说。

先来看一下 Monad 的形式化定义：

class Applicative m => Monad m where
  return :: a -> m a
  return = pure
  (>>=) :: m a -> (a -> m b) -> m b
  (>>) :: m a -> m b -> m b
  m >> k = m >>= \_ -> k

可以看到定义中有三个函数：return，(>>=) 和 (>>)。首先看 return，它的作用是给定一个元素，返回只包含他的对应 type 的单位量。发现其作用就是 Applicative 中的 pure，而 Monad 就是由 Applicative 拓展而来的，因此直接使用 return = pure 来定义即可。

然后再看 (>>=)，这个函数也是 Monad 的关键，可以读作 bind。它的类型和作用在上面已经介绍了。然后看 (>>)，他给定了一个 m a 和一个 m b，然后只返回 m b，方式是将 m a bind 到一个无论传入何值，都直接返回给定的 m b 的函数上。

注意到 return 是从 Applicative 的 pure 引入的，(>>) 又是根据 (>>=) 定义好的，因此如果我么希望实作一个 type constructor，其实只需要定义他的 (>>=) 函数即可。

还是先来看一下 Maybe 的 Monad 应该如何定义：

instance Monad Maybe where
  Nothing >>= _ = Nothing
  (Just x) >>= f = f x

这个定义是显然的，如果 m a 是 Nothing 那么返回值就是 Nothing，否则把 m a 里面的值取出来，作用一下 f 即可。来看几个例子：

ghci> Just 9 >>= \x -> return (x * 10)
Just 90
ghci> Nothing >>= \x -> return (x * 10)
Nothing

然后来看 list 的 Monad 定义：

instance Monad [] where
  xs >>= f = [y | x <- xs, y <- f x]

其实就是把 m a 里的每个元素取出来，作用一下 f 能够得到一个 m b，然后再把 m b 里的元素全部取出来造成一个 list。因此 list 的 Monad 定义也可以写作：

instance Monad [] where
  xs >>= f = cancat . (map f xs)

同样也可以看一些例子：

ghci> [1,2,3] >>= \x -> return (4 * x)
[4,8,12]
ghci> [1,2,3] >>= \x -> (replicate 4 x)
[1,1,1,1,2,2,2,2,3,3,3,3]
ghci> [1,2,3] >>= \n -> ['a','b','c'] >>= \ch -> return (n, ch)
[(1,'a'),(1,'b'),(1,'c'),(2,'a'),(2,'b'),(2,'c'),(3,'a'),(3,'b'),(3,'c')]

Monad 的链式结构#

Monad 的函数写作 (>>=) 和 (>>)，这可以在写代码时写出一个链式的结构。比如考虑下面这个问题：

皮尔斯决定要辞掉他的工作改行试着走钢索。他对走钢索蛮在行的，不过仍有个小问题。就是鸟会停在他拿的平衡竿上。他们会飞过来停一小会儿，然后再飞走。这样的情况在两边的鸟的数量一样时并不是个太大的问题。但有时候，所有的鸟都会想要停在同一边，皮尔斯就失去了平衡，就会让他从钢索上掉下去。

我们这边假设两边的鸟差异在三个之内的时候，皮尔斯仍能保持平衡。所以如果是右边有一只，左边有四只的话，那还撑得住。但如果左边有五只，那就会失去平衡。

我们要写个程序来仿真整个情况。我们想看看皮尔斯究竟在好几只鸟来来去去后是否还能撑住。

那么我们先来定义一个 type 表示杆子的状态：

type Birds = Int
type Pole = (Birds, Birds)

然后我们就可以用 Maybe Pole 来表示当前状态：若为 Nothing 则表示已经失去平衡，否则表示两边各有多少鸟。可以写出如下两个函数来表示在杆子上停鸟的过程：

landLeft :: Birds -> Pole -> Maybe Pole
landLeft n (left, right)
  | abs((left + n) - right) < 4 = Just (left + n, right)
  | otherwise                   = Nothing
  
landRight :: Birds -> Pole -> Maybe Pole
landRight n (left, right)
  | abs(left - (right + n)) < 4 = Just (left, right + n)
  | otherwise                   = Nothing

那么如果我们先往 landLeft 里面穿一个参数 n，即 landLeft n，这就是一个 Pole -> Maybe Pole 的函数，符合 Monad 的要求。因此，一个过程就可以被表示为：

ghci> return (0, 0) >>= landRight 2 >>= landLeft 3 >>= landRight 2
Just (4,3)
ghci> return (0, 0) >>= landRight 2 >>= landLeft 6 >>= landRight 2
Nothing

如果中途还有一个状态直接导致不合法，那么就可以用 (>>) 函数来表示：

ghci> return (0, 0) >>= landRight 2 >>= landLeft 3 >> Nothing >>= landRight 2
Nothing

Monad 的 Do 表示法#

有些时候我们定义一个函数，可能需要在 lambda 内套用 (>>=)，比如：

ghci> Just 3 >>= (\x -> Just "!" >>= (\y -> Just (show x ++ y)))  
Just "3!"

这时候写成一层套一层的结构就会很丑，因此 Haskell 还支持了 Do 表达式。我们把上面那句话分行写：

foo :: Maybe String
foo = Just 3   >>= (\x ->
      Just "!" >>= (\y ->
      Just (show x ++ y)))

那么 Do 表达式的用处就是让他变得好看，可以写作：

foo :: Maybe String
foo = do
    x <- Just 3
    y <- Just "!"
    Just (show x ++ y)

同样的，上一段中所写的链式结构：

ghci> return (0, 0) >>= landRight 2 >>= landLeft 3 >>= landRight 2
Just (4,3)

也可以被写作 Do 表达式：

routine :: Maybe Pole
routine = do
        start <- return (0, 0)
        first <- landRight 2 start
        second <- landLeft 3 first
        landRight 2 second

运行就有：

ghci> routine
Just (4,3)

所以 Do 的本质其实是一个语法糖，但是却让 Haskell 这个函数式的编程语言有了命令式的感觉。

Laws#

还有一部分内容是有关 Functor，Applicative 和 Monad 的 Laws，所有定义这三种 type class 的函数都必须满足其对应的 Laws。

这一部分有一些范畴论的内容，下面的解释不一定完全正确。

Functor Laws#

任何一个 Functor 的实例，都必须满足如下两个性质：

fmap id === id
fmap (f . g) === fmap f . fmap g

Applicative Laws#

任何一个 Applicative Functor 的实例，都必须满足如下性质（标注内容是对应的类型分析）：

pure id <*> x === x
- 若 id :: a -> a，则 x :: f a
pure (g x) === pure g <*> pure x
- 若 x :: a，则 g :: a -> b，pure (g x) :: f b
x <*> pure y === pure (\g -> g y) <*> x
- 若 y :: a，则 x :: f (a -> b)，pure (\g g -> y) :: f ((a -> b) -> b)
x <*> (y <*> z) === (pure (.) <*> x <*> y) <*> z
- 若 z :: f a，则 y :: f (a -> b)，x :: f(b -> c)，
- 且 pure (.) <*> x <*> y :: f (a -> c)

其实后两条就是类似交换律和结合律。

Monad Laws#

任何一个 Monad 的实例，都必须满足如下性质：

Left identity（左单位律）

-- :: m a       :: a -> m b
-- ↓↓↓↓↓↓↓↓     ↓
   return x >>= h  ===  h x
--                      ↑↑↑
--                      :: m b

Right identity（右单位律）

-- :: m a :: a -> m a
-- ↓↓     ↓↓↓↓↓↓
   mx >>= return  ===  mx
--                     ↑↑
--                     :: m a

Associativity（结合律）

--  :: m a
--  ┊┊     :: a -> m b
--  ┊┊     ┊      :: b -> m c      :: a :: m c
--  ↓↓     ↓      ↓                ↓    ↓↓↓↓↓↓↓↓↓
   (mx >>= g) >>= h  ===  mx >>= (\x -> g x >>= h)
--  ↑↑↑↑↑↑↑↑                      ↑↑↑↑↑↑↑↑↑↑↑↑↑↑↑
--  :: m b                        :: a -> m c

至于为什么叫做这三个，可以用如下方式来解释：Haskell 的 Control.Monad 模块中按照如下方式定义了运算符 >==>：

-- The monad-composition operator
-- defined in Control.Monad
(>=>) :: Monad m => (a -> m b) -> (b -> m c) -> (a -> m c)

--          :: a :: m b  :: b -> m c
--          ↓    ↓↓↓     ↓
f >=> g  = \x -> f x >>= g
--               ↑↑↑↑↑↑↑↑↑
--               :: m c

简单来说就是把两个函数合并到一个函数上。那么我们可以对三个 Law 分别做一些变换：

Left identity

              return x >>= h      ===  h x
<==>         (return x >>= h)     ===  h x
<==>  (\y -> (return y >>= h)) x  ===  h x
<==>  (       return   >=> h ) x  ===  h x
<==>  (       return   >=> h )    ===  h
<==>          return   >=> h      ===  h

其实就是在说 return 是 >==> 的左单位元。

Right identity

            mx  >>= return     ===  mx
<==>        f x >>= return     ===  f x
<==> (\y -> f y >>= return) x  ===  f x
<==> (      f   >=> return) x  ===  f x
<==> (      f   >=> return)    ===  f
<==>        f   >=> return     ===  f

其实就是在说 return 是 >==> 的右单位元。

Associativity

            (      my  >>= g)   >>= h    ===        my  >>= (\y -> g y >>= h)
<==>        (      f x >>= g)   >>= h    ===        f x >>= (\y -> g y >>= h)
<==>        (      f x >>= g)   >>= h    ===        f x >>= (      g   >=> h)
<==>        (\u -> f u >>= g) x >>= h    ===        f x >>= (      g   >=> h)
<==>        (      f   >=> g) x >>= h    ===        f x >>= (      g   >=> h)
<==> (\u -> (      f   >=> g) u >>= h) x === (\u -> f u >>= (      g   >=> h)) x
<==> (\u -> (      f   >=> g) u >>= h)   === (\u -> f u >>= (      g   >=> h))
<==> (      (      f   >=> g)   >=> h)   === (      f   >=> (      g   >=> h))

其实就是在说 >==> 满足结合律。

一些 Monad#

State Monad#

考虑下面这个问题：有一个 Tree a 类型，他按以下方式来定义：

data Tree a = Leaf a | Node (Tree a) (Tree a) deriving Show

也就是只有叶子节点会存放一个 a 类型的值。现在我们希望实现一个函数 relable :: Tree a -> Tree Int，来给每个叶子节点重新标号。如果按照普通函数的方式来写，那么就需要传回一个 pair，一个表示结果，一个表示当前用到的标号：

rlabel :: Tree a -> Int -> Tree Int
rlabel (Leaf _) n = (Leaf n, n + 1)
rlable (Node l r) n = ((Node l' r') n'') where
  (l', n' ) = rlable l n
  (r', n'') = rlable r n'

relable :: Tree a -> Tree Int
relable t = fst (rlable t 0)

如果使用 State Monad，我们就能做到不显示维护这些值。我们先来定义 StateTrans：

type State = Int
newtype StateTrans a = ST (State -> (a, State))

这里 StateTrans 就是对 State 进行某种变化，但是中间可能会产出一个需要的值，类型为 a。在上面的例子中，State 即为我们要存的当前标到的号，而 a 即为产出的结果。

为了下面定义方便，我们再定义一个函数 app：

app :: StateTrans a -> State -> (a, State)
app (ST f) s = f s

其实就是把 StateTrans 里面的函数拿出来作用一下。

先来看一下应该如何把 StateTrans 变为 Functor，Applicative 和 Monad 的实例：

instance Functor StateTrans where
  fmap g st = ST $ \s -> let (x, s') = app st s in (g x, s')
  
instance Applicative StateTrans where
  pure x = ST $ \s -> (x, s)
  stf <*> stx = ST $ \s -> let (f, s' ) = app stf s
                               (x, s'') = app stx s'
                           in (f x, s'')

instance Monad StateTrans where
  return x = ST $ \s -> (x, s)
  st >>= f = ST $ \s -> let (x, s') = app st s
                        in app (f x) s'

三个定义都很自然。那么现在我们就可以尝试用 State Monad 来定义 relable 函数了：

fresh :: StateTrans Int
fresh = ST $n -> (n, n + 1)

mlable :: Tree a -> StateTrans (Tree Int)
mlable (Leaf _) = fresh >>= \n -> return $ Leaf n
mlable (Node l r) = mlable l >>= \l' ->
                    mlable r >>= \r' ->
                    return $ Node l' r'

relable :: Tree a -> Tree Int
relable t = fst $ app (mlable t) 0

当然也可以把 mlable 写成 Do 表达式：

mlable (Leaf _) = do n <- fresh
                     return $ Leaf n
mlabe (Node l r) = do l' <- mlable l
                      r' <- mlable r
                      return $ Node l' r'

`(->) r` Monad#

注意到 a -> b 可以被写作 (->) a b，也就是说 -> 其实也是一个 type constructor，他接收两个 type a 和 b，返回一个 a -> b 的 type。但是我们并不能直接对 -> 来定义 Monad，因为 Monad 只能定义在接收一个 type 的 type constructor 上。

但是发现 Haskell 的函数都是 Curry 化的，因此如果我们给定了 a，他么他就只需要接收一个 type 了。也就是说 (->) r 其实是一个接收一个 type 的 type constructor。

我们来尝试定义 (->) r 的 Functor，Applicative 和 Monad：

instance Functor ((->) r) where
  fmap = (.)

instance Applicative ((->) r) where
  pure = const
  g <*> h = \x -> g x $ h x

instance Monad ((->) r) where
  return = const
  g >>= h = \x -> h (g x) x

此时我们很难把 (->) r 看作一个装东西的盒子，这三个函数的定义也就没有这么显然了。但是我们可以用类型推导的方式来做出上面三个定义，分别来看：

对于 Functor，有：

fmap :: (a -> b) -> f a -> f b
fmap :: (a -> b) -> (r -> a) -> (r -> b)
fmap g h = \x -> g (h x)
fmap = (.)

对于 Applicative，有：

(<*>) :: f (a -> b) -> f a -> f b
(<*>) :: (r -> a -> b) -> (r -> a) -> (r -> b)
g <*> h = \x -> g x $ h x

对于 Monad，有：

(>>=) :: f a -> (a -> f b) -> f b
(>>=) :: (r -> a) -> (a -> r -> b) -> (r -> b)
g >>= h = \x -> h (g x) x

因此类型推导也能帮助我们来定义一些较抽象 type constructor 的 Monad。

Expr Monad#

Expr 是我们自己定义的一个 type，我们也可以对其定义 Monad：

data Expr a
  = Var a
  | Val Int
  | Add (Expr a) (Expr a)
  deriving (Show)

instance Functor Expr where
  fmap f (Var x) = Var $ f x
  fmap _ (Val x) = Val x
  fmap f (Add x y) = Add (fmap f x) (fmap f y)

instance Applicative Expr where
  pure = Var
  (Var f) <*> t = fmap f t
  (Val x) <*> _ = Val x
  (Add x y) <*> t = Add (x <*> t) (y <*> t)

instance Monad Expr where
  return = Var
  (Var x) >>= f = f x
  (Val x) >>= _ = Val x
  (Add x y) >>= f = Add (x >>= f) (y >>= f)

这些函数定义都比较显然，摘自作业。