Advent of Code 2022 -- Day 13

Haskell's type classes group types based on shared behavior. That's what we mean when we talk about types being Eq, Num, Functor, Foldable, Monad or whatever. Type classes require a minimal set of functions a type must support in order to belong to the class, and then provide derived functions with default implementations. Parametric polymorphism allows restrictions to function signatures, so that only types belonging to particular type classes can be passed as arguments.

This challenge is a textbook example of how efficient it is to take advantage of a custom implementation for a type class...

The Problem

We are presented with the notion of «packets». They have the form of nested lists of integers. The sample input looks like this

    [1,1,3,1,1]\n
    [1,1,5,1,1]\n
    \n
    [[1],[2,3,4]]\n
    [[1],4]\n
    \n
    [9]\n
    [[8,7,6]]\n

Each pair of packets must be compared to determine if they are in the correct order. The problem description goes into detail about how to compare two packets in particular to determine their ordering.

Main Program

The main program is pretty much the same as Day 1.

Packets and how to compare them

Haskell's lists are homogeneous, i.e. all elements must be of the exact same type, hence the signature [a]. The packets as described by this problem are heterogeneous lists: there can be plain integers or lists of integers at the same level, because nesting is possible. We'll start by defining a recursive data type to model this heterogeneous container in an homogeneous way

    data Packet = Atom Int
                | List [Packet]
                deriving (Show,Eq)

This type will let us turn the heterogeneous

    [[1],[2,3,4]]

into the homogeneous

    List [ List [ Atom 1 ], List [ Atom 2, Atom 3, Atom 4 ] ]

and even allow mixed-mode nesting, to turn the heterogeneous

    [[1],4]

into the homogeneous

    List [ List [ Atom 1 ], Atom 4 ]

The type definition requested the compiler to derive Show and Eq. The former let's us print (via show) values on screen, the latter allows us to compare (via (==)) values on our code. The compiler is able to generate the implementation for (==) automatically.

We are interested in being able to compare two values of type Packet to decide which one is «smaller». We'd like to use (<=) directly, and that's something that type class Ord provides. The compiler can generate an Ord instance automatically, but the ordering would not match the one required by this problem. Go read the documentation to find out why.

Haskell allows you to implement your own instance for any type class you see fit. For the Ord type class

    λ> :info Ord
    type Ord :: * -> Constraint
    class Eq a => Ord a where
      compare :: a -> a -> Ordering
      (<) :: a -> a -> Bool
      (<=) :: a -> a -> Bool
      (>) :: a -> a -> Bool
      (>=) :: a -> a -> Bool
      max :: a -> a -> a
      min :: a -> a -> a
      {-# MINIMAL compare | (<=) #-}

we see that we either provide the implementation for (<=) or for compare, and in exchange all the other functions will have working default implementations. Not only that, any function having the Ord a restriction can now be used on Packet.

Here's the custom Ord instance for Packet providing (<=) I wrote

    instance Ord Packet where
      Atom left <= Atom right = left             <= right
      List lps  <= List rps   = lps              <= rps
      Atom left <= List rps   = List [Atom left] <= List rps
      List lps  <= Atom right = List lps         <= List [Atom right]

This instance implements the rules laid out by the problem description, and I'm not going to explain them again. There are a few comments related to the polymorphic use of (<=) that are worth mentioning:

All the uses of (<=) on the left side of the equals (=) are the ones we are defining for Packet. That is, they are comparing Packet on the left with Packet on the right.
The use of (<=) on the right side of the equals for the first rule, is comparing two Ints and is already defined. It's comparing the values held by the Atom constructor, which are Int.
The use of (<=) on the right side of the equals for the second rule, is comparing two [a] at the list level. That is, it's using the Ord instance for [a]. This is convenient because it matches exactly what the Packet comparison for lists needs.
The uses of (<=) on the right side of the equals for the third and fourth rules, are comparing two Packet. This is just recursion.

And that's all it takes for all the comparison operators to work over our Packet type.

Parsing packets

A Packet is a recursive structure. Parsing one is too.

    parsePacket :: Parser Packet
    parsePacket = Atom . read <$> many1 digit
               <|> List <$> between (char '[')
                                    (char ']')
                                    (parsePacket `sepBy` char ',')

For the base case, we parse one or more digits, read them into an Int, and lift it into a Packet using the Atom constructor. For the recursive case, we take advantage of the between parser combinator provided by Parsec: find zero or more packets separated by a comma, between squared brackets, and lift the resulting [Packet] into a Packet using the List constructor.

The input file has pairs of packets separated by an empty line. We're asked to compare each pair, so it makes sense to parse two lines into a tuple to keep them together

    pairPackets :: Parser (Packet,Packet)
    pairPackets = (,) <$> parsePacket <* newline
                      <*> parsePacket <* newline

This is a nice example of Parsec's Applicative API. By writing

    parsePacket <* newline

we ask for both parsers to be run in sequence, but to keep the value produced by parsePacket. By writing

    (,) <$> p0 <*> p1

we are asking to apply the tuple constructor to the results of p0 and p1. This effectively parses two consecutive lines and keeps the resulting Packets in a tuple, in the same order they are in the input file.

Reading the whole file as a list of tuples can be written as

    parseInput :: Parser [(Packet,Packet)]
    parseInput = pairPackets `sepBy1` newline

and we finish up with a nice top-level parsing function that actually runs the parser, returning the unadorned list of tuples

    loadPackets :: String -> [(Packet,Packet)]
    loadPackets input = case parse parseInput "loadPackets" input of
                          Right ps -> ps
                          Left  _e -> undefined

Solving Part A

Each pair of packets position is relevant to the problem. The first pair of packets is at position 1. Part A asks to add up the positions for all packets that are in the correct order.

Thanks to the Ord instance for Packet, we can use (<=) to compare two of them. But (<=) is a curried function, as it should

    ghci> :type (<=)
    (<=) :: Ord a => a -> a -> Bool

and we are parsing our input into [(Packet,Packet)]. This is a perfect time to

    ghci> :type uncurry
    uncurry :: (a -> b -> c) -> (a, b) -> c

which turns any curried function into one that takes a tuple as argument. Now we can use uncurry (<=) over each tuple of our list. We reason

    ghci> :type loadPackets
    loadPacket :: String -> [(Packet,Packet)]
    ghci> :type map (uncurry (<=)) . loadPackets
    map (uncurry (<=)) . loadPackets :: [Bool]

and the result would be whether or not the particular packets are in order. Now, to add the position index to every result, we reason

    ghci> :type zip [1..] . map (uncurry (<=)) . loadPackets
    zip [1..] . map (uncurry (<=)) . loadPackets :: (Num a, Enum a) => String -> [(a, Bool)]

where the restrictions state that it's some kind of Enumerable Number type. But we're only interested on those pair where the result was True, so

    ghci> :type filter snd . zip [1..] . map (uncurry (<=)) . loadPackets
    filter snd . zip [1..] . map (uncurry (<=)) . loadPackets :: (Num a, Enum a) => String -> [(a, Bool)]

Now we need to extract the first component from each tuple, and add them up

    partA :: String -> Int
    partA = sum
          . map fst
          . filter snd
          . zip [1..]
          . map (uncurry (<=))
          . loadPackets

Solving Part B

Pairs don't matter any more. We are told to bunch all packets together, add two special marker packets, and put everything in ascending order. Figure out the position where these marker packets end up on the sorted list, and multiply them.

The marker packets are

    marker2 :: Packet
    marker2 = List [ List [ Atom 2 ] ]

    marker6 :: Packet
    marker6 = List [ List [ Atom 6 ] ]

We can use loadPackets to load as [(Packet,Packet)], but since the pairing is irrelevant, we can use

    ghci> :type unzip
    unzip :: [(a, b)] -> ([a], [b])

to end up with a list holding «all the first» and a list holding «all the seconds», in a pair. Now, we can use uncurry (++) over said pair to combine both lists into one, behind the desired markers

   ghci> :type (\p -> [marker2,marker6] ++ uncurry (++) p) . unzip . loadPackets 
   (\p -> [marker2,marker6] ++ uncurry (++) p) . unzip . loadPackets :: String -> Packet

Thanks to parametric polymorphism we can use

   ghci> :type Data.List.sort
   Data.List.sort :: Ord a => [a] -> [a]

to sort our [Packet]. Using the same techniques as seen on partA we write

    partB :: String -> Int
    partB = product
          . map fst
          . filter (\(_,s) -> s `elem` [marker2,marker6])
          . zip [1..]
          . sort
          . (\p -> [marker2,marker6] ++ uncurry (++) p)
          . unzip
          . loadPackets

to complete the solution.

Conclusion

Familiarity with Haskell's type classes is essential to approaching many programming problems in a clean straightforward way. GHC's ability to automatically derive type class instances for user defined types is rooted in Algebraic Type Theory: when you use recursive sum and product types, the compiler can deduce a reasonable instance and generate code for you. This applies to any type-class, be it Haskell standard, provided by a library, or defined on your own.

The more type class behaviors you can recognize and use, the easier will be to design types and write programs to take advantage of them. Knowing when to write your own Show, Ord, Semigroup, Monoid, and possibly Num can lead to general solutions that compose better.

Identifying and using Monoid, Functor, Applicative and Monad is a must. There is a set of standard Haskell type classes described in the Typeclassopedia, with reasonably good descriptions of their usage.

Many libraries provide their own type classes, usually accompanied with implementations for Haskell's standard types. For instance aeson comes with typeclasses FromJSON and ToJSON providing functionality for exactly what you're thinking.

Type classes and parametric polymorphism restrictions are superior to interfaces and even roles, as found in OO Programming Languages, because you don't have to write additional code to fulfill the interface, the compiler can generate default implementations for your types, the code is not encumbered with runtime dispatch costs, and you can have more than one instance for the same base type thanks to newtype wrappers at zero runtime cost.

Classier than your run of the mill classes.