Advent of Code 2022 -- Day 3

Day 3 is a cute one. Again, simple lazy list processing parser taking advantage of the standard library, or writing lazy recursive functions.

One of those rare cases when using length is not frowned upon.

The Problem

We are given a dataset of unknown length. Lines have different even lengths, and look like a jumble of letters

    vJrwpWtwJgWrhcsFMMfFFhFp\n
    jqHRNqRjqzjGDLGLrsFMfFZSrLrFZsSL\n
    ...

We are told to look at each line separately, considering its first and second half, to see if there's a shared element. For instance

    vJrwpWtwJgWrhcsFMMfFFhFp
      { split in half }
    vJrwpWtwJgWr hcsFMMfFFhFp
      { notice `p` is the single element in common }

There's a «priority» system with lowercase letters having values between 1 and 26, and uppercase letters having values between 27 and 52, respectively.

Main Program

The main program is pretty much the same as Day 1.

Parse and compute, all over again.

Part A asks to find the common element among both halves of each line, compute its priority, and total them up. So we're looking at this signature again

    partA :: String -> Int

and can break its implementation in three parts:

Parse the input String into a list of lines, using lines.
Have a way to find the sharedElement among two halves of a line, and compute its priority
```
 sharedElement :: String -> Char
 priority      :: Char -> Int
```
Add the list with sum.

This leads to a high level implemenation like so

    partA = sum . map (priority . sharedElement) . lines

You'll notice I'm using Int instead of Integer, because I'm going to take advantage of Enum typeclass for computing the priority.

Haskell's Data.List library, part of GHC's «base», provides a lot of handy functions for dealing with lists beyond those in the standard Prelude. There's a group of functions that allow you to treat lists as if they were sets, so you can compute union, difference, and interesection, the latter being exactly what we need to find out the common element. We write

    sharedElement :: String -> Char
    sharedElement r = head $ intersect f s
                        where
                          (f,s) = splitAt (length r `div` 2) r

As promised, one of those rare cases where you actually need to use length because there's no other cheap and clever way to split the list in half preserving positions.

Computing the priority takes advantage of Char being an instance of Enum. The instance follows the reasonable mapping for ASCII (and Unicode) from character entities to positive integers. This means 'a' is actually mapped to 97, its ASCII value, so we need to offset accordingly.

    priority :: Char -> Int
    priority c | isLower c = fromEnum c - fromEnum 'a' + 1
    priority c | isUpper c = fromEnum c - fromEnum 'A' + 27

And that's all it takes to solve Part A.

Just change a tiny bit, again

Part B just changes things a bit. Firstly, we need to consider lines in groups of three at a time. Secondly, we need to find out the single character common to all three lines. We see that it's similar to Part A, so we can reuse what we have:

Parse the input String into a list of lines, using lines.
Have a way to grab chunksOf a particular size. This function actually exists in a non-base Haskell library, so I chose to implement it myself and in a generic way, as one should.
```
 chunksOf :: Int -> [a] -> [[a]]
```
We need to find the commonElement for every group, and compute its priority
```
 commonElement :: [String] -> Char
```
Add the list with sum.

This leads to a high level implemenation like so

    partB = sum . map (priority . commonElement) . chunkOf 3 . lines

In order to have an efficient solution, it becomes important that the «chunking» happens in a lazy fashion. That way, map can start finding out commonElement and computing priorities immediately, without having all the list in memory at once.

This means writing chunkOf in an explicit recursive way

    chunksOf :: Int -> [a] -> [[a]]
    chunksOf _ [] = []
    chunksOf n xs = take n xs : chunksOf n (drop n xs)

It doesn't matter for this exercise, but it's important that you realize this implementation can actually handle infinite lists.

In order to find the commonElement of a chunk of any size, we make the following observation: a chunk is nothing more than a list of lists, and the problem specification ensures there will be a single common element, therefore said element must be present in all lists of the chunk.

So, for any non-empty chunk of arbitrary length

    [ l1, l2, l3, ..., lN ]

it follows that

    l1 `intersect` l2 `intersect` l3 ... `intersect` lN

will find the common element among all the lists, the result being a single element list. This pattern of recursion is the familiar fold over non-empty lists, so we write

    commonElement :: [String] -> Char
    commonElement = head . foldr1 intersect

And that's all it takes to solve Part B.

Conclusion

List processing is the core of functional programming, but with a twist that sometimes escapes the beginner: instead of worrying about how long the list is (computing the length) and then using loops with counters, we worry about a way to process groups of elements at a time.

That's why we favor using map and the fold family of functions, as well as writing additional function in a lazy recursive way. That is, we think of lists as control structures rather than mere data types. That is one of the «aha!» moments that will mark the transition from imperative thinking, to declarative thinking. In fact, we think about any recursive data structure as a control structure, but more on that on future AOC problems.

More often than not length is a terrible code smell in functional programming, specially in a lazy evaluation setting. If you find yourself writing Haskell and thinking «let's get the length and then count/move», you're not there yet: think harder and find a recursive solution.

This problem is cute in that the single use of length is entirely warranted.