Day 3 is a cute one. Again, simple lazy list processing parser taking advantage of the standard library, or writing lazy recursive functions.
One of those rare cases when using length
is not frowned upon.
The Problem
We are given a dataset of unknown length. Lines have different even lengths, and look like a jumble of letters
vJrwpWtwJgWrhcsFMMfFFhFp\n
jqHRNqRjqzjGDLGLrsFMfFZSrLrFZsSL\n
...
We are told to look at each line separately, considering its first and second half, to see if there's a shared element. For instance
vJrwpWtwJgWrhcsFMMfFFhFp
{ split in half }
vJrwpWtwJgWr hcsFMMfFFhFp
{ notice `p` is the single element in common }
There's a «priority» system with lowercase letters having values between 1 and 26, and uppercase letters having values between 27 and 52, respectively.
Main Program
The main program is pretty much the same as Day 1.
Parse and compute, all over again.
Part A asks to find the common element among both halves of each line, compute its priority, and total them up. So we're looking at this signature again
partA :: String -> Int
and can break its implementation in three parts:
Parse the input
String
into a list of lines, usinglines
.Have a way to find the
sharedElement
among two halves of a line, and compute itspriority
sharedElement :: String -> Char priority :: Char -> Int
Add the list with
sum
.
This leads to a high level implemenation like so
partA = sum . map (priority . sharedElement) . lines
You'll notice I'm using Int
instead of Integer
, because I'm
going to take advantage of Enum
typeclass for computing
the priority.
Haskell's Data.List
library, part of GHC's «base», provides
a lot of handy functions for dealing with lists beyond those
in the standard Prelude
. There's a group of functions that
allow you to treat lists as if they were sets, so you can
compute union, difference, and interesection, the latter
being exactly what we need to find out the common element.
We write
sharedElement :: String -> Char
sharedElement r = head $ intersect f s
where
(f,s) = splitAt (length r `div` 2) r
As promised, one of those rare cases where you actually need
to use length
because there's no other cheap and clever way to
split the list in half preserving positions.
Computing the priority
takes advantage of Char
being an instance
of Enum
. The instance follows the reasonable mapping for ASCII
(and Unicode) from character entities to positive
integers. This means 'a'
is actually mapped to 97
, its ASCII
value, so we need to offset accordingly.
priority :: Char -> Int
priority c | isLower c = fromEnum c - fromEnum 'a' + 1
priority c | isUpper c = fromEnum c - fromEnum 'A' + 27
And that's all it takes to solve Part A.
Just change a tiny bit, again
Part B just changes things a bit. Firstly, we need to consider lines in groups of three at a time. Secondly, we need to find out the single character common to all three lines. We see that it's similar to Part A, so we can reuse what we have:
Parse the input
String
into a list of lines, usinglines
.Have a way to grab
chunksOf
a particular size. This function actually exists in a non-base Haskell library, so I chose to implement it myself and in a generic way, as one should.chunksOf :: Int -> [a] -> [[a]]
We need to find the
commonElement
for every group, and compute itspriority
commonElement :: [String] -> Char
Add the list with
sum
.
This leads to a high level implemenation like so
partB = sum . map (priority . commonElement) . chunkOf 3 . lines
In order to have an efficient solution, it becomes important that
the «chunking» happens in a lazy fashion. That way, map
can start
finding out commonElement
and computing priorities immediately,
without having all the list in memory at once.
This means writing chunkOf
in an explicit recursive way
chunksOf :: Int -> [a] -> [[a]]
chunksOf _ [] = []
chunksOf n xs = take n xs : chunksOf n (drop n xs)
It doesn't matter for this exercise, but it's important that you realize this implementation can actually handle infinite lists.
In order to find the commonElement
of a chunk of any size,
we make the following observation: a chunk is nothing more
than a list of lists, and the problem specification ensures
there will be a single common element, therefore said element
must be present in all lists of the chunk.
So, for any non-empty chunk of arbitrary length
[ l1, l2, l3, ..., lN ]
it follows that
l1 `intersect` l2 `intersect` l3 ... `intersect` lN
will find the common element among all the lists, the result being a single element list. This pattern of recursion is the familiar fold over non-empty lists, so we write
commonElement :: [String] -> Char
commonElement = head . foldr1 intersect
And that's all it takes to solve Part B.
Conclusion
List processing is the core of functional programming, but with
a twist that sometimes escapes the beginner: instead of worrying
about how long the list is (computing the length
) and then
using loops with counters, we worry about a way to process
groups of elements at a time.
That's why we favor using map
and the fold
family of functions,
as well as writing additional function in a lazy recursive way.
That is, we think of lists as control structures rather than mere
data types. That is one of the «aha!» moments that will mark
the transition from imperative thinking, to declarative
thinking. In fact, we think about any recursive data structure
as a control structure, but more on that on future AOC problems.
More often than not length
is a terrible code smell in
functional programming, specially in a lazy evaluation setting.
If you find yourself writing Haskell and thinking «let's get
the length
and then count/move», you're not there yet: think
harder and find a recursive solution.
This problem is cute in that the single use of length
is
entirely warranted.