Advent of Code 2022 -- Day 8

There's not much to this challenge. You get a two dimensional grid, you have to go over it and check things relative to each position. It's meant for imperative arrays, with nested loops, I guess...

At least I got to be clever about parsing an array without knowing it's dimensions in advance, and use the «generative array» Haskell trick. I have that going on for me...

The Problem

We are given a two dimensional array. Each position holds a single integer, to be interpreted as elevation for that position.

The example input looks like this

For each position (row,col), one most consider the other positions sharing the same row or column. A position is visible if the elevation at said position is greater than the elevations of all other positions towards any edge, either row-wise left or right, or column-wise top or bottom. Naturally, this makes edge positions automatically visible.

Quite boring.

Main Program

The main program is pretty much the same as Day 1.

Arrays in a functional setting

The «array mentality» is strongly tied to plain iteration and the need to know «the size of things». It's one of the first things an imperative programmer learns, and one of the hardest things to forget when you change paradigms. Using lists as if they were arrays is a terrible idea.

Haskell has a full set of Arrays. With polymorphic dimensions and a rich API to move around. This rarely ever comes up when teaching introductory courses, because they are not «general purpose». Pure arrays are meant to be written once and read as many times as needed; modifying any position on a pure array means copying the whole thing.

This problem is amenable for solving with pure arrays:

Parse the input in one pass and load it into a pure array.
Go over the positions of the new array and create a new array computing the desired result per position. This would be the boring «nested loop» part.
Collapse the array to compute the solution.

We can do this without knowing the dimensions in advance, and never needing them for the «nested loops». This will hopefully show how Haskell arrays are nicer than your run of the mill imperative arrays.

Parsing the input array

We'll be using Haskell's standard library Data.Array. This library provide «pure arrays», meant to be loaded once and read many times -- O(1) time for reads, if you must know. The basic way to create a pure array is with, well

    array :: Ix i => (i, i) -> [(i, e)] -> Array i e

where i is any type that can be used as an array index. As you can imagine, you can use Int for indexing, so

    array :: (Int,Int) -> [(Int, e)] -> Array Int e

will help you create a one-dimensional array from lower to upper bounds, storing elements of type e. The second argument is supposed to be a list with all the elements to be stored in the array. The order respective to the index doesn't matter, as array will place them correctly before «freezing» the grid.

One can use tuples (Int,Int) to create two-dimensional arrays. The astute reader should've figured by now that arrays indexes can start and end on whatever number you need, or even be Char, or String, or any Enum. Maybe this will come up in future challenges, but for this one, tuples it is.

Going back to how to use array, we will need to provide the dimensions of the array as second argument. And each element of the list must also carry correct row and column information, for each to be placed correctly. As usual, I will not cheat by looking at the actual input and hardcoding dimensions, but try to do it «en passant».

The Array we need

We would like to use array instantiated as

    array :: ((Int,Int),(Int,Int))   -- Two dimensions
          -> [((Int,Int), Int)]      -- Each position holds an Int
          -> Array (Int,Int) Int

which means our top-level parser should produce

    [((Int,Int), Int)]

to feed as second argument to array. As for the first argument, the only thing we know for sure is

    ((0,0),(?,?))

If we build the list with proper row and column position for each element, we can figure the maximum coordinate to provide as second argument. The trick is, precisely, computing proper row and column positions for each element...

Implicit counters

We don't need a stateful parser like on Day 7, but you're free to try. A plain Parser will do.

Parsing a single row is straightforward using Parsec combinators and Applicative:

    parseRow :: Parser [Int]
    parseRow = many1 (digitToInt <$> digit) <* newline

Just grab one digit at a time and collect them as a list. Notice the use of (<*) to run both parsers in sequence, but keeping what the first one returns. Now, we still don't know how many Int's we collected.

We set on to parse the full matrix. Of course it's just a matter of using parseRow as many times as possible, until we hit the end of input (eof). Here's the trick:

While parsing row r, we can assign precisely row r to every element.
We can pair each element with its column position by taking enough elements from the infinite list [0..] by lettting lazy evaluation do its thing, e.g. using a String as the list of elements and assuming we're on row 42, something like
```
  ghci> zip (zip (repeat 42) [0..]) "foo"
  [((42,0),'f'),((42,1),'o'),((42,2),'o')]
```
After parsing row r, we continue parsing a matrix, but starting at row r+1. Recursion!

Build the correct list of elements and their positions for row r, and place them in front of the list of lists for the rest of the matrix.

  parseMatrix :: Int -> Parser [((Int,Int),Int)]
  parseMatrix r = (eof >> pure [])
               <|> do ds <- parseRow
                      rs <- parseMatrix (succ r)
                      pure $ zip (zip (repeat r) [0..]) ds ++ rs

When we start parsing our matrix, we are on row 0, therefore

    parseInput :: Parser [((Int,Int),Int)]
    parseInput = parseMatrix 0

We are now in a position to use array after parsing our input. You can think my use of maximum for figuring out the largest tuple is cheating, but turns out it is as expensive as carrying it in the parser. You can do the math, I won't check it.

    loadForest :: String -> Array (Int,Int) Int
    loadForest input = case parse parseInput "loadForest" input of
                         Left _e -> undefined
                         Right l -> array ((0,0),maximum $ map fst l) l

Yes, we can load a two-dimensional array with proper row and column information without needing to know the length of each line or the input file in advance. As long as all lines have the same length, of course.

Parametric polymorphism is the answer

The Array API provides operator (!) for efficiently reading the element at any particular position. We could use it with lots of recursion to simulate nested loops. That would be Ugly, Sad, and Not Declarative.

Here's the nifty trick, in the form of a polymorphic function:

    mkArray :: Ix i => (i -> e) -> (i,i) -> Array i e
    mkArray f bs = array bs [ (i, f i) | i <- range bs ]

Function f is able to compute the element that should go in a particular position. The second argument provides the bounds for the array. Using Data.Array range we cover all the array space for the particular bounds, feeding f to effectively create a new pure Array.

Now, all we need to do is come up with the proper f to compute the solution we are looking for based on the input array, and mkArray will build the answer array. Since mkArray is polymorphic, the answers could have different contents -- planning ahead for Part B, obviously.

Solving Part A

As discussed before, for each position in the input array, we need to check all positions in the same row to the left, in the same row to the right, in the same column up, and in the same column down.

Solving Part A can be accomplished by creating a new array with the same shape, but holding Bool to state whether or not a position is visible.

Array functions tend to be very busy, on account of having to keep track of dimensions, bounds, and whatnot.

    visibleTrees :: Array (Int,Int) Int -> Array (Int,Int) Bool
    visibleTrees forest = mkArray visible dimensions
      where
        dimensions                          = bounds forest
        ((minR,minC),(maxR,maxC))           = dimensions

        visible (_,c) | c == 0 || c == maxC = True
        visible (r,_) | r == 0 || r == maxR = True
        visible (r,c)                       = 
          or [ and [ forest ! (rb,c) < forest ! (r,c) | rb <- [minR..pred r] ]
             , and [ forest ! (ra,c) < forest ! (r,c) | ra <- [succ r..maxR] ]
             , and [ forest ! (r,bc) < forest ! (r,c) | bc <- [minC..pred c] ]
             , and [ forest ! (r,ac) < forest ! (r,c) | ac <- [succ c..maxC] ]
             ]
                                                 ]

Since we'll be using mkArray to build the solution array, we must compute the input array dimensions using Array API bounds. We use pattern-matching to extract dimensions parts, as we'll need them for the internal lookups.

Finally, we provide function visible that, given a particular position, decides whether or not it is visible over any direction:

Any position sitting on a boundary row or column is visible, no other checks needed.
For position (r,c) to be visible, it must be taller than all neighboring positions per direction (left, right, up, down) on at least one of them:o
- Check all positions on a particular direction. Here's the only place we need to use (!) to look on the input forest and compare with the current position. Notice and is short-circuited: lists will be traversed in full only if reaching the edge is needed.
- Using short-circuited or also helps: worst case scenario is having to check all directions when the position is not visible.

To round up the solution for Part A, we use elems as provided by Array API. It simply lists all elements from any Array. The list will be in outer to inner dimension order. It doesn't matter for this particular problem. We reason as follows:

    ghci> :type loadForest
    loadForest :: String -> Array (Int, Int) Int
    ghci> :type visibleTrees . loadForest
    visibleTrees . loadForest :: String -> Array (Int, Int) Bool
    ghci> :type elems . visibleTrees . loadForest
    elems . visibleTrees . loadForest :: String -> [Bool]

We just need to count how many are True, so we write

    partA :: String -> Int
    partA = length . filter id . elems . visibleTrees . loadForest

We needed the length of things in the end --badum tss--

Solving Part B

The approach for solving Part B is similar. For each position in the input array, we need to «spread out» checking all positions in the same row to the left, in the same row to the right, in the same column up, and in the same column down. On every direction we need to count how far we can go until we find the first element that is greater or equal to the one in the position, or the edge, whatever happens first. The «scenic value» for the position, is the product of how far can we go in each of the four directions.

Solving Part B can be accomplished by creating a new array with the same shape, but holding Int to hold the scenic value. Pretty much the same structure as before, just changing the auxiliary function scenery passed to mkArray.

    scenicScore :: Array (Int,Int) Int -> Array (Int,Int) Int
    scenicScore forest = mkArray scenery dimensions
      where
        dimensions                          = bounds forest
        ((minR,minC),(maxR,maxC))           = dimensions

        scenery (_,c) | c == 0 || c == maxC = 0
        scenery (r,_) | r == 0 || r == maxR = 0
        scenery (r,c)                       =
          product [ length ((r,c) `seesOver` [(r,c') | c' <- reverse [minC..pred c]])
                  , length ((r,c) `seesOver` [(r,c') | c' <- [succ c..maxC]])
                  , length ((r,c) `seesOver` [(r',c) | r' <- reverse [minR..pred r]])
                  , length ((r,c) `seesOver` [(r',c) | r' <- [succ r..maxR]])
                  ]

        seesOver _    []        = []
        seesOver origin (p:ps) = if forest ! origin > forest ! p
                                    then p : seesOver origin ps
                                    else [p]

Again, list comprehensions to construct the list of positions to consider in each direction. This time we need to compute their length in order to count them, and compute their product to figure out the positions scenic value.

Ranges within list comprehensions should be processed from the position outwards, that's why for left and right we need to build them from smaller to larger, and then reverse them. Using Enum API would be an alternative.

Finally, I had to write seesOver because neither dropWhile nor takeWhile were good fits. Can you figure out why?

Solving Part B is just a matter of find out the element having the maximum product.

    partB :: String -> Int
    partB = maximum . elems . scenicScore . loadForest

That's it.

Conclusion

Functional programmers rarely ever use arrays. Arrays aren't as good control structures as Foldables are. The only way to process arrays is with external information (bounds, dimensions) and control (recursion). That doesn't compose.

We reach out to Arrays for a few problems, mostly those that fit dynamic programming or hardware-assisted numerical computation. Array is helpful for dynamic programming, particularly if you need to build each position in the array via a computation that involves previous positions (a technique known as «dovetailing»). It's also helpful when your input data is going to be pretty much immutable, and you're going to derive results by repeated reads, as in this challenge. It is not useful if you're trying to implement algorithms that try to change array positions.

There are just a few array-based algorithms for which an equivalent non-array solution has been found yet. For those, you might want to explore Data.Array.ST and Data.Array.IO. But before trying to replicate your array-based algorithm in Haskell using arrays, do some research to find out if it has already been solved. You'd be surprised.

As for this challenge, the input was an Array, and the intermediate solution was also and Array, built using higher-order functions while looking at arrays as if they were functions from the index domain into the element codomain. Because, they are.