This challenge felt like «busy work» more than «creative work». It boils down to keeping track of a grid's occupied positions, until a particle goes out of bounds or there's no room for more particles under the rules.

As usual for «grid-centric» problem solving in Haskell, is better to use a `Map` to focus on occupied positions. After defining a custom `Ord` instance to sort things the way I needed, it was mostly «busy work».

The simulation turned out adequate to illustrate why `unfoldr` matters, specially when you fuse into an hylomorphism...

## The Problem

We are given the description of a wall grid, in the form of horizontal and vertical segments. The grid is empty, except for the places where rocks form the horizontal and vertical segments. The sample input looks like

``````    498,4 -> 498,6 -> 496,6\n
503,4 -> 502,4 -> 502,9 -> 494,9\n
``````

Each pair of number between `->` describes a segment. The input guarantees that each pair has a common first element thus representing an horizontal segment, or a common second element thus representing a vertical segment.

A «grid unit» of sand is dropped from a particular starting position. It drops until it cannot progress further according to the movement rules (down, down left, down right, or stop). Units are blocked by the walls as defined by the grid, or by previously dropped units. Once a unit stops, a new one is dropped.

## Main Program

The main program is pretty much the same as Day 1.

## The Grid

When modeling «grid» problems, Haskell programmers tend to focus on what's being used in the grid, rather than on empty spaces. Only keeping track of used positions is surely space-efficient when compared with traditional arrays. That's why we choose a map that's optimized for the kind of position index we need to handle. For this problem, positions are two-dimensional, so it makes sense to use the traditional `Data.Map`.

``````    import qualified Data.Map as DM
``````

The `Data.Map` API imposes only one restriction on the type used as key for lookups

``````    DM.lookup :: Ord k => k -> DM.Map k a -> Maybe a
``````

Since most standard data types already have an `Ord` instance, and we can always ask the compiler to derive an automatic `Ord` instance, this is quite convenient: we could use plain tuples `(Int,Int)` as keys, and be done with that.

The default `Ord` instance for tuples follows the lexicographic order. That is, it sorts on the first component and then on the second component. This is fine for most applications, but not for this problem. Part A asks to detect when a unit is «falling into the void». Nothing more than detecting when its Y coordinate is larger than the largest Y coordinate for a block. If we model coordinates as `(x,y)`, finding the «largest» coordinate becomes contrived. If we model coordinates as `(y,x)` then we are surely to run into confusion.

I opted to create a custom type for `Pos`itions, and providing a custom `Ord` instance

``````    data Pos = Pos { x :: Int
, y :: Int
}
deriving (Show,Eq)

instance Ord Pos where
(Pos x0 y0) `compare` (Pos x1 y1) = case y0 `compare` y1 of
EQ -> x0 `compare` x1
c  -> c
``````

The compiler-derived `Eq` instance is fine. The custom `Ord` instance sorts based on the `y` coordinate first, comparing on `x` only to break ties.

As for dropping things into the grid, I separated the actual movement from the keeping track of grid contents. That means providing a simple API for `Pos` to figure out the three possible movements

``````    down :: Pos -> Pos
down (Pos x0 y0) = Pos x0 (succ y0)

downLeft :: Pos -> Pos
downLeft (Pos x0 y0) = Pos (pred x0) (succ y0)

downRight :: Pos -> Pos
downRight (Pos x0 y0) = Pos (succ x0) (succ y0)
``````

and just give a name for the fixed initial position whence units are dropped.

``````    origin :: Pos
origin = Pos { x = 500, y = 0 }
``````

Using `DM.lookup` returns a `Maybe`, so looking for empty positions would result in `Nothing`. I thought it would be a good idea to have a separate type to model what's occupying a grid position,

``````    data Content = Rock | Sand
deriving (Show,Eq)
``````

We need to parse our initial `DM.Map Pos Content` and write the functions to simulate the droppings.

## Building the initial grid while parsing

All the numbers in the input are positive, so we start with the usual

``````    parseInt :: Parser Int
parseInt = read <\$> many1 digit
``````

and then take advantage of `Parser` being an `Applicative` to turn a pair of numbers separated by a comma, into a proper `Pos`. Note the use of `(<*)` to take care of the comma, but returning the first `parseInt` result.

``````    parsePair :: Parser Pos
parsePair = Pos <\$> (parseInt <* char ',') <*> parseInt
``````

Now, each input line is just a collection of `Pos`itions separated by the arrow, therefore

``````    parseLine :: Parser [Pos]
parseLine = parsePair `sepBy1` string " -> "
``````

The problem description asserts that every input line describes a continuous segment. For every pair of `Pos` in the list, there's one common coordinate, so we can expand the whole segment between them, thus computing all grid elements that must be occupied with rocks.

We start by writing

``````    segments :: [Pos] -> [Pos]
segments ps = concat \$ zipWith points ps (tail ps)
where
points :: Pos -> Pos -> [Pos]
points (Pos x0 y0) (Pos x1 y1)
| x0 == x1   = [ Pos x0 y' | y' <- [min y0 y1 .. max y0 y1]]
| y0 == y1   = [ Pos x' y0 | x' <- [min x0 x1 .. max x0 x1]]
``````

Consider a list of `Pos` such as

``````    [p0,p1,p2,...,pN]
``````

we need to consider them two at a time, figure out if they represent a vertical or horizontal line, and generate all the points in between. The auxiliary `points` does just that for two `Pos`. Since the input could define these top-bottom, bottom-up, left-to-right, or right-to-left, we use `min` and `max` so the list comprehension generates all the points for the particular segment.

After writing `points` we should realize that in order to consider all pairs of `Pos` in the list, we can just `zipWith` the list with its own `tail`, e.g.

``````    ghci> let l = [1,2,3,4]
ghci> zip l (tail l)
[(1,2),(2,3),(3,4)]
``````

Using `zipWith points` we will process each pair of `Pos` on the original `[Pos]`, and produce a new `[Pos]` for each pair, resulting in a `[[Pos]]`, hence the `concat` to flatten all segments into a single `[Pos]`.

But we need to load multiple lines from the actual input. Each line can be parsed into a `[Pos]` thanks to `parseLine`, and we can use `sepEndBy` to parse them all into a `[[Pos]]`. We need to apply `segments` to every internal list, and then `concat` their results:

``````   parseInput :: Parser [Pos]
parseInput = concatMap segments <\$> parseLine `sepEndBy1` newline
``````

This will result in a `[Pos]` holding all the occupied positions for the initial grid. We can load them into the initial `DM.Map` with a convenient top-level parsing function

``````    loadGrid :: String -> DM.Map Pos Content
Left  _e -> undefined
where
loadWalls :: [Pos] -> DM.Map Pos Content
loadWalls = foldl' (\m p -> DM.insert p Rock m) DM.empty
``````

Loading a `DM.Map` using `foldl'` over the list of initial elements is a standard idiom: use an `DM.empty` map as accumulator, and perform the appropriate `DM.insert` as we go through each element of the list. In this case, just insert a `Rock` in place.

## Making things fall

The problem description explains that at every given time only one unit is falling until stopping. This lead me to explicitly decouple the «falling down» with the «keeping track». That is, I try and make a unit fall as much as possible one step at a time is separate from actual grid maintenance.

A unit's `fall` depends on the current grid state and the unit's `Pos`ition. It can either fall to a new position, or be stuck in place («landed»).

``````    fall :: DM.Map Pos Content -> Pos -> Pos
fall grid p0 = tryFalling [ down p0, downLeft p0, downRight p0 ]
where
tryFalling :: [Pos] -> Pos
tryFalling []     = p0
tryFalling (p:ps) = case DM.lookup p grid of
Nothing -> p
Just _  -> tryFalling ps
``````

The unit is in `Pos p0` so we `tryFalling` according to the rules. Each try looks into the current map to see if the would be next position is empty, in which case we can move there. If the would be next position is not empty, `tryFalling` in the next direction as per the rules. If there are no more alternatives to fall down, return the same position: the unit has landed. This is just one step where the unit can either fall down to a new position or stay stuck.

Part A asks to detect when a unit is falling into the void. This means we need to stop falling when we detect the lowest wall has been found. Therefore, we need to be able to `dropSandA` with a particular `maxY` bound passed as first argument. Also, this being a simulation, we need to signal whether or not the simulation has ended: my choosing a `Maybe` to wrap the newest resting position and updated state will become obvious later on.

``````    dropSandA :: Int -> DM.Map Pos Content -> Maybe (Pos, DM.Map Pos Content)
dropSandA maxY grid = dropAnother origin
where
dropAnother p
| y p' > maxY = Nothing
| p' == p     = Just (p', DM.insert p' Sand grid)
| otherwise   = dropAnother p'
where
p' = fall grid p
``````

The simulation is performed by `dropAnother` from its current `Pos`ition `p`. It tries to `fall` using the current `grid` onto a new `Pos`ition `p'`. If that new `Pos`ition is greater than the upper bound, the simulation is over. If that new `Pos`ition is the same as the original one, the unit has landed so we return it alongside the updated `grid`. Otherwise, we try falling again. Each unit has to drop from `origin`, so the top-level definition from the function just tries to `dropAnother origin`.

## Part A solution unfolds

The signature for `dropSandA` seems odd at first glance

``````    dropSandA :: Int -> DM.Map Pos Content -> Maybe (Pos, DM.Map Pos Content)
``````

Being a curried function, we can partially apply it to any given `Int` that would be the upper bound. That would result on a signature

``````    dropSandA 42 :: DM.Map Pos Content -> Maybe (Pos, DM.Map Pos Content)
``````

The description for `dropSandA` establishes that it attempts to simulate one dropping until it either succeeds or goes over the bound. In the former case we get a `Just` with the final position placed and the updated grid; the latter just provides a `Nothing`.

A simulation such as this is an instance of an «unbound repetition». We don't know how many units can fall (in fact, that's what we've been asked to figure out!). We know each step takes a current grid (a «seed») and can produce a new position and updated grid (a «result» and «new seed»), or signal the simulation has stopped when going out of bounds.

In the same way that «bound repetition» can be abstracted as a `fold`, «unbound repetition» can be abstracted as an `unfoldr`.

``````    unfoldr :: (b -> Maybe (a, b)) -> b -> [a]
``````

The second argument is the initial «seed». The first argument is a function that's able to perform one step, taking the current seed to produce a result and a new seed, or signal the end of the simulation. Using `unfoldr` will automatically collect the intermediate results, and return them lazily as they are produced.

We can solve Part A with

``````    partA :: String -> Int
partA input = length \$ unfoldr (dropSandA boundary) grid0
where
boundary = y \$ fst \$ DM.findMax grid0
``````

We `loadGrid` our initial seed into `grid0`. Thanks to the custom `Ord` instance for the `Pos`ition type, we can use `DM.findMax` to get the `Pos`ition with the largest Y coordinate in `O(log n)` time, thus setting the `boundary`. We let `unfoldr` do its thing over `dropSandA boundary` working from the starting grid. It will lazily produce a list of «landed» `Pos`itions until the first one goes out of bounds. The `length` of the resulting list is the number of units that settled down before the first one went out of bounds.

## Part B is similar but hard to factor out

After solving Part A, the description for Part B keeps the same core simulation, but adds two new restrictions:

• The ground is assumed to be two levels after the maximum Y wall, spanning as far left and right as needed. This effectively means there's a new bound but for stopping in place, not interrupting the simulation.

• The simulation must run until enough units have landed such that it's not possible to drop another one. That is, everything has piled up to `origin`.

I wasn't able to easily refactor `dropSandA` into something that could work for both simulations (hence the name). I ended up writing `dropSandB` which follows the same strategy

``````    dropSandB :: Int -> DM.Map Pos Content -> Maybe (Pos, DM.Map Pos Content)
dropSandB maxY grid = dropAnother origin
where
originFull = isJust \$ DM.lookup origin grid
dropAnother p
| p' == origin && originFull = Nothing                           -- 1
| p' == p                    = Just (p', DM.insert p' Sand grid) -- 2
| y p' == pred maxY          = Just (p', DM.insert p' Sand grid) -- 3
| otherwise                  = dropAnother p'
where
p'         = fall grid p
``````

There are just two tweaks:

• Units «appear» at the `origin` and try to fall down from there. There will be a point where `origin` is not occupied, a new unit appears but can't progress beyond `origin` staying there -- that would be rule (2). The next unit will appear at the `origin`, will not be able to progress beyond `origin`, but can't stay there on account of the previous one. That's rule (1) signaling the end of the simulation.

• There's always the ground to stop units. They can lay there, and the simulation must continue. The ground is one unit above the bound. That's rule (3).

Rules (2) and (3) can be combined with `(||)` in the guard, I know. It's easier to explain the tweaks if they're separate, that's all.

Other than those tweaks, it's exactly the same approach in order for `dropSandB` to be hoisted by `unfoldr` after figuring out the «ground level» bound

``````    partB :: String -> Int
partB input = length \$ unfoldr (dropSandB boundary) grid0
where
boundary = 2 + y (fst \$ DM.findMax grid0)
``````

## Conclusion

Beginner Haskell programmers tend to write their recursive functions by hand. That's fine. They're learning the ropes. Once you understand the basic forms of recursion, you must stop writing it explicitly and turn to higher-order functions abstracting it.

Haskell programmers use data as control structures. When we want to go over all elements of a recursive data structure, we `fold`. It doesn't matter if it is list-like or tree-like... `Foldable` takes care of moving around and detecting when we're done -- we just need to say what to do at each step. These forms of recursion consume one element at a time and are said to «collapse» the original data structure into a resulting value. This recursion style is a catamorphism.

Conversely (or «dually»), when we want to build a data structure incrementally from an initial seed, we `unfold`. There is no `Unfoldable` type class, because there isn't a generalized API useful enough. We have `Data.List.unfoldr` able to generate a (potentially infinite) list one element at a time. The standard Haskell type `Data.Tree` also provides an `unfoldTree`. These forms of recursion produce one element a time, in a predictable order, and are said to «construct» a new data structure incrementally. This recursion style is an anamorphism.

They are sometimes enough for simple simulations such as this one -- that's why I used it. They can also be used to generate locally-finite globally-infinite data structures for localized traversals.

Learning to write all your recursion as composition of catamorphisms (`fold`) and anamorphisms (`unfoldr`) over the particular data types involved is a crucial skill. It allows you to separate the what to do from the how to move, leading to better decoupling and code reuse. It also improves performance:

• When there are multiple `fold`s in sequence, they can be combined into a single `fold`. The compiler can do it for you! This means less value passing, less function calling, better cache locality.

• Placing a `fold` after and `unfold`, is similar to having a consumer after a producer. The `unfold` will generate one element at a time, and the `fold` can process it immediately. The intermediate structure is never created. This form of recursion is known as an hylomorphism.

The simulation is `unfold`-based (anamorphism), and `length` is `fold`-based (catamorphism). Their composition is fused as a hylomorphism. The assembler looks written by hand, except it was not. High-order recursion wins.