This challenge felt like «busy work» more than «creative work». It boils down to keeping track of a grid's occupied positions, until a particle goes out of bounds or there's no room for more particles under the rules.

As usual for «grid-centric» problem solving in Haskell, is better
to use a `Map`

to focus on *occupied* positions. After defining a
custom `Ord`

instance to sort things the way I needed, it was
mostly «busy work».

The simulation turned out adequate to illustrate why `unfoldr`

matters, specially when you fuse into an *hylomorphism*...

## The Problem

We are given the description of a wall grid, in the form of horizontal and vertical segments. The grid is empty, except for the places where rocks form the horizontal and vertical segments. The sample input looks like

```
498,4 -> 498,6 -> 496,6\n
503,4 -> 502,4 -> 502,9 -> 494,9\n
```

Each pair of number between `->`

describes a segment. The input
guarantees that each pair has a common first element thus representing
an horizontal segment, or a common second element thus representing
a vertical segment.

A «grid unit» of sand is dropped from a particular starting position. It drops until it cannot progress further according to the movement rules (down, down left, down right, or stop). Units are blocked by the walls as defined by the grid, or by previously dropped units. Once a unit stops, a new one is dropped.

## Main Program

The main program is pretty much the same as Day 1.

## The Grid

When modeling «grid» problems, Haskell programmers tend to focus on
what's being *used* in the grid, rather than on empty spaces. Only
keeping track of used positions is surely space-efficient when compared
with traditional arrays. That's why we choose a map that's optimized
for the kind of *position index* we need to handle. For this problem,
positions are two-dimensional, so it makes sense to use the traditional
`Data.Map`

.

```
import qualified Data.Map as DM
```

The `Data.Map`

API imposes only one restriction on the type used as key
for lookups

```
DM.lookup :: Ord k => k -> DM.Map k a -> Maybe a
```

Since most standard data types already have an `Ord`

instance, and we
can always ask the compiler to derive an automatic `Ord`

instance, this
is quite convenient: we could use plain tuples `(Int,Int)`

as keys,
and be done with that.

The default `Ord`

instance for tuples follows the lexicographic order.
That is, it sorts on the *first* component and then on the *second* component.
This is fine for most applications, but not for this problem. Part A
asks to detect when a unit is «falling into the void». Nothing
more than detecting when its Y coordinate is larger than the largest
Y coordinate for a block. If we model coordinates as `(x,y)`

, finding
the «largest» coordinate becomes contrived. If we model coordinates
as `(y,x)`

then we are surely to run into confusion.

I opted to create a custom type for `Pos`

itions, and providing a
custom `Ord`

instance

```
data Pos = Pos { x :: Int
, y :: Int
}
deriving (Show,Eq)
instance Ord Pos where
(Pos x0 y0) `compare` (Pos x1 y1) = case y0 `compare` y1 of
EQ -> x0 `compare` x1
c -> c
```

The compiler-derived `Eq`

instance is fine. The custom `Ord`

instance
sorts based on the `y`

coordinate first, comparing on `x`

only to break
ties.

As for dropping things into the grid, I separated the actual
movement from the keeping track of grid contents. That means providing
a simple API for `Pos`

to figure out the three possible movements

```
down :: Pos -> Pos
down (Pos x0 y0) = Pos x0 (succ y0)
downLeft :: Pos -> Pos
downLeft (Pos x0 y0) = Pos (pred x0) (succ y0)
downRight :: Pos -> Pos
downRight (Pos x0 y0) = Pos (succ x0) (succ y0)
```

and just give a name for the fixed initial position whence units are dropped.

```
origin :: Pos
origin = Pos { x = 500, y = 0 }
```

Using `DM.lookup`

returns a `Maybe`

, so looking for empty positions would
result in `Nothing`

. I thought it would be a good idea to have a separate
type to model what's occupying a grid position,

```
data Content = Rock | Sand
deriving (Show,Eq)
```

We need to parse our initial `DM.Map Pos Content`

and write the functions
to simulate the droppings.

## Building the initial grid while parsing

All the numbers in the input are positive, so we start with the usual

```
parseInt :: Parser Int
parseInt = read <$> many1 digit
```

and then take advantage of `Parser`

being an `Applicative`

to turn a pair
of numbers separated by a comma, into a proper `Pos`

. Note the use
of `(<*)`

to take care of the comma, but returning the first `parseInt`

result.

```
parsePair :: Parser Pos
parsePair = Pos <$> (parseInt <* char ',') <*> parseInt
```

Now, each input line is just a collection of `Pos`

itions separated
by the arrow, therefore

```
parseLine :: Parser [Pos]
parseLine = parsePair `sepBy1` string " -> "
```

The problem description asserts that every input *line* describes
a continuous segment. For every pair of `Pos`

in the list, there's
one common coordinate, so we can *expand* the whole segment between
them, thus computing all grid elements that must be occupied with
rocks.

We start by writing

```
segments :: [Pos] -> [Pos]
segments ps = concat $ zipWith points ps (tail ps)
where
points :: Pos -> Pos -> [Pos]
points (Pos x0 y0) (Pos x1 y1)
| x0 == x1 = [ Pos x0 y' | y' <- [min y0 y1 .. max y0 y1]]
| y0 == y1 = [ Pos x' y0 | x' <- [min x0 x1 .. max x0 x1]]
```

Consider a list of `Pos`

such as

```
[p0,p1,p2,...,pN]
```

we need to consider them two at a time, figure out if they represent a
vertical or horizontal line, and generate all the points in between.
The auxiliary `points`

does just that for two `Pos`

. Since the input
could define these top-bottom, bottom-up, left-to-right, or right-to-left,
we use `min`

and `max`

so the list comprehension generates all the points
for the particular segment.

After writing `points`

we should realize that in order to consider all
pairs of `Pos`

in the list, we can just `zipWith`

the list with its own
`tail`

, e.g.

```
ghci> let l = [1,2,3,4]
ghci> zip l (tail l)
[(1,2),(2,3),(3,4)]
```

Using `zipWith points`

we will process each pair of `Pos`

on the original `[Pos]`

,
and produce a new `[Pos]`

for *each pair*, resulting in a `[[Pos]]`

, hence
the `concat`

to flatten all segments into a single `[Pos]`

.

But we need to load *multiple* lines from the actual input. Each line can
be parsed into a `[Pos]`

thanks to `parseLine`

, and we can use `sepEndBy`

to
parse them all into a `[[Pos]]`

. We need to apply `segments`

to every
internal list, and then `concat`

their results:

```
parseInput :: Parser [Pos]
parseInput = concatMap segments <$> parseLine `sepEndBy1` newline
```

This will result in a `[Pos]`

holding all the occupied positions for
the initial grid. We can load them into the initial `DM.Map`

with a
convenient top-level parsing function

```
loadGrid :: String -> DM.Map Pos Content
loadGrid input = case parse parseInput "loadGrid" input of
Right ws -> loadWalls ws
Left _e -> undefined
where
loadWalls :: [Pos] -> DM.Map Pos Content
loadWalls = foldl' (\m p -> DM.insert p Rock m) DM.empty
```

Loading a `DM.Map`

using `foldl'`

over the list of initial elements is
a standard idiom: use an `DM.empty`

map as accumulator, and perform
the appropriate `DM.insert`

as we go through each element of the list.
In this case, just insert a `Rock`

in place.

## Making things fall

The problem description explains that at every given time only
*one* unit is falling until stopping. This lead me to explicitly
decouple the «falling down» with the «keeping track». That is,
I try and make a unit fall as much as possible *one step at
a time* is separate from actual grid maintenance.

A unit's `fall`

depends on the current grid state and the unit's
`Pos`

ition. It can either fall to a new position, or be stuck
in place («landed»).

```
fall :: DM.Map Pos Content -> Pos -> Pos
fall grid p0 = tryFalling [ down p0, downLeft p0, downRight p0 ]
where
tryFalling :: [Pos] -> Pos
tryFalling [] = p0
tryFalling (p:ps) = case DM.lookup p grid of
Nothing -> p
Just _ -> tryFalling ps
```

The unit is in `Pos p0`

so we `tryFalling`

according to the rules. Each try
looks into the current map to see if the would be next position is
empty, in which case we can move there. If the would be next position
is not empty, `tryFalling`

in the next direction as per the rules. If
there are no more alternatives to fall down, return the same position:
the unit has landed. This is just *one step* where the unit can either
fall down to a new position or stay stuck.

Part A asks to detect when a unit is falling into the void. This means
we need to stop falling when we detect the lowest wall has been found.
Therefore, we need to be able to `dropSandA`

with a particular `maxY`

bound
passed as first argument. Also, this being a simulation, we need to
signal whether or not the simulation has ended: my choosing a
`Maybe`

to wrap the newest resting position and updated state will
become obvious later on.

```
dropSandA :: Int -> DM.Map Pos Content -> Maybe (Pos, DM.Map Pos Content)
dropSandA maxY grid = dropAnother origin
where
dropAnother p
| y p' > maxY = Nothing
| p' == p = Just (p', DM.insert p' Sand grid)
| otherwise = dropAnother p'
where
p' = fall grid p
```

The simulation is performed by `dropAnother`

from its current `Pos`

ition `p`

.
It tries to `fall`

using the current `grid`

onto a new `Pos`

ition `p'`

. If that
new `Pos`

ition is greater than the upper bound, the simulation is over.
If that new `Pos`

ition is the same as the original one, the unit has
landed so we return it alongside the updated `grid`

. Otherwise, we try
falling again. Each unit has to drop from `origin`

, so the top-level
definition from the function just tries to `dropAnother origin`

.

## Part A solution unfolds

The signature for `dropSandA`

seems odd at first glance

```
dropSandA :: Int -> DM.Map Pos Content -> Maybe (Pos, DM.Map Pos Content)
```

Being a curried function, we can partially apply it to any given `Int`

that would be the upper bound. That would result on a signature

```
dropSandA 42 :: DM.Map Pos Content -> Maybe (Pos, DM.Map Pos Content)
```

The description for `dropSandA`

establishes that it attempts to simulate *one*
dropping until it either succeeds or goes over the bound. In the former case
we get a `Just`

with the final position placed *and* the updated grid; the latter
just provides a `Nothing`

.

A simulation such as this is an instance of an «unbound repetition». We don't know how many units can fall (in fact, that's what we've been asked to figure out!). We know each step takes a current grid (a «seed») and can produce a new position and updated grid (a «result» and «new seed»), or signal the simulation has stopped when going out of bounds.

In the same way that «bound repetition» can be abstracted as a `fold`

,
«unbound repetition» can be abstracted as an `unfoldr`

.

```
unfoldr :: (b -> Maybe (a, b)) -> b -> [a]
```

The *second* argument is the initial «seed». The *first* argument is
a function that's able to perform *one step*, taking the current seed
to produce a result and a new seed, or signal the end of the simulation.
Using `unfoldr`

will automatically collect the intermediate results,
and return them *lazily* as they are produced.

We can solve Part A with

```
partA :: String -> Int
partA input = length $ unfoldr (dropSandA boundary) grid0
where
grid0 = loadGrid input
boundary = y $ fst $ DM.findMax grid0
```

We `loadGrid`

our initial seed into `grid0`

. Thanks to the custom `Ord`

instance
for the `Pos`

ition type, we can use `DM.findMax`

to get the `Pos`

ition with
the largest Y coordinate in `O(log n)`

time, thus setting the `boundary`

.
We let `unfoldr`

do its thing over `dropSandA boundary`

working from
the starting grid. It will lazily produce a list of «landed» `Pos`

itions
until the first one goes out of bounds. The `length`

of the resulting
list is the number of units that settled down before the first one
went out of bounds.

## Part B is similar but hard to factor out

After solving Part A, the description for Part B keeps the same core simulation, but adds two new restrictions:

The ground is assumed to be two levels after the maximum Y wall, spanning as far left and right as needed. This effectively means there's a new bound but for

*stopping*in place, not interrupting the simulation.The simulation must run until enough units have landed such that it's not possible to drop another one. That is, everything has piled up to

`origin`

.

I wasn't able to easily refactor `dropSandA`

into something that could
work for both simulations (hence the name). I ended up writing
`dropSandB`

which follows the same strategy

```
dropSandB :: Int -> DM.Map Pos Content -> Maybe (Pos, DM.Map Pos Content)
dropSandB maxY grid = dropAnother origin
where
originFull = isJust $ DM.lookup origin grid
dropAnother p
| p' == origin && originFull = Nothing -- 1
| p' == p = Just (p', DM.insert p' Sand grid) -- 2
| y p' == pred maxY = Just (p', DM.insert p' Sand grid) -- 3
| otherwise = dropAnother p'
where
p' = fall grid p
```

There are just two tweaks:

Units «appear» at the

`origin`

and try to fall down from there. There will be a point where`origin`

is not occupied, a new unit appears but can't progress beyond`origin`

staying there -- that would be rule (2). The*next*unit will appear at the`origin`

, will not be able to progress beyond`origin`

, but can't stay there on account of the previous one. That's rule (1) signaling the end of the simulation.There's always the ground to stop units. They can lay there, and the simulation must continue. The ground is one unit above the bound. That's rule (3).

Rules (2) and (3) can be combined with `(||)`

in the guard, I know.
It's easier to explain the tweaks if they're separate, that's all.

Other than those tweaks, it's exactly the same approach in order
for `dropSandB`

to be hoisted by `unfoldr`

after figuring out the
«ground level» bound

```
partB :: String -> Int
partB input = length $ unfoldr (dropSandB boundary) grid0
where
grid0 = loadGrid input
boundary = 2 + y (fst $ DM.findMax grid0)
```

## Conclusion

Beginner Haskell programmers tend to write their recursive functions
by hand. That's fine. They're learning the ropes. Once you understand
the basic forms of recursion, you *must stop writing it explicitly* and
turn to higher-order functions abstracting it.

Haskell programmers use data as control structures. When we want to
go over all elements of a recursive data structure, we `fold`

. It doesn't
matter if it is list-like or tree-like... `Foldable`

takes care of moving
around and detecting when we're done -- we just need to say what to do
at each step. These forms of recursion *consume* one element at a time
and are said to «collapse» the original data structure into a
resulting value. This recursion style is a *catamorphism*.

Conversely (or «dually»), when we want to build a data structure
incrementally from an initial seed, we `unfold`

. There is no `Unfoldable`

type class, because there isn't a generalized API useful enough.
We have `Data.List.unfoldr`

able to generate a (potentially infinite)
list one element at a time. The standard Haskell type `Data.Tree`

also
provides an `unfoldTree`

. These forms of recursion *produce* one element
a time, in a predictable order, and are said to «construct» a new
data structure incrementally. This recursion style is an *anamorphism*.

They are sometimes enough for simple simulations such as this one -- that's why I used it. They can also be used to generate locally-finite globally-infinite data structures for localized traversals.

Learning to write *all* your recursion as composition of
catamorphisms (`fold`

) and *anamorphisms* (`unfoldr`

) over the
particular data types involved is a crucial skill. It allows
you to separate the *what to do* from the *how to move*, leading
to better decoupling and code reuse. It also improves
performance:

When there are multiple

`fold`

s in sequence, they can be combined into a*single*`fold`

. The compiler can do it for you! This means*less*value passing,*less*function calling,*better*cache locality.Placing a

`fold`

*after*and`unfold`

, is similar to having a*consumer*after a*producer*. The`unfold`

will generate one element at a time, and the`fold`

can process it immediately. The intermediate structure*is never created*. This form of recursion is known as an*hylomorphism*.

Go read

«Functional programming with bananas, lenses, envelopes, and barbed wire» (Meijer, Fokkinga, Paterson)

«A tutorial on the universality and expressiveness of fold» (Hutton)

«Theory and Practice of Fusion» (Hinze, Harper, James)

for the basic mathematical nitty-gritty. These techniques have
been known for decades, and GHC applies them when improving
and generating *machine* code. The *apparent* list processing
becomes tight assembly loops building no lists at all...

The simulation is `unfold`

-based (anamorphism), and
`length`

is `fold`

-based (catamorphism). Their composition is
fused as a hylomorphism. The assembler looks written by
hand, except it was not. High-order recursion wins.