Emulators are fun! Even when they are unidimensional like in this challenge. The CPU has just one register, a cycle counter in lieu of a clock, and just two instructions. No jumps, no RAM, no I/O. Enough to power a pixel display, that also happens to be unidimensional.

But we still need to process the «assembly language» instructions
and keep track of a changing state and side-effects. If pure
functional programming is about *forbidding* side-effects, how
can we manage this?

## The Problem

A simplistic display device being driven by an even simpler CPU are described. The CPU is extremely simple:

It has a constant ticking rate, each step being a

*cycle*.It has a single integer register

`X`

, preloaded with a`1`

.It has a

`noop`

instruction that takes one cycle to complete, doing nothing.It has an

`addx V`

instruction that takes two cycles to complete, increasing the contents of register`X`

by`V`

*at the end*of the second cycle.

The input looks like

```
noop\n
addx 3\n
addx -5\n
```

The general problem is to simulate the behavior of the CPU by «running»
the instructions and keeping track of the cycles and changes to
the `X`

register.

The particular problem for Part A is to compute the product
of the cycle number times the contents of register `X`

on
cycles 20, 60, 100, 140, 180 and 220, and add them up.

## Main Program

The main program is pretty much the same as Day 1.

However, Part B turned out different, because it
returned a multi-line `String`

that needs to be presented
using `putStrLn`

instead of `print`

.

## Loading instructions

Simple lazy list parsing and a data type is all we need.
Let's start with an `INST`

data type to model CPU instructions
and a function to parse the corresponding `String`

into a
value of `INST`

, i.e.

```
data INST = NOOP | ADDX Int
deriving (Show,Eq)
toINST :: String -> INST
toINST "noop" = NOOP
toINST inst = ADDX $ read $ snd $ tail <$> break (==' ') inst
```

As long as the instruction is on a line of its own *without* the
newline, we can use `toINST`

. The `ADDX`

case is mildly interesting:
`break`

ing on the lone space will produce a tuple with both parts,
and we `fmap`

on the second one to remove the leading space, before
extracting and `read`

ing as an `Int`

.

Now, in order to get our arbitrarily long input into a `[INST]`

we use `lines`

to split the input getting rid of the newlines,
and apply `toINST`

to every element of the list.

```
loadProgram :: String -> [INST]
loadProgram = map toINST . lines
```

## Modeling the CPU

The CPU can be modeled with a simple data type

```
data CPU = CPU { x :: Int
, c :: Int
}
deriving (Show)
```

and the initial state as a constant value

```
initialCPU :: CPU
initialCPU = CPU { x = 1
, c = 0
}
```

## Modeling the machine

There are many ways to model an abstract virtual machine.
The one I chose is aligned with what the problem statement
requests, my intuition regarding what *could've been
asked* for Part B, and my experience writing translators,
interpreters, and compilers.

The way I saw it, the machine needed to:

Have a set of read-only «rules» -- a data-driven way to trigger the «probe during this cycle» to compute the required products. Emphasis on

*read-only.*Have a way to collect results -- each probe is going to produce a result, and I want to collect them all while interpreting. I did not want this collection to be «the result» of the emulation because it's not. Emphasis in

*collect as we go*.Have a way to manage the changing CPU state -- interpreting each instructions will require updating the cycle counter, maybe changing the contents of register

`X`

, and inspect them when probing at the required intervals. Emphasis in*mutable state*.

### Enter the Monad

There are multiple definitions for what a `Monad`

is, and that's
not as important as how to *use them*. For this particular problem,
we want a `Monad`

that is able to *implicitly* carry read-only reference
data (an «environment») that we can query, an append-only storage
(a «log») where we can add information as we go along, and a read-write
piece of data (a «state») that we can manipulate. By *implicitly* I mean
the shuffling around happens without needing to express it in code.
That is, if we write

```
do step1
step2
step3
...
```

*each* step *could* access environment, log, or state. Regardless
of what is accessed, the environment, log, and state, are
passed along from `step1`

to `step2`

, and so on and so forth.

That way, we can write our code in «imperative fashion» thinking
there is some «program state» being threaded as we go, when in
reality is just pure values being passed around. This is the
«programmable semicolon» interpretation of a `Monad`

: when passing
from step to step, the implicit semicolon has this «passing around»
meaning. You can think of the triad (environment,log,state) being
conveniently passed «on the side», hidden from view, but available
to be used at any or all steps.

But we will not need I/O. `Monad`

s can be used for I/O but aren't
only for that, and in fact are way more flexible.

### The RWS Monad

Haskell's standard library already provides a `Monad`

with the
behavior we're looking for. As a matter of fact, my modeling of
the machine was intentional to show how this `Monad`

can be useful.

```
import Control.Monad.RWS
```

The `RWS Monad`

is a polymorphic type for a monadic computation
producing results of any desired type. The computation implicitly
carries along a read-only environment, a logging container,
and a type for modeling state. Without going into details outside
the scope of this article, it looks like this

```
type RWS r w s = ...
```

We just need to fill the proper types for `r`

ead-only, `w`

rite-log,
and `s`

tate types. It is customary to simplify complex monadic types
using type aliases, so I wrote

```
type Machine w a = RWS Check w CPU a
```

where:

`Check`

is the type for the read-only environment. I used this for the actual cycles to probe on -- more on it in a bit. When writing monadic code, function`ask`

can be used to access the value stored provided on initial evaluation.The second type has to be a

`Monoid`

. Part A requires adding numbers as we probe. Thinking that Part B would require saving things of a different type, I opted for a polymorphic`w`

so I can choose the proper`Monoid`

as long as my type signatures are correct. When writing monadic code, function`tell`

can be used to*append*(as in`Monoid.(<>)`

) to the running log.The third type,

`CPU`

, is the one used for the mutable state. When writing monadic code, functions`get`

and`put`

can be used to read the current state and overwrite it, respectively.

The API for `RWS`

provides more functions to handle each part
of the implicit data being carried on. I avoided the fancy
stuff sticking to the basic ones.

### Monadic simulation

There are three parts to this simulation:

How to keep track of (or

`tick`

) the cycle counter. For Part A it also means saving the probes at the required intervals. I expected Part B to want something different.How to

`perform`

a single CPU instruction.How to «bootstrap» the simulation -- give initial environment and state, to then «run the assembler».

Part (3) is quite generic: it *does not* need to provide an initial
log (because it is empty) so it can be made polymorphic on its type.

I decided to decouple (1) from (2) in order to make (2) generic
in the same sense that (3) is generic. By *not doing any logging*
during the CPU operations, but shifting that to the `tick`

er. That
way, only `tick`

would need to know about the particulars of logging,
with (2) and (3) not caring. That's how general `Monad`

programming
can and should get -- but you *must* write your function signatures!

I started with `perform`

, to implement the CPU behavior.

```
perform :: Monoid w => Machine w () -> INST -> Machine w ()
perform tick NOOP = tick
perform tick (ADDX v) = do tick
tick
cpu <- get
put $ cpu { x = x cpu + v }
```

As per the definition:

`NOOP`

does nothing and takes one`tick`

.`ADDX v`

takes two`tick`

s and updates register`x`

by adding`v`

to it. Notice the use of`get`

to read the current state as a pure value, and then`put`

to overwrite it with the updated value.

The signature is as interesting as it is important:

```
perform :: Monoid w => Machine w () -> INST -> Machine w ()
```

The restriction in order to use

*any*`Monoid`

.The first argument is a monadic

`Machine`

carrying a log of some type`w`

*we don't care about in this function*, that does not produce a result -- that would be the ticker.The second argument is the single

`INST`

to perform.This in turn is a monadic

`Machine`

carrying a log of the same type`w`

*we don't care about*and producing no result.

The next generic component is the `simulate`

itself. This has to
take care of «bootstrapping» the whole computation providing
the initial read-only environment and state, and them process
a full *assembly program* in order.

We already have `initialCPU`

for our initial state, so we have
to write `initialCheck`

to provide instructions for the probing
points. According to the problem description, the first probing
event has to happen on cycle `20`

, and the rest *every other* `40`

cycles, therefore

```
data Check = Check { first :: Int
, modulus :: Int
}
initialCheck :: Check
initialCheck = Check 20 40
```

Now, for the bootstrapping, we look at `RWS`

API and find two
functions that can help

```
runRWS :: RWS r w s a -> r -> s -> (a, s, w)
evalRWS :: RWS r w s a -> r -> s -> (a, w)
```

In both cases the first argument is the *full monadic computation*
(not «a single step»), the second argument is the initial read-only
environment, and the third argument is the initial state. The returned
values are different: `runRWS`

gives back the computation result, the
final state, and the log; while `evalRWS`

gives back the computation
result, and the log.

We surely need the log. Our computations do not return anything
useful (notice the `()`

on the signatures) so that doesn't make a
difference. I did not know if Part B was going to ask anything about
the final CPU state or not, so I chose `runRWS`

and wrote

```
simulate :: Monoid w => Machine w () -> [INST] -> ((), CPU, w)
simulate tick is = runRWS (mapM_ (perform tick) is)
initialCheck
initialCPU
```

Again, the signature is explicit: first argument is the `tick`

er,
second argument is the `[INST]`

to interpret, and the returning
triplet will have no return value, the final `CPU`

state, and
the log, whatever it holds. And we want to use *any* `Monoid`

that is convenient. Generic as it can be.

Recall the first argument to `runRWS`

must be the «whole computation».
We can use `perform`

to simulate the behavior of *one* instruction. Since
`tick`

is abstracted as an argument, we can use `(perform tick)`

to use
the «appropriate» `tick`

er when interpreting an instruction. We'd like
to use `(perform tick)`

over *every* instruction on our `[INST]`

, keeping
the order from left to right and, more importantly, keeping the
«implicit monadic flow» so environment, log, and state, are passed
around.

And for that we use `mapM_`

```
mapM_ :: (Foldable t, Monad m) => (a -> m b) -> t a -> m ()
```

First, we notice that `[]`

is a `Foldable`

so we can `t ~ [INST]`

to get

```
mapM_ :: (Monad m) => (INST -> m b) -> [INST] -> m ()
```

Next, we notice that according to `simulate`

's signature

```
tick :: Machine w ()
```

therefore

```
perform tick :: INST -> Machine w ()
```

and if we make `b ~ ()`

and `m ~ Machine w`

, we get

```
mapM_ :: (INST -> Machine w ()) -> [INST] -> Machine w ()
```

which means:

We can pass

`(perform tick)`

as first argument to`mapM_`

We can pass our list of instructions as second argument.

It produces a combined monadic computation

`Machine w ()`

suitable for bootstrapping with`runRWS`

And `mapM_`

is useful because it builds a *list of individual monadic
computations*

```
[m0,m1,m2,m3,...,mN]
```

and turns it into

```
m0 >> m1 >> m2 >> m3 >> .. >> mN
```

which is equivalent to having written

```
do m0
m1
m2
...
mN
```

except you didn't write it, but built it dynamically. Monadic actions
are *first-order values*: they can be created, combined, and passed around,
but they don't «happen» until you bootstrap them. That's part of what
we mean by «programmable semicolon».

At this point, we have a *generic simulator* that only needs a particular
`tick`

er to run the assembly program. Here's were use a specific `Monoid`

to built our logs with minimal effort.

## Solving Part A

Going back to the requirements for Part A, we are asked to probe
the CPU status on cycles 20, 60, 100, 140, 180 and 220, multiply
the value of register `X`

and add them up. Notice how the first
value is «odd», and the rest go in increments of 40 -- that's what
`initialCheck`

carries for easy reference.

We need to write a `tick`

er to do the probing and adding as we go.
Turns out there exists a `Monoid`

named `Sum a`

that's able to
*automatically* carry out additions every time we `(<>)`

. Therefore,
we can use `tell`

to provide a `Sum`

, and the addition will be
computed as we go. We write an *explicit* signature mentioning it

```
tickA :: Machine (Sum Int) ()
tickA = do cpu <- get
let c' = succ (c cpu)
chk <- ask
when (check chk c') $ tell $ Sum $ x cpu * c'
put $ cpu { c = c' }
where
check :: Check -> Int -> Bool
check ck cy = cy == first ck
|| (cy - first ck) `mod` modulus ck == 0
```

Now `tickA`

explicitly mentions the use of `Monoid Sum Int`

to *automatically*
add numbers every time the code `tell`

s it to log. At the start of the
cycle, we `get`

the current CPU status and create a local variable to
increment the cycle counter *while we compute*. We immediately `ask`

for the read-only environment to have access to probing instructions.
If the `check`

confirms that we must probe within this cycle, we
simply do the math and `tell`

the `Sum`

`Monoid`

to do its thing.
Finally, we `put`

the CPU state with the updated cycle counter.

We can finally solve `partA`

by running the simulator with this
custom ticker. It will type-check, because we're providing a concrete
`Monoid`

to work with.

```
partA :: String -> Int
partA input = getSum total
where (_,_,total) = simulate tickA (loadProgram input)
```

We ignore the computation result and the final state, grabbing the
final log via pattern matching. The log is a value of type `Sum`

,
and its API provides a function `getSum`

to get the actual value.

## Solving Part B

The description of Part B turned out to be quite confusing at first. The provided example is helpful, but there's room for improvement. I ended up understanding it this way:

The display actually has positions 1 to 240, grouped 40 a row.

The cycles go from 1 to 240.

When working on cycle

`n`

, if`n mod 40`

is in the range`[x-1 .. x+1]`

, we need to output a hash, otherwise we output a period.To mimic the screen's two dimensions, output a newline every 40 cycles.

We just need to provide a `tick`

er for this particular problem. Using
`[String]`

works because `[a]`

is a `Monoid`

. The log is going to be short
enough, so I won't worry about `(++)`

being expensive. We will create
*all* the output on the log, and print it after the simulation has
completed. This is different to doing interleaved I/O, which is
possible but I chose not to use, in order to show a *pure solution*.

```
tickB :: Machine [String] ()
tickB = do cpu <- get
if check (c cpu) (x cpu) then tell ["#"] else tell ["."]
let c' = succ (c cpu)
when (c' `mod` 40 == 0) $ tell ["\n"]
put $ cpu { c = c' }
where
check :: Int -> Int -> Bool
check cy rX = (cy `mod` 40) `elem` [pred rX .. succ rX]
```

Now `tickB`

explicitly mentions the use of `Monoid [String]`

for the log.
This means we need to `tell`

using `[String]`

for them to be appended
as we go. At the start of the cycle we `get`

the state of the CPU,
and check if we need to log a hash or a period. We then increment
the cycle counter in a local variable, and log a newline if we've
reached the right edge of the «screen». Finally, we `put`

the CPU state
with the updated cycle counter.

We can finally solve `partB`

by running the simulator with this
custom ticker. It will type-check, because we're providing a concrete
`Monoid`

to work with.

```
partB :: String -> String
partB input = concat ls
where (_,_,ls) = simulate tickB (loadProgram input)
```

We ignore the computation result and the final state, grabbing the
final log via pattern matching. The log is a value of type `[String]`

so we simply `concat`

into a long `String`

that should be printed
using `putStrLn`

as mentioned when describing the main program.

```
ghci> readFile "input.txt" >>= putStrLn . partB
###..###..###...##..###...##...##..####.
#..#.#..#.#..#.#..#.#..#.#..#.#..#.#....
#..#.###..#..#.#..#.#..#.#..#.#....###..
###..#..#.###..####.###..####.#.##.#....
#.#..#..#.#....#..#.#.#..#..#.#..#.#....
#..#.###..#....#..#.#..#.#..#..###.#....
```

## Conclusion

The `Monad`

abstraction is very powerful. There's more to `Monad`

s
than the simple «oh, they are used for I/O». Since `Monad`

s model
order-dependent computations in a *generic* and *polymorphic* way,
they can be used and combined to build first-order pure
computations that have «side effects» or «implicit effects»,
besides the actual results they produce.

I decided to use `RWS`

to illustrate the ability to separate
«the process» from «the specifics». This is a particularly
useful way of structuring monadic computation that favors
reusability and composability, in ways that are either
impossible or prohibitively expensive in other programming
languages.

Each of the `RWS`

components can be used separately. That is,
there exist separate `Reader Monad`

, `Writer Monad`

, and `State Monad`

that can be used individually when you need only one of the
behaviors. There are many other data types that can act as
`Monad`

, such as `Maybe`

, `Either`

and `[a]`

, to model computations
that can fail, produce an exception, or be predictably
non-deterministic. There are `Monad`

s for parallelism, parsing,
continuation-passing style, Prolog-style backtracking,
and Software Transactional Concurrency.

Every single flow-control strategy can be modeled as a
`Monad`

. Haskell gives you the opportunity to choose which
strategy to use, and to combine more than one using the
Monad Transforming technique. On top of that, you can
write generic code with replaceable parts -- this example
scratches the surface by keeping the `Monoid`

generic,
so I can abstract the particular away from the general.

On the topic of `Monoid`

, knowing about `Sum`

and `Product`

is a matter of consulting the API. I used `[a]`

as a
cheap `Monoid`

for accumulating. For production-grade
code I'd use `Data.Seq`

instead.

As a final comment, I used «log» in the sense of
in-memory record. If you are thinking about doing actual
application logging, the `Writer Monad`

is *not* the way.
Guess what, there's a `LoggingT Monad`

specifically
designed for logging to `stderr`

or `stdout`

, and for
people doing serious `syslog`

logging, check out
monad-logger-syslog.
They abstract the logging *out* of the explicit I/O
operations in a general way.