Emulators are fun! Even when they are unidimensional like in this challenge. The CPU has just one register, a cycle counter in lieu of a clock, and just two instructions. No jumps, no RAM, no I/O. Enough to power a pixel display, that also happens to be unidimensional.

But we still need to process the «assembly language» instructions and keep track of a changing state and side-effects. If pure functional programming is about forbidding side-effects, how can we manage this?

## The Problem

A simplistic display device being driven by an even simpler CPU are described. The CPU is extremely simple:

• It has a constant ticking rate, each step being a cycle.

• It has a single integer register `X`, preloaded with a `1`.

• It has a `noop` instruction that takes one cycle to complete, doing nothing.

• It has an `addx V` instruction that takes two cycles to complete, increasing the contents of register `X` by `V` at the end of the second cycle.

The input looks like

``````    noop\n
``````

The general problem is to simulate the behavior of the CPU by «running» the instructions and keeping track of the cycles and changes to the `X` register.

The particular problem for Part A is to compute the product of the cycle number times the contents of register `X` on cycles 20, 60, 100, 140, 180 and 220, and add them up.

## Main Program

The main program is pretty much the same as Day 1.

However, Part B turned out different, because it returned a multi-line `String` that needs to be presented using `putStrLn` instead of `print`.

Simple lazy list parsing and a data type is all we need. Let's start with an `INST` data type to model CPU instructions and a function to parse the corresponding `String` into a value of `INST`, i.e.

``````    data INST = NOOP | ADDX Int
deriving (Show,Eq)

toINST :: String -> INST
toINST "noop" = NOOP
toINST inst   = ADDX \$ read \$ snd \$ tail <\$> break (==' ') inst
``````

As long as the instruction is on a line of its own without the newline, we can use `toINST`. The `ADDX` case is mildly interesting: `break`ing on the lone space will produce a tuple with both parts, and we `fmap` on the second one to remove the leading space, before extracting and `read`ing as an `Int`.

Now, in order to get our arbitrarily long input into a `[INST]` we use `lines` to split the input getting rid of the newlines, and apply `toINST` to every element of the list.

``````    loadProgram :: String -> [INST]
loadProgram = map toINST . lines
``````

## Modeling the CPU

The CPU can be modeled with a simple data type

``````    data CPU = CPU { x :: Int
, c :: Int
}
deriving (Show)
``````

and the initial state as a constant value

``````    initialCPU :: CPU
initialCPU = CPU { x = 1
, c = 0
}
``````

## Modeling the machine

There are many ways to model an abstract virtual machine. The one I chose is aligned with what the problem statement requests, my intuition regarding what could've been asked for Part B, and my experience writing translators, interpreters, and compilers.

The way I saw it, the machine needed to:

• Have a set of read-only «rules» -- a data-driven way to trigger the «probe during this cycle» to compute the required products. Emphasis on read-only.

• Have a way to collect results -- each probe is going to produce a result, and I want to collect them all while interpreting. I did not want this collection to be «the result» of the emulation because it's not. Emphasis in collect as we go.

• Have a way to manage the changing CPU state -- interpreting each instructions will require updating the cycle counter, maybe changing the contents of register `X`, and inspect them when probing at the required intervals. Emphasis in mutable state.

There are multiple definitions for what a `Monad` is, and that's not as important as how to use them. For this particular problem, we want a `Monad` that is able to implicitly carry read-only reference data (an «environment») that we can query, an append-only storage (a «log») where we can add information as we go along, and a read-write piece of data (a «state») that we can manipulate. By implicitly I mean the shuffling around happens without needing to express it in code. That is, if we write

``````    do step1
step2
step3
...
``````

each step could access environment, log, or state. Regardless of what is accessed, the environment, log, and state, are passed along from `step1` to `step2`, and so on and so forth.

That way, we can write our code in «imperative fashion» thinking there is some «program state» being threaded as we go, when in reality is just pure values being passed around. This is the «programmable semicolon» interpretation of a `Monad`: when passing from step to step, the implicit semicolon has this «passing around» meaning. You can think of the triad (environment,log,state) being conveniently passed «on the side», hidden from view, but available to be used at any or all steps.

But we will not need I/O. `Monad`s can be used for I/O but aren't only for that, and in fact are way more flexible.

Haskell's standard library already provides a `Monad` with the behavior we're looking for. As a matter of fact, my modeling of the machine was intentional to show how this `Monad` can be useful.

``````    import Control.Monad.RWS
``````

The `RWS Monad` is a polymorphic type for a monadic computation producing results of any desired type. The computation implicitly carries along a read-only environment, a logging container, and a type for modeling state. Without going into details outside the scope of this article, it looks like this

``````    type RWS r w s = ...
``````

We just need to fill the proper types for `r`ead-only, `w`rite-log, and `s`tate types. It is customary to simplify complex monadic types using type aliases, so I wrote

``````    type Machine w a = RWS Check w CPU a
``````

where:

• `Check` is the type for the read-only environment. I used this for the actual cycles to probe on -- more on it in a bit. When writing monadic code, function `ask` can be used to access the value stored provided on initial evaluation.

• The second type has to be a `Monoid`. Part A requires adding numbers as we probe. Thinking that Part B would require saving things of a different type, I opted for a polymorphic `w` so I can choose the proper `Monoid` as long as my type signatures are correct. When writing monadic code, function `tell` can be used to append (as in `Monoid.(<>)`) to the running log.

• The third type, `CPU`, is the one used for the mutable state. When writing monadic code, functions `get` and `put` can be used to read the current state and overwrite it, respectively.

The API for `RWS` provides more functions to handle each part of the implicit data being carried on. I avoided the fancy stuff sticking to the basic ones.

There are three parts to this simulation:

1. How to keep track of (or `tick`) the cycle counter. For Part A it also means saving the probes at the required intervals. I expected Part B to want something different.

2. How to `perform` a single CPU instruction.

3. How to «bootstrap» the simulation -- give initial environment and state, to then «run the assembler».

Part (3) is quite generic: it does not need to provide an initial log (because it is empty) so it can be made polymorphic on its type.

I decided to decouple (1) from (2) in order to make (2) generic in the same sense that (3) is generic. By not doing any logging during the CPU operations, but shifting that to the `tick`er. That way, only `tick` would need to know about the particulars of logging, with (2) and (3) not caring. That's how general `Monad` programming can and should get -- but you must write your function signatures!

I started with `perform`, to implement the CPU behavior.

``````    perform :: Monoid w => Machine w () -> INST -> Machine w ()
perform tick NOOP     = tick
perform tick (ADDX v) = do tick
tick
cpu <- get
put \$ cpu { x = x cpu + v }
``````

As per the definition:

• `NOOP` does nothing and takes one `tick`.

• `ADDX v` takes two `tick`s and updates register `x` by adding `v` to it. Notice the use of `get` to read the current state as a pure value, and then `put` to overwrite it with the updated value.

The signature is as interesting as it is important:

``````    perform :: Monoid w => Machine w () -> INST -> Machine w ()
``````
• The restriction in order to use any `Monoid`.

• The first argument is a monadic `Machine` carrying a log of some type `w` we don't care about in this function, that does not produce a result -- that would be the ticker.

• The second argument is the single `INST` to perform.

• This in turn is a monadic `Machine` carrying a log of the same type `w` we don't care about and producing no result.

The next generic component is the `simulate` itself. This has to take care of «bootstrapping» the whole computation providing the initial read-only environment and state, and them process a full assembly program in order.

We already have `initialCPU` for our initial state, so we have to write `initialCheck` to provide instructions for the probing points. According to the problem description, the first probing event has to happen on cycle `20`, and the rest every other `40` cycles, therefore

``````    data Check = Check { first   :: Int
, modulus :: Int
}

initialCheck :: Check
initialCheck = Check 20 40
``````

Now, for the bootstrapping, we look at `RWS` API and find two functions that can help

``````    runRWS  :: RWS r w s a -> r -> s -> (a, s, w)
evalRWS :: RWS r w s a -> r -> s -> (a, w)
``````

In both cases the first argument is the full monadic computation (not «a single step»), the second argument is the initial read-only environment, and the third argument is the initial state. The returned values are different: `runRWS` gives back the computation result, the final state, and the log; while `evalRWS` gives back the computation result, and the log.

We surely need the log. Our computations do not return anything useful (notice the `()` on the signatures) so that doesn't make a difference. I did not know if Part B was going to ask anything about the final CPU state or not, so I chose `runRWS` and wrote

``````    simulate :: Monoid w => Machine w () -> [INST] -> ((), CPU, w)
simulate tick is = runRWS (mapM_ (perform tick) is)
initialCheck
initialCPU
``````

Again, the signature is explicit: first argument is the `tick`er, second argument is the `[INST]` to interpret, and the returning triplet will have no return value, the final `CPU` state, and the log, whatever it holds. And we want to use any `Monoid` that is convenient. Generic as it can be.

Recall the first argument to `runRWS` must be the «whole computation». We can use `perform` to simulate the behavior of one instruction. Since `tick` is abstracted as an argument, we can use `(perform tick)` to use the «appropriate» `tick`er when interpreting an instruction. We'd like to use `(perform tick)` over every instruction on our `[INST]`, keeping the order from left to right and, more importantly, keeping the «implicit monadic flow» so environment, log, and state, are passed around.

And for that we use `mapM_`

``````    mapM_ :: (Foldable t, Monad m) => (a -> m b) -> t a -> m ()
``````

First, we notice that `[]` is a `Foldable` so we can `t ~ [INST]` to get

``````    mapM_ :: (Monad m) => (INST -> m b) -> [INST] -> m ()
``````

Next, we notice that according to `simulate`'s signature

``````    tick :: Machine w ()
``````

therefore

``````    perform tick :: INST -> Machine w ()
``````

and if we make `b ~ ()` and `m ~ Machine w`, we get

``````    mapM_ :: (INST -> Machine w ()) -> [INST] -> Machine w ()
``````

which means:

• We can pass `(perform tick)` as first argument to `mapM_`

• We can pass our list of instructions as second argument.

• It produces a combined monadic computation `Machine w ()` suitable for bootstrapping with `runRWS`

And `mapM_` is useful because it builds a list of individual monadic computations

``````    [m0,m1,m2,m3,...,mN]
``````

and turns it into

``````    m0 >> m1 >> m2 >> m3 >> .. >> mN
``````

which is equivalent to having written

``````    do m0
m1
m2
...
mN
``````

except you didn't write it, but built it dynamically. Monadic actions are first-order values: they can be created, combined, and passed around, but they don't «happen» until you bootstrap them. That's part of what we mean by «programmable semicolon».

At this point, we have a generic simulator that only needs a particular `tick`er to run the assembly program. Here's were use a specific `Monoid` to built our logs with minimal effort.

## Solving Part A

Going back to the requirements for Part A, we are asked to probe the CPU status on cycles 20, 60, 100, 140, 180 and 220, multiply the value of register `X` and add them up. Notice how the first value is «odd», and the rest go in increments of 40 -- that's what `initialCheck` carries for easy reference.

We need to write a `tick`er to do the probing and adding as we go. Turns out there exists a `Monoid` named `Sum a` that's able to automatically carry out additions every time we `(<>)`. Therefore, we can use `tell` to provide a `Sum`, and the addition will be computed as we go. We write an explicit signature mentioning it

``````    tickA :: Machine (Sum Int) ()
tickA = do cpu <- get
let c' = succ (c cpu)
when (check chk c') \$ tell \$ Sum \$ x cpu * c'
put \$ cpu { c = c' }
where
check :: Check -> Int -> Bool
check ck cy =  cy == first ck
|| (cy - first ck) `mod` modulus ck == 0
``````

Now `tickA` explicitly mentions the use of `Monoid Sum Int` to automatically add numbers every time the code `tell`s it to log. At the start of the cycle, we `get` the current CPU status and create a local variable to increment the cycle counter while we compute. We immediately `ask` for the read-only environment to have access to probing instructions. If the `check` confirms that we must probe within this cycle, we simply do the math and `tell` the `Sum` `Monoid` to do its thing. Finally, we `put` the CPU state with the updated cycle counter.

We can finally solve `partA` by running the simulator with this custom ticker. It will type-check, because we're providing a concrete `Monoid` to work with.

``````    partA :: String -> Int
partA input = getSum total
where (_,_,total) = simulate tickA (loadProgram input)
``````

We ignore the computation result and the final state, grabbing the final log via pattern matching. The log is a value of type `Sum`, and its API provides a function `getSum` to get the actual value.

## Solving Part B

The description of Part B turned out to be quite confusing at first. The provided example is helpful, but there's room for improvement. I ended up understanding it this way:

• The display actually has positions 1 to 240, grouped 40 a row.

• The cycles go from 1 to 240.

• When working on cycle `n`, if `n mod 40` is in the range `[x-1 .. x+1]`, we need to output a hash, otherwise we output a period.

• To mimic the screen's two dimensions, output a newline every 40 cycles.

We just need to provide a `tick`er for this particular problem. Using `[String]` works because `[a]` is a `Monoid`. The log is going to be short enough, so I won't worry about `(++)` being expensive. We will create all the output on the log, and print it after the simulation has completed. This is different to doing interleaved I/O, which is possible but I chose not to use, in order to show a pure solution.

``````    tickB :: Machine [String] ()
tickB = do cpu <- get
if check (c cpu) (x cpu) then tell ["#"] else tell ["."]
let c' = succ (c cpu)
when (c' `mod` 40 == 0) \$ tell ["\n"]
put \$ cpu { c = c' }
where
check :: Int -> Int -> Bool
check cy rX = (cy `mod` 40) `elem` [pred rX .. succ rX]
``````

Now `tickB` explicitly mentions the use of `Monoid [String]` for the log. This means we need to `tell` using `[String]` for them to be appended as we go. At the start of the cycle we `get` the state of the CPU, and check if we need to log a hash or a period. We then increment the cycle counter in a local variable, and log a newline if we've reached the right edge of the «screen». Finally, we `put` the CPU state with the updated cycle counter.

We can finally solve `partB` by running the simulator with this custom ticker. It will type-check, because we're providing a concrete `Monoid` to work with.

``````    partB :: String -> String
partB input = concat ls
where (_,_,ls) = simulate tickB (loadProgram input)
``````

We ignore the computation result and the final state, grabbing the final log via pattern matching. The log is a value of type `[String]` so we simply `concat` into a long `String` that should be printed using `putStrLn` as mentioned when describing the main program.

``````    ghci> readFile "input.txt" >>= putStrLn . partB
###..###..###...##..###...##...##..####.
#..#.#..#.#..#.#..#.#..#.#..#.#..#.#....
#..#.###..#..#.#..#.#..#.#..#.#....###..
###..#..#.###..####.###..####.#.##.#....
#.#..#..#.#....#..#.#.#..#..#.#..#.#....
#..#.###..#....#..#.#..#.#..#..###.#....
``````

## Conclusion

The `Monad` abstraction is very powerful. There's more to `Monad`s than the simple «oh, they are used for I/O». Since `Monad`s model order-dependent computations in a generic and polymorphic way, they can be used and combined to build first-order pure computations that have «side effects» or «implicit effects», besides the actual results they produce.

I decided to use `RWS` to illustrate the ability to separate «the process» from «the specifics». This is a particularly useful way of structuring monadic computation that favors reusability and composability, in ways that are either impossible or prohibitively expensive in other programming languages.

Each of the `RWS` components can be used separately. That is, there exist separate `Reader Monad`, `Writer Monad`, and `State Monad` that can be used individually when you need only one of the behaviors. There are many other data types that can act as `Monad`, such as `Maybe`, `Either` and `[a]`, to model computations that can fail, produce an exception, or be predictably non-deterministic. There are `Monad`s for parallelism, parsing, continuation-passing style, Prolog-style backtracking, and Software Transactional Concurrency.

Every single flow-control strategy can be modeled as a `Monad`. Haskell gives you the opportunity to choose which strategy to use, and to combine more than one using the Monad Transforming technique. On top of that, you can write generic code with replaceable parts -- this example scratches the surface by keeping the `Monoid` generic, so I can abstract the particular away from the general.

On the topic of `Monoid`, knowing about `Sum` and `Product` is a matter of consulting the API. I used `[a]` as a cheap `Monoid` for accumulating. For production-grade code I'd use `Data.Seq` instead.

As a final comment, I used «log» in the sense of in-memory record. If you are thinking about doing actual application logging, the `Writer Monad` is not the way. Guess what, there's a `LoggingT Monad` specifically designed for logging to `stderr` or `stdout`, and for people doing serious `syslog` logging, check out monad-logger-syslog. They abstract the logging out of the explicit I/O operations in a general way.