My previous post described a simple, yet sound example (pun intended), relating the use of entropy to generate a reasonable pleasant melody.
I complained about the perceived rigidness in terms of composition (pun twice as intended). Those are the shackles of straightforward imperative code. I posit a data-flow or stream-based mindset, supported by an expressive language with already available abstractions, leads to better composability.
I also mentioned in passing that I would rather have an algebraic representation for music, be it randomnly generated or not. This post will not explore that yet, but will hopefully set an understandable playing ground (thrice the pun!) on top of which to build.
Let’s rewrite while exploiting Haskell’s static type system to ensure explicit low-level value conversion, and a few niceties to make our toy CLI tool amenable.
A Haskell redo
String is the Poor Man’s datatype. – me, mocking the unityped and the barely typed, since 2005
Programmers who rely on strings to model things are doomed to trip,
unless they are extremely disciplined and in good shape. They
rarely are either. The Perl prototype I wrote is string based,
because the language provides no cheap practical alternative.
That’s why I had to open()
the file with a particular flag,
and go from Perl-string to Perl-character, and then to machine
integer, back and forth. And interleaving I/O with transformations
being careful of said intertwining. Pain and suffering.
We can do much better.
Haskell provides
many
«string-like» datatypes. We use [String]
to teach, Text
for human-readable things, and ByteString
for binary data.
Since we are going to read an arbitrarily long stream of
bytes from /dev/urandom
, we will use lazy ByteString
s:
they are implemented in a way that us programmers don’t need
to worry about buffering, instead thinking of «the contents»
as a very long string of Word8
(unsigned bytes) that can
be manipulated with an API that behaves just like String
.
Even though they «look like lists» and their API resembles
list manipulation, they are extremely performant: the compiler
will analyze the way functions are composed, and transform
the code into tight assembler loops, thanks to code
transformations that are possible only over pure code.
I’ve talked about it for
regular lists, it’s called fusion, and makes ByteString
related stuff extremely performant.
This re-write will also make it easier to change the music scale, having a single implementation that can do both 8-bit and 32-bit quantization. I’ll use additional function parameters to «pass down» this information: there is a cleaner equivalent solution we will explore in a future post.
Rewriting the tone generator
We will be processing a lazy ByteString
, each element
being a Word8
value. Using one Word8
value, choose
the particular note from a scale, compute its base amplitude
in relation to A440, and then amplify it. The thing is,
Haskell has a static type system, Word8
and Double
(needed for sin
) don’t mix, forcing us to
write explicit conversions.
type Scale = [Word8]
major :: Scale
= [0,2,4,5,7,9,11,12]
major
aSampleU8 :: Scale -> Word8 -> Double -> Word8
=
aSampleU8 s w8 t round $ volume * sin ( a4
* 2.0 ** ( fromIntegral n / 12.0 )
* t
)where
a4 :: Double
= 440 * pi
a4 n :: Word8
= s !! (fromIntegral w8 `mod` length s) n
I’ve chosen to model Scale
as a simple list of Word8
, with
exactly the same intentions as my previous post. You can guess
there’s another value minor :: Scale
with the corresponding
interval steps.
Note how aSampleU8
’s first argument s
is a Scale
. That
way, we can call it with major
, minor
, or whatever other
weird scale you come up with, and the computation will still
be the same. A common idiom in functional programming,
particularly helpful when the language supports
currying, is to
place less general arguments first, so they can be provided
partially, to complete the function application at a
later place with «hardcoded» context, so to speak.
The second argument is the Word8
value read from the
entropy source. Since we need to use the same seed value
w8
to generate multiple points of the sinusoid, it becomes
less general than the t
ime, the third argument.
The function implementation is basically a re-write of
the Perl version, but using explicit conversions. I’ve written
the literal Double
values so they’re easy to identify.
We need two explicit conversions here, and it’s a great
opportunity to learn about
fromIntegral :: (Integral a, Num b) => a -> b
able to take any Integral
(Int
, Word8
, Integer
,…)
and do an explicit conversion (not a coercion, how uneducated)
into any destination type that is Num
(any numeric type).
The particular fromIntegral
implementation to use
will be inferred by the target type, because it is mandatory
for every Num
type to have
fromInteger :: Integer -> a
and Integer
is Integral
. If the source type is not
Integer
, it has to be Integral
which mandates
toInteger :: a -> Integer
The compiler is able to insert the optimized composition
fromInteger . toInteger
, possibly fusing it at the
assembly level (read as, the right bit-shuffling).
That’s why this line
= s !! (fromIntegral w8 `mod` length s) n
works nicely. We have length
producing an Int
, which forces
> :type mod
ghcimod :: Integral a => a -> a -> a
> :type mod @Int
ghcimod @Int :: Integral Int => Int -> Int -> Int
We can’t use w8 :: Word8
directly as mod
’s first argument, but then
> :type fromIntegral @Word8 @Int
ghcifromIntegral @Word8 @Int
:: (Integral Word8, Num Int) => Word8 -> Int
The resulting Int
matches the type for the second argument
of (!!)
, needed to get the proper Word8
out of the list.
Not to worry, this will turn into extremely efficient machine code,
but the conversions are explicit and proven to be right by the
type system and not my (over)confidence. It’s impossible for
a type conversion bug to happen here, and I don’t need to write
tests to feel warm and fuzzy about it.
Figuring out why and how fromIntegral
works within the
argument for sin
is left as an exercise for the reader.
Also how round
makes our Double
into a Word8
.
Collect ALL the points!
The Perl code resorted to nested loops so that foreach
random sample, a chunk of the sinusoid was generated.
Haskell only has recursion. But we rather use implicit recursion,
because they are fold
s that, in turn, become very efficient
machine code.
So, given a particular Scale
and one Word8
sample,
we can generate a collection of Word8
samples like this
import qualified Data.ByteString.Lazy as B
toSamplesU8 :: Scale -> Word8 -> B.ByteString
= B.pack $ map (aSampleU8 s w8) [0, 0.0001 .. 1] toSamplesU8 s w8
Implicit iteration provided by map
and, as anticipated,
currying aSampleU8
by fixing both the scale and single sample.
Now, the result is a plain [Word8]
that requires pack
ing
to become a proper ByteString
.
Now, picture the arbitrarily long stream of Word8
coming
from entropy. We can take one of those, and turn it into a
new sub-stream of Word8
– the points of the sinusoid.
We want to repeatedly do this over the stream, but create
a combined stream: we want to concatenate the resulting
sub-streams produced by each sample-mapping generation.
This is a frequent operation on regular lists as
well as BytesString
streams you must learn to identify it:
phase :: (B.ByteString -> B.ByteString)
-> Scale
-> B.ByteString -> B.ByteString
= t $ B.concatMap (toSamplesU8 s) r phase t s r
Given the scale s
and the arbitrarily long entropy stream r
,
concatMap
will apply the curried toSamplesU8 s
to each
element of the stream, and combine all the resulting
partial streams into a single output stream. We don’t have to
worry about buffering, we don’t have to worry about «how many»,
we don’t have to worry (in this case) about memory leaks,
because fusion will take care of making this a very efficient loop.
Now, there’s a mistery t
argument that, if you read the
type signature, turns out to be a transformation function
that takes a ByteString
and produces a new one. We’ll use
that to make our program more flexible.
Let’s step back for a second and contemplate how we can use
what we have so far. Having phase
we effectively abstracted the
problem of generating the sound phases given a Scale
and
the stream of entropy.
It should come as no surprise that there’s a way to
read a file and turn it into a lazy ByteString
, and a
way to print a lazy ByteString
. In this context, lazy means
the library will take care of buffering and «reading on demand»,
as well as interleaving the reading with the printing.
«Wait, are you saying the [Word8]
is a lie and there’s
never a list in memory?»… precisely.
So, we can actually have our string of random stuff with a simple
"/dev/urandom" >>= B.putStr . phase id major B.readFile
because the stream generated by B.readFile
is fed
to phase id major
(the major scale and the identity function),
to then get printed. It doesn’t get much cleaner than that,
thanks to the IO
Monad and currying.
Since my sound device accepts unsigned 8-bit integers, I can make this into an executable and it works in the same way the 8-bit Perl script did: except faster and using constant memory.
I find your lack of fusion disturbing
I wrote two Perl scripts, one for unsigned 8-bit output, another
for signed 32-bit little endian integers. The first parameter
for phase
was placed there so that, thanks to currying,
arbitrary conversions can be placed while the stream is
being produced. For unsigned 8-bit nothing needs converting,
so we put id
, and the compiler is smart enough to «do
nothing».
So, how do we go about converting a ByteString
in such
a way that we take four Word8
s at a time, interpret them
as a 32-bit little endian integers, and then tuck them
back as Word8
so they become a new ByteString
?
If you ever need to grab bytes from the wire (or a disk)
and analyze their structure (network package, audio file,
cryptographic material) and them punt them elsewhere, you
are unmarshalling and marshalling.
Haskell’s ecosystem provides a brutally performant solution
to this problem for ByteString
, thanks to the
binary
library.
The library provides two pure monads: Get
to express generic multi-step unmarshalling, and Put
to express generic multi-step marshalling. In our case,
we’ll use Get
to express «turn a ByteString
into a list
of 32-bit unsigned integers», and Put
to express «turn
a list of 32-bit unsigned integers into a ByteString
»,
and them sequence their effects together. I’m going to say
it again, for the haters in the back: thanks to fusion
this will become a machine language tight loop.
Consider
getInt32s :: B.ByteString -> [Int32]
= runGet getInt32le four : getInt32s rest
getInt32s bs where
= B.splitAt 4 bs (four,rest)
receiving an arbitrarily long ByteString
. It grabs the
first four
elements, while keeping the rest
. Then
it «runs» the Get
monad to extract a single 32-bit
little endian integer (Int32
), putting it as the first
element of a plain Haskell list. Then it uses explicit
recursion to continue working. Remember, laziness works
such that the Int32
can be immediately used by
whatever function is consuming it – the list will never
exist. Let me say my line: thanks to fusion.
Now consider
toS32LEs :: [Int32] -> Put
= mapM_ putInt32le toS32LEs
which is a beautiful example of why monads are so expressive.
This function is obviously written to feed from getInt32s
,
but using a «monadic map»: putInt32le
is a monadic action
that works within the Put
monad, by taking a
single Int32
and marshalling it, but only when the Put
monad is effectively performed.
Now mapM_
has a very interesting type
> :type mapM_
ghcimapM_ :: (Foldable t, Monad m) => (a -> m b) -> t a -> m ()
> :type mapM_ putInt32le
ghcimapM_ putInt32le :: Foldable t => t GHC.Int.Int32 -> PutM ()
as it allows to map a monadic action, putInt32le
in this
case, over every value inside a Foldable
(and [Int32]
is
a Foldable
). The result type is a bit puzzling as it
produces an empty value ()
within the PutM
monad.
Turns out, mapM_
is actually building the sequence
of steps, one per each element of the list, and then
sequencing them one after the other ready to go, but it
does not perform them: the list of individual values
becomes a sequence of monadic actions that will parse
them when the PutM
monad is run, i.e. a «program»
for PutM
has been created, but not run yet.
It follows we need to write
asS32LEs :: B.ByteString -> B.ByteString
= runPut . toS32LEs . getInt32s asS32LEs
The argument being implicit («point free style» or
η-conversion),
getInt32s
unmarshalls from Word8
stream into a
[Int32]
fed to a lazy-generated infinitely long sequence
of putInt32le
steps that are performed by runPut
.
In comes an arbitrarily long ByteString
, its elements
unmarshalled four at a time, turned into Int32
, that
then marshalled into a ByteString
. Guess what,
lists are never built, there’s exactly one
toS32LEs
being performed at a time. That’s what
implicit recursion, laziness, and… fusion, get you.
You write at an extremely high level of abstraction,
don’t need to know how many, how to buffer… just
connect the stream.
MVP! MVP! MVP!
We can finally put together a main program that can selectively use the major or minor scale, and spit unsigned bytes or signed 32-bit little endian marshalled as unsigned bytes (they are the same but different). It looks like this
data S = Major | Minor
deriving (Show,Eq)
data F = Low | High
deriving (Show,Eq)
data Opts = Opts { scale :: S
format :: F
,
}deriving SHow
main :: IO ()
= do
main <- execParser options
opts let s = case scale opts of
Major -> major
Minor -> minor
let f = case format opts of
Low -> id
High -> asS32LEs
"/dev/urandom" >>= B.putStr . phase f s B.readFile
We «parse» some options
that will obviously come from the
command line arguments, to get an opts
value. The couple
of let
statements use the fields of the opts
value to
set s
as the desired scale, and f
as the desired
transformation function, and then just the processing
stream we’ve described before. I named my executable
mm-exe
, and run it like this
$ stack exec mm-exe -- --help
M -- random music generator
Usage: mm-exe [-m|--minor] [-h|--high]
Random music generator
Available options:
-m,--minor Minor scale
-h,--high High resolution
-h,--help Show this help text
$ stack exec mm-exe | aplay -c 2 -f U8 -r 8000
Playing raw data 'stdin' : Unsigned 8 bit, Rate 8000 Hz, Stereo
... major scale noises ...)
($ stack exec mm-exe -- --minor| aplay -c 2 -f U8 -r 8000
Playing raw data 'stdin' : Unsigned 8 bit, Rate 8000 Hz, Stereo
... minor scale noises ...)
($ stack exec mm-exe -- --minor --high | aplay -c 2 -f S32_LE -r 8000
Playing raw data 'stdin' : Signed 32 bit Little Endian, Rate 8000 Hz, Stereo
... minor scale higher quality noises ...)
($ stack exec mm-exe -- --wtf
Invalid option `--wtf'
Usage: mm-exe [-m|--minor] [-h|--high]
Random music generator
All this CLI argument functionality comes from the absolutely wonderful optparse-applicative library. Had to write this, hopefully self-explanatory, program and flag definitions
options :: ParserInfo Opts
= info (flags <**> helper)
options <> progDesc "Random music generator"
(fullDesc <> header "M -- random music generator"
)where
= Opts <$> flag Major Minor ( long "minor" <> short 'm' <> help "Minor scale")
flags <*> flag Low High ( long "high" <> short 'h' <> help "High resolution")
and make sure to execParser options
at the top of main
.
Ship it!
Play on
Ignoring extra functionality and type signatures, the basic Haskell version is about as long as the Perl one. However, there’s way better composability here. If we need more output formats, we simply write an additional marshalling part. Refactoring is easier, as there are no nested loop with interleaved side effects: stream processing is 100% pure Haskell, only needing I/O at both ends. The resulting code is a very tight loop interleaving I/O with data transformation but the compiler is going to write it for me.
There’s still one thing bugging me: this program outputs raw bytes so music
is being produced by the hardware thanks to aplay
. The stream produces
a wave corresponding to a tone but not the tones themselves. I would rather
replace phase
with something like
notes :: B.ByteString -> Music
where Music
represents, among other things, pitches and octaves. Then
have said Music
transformed in the way musicians do (transpose, reverse,
arrange in chords), and then have it play with different instruments.
That would make it quite my tempo…