Not so random music flow

Posted on 2025-03-06 by Ernesto Hernández-Novich
Tags: ,

My previous post described a simple, yet sound example (pun intended), relating the use of entropy to generate a reasonable pleasant melody.

I complained about the perceived rigidness in terms of composition (pun twice as intended). Those are the shackles of straightforward imperative code. I posit a data-flow or stream-based mindset, supported by an expressive language with already available abstractions, leads to better composability.

I also mentioned in passing that I would rather have an algebraic representation for music, be it randomnly generated or not. This post will not explore that yet, but will hopefully set an understandable playing ground (thrice the pun!) on top of which to build.

Let’s rewrite while exploiting Haskell’s static type system to ensure explicit low-level value conversion, and a few niceties to make our toy CLI tool amenable.

A Haskell redo

String is the Poor Man’s datatype. – me, mocking the unityped and the barely typed, since 2005

Programmers who rely on strings to model things are doomed to trip, unless they are extremely disciplined and in good shape. They rarely are either. The Perl prototype I wrote is string based, because the language provides no cheap practical alternative. That’s why I had to open() the file with a particular flag, and go from Perl-string to Perl-character, and then to machine integer, back and forth. And interleaving I/O with transformations being careful of said intertwining. Pain and suffering.

We can do much better.

Haskell provides many «string-like» datatypes. We use [String] to teach, Text for human-readable things, and ByteString for binary data. Since we are going to read an arbitrarily long stream of bytes from /dev/urandom, we will use lazy ByteStrings: they are implemented in a way that us programmers don’t need to worry about buffering, instead thinking of «the contents» as a very long string of Word8 (unsigned bytes) that can be manipulated with an API that behaves just like String.

Even though they «look like lists» and their API resembles list manipulation, they are extremely performant: the compiler will analyze the way functions are composed, and transform the code into tight assembler loops, thanks to code transformations that are possible only over pure code. I’ve talked about it for regular lists, it’s called fusion, and makes ByteString related stuff extremely performant.

This re-write will also make it easier to change the music scale, having a single implementation that can do both 8-bit and 32-bit quantization. I’ll use additional function parameters to «pass down» this information: there is a cleaner equivalent solution we will explore in a future post.

Rewriting the tone generator

We will be processing a lazy ByteString, each element being a Word8 value. Using one Word8 value, choose the particular note from a scale, compute its base amplitude in relation to A440, and then amplify it. The thing is, Haskell has a static type system, Word8 and Double (needed for sin) don’t mix, forcing us to write explicit conversions.

type Scale = [Word8]

major :: Scale
major = [0,2,4,5,7,9,11,12]

aSampleU8 :: Scale -> Word8 -> Double -> Word8
aSampleU8 s w8 t =
  round $ volume * sin ( a4
                       * 2.0 ** ( fromIntegral n / 12.0 )
                       * t
                       )
  where
    a4 :: Double
    a4 = 440 * pi
    n :: Word8
    n = s !! (fromIntegral w8 `mod` length s)

I’ve chosen to model Scale as a simple list of Word8, with exactly the same intentions as my previous post. You can guess there’s another value minor :: Scale with the corresponding interval steps.

Note how aSampleU8’s first argument s is a Scale. That way, we can call it with major, minor, or whatever other weird scale you come up with, and the computation will still be the same. A common idiom in functional programming, particularly helpful when the language supports currying, is to place less general arguments first, so they can be provided partially, to complete the function application at a later place with «hardcoded» context, so to speak.

The second argument is the Word8 value read from the entropy source. Since we need to use the same seed value w8 to generate multiple points of the sinusoid, it becomes less general than the time, the third argument.

The function implementation is basically a re-write of the Perl version, but using explicit conversions. I’ve written the literal Double values so they’re easy to identify. We need two explicit conversions here, and it’s a great opportunity to learn about

fromIntegral :: (Integral a, Num b) => a -> b

able to take any Integral (Int, Word8, Integer,…) and do an explicit conversion (not a coercion, how uneducated) into any destination type that is Num (any numeric type). The particular fromIntegral implementation to use will be inferred by the target type, because it is mandatory for every Num type to have

fromInteger :: Integer -> a

and Integer is Integral. If the source type is not Integer, it has to be Integral which mandates

toInteger :: a -> Integer

The compiler is able to insert the optimized composition fromInteger . toInteger, possibly fusing it at the assembly level (read as, the right bit-shuffling).

That’s why this line

    n = s !! (fromIntegral w8 `mod` length s)

works nicely. We have length producing an Int, which forces

ghci> :type mod
mod :: Integral a => a -> a -> a
ghci> :type mod @Int
mod @Int :: Integral Int => Int -> Int -> Int

We can’t use w8 :: Word8 directly as mod’s first argument, but then

ghci> :type fromIntegral @Word8 @Int
fromIntegral @Word8 @Int
  :: (Integral Word8, Num Int) => Word8 -> Int

The resulting Int matches the type for the second argument of (!!), needed to get the proper Word8 out of the list. Not to worry, this will turn into extremely efficient machine code, but the conversions are explicit and proven to be right by the type system and not my (over)confidence. It’s impossible for a type conversion bug to happen here, and I don’t need to write tests to feel warm and fuzzy about it.

Figuring out why and how fromIntegral works within the argument for sin is left as an exercise for the reader. Also how round makes our Double into a Word8.

Collect ALL the points!

The Perl code resorted to nested loops so that foreach random sample, a chunk of the sinusoid was generated. Haskell only has recursion. But we rather use implicit recursion, because they are folds that, in turn, become very efficient machine code.

So, given a particular Scale and one Word8 sample, we can generate a collection of Word8 samples like this

import qualified Data.ByteString.Lazy as B

toSamplesU8 :: Scale -> Word8 -> B.ByteString
toSamplesU8 s w8 = B.pack $ map (aSampleU8 s w8) [0, 0.0001 .. 1]

Implicit iteration provided by map and, as anticipated, currying aSampleU8 by fixing both the scale and single sample. Now, the result is a plain [Word8] that requires packing to become a proper ByteString.

Now, picture the arbitrarily long stream of Word8 coming from entropy. We can take one of those, and turn it into a new sub-stream of Word8 – the points of the sinusoid. We want to repeatedly do this over the stream, but create a combined stream: we want to concatenate the resulting sub-streams produced by each sample-mapping generation. This is a frequent operation on regular lists as well as BytesString streams you must learn to identify it:

phase :: (B.ByteString -> B.ByteString)
      -> Scale
      -> B.ByteString -> B.ByteString
phase t s r = t $ B.concatMap (toSamplesU8 s) r

Given the scale s and the arbitrarily long entropy stream r, concatMap will apply the curried toSamplesU8 s to each element of the stream, and combine all the resulting partial streams into a single output stream. We don’t have to worry about buffering, we don’t have to worry about «how many», we don’t have to worry (in this case) about memory leaks, because fusion will take care of making this a very efficient loop.

Now, there’s a mistery t argument that, if you read the type signature, turns out to be a transformation function that takes a ByteString and produces a new one. We’ll use that to make our program more flexible.

Let’s step back for a second and contemplate how we can use what we have so far. Having phase we effectively abstracted the problem of generating the sound phases given a Scale and the stream of entropy. It should come as no surprise that there’s a way to read a file and turn it into a lazy ByteString, and a way to print a lazy ByteString. In this context, lazy means the library will take care of buffering and «reading on demand», as well as interleaving the reading with the printing. «Wait, are you saying the [Word8] is a lie and there’s never a list in memory?»… precisely.

So, we can actually have our string of random stuff with a simple

B.readFile "/dev/urandom" >>= B.putStr . phase id major

because the stream generated by B.readFile is fed to phase id major (the major scale and the identity function), to then get printed. It doesn’t get much cleaner than that, thanks to the IO Monad and currying.

Since my sound device accepts unsigned 8-bit integers, I can make this into an executable and it works in the same way the 8-bit Perl script did: except faster and using constant memory.

I find your lack of fusion disturbing

I wrote two Perl scripts, one for unsigned 8-bit output, another for signed 32-bit little endian integers. The first parameter for phase was placed there so that, thanks to currying, arbitrary conversions can be placed while the stream is being produced. For unsigned 8-bit nothing needs converting, so we put id, and the compiler is smart enough to «do nothing».

So, how do we go about converting a ByteString in such a way that we take four Word8s at a time, interpret them as a 32-bit little endian integers, and then tuck them back as Word8 so they become a new ByteString?

If you ever need to grab bytes from the wire (or a disk) and analyze their structure (network package, audio file, cryptographic material) and them punt them elsewhere, you are unmarshalling and marshalling. Haskell’s ecosystem provides a brutally performant solution to this problem for ByteString, thanks to the binary library.

The library provides two pure monads: Get to express generic multi-step unmarshalling, and Put to express generic multi-step marshalling. In our case, we’ll use Get to express «turn a ByteString into a list of 32-bit unsigned integers», and Put to express «turn a list of 32-bit unsigned integers into a ByteString», and them sequence their effects together. I’m going to say it again, for the haters in the back: thanks to fusion this will become a machine language tight loop.

Consider

    getInt32s :: B.ByteString -> [Int32]
    getInt32s bs = runGet getInt32le four : getInt32s rest
      where
        (four,rest) = B.splitAt 4 bs

receiving an arbitrarily long ByteString. It grabs the first four elements, while keeping the rest. Then it «runs» the Get monad to extract a single 32-bit little endian integer (Int32), putting it as the first element of a plain Haskell list. Then it uses explicit recursion to continue working. Remember, laziness works such that the Int32 can be immediately used by whatever function is consuming it – the list will never exist. Let me say my line: thanks to fusion.

Now consider

    toS32LEs :: [Int32] -> Put
    toS32LEs = mapM_ putInt32le

which is a beautiful example of why monads are so expressive. This function is obviously written to feed from getInt32s, but using a «monadic map»: putInt32le is a monadic action that works within the Put monad, by taking a single Int32 and marshalling it, but only when the Put monad is effectively performed.

Now mapM_ has a very interesting type

ghci> :type mapM_
mapM_ :: (Foldable t, Monad m) => (a -> m b) -> t a -> m ()
ghci> :type mapM_ putInt32le
mapM_ putInt32le :: Foldable t => t GHC.Int.Int32 -> PutM ()

as it allows to map a monadic action, putInt32le in this case, over every value inside a Foldable (and [Int32] is a Foldable). The result type is a bit puzzling as it produces an empty value () within the PutM monad. Turns out, mapM_ is actually building the sequence of steps, one per each element of the list, and then sequencing them one after the other ready to go, but it does not perform them: the list of individual values becomes a sequence of monadic actions that will parse them when the PutM monad is run, i.e. a «program» for PutM has been created, but not run yet.

It follows we need to write

asS32LEs :: B.ByteString -> B.ByteString
asS32LEs = runPut . toS32LEs . getInt32s

The argument being implicit («point free style» or η-conversion), getInt32s unmarshalls from Word8 stream into a [Int32] fed to a lazy-generated infinitely long sequence of putInt32le steps that are performed by runPut. In comes an arbitrarily long ByteString, its elements unmarshalled four at a time, turned into Int32, that then marshalled into a ByteString. Guess what, lists are never built, there’s exactly one toS32LEs being performed at a time. That’s what implicit recursion, laziness, and… fusion, get you. You write at an extremely high level of abstraction, don’t need to know how many, how to buffer… just connect the stream.

MVP! MVP! MVP!

We can finally put together a main program that can selectively use the major or minor scale, and spit unsigned bytes or signed 32-bit little endian marshalled as unsigned bytes (they are the same but different). It looks like this

data S = Major | Minor
       deriving (Show,Eq)

data F = Low | High
       deriving (Show,Eq)

data Opts = Opts { scale  :: S
                 , format :: F
                 }
          deriving SHow

main :: IO ()
main = do
  opts <- execParser options
  let s = case scale opts of
            Major -> major
            Minor -> minor
  let f = case format opts of
            Low   -> id
            High  -> asS32LEs
  B.readFile "/dev/urandom" >>= B.putStr . phase f s

We «parse» some options that will obviously come from the command line arguments, to get an opts value. The couple of let statements use the fields of the opts value to set s as the desired scale, and f as the desired transformation function, and then just the processing stream we’ve described before. I named my executable mm-exe, and run it like this

$ stack exec mm-exe -- --help
M -- random music generator

Usage: mm-exe [-m|--minor] [-h|--high]

  Random music generator

  Available options:
    -m,--minor               Minor scale
    -h,--high                High resolution
    -h,--help                Show this help text
$ stack exec mm-exe | aplay -c 2 -f U8 -r 8000
Playing raw data 'stdin' : Unsigned 8 bit, Rate 8000 Hz, Stereo
(... major scale noises ...)
$ stack exec mm-exe -- --minor| aplay -c 2 -f U8 -r 8000
Playing raw data 'stdin' : Unsigned 8 bit, Rate 8000 Hz, Stereo
(... minor scale noises ...)
$ stack exec mm-exe -- --minor --high | aplay -c 2 -f S32_LE -r 8000
Playing raw data 'stdin' : Signed 32 bit Little Endian, Rate 8000 Hz, Stereo
(... minor scale higher quality noises ...)
$ stack exec mm-exe -- --wtf
Invalid option `--wtf'

Usage: mm-exe [-m|--minor] [-h|--high]

  Random music generator

All this CLI argument functionality comes from the absolutely wonderful optparse-applicative library. Had to write this, hopefully self-explanatory, program and flag definitions

options :: ParserInfo Opts
options = info (flags <**> helper)
               (fullDesc <> progDesc "Random music generator"
                         <> header "M -- random music generator"
               )
  where
    flags = Opts <$> flag Major Minor ( long "minor" <> short 'm' <> help "Minor scale")
                 <*> flag Low   High  ( long "high"  <> short 'h' <> help "High resolution")

and make sure to execParser options at the top of main.

Ship it!

Play on

Ignoring extra functionality and type signatures, the basic Haskell version is about as long as the Perl one. However, there’s way better composability here. If we need more output formats, we simply write an additional marshalling part. Refactoring is easier, as there are no nested loop with interleaved side effects: stream processing is 100% pure Haskell, only needing I/O at both ends. The resulting code is a very tight loop interleaving I/O with data transformation but the compiler is going to write it for me.

There’s still one thing bugging me: this program outputs raw bytes so music is being produced by the hardware thanks to aplay. The stream produces a wave corresponding to a tone but not the tones themselves. I would rather replace phase with something like

notes :: B.ByteString -> Music

where Music represents, among other things, pitches and octaves. Then have said Music transformed in the way musicians do (transpose, reverse, arrange in chords), and then have it play with different instruments.

That would make it quite my tempo…