Random music, abstracted

Posted on 2025-03-26 by Ernesto Hernández-Novich
Tags: ,

My previous post showed a way to leverage Haskell’s static type system to translate the imperative low-level random music generator into a slightly higher level form. Not only did we ensure explicit low-level value conversion, while adding a few niceties to our toy CLI tool, but also showed a «data flow» approach to this kind of programming: separate I/O from transformation, try and make transformation a stream, let Haskell’s laziness take care of the details.

Simpler, decoupled, and safer. But there’s still room for improvement. You probably noticed there’s a sort of rhytmic «click» in between each tone. This is not intentional and a side effect of working at such a low-level. The code is synthesizing a partial sine wave for a given tone, but it’s not trying to smooth out the transition from one tone to the next: the ends will usually have a ragged edge, or a discontinuity. And then there’s the issue of how «plain» and «synthetic» the sounds are.

To conclude this experiment in random music generation, let’s bring the code to an even higher level. Having our code synthesize sine waves is fine and dandy, but it is extremely low-level. Yes, we can make it sound better with lots (lots!) of signal processing techniques, but that’s descending to an even lower level ditch, requiring a ridiculous amount of work. Instead, let’s use data types to abstract the notion of a note, do all our transformation using music theory, and then use a synthesizer for proper instrument-like sounds.

My muse does all the work

Euterpea is a wonderful Haskell library for music and sound synthesis. It provides a plethora of algebraic data types to represent music and signals, as well as combinators to build and transform at a very high level. It effectively separates the notions of building music (or sound) and performing (playing).

My previous code resorted to fancy math to generate notes, using A440 as a base. I had to fix the base frequency, worry about a long enough sinusoideal segment, and compute twelfth roots of two to figure out relative semitones. It’s good to know how to do that, but the more you know, the more you long to work at a higher level, in the same way that is great to know assembler(s) but is better to program in Haskell.

In what follows, you can assume

import qualified Data.ByteString.Lazy as B
import qualified Euterpea.Music       as M

to help you figure out what comes from each library.

Euterpea provides a data type to represent musical pitches succintly. The library has been designed so that translating to MIDI is as straightforward as possible. This is not a controversial issue at all. This is all to say that our base note (A440) should match the A (La) note on the fourth octave of a Grand Piano, so we use the Pitch data type to write

a440 :: M.Pitch
a440 = (M.A, 4)

Now, recall our notion of major and minor scale, as a sequence of semitone steps. We used them to compute the relative frequency from our base frequency. Musicians talk of transposing as the «moving up» or «moving down» a number of semitones from a particular base pitch, e.g. transposing A (La) two semitones up, becomes a B (Si), and transposing A (La) three semitones down, becomes an F# (Fa sostenido). It’s only natural that Euterpea provides a function M.trans to compute transpositions, so the heavy math can be rewritten as

toNote :: Integral a => a -> M.Music M.Pitch
toNote i = -- wait for it
         $ M.trans (major !! (fromIntegral i `mod` 8)) a440

We use i, the random number, to select a position from the major scale interval array. Then use M.trans to transpose a440 as many semitones as the interval needs. This takes care of selecting the proper pitch. But what about the duration of each pitch when played? The original code did some approximate math to compute a piece of sinusoid lasting enough time to be heard. In music, a note’s duration is a tad more complex notion, as it is relative to the rhythm, the playing speed, and sometimes a musician’s mood or skill. There’s a standardized way of expressing them: full notes, half notes, quarter notes, eighth notes… So, if we have a pitch and we want to make it into a note, we need to specify its relative duration as a modifier. That’s why we write

toNote :: Integral a => a -> M.Music M.Pitch
toNote i = M.note M.en
         $ M.trans (major !! (fromIntegral i `mod` 8)) a440

so that out of a random integral i we build an eighth note (or quaver, if you’re bri’ish) with the desired pitch.

Notice the type for the resulting value. M.Music is a polymorphic ADT to represent music constructions. So far we have written a function that constructs a piece of music comprised of a single note (duration of a pitch). If we map this function over an infinite list of random numbers, we are going to produce an infinite list of notes. In music, that would constitute a melody, which musicians tend to call as lines. Thanks to Euterpead we can write, unsurprisingly

randomMelody :: B.ByteString -> M.Music M.Pitch
randomMelody = M.line . map toNote . B.unpack

That is, given a potentially infinite ByteString, unpack it into a [Word8], use it to generate a list of single note music constructions, and them combine them as a single melody. And you can test this from the REPL, using take to look at a sample of what’s generated

ghci> import qualified Data.ByteString.Lazy as B
ghci> import qualified Euterpea.Music as M
ghci> r <- B.readFile "/dev/urandom" 
ghci> M.line $ take 5 $ map toNote $ B.unpack r
Prim (Note (1 % 8) (E,5)) :+: (Prim (Note (1 % 8) (Gs,5)) :+: (Prim (Note (1 % 8) (A,5)) :+: (Prim (Note (1 % 8) (A,5)) :+: (Prim (Note (1 % 8) (D,5)) :+: Prim (Rest (0 % 1))))))

There are music Primitives, Notes in this case, combined in sequence thanks to the line-building operator :+:. Note duration is fixed as 1/8 beautifully noted thanks to Haskell’s Rational data type that, yes, allows exact math to be performed on them. Finally, there are exactly five notes, with a zero duration Rest (silence) tacked at the end – this is a «marker» used by Euterpea, that becomes a no-operation the moment you «perform» the music construction.

Our original code performed (as in «sounded») reasonably well. The 8-bit sound is funny and comes as a result of us using poorly sampled sinusoids. But this new code produces something very close to a music score, in that, well, there’s a bunch of eighth notes one after the other. However, if you’ve ever looked at a music score, each line begins with a squiggly sigil, sometimes accompanied by hashes and weird little «b»: those are used to specify the key (base note) and scale, which are hardcoded in our note generation. But then there’s usually an ominous fraction to be found: it is used to specify the rhythm by stating how many given notes fit into every beat. All of our notes’ durations are hardcoded as 1/8, and I want eight of them on every beat. Euterpea has a somewhat weird way of expressing the rhythm that I prefer not to explain. Suffice to say that the Music type allows us to apply a particular tempo to any given piece of Music, so we start with

simple :: B.ByteString -> M.Music M.Pitch
simple = M.tempo M.wm . randomMelody

to say that each beat should take a whole-note duration. This would complete a music score with key, scale, rhythm, and notes. We only need a musician and an instrument to perform it, right?

As mentioned before, Euterpea was designed to follow the MIDI standard. This includes the ability to modify any music construction to be played with any instrument as named by the MIDI standard, i.e.

simple :: B.ByteString -> M.Music M.Pitch
simple = M.instrument M.RhodesPiano . M.tempo M.wm . randomMelody

All we need now is a MIDI sequencer to play this piece of music. We can do it from the REPL like so

ghci> B.readFile "/dev/urandom" >>= Euterpea.IO.MIDI.Play.playDev 2 . simple
(music plays until you interrupt it)

Since MIDI files must be finite in size, I used a combination of take and Euterpea’s writeMidi function to create a smaller file with a sample run.

But wait, there’s more!

We went from random integers grabbed from Linux’ entropy source, to either an infinite melody played live or a fully functional MIDI file. We used a proper software synthesizer (fluidsynth) but you could send it to your MIDI keyboard and it would work.

With a half a dozen lines of Haskell.

Now, I am not going to explain all the things Euterpea can do in terms of rhythm, instrumenation, dynamics, or musical concepts. Rather, I would like to emphasize how this is an extremely practical approach to manipulating music at the highest possible level of abstraction with a data type handling music constructions (pitch, duration), music transformation (transposition), music combination, musical properties (tempo, instrumentation). The library effectively separates the building of the score, from the performing of the score, in the same way Haskell forces you to separate pure data transformation from I/O.

The Music type and associated combinators are normal Haskell values, thus allowing you to use all of the programming language’s power to manipulate music constructions (laziness included), leaving the rendering to the end. Let’s close with a deceptively simple, but I hope eye opening example of what’s possible with a little creativity.

We’ve seen how line actually uses the :+: operator to create melodies: one note after the other. Music also has the concept of harmony: several notes played at the same time. Euterpea provides the :=: operator to combine two or more Music elements so they can be played simultaneously. This means you can create a simple chord by combining three notes, say

c_major :: M.Music M.Pitch
c_major = M.note M.qn (M.C,4)
          :=:
          M.note M.qn (M.E,4)
          :=:
          M.note M.qn (M.G,4)

that you can then play with whatever instrument. But you can also combine melodies.

We’ve also seen Euterpea has the notion of duration in terms of note fractions (whole, half, quarter) and these can be applied to rests (silences) as well. We can do math on them, as their type has a Num instance. Not only that, Euterpea comes with many utility functions to handle commonly used musical duration notions.

A Canon is a compositional technique where a chosen base melody is played by an instrument, and then one or more imitation melodies are added, to be played by other instruments, each coming in to play after a certain delay (measures in silence). If we wanted to create an infinite three-voice canon, the dux playing a Rhodes piano, with comes by a glockenspiel and a synthetic voice, in after two and four measures respectively

canon :: B.ByteString -> M.Music M.Pitch
canon bs = M.instrument M.RhodesPiano baseMelody
           M.:=:
           M.instrument M.Glockenspiel (M.offset (2*M.wn) baseMelody)
           M.:=:
           M.instrument M.SynthVoice (M.offset (4*M.wn) baseMelody)

and then

ghci> B.readFile "/dev/urandom" >>= Euterpea.IO.MIDI.Play.playDev 2 . canon

with the ad nauseam director note being implicit…

You can listen to the major and minor jam demos as saved to disk by Euterpea.

Abstraction leads to flexibility

If you pay attention to the internal structure of the MIDI files, you’ll see that Euterpea has built it as proper separate tracks, each holding a set of events, and all the annotations that allow both playing, or loading to a sequencer for further manipulation.

That means you could use Euterpea to algorithmically build fragments of abstract, high-level, music construction, that render to properly built MIDI files. Then use them as sources you combine with live recording or low-level samples, in your computer assisted music composition. And yes, you can read MIDI files into Euterpea, to turn them into data types and manipulate to transform at will.

A (sane) programmer would rather use high level programming language constructs to build complex functionality, instead of combining fragments of assembly language. I believe there’s a lot more value in the automatic manipulation of music at the higher level of abstraction of notes, chords, lines, and algebraic transformations, than to use raw sound samples and apply costly low-level transformations.

This does not mean there isn’t value in processing sound samples that are waveform based using higher abstractions. As a matter of fact, a part of Euterpea affords this usage, in combination with music manipulation. However, I believe a data-flow approach for music and sound is more effective and expressive than the current raw fragment, copy&paste, low level filtering transformation approach most DAW users prefer. In a way, DAWs are quite imperative and procedural, focusing on the step-by-step, and menial work over almost raw material, instead of being data-flow transformation based over high level abstractions. Sprinkled with the «but this is intuitive» fallacy. Yes, you can do amazing things with them, but they lack the additional flexibility one gains by treating music language as the algebraic entity it is, instead of focusing on the final rendering value.

I believe is the same difficulty people have understanding why I can produce extremely complex yet visually pleasing documents, diagrams, and presentations without using a WYSIWYG word processor. When you’re guided by the form, you have a hard time understanding the underlaying structure, and given time, manipulating basic and aggregate parts becomes increasingly difficult until you lose the will to do it.

A vector image is more flexible than a PNG. A WAV file is more flexible than an MP3. The flexibility of an algebraic data type you can get back and forth from MIDI files into a high level programming language is hard to argue against…