Not so random music

Posted on 2025-02-27 by Ernesto Hernández-Novich
Tags: ,

I’m interested in music from the formal languages point of view. One of my first experiments was to try and create «random music» in the true sense of the word random. That is, generate absolutely random notes and listen to what happens. It was extremely easy to program, but results were not pleasant at all.

Western world music has evolved in such a way that our ears rarely ever appreciate music, unless it has some internal structure. Eastern world music is a bit more open minded. But still, if you sit down with a 12-sided dice and just play a note depending on what you roll, it will obviously sound chaotic, but also ugly and annoying.

I still think there’s value in generating random music, though. It can show whether or not you understand how unadultered binary data, note selection, tuning, and digital encoding work. At least, that was the purpose of this exercise I came up a couple of decades ago, when I was teaching these subjects.

The following experiment assumes a GNU/Linux system. The code is written in Perl and the lines are not in the order they appear in the program, and there are a few blanks you’ll need to fill. Have fun.

Randomness as a device

The Linux kernel provides a couple of device drivers producing pure unadultered random data based on environmental entropy.

$ ls -l /dev/*random
crw-rw-rw- 1 root root 1, 8 Feb 27 12:22 /dev/random
crw-rw-rw- 1 root root 1, 9 Feb 27 12:22 /dev/urandom

That is, if you read from those devices you get an arbitrarily long stream of random bytes, one at a time (they are character devices). A firehose of bits that came to be thanks to all the goings on with the network, keyboard, mouse, and the universe. I will use /dev/urandom because it is a pseudorandom number generator seeded with entropy gathered from the environment. This means there will always be data there and reads will never block.

So, we start by

open(my $r,"<:raw","/dev/urandom");
while (read($r,$b,1)) {

to open the random generator in «raw» mode: give me all the bytes, don’t do any Perl-magic interpretation. That stream of bytes will be read one at a time using read(FILEHANDLE,BUFFER,LENGTH) so that $b will hold the next byte read. I’m not going for efficiency, but practicality.

This is a raw byte that I can interpret any way I want. I choose to interpret it as a signed character so it will range from -128 to 127 – check your CI3641 notes. Perl is extremely powerful in terms of translating from or to raw data, and for this particular problem

    my $h = unpack('C*',$b);

is the incantation for $h to become a signed charater. That’s all the operating systems, programming languages, and machine organization knowledge will need.

The vibe and the Fourier

Sounds are vibrations that propagate as acoustic waves. We want to create a sound wave out of this random numbers, and then use the sound card to… make noise. Sound waves can be thought as increasing (positive) and decreasing (negative) pressure values, which hopefully clarifies why I chose the numbers to be signed.

A combination of musical tones can be built out of this random stream as long as we select the particular frequencies western ears are used to. That is we need to «tune» (as in tuning a musical instrument) these numbers so they match the traditional music notes (or pitches). Since I am making singular notes without a particular timbre or modulation, the math is straightforward.

If you’ve ever enjoyed a live orchestra concert, you’ve certainly noticed that after every musician has sat down and settled, a violin player ceremoniously comes out. They then play a single particular note, and the rest of the orchestra fine tunes their instruments to that tone. They become in tune to the A440 («concert A» or «La de concierto») and they are ready to go. This is a very intense moment, specially if you know the violin player and can’t help but shed a tear, even after fifteen years of listening to him do his thing…

This A440 has (spoiler alert!) 440Hz frequency. A vibration having double that frequency (880Hz), would have the same tone but with a higher pitch – a full octave. The western world has agreed to subdivide a full octave in twelve steps according to the equal temperament tuning system, using a logarithmic scale based on the twelfth root of 2. Therefore, the frequency for the n-th semitone of an interval starting at A440 can be computed by

440 × 2(n/12)

counting from n = 0, because that’s how civilized people count.

Building a sinusoid wave out of these random numbers means the sin() function is involved, and it typically works on radians. The range of sin() is [-1,1] so scaling will also be needed, to increase the sinusoids amplitude (pump up the volume!). Let’s start with something like

    my $n = int( $v * sin( $a4 * 2 ** ( $h / 12 ) ));

where

  • $v is a constact factor for amplitude (volume) arbitrarily set at 100 elsewhere in the script.

  • $a4 is a constant factor 1382 set elsewhere in the script. This is 440π so the value is turned to radians for sin() to work properly

  • The whole operation is truncated with int() because the hardware wants integers. That’s for later.

Now, the above line as it is would still generate annoying noise because $h is a random number that would jump up and down, without the patterns that our western ears are used to.

Most of the «agreeable» western music is organized using either major or minor scales. That is, you don’t play random notes: you select the semitone progression from the base note, and «stick to it». A cursory review of those two references (or you being a musician) should make clear why I have three lines like these in the script

my @ma = qw(0 2 4 5 7 9 11 12);
my @mi = qw(0 2 3 5 7 8 10 12);
my @s  = @ma;
my $p  = 8;

so that the default scale (@s) uses the major (@ma) scale progression, having exactly eight steps ($p). When I feel sad or pensive, I switch to the minor scale. These additions lead to improving our computation to

    my $n = int( $v * sin( $a4 * 2 ** ( $s[ $h % $p ] / 12 ) ));

using modular arithmetic so our random number $h falls into the [0..7] range, and then picking the particular position from the default scale. This will select the proper exponent to which to raise 2 to, so that only tempered tones are generated: a Perl integer corresponding to a pure sinusoid tuned to the corresponding pitch within the scale.

The next step would be to simply turn that Perl integer into an unsigned machine byte representation of one byte. If you’re wondering why I need unsigned bytes now, it has to do with the hardware I have, you’ll see.

In order to convert the Perl integer into an unsigned machine byte, we can write

    my $w = pack('c',$n);
    print "$w";

to spit out bytes. It is very possible that a badly chosen $v produces an $n too large to fit on a byte («too loud») and there will be implicit clipping – I’m not trying to deal with this, I’m just playing.

But we are missing another aspect of music (and parametric simulations such as these): time. The above computation uses a random number to produce a particular note within the interval, but we want them to play for a certain amount of time. Given that a pure note is a sinusoid, consider this perlseudocode.

for (my $t = 0.0; $t < 1; $t += 0.0001 ) {
    my $n = sin( $t * 3.14 );
}

this would compute a pure sinusoid. We need to interleave the construction of a sinusoid, with changing the phase at each time step, to get the tone we want. That means changing the code to look like this

  for (my $t = 0.0; $t < 1; $t += 0.0001) {
    ...
    my $n = int( $v * sin( $a4 * 2 ** ( $s[ $h % $p ] / 12 ) * $t ));
    ...
  }

You should be able to figure out how to combine all the pieces and put them into a single Perl script. I will not try to code the part that interacts with the sound card, because it doesn’t add any value and delays the fun we’re after.

Drop them beats

After putting all the code in m8.pl, we can produce all the random numbers we need running something like

$ perl m8.pl | od -b | head -5
0000000 000 021 042 061 100 114 126 135 142 143 142 136 126 114 100 062
0000020 042 022 000 360 337 317 301 265 252 243 236 235 236 242 251 263
0000040 277 315 335 356 377 017 040 060 077 113 125 135 142 143 142 136
0000060 127 115 101 063 044 023 002 361 340 321 302 265 253 243 236 235
0000100 236 242 251 262 276 314 334 354 376 016 037 057 075 112 124 134

Those look like random bytes printed in octal, but in reality they have been carefully computed to match the many sinusoids corresponding to the randomly chosen pitches within the same octave, as a sequence of unsigned 8-bit values.

ALSA is a combination of Linux kernel drivers and CLI utilities to operate sound hardware. The hardware in this particular machine supports playing raw unsigned 8-bit values, so making noise is as simple as running

$ perl m8.pl | aplay -c 2 -f U8 -r 12000

The stream of bytes produced by m8.pl will be fed to aplay. Each byte will be considered an Unsigned 8-bit sample, and they will be used at a sampling speed of 12kHz. Each byte will be used for both channels (a «fake stereo»). And this will work until you interrupt it with Ctrl-C.

You can improve the pipeline by adding something in the middle that only reads a certain amount of bytes and copies them over, such as dd.

You can obviously capture the output of the script into a file, and then feed it to aplay. Everything is a file for the Unix-enlightened. If you change the sampling rate, you can listen to the same random melody with different pitches.

$ perl m8.pl > sample.raw
(... wait as long as you want and hit Ctrl-C ...)
$ ls -l sample.raw
-rw-r--r-- 1 emhn emhn 3293184 Feb 27 17:26 sample.raw
$ aplay -c 2 -f U8 -r 12000 sample.raw 
Playing raw data 'sample.raw' : Unsigned 8 bit, Rate 12000 Hz, Stereo
$ aplay -c 2 -f U8 -r 8000 sample.raw 
Playing raw data 'sample.raw' : Unsigned 8 bit, Rate 8000 Hz, Stereo

I’ve converted the RAW samples to MP3 files, so you can hear what I heard coming out my headphones.

The conversion can be done without having to touch the mouse or click a button, by simply using sox and lame. Did I mention you can add sound effects using sox, as in echo, reverbs, as well as do mixing? Go read some manual pages to Get Good.

Random yet reasonably pleasing 8-bit music.

Generating better sounds

The above script generates 8-bit music because, well, we’re using one byte at a time. This gives us 256 different possible values to represent the sinusoid’s amplitude modelling the frequency. Those are very few possible values for such a function, and it accounts for the raspy sound. We say there’s poor quantization because our sample size (8-bits) lacks enough detail to accurately represent the curve.

We can increase the sample size to anything our hardware supports. I know for a fact my sound hardware supports signed 32-bits litte endian samples. So, if we grab four bytes at a time, and turn them into a 32-bit signed little endian machine integer, each sample is going to be extremely high quality.

This requires two changes to the script that are so obnoxious to refactor, that I decided to write another script. We need to group by fours, and every time we have four samples pack them into the desired binary form. You should be able to figure out where these changes fit

  my @w;
  my $i = 0;
  ...
    push @w,$n;
    unless (++$i % 4) {
      my  = pack('N*',@w);
      print "$w";
      @w = ();
    }

and now

$ perl m32.pl | aplay -c 2 -f S32_LE -r 12000
Playing raw data 'stdin' : Signed 32 bit Little Endian, Rate 12000 Hz, Stereo

Much better

Now, raw samples, waveform files, and MP3 files are «final versions», in the same way an executable is to your source code. Wouldn’t it be nice to capture this raw samples and turn them into higher-level data we can manipulate as algebraic things, instead of having to rely on the physics and math of it? I certainly think so, and will have something to say about it Real Soon Now®.