The Psychology of Music and the 'tuneR' Package

Introduction

This semester I’m TA’ing a course on the Psychology of Music taught by Phil Johnson-Laird. It’s been a great course to teach because (i) so much of the material is new to me and (ii) because the study of the psychology of music brings together so many of the intellectual tools I enjoy, including music theory, psychophysics and Fourier analysis.

One topic this semester that was completely new to me was the theory of tuning: I had known about the invention of the well-tempered system of tuning, but had never heard of Pythagorean tuning or just tuning – and certainly was not aware that the well-tempered system Bach celebrated was not identical to our current equal-tempered system of tuning.

As a way of consolidating some of the knowledge I’ve gained, I decided I’d write a blog entry after several months of neglecting this blog. (For that neglect, I’ll blame a combination of grant writing, book writing, ongoing research projects and personal life developments.) In what follows, I’ll give a brief overview of the theory of tuning at a theoretical level that should be accessible to anyone who’s familiar with the names of intervals and feels comfortable thinking quantitatively.

After surveying the field, I’ll turn to a discussion of some code I’ve written in R that implements these ideas using the ‘tuneR’ package, which is one of my favorite hidden gems from CRAN. Along the way, I’ll introduce some of the simplest tools from the ‘tuneR’ package that can be used for generating computer music.

Tuning Systems: Pythagorean, Just and 12-Tet

It’s worth noting right at the start that tuning is a misleading name for the topic we’ll be discussing: we’re not talking about how one tunes a fixed instrument so that it sounds in tune, but rather we’re interested in how one defines the very notes that the instrument should be able to produce when it’s perfectly in tune.

To make that clear, let’s assume that we’ve accepted as a given that a frequency of 440 Hz will be called A. Our problem then becomes one of deciding which of the infinitely many frequencies we could produce actually deserves the label of A#, B, C, C#, and so on.

Pythagorean Tuning

The simplest solution to this problem I know of is the Pythagorean tuning system. It’s based on constructing all of the possible notes using a series of perfect fifths. If you remember the Circle of Fifths, you’ll remember that you can reach every chromatic note by ascending fifths: if you start at A, you’ll proceed through E, B, F# and so on.

The Pythagorean system implements the Circle of Fifths directly using repeated multiplication of a base frequency. To do this, you first declare that a perfect fifth is at a frequency 3/2 above your base frequency. For example, this definition implies that the perfect fifth above the A at 440 Hz has to be at a frequency of 3/2 * 440 = 660 Hz. Once you do this, you’ve defined the frequency we’ll call E.

And following on with this logic, you produce a B at 990 Hz. Of course, this B occurs an octave above the base A at 440 Hz, so you transpose it down an octave to produce the B you’ll actually use. To do this, you need to assume that an octave is at a frequency 2 times the base frequency. Since we’ve accepted that 990 Hz is a B, we divide 990 by 2 and conclude that 495 Hz should be B.

With these three notes defined, we have the following table of frequency/note pairs:

Note Frequency Ratio with 440 Hz
A 440 Hz 1
E 660 Hz 3/2
B 495 Hz 9/8

If we continue on with this logic and calculate many more multiplications by 3/2 and divisions by 2, we will eventually produce a complete table for all of the notes in the chromatic scale that looks like the following:

Note Frequency Ratio
A 440 1
A# 463.5391 256/243
B 495 9/8
C 521.4815 32/27
C# 556.875 81/64
D 586.6667 4/3
D# 626.4844 729/512
E 660 3/2
F 695.3086 128/81
F# 742.5 27/16
G 782.2222 16/9
G# 835.3125 243/128
A 880 2

One thing about this table might strike you as odd if you’re mathematically savvy: the octave, which we’ve defined by fiat as a ratio of 2:1, could never have been produced by successive multiplication by 3/2, since no power of 3 will be evenly divisible by a power of 2. This is the one flub in the Pythagorean system: you can’t really produce the entire chromatic scale using only multiples of 3/2. Here we’ve solved that problem by replacing the note we would have called A with a true octave generated using multiplication by 2. Because the exact octave produced by Pythagorean tuning is slightly out of tune with our preferred definition of an octave, you may hear people refer to this discrepancy as the the Pythagorean comma.

Just Tuning

Given that we had to cheat a bit to create a proper octave using the Pythagorean tuning system based on multiples of 3/2, it makes sense to ask why we shouldn’t just allow ourselves to use other multipliers than 3/2. Looking at the Pythagoren tuning table, we see some pretty ugly fractions like 729/512. What if we forced these fractions to be simpler by employing ratios like 4/3 and 5/4 to build up the whole system?

The result of allowing ourselves several fractions beyond just those derived from 3/2 is called the just tuning system. Here we assume that perfect fifths occur at a frequency ratio of 3/2 and that perfect fourths occur at a frequency ratio of 4/3. Continuing on with this process, we eventually end up with the following tuning table:

Note Frequency Ratio
A 440 1
A# 469.3333 16/15
B 495 9/8
C 528 6/5
C# 550 5/4
D 586.6667 4/3
D# 625.7778 64/45
E 660 3/2
F 704 8/5
F# 733.3333 5/3
G 782.2222 16/9
G# 825 15/8
A 880 2

This is the tuning that early Classical music was written in. Looking at the table you con immediately appreciate the theoretical assertion that the relative dissonance of an interval is determined by the simplicity of the ratio of frequencies between the two notes: perfect fifths are 3/2 and major thirds are 5/4, while minor seconds are 16/15 and major sevenths are 15/8. This is one of the things I most enjoy about the theory of harmony: there’s a match between the aesthetics of fractions and the aesthetics of sounds that, for me, helps to justify my sense that certain fractions are more beautiful than others.

12 Tet / Equal-Temperament

Now, if you know the history of Bach’s Well-Tempered Clavier, you know that there is a problem with the just tuning system: it sounds great in the key you used as the base (here A), but it sounds a bit out of tune in other keys. The modern 12-tet system is the most recent approach to solving this problem: you assume the gap between two semitones (e.g. A to A# or A# to B) is always the exact same multiple. Since you’ll repeat this multiplication 12 times before reaching an octave, you can conclude that two notes that are a semitone apart must be separated by the 12th root of 2. Building a tuning system using that ratio alone gives us our modern system of tuning, which is shown in the table above using the decimal expansion of the ratios instead of their representation as powers of the 12th root of 2:

Note Frequency Ratio
A 440 1.000000
A# 466.1638 1.059463
B 493.8833 1.122462
C 523.2511 1.189207
C# 554.3653 1.259921
D 587.3295 1.334840
D# 622.2540 1.414214
E 659.2551 1.498307
F 698.4565 1.587401
F# 739.9888 1.681793
G 783.9909 1.781797
G# 830.6094 1.887749
A 880 2.000000

Listening to the Results

We’ve just described three ways to define the notes used in Western music. But how different do they sound? To answer that, I decided to produce a series of simple sine wave audio samples that were tuned using each of the three tuning systems. To produce those audio samples, I used the ‘tuneR’ package, which I’ll describe now. Before you read on, you should install it from CRAN using the standard install.packages('tuneR') invocation.

A tuneR Tutorial

The tuneR package is an extremely convenient tool for generating audio files from R based on a numeric description of the audio stream. For the purposes of this discussion of tuning systems, we simply need to produce basic sine waves. Thankfully, that’s very easy to do with tuneR. Here’s an example:

library('tuneR')

sound <- sine(440, bit = 16)

writeWave(sound, '440.wav')

Here we’ve loaded the tuneR package, created a 1s snippet of sine wave audio at 16 bits resolution using the sine function, and then written out the audio to a WAV file using writeWave. If you look at your current directory and listen to this file, you’ll hear a sine wave at 440 Hz.

If you want to explore the use of sine, you can easily play with the duration of the sound by changing the duration parameter. If you want to, you can also change the sample rate and the bit rate, but I don’t see any reason to do that while exploring ideas about tuning.

More important is knowing that you can superimpose two sine waves using the `+` operator and that you can concatenate them using the bind function. To show off producing octaves, for example, you might use the following code to hear an A at 440 Hz, then an A an octave above it, and finally the harmony they produce together:

library('tuneR')

sound <- bind(
    sine(440, bit = 16),
    sine(880, bit = 16),
    sine(440, bit = 16) + sine(880, bit = 16)
)

writeWave(sound, 'octaves.wav')

Unfortunately, this sample code produces an error because of the naive addition we’ve implemented using the `+` operator. Adding two sine waves directly together overfills the bit rate we’re using. To safely perform addition of two sine waves, we need to normalize the results of our summation using the normalize function. This gives us just one more line of code:

library('tuneR')

sound <- bind(
    sine(440, bit = 16),
    sine(880, bit = 16),
    sine(440, bit = 16) + sine(880, bit = 16)
)

sound <- normalize(sound, unit = '16')

writeWave(sound, 'octaves.wav')

For reasons that are not clear to me, you have to specify the bit rate to normalize using the unit parameter rather than the bit parameter.

Demoing Tuning Systems

Our little octave demo is cute, but we really want to know what more interesting harmonies like major thirds and minor seconds sound like in the various tuning systems we described. To do that, I first wrote a function called interval that spits out the multiplier you need to use to produce a given interval for any of the three tuning systems. That function is in a GitHub repository I’ve set up with code for making these demos. If you download that repository, you could load my interval function using a simple call to source like the one seen below. And using this interval function, we can generate demos of various intervals as follows:

library('tuneR')
source('interval.R')

base <- 440

sound <- sine(base) +
    sine(interval('minor-second', tuning = 'pythagorean') * base)

sound <- normalize(sound, unit = '16')

writeWave(sound, 'minor_second_pythagorean.wav')

On GitHub there’s a file called test_intervals.R that will go through and generate all of the intervals in all three tuning systems. If you run that file, you’ll generate a lot of audio files you can listen to as demos of the three tuning systems we’ve described. For me, these tuning systems all produce intervals that sound surprisingly similar, though at high volumes I find it moderately easy to hear slight differences between the tuning systems. That said, I very much doubt I would pick up on them in a normal musical context.

That’s the end of my little introduction to tuning systems and the use of the tuneR package to explore them. If you’re interested in thinking computationally about music, I highly recommend playing around with tuneR until you feel like you can produce interesting results. I’m already working on trying to build up some interesting timbres to work with.