Mathematics

The Geometry of Pitch Class Sets

Strange Spaces in Music Theory

Matthew Ward

08 Dec 2019 — 8 min read

Last time I talked about how the Yoneda Lemma allows us to think about non-traditional spaces. Today we’ll look at a practical application of this to music theory.

Here’s the motivational problem:

Can we construct a geometric space that has chords as its points and also encodes useful music theory in it somehow?

I know that’s vague, but I’ll become more precise as we go.

Pitch Class Sets

What I’m going to describe here is usually encountered in a class on post-tonal music theory. The book Introduction to Post-Tonal Theory by Joseph N. Straus is excellent and where I originally learned it.

The term “post-tonal” can be misleading here because, for the most part, it is an extremely useful mathematical way of thinking about music theory that doesn’t particularly have to do with atonal music or 12-tone serialism.

The Western 12-tone scale is essentially formed by taking an octave and dividing it up into 12 parts.

Since an octave (or 12 semitones up or down) gives the same note we can mathematically think of things more clearly by just labeling a C with 0, a C# with 1, a D with 2 and so on up to labeling a B with 11.

When we back to C we “wrap around” and call it 0 again. Mathematically, this is just modulo 12 arithmetic.

A great way to visualize this is to draw a 12-sided figure with all the side lengths the same (a regular dodecagon).

Now if we take a C major chord: 0, 4, 7, then transposing it to a major chord 3 semitones up just amounts to adding every number by 3 to get 3, 7, 10.

In fact, given any set of notes, we have the operation of transposition by n:

where mod 12 means we add by wrapping around and consider 12=0, 13=1, 14=2, etc (because they’re the same notes!).

We can also do something called an inversion. The term in music theory means inverting the intervals.

Mathematically, this means negating every single number and then figuring out what this number is mod 12. So the inversion of the C major chord: [0, 4, 7] is [0, -4, -7]=[0, 8, 5] or if we really are considering “chords” then the order doesn’t matter so it is [0,5,8].

But this is just an f minor chord!

We call this operation I for “inversion.” It can be visualized as a reflection of the dodecagon as follows:

Doing Tₙ for all choices of n to [0,4,7] gives you all 12 majors chords, and if you do both I and Tₙ then you’ll get all 12 minor chords, too.

The operations of transpositions and inversions generates something called a group. In fact, visualizing with a regular 12-gon immediately tells us that the TI group is what mathematicians call D_12, the Dihedral group of symmetries of the dodecagon. It has 24 elements.

We call an unordered collection of numbers between 0 and 12 a pitch set, and we get that D_12 acts on the set of pitch sets. Let’s examine this in more depth before moving on.

Group Actions on Pitch Sets

We just proved that the orbit of [0,4,7] under this action consists of the collection of all major and minor triads (three-note chords).

This should already be an indication we’re on the right track. Major and minor triads are the foundation of music in the West.

Note that none of the triads are sent to themselves. In other words, given a non-trivial symmetry/combination of transpositions and inversions, we will always get a distinct new triad.

Mathematicians might say this in a fancy way: the set of major and minor triads is a torsor under the TI-action.

It turns out this is a “generic” phenomenon. If you choose a pitch set at random, you are likely (the probability is greater than 50%) to have chosen one that has this property of getting 24 distinct new chords by translating and inverting.

We could say that it has the property of having no TI-symmetry. It’s worth thinking about this for a moment. Take [0,6] and invert it to [0, -6]. This is the same as [0,6] mod 12. Inverting does not give us a new pitch set.

So the reason [0,4,7] never had this happen was that it didn’t have some sort of symmetry like that.

We’ll call a k-chord (read: an unordered chord with k notes in it) TI-symmetric if there is some choice of non-trivial transposition and inversion such that the chord is sent to itself.

Now, even though these are rarer, it turns out that for any choice of k, there is always a k-chord with this property. These exist for rather silly reasons. For example, [0,1,2, … , k] is always an example of such a chord (exercise: why?).

For less trivial examples you could take the whole-tone scale [0,2,4,6,8,10]. If you translate by 2, then you certainly get the same thing back again.

Inversion also fixes this 6-chord.

This tells us that up to inversion and transposition there are only 2 distinct whole tone scales (if you want overkill then the subgroup generated by T_2 has 12 elements, so the Orbit-Stabilizer Theorem tells us this fact).

Here is an interesting question from pure music theory, and to my knowledge, it’s still open (although I suspect it’s fairly easy to answer).

None of this was specific to dividing up an octave into 12 notes. Suppose you invent a tonal system with n notes instead. Then you’d have an action of D_n on the k-chords.

Is there a simple closed-form formula for the number of k-chords that are TI-symmetric? More importantly, for a given n, which k gives the most number of k-chords with TI-symmetry.

I should point out that if you rule out the “silly examples” of TI-symmetry given by a strictly chromatic scale, then there is actually utility in figuring this out. TI-symmetry has played a great role in the history of composition.

For example, the augmented triad, the French augmented sixth chord, the diminished seventh, the famous chord from Stravinsky’s Petrushka, the hexatonic scale, the whole tone scale, and the octatonic scale are all examples with TI-symmetry.

So, I think this is more than just a novelty problem.

The Geometry of Pitch Class Sets

I’ll sketch the idea now of forming our space classifying pitch class sets.

I won’t rely on any details from my Yoneda Lemma article. You can just trust that there is some notion of a “generalized space” used to classify objects that retains a lot of information on how these objects are related.

Pitch Class Sets

Recall that a pitch set (or chord) is just converting notes to numbers: 0 is C, 1 is C#, 2 is D, etc. A given collection of pitches can be expressed in a more useful notation when there isn’t a key we’re working in.

For example, a C major chord is (047).

You may be confused about why I’ve switched to (047) from [0,4,7]. I want a different notation for the whole “class set” (sometimes called an equivalence class) of chords you get by performing translations and inversions to [0,4,7].

A pitch class set is then saying that there are collections of chords we want to consider to be the same.

There are a few music theoretic reasons for this. For one, our choice of 0 is completely arbitrary. We could have made 0 correspond to A, and we should get the same music theory. This amounts to identifying all pitch sets that are the same after translation.

We also want to identify sets that are the same after inversion. In the previous post on this topic, I showed that if we label the vertices of a dodecagon, this amounts to a reflection symmetry.

The reflections together with the translations generate the dihedral group, so we are secretly letting it act on the set of all tuples of numbers 0 to 11, where each number only appears once and without loss of generality we can assume they are in increasing order.

Thus a pitch class set is just an equivalence class of a chord under this group action. It is not the direction I want this post to go, but given such a class, there is always a unique representative that is usually called the “prime form” (basically the most “compact” representative starting with 0).

Check out Straus’s book for more information on that. It is the standard way to talk about post-tonal theory.

The Geometric Space

The set of all “chords” should have some sort of useful topology on it. For example, [0,1,2,3] should be related to [0,1,2,4], because they are the same chord except for one note.

I don’t think doing something obvious like defining a distance based on the coordinates works. If you try to construct the lattice of open sets by hand based on your intuition, a definition might become more obvious. In any case, the topology isn’t important here.

Call this space of chords X.

Now we have a space with a group action on it. One might want to merely form the quotient space X/G.

The quotient map X → X/G will be 24 to 1 at most points, but it will also forget which chords were fixed by elements of the group. Part of the “theory” in music theory is to remember that information.

This is why I propose making the quotient stack [X/G]. This is one of those fancy spaces I wrote about last time. It is the moduli space of pitch class sets. It seems like an overly complicated thing to do, but here’s what you gain.

You now have a “space” whose points are the pitch class sets. If that class contains 24 distinct chords, then the point is an “honest” point with no extra information.

The fiber of the quotient map contains the 24 chords, and you get to each of them by acting by the elements of (i.e. it is a torsor under the TI group action).

Now consider something like the pitch class set [0,2,4,6,8,10]. The fiber of the quotient map only contains elements: (02468T) and (13579E). The stack will tag these points with D_6, which is the subgroup of symmetries which sends this chord to itself.

Now that I’ve drawn this, I can see that many of you will be skeptical about its simplicity.

Think of it this way:

The bottom thing is the space I’m describing. Each point in the space is tagged with the prime form representative together with the subgroup of symmetries that preserve the class.

That’s pretty simple.

Yet it remembers all of the complicated music theory of the top thing! If the topology was defined well, then studying this space may even lead to insights on how symmetries of classes are related to each other.