Introduction to Yang-Mills theories

Yang-Mills theories are a class of classical field theory generalizing Maxwell’s equations. When quantized, Yang-Mills theories form the basis for all successful modern quantum field theories, including the standard model of particle physics, and grand unified theories (GUTs) that attempt to go beyond the standard model.

This post contains working notes (32 pages, pdf only) that I wrote in an attempt to come to a satisfactory personal understanding of Yang-Mills theory. They are part of a larger project of understanding the standard models of particle physics and of cosmology – some related earlier notes are here.

Caveat: The current notes take a geometric approach to Yang-Mills theory, and include quite a bit of background on differential geometry. After completing a first draft, I realized that if I was to write either a pedagogical introduction or a review of Yang-Mills theory, this geometric approach is not the approach I’d prefer to take. Rather, I’d start with a bare statement of the Yang-Mills equations, considered as a generalization of Maxwell’s equations, and then work through a series of examples, only gradually mixing in the geometric approach. This would have the advantage of bringing readers up to speed much more quickly, without needing to absorb reams of differential geometry upfront.

Because of this, I haven’t polished these notes – they remain primarily my personal working notes, and there are various inaccuracies and shortcomings in the notes. I’m content to ignore these – why spend time polishing when you know a better approach is possible – but would appreciate advisement if you spot any serious misconceptions.

Despite these caveats, I believe the notes may be useful to some readers. In particular, if you’d like to understand the approach to Yang-Mills theory from differential geometry, these notes may serve as a useful first step, to be supplemented by additional reading such as the book by Baez and Muniain (“Knots, Gauge Fields and Gravity”, World Scientific 1994) on which the notes are primarily based.


Update: If you’re reading the notes in detail, then you might want to take a look at the comments, esepcially those by David Speyer and Aaron Bergman, who provide some important corrections and extensions.


  1. Great notes! Thanks very much. Am working through them & will be back with feedback if I have any.

  2. First of all, thank you. This was very clear, I hadn’t even understood before now that there was a “classical” version of Yang-Mills that can be considered independent of the quantum aspects. Also, please note that I am an algebraic geometer who is trying to learn physics, not a differential geometer, so there may be errors below.

    Now, a few questions and comments:

    (1) Exercise 5.12. I think that this is false. I think that the condition that the Christoffel symbols is symmetric is equivalent to saying that the connection has zero torsion. Exercise 4.2 of Lee’s textbook “Riemmannian Manifolds” says something like this, but he is making some distinction between “coordinate frames” and “other frames” that I don’t quite follow. It seems to me that the condition that your equation (26) implies that the connection is flat (zero curvature). Clearly, not every zero torsion connection is flat.

    (2) Regarding your “Exercise for the author 5.2”, isn’t this immediate from equation (46)?

    (3) My hazy memory of learning about vector potentials is E&M is that one normally adopts some sort of normalizing condition, which I think is something like divergence(A)=d phi/dt. (Where A is the vector potential and phi is the electrostatic potential.) In four-vector formalism, that probably says something like d* \mathcal{A}=0, where \mathcal{A} is the four-vector (phi, a). Is there an analogue for general Yang-Mills?

    (4) This isn’t really something I can blame your notes for, but it is very frustrating not knowing the analogue of the Lorentz forces. I have been told that the nuclear forces decay exponentially because the corresponding gauge groups are noncommutative. I’d hoped to work out the details of this, but I can’t because I have no way of knowing how to relate any of these beautiful field equations to actual particle motion.

  3. Thanks for the comments, David. Regarding your points:

    (1): perhaps it is obvious that Eq. (26) implies the connection is flat, but it is not obvious to me. I’m probably missing something trivial. In any case, if you’re correct this whole section of my notes needs a rewrite.

    (2) Yes, you’re right.

    (3) Physicists usually refer to this as fixing a gauge; something like div A = d phi/dt is a particular choice of gauge (others can be made). In the case of EM, with such a choice, it can be shown using de Rham theory that A is uniquely determined by the Faraday tensor. My understanding is that the general situation for YM is understood along similar lines, but I haven’t studied the details.

    (4) I expect this would be relatively easy to extract from a suitable source (probably a field theory book), but can’t point you to one off the top of my head. I just moved to Canada from Australia – nearly all my books are on a boat!

  4. Regarding (1): Maybe I’m confused. I am allowed to compute curvature in any coordinate system. And if I work in the coordinate system for which equation (26) holds, I sure seem to get curvature zero. Does this not work?


  5. David — you’re correct. I feel a bit sheepish for not noticing this; I really should rewrite that section from another point of view, at some point.

  6. (Coming in a bit late here, but….)

    I gave the notes a quick skim. Here are a few comments. In footnote 3: One has to specify all the fermions to get the standard model, of course, and the various Yukawa interactions.

    I don’t understand your definitions of G-bundles. The usual thing to do is to define a principal G-bundle which is (roughly) a bundle with fiber the group G itself. Your definition seems to be of an associated vector bundle. A section of an associated vector bundle does not assign a group element to each point on the manifold, however. A section of the G-principal bundle will, however, but that’s not the same thing as a gauge transformation. A gauge transformation can be thought of as a change in the local trivialization of the principal bundle (or a section of the bundle Ad(G) which isn’t always the same as G).

    I’m not sure you ever define a connection on a G-bundle, either. There are a couple of equivalent notions — I like the discussion in Nakahara, personally. This is important for your later discussion of gauge symmetry. One nice definition of the connection lives on the total space of the G-bundle. The form that is defined on the base only exists on a local patch and is given by the pullback by a local section. Gauge invariance is simply the statement that nothing physical should depend on the choice of this local section.

    Tying into this, you never seem to discuss that most of the objects in these notes are valued in the Lie algebra of the gauge group. Endomorphisms of an associated bundle naturally include a representation of the Lie algebra, but there’s more than just that in there and it’s probably best to work with the honest Lie algebra rather than just a representation.

    For your problem for the author, the reason for the Hodge star is this: Given a metric on a manifold, there is a natural inner product on forms give by contracting the indices with the metric and integrating with the usual root det g factor. This is a map

    H^i(M) x H^i(M) -> C

    There’s also a natural operation on forms given by wedging and integrating

    H^i(M) x H^{d-i}(M) -> C

    The Hodge star takes one inner product to the other.

    I’ve never read Baez and Munian myself, but I highly recommend M. Nakahara “Geometry, Topology and Physics”.

    For David, I don’t think understanding the Lorentz force law is so hard, but it’s getting late. The weak force decays expoentially because the Higgs effect renders the force carriers massive — you won’t just see it from studying the pure YM theory. For the strong force, it actually increases with distance and thus confines. You never see a particle charged under the strong force in nature.

  7. Thanks for all your comments, Aaron, they’re very helpful. There’s a bunch of stuff you point out that I need to amend in a revised version. In particular, the inconsistency between my definition of a G-bundle and a section of a G-bundle needs revision (I don’t know what I was thinking), and this will have some follow on effects elsewhere that would clarify things quite a bit.

    I should also bring out the fact that many things are valued in the Lie algebra; I didn’t appreciate this properly until quite late in my reading, and it’s certainly not made nearly evident enough in the notes.

    Thanks for the pointer to Nakahara, I’ve ordered it from

  8. Great notes.

    Just a commentary about David Speier’s commentary: The exercise 5.12 is ok. It’s saying that you can always find a coordinate system where Christoffel symbols are zero, in a point! This dosen’t mean that the manifold is flat. You can always put Christoffel symbols equal to zero in a point, what you can’t do, if the curvature is not zero, is put them zero in a point an in its vicinity.
    I hope I made myself clear.

    Best regards.

    Pietro Dall’Olio

Comments are closed.