Peeter Joot's (OLD) Blog.

Math, physics, perl, and programming obscurity.

Covariant gauge transformation of Dirac equation to get the spin adjusted Klein-Gordon equation.

Posted by peeterjoot on September 3, 2011

[Click here for a PDF of this post with nicer formatting (especially if my latex to wordpress script has left FORMULA DOES NOT PARSE errors.)]


In section 36.4 of [1] is a covariant treatment of the gauge transformed Dirac equation, with the end goal of finding the Klein-Gordon relation with the spin terms required for electromagnetic phenomena.

Two typos to start things off, and (36.63-64) should be respectively

\begin{aligned}p_\mu &\rightarrow p_\mu - e A_\mu \\ D_\mu &= \partial_\mu + i e A_\mu.\end{aligned} \hspace{\stretch{1}}(1.1)

Other than this there are no typos till the end where a factor of two is lost, and we should have

\begin{aligned}\left( D^\mu D_\mu - \frac{e}{2} \sigma^{\mu \nu} F_{\mu \nu} + m^2 \right) \psi = 0\end{aligned} \hspace{\stretch{1}}(1.3)

It’s slightly tempting to re-derive this, with the inclusion of the \hbar and c factors. However, a bit of play using the Geometric Algebra (GA) operators discussed in [2] proves more productive. This text introduces a purely GA formalism for the Dirac equation which I’ll not use here. Instead I’ll use the more conventional Feynman slash notation so that a four vector with coordinates a^\mu is written with its basis as

\begin{aligned}\not{a} = a^\mu \gamma_\mu = a_\mu \gamma^\mu\end{aligned} \hspace{\stretch{1}}(1.4)

Geometric Algebra notation.

We require the GA dot, and wedge operators

\begin{aligned}\not{a} \cdot \not{b} = \frac{1}{{2}}( \not{a} \not{b} + \not{b} \not{a} ) = a^\mu b_\mu\end{aligned} \hspace{\stretch{1}}(2.5)

\begin{aligned}\not{a} \wedge \not{b} = \frac{1}{{2}}( \not{a} \not{b} - \not{b} \not{a} ) = \frac{1}{{2}} a^\mu b^\nu \gamma_{[\mu} \gamma_{\nu]}.\end{aligned} \hspace{\stretch{1}}(2.6)

In contrast to the matrix notation, the product of two identical gamma matrices is written as the unit scalar value (or grade zero product), instead of a using an explicit four by four identity matrix representation. We similarly label the product of different basis elements, for example \gamma_0 \gamma_1 as a grade two element, or bivector. Thus the dot product is the grade zero term of the multivector product \not{a} \not{b}, and the wedge product is the grade zero term of the same. More generally, for a product of multivectors A and B the grade selection operator is defined as

\begin{aligned}{\left\langle{{A B}}\right\rangle}_{{n}}.\end{aligned} \hspace{\stretch{1}}(2.7)

This is an abstract notation encoding the instructions to select just the n grade elements of the multivector product if they exist. In this notation, the product of vectors splits into scalar and bivector terms which may also be expressed as the dot and wedge products

\begin{aligned}\not{a} \not{b} = {\left\langle{{\not{a} \not{b}}}\right\rangle}_{{0}} + {\left\langle{{\not{a} \not{b}}}\right\rangle}_{{2}} = \not{a} \cdot \not{b} + \not{a} \wedge \not{b}.\end{aligned} \hspace{\stretch{1}}(2.8)

Gauge transforming the Dirac equation.

Our electron equation is

\begin{aligned}\left(\not{p} - m c \right) \psi = 0\end{aligned} \hspace{\stretch{1}}(3.9)

After a gauge transformation

\begin{aligned}\not{p} \rightarrow \not{p} - \frac{e}{c} \not{A},\end{aligned} \hspace{\stretch{1}}(3.10)

this is

\begin{aligned}\left(\not{p} - \frac{e}{c}\not{A} - m c \right) \psi = 0.\end{aligned} \hspace{\stretch{1}}(3.11)

We left multiply by the conjugate quantity \not{p} - \frac{e}{c}\not{A} + m c, and our task is now to reduce operator equation

\begin{aligned}\begin{aligned}0 &= \left(\not{p} - \frac{e}{c}\not{A} + m c \right) \left(\not{p} - \frac{e}{c}\not{A} - m c \right) \psi \\ &= {\left\langle{{\left(\not{p} - \frac{e}{c}\not{A} + m c \right) \left(\not{p} - \frac{e}{c}\not{A} - m c \right) \psi}}\right\rangle}_{{0}} \\ &\quad +{\left\langle{{\left(\not{p} - \frac{e}{c}\not{A} + m c \right) \left(\not{p} - \frac{e}{c}\not{A} - m c \right) \psi}}\right\rangle}_{1} \\ &\quad +{\left\langle{{\left(\not{p} - \frac{e}{c}\not{A} + m c \right) \left(\not{p} - \frac{e}{c}\not{A} - m c \right) \psi}}\right\rangle}_{2}.\end{aligned}\end{aligned} \hspace{\stretch{1}}(3.12)

We have two contributions for the scalar parts, one is the product of mc scalars, and the other is the dot product of the vectors

\begin{aligned}{\left\langle{{\left(\not{p} - \frac{e}{c}\not{A} + m c \right) \left(\not{p} - \frac{e}{c}\not{A} - m c \right) \psi}}\right\rangle}_{{0}}=\left(\not{p} - \frac{e}{c}\not{A} \right) \cdot \left(\not{p} - \frac{e}{c}\not{A} \right) \psi- (m c)^2 \psi.\end{aligned} \hspace{\stretch{1}}(3.13)

The grade one (four vector) components sum to zero since those are only the scalar times four vector portions and have opposing signs

\begin{aligned}{\left\langle{{\left(\not{p} - \frac{e}{c}\not{A} + m c \right) \left(\not{p} - \frac{e}{c}\not{A} - m c \right) \psi}}\right\rangle}_{1} =\left(\not{p} - \frac{e}{c}\not{A} \right) \left( - m c \right) \psi+ \left( m c \right) \left(\not{p} - \frac{e}{c}\not{A} \right) \psi= 0.\end{aligned} \hspace{\stretch{1}}(3.14)

Only the vector products can contribute to the grade two portion of the multivector product, so retain only the wedges between those

\begin{aligned}{\left\langle{{\left(\not{p} - \frac{e}{c}\not{A} + m c \right) \left(\not{p} - \frac{e}{c}\not{A} - m c \right) \psi}}\right\rangle}_{2}&=\left(\not{p} - \frac{e}{c}\not{A} \right) \wedge \left(\not{p} - \frac{e}{c}\not{A} \right) \psi \\ &=\not{p} \wedge \not{p} \psi - \frac{e}{c} \left( \not{p} \wedge \not{A} \psi + \not{A} \wedge \not{p} \psi \right)+ \frac{e^2}{c^2} \not{A} \wedge \not{A} \psi \\ &=- \frac{e}{c} \left( (\not{p} \wedge \not{A}) \psi + (\not{p} \psi) \wedge \not{A} + \not{A} \wedge \not{p} \psi \right) \\ &=- \frac{e}{c} (\not{p} \wedge \not{A}) \psi.\end{aligned}

Here braces have been used to denote the range of operation of our differential operator \not{p}. With \not{\partial} = \gamma^\mu \partial_\mu, our momentum operator is

\begin{aligned}\not{p} = i \hbar \not{\partial},\end{aligned} \hspace{\stretch{1}}(3.15)

allowing us to write

\begin{aligned}\not{p} \wedge \not{A}&=i \hbar \not{\partial} \wedge \not{A} \\ \end{aligned}

For the electromagnetic field bivector lets write

\begin{aligned}F = \not{\partial} \wedge \not{A} = \frac{1}{{2}} F_{\mu \nu} \gamma^\mu \wedge \gamma^\nu.\end{aligned} \hspace{\stretch{1}}(3.16)

So the grade two terms of 3.12 are

\begin{aligned}{\left\langle{{\left(\not{p} - \frac{e}{c}\not{A} + m c \right) \left(\not{p} - \frac{e}{c}\not{A} - m c \right) \psi}}\right\rangle}_{2}=-\frac{ie \hbar }{c } F \psi.\end{aligned} \hspace{\stretch{1}}(3.17)

Assembling all results we have

\begin{aligned}\left(\left(\not{p} - \frac{e}{c}\not{A} \right) \cdot \left(\not{p} - \frac{e}{c}\not{A} \right) - \frac{ie \hbar }{c } F - (m c)^2 \right)  \psi = 0.\end{aligned} \hspace{\stretch{1}}(3.18)

Some checks

There are two tasks that remain. One is to verify that this reproduces the result expressed in terms of the gauge covariant derivative D_\mu. The other is to verify that this also reproduces the earlier result with an explicit split into energy and momentum operators.

Gauge covariant derivative form

With \not{p} = i \hbar \gamma^\mu \partial_\mu our dot product takes the form

\begin{aligned}\left(\not{p} - \frac{e}{c}\not{A} \right) \cdot \left(\not{p} - \frac{e}{c}\not{A} \right) &=\left(i \hbar \partial^\mu - \frac{e}{c}A^\mu \right) \left(i \hbar \partial_\mu - \frac{e}{c}A_\mu \right)  \\ &=-\hbar^2 \left(\partial^\mu + \frac{i e}{c \hbar}A^\mu \right) \left(\partial_\mu + \frac{i e}{ \hbar c}A_\mu \right)  \\ \end{aligned}

We thus write

\begin{aligned}D_\mu = \partial_\mu + \frac{i e}{c \hbar}A_\mu,\end{aligned} \hspace{\stretch{1}}(4.19)

differing from the text only by the inclusion of \hbar and c factors. Since we have

\begin{aligned}\sigma^{\mu \nu} = \frac{1}{{i}} \gamma^\mu \wedge \gamma^\nu,\end{aligned} \hspace{\stretch{1}}(4.20)

we have

\begin{aligned}\left( -\hbar^2 D^\mu D_\mu + \frac{e \hbar}{ 2 c } \sigma^{\mu\nu} F_{\mu\nu} - (m c)^2 \right) \psi  = 0,\end{aligned} \hspace{\stretch{1}}(4.21)


\begin{aligned}\left( D^\mu D_\mu - \frac{e }{ 2 c \hbar } \sigma^{\mu\nu} F_{\mu\nu} + \frac{m^2 c^2}{\hbar^2} \right) \psi  = 0.\end{aligned} \hspace{\stretch{1}}(4.22)

With c = \hbar = 1, this reproduces 1.3, the (corrected) result (36.78) from the text.

Verifying the space time split into energy and spatial momentum operators.

Temporarily working with c = \hbar = 1 we have

\begin{aligned}D^\mu D_\mu &= D^0 D_0 + D^k D_k \\ &= (\partial^t + i e \phi)^2 - (\partial^k + i e A^k)^2 \\ \end{aligned}

With E = i \partial_t and (\mathbf{p})_k = -i \partial_k = i \partial^k, this is

\begin{aligned}D^\mu D_\mu &= (-i E + i e \phi)^2 - (-i \mathbf{p} + i e \mathbf{A})^2 \\ &= -(E - e \phi)^2 + (\mathbf{p} - e \mathbf{A})^2.\end{aligned}

Restoring c and \hbar, and multiplying throughout by \hbar^2, 4.21 becomes

\begin{aligned}\left( -\frac{1}{{c^2 }} (E - e \phi)^2 + \left(\mathbf{p} - \frac{e }{c} \mathbf{A} \right)^2- \frac{ e \hbar }{ 2 c } \sigma^{\mu\nu} F_{\mu\nu} + m^2 c^2 \right) \psi  = 0.\end{aligned} \hspace{\stretch{1}}(4.23)

Now let’s expand the \sigma^{\mu \nu} F_{\mu \nu} product in terms of the matrices used in the text previously. That is

\begin{aligned}\sigma^{\mu \nu} F_{\mu \nu}&=\sigma^{0 0} F_{0 0}+\sigma^{0 k} F_{0 k}+\sigma^{j 0} F_{j 0}+\sigma^{j k} F_{j k} \\ &=2 \sigma^{0 k} F_{0 k}+\sigma^{j k} F_{j k} \\ \end{aligned}


\begin{aligned}F_{0 k}&=\partial_0 A_k -\partial_k A_0 \\ &=-\partial_0 A^k -\partial_k A^0 \\ &= (\mathbf{E})_k,\end{aligned}


\begin{aligned}\sigma^{0 k}&=\frac{1}{{2 i}} (\gamma^0 \gamma^j - \gamma^j \gamma^0) \\ &=-i \gamma^0 \gamma^j  \\ &=i \gamma^j \gamma^0  \\ &=i\begin{bmatrix}0 & \sigma_j \\ -\sigma_j & 0\end{bmatrix}\begin{bmatrix}1 & 0 \\ 0 & -1\end{bmatrix} \\ &= i\begin{bmatrix}0 & -\sigma_j \\ -\sigma_j & 0\end{bmatrix} \\ &=-i \alpha_j\end{aligned}


\begin{aligned}2 \sigma^{0 k} F_{0 k} = -2 i \boldsymbol{\alpha} \cdot \mathbf{E}.\end{aligned} \hspace{\stretch{1}}(4.24)

For the magnetic terms we have

\begin{aligned}F_{j k} &= \partial_j A_k - \partial_k A_j \\ &= -\partial_j A^k + \partial_k A^j \\ &= \epsilon_{m k j} B_m.\end{aligned}

For \sigma^{j k}, considering the j \ne k case (since it is zero otherwise), we have

\begin{aligned}\sigma^{j k} &=\frac{1}{{2i}} (\gamma^j \gamma^k - \gamma^k \gamma^j) \\ &=\frac{1}{{i}} \gamma^j \gamma^k  \\ &=- i \begin{bmatrix}0 & \sigma_j \\ -\sigma_j & 0\end{bmatrix}\begin{bmatrix}0 & \sigma_k \\ -\sigma_k & 0\end{bmatrix}\\ &=i\begin{bmatrix}\sigma_j \sigma_k & 0 \\ 0 & \sigma_k \sigma_j \end{bmatrix} \\ &=i^2 \epsilon_{j k m}\begin{bmatrix}\sigma_m & 0 \\ 0 & \sigma_m\end{bmatrix}.\end{aligned}


\begin{aligned}\sigma^{j k} F_{j k} &= - \epsilon_{j k n} {\sigma'}_n \epsilon_{m k j} B_m \\ &= 2 \delta_{m n} {\sigma'}_n B_m \\ &= 2 \boldsymbol{\sigma}' \cdot \mathbf{B}.\end{aligned}

Thus the space time split of the field is

\begin{aligned}\sigma^{\mu \nu} F_{\mu \nu}=-2 i \boldsymbol{\alpha} \cdot \mathbf{E} + 2 \boldsymbol{\sigma}' \cdot \mathbf{B}\end{aligned} \hspace{\stretch{1}}(4.25)

\begin{aligned}\left( -\frac{1}{{c^2 }} (E - e \phi)^2 + \left(\mathbf{p} - \frac{e }{c} \mathbf{A} \right)^2- \frac{ e \hbar }{ c } \left( -i \boldsymbol{\alpha} \cdot \mathbf{E} + \boldsymbol{\sigma}' \cdot \mathbf{B} \right) + m^2 c^2 \right) \psi  = 0.\end{aligned} \hspace{\stretch{1}}(4.26)

This recovers (36.22) from the text (once one adds back in the \hbar and c and fixes the sign error in the electric field term).

A last word on covariant gauge transformations.

Usually discussions of gauge invariance involve a Lagrangian. Such a beastie for the Dirac equation has not yet been discussed yet in the text. Suppose we just apply the phase transformation to the Dirac equation of the same form that we use with the Klein-Gordon Lagrangian

\begin{aligned}\psi \rightarrow \psi e^{i \frac{e }{\hbar c} \theta}.\end{aligned} \hspace{\stretch{1}}(5.27)

Observe that if the dimensions of \partial_\mu \theta are those of the vector potential A_\mu, then e \theta/\hbar c is dimensionless.

Now lets transform the Dirac equation, utilizing the freedom to adjust the phase arbitrarily

\begin{aligned}(\not{p} - m c) \psi&\rightarrow(\not{p} - m c )\psi e^{i \frac{e}{\hbar c} \theta } \\ &=(i \hbar \gamma^\mu \partial_\mu - m c) \psi e^{i \frac{e}{\hbar c} \theta } \\ &=e^{i \frac{e}{\hbar c} \theta } \left(i \hbar \gamma^\mu \left( \frac{ i e }{\hbar c} \partial_\mu \theta + \partial_\mu \right)  - m c\right) \psi \\ \end{aligned}

With the usual assignment

\begin{aligned}\partial_\mu \theta = A_\mu,\end{aligned} \hspace{\stretch{1}}(5.28)

this becomes

\begin{aligned}(\not{p} - m c) \psi&\rightarrow e^{i \frac{e}{\hbar c} \theta } \left(\not{p} - \frac{e}{c} \not{A} - m c\right) \psi.\end{aligned}

The bigger question of why 5.28 is the usual transformation remains. I don’t know the answer for that one. It is common in Lagrangian transformation discussions for the Klien-Gordon equation, and I’m guessing it’s something better justified in places I’ve not yet read.

So in the limit where \theta \rightarrow 0, the Dirac equation is transformed as

\begin{aligned}(\not{p} - m c) \psi\rightarrow \left(\not{p} - \frac{e}{c} \not{A} - m c\right) \psi,\end{aligned} \hspace{\stretch{1}}(5.29)

and we can say that our momentum operator transforms

\begin{aligned}\not{p} \rightarrow \not{p} - \frac{e}{c} \not{A}.\end{aligned} \hspace{\stretch{1}}(5.30)

The gauge covariant derivative is then just what we get when we factor out an i\gamma^\mu from the gauge transformed momentum operator

\begin{aligned}\not{p} - \frac{e}{c} \not{A} &= i \hbar \gamma^\mu \left( \partial_\mu + i \frac{e}{ c \hbar } A_\mu \right) \\ &=i \hbar \left( \not{\partial} + i \frac{ e }{ \hbar c} \not{A} \right)\end{aligned}

So with

\begin{aligned}D_\mu = \partial_\mu + i \frac{e}{ c \hbar } A_\mu,\end{aligned} \hspace{\stretch{1}}(5.31)


\begin{aligned}\not{D} =\not{\partial} + i \frac{e}{ c \hbar } \not{A},\end{aligned} \hspace{\stretch{1}}(5.32)

we define

\begin{aligned}\not{p} - \frac{e}{c} \not{A} = i \hbar \not{D} = i \hbar \gamma^\mu D_\mu\end{aligned} \hspace{\stretch{1}}(5.33)


[1] BR Desai. Quantum mechanics with basic field theory. Cambridge University Press, 2009.

[2] C. Doran and A.N. Lasenby. Geometric algebra for physicists. Cambridge University Press New York, Cambridge, UK, 1st edition, 2003.


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s

%d bloggers like this: