Peeter Joot's (OLD) Blog.

Math, physics, perl, and programming obscurity.

Posts Tagged ‘pauli matrix’

An updated compilation of notes, for ‘PHY452H1S Basic Statistical Mechanics’, Taught by Prof. Arun Paramekanti

Posted by peeterjoot on March 27, 2013

Here’s my second update of my notes compilation for this course, including all of the following:

March 27, 2013 Fermi gas

March 26, 2013 Fermi gas thermodynamics

March 26, 2013 Fermi gas thermodynamics

March 23, 2013 Relativisitic generalization of statistical mechanics

March 21, 2013 Kittel Zipper problem

March 18, 2013 Pathria chapter 4 diatomic molecule problem

March 17, 2013 Gibbs sum for a two level system

March 16, 2013 open system variance of N

March 16, 2013 probability forms of entropy

March 14, 2013 Grand Canonical/Fermion-Bosons

March 13, 2013 Quantum anharmonic oscillator

March 12, 2013 Grand canonical ensemble

March 11, 2013 Heat capacity of perturbed harmonic oscillator

March 10, 2013 Langevin small approximation

March 10, 2013 Addition of two one half spins

March 10, 2013 Midterm II reflection

March 07, 2013 Thermodynamic identities

March 06, 2013 Temperature

March 05, 2013 Interacting spin

plus everything detailed in the description of my first update and before.

Posted in Math and Physics Learning. | Tagged: , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , | 1 Comment »

Addition of two one half spins

Posted by peeterjoot on March 10, 2013

[Click here for a PDF of this post with nicer formatting (especially if my latex to wordpress script has left FORMULA DOES NOT PARSE errors.)]

In class an example of interacting spin was given where the Hamiltonian included a two spins dot product

\begin{aligned}H = \mathbf{S}_1 \cdot \mathbf{S}_2.\end{aligned} \hspace{\stretch{1}}(1.0.1)

The energy eigenvalues for this Hamiltonian were derived by using the trick to rewrite this in terms of just squared spin operators

\begin{aligned}H = \frac{(\mathbf{S}_1 + \mathbf{S}_2)^2 - \mathbf{S}_1^2 - \mathbf{S}_2^2}{2}.\end{aligned} \hspace{\stretch{1}}(1.0.2)

For each of these terms we can calculate the total energy eigenvalues from

\begin{aligned}\mathbf{S}^2 \Psi = \hbar^2 S (S + 1) \Psi,\end{aligned} \hspace{\stretch{1}}(1.0.3)

where S takes on the values of the total spin for the (possibly composite) spin operator. Thinking about the spin operators in their matrix representation, it’s not obvious to me that we can just add the total spins, so that if \mathbf{S}_1 and \mathbf{S}_2 are the spin operators for two respective particle, then the total system has a spin operator \mathbf{S} = \mathbf{S}_1 + \mathbf{S}_2 (really \mathbf{S} = \mathbf{S}_1 \otimes I_2 + I_2 \otimes \mathbf{S}_2, since the respective spin operators only act on their respective particles).

Let’s develop a bit of intuition on this, by calculating the energy eigenvalues of \mathbf{S}_1 \cdot \mathbf{S}_2 using Pauli matrices.

First lets look at how each of the Pauli matrices operate on the S_z eigenvectors

\begin{aligned}\sigma_x {\left\lvert {+} \right\rangle} = \begin{bmatrix} 0 & 1 \\ 1 & 0 \\ \end{bmatrix} \begin{bmatrix}1 \\ 0\end{bmatrix}=\begin{bmatrix}0 \\ 1 \end{bmatrix}= {\left\lvert {-} \right\rangle}\end{aligned} \hspace{\stretch{1}}(1.0.4a)

\begin{aligned}\sigma_x {\left\lvert {-} \right\rangle} = \begin{bmatrix} 0 & 1 \\ 1 & 0 \\ \end{bmatrix} \begin{bmatrix}0 \\ 1\end{bmatrix}=\begin{bmatrix}1 \\ 0 \end{bmatrix}= {\left\lvert {+} \right\rangle}\end{aligned} \hspace{\stretch{1}}(1.0.4b)

\begin{aligned}\sigma_y {\left\lvert {+} \right\rangle} = \begin{bmatrix} 0 & -i \\ i & 0 \\ \end{bmatrix} \begin{bmatrix}1 \\ 0\end{bmatrix}=\begin{bmatrix}0 \\ i \end{bmatrix}= i {\left\lvert {-} \right\rangle}\end{aligned} \hspace{\stretch{1}}(1.0.4c)

\begin{aligned}\sigma_y {\left\lvert {-} \right\rangle} = \begin{bmatrix} 0 & -i \\ i & 0 \\ \end{bmatrix} \begin{bmatrix}0 \\ 1\end{bmatrix}=\begin{bmatrix}-i \\ 0 \end{bmatrix}= -i {\left\lvert {+} \right\rangle}\end{aligned} \hspace{\stretch{1}}(1.0.4d)

\begin{aligned}\sigma_z {\left\lvert {+} \right\rangle} = \begin{bmatrix} 1 & 0 \\ 0 & -1 \\ \end{bmatrix} \begin{bmatrix}1 \\ 0\end{bmatrix}=\begin{bmatrix}1 \\ 0 \end{bmatrix}= {\left\lvert {+} \right\rangle}\end{aligned} \hspace{\stretch{1}}(1.0.4e)

\begin{aligned}\sigma_z {\left\lvert {-} \right\rangle} = \begin{bmatrix} 1 & 0 \\ 0 & -1 \\ \end{bmatrix} \begin{bmatrix}0 \\ 1\end{bmatrix}=-\begin{bmatrix}0 \\ 1 \end{bmatrix}= -{\left\lvert {-} \right\rangle}\end{aligned} \hspace{\stretch{1}}(1.0.4f)

Summarizing, these are

\begin{aligned}\sigma_x {\left\lvert {\pm} \right\rangle} = {\left\lvert {\mp} \right\rangle}\end{aligned} \hspace{\stretch{1}}(1.0.5a)

\begin{aligned}\sigma_y {\left\lvert {\pm} \right\rangle} = \pm i {\left\lvert {\mp} \right\rangle}\end{aligned} \hspace{\stretch{1}}(1.0.5b)

\begin{aligned}\sigma_z {\left\lvert {\pm} \right\rangle} = \pm {\left\lvert {\pm} \right\rangle}\end{aligned} \hspace{\stretch{1}}(1.0.5c)

For convienience let’s avoid any sort of direct product notation, with the composite operations defined implicitly by

\begin{aligned}\left( S_{1k} \otimes S_{2k} \right)\left( {\left\lvert {\alpha} \right\rangle} \otimes {\left\lvert {\beta} \right\rangle}  \right)=S_{1k} S_{2k} {\left\lvert {\alpha \beta} \right\rangle}=\left( S_{1k} {\left\lvert {\alpha} \right\rangle}  \right) \otimes\left( S_{2k} {\left\lvert {\beta} \right\rangle}  \right).\end{aligned} \hspace{\stretch{1}}(1.0.6)

Now let’s compute all the various operations

\begin{aligned}\begin{aligned}\sigma_{1x} \sigma_{2x} {\left\lvert {++} \right\rangle} &= {\left\lvert {--} \right\rangle} \\ \sigma_{1x} \sigma_{2x} {\left\lvert {--} \right\rangle} &= {\left\lvert {++} \right\rangle} \\ \sigma_{1x} \sigma_{2x} {\left\lvert {+-} \right\rangle} &= {\left\lvert {-+} \right\rangle} \\ \sigma_{1x} \sigma_{2x} {\left\lvert {-+} \right\rangle} &= {\left\lvert {+-} \right\rangle}\end{aligned}\end{aligned} \hspace{\stretch{1}}(1.0.7a)

\begin{aligned}\begin{aligned}\sigma_{1y} \sigma_{2y} {\left\lvert {++} \right\rangle} &= i^2 {\left\lvert {--} \right\rangle} \\ \sigma_{1y} \sigma_{2y} {\left\lvert {--} \right\rangle} &= (-i)^2 {\left\lvert {++} \right\rangle} \\ \sigma_{1y} \sigma_{2y} {\left\lvert {+-} \right\rangle} &= i (-i) {\left\lvert {-+} \right\rangle} \\ \sigma_{1y} \sigma_{2y} {\left\lvert {-+} \right\rangle} &= (-i) i {\left\lvert {+-} \right\rangle}\end{aligned}\end{aligned} \hspace{\stretch{1}}(1.0.7b)

\begin{aligned}\begin{aligned}\sigma_{1z} \sigma_{2z} {\left\lvert {++} \right\rangle} &= (-1)^2 {\left\lvert {--} \right\rangle} \\ \sigma_{1z} \sigma_{2z} {\left\lvert {--} \right\rangle} &= {\left\lvert {++} \right\rangle} \\ \sigma_{1z} \sigma_{2z} {\left\lvert {+-} \right\rangle} &= -{\left\lvert {-+} \right\rangle} \\ \sigma_{1z} \sigma_{2z} {\left\lvert {-+} \right\rangle} &= -{\left\lvert {+-} \right\rangle}\end{aligned}\end{aligned} \hspace{\stretch{1}}(1.0.7c)

Tabulating first the action of the sum of the x and y operators we have

\begin{aligned}\begin{aligned}\left( \sigma_{1x} \sigma_{2x} + \sigma_{1y} \sigma_{2y}  \right) {\left\lvert {++} \right\rangle} &= 0 \\ \left( \sigma_{1x} \sigma_{2x} + \sigma_{1y} \sigma_{2y}  \right) {\left\lvert {--} \right\rangle} &= 0 \\ \left( \sigma_{1x} \sigma_{2x} + \sigma_{1y} \sigma_{2y}  \right) {\left\lvert {+-} \right\rangle} &= 2 {\left\lvert {-+} \right\rangle} \\ \left( \sigma_{1x} \sigma_{2x} + \sigma_{1y} \sigma_{2y}  \right) {\left\lvert {-+} \right\rangle} &= 2 {\left\lvert {+-} \right\rangle}\end{aligned}\end{aligned} \hspace{\stretch{1}}(1.0.8)

so that

\begin{aligned}\begin{aligned}\mathbf{S}_1 \cdot \mathbf{S}_2 {\left\lvert {++} \right\rangle} &= {\left\lvert {++} \right\rangle} \\ \mathbf{S}_1 \cdot \mathbf{S}_2 {\left\lvert {--} \right\rangle} &= {\left\lvert {--} \right\rangle} \\ \mathbf{S}_1 \cdot \mathbf{S}_2 {\left\lvert {+-} \right\rangle} &= 2 {\left\lvert {-+} \right\rangle} - {\left\lvert {+-} \right\rangle} \\ \mathbf{S}_1 \cdot \mathbf{S}_2 {\left\lvert {-+} \right\rangle} &= 2 {\left\lvert {+-} \right\rangle} - {\left\lvert {-+} \right\rangle}\end{aligned}\end{aligned} \hspace{\stretch{1}}(1.0.9)

Now we are set to write out the Hamiltonian matrix. Doing this with respect to the basis \beta = \{ {\left\lvert {++} \right\rangle}, {\left\lvert {--} \right\rangle}, {\left\lvert {+-} \right\rangle}, {\left\lvert {-+} \right\rangle} \}, we have

\begin{aligned}H &= \mathbf{S}_1 \cdot \mathbf{S}_2 \\ &= \frac{\hbar^2}{4} \begin{bmatrix}\left\langle ++ \right\rvert H \left\lvert ++ \right\rangle & \left\langle ++ \right\rvert H \left\lvert -- \right\rangle & \left\langle ++ \right\rvert H \left\lvert +- \right\rangle & \left\langle ++ \right\rvert H \left\lvert -+ \right\rangle \\ \left\langle -- \right\rvert H \left\lvert ++ \right\rangle & \left\langle -- \right\rvert H \left\lvert -- \right\rangle & \left\langle -- \right\rvert H \left\lvert +- \right\rangle & \left\langle -- \right\rvert H \left\lvert -+ \right\rangle \\ \left\langle +- \right\rvert H \left\lvert ++ \right\rangle & \left\langle +- \right\rvert H \left\lvert -- \right\rangle & \left\langle +- \right\rvert H \left\lvert +- \right\rangle & \left\langle +- \right\rvert H \left\lvert -+ \right\rangle \\ \left\langle -+ \right\rvert H \left\lvert ++ \right\rangle & \left\langle -+ \right\rvert H \left\lvert -- \right\rangle & \left\langle -+ \right\rvert H \left\lvert +- \right\rangle & \left\langle -+ \right\rvert H \left\lvert -+ \right\rangle \end{bmatrix} \\ &= \frac{\hbar^2}{4} \begin{bmatrix}1 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 \\ 0 & 0 & -1 & 2 \\ 0 & 0 & 2 & -1 \end{bmatrix} \end{aligned} \hspace{\stretch{1}}(1.0.10)

Two of the eigenvalues we can read off by inspection, and for the other two need to solve

\begin{aligned}0 =\begin{vmatrix}-\hbar^2/4 - \lambda & \hbar^2/2 \\ \hbar^2/2 & -\hbar^2/4 - \lambda\end{vmatrix}= (\hbar^2/4 + \lambda)^2 - (\hbar^2/2)^2\end{aligned} \hspace{\stretch{1}}(1.0.11)


\begin{aligned}\lambda = -\frac{\hbar^2}{4} \pm \frac{\hbar^2}{2} = \frac{\hbar^2}{4}, -\frac{3 \hbar^2}{4}.\end{aligned} \hspace{\stretch{1}}(1.0.12)

These are the last of the triplet energy eigenvalues and the singlet value that we expected from the spin addition method. The eigenvectors for the \hbar^2/4 eigenvalue is given by the solution of

\begin{aligned}0 =\frac{\hbar^2}{2}\begin{bmatrix}-1 & 1 \\ 1 & -1\end{bmatrix}\begin{bmatrix}a \\ b\end{bmatrix}\end{aligned} \hspace{\stretch{1}}(1.0.13)

So the eigenvector is

\begin{aligned}\frac{1}{{\sqrt{2}}} \left( {\left\lvert {+-} \right\rangle} + {\left\lvert {-+} \right\rangle} \right)\end{aligned} \hspace{\stretch{1}}(1.0.14)

For our -3\hbar^2/4 eigenvalue we seek

\begin{aligned}0 =\frac{\hbar^2}{2}\begin{bmatrix}1 & 1 \\ 1 & 1\end{bmatrix}\begin{bmatrix}a \\ b\end{bmatrix}\end{aligned} \hspace{\stretch{1}}(1.0.15)

So the eigenvector is

\begin{aligned}\frac{1}{{\sqrt{2}}} \left( {\left\lvert {+-} \right\rangle} - {\left\lvert {-+} \right\rangle} \right)\end{aligned} \hspace{\stretch{1}}(1.0.16)

An orthonormal basis with respective eigenvalues \hbar^2/4 (\times 3), -3\hbar^2/4 is thus given by

\begin{aligned}\beta' = \left\{{\left\lvert {++} \right\rangle},{\left\lvert {--} \right\rangle},\frac{1}{{\sqrt{2}}} \left( {\left\lvert {+-} \right\rangle} + {\left\lvert {-+} \right\rangle} \right),\frac{1}{{\sqrt{2}}} \left( {\left\lvert {+-} \right\rangle} - {\left\lvert {-+} \right\rangle} \right)\right\}.\end{aligned} \hspace{\stretch{1}}(1.0.17)

Confirmation of spin additivity.

Let’s use this to confirm that for H = (\mathbf{S}_1 + \mathbf{S}_2)^2, the two spin 1/2 particles have a combined spin given by

\begin{aligned}S(S + 1) \hbar^2.\end{aligned} \hspace{\stretch{1}}(1.0.18)


\begin{aligned}(\mathbf{S}_1 + \mathbf{S}_2)^2 = \mathbf{S}_1^2 + \mathbf{S}_2^2 + 2 \mathbf{S}_1 \cdot \mathbf{S}_2,\end{aligned} \hspace{\stretch{1}}(1.0.19)

we have for the \hbar^2/4 energy eigenstate of \mathbf{S}_1 \cdot \mathbf{S}_2

\begin{aligned}2 \hbar^2 \frac{1}{{2}} \left( 1 + \frac{1}{{2}}  \right) + 2 \frac{\hbar^2}{4} = 2 \hbar^2,\end{aligned} \hspace{\stretch{1}}(1.0.20)

and for the -3\hbar^2/4 energy eigenstate of \mathbf{S}_1 \cdot \mathbf{S}_2

\begin{aligned}2 \hbar^2 \frac{1}{{2}} \left( 1 + \frac{1}{{2}}  \right) + 2 \left( - \frac{3 \hbar^2}{4}  \right) = 0.\end{aligned} \hspace{\stretch{1}}(1.0.21)

We get the 2 \hbar^2 and 0 eigenvalues respectively as expected.

Posted in Math and Physics Learning. | Tagged: , , , , , , , , | Leave a Comment »

Geometric Algebra. The very quickest introduction.

Posted by peeterjoot on March 17, 2012

[Click here for a PDF of this post with nicer formatting.]


An attempt to make a relatively concise introduction to Geometric (or Clifford) Algebra. Much more complete introductions to the subject can be found in [1], [2], and [3].


We have a couple basic principles upon which the algebra is based

  1. Vectors can be multiplied.
  2. The square of a vector is the (squared) length of that vector (with appropriate generalizations for non-Euclidean metrics).
  3. Vector products are associative (but not necessarily commutative).

That’s really all there is to it, and the rest, paraphrasing Feynman, can be figured out by anybody sufficiently clever.

By example. The 2D case.

Consider a 2D Euclidean space, and the product of two vectors \mathbf{a} and \mathbf{b} in that space. Utilizing a standard orthonormal basis \{\mathbf{e}_1, \mathbf{e}_2\} we can write

\begin{aligned}\mathbf{a} &= \mathbf{e}_1 x_1 + \mathbf{e}_2 x_2 \\ \mathbf{b} &= \mathbf{e}_1 y_1 + \mathbf{e}_2 y_2,\end{aligned} \hspace{\stretch{1}}(3.1)

and let’s write out the product of these two vectors \mathbf{a} \mathbf{b}, not yet knowing what we will end up with. That is

\begin{aligned}\mathbf{a} \mathbf{b} &= (\mathbf{e}_1 x_1 + \mathbf{e}_2 x_2 )( \mathbf{e}_1 y_1 + \mathbf{e}_2 y_2 ) \\ &= \mathbf{e}_1^2 x_1 y_1 + \mathbf{e}_2^2 x_2 y_2+ \mathbf{e}_1 \mathbf{e}_2 x_1 y_2 + \mathbf{e}_2 \mathbf{e}_1 x_2 y_1\end{aligned}

From axiom 2 we have \mathbf{e}_1^2 = \mathbf{e}_2^2 = 1, so we have

\begin{aligned}\mathbf{a} \mathbf{b} = x_1 y_1 + x_2 y_2 + \mathbf{e}_1 \mathbf{e}_2 x_1 y_2 + \mathbf{e}_2 \mathbf{e}_1 x_2 y_1.\end{aligned} \hspace{\stretch{1}}(3.3)

We’ve multiplied two vectors and ended up with a scalar component (and recognize that this part of the vector product is the dot product), and a component that is a “something else”. We’ll call this something else a bivector, and see that it is characterized by a product of non-colinear vectors. These products \mathbf{e}_1 \mathbf{e}_2 and \mathbf{e}_2 \mathbf{e}_1 are in fact related, and we can see that by looking at the case of \mathbf{b} = \mathbf{a}. For that we have

\begin{aligned}\mathbf{a}^2 &=x_1 x_1 + x_2 x_2 + \mathbf{e}_1 \mathbf{e}_2 x_1 x_2 + \mathbf{e}_2 \mathbf{e}_1 x_2 x_1 \\ &={\left\lvert{\mathbf{a}}\right\rvert}^2 +x_1 x_2 ( \mathbf{e}_1 \mathbf{e}_2 + \mathbf{e}_2 \mathbf{e}_1 )\end{aligned}

Since axiom (2) requires our vectors square to equal its (squared) length, we must then have

\begin{aligned}\mathbf{e}_1 \mathbf{e}_2 + \mathbf{e}_2 \mathbf{e}_1 = 0,\end{aligned} \hspace{\stretch{1}}(3.4)


\begin{aligned}\mathbf{e}_2 \mathbf{e}_1 = -\mathbf{e}_1 \mathbf{e}_2.\end{aligned} \hspace{\stretch{1}}(3.5)

We see that Euclidean orthonormal vectors anticommute. What we can see with some additional study is that any colinear vectors commute, and in Euclidean spaces (of any dimension) vectors that are normal to each other anticommute (this can also be taken as a definition of normal).

We can now return to our product of two vectors 3.3 and simplify it slightly

\begin{aligned}\mathbf{a} \mathbf{b} = x_1 y_1 + x_2 y_2 + \mathbf{e}_1 \mathbf{e}_2 (x_1 y_2 - x_2 y_1).\end{aligned} \hspace{\stretch{1}}(3.6)

The product of two vectors in 2D is seen here to have one scalar component, and one bivector component (an irreducible product of two normal vectors). Observe the symmetric and antisymmetric split of the scalar and bivector components above. This symmetry and antisymmetry can be made explicit, introducing dot and wedge product notation respectively

\begin{aligned}\mathbf{a} \cdot \mathbf{b} &= \frac{1}{{2}}( \mathbf{a} \mathbf{b} + \mathbf{b} \mathbf{a}) = x_1 y_1 + x_2 y_2 \\ \mathbf{a} \wedge \mathbf{b} &= \frac{1}{{2}}( \mathbf{a} \mathbf{b} - \mathbf{b} \mathbf{a}) = \mathbf{e}_1 \mathbf{e}_2 (x_1 y_y - x_2 y_1).\end{aligned} \hspace{\stretch{1}}(3.7)

so that the vector product can be written as

\begin{aligned}\mathbf{a} \mathbf{b} = \mathbf{a} \cdot \mathbf{b} + \mathbf{a} \wedge \mathbf{b}.\end{aligned} \hspace{\stretch{1}}(3.9)


In many contexts it is useful to introduce an ordered product of all the unit vectors for the space is called the pseudoscalar. In our 2D case this is

\begin{aligned}i = \mathbf{e}_1 \mathbf{e}_2,\end{aligned} \hspace{\stretch{1}}(4.10)

a quantity that we find behaves like the complex imaginary. That can be shown by considering its square

\begin{aligned}(\mathbf{e}_1 \mathbf{e}_2)^2&=(\mathbf{e}_1 \mathbf{e}_2)(\mathbf{e}_1 \mathbf{e}_2) \\ &=\mathbf{e}_1 (\mathbf{e}_2 \mathbf{e}_1) \mathbf{e}_2 \\ &=-\mathbf{e}_1 (\mathbf{e}_1 \mathbf{e}_2) \mathbf{e}_2 \\ &=-(\mathbf{e}_1 \mathbf{e}_1) (\mathbf{e}_2 \mathbf{e}_2) \\ &=-1^2 \\ &= -1\end{aligned}

Here the anticommutation of normal vectors property has been used, as well as (for the first time) the associative multiplication axiom.

In a 3D context, you’ll see the pseudoscalar in many places (expressing the normals to planes for example). It also shows up in a number of fundamental relationships. For example, if one writes

\begin{aligned}I = \mathbf{e}_1 \mathbf{e}_2 \mathbf{e}_3\end{aligned} \hspace{\stretch{1}}(4.11)

for the 3D pseudoscalar, then it’s also possible to show

\begin{aligned}\mathbf{a} \mathbf{b} = \mathbf{a} \cdot \mathbf{b} + I (\mathbf{a} \times \mathbf{b})\end{aligned} \hspace{\stretch{1}}(4.12)

something that will be familiar to the student of QM, where we see this in the context of Pauli matrices. The Pauli matrices also encode a Clifford algebraic structure, but we do not need an explicit matrix representation to do so.


Very much like complex numbers we can utilize exponentials to perform rotations. Rotating in a sense from \mathbf{e}_1 to \mathbf{e}_2, can be expressed as

\begin{aligned}\mathbf{a} e^{i \theta}&=(\mathbf{e}_1 x_1 + \mathbf{e}_2 x_2) (\cos\theta + \mathbf{e}_1 \mathbf{e}_2 \sin\theta) \\ &=\mathbf{e}_1 (x_1 \cos\theta - x_2 \sin\theta)+\mathbf{e}_2 (x_2 \cos\theta + x_1 \sin\theta)\end{aligned}

More generally, even in N dimensional Euclidean spaces, if \mathbf{a} is a vector in a plane, and \hat{\mathbf{u}} and \hat{\mathbf{v}} are perpendicular unit vectors in that plane, then the rotation through angle \theta is given by

\begin{aligned}\mathbf{a} \rightarrow \mathbf{a} e^{\hat{\mathbf{u}} \hat{\mathbf{v}} \theta}.\end{aligned} \hspace{\stretch{1}}(5.13)

This is illustrated in figure (1).

Plane rotation.


Notice that we have expressed the rotation here without utilizing a normal direction for the plane. The sense of the rotation is encoded by the bivector \hat{\mathbf{u}} \hat{\mathbf{v}} that describes the plane and the orientation of the rotation (or by duality the direction of the normal in a 3D space). By avoiding a requirement to encode the rotation using a normal to the plane we have an method of expressing the rotation that works not only in 3D spaces, but also in 2D and greater than 3D spaces, something that isn’t possible when we restrict ourselves to traditional vector algebra (where quantities like the cross product can’t be defined in a 2D or 4D space, despite the fact that things they may represent, like torque are planar phenomena that do not have any intrinsic requirement for a normal that falls out of the plane.).

When \mathbf{a} does not lie in the plane spanned by the vectors \hat{\mathbf{u}} and \hat{\mathbf{v}} , as in figure (2), we must express the rotations differently. A rotation then takes the form

\begin{aligned}\mathbf{a} \rightarrow e^{-\hat{\mathbf{u}} \hat{\mathbf{v}} \theta/2} \mathbf{a} e^{\hat{\mathbf{u}} \hat{\mathbf{v}} \theta/2}.\end{aligned} \hspace{\stretch{1}}(5.14)

3D rotation.


In the 2D case, and when the vector lies in the plane this reduces to the one sided complex exponential operator used above. We see these types of paired half angle rotations in QM, and they are also used extensively in computer graphics under the guise of quaternions.


[1] L. Dorst, D. Fontijne, and S. Mann. Geometric Algebra for Computer Science. Morgan Kaufmann, San Francisco, 2007.

[2] C. Doran and A.N. Lasenby. Geometric algebra for physicists. Cambridge University Press New York, Cambridge, UK, 1st edition, 2003.

[3] D. Hestenes. New Foundations for Classical Mechanics. Kluwer Academic Publishers, 1999.

Posted in Math and Physics Learning. | Tagged: , , , , , , , , , , , , , , , , , | 2 Comments »

PHY456H1F: Quantum Mechanics II. Lecture 16 (Taught by Prof J.E. Sipe). Hydrogen atom with spin, and two spin systems.

Posted by peeterjoot on November 2, 2011

[Click here for a PDF of this post with nicer formatting and figures if the post had any (especially if my latex to wordpress script has left FORMULA DOES NOT PARSE errors.)]


Peeter’s lecture notes from class. May not be entirely coherent.

The hydrogen atom with spin.

READING: what chapter of [1] ?

For a spinless hydrogen atom, the Hamiltonian was

\begin{aligned}H = H_{\text{CM}} \otimes H_{\text{rel}}\end{aligned} \hspace{\stretch{1}}(2.1)

where we have independent Hamiltonian’s for the motion of the center of mass and the relative motion of the electron to the proton.

The basis kets for these could be designated {\left\lvert {\mathbf{p}_\text{CM}} \right\rangle} and {\left\lvert {\mathbf{p}_\text{rel}} \right\rangle} respectively.

Now we want to augment this, treating

\begin{aligned}H = H_{\text{CM}} \otimes H_{\text{rel}} \otimes H_{\text{s}}\end{aligned} \hspace{\stretch{1}}(2.2)

where H_{\text{s}} is the Hamiltonian for the spin of the electron. We are neglecting the spin of the proton, but that could also be included (this turns out to be a lesser effect).

We’ll introduce a Hamiltonian including the dynamics of the relative motion and the electron spin

\begin{aligned}H_{\text{rel}} \otimes H_{\text{s}}\end{aligned} \hspace{\stretch{1}}(2.3)

Covering the Hilbert space for this system we’ll use basis kets

\begin{aligned}{\left\lvert {nlm\pm} \right\rangle}\end{aligned} \hspace{\stretch{1}}(2.4)

\begin{aligned}\begin{aligned}{\left\lvert {nlm+} \right\rangle} &\rightarrow \begin{bmatrix}\left\langle{{\mathbf{r}+}} \vert {{nlm+}}\right\rangle \\ \left\langle{{\mathbf{r}-}} \vert {{nlm+}}\right\rangle \\ \end{bmatrix}=\begin{bmatrix}\Phi_{nlm}(\mathbf{r}) \\ 0\end{bmatrix} \\ {\left\lvert {nlm-} \right\rangle} &\rightarrow \begin{bmatrix}\left\langle{{\mathbf{r}+}} \vert {{nlm-}}\right\rangle \\ \left\langle{{\mathbf{r}-}} \vert {{nlm-}}\right\rangle \\ \end{bmatrix}=\begin{bmatrix}0 \\ \Phi_{nlm}(\mathbf{r}) \end{bmatrix}.\end{aligned}\end{aligned} \hspace{\stretch{1}}(2.5)

Here \mathbf{r} should be understood to really mean \mathbf{r}_\text{rel}. Our full Hamiltonian, after introducing a magnetic pertubation is

\begin{aligned}H = \frac{P_\text{CM}^2}{2M} + \left(\frac{P_\text{rel}^2}{2\mu}-\frac{e^2}{R_\text{rel}}\right)- \boldsymbol{\mu}_0 \cdot \mathbf{B}- \boldsymbol{\mu}_s \cdot \mathbf{B}\end{aligned} \hspace{\stretch{1}}(2.6)


\begin{aligned}M = m_\text{proton} + m_\text{electron},\end{aligned} \hspace{\stretch{1}}(2.7)


\begin{aligned}\frac{1}{{\mu}} = \frac{1}{{m_\text{proton}}} + \frac{1}{{m_\text{electron}}}.\end{aligned} \hspace{\stretch{1}}(2.8)

For a uniform magnetic field

\begin{aligned}\boldsymbol{\mu}_0 &= \left( -\frac{e}{2 m c} \right) \mathbf{L} \\ \boldsymbol{\mu}_s &= g \left( -\frac{e}{2 m c} \right) \mathbf{S}\end{aligned} \hspace{\stretch{1}}(2.9)

We also have higher order terms (higher order multipoles) and relativistic corrections (like spin orbit coupling [2]).

Two spins.

READING: section 28 of [1].

Example: Consider two electrons, 1 in each of 2 quantum dots.

\begin{aligned}H = H_{1} \otimes H_{2}\end{aligned} \hspace{\stretch{1}}(3.11)

where H_1 and H_2 are both spin Hamiltonian’s for respective 2D Hilbert spaces. Our complete Hilbert space is thus a 4D space.

We’ll write

\begin{aligned}\begin{aligned}{\left\lvert {+} \right\rangle}_1 \otimes {\left\lvert {+} \right\rangle}_2 &= {\left\lvert {++} \right\rangle} \\ {\left\lvert {+} \right\rangle}_1 \otimes {\left\lvert {-} \right\rangle}_2 &= {\left\lvert {+-} \right\rangle} \\ {\left\lvert {-} \right\rangle}_1 \otimes {\left\lvert {+} \right\rangle}_2 &= {\left\lvert {-+} \right\rangle} \\ {\left\lvert {-} \right\rangle}_1 \otimes {\left\lvert {-} \right\rangle}_2 &= {\left\lvert {--} \right\rangle} \end{aligned}\end{aligned} \hspace{\stretch{1}}(3.12)

Can introduce

\begin{aligned}\mathbf{S}_1 &= \mathbf{S}_1^{(1)} \otimes I^{(2)} \\ \mathbf{S}_2 &= I^{(1)} \otimes \mathbf{S}_2^{(2)}\end{aligned} \hspace{\stretch{1}}(3.13)

Here we “promote” each of the individual spin operators to spin operators in the complete Hilbert space.

We write

\begin{aligned}S_{1z}{\left\lvert {++} \right\rangle} &= \frac{\hbar}{2} {\left\lvert {++} \right\rangle} \\ S_{1z}{\left\lvert {+-} \right\rangle} &= \frac{\hbar}{2} {\left\lvert {+-} \right\rangle}\end{aligned} \hspace{\stretch{1}}(3.15)


\begin{aligned}\mathbf{S} = \mathbf{S}_1 + \mathbf{S}_2,\end{aligned} \hspace{\stretch{1}}(3.17)

for the full spin angular momentum operator. The z component of this operator is

\begin{aligned}S_z = S_{1z} + S_{2z}\end{aligned} \hspace{\stretch{1}}(3.18)

\begin{aligned}S_z{\left\lvert {++} \right\rangle} &= (S_{1z} + S_{2z}) {\left\lvert {++} \right\rangle} = \left( \frac{\hbar}{2} +\frac{\hbar}{2} \right) {\left\lvert {++} \right\rangle} = \hbar {\left\lvert {++} \right\rangle} \\  S_z{\left\lvert {+-} \right\rangle} &= (S_{1z} + S_{2z}) {\left\lvert {+-} \right\rangle} = \left( \frac{\hbar}{2} -\frac{\hbar}{2} \right) {\left\lvert {+-} \right\rangle} = 0 \\ S_z{\left\lvert {-+} \right\rangle} &= (S_{1z} + S_{2z}) {\left\lvert {-+} \right\rangle} = \left( -\frac{\hbar}{2} +\frac{\hbar}{2} \right) {\left\lvert {-+} \right\rangle} = 0 \\ S_z{\left\lvert {--} \right\rangle} &= (S_{1z} + S_{2z}) {\left\lvert {--} \right\rangle} = \left( -\frac{\hbar}{2} -\frac{\hbar}{2} \right) {\left\lvert {--} \right\rangle} = -\hbar {\left\lvert {--} \right\rangle} \end{aligned} \hspace{\stretch{1}}(3.19)

So, we find that {\left\lvert {x x} \right\rangle} are all eigenkets of S_z. These will also all be eigenkets of \mathbf{S}_1^2 = S_{1x}^2 +S_{1y}^2 +S_{1z}^2 since we have

\begin{aligned}S_1^2 {\left\lvert {x x} \right\rangle} &= \hbar^2 \left(\frac{1}{{2}}\right) \left(1 + \frac{1}{{2}}\right) {\left\lvert {x x} \right\rangle} = \frac{3}{4} \hbar^2 {\left\lvert {x x} \right\rangle} \\ S_2^2 {\left\lvert {x x} \right\rangle} &= \hbar^2 \left(\frac{1}{{2}}\right) \left(1 + \frac{1}{{2}}\right) {\left\lvert {x x} \right\rangle} = \frac{3}{4} \hbar^2 {\left\lvert {x x} \right\rangle} \end{aligned} \hspace{\stretch{1}}(3.23)

\begin{aligned}\begin{aligned}S^2 &= (\mathbf{S}_1^2+\mathbf{S}_2^2) \cdot(\mathbf{S}_1^2+\mathbf{S}_2^2)  \\ &= S_1^2 + S_2^2 + 2 \mathbf{S}_1 \cdot \mathbf{S}_2\end{aligned}\end{aligned} \hspace{\stretch{1}}(3.25)

Are all the product kets also eigenkets of S^2? Calculate

\begin{aligned}S^2 {\left\lvert {+-} \right\rangle} &= (S_1^2 + S_2^2 + 2 \mathbf{S}_1 \cdot \mathbf{S}_2) {\left\lvert {+-} \right\rangle} \\ &=\left(\frac{3}{4}\hbar^2+\frac{3}{4}\hbar^2\right)+ 2 S_{1x} S_{2x} {\left\lvert {+-} \right\rangle} + 2 S_{1y} S_{2y} {\left\lvert {+-} \right\rangle} + 2 S_{1z} S_{2z} {\left\lvert {+-} \right\rangle} \end{aligned}

For the z mixed terms, we have

\begin{aligned}2 S_{1z} S_{2z} {\left\lvert {+-} \right\rangle}  = 2 \left(\frac{\hbar}{2}\right)\left(-\frac{\hbar}{2}\right){\left\lvert {+-} \right\rangle}\end{aligned} \hspace{\stretch{1}}(3.26)


\begin{aligned}S^2{\left\lvert {+-} \right\rangle} = \hbar^2 {\left\lvert {+-} \right\rangle} + 2 S_{1x} S_{2x} {\left\lvert {+-} \right\rangle} + 2 S_{1y} S_{2y} {\left\lvert {+-} \right\rangle} \end{aligned} \hspace{\stretch{1}}(3.27)

Since we have set our spin direction in the z direction with

\begin{aligned}{\left\lvert {+} \right\rangle} &\rightarrow \begin{bmatrix}1 \\ 0\end{bmatrix} \\ {\left\lvert {-} \right\rangle} &\rightarrow \begin{bmatrix}0 \\ 1 \end{bmatrix}\end{aligned} \hspace{\stretch{1}}(3.28)

We have

\begin{aligned}S_x{\left\lvert {+} \right\rangle} &\rightarrow \frac{\hbar}{2} \begin{bmatrix} 0 & 1 \\ 1 & 0 \\ \end{bmatrix}\begin{bmatrix}1 \\ 0\end{bmatrix} =\frac{\hbar}{2}\begin{bmatrix}0 \\ 1 \end{bmatrix}=\frac{\hbar}{2} {\left\lvert {-} \right\rangle} \\ S_x{\left\lvert {-} \right\rangle} &\rightarrow \frac{\hbar}{2} \begin{bmatrix} 0 & 1 \\ 1 & 0 \\ \end{bmatrix}\begin{bmatrix}0 \\ 1 \end{bmatrix} =\frac{\hbar}{2}\begin{bmatrix}1  \\ 0 \end{bmatrix}=\frac{\hbar}{2} {\left\lvert {+} \right\rangle} \\ S_y{\left\lvert {+} \right\rangle} &\rightarrow \frac{\hbar}{2} \begin{bmatrix} 0 & -i \\ i & 0 \\ \end{bmatrix}\begin{bmatrix}1  \\ 0 \end{bmatrix} =\frac{i\hbar}{2}\begin{bmatrix}0  \\ 1 \end{bmatrix}=\frac{i\hbar}{2} {\left\lvert {-} \right\rangle} \\ S_y{\left\lvert {-} \right\rangle} &\rightarrow \frac{\hbar}{2} \begin{bmatrix} 0 & -i \\ i & 0 \\ \end{bmatrix}\begin{bmatrix}0  \\ 1 \end{bmatrix} =\frac{-i\hbar}{2}\begin{bmatrix}1  \\ 0 \end{bmatrix}=-\frac{i\hbar}{2} {\left\lvert {+} \right\rangle} \\ \end{aligned}

And are able to arrive at the action of S^2 on our mixed composite state

\begin{aligned}S^2{\left\lvert {+-} \right\rangle} = \hbar^2 ({\left\lvert {+-} \right\rangle} + {\left\lvert {-+} \right\rangle} ).\end{aligned} \hspace{\stretch{1}}(3.30)

For the action on the {\left\lvert {++} \right\rangle} state we have

\begin{aligned}S^2 {\left\lvert {++} \right\rangle} &=\left(\frac{3}{4}\hbar^2 +\frac{3}{4}\hbar^2\right){\left\lvert {++} \right\rangle} + 2 \frac{\hbar^2}{4} {\left\lvert {--} \right\rangle} + 2 i^2 \frac{\hbar^2}{4} {\left\lvert {--} \right\rangle} +2 \left(\frac{\hbar}{2}\right)\left(\frac{\hbar}{2}\right){\left\lvert {++} \right\rangle} \\ &=2 \hbar^2 {\left\lvert {++} \right\rangle} \\ \end{aligned}

and on the {\left\lvert {--} \right\rangle} state we have

\begin{aligned}S^2 {\left\lvert {--} \right\rangle} &=\left(\frac{3}{4}\hbar^2 +\frac{3}{4}\hbar^2\right){\left\lvert {--} \right\rangle} + 2 \frac{(-\hbar)^2}{4} {\left\lvert {++} \right\rangle} + 2 i^2 \frac{\hbar^2}{4} {\left\lvert {++} \right\rangle} +2 \left(-\frac{\hbar}{2}\right)\left(-\frac{\hbar}{2}\right){\left\lvert {--} \right\rangle} \\ &=2 \hbar^2 {\left\lvert {--} \right\rangle} \end{aligned}

All of this can be assembled into a tidier matrix form

\begin{aligned}S^2\rightarrow \hbar^2\begin{bmatrix}2 & 0 & 0 & 0 \\ 0 & 1 & 1 & 0 \\ 0 & 1 & 1 & 0 \\ 0 & 0 & 0 & 2 \\ \end{bmatrix},\end{aligned} \hspace{\stretch{1}}(3.31)

where the matrix is taken with respect to the (ordered) basis

\begin{aligned}\{{\left\lvert {++} \right\rangle},{\left\lvert {+-} \right\rangle},{\left\lvert {-+} \right\rangle},{\left\lvert {--} \right\rangle}\}.\end{aligned} \hspace{\stretch{1}}(3.32)


\begin{aligned}\left[{S^2},{S_z}\right] &= 0 \\ \left[{S_i},{S_j}\right] &= i \hbar \sum_k \epsilon_{ijk} S_k\end{aligned} \hspace{\stretch{1}}(3.33)

It should be possible to find eigenkets of S^2 and S_z

\begin{aligned}S^2 {\left\lvert {s m_s} \right\rangle} &= s(s+1)\hbar^2 {\left\lvert {s m_s} \right\rangle} \\ S_z {\left\lvert {s m_s} \right\rangle} &= \hbar m_s {\left\lvert {s m_s} \right\rangle} \end{aligned} \hspace{\stretch{1}}(3.35)

An orthonormal set of eigenkets of S^2 and S_z is found to be

\begin{aligned}\begin{array}{l l}{\left\lvert {++} \right\rangle} & \mbox{latex s = 1$ and m_s = 1} \\ \frac{1}{{\sqrt{2}}} \left( {\left\lvert {+-} \right\rangle} + {\left\lvert {-+} \right\rangle} \right) & \mbox{s = 1 and m_s = 0} \\ {\left\lvert {–} \right\rangle} & \mbox{s = 1 and m_s = -1} \\ \frac{1}{{\sqrt{2}}} \left( {\left\lvert {+-} \right\rangle} – {\left\lvert {-+} \right\rangle} \right) & \mbox{s = 0 and m_s = 0}\end{array}\end{aligned} \hspace{\stretch{1}}(3.37)$

The first three kets here can be grouped into a triplet in a 3D Hilbert space, whereas the last treated as a singlet in a 1D Hilbert space.

Form a grouping

\begin{aligned}H = H_1 \otimes H_2\end{aligned} \hspace{\stretch{1}}(3.38)

Can write

\begin{aligned}\frac{1}{{2}} \otimes \frac{1}{{2}} = 1 \oplus 0\end{aligned} \hspace{\stretch{1}}(3.39)

where the 1 and 0 here refer to the spin index s.

Other examples

Consider, perhaps, the l=5 state of the hydrogen atom

\begin{aligned}J_1^2 {\left\lvert {j_1 m_1} \right\rangle} &= j_1(j_1+1)\hbar^2 {\left\lvert {j_1 m_1} \right\rangle} \\ J_{1z} {\left\lvert {j_1 m_1} \right\rangle} &= \hbar m_1 {\left\lvert {j_1 m_1} \right\rangle} \end{aligned} \hspace{\stretch{1}}(3.40)

\begin{aligned}J_2^2 {\left\lvert {j_2 m_2} \right\rangle} &= j_2(j_2+1)\hbar^2 {\left\lvert {j_2 m_2} \right\rangle} \\ J_{2z} {\left\lvert {j_2 m_2} \right\rangle} &= \hbar m_2 {\left\lvert {j_2 m_2} \right\rangle} \end{aligned} \hspace{\stretch{1}}(3.42)

Consider the Hilbert space spanned by {\left\lvert {j_1 m_1} \right\rangle} \otimes {\left\lvert {j_2 m_2} \right\rangle}, a (2 j_1 + 1)(2 j_2 + 1) dimensional space. How to find the eigenkets of J^2 and J_z?


[1] BR Desai. Quantum mechanics with basic field theory. Cambridge University Press, 2009.

[2] Wikipedia. Spin.orbit interaction — wikipedia, the free encyclopedia [online]. 2011. [Online; accessed 2-November-2011].\%E2\%80\%93orbit_interaction&oldid=451606718.

Posted in Math and Physics Learning. | Tagged: , , , , , , , | Leave a Comment »

PHY456H1F: Quantum Mechanics II. Lecture 15 (Taught by Prof J.E. Sipe). Rotation operator in spin space

Posted by peeterjoot on October 31, 2011

[Click here for a PDF of this post with nicer formatting and figures if the post had any (especially if my latex to wordpress script has left FORMULA DOES NOT PARSE errors.)]


Peeter’s lecture notes from class. May not be entirely coherent.

Rotation operator in spin space.

We can formally expand our rotation operator in Taylor series

\begin{aligned}e^{-i \theta \hat{\mathbf{n}} \cdot \mathbf{S}/\hbar}= I +\left(-i \theta \hat{\mathbf{n}} \cdot \mathbf{S}/\hbar\right)+\frac{1}{{2!}}\left(-i \theta \hat{\mathbf{n}} \cdot \mathbf{S}/\hbar\right)^2+\frac{1}{{3!}}\left(-i \theta \hat{\mathbf{n}} \cdot \mathbf{S}/\hbar\right)^3+ \cdots\end{aligned} \hspace{\stretch{1}}(2.1)


\begin{aligned}e^{-i \theta \hat{\mathbf{n}} \cdot \boldsymbol{\sigma}/2}&= I +\left(-i \theta \hat{\mathbf{n}} \cdot \boldsymbol{\sigma}/2\right)+\frac{1}{{2!}}\left(-i \theta \hat{\mathbf{n}} \cdot \boldsymbol{\sigma}/2\right)^2+\frac{1}{{3!}}\left(-i \theta \hat{\mathbf{n}} \cdot \boldsymbol{\sigma}/2\right)^3+ \cdots \\ &=\sigma_0 +\left(\frac{-i \theta}{2}\right) (\hat{\mathbf{n}} \cdot \boldsymbol{\sigma})+\frac{1}{{2!}} \left(\frac{-i \theta}{2}\right) (\hat{\mathbf{n}} \cdot \boldsymbol{\sigma})^2+\frac{1}{{3!}} \left(\frac{-i \theta}{2}\right) (\hat{\mathbf{n}} \cdot \boldsymbol{\sigma})^3+ \cdots \\ &=\sigma_0 +\left(\frac{-i \theta}{2}\right) (\hat{\mathbf{n}} \cdot \boldsymbol{\sigma})+\frac{1}{{2!}} \left(\frac{-i \theta}{2}\right) \sigma_0+\frac{1}{{3!}} \left(\frac{-i \theta}{2}\right) (\hat{\mathbf{n}} \cdot \boldsymbol{\sigma}) + \cdots \\ &=\sigma_0 \left( 1 - \frac{1}{{2!}}\left(\frac{\theta}{2}\right)^2 + \cdots \right) +(\hat{\mathbf{n}} \cdot \boldsymbol{\sigma}) \left( \frac{\theta}{2} - \frac{1}{{3!}}\left(\frac{\theta}{2}\right)^3 + \cdots \right) \\ &=\cos(\theta/2) \sigma_0 + \sin(\theta/2) (\hat{\mathbf{n}} \cdot \boldsymbol{\sigma})\end{aligned}

where we’ve used the fact that (\hat{\mathbf{n}} \cdot \boldsymbol{\sigma})^2 = \sigma_0.

So our representation of the spin operator is

\begin{aligned}\begin{aligned}e^{-i \theta \hat{\mathbf{n}} \cdot \mathbf{S}/\hbar} &\rightarrow \cos(\theta/2) \sigma_0 + \sin(\theta/2) (\hat{\mathbf{n}} \cdot \boldsymbol{\sigma}) \\ &=\cos(\theta/2) \sigma_0 + \sin(\theta/2) \left(n_x \begin{bmatrix} 0 & 1 \\ 1 & 0 \\ \end{bmatrix} + n_y \begin{bmatrix} 0 & -i \\ i & 0 \\ \end{bmatrix} + n_z \begin{bmatrix} 1 & 0 \\ 0 & -1 \\ \end{bmatrix} \right) \\ &=\begin{bmatrix}\cos(\theta/2) -i n_z \sin(\theta/2) & -i (n_x -i n_y) \sin(\theta/2) \\ -i (n_x + i n_y) \sin(\theta/2) & \cos(\theta/2) +i n_z \sin(\theta/2) \end{bmatrix}\end{aligned}\end{aligned} \hspace{\stretch{1}}(2.2)

Note that, in particular,

\begin{aligned}e^{-2 \pi i \hat{\mathbf{n}} \cdot \mathbf{S}/\hbar} \rightarrow \cos\pi \sigma_0 = -\sigma_0\end{aligned} \hspace{\stretch{1}}(2.3)

This “rotates” the ket, but introduces a phase factor.

Can do this in general for other degrees of spin, for s = 1/2, 3/2, 5/2, \cdots.

Unfortunate interjection by me

I mentioned the half angle rotation operator that requires a half angle operator sandwich. Prof. Sipe thought I might be talking about a Heisenberg picture representation, where we have something like this in expectation values

\begin{aligned}{\left\lvert {\psi'} \right\rangle} = e^{-i \theta \hat{\mathbf{n}} \cdot \mathbf{J}/\hbar} {\left\lvert {\psi} \right\rangle}\end{aligned} \hspace{\stretch{1}}(2.4)

so that

\begin{aligned}{\left\langle {\psi'} \right\rvert}\mathcal{O}{\left\lvert {\psi'} \right\rangle} = {\left\langle {\psi} \right\rvert} e^{i \theta \hat{\mathbf{n}} \cdot \mathbf{J}/\hbar} \mathcal{O}e^{-i \theta \hat{\mathbf{n}} \cdot \mathbf{J}/\hbar} {\left\lvert {\psi} \right\rangle}\end{aligned} \hspace{\stretch{1}}(2.5)

However, what I was referring to, was that a general rotation of a vector in a Pauli matrix basis

\begin{aligned}R(\sum a_k \sigma_k) = R( \mathbf{a} \cdot \boldsymbol{\sigma})\end{aligned} \hspace{\stretch{1}}(2.6)

can be expressed by sandwiching the Pauli vector representation by two half angle rotation operators like our spin 1/2 operators from class today

\begin{aligned}R( \mathbf{a} \cdot \boldsymbol{\sigma}) = e^{-\theta \hat{\mathbf{u}} \cdot \boldsymbol{\sigma} \hat{\mathbf{v}} \cdot \boldsymbol{\sigma}/2} \mathbf{a} \cdot \boldsymbol{\sigma} e^{\theta \hat{\mathbf{u}} \cdot \boldsymbol{\sigma} \hat{\mathbf{v}} \cdot \boldsymbol{\sigma}/2}\end{aligned} \hspace{\stretch{1}}(2.7)

where \hat{\mathbf{u}} and \hat{\mathbf{v}} are two non-colinear orthogonal unit vectors that define the oriented plane that we are rotating in.

For example, rotating in the x-y plane, with \hat{\mathbf{u}} = \hat{\mathbf{x}} and \hat{\mathbf{v}} = \hat{\mathbf{y}}, we have

\begin{aligned}R( \mathbf{a} \cdot \boldsymbol{\sigma}) = e^{-\theta \sigma_1 \sigma_2/2} (a_1 \sigma_1 + a_2 \sigma_2 + a_3 \sigma_3) e^{\theta \sigma_1 \sigma_2/2} \end{aligned} \hspace{\stretch{1}}(2.8)

Observe that these exponentials commute with \sigma_3, leaving

\begin{aligned}R( \mathbf{a} \cdot \boldsymbol{\sigma}) &= (a_1 \sigma_1 + a_2 \sigma_2) e^{\theta \sigma_1 \sigma_2} +  a_3 \sigma_3 \\ &= (a_1 \sigma_1 + a_2 \sigma_2) (\cos\theta + \sigma_1 \sigma_2 \sin\theta)+a_3 \sigma_3 \\ &= \sigma_1 (a_1 \cos\theta - a_2 \sin\theta)+ \sigma_2 (a_2 \cos\theta + a_1 \sin\theta)+ \sigma_3 (a_3)\end{aligned}

yielding our usual coordinate rotation matrix. Expressed in terms of a unit normal to that plane, we form the normal by multiplication with the unit spatial volume element I = \sigma_1 \sigma_2 \sigma_3. For example:

\begin{aligned}\sigma_1 \sigma_2 \sigma_3( \sigma_3 )=\sigma_1 \sigma_2 \end{aligned} \hspace{\stretch{1}}(2.9)

and can in general write a spatial rotation in a Pauli basis representation as a sandwich of half angle rotation matrix exponentials

\begin{aligned}R( \mathbf{a} \cdot \boldsymbol{\sigma}) = e^{-I \theta (\hat{\mathbf{n}} \cdot \boldsymbol{\sigma})/2} (\mathbf{a} \cdot \boldsymbol{\sigma})e^{I \theta (\hat{\mathbf{n}} \cdot \boldsymbol{\sigma})/2} \end{aligned} \hspace{\stretch{1}}(2.10)

when \hat{\mathbf{n}} \cdot \mathbf{a} = 0 we get the complex-number like single sided exponential rotation exponentials (since \mathbf{a} \cdot \boldsymbol{\sigma} commutes with \mathbf{n} \cdot \boldsymbol{\sigma} in that case)

\begin{aligned}R( \mathbf{a} \cdot \boldsymbol{\sigma}) = (\mathbf{a} \cdot \boldsymbol{\sigma} )e^{I \theta (\hat{\mathbf{n}} \cdot \boldsymbol{\sigma})} \end{aligned} \hspace{\stretch{1}}(2.11)

I believe it was pointed out in one of [1] or [2] that rotations expressed in terms of half angle Pauli matrices has caused some confusion to students of quantum mechanics, because this 2 \pi “rotation” only generates half of the full spatial rotation. It was argued that this sort of confusion can be avoided if one observes that these half angle rotations exponentials are exactly what we require for general spatial rotations, and that a pair of half angle operators are required to produce a full spatial rotation.

The book [1] takes this a lot further, and produces a formulation of spin operators that is devoid of the normal scalar imaginary i (using the Clifford algebra spatial unit volume element instead), and also does not assume a specific matrix representation of the spin operators. They argue that this leads to some subtleties associated with interpretation, but at the time I was attempting to read that text I did know enough QM to appreciate what they were doing, and haven’t had time to attempt a new study of that content.

Spin dynamics

At least classically, the angular momentum of charged objects is associated with a magnetic moment as illustrated in figure (\ref{fig:qmTwoL15:qmTwoL15fig1})

\caption{Magnetic moment due to steady state current}

\begin{aligned}\boldsymbol{\mu} = I A \mathbf{e}_\perp\end{aligned} \hspace{\stretch{1}}(3.12)

In our scheme, following the (cgs?) text conventions of [3], where the \mathbf{E} and \mathbf{B} have the same units, we write

\begin{aligned}\boldsymbol{\mu} = \frac{I A}{c} \mathbf{e}_\perp\end{aligned} \hspace{\stretch{1}}(3.13)

For a charge moving in a circle as in figure (\ref{fig:qmTwoL15:qmTwoL15fig2})
\caption{Charge moving in circle.}

\begin{aligned}\begin{aligned}I &= \frac{\text{charge}}{\text{time}} \\ &= \frac{\text{distance}}{\text{time}} \frac{\text{charge}}{\text{distance}} \\ &= \frac{q v}{ 2 \pi r}\end{aligned}\end{aligned} \hspace{\stretch{1}}(3.14)

so the magnetic moment is

\begin{aligned}\begin{aligned}\mu &= \frac{q v}{ 2 \pi r} \frac{\pi r^2}{c}  \\ &= \frac{q }{ 2 m c } (m v r) \\ &= \gamma L\end{aligned}\end{aligned} \hspace{\stretch{1}}(3.15)

Here \gamma is the gyromagnetic ratio

Recall that we have a torque, as shown in figure (\ref{fig:qmTwoL15:qmTwoL15fig3})
\caption{Induced torque in the presence of a magnetic field.}

\begin{aligned}\mathbf{T} = \boldsymbol{\mu} \times \mathbf{B}\end{aligned} \hspace{\stretch{1}}(3.16)

tending to line up \boldsymbol{\mu} with \mathbf{B}. The energy is then

\begin{aligned}-\boldsymbol{\mu} \cdot \mathbf{B}\end{aligned} \hspace{\stretch{1}}(3.17)

Also recall that this torque leads to precession as shown in figure (\ref{fig:qmTwoL15:qmTwoL15fig4})

\begin{aligned}\frac{d{\mathbf{L}}}{dt} = \mathbf{T} = \gamma \mathbf{L} \times \mathbf{B},\end{aligned} \hspace{\stretch{1}}(3.18)

\caption{Precession due to torque.}

with precession frequency

\begin{aligned}\boldsymbol{\omega} = - \gamma \mathbf{B}.\end{aligned} \hspace{\stretch{1}}(3.19)

For a current due to a moving electron

\begin{aligned}\gamma = -\frac{e}{2 m c} < 0\end{aligned} \hspace{\stretch{1}}(3.20)

where we are, here, writing for charge on the electron -e.

Question: steady state currents only?. Yes, this is only true for steady state currents.

For the translational motion of an electron, even if it is not moving in a steady way, regardless of it’s dynamics

\begin{aligned}\boldsymbol{\mu}_0 = - \frac{e}{2 m c} \mathbf{L}\end{aligned} \hspace{\stretch{1}}(3.21)

Now, back to quantum mechanics, we turn \boldsymbol{\mu}_0 into a dipole moment operator and \mathbf{L} is “promoted” to an angular momentum operator.

\begin{aligned}H_{\text{int}} = - \boldsymbol{\mu}_0 \cdot \mathbf{B}\end{aligned} \hspace{\stretch{1}}(3.22)

What about the “spin”?


\begin{aligned}\boldsymbol{\mu}_s = \gamma_s \mathbf{S}\end{aligned} \hspace{\stretch{1}}(3.23)

we write this as

\begin{aligned}\boldsymbol{\mu}_s = g \left( -\frac{e}{ 2 m c} \right)\mathbf{S}\end{aligned} \hspace{\stretch{1}}(3.24)

so that

\begin{aligned}\gamma_s = - \frac{g e}{ 2 m c} \end{aligned} \hspace{\stretch{1}}(3.25)

Experimentally, one finds to very good approximation

\begin{aligned}g = 2\end{aligned} \hspace{\stretch{1}}(3.26)

There was a lot of trouble with this in early quantum mechanics where people got things wrong, and canceled the wrong factors of 2.

In fact, Dirac’s relativistic theory for the electron predicts g=2.

When this is measured experimentally, one does not get exactly g=2, and a theory that also incorporates photon creation and destruction and the interaction with the electron with such (virtual) photons. We get

\begin{aligned}\begin{aligned}g_{\text{theory}} &= 2 \left(1.001159652140 (\pm 28)\right) \\ g_{\text{experimental}} &= 2 \left(1.0011596521884 (\pm 43)\right)\end{aligned}\end{aligned} \hspace{\stretch{1}}(3.27)

Richard Feynman compared the precision of quantum mechanics, referring to this measurement, “to predicting a distance as great as the width of North America to an accuracy of one human hair’s breadth”.


[1] C. Doran and A.N. Lasenby. Geometric algebra for physicists. Cambridge University Press New York, Cambridge, UK, 1st edition, 2003.

[2] D. Hestenes. New Foundations for Classical Mechanics. Kluwer Academic Publishers, 1999.

[3] BR Desai. Quantum mechanics with basic field theory. Cambridge University Press, 2009.

Posted in Math and Physics Learning. | Tagged: , , , , , , , , , , , | Leave a Comment »

Gauge transformation of the Dirac equation.

Posted by peeterjoot on August 21, 2011

[Click here for a PDF of this post with nicer formatting (especially if my latex to wordpress script has left FORMULA DOES NOT PARSE errors.)]


In [1] the gauge transformation of the Dirac equation is covered, producing the non-relativistic equation with the correct spin interaction. There are unfortunately some sign errors, some of which self correct, and some of which don’t impact the end result, but are slightly confusing. There are also some omitted details. I’ll attempt to work through the same calculation with all the signs in the right places and also fill in some of the details I found myself wanting.

A step back. On the gauge transformation.

The gauge transformations utilized are given as

\begin{aligned}\mathcal{E} &\rightarrow \mathcal{E} - e \phi \\ \mathbf{p} &\rightarrow \mathbf{p} - e \mathbf{A}.\end{aligned} \hspace{\stretch{1}}(2.1)

Let’s start off by reminding ourself where these come from. As outlined in section 12.9 in [2] (with some details pondered in [3]), our relativistic Lagrangian is

\begin{aligned}\mathcal{L} = -m c^2 \sqrt{ 1 - \frac{\mathbf{u}}{c^2}} + \frac{e}{c} \mathbf{u} \cdot \mathbf{A} - e \phi.\end{aligned} \hspace{\stretch{1}}(2.3)

The conjugate momentum is

\begin{aligned}\mathbf{P} = \mathbf{e}^i \frac{\partial {\mathcal{L}}}{\partial {u^i}} = \frac{m \mathbf{u}}{\sqrt{1 - \mathbf{u}^2/c^2}} + \frac{e}{c} \mathbf{A},\end{aligned} \hspace{\stretch{1}}(2.4)


\begin{aligned}\mathbf{P} = \mathbf{p} + \frac{e}{c} \mathbf{A}.\end{aligned} \hspace{\stretch{1}}(2.5)

The Hamiltonian, which must be expressed in terms of this conjugate momentum \mathbf{P}, is found to be

\begin{aligned}\mathcal{E} = \sqrt{ (c \mathbf{P} - e \mathbf{A})^2 + m^2 c^4 } + e \phi.\end{aligned} \hspace{\stretch{1}}(2.6)

With the free particle Lagrangian

\begin{aligned}\mathcal{L} = -m c^2 \sqrt{ 1 - \frac{\mathbf{u}}{c^2}} ,\end{aligned} \hspace{\stretch{1}}(2.7)

our conjugate momentum is

\begin{aligned}\mathbf{P} = \frac{m \mathbf{u}}{\sqrt{ 1 - \mathbf{u}^2/c^2} }.\end{aligned} \hspace{\stretch{1}}(2.8)

For this we find that our Hamiltonian \mathcal{E} = \mathbf{P} \cdot \mathbf{u} - \mathcal{L} is

\begin{aligned}\mathcal{E} = \frac{m c^2}{\sqrt{1 - \mathbf{u}^2/c^2}},\end{aligned} \hspace{\stretch{1}}(2.9)

but this has to be expressed in terms of \mathbf{P}. Having found the form of the Hamiltonian for the interaction case, it is easily verified that 2.6 contains the required form once the interaction fields (\phi, \mathbf{A}) are zeroed

\begin{aligned}\mathcal{E} = \sqrt{ (c \mathbf{P})^2 + m^2 c^4 }.\end{aligned} \hspace{\stretch{1}}(2.10)

Considering the interaction case, Jackson points out that the energy and momentum terms can be combined as a four momentum

\begin{aligned}p^a = \left( \frac{1}{{c}}(\mathcal{E} - e \phi), \mathbf{P} - \frac{e}{c}\mathbf{A} \right),\end{aligned} \hspace{\stretch{1}}(2.11)

so that the re-arranged and squared Hamiltonian takes the form

\begin{aligned}p^a p_a = (m c)^2.\end{aligned} \hspace{\stretch{1}}(2.12)

From this we see that for the Lorentz force, the interaction can be found, starting with the free particle Hamiltonian 2.6, making the transformation

\begin{aligned}\mathcal{E}   &\rightarrow \mathcal{E} - e\phi \\ \mathbf{P} &\rightarrow \mathbf{P} - \frac{e}{c}\mathbf{A},\end{aligned} \hspace{\stretch{1}}(2.13)

or in covariant form

\begin{aligned}p^\mu \rightarrow p^\mu - \frac{e}{c}A^\mu.\end{aligned} \hspace{\stretch{1}}(2.15)

On the gauge transformation of the Dirac equation.

The task at hand now is to make the transformations of 2.13, applied to the Dirac equation

\begin{aligned}{p} = \gamma_\mu p^\mu = m c.\end{aligned} \hspace{\stretch{1}}(3.16)

The first observation to make is that we appear to have different units in the Desai text. Let’s continue using the units from Jackson, and translate them later if inclined.

Right multiplication of 3.16 by \gamma_0 gives us

\begin{aligned}0 &= \gamma_0 ({p} - m c) \\   &= \gamma_0 \gamma_\mu \left( p^\mu - \frac{e}{c} A^\mu \right)- \gamma_0 m c\\   &=\gamma_0 \gamma_0 \left(\frac{\mathcal{E}}{c} - \frac{e}{c} \phi \right)+\gamma_0 \gamma_a \left(p^a - \frac{e}{c} A^a \right)- \gamma_0 m c \\   &=\frac{1}{{c}} \left( \mathcal{E}- e \phi \right)-\boldsymbol{\alpha} \cdot \left(\mathbf{P} - \frac{e}{c} \mathbf{A} \right)- \gamma_0 m c \\ \end{aligned}

With the minor notational freedom of using \gamma_0 instead of \gamma_4, this is our starting point in the Desai text, and we can now left multiply by

\begin{aligned}({p} + m c) \gamma_0 =\frac{1}{{c}} \left( \mathcal{E} - e \phi \right)+\boldsymbol{\alpha} \cdot \left(\mathbf{P} - \frac{e}{c} \mathbf{A} \right)+ \gamma_0 m c.\end{aligned} \hspace{\stretch{1}}(3.17)

The motivation for this appears to be that this product of conjugate like quantities

\begin{aligned}\begin{aligned}0 &= ({p} + m c) \gamma_0 \gamma_0 ({p} - m c)  \\ &=({p} + m c) ({p} - m c) \\ &= \frac{1}{{c^2}} \left( \mathcal{E} - e \phi \right)^2 -\left( \mathbf{P} - \frac{e}{c} \mathbf{A} \right)^2 - (m c)^2 + \cdots,\end{aligned}\end{aligned} \hspace{\stretch{1}}(3.18)

produces the the Klein-Gordon equation, plus some cross terms to be determined. Those cross terms are the important bits since they contain the spin interaction, even in the non-relativistic limit.

Let’s do the expansion.

\begin{aligned}0&= ({p} + m c) \gamma_0 \gamma_0 ({p} - m c) u \\ &=\left(\frac{1}{{c}} \left( \mathcal{E} - e \phi \right)+\boldsymbol{\alpha} \cdot \left(\mathbf{P} - \frac{e}{c} \mathbf{A} \right)+ \gamma_0 m c\right)\left(\frac{1}{{c}} \left( \mathcal{E}- e \phi \right)-\boldsymbol{\alpha} \cdot \left(\mathbf{P} - \frac{e}{c} \mathbf{A} \right)- \gamma_0 m c \right) u \\ &=\frac{1}{{c}} \left( \mathcal{E} - e \phi \right)\left(\frac{1}{{c}} \left( \mathcal{E}- e \phi \right)-\boldsymbol{\alpha} \cdot \left(\mathbf{P} - \frac{e}{c} \mathbf{A} \right)- \gamma_0 m c \right) u \\ &\qquad +\boldsymbol{\alpha} \cdot \left(\mathbf{P} - \frac{e}{c} \mathbf{A} \right)\left(\frac{1}{{c}} \left( \mathcal{E}- e \phi \right)-\boldsymbol{\alpha} \cdot \left(\mathbf{P} - \frac{e}{c} \mathbf{A} \right)- \gamma_0 m c \right) u \\ &\qquad + \gamma_0 m c\left(\frac{1}{{c}} \left( \mathcal{E}- e \phi \right)-\boldsymbol{\alpha} \cdot \left(\mathbf{P} - \frac{e}{c} \mathbf{A} \right)- \gamma_0 m c \right) u \\ &=\left(\frac{1}{{c^2}} \left( \mathcal{E} - e \phi \right)^2- \left( \boldsymbol{\alpha} \cdot \left(\mathbf{P} - \frac{e}{c} \mathbf{A} \right) \right)^2- (mc)^2\right) u\\ &\qquad + \frac{1}{{c}} \left[{\boldsymbol{\alpha} \cdot \left(\mathbf{P} - \frac{e}{c} \mathbf{A} \right)},{\mathcal{E} - e \phi}\right] u- m c\left\{{\boldsymbol{\alpha} \cdot \left(\mathbf{P} - \frac{e}{c} \mathbf{A} \right)},{ \gamma_0}\right\} u \\ &\qquad + {\gamma_0 m\left(\mathcal{E} - e \phi\right) u}- {\gamma_0 m\left(\mathcal{E} - e \phi\right) u}\\ \end{aligned}

Since \gamma_0 anticommutes with any \boldsymbol{\alpha} \cdot \mathbf{x}, even when \mathbf{x} contains operators, the anticommutator term is killed.

While done in the text, lets also do the \boldsymbol{\alpha} \cdot \left(\mathbf{P} - \frac{e}{c} \mathbf{A} \right) square for completeness. Because this is an operator, we need to treat this as

\begin{aligned}\left( \boldsymbol{\alpha} \cdot \left(\mathbf{P} - \frac{e}{c} \mathbf{A} \right) \right)^2 u&=\boldsymbol{\alpha} \cdot \left(\mathbf{P} - \frac{e}{c} \mathbf{A} \right)\boldsymbol{\alpha} \cdot \left(\mathbf{P} u - \frac{e}{c} \mathbf{A} u \right),\end{aligned}

so want to treat the two vectors as independent, say (\boldsymbol{\alpha} \cdot \mathbf{a})(\boldsymbol{\alpha} \cdot \mathbf{b}). That is

\begin{aligned}(\boldsymbol{\alpha} \cdot \mathbf{a})(\boldsymbol{\alpha} \cdot \mathbf{b})&=\begin{bmatrix}0 & \boldsymbol{\sigma} \cdot \mathbf{a} \\ \boldsymbol{\sigma} \cdot \mathbf{a} & 0\end{bmatrix}\begin{bmatrix}0 & \boldsymbol{\sigma} \cdot \mathbf{b} \\ \boldsymbol{\sigma} \cdot \mathbf{b} & 0\end{bmatrix} \\ &=\begin{bmatrix}(\boldsymbol{\sigma} \cdot \mathbf{a}) (\boldsymbol{\sigma} \cdot \mathbf{b})  & 0 \\ 0 & (\boldsymbol{\sigma} \cdot \mathbf{a}) (\boldsymbol{\sigma} \cdot \mathbf{b})  & 0 \\ \end{bmatrix} \\ \end{aligned}

The diagonal elements can be expanded by coordinates

\begin{aligned}(\boldsymbol{\sigma} \cdot \mathbf{a}) (\boldsymbol{\sigma} \cdot \mathbf{b})&=\sum_{m,n} \sigma^m a^m \sigma^n b^n \\ &=\sum_m a^m b^m+\sum_{m\ne n} \sigma^m \sigma^n a^m b^m \\ &=\mathbf{a} \cdot \mathbf{b}+i \sum_{m\ne n} \sigma^o \epsilon^{m n o} a^m b^m \\ &=\mathbf{a} \cdot \mathbf{b}+i \boldsymbol{\sigma} \cdot (\mathbf{a} \times \mathbf{b}),\end{aligned}


\begin{aligned}(\boldsymbol{\alpha} \cdot \mathbf{a})(\boldsymbol{\alpha} \cdot \mathbf{b})=\begin{bmatrix}\mathbf{a} \cdot \mathbf{b} + i \boldsymbol{\sigma} \cdot (\mathbf{a} \times \mathbf{b}) & 0 \\ 0 & \mathbf{a} \cdot \mathbf{b} + i \boldsymbol{\sigma} \cdot (\mathbf{a} \times \mathbf{b})\end{bmatrix}\end{aligned} \hspace{\stretch{1}}(3.19)

Plugging this back in, we now have an extra term in the expansion

\begin{aligned}0&=\left(\frac{1}{{c^2}} \left( \mathcal{E} - e \phi \right)^2- \left( \mathbf{P} - \frac{e}{c} \mathbf{A} \right)^2- (mc)^2\right) u\\ &\qquad + \frac{1}{{c}} \left[{\boldsymbol{\alpha} \cdot \left(\mathbf{P} - \frac{e}{c} \mathbf{A} \right)},{\mathcal{E} - e \phi}\right] u\\ &\qquad- i \boldsymbol{\sigma}' \cdot\left(\left( \mathbf{P} - \frac{e}{c} \mathbf{A} \right) \times \left( \mathbf{P} - \frac{e}{c} \mathbf{A} \right)\right) u\end{aligned}

Here \boldsymbol{\sigma}' was defined as the direct product of the two by two identity with the abstract matrix \boldsymbol{\sigma} as follows

\begin{aligned}\boldsymbol{\sigma}' =\begin{bmatrix}\boldsymbol{\sigma} & 0 \\ 0 & \boldsymbol{\sigma}\end{bmatrix}= I \otimes \boldsymbol{\sigma}\end{aligned} \hspace{\stretch{1}}(3.20)

Like the \mathbf{L} \times \mathbf{L} angular momentum operator cross products this one wasn’t zero. Expanding it yields

\begin{aligned}\left( \mathbf{P} - \frac{e}{c} \mathbf{A} \right) \times \left( \mathbf{P} - \frac{e}{c} \mathbf{A} \right) u&=\mathbf{P} \times \mathbf{P} u+ \frac{e^2}{c^2} \mathbf{A} \times \mathbf{A} u- \frac{e}{c} \left( \mathbf{A} \times \mathbf{P} + \mathbf{P} \times \mathbf{A} \right) u \\ &=- \frac{e}{c} \left( \mathbf{A} \times (\mathbf{P} u) + (\mathbf{P} u) \times \mathbf{A} + u (\mathbf{P} \times \mathbf{A}) \right) \\ &=- \frac{e}{c} (-i \hbar \boldsymbol{\nabla} \times \mathbf{A}) u \\ &=\frac{i e \hbar}{c} \mathbf{H} u\end{aligned}

Plugging in again we are getting closer, and now have the magnetic field cross term

\begin{aligned}0&=\left(\frac{1}{{c^2}} \left( \mathcal{E} - e \phi \right)^2- \left(\mathbf{P} - \frac{e}{c} \mathbf{A} \right)^2- (mc)^2\right) u\\ &\qquad + \frac{1}{{c}}\left[{\boldsymbol{\alpha} \cdot \left(\mathbf{P} - \frac{e}{c} \mathbf{A} \right)},{\mathcal{E} - e \phi}\right] u\\ &\qquad+ \frac{e \hbar}{c} \boldsymbol{\sigma}' \cdot \mathbf{H} u.\end{aligned}

All that remains is evaluation of the commutator term, which should yield the electric field interaction. That commutator is

\begin{aligned}\left[{\boldsymbol{\alpha} \cdot \left(\mathbf{P} - \frac{e}{c} \mathbf{A} \right)},{\mathcal{E} - e \phi}\right] u&={\boldsymbol{\alpha} \cdot \mathbf{P} \mathcal{E} u}- e \boldsymbol{\alpha} \cdot \mathbf{P} \phi u- \frac{e}{c} \boldsymbol{\alpha} \cdot \mathbf{A} \mathcal{E} u+ {\frac{e^2}{c} \boldsymbol{\alpha} \cdot \mathbf{A} \phi u} \\ &- {\mathcal{E} \boldsymbol{\alpha} \cdot \mathbf{P} u}+ e \phi \boldsymbol{\alpha} \cdot \mathbf{P} u+ \frac{e}{c} \mathcal{E} \boldsymbol{\alpha} \cdot \mathbf{A} u- {\frac{e^2}{c} \phi \boldsymbol{\alpha} \cdot \mathbf{A} u} \\ &=\boldsymbol{\alpha} \cdot \left( - e \mathbf{P} \phi+ \frac{e}{c} \mathcal{E} \right) u \\ &=e i \hbar \boldsymbol{\alpha} \cdot \left( \boldsymbol{\nabla} \phi+ \frac{1}{c} \frac{\partial {\mathbf{A}}}{\partial {t}} \right) u \\ &=- e i \hbar \boldsymbol{\alpha} \cdot \mathbf{E} u\end{aligned}

That was the last bit required to fully expand the space time split of our squared momentum equations. We have

\begin{aligned}0=({p} + mc)({p} - mc) u=\left(\frac{1}{{c^2}} \left( \mathcal{E} - e \phi \right)^2- \left(\mathbf{P} - \frac{e}{c} \mathbf{A} \right)^2- (mc)^2- \frac{i e \hbar}{c} \boldsymbol{\alpha} \cdot \mathbf{E}+ \frac{e \hbar}{c} \boldsymbol{\sigma}' \cdot \mathbf{H}\right) u\end{aligned} \hspace{\stretch{1}}(3.21)

This is the end result of the reduction of the spacetime split gauge transformed Dirac equation. The next step is to obtain the non-relativistic Hamiltonian operator equation (linear in the time derivative operator and quadratic in spacial partials) that has both the electric field and magnetic field terms that we desire to accurately describe spin (actually we need only the magnetic interaction term for non-relativistic spin, but we’ll see that soon).

To obtain the first order time derivatives we can consider an approximation to the (\mathcal{E} - e \phi)^2 terms. We can get that by considering the difference of squares factorization

\begin{aligned}\frac{1}{{c^2}} ( \mathcal{E} - e \phi - m c^2) ( \mathcal{E} - e \phi + m c^2) u&=\frac{1}{{c^2}} \left(( \mathcal{E} - e \phi )^2 u - (m c^2)^2 u- {m c^2 \mathcal{E} u}+ {\mathcal{E} m c^2 u} \right) \\ &=\frac{1}{{c^2}} ( \mathcal{E} - e \phi )^2 u - (m c)^2 u\end{aligned}

In the text, this is factored, instead of the factorization verified. I wanted to be careful to ensure that the operators did not have any effect. They don’t, which is clear in retrospect since the \mathcal{E} operator and the scalar mc necessarily commute. With this factorization, some relativistic approximations are possible. Considering the free particle energy, we can separate out the rest energy from the kinetic (which is perversely designated with subscript T for some reason in the text (and others))

\begin{aligned}\mathcal{E}&= \gamma m c^2  \\ &= m c^2 \left( 1 + \frac{1}{{2}} \left(\frac{\mathbf{v}}{c}\right)^2 + \cdots \right) \\ &= m c^2 + \frac{1}{{2}} m \mathbf{v}^2 + \cdots \\ &\equiv m c^2 + \mathcal{E}_{T}\end{aligned}

With this definition, the energy minus mass term in terms of kinetic energy (that we also had in the Klein-Gordon equation) takes the form

\begin{aligned}\frac{1}{{c^2}} ( \mathcal{E} - e \phi )^2 u - (m c)^2 u=\frac{1}{{c^2}} ( \mathcal{E}_{T} - e \phi ) ( \mathcal{E} - e \phi + m c^2) u\end{aligned} \hspace{\stretch{1}}(3.22)

In the second factor, to get a non-relativistic approximation of \mathcal{E} - e \phi, the text states without motivation that e \phi will be considered small compared to m c^2. We can make some sense of this by considering the classical Hamiltonian for a particle in a field

\begin{aligned}\mathcal{E}&= \sqrt{ c^2 \left(\mathbf{P} - \frac{e}{c} \mathbf{A}\right) + (m c^2)^2 } + e \phi \\ &= \sqrt{ c^2 (\gamma m \mathbf{v})^2 + (m c^2)^2 } + e \phi \\ &= m c \sqrt{ (\gamma \mathbf{v})^2 + c^2 } + e \phi \\ &= m c \sqrt{ \frac{ \mathbf{v}^2 + c^2 ( 1 - \mathbf{v}^2/c^2) } { 1 - \mathbf{v}^2/c^2 } } + e \phi \\ &= \gamma m c^2 + e \phi \\ &= m c^2 \left( 1 + \frac{1}{{2}} \frac{\mathbf{v}^2}{c^2} + \cdots \right) + e \phi.\end{aligned}

We find that, in the non-relativistic limit, we have

\begin{aligned}\mathcal{E} - e \phi = m c^2 + \frac{1}{{2}} m \mathbf{v}^2 + \cdots \approx m c^2,\end{aligned} \hspace{\stretch{1}}(3.23)

and obtain the first order approximation of our time derivative operator

\begin{aligned}\frac{1}{{c^2}} ( \mathcal{E} - e \phi )^2 u - (m c)^2 u\approx\frac{1}{{c^2}} ( \mathcal{E}_{T} - e \phi ) 2 m c^2 u,\end{aligned} \hspace{\stretch{1}}(3.24)


\begin{aligned}\frac{1}{{c^2}} ( \mathcal{E} - e \phi )^2 u - (m c)^2 u\approx2 m ( \mathcal{E}_{T} - e \phi ).\end{aligned} \hspace{\stretch{1}}(3.25)

It seems slightly underhanded to use the free particle Hamiltonian in one part of the approximation, and the Hamiltonian for a particle in a field for the other part. This is probably why the text just mandates that e\phi be small compared to m c^2.

To summarize once more before the final reduction (where we eliminate the electric field component of the operator equation), we have

\begin{aligned}0=({p} + mc)({p} - mc) u\approx\left(2 m ( \mathcal{E}_{T} - e \phi )- \left(\mathbf{P} - \frac{e}{c} \mathbf{A} \right)^2- \frac{i e \hbar}{c} \boldsymbol{\alpha} \cdot \mathbf{E}+ \frac{e \hbar}{c} \boldsymbol{\sigma}' \cdot \mathbf{H}\right) u.\end{aligned} \hspace{\stretch{1}}(3.26)

Except for the electric field term, this is the result that is derived in the text. It was argued that this term is not significant compared to e \phi when the particle velocity is restricted to the non-relativistic domain. This is done by computing the expectation of this term relative to e \phi. Consider

\begin{aligned}{\left\lvert{ \left\langle{{ \frac{e \hbar}{ 2 m c} \frac{\boldsymbol{\alpha} \cdot \mathbf{E}}{e \phi } }}\right\rangle }\right\rvert}\end{aligned} \hspace{\stretch{1}}(3.27)

With the velocities low enough so that the time variation of the vector potential does not contribute to the electric field (i.e. the electrostatic case), we have

\begin{aligned}\mathbf{E} = - \boldsymbol{\nabla} \phi = - \hat{\mathbf{r}} \frac{\partial {\phi}}{\partial {r}}.\end{aligned} \hspace{\stretch{1}}(3.28)

The variation in length a that is considered is labeled the characteristic length

\begin{aligned}p a \sim \hbar,\end{aligned} \hspace{\stretch{1}}(3.29)

so that with p = m v we have

\begin{aligned}a \sim \frac{\hbar}{m v}.\end{aligned} \hspace{\stretch{1}}(3.30)

This characteristic length is not elaborated on, but one can observe the similarity to the Compton wavelength

\begin{aligned}L_{\text{Compton}} = \frac{\hbar}{m c},\end{aligned} \hspace{\stretch{1}}(3.31)

the length scale for which Quantum field theory must be considered. This length scale is considerably larger for velocities smaller than the speed of light. For example, the drift velocity of electrons in copper is \sim 10^{6} \frac{\text{m}}{\text{s}}, which fixes our length scale to 100 times the Compton length (\sim 10^{-12} \text{m}). This is still a very small length, but is in the QM domain instead of QED. With such a length scale consider the magnitude of a differential contribution to the electric field

\begin{aligned}{\left\lvert{\phi}\right\rvert} = {\left\lvert{\mathbf{E}}\right\rvert} \Delta x = {\left\lvert{\mathbf{E}}\right\rvert} a,\end{aligned} \hspace{\stretch{1}}(3.32)

so that

\begin{aligned}\left\langle{{ \frac{e \hbar}{ 2 m c} \frac{\boldsymbol{\alpha} \cdot \mathbf{E}}{e {\left\lvert{\phi}\right\rvert} } }}\right\rangle&=\left\langle{{ \frac{e \hbar}{ 2 m c} \frac{\boldsymbol{\alpha} \cdot \mathbf{E}}{e a {\left\lvert{\mathbf{E}}\right\rvert} } }}\right\rangle \\ &=\left\langle{{ \frac{e \hbar}{m} \frac{1}{ 2 c} \frac{\boldsymbol{\alpha} \cdot \mathbf{E}}{e \frac{\hbar }{ m v } {\left\lvert{\mathbf{E}}\right\rvert} } }}\right\rangle \\ &=\frac{1}{{2}} \frac{v}{c} \left\langle{{ \frac{\boldsymbol{\alpha} \cdot \mathbf{E}}{ {\left\lvert{\mathbf{E}}\right\rvert} } }}\right\rangle.\end{aligned}

Thus the magnitude of this (vector) expectation is dominated by the expectation of just the \boldsymbol{\alpha}. That has been calculated earlier when Dirac currents were considered, where it was found that

\begin{aligned}\left\langle{{\alpha_i}}\right\rangle = \psi^\dagger \alpha_i \psi = (\mathbf{j})_i.\end{aligned} \hspace{\stretch{1}}(3.33)

Also recall that (33.73) that this current was related to momentum with

\begin{aligned}\mathbf{j} = \frac{\mathbf{p}}{m c} = \frac{\mathbf{v}}{c}\end{aligned} \hspace{\stretch{1}}(3.34)

which allows for a final approximation of the magnitude of the electric field term’s expectation value relative to the e\phi term of the Hamiltonian operator. Namely

\begin{aligned}{\left\lvert{ \left\langle{{ \frac{e \hbar}{ 2 m c} \frac{\boldsymbol{\alpha} \cdot \mathbf{E}}{e \phi } }}\right\rangle }\right\rvert}\sim\frac{\mathbf{v}^2}{c^2}.\end{aligned} \hspace{\stretch{1}}(3.35)

With that last approximation made, the gauge transformed Dirac equation, after non-relativistic approximation of the energy and electric field terms, is left as

\begin{aligned}i \hbar \frac{\partial {}}{\partial {t}}=\frac{1}{{2m}} \left(i \hbar \boldsymbol{\nabla} + \frac{e}{c} \mathbf{A} \right)^2- \frac{e \hbar}{2 m c} \boldsymbol{\sigma}' \cdot \mathbf{H}+ e \phi.\end{aligned} \hspace{\stretch{1}}(3.36)

This is still a four dimensional equation, and it is stated in the text that only the large component is relevant (reducing the degrees of spin freedom to two). That argument makes a bit more sense with the matrix form of the gauge reduction which follows in the next section, so understanding that well seems worthwhile, and is the next thing to digest.


[1] BR Desai. Quantum mechanics with basic field theory. Cambridge University Press, 2009.

[2] JD Jackson. Classical Electrodynamics Wiley. John Wiley and Sons, 2nd edition, 1975.

[3] Peeter Joot. Misc Physics and Math Play, chapter Hamiltonian notes.

Posted in Math and Physics Learning. | Tagged: , , , , , , , , , , , | Leave a Comment »

On tensor product generators of the gamma matrices.

Posted by peeterjoot on June 20, 2011

[Click here for a PDF of this post with nicer formatting (especially if my latex to wordpress script has left FORMULA DOES NOT PARSE errors.)]


In [1] he writes

\begin{aligned}\gamma^0 &=\begin{bmatrix}I & 0 \\ 0 & -I\end{bmatrix}=I \otimes \tau_3 \\ \gamma^i &=\begin{bmatrix}0 & \sigma^i \\ \sigma^i & 0\end{bmatrix}=\sigma^i \otimes i \tau_2 \\ \gamma^5 &=\begin{bmatrix}0 & I \\ I & 0\end{bmatrix}=I \otimes \tau_1\end{aligned}

The Pauli matrices \sigma^i I had seen, but not the \tau_i matrices, nor the \otimes notation. Strangerep in physicsforums points out that the \otimes is a Kronecker matrix product, a special kind of tensor product [2]. Let’s do the exersize of reverse engineering the \tau matrices as suggested.


Let’s start with \gamma^5. We want

\begin{aligned}\gamma^5 = I \otimes \tau_1 =\begin{bmatrix}I \tau_{11} & I \tau_{12} \\ I \tau_{21} & I \tau_{22} \\ \end{bmatrix}= \begin{bmatrix}0 & 1 \\ 1 & 0\end{bmatrix}\end{aligned} \hspace{\stretch{1}}(2.1)

By inspection we must have

\begin{aligned}\tau_1 = \begin{bmatrix}0 & 1 \\ 1 & 0\end{bmatrix}= \sigma^1\end{aligned} \hspace{\stretch{1}}(2.2)

Thus \tau_1 = \sigma^1. How about \tau_2? For that matrix we have

\begin{aligned}\gamma^i = \sigma^i \otimes i \tau_2 =\begin{bmatrix}\sigma^i \tau_{11} & \sigma^i \tau_{12} \\ \sigma^i \tau_{21} & \sigma^i \tau_{22} \\ \end{bmatrix}= \begin{bmatrix}0 & 1 \\ 1 & 0\end{bmatrix}\end{aligned} \hspace{\stretch{1}}(2.3)

Again by inspection we must have

\begin{aligned}i \tau_2 = \begin{bmatrix}0 & 1 \\ -1 & 0\end{bmatrix},\end{aligned} \hspace{\stretch{1}}(2.4)


\begin{aligned}\tau_2 = \begin{bmatrix}0 & -i \\ i & 0\end{bmatrix}= \sigma^2.\end{aligned} \hspace{\stretch{1}}(2.5)

This one is also just the Pauli matrix. For the last we have

\begin{aligned}\gamma^0 = I \otimes \tau_3 =\begin{bmatrix}I \tau_{11} & I \tau_{12} \\ I \tau_{21} & I \tau_{22} \\ \end{bmatrix}= \begin{bmatrix}1 & 0 \\ 0 & -1\end{bmatrix}.\end{aligned} \hspace{\stretch{1}}(2.6)

Our last tau matrix is thus

\begin{aligned}\tau_3 = \begin{bmatrix}1 & 0 \\ 0 & -1\end{bmatrix}= \sigma^3.\end{aligned} \hspace{\stretch{1}}(2.7)

Curious that there are two notations used in the same page for exactly the same thing? It appears that I wasn’t the only person confused about this.

The bivector expansion

Zee writes his wedge products with the commutator, adding a complex factor

\begin{aligned}\sigma^{\mu\nu} = \frac{i}{2} \left[{\gamma^\mu},{\gamma^\nu}\right]\end{aligned} \hspace{\stretch{1}}(3.8)

Let’s try the direct product notation to expand \sigma^{0 i} and \sigma^{ij}. That first is

\begin{aligned}\sigma^{0 i} &= \frac{i}{2} \left( \gamma^0 \gamma^i - \gamma^i \gamma^0 \right) \\ &= i \gamma^0 \gamma^i \\ &= i (I \otimes \tau_3)(\sigma^i \otimes i \tau_2) \\ &= i^2 \sigma^i \otimes \tau_3\tau_2 \\ &= - \sigma^i \otimes (-i \tau_1) \\ &= i \sigma^i \otimes \tau_1 \\ &= i \begin{bmatrix}0 & \sigma^i \\ \sigma^i & 0\end{bmatrix},\end{aligned}

which is what was expected. The second bivector, for i=j is zero, and for i\ne j is

\begin{aligned}\sigma^{i j} &= i \gamma^i \gamma^j \\ &= i (\sigma^i \otimes i \tau_2) (\sigma^j \otimes i \tau_2) \\ &= i^3 (\sigma^i \sigma^j) \otimes I \\ &= i^4 (\epsilon_{ijk} \sigma^k) \otimes I \\ &= \epsilon_{ijk} \begin{bmatrix}\sigma^k & 0 \\ 0 & \sigma^k\end{bmatrix}.\end{aligned}


[1] A. Zee. Quantum field theory in a nutshell. Universities Press, 2005.

[2] Wikipedia. Tensor product — wikipedia, the free encyclopedia [online]. 2011. [Online; accessed 21-June-2011].

Posted in Math and Physics Learning. | Tagged: , , , | Leave a Comment »

PHY450HS1: Relativistic electrodynamics: some exam reflection.

Posted by peeterjoot on April 28, 2011

[Click here for a PDF of this post with nicer formatting (especially if my latex to wordpress script has left FORMULA DOES NOT PARSE errors.)]

Charged particle in a circle.

From the 2008 PHY353 exam, given a particle of charge q moving in a circle of radius a at constant angular frequency \omega.

\item Find the Lienard-Wiechert potentials for points on the z-axis.
\item Find the electric and magnetic fields at the center.

When I tried this I did it for points not just on the z-axis. It turns out that we also got this question on the exam (but stated slightly differently). Since I’ll not get to see my exam solution again, let’s work through this at a leisurely rate, and see if things look right. The problem as stated in this old practice exam is easier since it doesn’t say to calculate the fields from the four potentials, so there was nothing preventing one from just grinding away and plugging stuff into the Lienard-Wiechert equations for the fields (as I did when I tried it for practice).

The potentials.

Let’s set up our coordinate system in cylindrical coordinates. For the charged particle and the point that we measure the field, with i = \mathbf{e}_1 \mathbf{e}_2

\begin{aligned}\mathbf{x}(t) &= a \mathbf{e}_1 e^{i \omega t} \\ \mathbf{r} &= z \mathbf{e}_3 + \rho \mathbf{e}_1 e^{i \phi}\end{aligned} \hspace{\stretch{1}}(1.1)

Here I’m using the geometric product of vectors (if that’s unfamiliar then just substitute

\begin{aligned}\{\mathbf{e}_1, \mathbf{e}_2, \mathbf{e}_3\} \rightarrow \{\sigma_1, \sigma_2, \sigma_3\}\end{aligned} \hspace{\stretch{1}}(1.3)

We can do that since the Pauli matrices also have the same semantics (with a small difference since the geometric square of a unit vector is defined as the unit scalar, whereas the Pauli matrix square is the identity matrix). The semantics we require of this vector product are just \mathbf{e}_\alpha^2 = 1 and \mathbf{e}_\alpha \mathbf{e}_\beta = - \mathbf{e}_\beta \mathbf{e}_\alpha for any \alpha \ne \beta.

I’ll also be loose with notation and use \text{Real}(X) = \left\langle{{X}}\right\rangle to select the scalar part of a multivector (or with the Pauli matrices, the portion proportional to the identity matrix).

Our task is to compute the Lienard-Wiechert potentials. Those are

\begin{aligned}A^0 &= \frac{q}{R^{*}} \\ \mathbf{A} &= A^0 \frac{\mathbf{v}}{c},\end{aligned} \hspace{\stretch{1}}(1.4)


\begin{aligned}\mathbf{R} &= \mathbf{r} - \mathbf{x}(t_r) \\ R = {\left\lvert{\mathbf{R}}\right\rvert} &= c (t - t_r) \\ R^{*} &= R - \frac{\mathbf{v}}{c} \cdot \mathbf{R} \\ \mathbf{v} &= \frac{d\mathbf{x}}{dt_r}.\end{aligned} \hspace{\stretch{1}}(1.6)

We’ll need (eventually)

\begin{aligned}\mathbf{v} &= a \omega \mathbf{e}_2 e^{i \omega t_r} = a \omega ( -\sin \omega t_r, \cos\omega t_r, 0) \\ \dot{\mathbf{v}} &= -a \omega^2 \mathbf{e}_1 e^{i \omega t_r} = -a \omega^2 (\cos\omega t_r, \sin\omega t_r, 0)\end{aligned} \hspace{\stretch{1}}(1.10)

and also need our retarded distance vector

\begin{aligned}\mathbf{R} = z \mathbf{e}_3 + \mathbf{e}_1 (\rho e^{i \phi} - a e^{i \omega t_r} ),\end{aligned} \hspace{\stretch{1}}(1.12)

From this we have

\begin{aligned}R^2 &= z^2 + {\left\lvert{\mathbf{e}_1 (\rho e^{i \phi} - a e^{i \omega t_r} )}\right\rvert}^2 \\ &= z^2 + \rho^2 + a^2 - 2 \rho a (\mathbf{e}_1 \rho e^{i \phi}) \cdot (\mathbf{e}_1 e^{i \omega t_r}) \\ &= z^2 + \rho^2 + a^2 - 2 \rho a \text{Real}( e^{ i(\phi - \omega t_r) } ) \\ &= z^2 + \rho^2 + a^2 - 2 \rho a \cos(\phi - \omega t_r)\end{aligned}


\begin{aligned}R = \sqrt{z^2 + \rho^2 + a^2 - 2 \rho a \cos( \phi - \omega t_r ) }.\end{aligned} \hspace{\stretch{1}}(1.13)

Next we need

\begin{aligned}\mathbf{R} \cdot \mathbf{v}/c&= (z \mathbf{e}_3 + \mathbf{e}_1 (\rho e^{i \phi} - a e^{i \omega t_r} )) \cdot  \left(a \frac{\omega}{c} \mathbf{e}_2 e^{i \omega t_r} \right) \\ &=a \frac{\omega }{c}\text{Real}(i (\rho e^{-i \phi} - a e^{-i \omega t_r} ) e^{i \omega t_r} ) \\ &=a \frac{\omega }{c}\rho \text{Real}( i e^{-i \phi + i \omega t_r} ) \\ &=a \frac{\omega }{c}\rho \sin(\phi - \omega t_r)\end{aligned}

So we have

\begin{aligned}R^{*} = \sqrt{z^2 + \rho^2 + a^2 - 2 \rho a \cos( \phi - \omega t_r ) }-a \frac{\omega }{c} \rho \sin(\phi - \omega t_r)\end{aligned} \hspace{\stretch{1}}(1.14)

Writing k = \omega/c, and having a peek back at 1.4, our potentials are now solved for

\begin{aligned}\boxed{\begin{aligned}A^0 &= \frac{q}{\sqrt{z^2 + \rho^2 + a^2 - 2 \rho a \cos( \phi - k c t_r ) }} \\ \mathbf{A} &= A^0 a k ( -\sin k c t_r, \cos k c t_r, 0).\end{aligned}}\end{aligned} \hspace{\stretch{1}}(1.24)

The caveat is that t_r is only specified implicitly, according to

\begin{aligned}\boxed{c t_r = c t - \sqrt{z^2 + \rho^2 + a^2 - 2 \rho a \cos( \phi - k c t_r ) }.}\end{aligned} \hspace{\stretch{1}}(1.16)

There doesn’t appear to be much hope of solving for t_r explicitly in closed form.

General fields for this system.


\begin{aligned}\mathbf{R}^{*} = \mathbf{R} - \frac{\mathbf{v}}{c} R,\end{aligned} \hspace{\stretch{1}}(1.17)

the fields are

\begin{aligned}\boxed{\begin{aligned}\mathbf{E} &= q (1 - \mathbf{v}^2/c^2) \frac{\mathbf{R}^{*}}{{R^{*}}^3} + \frac{q}{{R^{*}}^3} \mathbf{R} \times (\mathbf{R}^{*} \times \dot{\mathbf{v}}/c^2) \\ \mathbf{B} &= \frac{\mathbf{R}}{R} \times \mathbf{E}.\end{aligned}}\end{aligned} \hspace{\stretch{1}}(1.18)

In there we have

\begin{aligned}1 - \mathbf{v}^2/c^2 = 1 - a^2 \frac{\omega^2}{c^2} = 1 - a^2 k^2\end{aligned} \hspace{\stretch{1}}(1.19)


\begin{aligned}\mathbf{R}^{*} &= z \mathbf{e}_3 + \mathbf{e}_1 (\rho e^{i \phi} - a e^{i k c t_r} )-a k \mathbf{e}_2 e^{i k c t_r} R \\ &= z \mathbf{e}_3 + \mathbf{e}_1 (\rho e^{i \phi} - a (1 - k R i) e^{i k c t_r} )\end{aligned}

Writing this out in coordinates isn’t particularly illuminating, but can be done for completeness without too much trouble

\begin{aligned}\mathbf{R}^{*} = ( \rho \cos\phi - a \cos t_r + a k R \sin t_r,  \rho \sin\phi - a \sin t_r - a k R \cos t_r,  z )\end{aligned} \hspace{\stretch{1}}(1.20)

In one sense the problem could be considered solved, since we have all the pieces of the puzzle. The outstanding question is whether or not the resulting mess can be simplified at all. Let’s see if the cross product reduces at all. Using

\begin{aligned}\mathbf{R} \times (\mathbf{R}^{*} \times \dot{\mathbf{v}}/c^2) =\mathbf{R}^{*} (\mathbf{R} \cdot \dot{\mathbf{v}}/c^2) - \frac{\dot{\mathbf{v}}}{c^2}(\mathbf{R} \cdot \mathbf{R}^{*})\end{aligned} \hspace{\stretch{1}}(1.21)

Perhaps one or more of these dot products can be simplified? One of them does reduce nicely

\begin{aligned}\mathbf{R}^{*} \cdot \mathbf{R} &= ( \mathbf{R} - R \mathbf{v}/c ) \cdot \mathbf{R}  \\ &= R^2 - (\mathbf{R} \cdot \mathbf{v}/c) R \\ &= R^2 - R a k \rho \sin(\phi - k c t_r) \\ &= R(R - a k \rho \sin(\phi - k c t_r))\end{aligned}

\begin{aligned}\mathbf{R} \cdot \dot{\mathbf{v}}/c^2&=\Bigl(z \mathbf{e}_3 + \mathbf{e}_1 (\rho e^{i \phi} - a e^{i \omega t_r} ) \Bigr) \cdot(-a k^2 \mathbf{e}_1 e^{i \omega t_r} )  \\ &=- a k^2 \left\langle{{\mathbf{e}_1 (\rho e^{i \phi} - a e^{i \omega t_r} ) \mathbf{e}_1 e^{i \omega t_r} )  }}\right\rangle \\ &=- a k^2 \left\langle{{(\rho e^{i \phi} - a e^{i \omega t_r} ) e^{-i \omega t_r} )  }}\right\rangle \\ &=- a k^2 \left\langle{{\rho e^{i \phi - i \omega t_r} - a }}\right\rangle \\ &=- a k^2 ( \rho \cos(\phi - k c t_r) - a )\end{aligned}

Putting this cross product back together we have

\begin{aligned}\mathbf{R} \times (\mathbf{R}^{*} \times \dot{\mathbf{v}}/c^2)&=a k^2 ( a -\rho \cos(\phi - k c t_r) ) \mathbf{R}^{*} +a k^2 \mathbf{e}_1 e^{i k c  t_r} R(R - a k \rho \sin(\phi - k c t_r)) \\ &=a k^2 ( a -\rho \cos(\phi - k c t_r) ) \Bigl(z \mathbf{e}_3 + \mathbf{e}_1 (\rho e^{i \phi} - a (1 - k R i) e^{i k c t_r} )\Bigr) \\ &\qquad +a k^2 R \mathbf{e}_1 e^{i k c  t_r} (R - a k \rho \sin(\phi - k c t_r)) \end{aligned}


\begin{aligned}\phi_r = \phi - k c t_r,\end{aligned} \hspace{\stretch{1}}(1.22)

this can be grouped into similar terms

\begin{aligned}\begin{aligned}\mathbf{R} \times (\mathbf{R}^{*} \times \dot{\mathbf{v}}/c^2)&=a k^2 (a - \rho \cos\phi_r) z \mathbf{e}_3 \\ &+ a k^2 \mathbf{e}_1(a - \rho \cos\phi_r) \rho e^{i\phi} \\ &+ a k^2 \mathbf{e}_1\left(-a (a - \rho \cos\phi_r) (1 - k R i)+ R(R - a k \rho \sin \phi_r)\right) e^{i k c t_r}\end{aligned}\end{aligned} \hspace{\stretch{1}}(1.23)

The electric field pieces can now be collected. Not expanding out the R^{*} from 1.14, this is

\begin{aligned}\begin{aligned}\mathbf{E} &= \frac{q}{(R^{*})^3} z \mathbf{e}_3\Bigl( 1 - a \rho k^2 \cos\phi_r \Bigr) \\ &+\frac{q}{(R^{*})^3} \rho\mathbf{e}_1 \Bigl(1 - a \rho k^2 \cos\phi_r \Bigr) e^{i\phi} \\ &+\frac{q}{(R^{*})^3} a \mathbf{e}_1\left(-\Bigl( 1 + a k^2 (a - \rho \cos\phi_r) \Bigr) (1 - k R i)(1 - a^2 k^2)+ k^2 R(R - a k \rho \sin \phi_r)\right) e^{i k c t_r}\end{aligned}\end{aligned} \hspace{\stretch{1}}(1.24)

Along the z-axis where \rho = 0 what do we have?

\begin{aligned}R = \sqrt{z^2 + a^2 } \end{aligned} \hspace{\stretch{1}}(1.25)

\begin{aligned}A^0 = \frac{q}{R} \end{aligned} \hspace{\stretch{1}}(1.26)

\begin{aligned}\mathbf{A} = A^0 a k \mathbf{e}_2 e^{i k c t_r } \end{aligned} \hspace{\stretch{1}}(1.27)

\begin{aligned}c t_r = c t - \sqrt{z^2 + a^2 } \end{aligned} \hspace{\stretch{1}}(1.28)

\begin{aligned}\begin{aligned}\mathbf{E} &= \frac{q}{R^3} z \mathbf{e}_3 \\ &+\frac{q}{R^3} a \mathbf{e}_1\left(-( 1 - a^4 k^4 ) (1 - k R i)+ k^2 R^2 \right) e^{i k c t_r} \end{aligned}\end{aligned} \hspace{\stretch{1}}(1.29)

\begin{aligned}\mathbf{B} = \frac{ z \mathbf{e}_3 - a \mathbf{e}_1 e^{i k c t_r}}{R} \times \mathbf{E}\end{aligned} \hspace{\stretch{1}}(1.30)

The magnetic term here looks like it can be reduced a bit.

An approximation near the center.

Unlike the old exam I did, where it didn’t specify that the potentials had to be used to calculate the fields, and the problem was reduced to one of algebraic manipulation, our exam explicitly asked for the potentials to be used to calculate the fields.

There was also the restriction to compute them near the center. Setting \rho = 0 so that we are looking only near the z-axis, we have

\begin{aligned}A^0 &= \frac{q}{\sqrt{z^2 + a^2}} \\ \mathbf{A} &= \frac{q a k \mathbf{e}_2 e^{i k c t_r} }{\sqrt{z^2 + a^2}} = \frac{q a k (-\sin k c t_r, \cos k c t_r, 0)}{\sqrt{z^2 + a^2}} \\ t_r &= t - R/c = t - \sqrt{z^2 + a^2}/c\end{aligned} \hspace{\stretch{1}}(1.31)

Now we are set to calculate the electric and magnetic fields directly from these. Observe that we have a spatial dependence in due to the t_r quantities and that will have an effect when we operate with the gradient.

In the exam I’d asked Simon (our TA) if this question was asking for the fields at the origin (ie: in the plane of the charge’s motion in the center) or along the z-axis. He said in the plane. That would simplify things, but perhaps too much since A^0 becomes constant (in my exam attempt I somehow fudged this to get what I wanted for the v = 0 case, but that must have been wrong, and was the result of rushed work).

Let’s now proceed with the field calculation from these potentials

\begin{aligned}\mathbf{E} &= - \boldsymbol{\nabla} A^0 - \frac{1}{{c}} \frac{\partial {\mathbf{A}}}{\partial {t}} \\ \mathbf{B} &= \boldsymbol{\nabla} \times \mathbf{A}.\end{aligned} \hspace{\stretch{1}}(1.34)

For the electric field we need

\begin{aligned}\boldsymbol{\nabla} A^0  &= q \mathbf{e}_3 \partial_z (z^2 + a^2)^{-1/2} \\ &= -q \mathbf{e}_3 \frac{z}{(\sqrt{z^2 + a^2})^3},\end{aligned}


\begin{aligned}\frac{1}{{c}} \frac{\partial {\mathbf{A}}}{\partial {t}} =\frac{q a k^2 \mathbf{e}_2 \mathbf{e}_1 \mathbf{e}_2 e^{i k c t_r} }{\sqrt{z^2 + a^2}}.\end{aligned} \hspace{\stretch{1}}(1.36)

Putting these together, our electric field near the z-axis is

\begin{aligned}\mathbf{E} = q \mathbf{e}_3 \frac{z}{(\sqrt{z^2 + a^2})^3}+\frac{q a k^2 \mathbf{e}_1 e^{i k c t_r} }{\sqrt{z^2 + a^2}}.\end{aligned} \hspace{\stretch{1}}(1.37)

(another mistake I made on the exam, since I somehow fooled myself into forcing what I knew had to be in the gradient term, despite having essentially a constant scalar potential (having taken z = 0)).

What do we get for the magnetic field. In that case we have

\begin{aligned}\boldsymbol{\nabla} \times \mathbf{A}(z)&=\mathbf{e}_\alpha \times \partial_\alpha \mathbf{A} \\ &=\mathbf{e}_3 \times \partial_z \frac{q a k \mathbf{e}_2 e^{i k c t_r} }{\sqrt{z^2 + a^2}}  \\ &=\mathbf{e}_3 \times (\mathbf{e}_2 e^{i  k c t_r} ) q a  k \frac{\partial {}}{\partial {z}} \frac{1}{{\sqrt{z^2 + a^2}}} +q a  k \frac{1}{{\sqrt{z^2 + a^2}}} \mathbf{e}_3 \times (\mathbf{e}_2 \partial_z e^{i  k c t_r} ) \\ &=-\mathbf{e}_3 \times (\mathbf{e}_2 e^{i  k c t_r} ) q a  k \frac{z}{(\sqrt{z^2 + a^2})^3} +q a  k \frac{1}{{\sqrt{z^2 + a^2}}} \mathbf{e}_3 \times \left( \mathbf{e}_2 \mathbf{e}_1 \mathbf{e}_2 k c e^{i  k c t_r} \partial_z ( t - \sqrt{z^a + a^2}/c ) \right) \\ &=-\mathbf{e}_3 \times (\mathbf{e}_2 e^{i  k c t_r} ) q a  k \frac{z}{(\sqrt{z^2 + a^2})^3} -q a  k^2 \frac{z}{z^2 + a^2} \mathbf{e}_3 \times \left( \mathbf{e}_1 k e^{i  k c t_r} \right) \\ &=-\frac{q a k z \mathbf{e}_3}{z^2 + a^2} \times \left( \frac{ \mathbf{e}_2 e^{i k c t_r} }{\sqrt{z^2 + a^2}} + k \mathbf{e}_1 e^{i k c t_r} \right)\end{aligned}

For the direction vectors in the cross products above we have

\begin{aligned}\mathbf{e}_3 \times (\mathbf{e}_2 e^{i \mu})&=\mathbf{e}_3 \times (\mathbf{e}_2 \cos\mu - \mathbf{e}_1 \sin\mu) \\ &=-\mathbf{e}_1 \cos\mu - \mathbf{e}_2 \sin\mu \\ &=-\mathbf{e}_1 e^{i \mu}\end{aligned}


\begin{aligned}\mathbf{e}_3 \times (\mathbf{e}_1 e^{i \mu})&=\mathbf{e}_3 \times (\mathbf{e}_1 \cos\mu + \mathbf{e}_2 \sin\mu) \\ &=\mathbf{e}_2 \cos\mu - \mathbf{e}_1 \sin\mu \\ &=\mathbf{e}_2 e^{i \mu}\end{aligned}

Putting everything, and summarizing results for the fields, we have

\begin{aligned}\mathbf{E} &= q \mathbf{e}_3 \frac{z}{(\sqrt{z^2 + a^2})^3}+\frac{q a k^2 \mathbf{e}_1 e^{i \omega t_r} }{\sqrt{z^2 + a^2}} \\ \mathbf{B} &= \frac{q a k z}{ z^2 + a^2} \left( \frac{\mathbf{e}_1}{\sqrt{z^2 + a^2}} - k \mathbf{e}_2 \right) e^{i \omega t_r}\end{aligned} \hspace{\stretch{1}}(1.38)

The electric field expression above compares well to 1.29. We have the Coulomb term and the radiation term. It is harder to compare the magnetic field to the exact result 1.30 since I did not expand that out.

FIXME: A question to consider. If all this worked should we not also get

\begin{aligned}\mathbf{B} \stackrel{?}{=}\frac{z \mathbf{e}_3 - \mathbf{e}_1 a e^{i \omega t_r}}{\sqrt{z^2 + a^2}} \times \mathbf{E}.\end{aligned} \hspace{\stretch{1}}(1.40)

However, if I do this check I get

\begin{aligned}\mathbf{B} =\frac{q a z}{z^2 + a^2} \left( \frac{1}{{z^2 + a^2}} + k^2 \right) \mathbf{e}_2 e^{i \omega t_r}.\end{aligned} \hspace{\stretch{1}}(1.41)

Collision of photon and electron.

I made a dumb error on the exam on this one. I setup the four momentum conservation statement, but then didn’t multiply out the cross terms properly. This led me to incorrectly assume that I had to try doing this the hard way (something akin to what I did on the midterm). Simon later told us in the tutorial the simple way, and that’s all we needed here too. Here’s the setup.

An electron at rest initially has four momentum

\begin{aligned}(m c, 0)\end{aligned} \hspace{\stretch{1}}(2.42)

where the incoming photon has four momentum

\begin{aligned}\left(\hbar \frac{\omega}{c}, \hbar \mathbf{k} \right)\end{aligned} \hspace{\stretch{1}}(2.43)

After the collision our electron has some velocity so its four momentum becomes (say)

\begin{aligned}\gamma (m c, m \mathbf{v}),\end{aligned} \hspace{\stretch{1}}(2.44)

and our new photon, going off on an angle \theta relative to \mathbf{k} has four momentum

\begin{aligned}\left(\hbar \frac{\omega'}{c}, \hbar \mathbf{k}' \right)\end{aligned} \hspace{\stretch{1}}(2.45)

Our conservation relationship is thus

\begin{aligned}(m c, 0) + \left(\hbar \frac{\omega}{c}, \hbar \mathbf{k} \right)=\gamma (m c, m \mathbf{v})+\left(\hbar \frac{\omega'}{c}, \hbar \mathbf{k}' \right)\end{aligned} \hspace{\stretch{1}}(2.46)

I squared both sides, but dropped my cross terms, which was just plain wrong, and costly for both time and effort on the exam. What I should have done was just

\begin{aligned}\gamma (m c, m \mathbf{v}) =(m c, 0) + \left(\hbar \frac{\omega}{c}, \hbar \mathbf{k} \right)-\left(\hbar \frac{\omega'}{c}, \hbar \mathbf{k}' \right),\end{aligned} \hspace{\stretch{1}}(2.47)

and then square this (really making contractions of the form p_i p^i). That gives (and this time keeping my cross terms)

\begin{aligned}(\gamma (m c, m \mathbf{v}) )^2 &= \gamma^2 m^2 (c^2 - \mathbf{v}^2) \\ &= m^2 c^2 \\ &=m^2 c^2 + 0 + 0+ 2 (m c, 0) \cdot \left(\hbar \frac{\omega}{c}, \hbar \mathbf{k} \right)- 2 (m c, 0) \cdot \left(\hbar \frac{\omega'}{c}, \hbar \mathbf{k}' \right)- 2 \cdot \left(\hbar \frac{\omega}{c}, \hbar \mathbf{k} \right)\cdot \left(\hbar \frac{\omega'}{c}, \hbar \mathbf{k}' \right) \\ &=m^2 c^2 + 2 m c \hbar \frac{\omega}{c} - 2 m c \hbar \frac{\omega'}{c}- 2\hbar^2 \left(\frac{\omega}{c} \frac{\omega'}{c}- \mathbf{k} \cdot \mathbf{k}'\right) \\ &=m^2 c^2 + 2 m c \hbar \frac{\omega}{c} - 2 m c \hbar \frac{\omega'}{c}- 2\hbar^2 \frac{\omega}{c} \frac{\omega'}{c} (1 - \cos\theta)\end{aligned}

Rearranging a bit we have

\begin{aligned}\omega' \left( m + \frac{\hbar \omega}{c^2} ( 1 - \cos\theta ) \right) = m \omega,\end{aligned} \hspace{\stretch{1}}(2.48)


\begin{aligned}\omega' = \frac{\omega}{1 + \frac{\hbar \omega}{m c^2} ( 1 - \cos\theta ) }\end{aligned} \hspace{\stretch{1}}(2.49)

Pion decay.

The problem above is very much like a midterm problem we had, so there was no justifiable excuse for messing up on it. That midterm problem was to consider the split of a pion at rest into a neutrino (massless) and a muon, and to calculate the energy of the muon. That one also follows the same pattern, a calculation of four momentum conservation, say

\begin{aligned}(m_\pi c, 0) = \hbar \frac{\omega}{c}(1, \hat{\mathbf{k}}) + ( \mathcal{E}_\mu/c, \mathbf{p}_\mu ).\end{aligned} \hspace{\stretch{1}}(3.50)

Here \omega is the frequency of the massless neutrino. The massless nature is encoded by a four momentum that squares to zero, which follows from (1, \hat{\mathbf{k}}) \cdot (1, \hat{\mathbf{k}}) = 1^2 - \hat{\mathbf{k}} \cdot \hat{\mathbf{k}} = 0.

When I did this problem on the midterm, I perversely put in a scattering angle, instead of recognizing that the particles must scatter at 180 degree directions since spatial momentum components must also be preserved. This and the combination of trying to work in spatial quantities led to a mess and I didn’t get the end result in anything that could be considered tidy.

The simple way to do this is to just rearrange to put the null vector on one side, and then square. This gives us

\begin{aligned}0 &=\left(\hbar \frac{\omega}{c}(1, \hat{\mathbf{k}}) \right) \cdot\left(\hbar \frac{\omega}{c}(1, \hat{\mathbf{k}}) \right) \\ &=\left( (m_\pi c, 0) - ( \mathcal{E}_\mu/c, \mathbf{p}_\mu ) \right) \cdot \left( (m_\pi c, 0) - ( \mathcal{E}_\mu/c, \mathbf{p}_\mu ) \right) \\ &={m_\pi}^2 c^2 + {m_\nu}^2 c^2 - 2 (m_\pi c, 0) \cdot ( \mathcal{E}_\mu/c, \mathbf{p}_\mu ) \\ &={m_\pi}^2 c^2 + {m_\nu}^2 c^2 - 2 m_\pi \mathcal{E}_\mu\end{aligned}

A final re-arrangement gives us the muon energy

\begin{aligned}\mathcal{E}_\mu = \frac{1}{{2}} \frac{ {m_\pi}^2 + {m_\nu}^2 }{m_\pi} c^2\end{aligned} \hspace{\stretch{1}}(3.51)

Posted in Math and Physics Learning. | Tagged: , , , , , , , , | Leave a Comment »

Lorentz transformation of the metric tensors.

Posted by peeterjoot on January 16, 2011

[Click here for a PDF of this post with nicer formatting]

Following up on the previous thought, it is not hard to come up with an example of a symmetric tensor a whole lot simpler than the electrodynamic stress tensor. The metric tensor is probably the simplest symmetric tensor, and we get that by considering the dot product of two vectors. Taking the dot product of vectors a and b for example we have

\begin{aligned}a \cdot b = a^\mu b^\nu \gamma_\mu \cdot \gamma_\nu\end{aligned} \hspace{\stretch{1}}(4.17)

From this, the metric tensors are defined as

\begin{aligned}\eta_{\mu\nu} &= \gamma_\mu \cdot \gamma_\nu \\ \eta^{\mu\nu} &= \gamma^\mu \cdot \gamma^\nu\end{aligned} \hspace{\stretch{1}}(4.18)

These are both symmetric and diagonal, and in fact equal (regardless of whether one picks a +,-,-,- or -,+,+,+ signature for the space).

Let’s look at the transformation of the dot product, utilizing the transformation of the four vectors being dotted to do so. By definition, when both vectors are equal, we have the (squared) spacetime interval, which based on the speed of light being constant, has been found to be an invariant under transformation.

\begin{aligned}a' \cdot b'= a^\mu b^\nu L(\gamma_\mu) \cdot L(\gamma_\nu)\end{aligned} \hspace{\stretch{1}}(4.20)

We note that, like any other vector, the image L(\gamma_\mu) of the Lorentz transform of the vector \gamma_\mu can be written as

\begin{aligned}L(\gamma_\mu) = \left( L(\gamma_\mu) \cdot \gamma^\nu \right) \gamma_\nu\end{aligned} \hspace{\stretch{1}}(4.21)

Similarily we can write any vector in terms of the reciprocal frame

\begin{aligned}\gamma_\nu = (\gamma_\nu \cdot \gamma_\mu) \gamma^\mu.\end{aligned} \hspace{\stretch{1}}(4.22)

The dot product factor is a component of the metric tensor

\begin{aligned}\eta_{\nu \mu} = \gamma_\nu \cdot \gamma_\mu,\end{aligned} \hspace{\stretch{1}}(4.23)

so we see that the dot product transforms as

\begin{aligned}a' \cdot b' = a^\mu b^\nu ( L(\gamma_\mu) \cdot \gamma^\alpha ) ( L(\gamma_\nu) \cdot \gamma^\beta ) \gamma_\alpha\cdot\gamma_\beta= a^\mu b^\nu {L_\mu}^\alpha{L_\nu}^\beta\eta_{\alpha \beta}\end{aligned} \hspace{\stretch{1}}(4.24)

In particular, for a = b where we have the invariant interval defined by the condition a^2 = {a'}^2, we must have

\begin{aligned}a^\mu a^\nu \eta_{\mu \nu}= a^\mu a^\nu {L_\mu}^\alpha{L_\nu}^\beta\eta_{\alpha \beta}\end{aligned} \hspace{\stretch{1}}(4.25)

This implies that the symmetric metric tensor transforms as

\begin{aligned}\eta_{\mu\nu}={L_\mu}^\alpha{L_\nu}^\beta\eta_{\alpha \beta}\end{aligned} \hspace{\stretch{1}}(4.26)

Recall from 3.16 that the coordinates representation of a bivector, an antisymmetric quantity transformed as

\begin{aligned}T^{\mu \nu} \rightarrow T^{\sigma \pi} {L_\sigma}^\mu {L_\pi}^\nu.\end{aligned} \hspace{\stretch{1}}(4.27)

This is a very similar transformation, but differs from the bivector case where our free indexes were upper indexes. Suppose that we define an alternate set of coordinates for the Lorentz transformation. Let

\begin{aligned}{L^\mu}_\nu = L(\gamma^\mu) \cdot \gamma_\nu.\end{aligned} \hspace{\stretch{1}}(4.28)

This can be related to the previous coordinate matrix by

\begin{aligned}{L^\mu}_\nu = \eta^{\mu \alpha } \eta_{\nu \beta } {L_\alpha}^\beta. \end{aligned} \hspace{\stretch{1}}(4.29)

If we examine how the coordinates of x^2 transform in thier lower index representation we find

\begin{aligned}{x'}^2 = x_\mu x_\nu {L^\mu}_\alpha {L^\nu}_\beta \eta^{\alpha \beta} = x^2 = x_\mu x_\nu \eta^{\mu \nu},\end{aligned} \hspace{\stretch{1}}(4.30)

and therefore find that the (upper index) metric tensor transforms as

\begin{aligned}\eta^{\mu \nu} \rightarrow\eta^{\alpha \beta}{L^\mu}_\alpha {L^\nu}_\beta .\end{aligned} \hspace{\stretch{1}}(4.31)

Compared to 4.27 we have almost the same structure of transformation. Are these the same? Does the notation I picked here introduce an apparent difference that does not actually exist? We really want to know if we have the identity

\begin{aligned}L(\gamma_\mu) \cdot \gamma^\nu\stackrel{?}{=}L(\gamma^\nu) \cdot \gamma_\mu,\end{aligned} \hspace{\stretch{1}}(4.32)

which given the notation selected would mean that {L_\mu}^\nu = {L^\nu}_\mu, and justify a notational simplification {L_\mu}^\nu = {L^\nu}_\mu = L^\nu_\mu.

The inverse Lorentz transformation

To answer this question, let’s consider a specific example, an x-axis boost of rapidity \alpha. For that our Lorentz transformation takes the following form

\begin{aligned}L(x) = e^{-\sigma_1 \alpha/2} x e^{\sigma_1 \alpha/2},\end{aligned} \hspace{\stretch{1}}(5.33)

where \sigma_k = \gamma_k \gamma_0. Since \sigma_1 anticommutes with \gamma_0 and \gamma_1, but commutes with \gamma_2 and \gamma_3, we have

\begin{aligned}L(x) = (x^0 \gamma_0 + x^1 \gamma_1) e^{\sigma_1 \alpha} + x^2 \gamma_2 + x^3 \gamma_3,\end{aligned} \hspace{\stretch{1}}(5.34)

and after expansion this is

\begin{aligned}L(x) = \gamma_0 ( x^0 \cosh \alpha - x^1 \sinh \alpha ) +\gamma_1 ( x^1 \cosh \alpha - x^0 \sinh \alpha )+\gamma_2+\gamma_3\end{aligned} \hspace{\stretch{1}}(5.35)

In particular for the basis vectors themselves we have

\begin{aligned}\begin{bmatrix}L(\gamma_0) \\ L(\gamma_1) \\ L(\gamma_2) \\ L(\gamma_3)\end{bmatrix}=\begin{bmatrix}\gamma_0 \cosh \alpha - \gamma_1 \sinh \alpha \\ -\gamma_0 \sinh \alpha + \gamma_1 \cosh \alpha \\ \gamma_2 \\ \gamma_3\end{bmatrix}\end{aligned} \hspace{\stretch{1}}(5.36)

Forming a matrix with \mu indexing over rows and \nu indexing over columns we have

\begin{aligned}{L_\mu}^\nu =\begin{bmatrix}\cosh \alpha &- \sinh \alpha & 0 & 0 \\ -\sinh \alpha & \cosh \alpha & 0 & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1\end{bmatrix}\end{aligned} \hspace{\stretch{1}}(5.37)

Performing the same expansion for {L^\nu}_\mu, again with \mu indexing over rows, we have

\begin{aligned}{L^\nu}_\mu =\begin{bmatrix}\cosh \alpha & \sinh \alpha & 0 & 0 \\ \sinh \alpha & \cosh \alpha & 0 & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1\end{bmatrix}.\end{aligned} \hspace{\stretch{1}}(5.38)

This answers the question. We cannot assume that {L_\mu}^\nu = {L^\nu}_\mu. In fact, in this particular case, we have {L^\nu}_\mu = ({L_\mu}^\nu)^{-1}. Is that a general condition? Note that for the general case, we have to consider compounded transformations, where each can be a boost or rotation.


[1] L.D. Landau and E.M. Lifshits. The classical theory of fields. Butterworth-Heinemann, 1980.

[2] C. Doran and A.N. Lasenby. Geometric algebra for physicists. Cambridge University Press New York, Cambridge, UK, 1st edition, 2003.

Posted in Math and Physics Learning. | Tagged: , , , , , , , | Leave a Comment »

Notes and problems for Desai Chapter V.

Posted by peeterjoot on November 8, 2010

[Click here for a PDF of this post with nicer formatting]


Chapter V notes for [1].



Problem 1.


Obtain S_x, S_y, S_z for spin 1 in the representation in which S_z and S^2 are diagonal.


For spin 1, we have

\begin{aligned}S^2 = 1 (1+1) \hbar^2 \mathbf{1}\end{aligned} \hspace{\stretch{1}}(3.1)

and are interested in the states {\lvert {1,-1} \rangle}, {\lvert {1, 0} \rangle}, and {\lvert {1,1} \rangle}. If, like angular momentum, we assume that we have for m_s = -1,0,1

\begin{aligned}S_z {\lvert {1,m_s} \rangle} = m_s \hbar {\lvert {1, m_s} \rangle}\end{aligned} \hspace{\stretch{1}}(3.2)

and introduce a column matrix representations for the kets as follows

\begin{aligned}{\lvert {1,1} \rangle} &=\begin{bmatrix}1 \\ 0 \\ 0\end{bmatrix} \\ {\lvert {1,0} \rangle} &=\begin{bmatrix}0 \\ 1 \\ 0\end{bmatrix} \\ {\lvert {1,-1} \rangle} &=\begin{bmatrix}0 \\ 0 \\ -1\end{bmatrix},\end{aligned} \hspace{\stretch{1}}(3.3)

then we have, by inspection

\begin{aligned}S_z &= \hbar\begin{bmatrix}1 & 0 & 0 \\ 0 & 0 & 0 \\ 0 & 0 & -1\end{bmatrix}.\end{aligned} \hspace{\stretch{1}}(3.6)

Note that, like the Pauli matrices, and unlike angular momentum, the spin states {\lvert {-1, m_s} \rangle}, {\lvert {0, m_s} \rangle} have not been considered. Do those have any physical interpretation?

That question aside, we can proceed as in the text, utilizing the ladder operator commutators

\begin{aligned}S_{\pm} &= S_x \pm i S_y,\end{aligned} \hspace{\stretch{1}}(3.7)

to determine the values of S_x and S_y indirectly. We find

\begin{aligned}\left[{S_{+}},{S_{-}}\right] &= 2 \hbar S_z \\ \left[{S_{+}},{S_{z}}\right] &= -\hbar S_{+} \\ \left[{S_{-}},{S_{z}}\right] &= \hbar S_{-}.\end{aligned} \hspace{\stretch{1}}(3.8)


\begin{aligned}S_{+} &=\begin{bmatrix}a & b & c \\ d & e & f \\ g & h & i\end{bmatrix}.\end{aligned} \hspace{\stretch{1}}(3.11)

Looking for equality between \left[{S_{z}},{S_{+}}\right]/\hbar = S_{+}, we find

\begin{aligned}\begin{bmatrix}0 & b & 2 c \\ -d & 0 & f \\ -2g & -h & 0\end{bmatrix}&=\begin{bmatrix}a & b & c \\ d & e & f \\ g & h & i\end{bmatrix},\end{aligned} \hspace{\stretch{1}}(3.12)

so we must have

\begin{aligned}S_{+} &=\begin{bmatrix}0 & b & 0 \\ 0 & 0 & f \\ 0 & 0 & 0\end{bmatrix}.\end{aligned} \hspace{\stretch{1}}(3.13)

Furthermore, from \left[{S_{+}},{S_{-}}\right] = 2 \hbar S_z, we find

\begin{aligned}\begin{bmatrix}{\left\lvert{b}\right\rvert}^2 & 0 & 0 \\ 0 \right\rvert}^2 - {\left\lvert{b}\right\rvert}^2 & 0 \\ 0 & 0 & -{\left\lvert{f}\right\rvert}^2\end{bmatrix} &= 2 \hbar^2\begin{bmatrix}1 & 0 & 0 \\ 0 & 0 & 0 \\ 0 & 0 & -1\end{bmatrix}.\end{aligned} \hspace{\stretch{1}}(3.14)

We must have {\left\lvert{b}\right\rvert}^2 = {\left\lvert{f}\right\rvert}^2 = 2 \hbar^2. We could probably pick any
b = \sqrt{2} \hbar e^{i\phi}, and f = \sqrt{2} \hbar e^{i\theta}, but assuming we have no reason for a non-zero phase we try

\begin{aligned}S_{+}&=\sqrt{2} \hbar\begin{bmatrix}0 & 1 & 0 \\ 0 & 0 & 1 \\ 0 & 0 & 0\end{bmatrix}.\end{aligned} \hspace{\stretch{1}}(3.15)

Putting all the pieces back together, with S_x = (S_{+} + S_{-})/2, and S_y = (S_{+} - S_{-})/2i, we finally have

\begin{aligned}S_x &=\frac{\hbar}{\sqrt{2}}\begin{bmatrix}0 & 1 & 0 \\ 1 & 0 & 1 \\ 0 & 1 & 0\end{bmatrix} \\ S_y &=\frac{\hbar}{\sqrt{2} i}\begin{bmatrix}0 & 1 & 0 \\ -1 & 0 & 1 \\ 0 & -1 & 0\end{bmatrix} \\ S_z &=\hbar\begin{bmatrix}1 & 0 & 0 \\ 0 & 0 & 0 \\ 0 & 0 & -1\end{bmatrix}.\end{aligned} \hspace{\stretch{1}}(3.16)

A quick calculation verifies that we have S_x^2 + S_y^2 + S_z^2 = 2 \hbar \mathbf{1}, as expected.

Problem 2.


Obtain eigensolution for operator A = a \sigma_y + b \sigma_z. Call the eigenstates {\lvert {1} \rangle} and {\lvert {2} \rangle}, and determine the probabilities that they will correspond to \sigma_x = +1.


The first part is straight forward, and we have

\begin{aligned}A &= a \begin{bmatrix} 0 & -i \\ i & 0 \\ \end{bmatrix} + b \begin{bmatrix} 1 & 0 \\ 0 & -1 \\ \end{bmatrix} \\ &=\begin{bmatrix}b & -i a \\ ia & -b\end{bmatrix}.\end{aligned}

Taking {\left\lvert{A - \lambda I}\right\rvert} = 0 we get

\begin{aligned}\lambda &= \pm \sqrt{a^2 + b^2},\end{aligned} \hspace{\stretch{1}}(3.19)

with eigenvectors proportional to

\begin{aligned}{\lvert {\pm} \rangle} &=\begin{bmatrix}i a \\ b \mp \sqrt{a^2 + b^2}\end{bmatrix}\end{aligned} \hspace{\stretch{1}}(3.20)

The normalization constant is 1/\sqrt{2 (a^2 + b^2) \mp 2 b \sqrt{a^2 + b^2}}. Now we can call these {\lvert {1} \rangle}, and {\lvert {2} \rangle} but what does the last part of the question mean? What’s meant by \sigma_x = +1?

Asking the prof about this, he says:

“I think it means that the result of a measurement of the x component of spin is +1. This corresponds to the eigenvalue of \sigma_x being +1. The spin operator S_x has eigenvalue +\hbar/2”.

Aside: Question to consider later. Is is significant that {\langle {1} \rvert} \sigma_x {\lvert {1} \rangle} = {\langle {2} \rvert} \sigma_x {\lvert {2} \rangle} = 0?

So, how do we translate this into a mathematical statement?

First let’s recall a couple of details. Recall that the x spin operator has the matrix representation

\begin{aligned}\sigma_x = \begin{bmatrix} 0 & 1 \\ 1 & 0 \\ \end{bmatrix}.\end{aligned} \hspace{\stretch{1}}(3.21)

This has eigenvalues \pm 1, with eigenstates (1,\pm 1)/\sqrt{2}. At the point when the x component spin is observed to be +1, the state of the system was then

\begin{aligned}{\lvert {x+} \rangle} =\frac{1}{{\sqrt{2}}}\begin{bmatrix}1 \\ 1\end{bmatrix}\end{aligned} \hspace{\stretch{1}}(3.22)

Let’s look at the ways that this state can be formed as linear combinations of our states {\lvert {1} \rangle}, and {\lvert {2} \rangle}. That is

\begin{aligned}\frac{1}{{\sqrt{2}}}\begin{bmatrix}1 \\ 1\end{bmatrix}&=\alpha {\lvert {1} \rangle}+ \beta {\lvert {2} \rangle},\end{aligned} \hspace{\stretch{1}}(3.23)


\begin{aligned}\begin{bmatrix}1 \\ 1\end{bmatrix}&=\frac{\alpha}{\sqrt{(a^2 + b^2) - b \sqrt{a^2 + b^2}}}\begin{bmatrix}i a \\ b - \sqrt{a^2 + b^2}\end{bmatrix}+\frac{\beta}{\sqrt{(a^2 + b^2) + b \sqrt{a^2 + b^2}}}\begin{bmatrix}i a \\ b + \sqrt{a^2 + b^2}\end{bmatrix}\end{aligned} \hspace{\stretch{1}}(3.24)

Letting c = \sqrt{a^2 + b^2}, this is

\begin{aligned}\begin{bmatrix}1 \\ 1\end{bmatrix}&=\frac{\alpha}{\sqrt{c^2 - b c}}\begin{bmatrix}i a \\ b - c\end{bmatrix}+\frac{\beta}{\sqrt{c^2 + b c}}\begin{bmatrix}i a \\ b + c\end{bmatrix}.\end{aligned} \hspace{\stretch{1}}(3.25)

We can solve the \alpha and \beta with Cramer’s rule, yielding

\begin{aligned}\begin{vmatrix}1 & i a \\ 1 & b - c\end{vmatrix}&=\frac{\beta}{\sqrt{c^2 + b c}}\begin{vmatrix}i a  & i a \\ b + c & b - c\end{vmatrix} \\ \begin{vmatrix}1 & i a \\ 1 & b + c\end{vmatrix}&=\frac{\alpha}{\sqrt{c^2 - b c}}\begin{vmatrix}i a  & i a \\ b - c & b + c\end{vmatrix},\end{aligned}


\begin{aligned}\alpha &= \frac{(b + c - ia)\sqrt{c^2 - b c}}{2 i a c} \\  \beta &= \frac{(b - c - ia)\sqrt{c^2 + b c}}{-2 i a c} \end{aligned} \hspace{\stretch{1}}(3.26)

It is {\left\lvert{\alpha}\right\rvert}^2 and {\left\lvert{\beta}\right\rvert}^2 that are probabilities, and after a bit of algebra we find that those are

\begin{aligned}{\left\lvert{\alpha}\right\rvert}^2 = {\left\lvert{\beta}\right\rvert}^2 = \frac{1}{{2}},\end{aligned} \hspace{\stretch{1}}(3.28)

so if the x spin of the system is measured as +1, we have a $50\

Is that what the question was asking? I think that I’ve actually got it backwards. I think that the question was asking for the probability of finding state {\lvert {x+} \rangle} (measuring a spin 1 value for \sigma_x) given the state {\lvert {1} \rangle} or {\lvert {2} \rangle}.

So, suppose that we have

\begin{aligned}\mu_{+} {\lvert {x+} \rangle} + \nu_{+} {\lvert {x-} \rangle} &= {\lvert {1} \rangle} \\ \mu_{-} {\lvert {x+} \rangle} + \nu_{-} {\lvert {x-} \rangle} &= {\lvert {2} \rangle},\end{aligned} \hspace{\stretch{1}}(3.29)

or (considering both cases simultaneously),

\begin{aligned}\mu_{\pm}\begin{bmatrix}1 \\ 1\end{bmatrix}+ \nu_{\pm}\begin{bmatrix}1 \\ -1\end{bmatrix}&= \frac{1}{{\sqrt{ c^2 \mp b c }}} \begin{bmatrix}i a \\ b \mp c\end{bmatrix} \\ \implies \\ \mu_{\pm}\begin{vmatrix}1 & 1 \\ 1 & -1\end{vmatrix}&= \frac{1}{{\sqrt{ c^2 \mp b c }}} \begin{vmatrix}i a & 1 \\ b \mp c & -1\end{vmatrix},\end{aligned}


\begin{aligned}\mu_{\pm} &= \frac{ia + b \mp c}{2 \sqrt{c^2 \mp bc}} .\end{aligned} \hspace{\stretch{1}}(3.31)

Unsurprisingly, this mirrors the previous scenario and we find that we have a probability {\left\lvert{\mu}\right\rvert}^2 = 1/2 of measuring a spin 1 value for \sigma_x when the state of the operator A has been measured as \pm \sqrt{a^2 + b^2} (ie: in the states {\lvert {1} \rangle}, or {\lvert {2} \rangle} respectively).

No measurement of the operator A = a \sigma_y + b\sigma_z gives a biased prediction of the state of the state \sigma_x. Loosely, this seems to justify calling these operators orthogonal. This is consistent with the geometrical antisymmetric nature of the spin components where we have \sigma_y \sigma_x = -\sigma_x \sigma_y, just like two orthogonal vectors under the Clifford product.

Problem 3.


Obtain the expectation values of S_x, S_y, S_z for the case of a spin 1/2 particle with the spin pointed in the direction of a vector with azimuthal angle \beta and polar angle \alpha.


Let’s work with \sigma_k instead of S_k to eliminate the \hbar/2 factors. Before considering the expectation values in the arbitrary spin orientation, let’s consider just the expectation values for \sigma_k. Introducing a matrix representation (assumed normalized) for a reference state

\begin{aligned}{\lvert {\psi} \rangle} &= \begin{bmatrix}a \\ b\end{bmatrix},\end{aligned} \hspace{\stretch{1}}(3.32)

we find

\begin{aligned}{\langle {\psi} \rvert} \sigma_x {\lvert {\psi} \rangle}&=\begin{bmatrix}a^{*} & b^{*}\end{bmatrix}\begin{bmatrix} 0 & 1 \\ 1 & 0 \\ \end{bmatrix}\begin{bmatrix}a \\ b\end{bmatrix}= a^{*} b + b^{*} a\\ {\langle {\psi} \rvert} \sigma_y {\lvert {\psi} \rangle}&=\begin{bmatrix}a^{*} & b^{*}\end{bmatrix}\begin{bmatrix} 0 & -i \\ i & 0 \\ \end{bmatrix}\begin{bmatrix}a \\ b\end{bmatrix}= - i a^{*} b + i b^{*} a \\ {\langle {\psi} \rvert} \sigma_x {\lvert {\psi} \rangle}&=\begin{bmatrix}a^{*} & b^{*}\end{bmatrix}\begin{bmatrix} 1 & 0 \\ 0 & -1 \\ \end{bmatrix}\begin{bmatrix}a \\ b\end{bmatrix}= a^{*} a - b^{*} b \end{aligned} \hspace{\stretch{1}}(3.33)

Each of these expectation values are real as expected due to the Hermitian nature of \sigma_k. We also find that

\begin{aligned}\sum_{k=1}^3 {{\langle {\psi} \rvert} \sigma_k {\lvert {\psi} \rangle}}^2 &= ({\left\lvert{a}\right\rvert}^2 + {\left\lvert{b}\right\rvert}^2)^2 = 1\end{aligned} \hspace{\stretch{1}}(3.36)

So a vector formed with the expectation values as components is a unit vector. This doesn’t seem too unexpected from the section on the projection operators in the text where it was stated that {\langle {\chi} \rvert} \boldsymbol{\sigma} {\lvert {\chi} \rangle} = \mathbf{p}, where \mathbf{p} was a unit vector, and this seems similar. Let’s now consider the arbitrarily oriented spin vector \boldsymbol{\sigma} \cdot \mathbf{n}, and look at its expectation value.

With \mathbf{n} as the the rotated image of \hat{\mathbf{z}} by an azimuthal angle \beta, and polar angle \alpha, we have

\begin{aligned}\mathbf{n} = (\sin\alpha \cos\beta,\sin\alpha \sin\beta,\cos\alpha)\end{aligned} \hspace{\stretch{1}}(3.37)

that is

\begin{aligned}\boldsymbol{\sigma} \cdot \mathbf{n} &= \sin\alpha \cos\beta \sigma_x + \sin\alpha \sin\beta \sigma_y + \cos\alpha \sigma_z \end{aligned} \hspace{\stretch{1}}(3.38)

The k = x,y,y projections of this operator

\begin{aligned}\frac{1}{{2}} \text{Tr} { \sigma_k (\boldsymbol{\sigma} \cdot \mathbf{n})} \sigma_k\end{aligned} \hspace{\stretch{1}}(3.39)

are just the Pauli matrices scaled by the components of \mathbf{n}

\begin{aligned}\frac{1}{{2}} \text{Tr} { \sigma_x (\boldsymbol{\sigma} \cdot \mathbf{n})} \sigma_x &= \sin\alpha \cos\beta \sigma_x  \\ \frac{1}{{2}} \text{Tr} { \sigma_y (\boldsymbol{\sigma} \cdot \mathbf{n})} \sigma_y &= \sin\alpha \sin\beta \sigma_y  \\ \frac{1}{{2}} \text{Tr} { \sigma_z (\boldsymbol{\sigma} \cdot \mathbf{n})} \sigma_z &= \cos\alpha \sigma_z,\end{aligned} \hspace{\stretch{1}}(3.40)

so our S_k expectation values are by inspection

\begin{aligned}{\langle {\psi} \rvert} S_x {\lvert {\psi} \rangle} &= \frac{\hbar}{2} \sin\alpha \cos\beta ( a^{*} b + b^{*} a ) \\ {\langle {\psi} \rvert} S_y {\lvert {\psi} \rangle} &= \frac{\hbar}{2} \sin\alpha \sin\beta ( - i a^{*} b + i b^{*} a ) \\ {\langle {\psi} \rvert} S_z {\lvert {\psi} \rangle} &= \frac{\hbar}{2} \cos\alpha ( a^{*} a - b^{*} b )\end{aligned} \hspace{\stretch{1}}(3.43)

Is this correct? While (\boldsymbol{\sigma} \cdot \mathbf{n})^2 = \mathbf{n}^2 = I is a unit norm operator, we find that the expectation values of the coordinates of \boldsymbol{\sigma} \cdot \mathbf{n} cannot be viewed as the coordinates of a unit vector. Let’s consider a specific case, with \mathbf{n} = (0,0,1), where the spin is oriented in the x,y plane. That gives us

\begin{aligned}\boldsymbol{\sigma} \cdot \mathbf{n} = \sigma_z\end{aligned} \hspace{\stretch{1}}(3.46)

so the expectation values of S_k are

\begin{aligned}\left\langle{{S_x}}\right\rangle &= 0 \\ \left\langle{{S_y}}\right\rangle &= 0 \\ \left\langle{{S_z}}\right\rangle &= \frac{\hbar}{2} ( a^{*} a - b^{*} b )\end{aligned} \hspace{\stretch{1}}(3.47)

Given this is seems reasonable that from 3.43 we find

\begin{aligned}\sum_k {{\langle {\psi} \rvert} S_k {\lvert {\psi} \rangle}}^2 \ne \hbar^2/4,\end{aligned} \hspace{\stretch{1}}(3.50)

(since we don’t have any reason to believe that in general ( a^{*} a - b^{*} b )^2 = 1 is true).

The most general statement we can make about these expectation values (an average observed value for the measurement of the operator) is that

\begin{aligned}{\left\lvert{\left\langle{{S_k}}\right\rangle}\right\rvert} \le \frac{\hbar}{2} \end{aligned} \hspace{\stretch{1}}(3.51)

with equality for specific states and orientations only.

Problem 4.


Take the azimuthal angle, \beta = 0, so that the spin is in the
x-z plane at an angle \alpha with respect to the z-axis, and the unit vector is \mathbf{n} = (\sin\alpha, 0, \cos\alpha). Write

\begin{aligned}{\lvert {\chi_{n+}} \rangle} = {\lvert {+\alpha} \rangle}\end{aligned} \hspace{\stretch{1}}(3.52)

for this case. Show that the probability that it is in the spin-up state in the direction \theta with respect to the z-axis is

\begin{aligned}{\left\lvert{ \left\langle{{+\theta}} \vert {{+\alpha}}\right\rangle }\right\rvert}^2 = \cos^2 \frac{\alpha - \theta}{2}\end{aligned} \hspace{\stretch{1}}(3.53)

Also obtain the expectation value of \boldsymbol{\sigma} \cdot \mathbf{n} with respect to the state {\lvert {+\theta} \rangle}.


For this orientation we have

\begin{aligned}\boldsymbol{\sigma} \cdot \mathbf{n}&=\sin\alpha \begin{bmatrix} 0 & 1 \\ 1 & 0 \\ \end{bmatrix} + \cos\alpha \begin{bmatrix} 1 & 0 \\ 0 & -1 \\ \end{bmatrix}=\begin{bmatrix}\cos\alpha & \sin\alpha \\ \sin\alpha & -\cos\alpha\end{bmatrix}\end{aligned} \hspace{\stretch{1}}(3.54)

Confirmation that our eigenvalues are \pm 1 is simple, and our eigenstates for the +1 eigenvalue is found to be

\begin{aligned}{\lvert {+\alpha} \rangle} \propto \begin{bmatrix}\sin\alpha \\ 1 - \cos\alpha\end{bmatrix}= \begin{bmatrix}\sin\alpha/2 \cos\alpha/2 \\ 2 \sin^2 \alpha/2\end{bmatrix}\propto\begin{bmatrix}\cos \alpha/2 \\ \sin\alpha/2 \end{bmatrix}\end{aligned} \hspace{\stretch{1}}(3.55)

This last has unit norm, so we can write

\begin{aligned}{\lvert {+\alpha} \rangle} =\begin{bmatrix}\cos \alpha/2 \\ \sin\alpha/2 \end{bmatrix}\end{aligned} \hspace{\stretch{1}}(3.56)

If the state has been measured to be

\begin{aligned}{\lvert {\phi} \rangle} = 1 {\lvert {+\alpha} \rangle} + 0 {\lvert {-\alpha} \rangle},\end{aligned} \hspace{\stretch{1}}(3.57)

then the probability of a second measurement obtaining {\lvert {+\theta} \rangle} is

\begin{aligned}{\left\lvert{ \left\langle{{+\theta}} \vert {{\phi}}\right\rangle }\right\rvert}^2&={\left\lvert{ \left\langle{{+\theta}} \vert {{+\alpha}}\right\rangle }\right\rvert}^2 .\end{aligned} \hspace{\stretch{1}}(3.58)

Expanding just the inner product first we have

\begin{aligned}\left\langle{{+\theta}} \vert {{+\alpha}}\right\rangle &=\begin{bmatrix}C_{\theta/2} & S_{\theta/2} \end{bmatrix}\begin{bmatrix}C_{\alpha/2} \\  S_{\alpha/2} \end{bmatrix} \\ &=S_{\theta/2} S_{\alpha/2} + C_{\theta/2} C_{\alpha/2}  \\ &= \cos\left( \frac{\theta - \alpha}{2} \right)\end{aligned}

So our probability of measuring spin up state {\lvert {+\theta} \rangle} given the state was known to have been in spin up state {\lvert {+\alpha} \rangle} is

\begin{aligned}{\left\lvert{ \left\langle{{+\theta}} \vert {{+\alpha}}\right\rangle }\right\rvert}^2 = \cos^2\left( \frac{\theta - \alpha}{2} \right)\end{aligned} \hspace{\stretch{1}}(3.59)

Finally, the expectation value for \boldsymbol{\sigma} \cdot \mathbf{n} with respect to {\lvert {+\theta} \rangle} is

\begin{aligned}\begin{bmatrix}C_{\theta/2} & S_{\theta/2} \end{bmatrix}\begin{bmatrix}C_\alpha & S_\alpha \\ S_\alpha & -C_\alpha\end{bmatrix}\begin{bmatrix}C_{\theta/2} \\ S_{\theta/2} \end{bmatrix} &=\begin{bmatrix}C_{\theta/2} & S_{\theta/2} \end{bmatrix}\begin{bmatrix}C_\alpha C_{\theta/2} + S_\alpha S_{\theta/2} \\ S_\alpha C_{\theta/2} - C_\alpha S_{\theta/2} \end{bmatrix} \\ &=C_{\theta/2} C_\alpha C_{\theta/2} + C_{\theta/2} S_\alpha S_{\theta/2} + S_{\theta/2} S_\alpha C_{\theta/2} - S_{\theta/2} C_\alpha S_{\theta/2} \\ &=C_\alpha ( C_{\theta/2}^2 -S_{\theta/2}^2 )+ 2 S_\alpha S_{\theta/2} C_{\theta/2} \\ &= C_\alpha C_\theta+ S_\alpha S_\theta \\ &= \cos( \alpha - \theta )\end{aligned}

Sanity checking this we observe that we have +1 as desired for the \alpha = \theta case.

Problem 5.


Consider an arbitrary density matrix, \rho, for a spin 1/2 system. Express each matrix element in terms of the ensemble averages [S_i] where i = x,y,z.


Let’s omit the spin direction temporarily and write for the density matrix

\begin{aligned}\rho &= w_{+} {\lvert {+} \rangle}{\langle {+} \rvert}+w_{-} {\lvert {-} \rangle}{\langle {-} \rvert} \\ &=w_{+} {\lvert {+} \rangle}{\langle {+} \rvert}+(1 - w_{+}){\lvert {-} \rangle}{\langle {-} \rvert} \\ &={\lvert {-} \rangle}{\langle {-} \rvert} +w_{+} ({\lvert {+} \rangle}{\langle {+} \rvert} -{\lvert {+} \rangle}{\langle {+} \rvert})\end{aligned}

For the ensemble average (no sum over repeated indexes) we have

\begin{aligned}[S] = \left\langle{{S}}\right\rangle_{av} &= w_{+} {\langle {+} \rvert} S {\lvert {+} \rangle} +w_{-} {\langle {-} \rvert} S {\lvert {-} \rangle} \\ &= \frac{\hbar}{2}( w_{+} -w_{-} ) \\ &= \frac{\hbar}{2}( w_{+} -(1 - w_{+}) ) \\ &= \hbar w_{+} - \frac{1}{{2}}\end{aligned}

This gives us

\begin{aligned}w_{+} = \frac{1}{{\hbar}} [S] + \frac{1}{{2}}\end{aligned}

and our density matrix becomes

\begin{aligned}\rho &=\frac{1}{{2}} ( {\lvert {+} \rangle}{\langle {+} \rvert} +{\lvert {-} \rangle}{\langle {-} \rvert} )+\frac{1}{{\hbar}} [S] ({\lvert {+} \rangle}{\langle {+} \rvert} -{\lvert {+} \rangle}{\langle {+} \rvert}) \\ &=\frac{1}{{2}} I+\frac{1}{{\hbar}} [S] ({\lvert {+} \rangle}{\langle {+} \rvert} -{\lvert {+} \rangle}{\langle {+} \rvert}) \\ \end{aligned}


\begin{aligned}{\lvert {x+} \rangle} &= \frac{1}{{\sqrt{2}}}\begin{bmatrix}1 \\ 1\end{bmatrix} \\ {\lvert {x-} \rangle} &= \frac{1}{{\sqrt{2}}}\begin{bmatrix}1 \\ -1\end{bmatrix} \\ {\lvert {y+} \rangle} &= \frac{1}{{\sqrt{2}}}\begin{bmatrix}1 \\ 1\end{bmatrix} \\ {\lvert {y-} \rangle} &= \frac{1}{{\sqrt{2}}}\begin{bmatrix}1 \\ -i\end{bmatrix} \\ {\lvert {z+} \rangle} &= \begin{bmatrix}1 \\ 0\end{bmatrix} \\ {\lvert {z-} \rangle} &= \begin{bmatrix}0 \\ 1\end{bmatrix}\end{aligned}

We can easily find

\begin{aligned}{\lvert {x+} \rangle}{\langle {x+} \rvert} -{\lvert {x+} \rangle}{\langle {x+} \rvert} &= \begin{bmatrix} 0 & 1 \\ 1 & 0 \\ \end{bmatrix} = \sigma_x \\ {\lvert {y+} \rangle}{\langle {y+} \rvert} -{\lvert {y+} \rangle}{\langle {y+} \rvert} &= \begin{bmatrix} 0 & -i \\ i & 0 \\ \end{bmatrix} = \sigma_y \\ {\lvert {z+} \rangle}{\langle {z+} \rvert} -{\lvert {z+} \rangle}{\langle {z+} \rvert} &= \begin{bmatrix} 1 & 0 \\ 0 & -1 \\ \end{bmatrix} = \sigma_z\end{aligned}

So we can write the density matrix in terms of any of the ensemble averages as

\begin{aligned}\rho =\frac{1}{{2}} I+\frac{1}{{\hbar}} [S_i] \sigma_i=\frac{1}{{2}} (I + [\sigma_i] \sigma_i )\end{aligned}

Alternatively, defining \mathbf{P}_i = [\sigma_i] \mathbf{e}_i, for any of the directions i = 1,2,3 we can write

\begin{aligned}\rho = \frac{1}{{2}} (I + \boldsymbol{\sigma} \cdot \mathbf{P}_i )\end{aligned} \hspace{\stretch{1}}(3.60)

In equation (5.109) we had a similar result in terms of the polarization vector \mathbf{P} = {\langle {\alpha} \rvert} \boldsymbol{\sigma} {\lvert {\alpha} \rangle}, and the individual weights w_\alpha, and w_\beta, but we see here that this (w_\alpha - w_\beta)\mathbf{P} factor can be written exclusively in terms of the ensemble average. Actually, this is also a result in the text, down in (5.113), but we see it here in a more concrete form having picked specific spin directions.

Problem 6.


If a Hamiltonian is given by \boldsymbol{\sigma} \cdot \mathbf{n} where \mathbf{n} = (\sin\alpha\cos\beta, \sin\alpha\sin\beta, \cos\alpha), determine the time evolution operator as a 2 x 2 matrix. If a state at t = 0 is given by

\begin{aligned}{\lvert {\phi(0)} \rangle} = \begin{bmatrix}a \\ b\end{bmatrix},\end{aligned} \hspace{\stretch{1}}(3.61)

then obtain {\lvert {\phi(t)} \rangle}.


Before diving into the meat of the problem, observe that a tidy factorization of the Hamiltonian is possible as a composition of rotations. That is

\begin{aligned}H &= \boldsymbol{\sigma} \cdot \mathbf{n} \\ &= \sin\alpha \sigma_1 ( \cos\beta + \sigma_1 \sigma_2 \sin\beta ) + \cos\alpha \sigma_3 \\ &= \sigma_3 \left(\cos\alpha + \sin\alpha \sigma_3 \sigma_1 e^{ i \sigma_3 \beta }\right) \\ &= \sigma_3 \exp\left( \alpha i \sigma_2 \exp\left( \beta i \sigma_3 \right)\right)\end{aligned}

So we have for the time evolution operator

\begin{aligned}U(\Delta t) &=\exp( -i \Delta t H /\hbar )= \exp \left(- \frac{\Delta t}{\hbar} i \sigma_3 \exp\Bigl( \alpha i \sigma_2 \exp\left( \beta i \sigma_3 \right)\Bigr)\right).\end{aligned} \hspace{\stretch{1}}(3.62)

Does this really help? I guess not, but it is nice and tidy.

Returning to the specifics of the problem, we note that squaring the Hamiltonian produces the identity matrix

\begin{aligned}(\boldsymbol{\sigma} \cdot \mathbf{n})^2 &= I \mathbf{n}^2 = I.\end{aligned} \hspace{\stretch{1}}(3.63)

This allows us to exponentiate H by inspection utilizing

\begin{aligned}e^{i \mu (\boldsymbol{\sigma} \cdot \mathbf{n}) } = I \cos\mu + i (\boldsymbol{\sigma} \cdot \mathbf{n}) \sin\mu\end{aligned} \hspace{\stretch{1}}(3.64)

Writing \sin\mu = S_\mu, and \cos\mu = C_\mu, we have

\begin{aligned}\boldsymbol{\sigma} \cdot \mathbf{n} &=\begin{bmatrix}C_\alpha & S_\alpha e^{-i\beta} \\ S_\alpha e^{i\beta} & -C_\alpha\end{bmatrix},\end{aligned} \hspace{\stretch{1}}(3.65)

and thus

\begin{aligned}U(\Delta t) = \exp( -i \Delta t H /\hbar )=\begin{bmatrix}C_{\Delta t/\hbar} -i S_{\Delta t/\hbar} C_\alpha & -i S_{\Delta t/\hbar} S_\alpha e^{-i\beta} \\ -i S_{\Delta t/\hbar} S_\alpha e^{i\beta} & C_{\Delta t/\hbar} + i S_{\Delta t/\hbar} C_\alpha\end{bmatrix}.\end{aligned} \hspace{\stretch{1}}(3.66)

Note that as a sanity check we can calculate that U(\Delta t) U(\Delta t)^\dagger = 1 as expected.

Now for \Delta t = t, we have

\begin{aligned}U(t,0) \begin{bmatrix}a \\ b\end{bmatrix}&=\begin{bmatrix}a C_{t/\hbar} -a i S_{t/\hbar} C_\alpha  - b i S_{t/\hbar} S_\alpha e^{-i\beta} \\ -a i S_{t/\hbar} S_\alpha e^{i\beta} + b C_{t/\hbar} + b i S_{t/\hbar} C_\alpha\end{bmatrix}.\end{aligned} \hspace{\stretch{1}}(3.67)

It doesn’t seem terribly illuminating to multiply this all out, but we can factor the results slightly to tidy it up. That gives us

\begin{aligned}U(t,0) \begin{bmatrix}a \\ b\end{bmatrix}&=\cos(t/\hbar)\begin{bmatrix}a \\ b\end{bmatrix}+ \sin(t/\hbar) \cos\alpha\begin{bmatrix}-a \\ b\end{bmatrix}+ i\sin(t/\hbar) \sin\alpha\begin{bmatrix}b e^{-i\beta} \\ -a e^{i \beta}\end{bmatrix}\end{aligned} \hspace{\stretch{1}}(3.68)

Problem 7.


Consider a system of spin 1/2 particles in a mixed ensemble containing a mixture of 25\


We have

\begin{aligned}\rho &= \frac{1}{4} {\lvert {z+} \rangle}{\langle {z+} \rvert}+\frac{3}{4} {\lvert {x-} \rangle}{\langle {x-} \rvert} \\ &=\frac{1}{{4}} \begin{bmatrix}1 \\ 0\end{bmatrix}\begin{bmatrix}1 & 0\end{bmatrix}+\frac{3}{4} \frac{1}{{2}}\begin{bmatrix}1 \\ -1\end{bmatrix}\begin{bmatrix}1 & -1\end{bmatrix} \\ &=\frac{1}{{4}} \left(\frac{1}{{2}}\begin{bmatrix}2 & 0 \\ 0 & 0\end{bmatrix}+\frac{3}{2}\begin{bmatrix}1 & -1 \\ -1 & 1\end{bmatrix}\right) \\ \end{aligned}

Giving us

\begin{aligned}\rho =\frac{1}{{8}}\begin{bmatrix}5 & -3 \\ -3 & 3\end{bmatrix}.\end{aligned} \hspace{\stretch{1}}(3.69)

Note that we can also factor the identity out of this for

\begin{aligned}\rho &=\frac{1}{{2}}\begin{bmatrix}5/4 & -3/4 \\ -3/4 & 3/4\end{bmatrix}\\ &=\frac{1}{{2}}\left(I +\begin{bmatrix}1/4 & -3/4 \\ -3/4 & -1/4\end{bmatrix}\right)\end{aligned}

which is just:

\begin{aligned}\rho = \frac{1}{{2}} \left( I + \frac{1}{{4}} \sigma_z -\frac{3}{4} \sigma_x \right)\end{aligned} \hspace{\stretch{1}}(3.70)

Recall that the ensemble average is related to the trace of the density and operator product

\begin{aligned}\text{Tr}( \rho A )&=\sum_\beta {\langle {\beta} \rvert} \rho A {\lvert {\beta} \rangle} \\ &=\sum_{\beta} {\langle {\beta} \rvert} \left( \sum_\alpha w_\alpha {\lvert {\alpha} \rangle}{\langle {\alpha} \rvert} \right) A {\lvert {\beta} \rangle} \\ &=\sum_{\alpha, \beta} w_\alpha \left\langle{{\beta}} \vert {{\alpha}}\right\rangle{\langle {\alpha} \rvert} A {\lvert {\beta} \rangle} \\ &=\sum_{\alpha, \beta} w_\alpha {\langle {\alpha} \rvert} A {\lvert {\beta} \rangle} \left\langle{{\beta}} \vert {{\alpha}}\right\rangle\\ &=\sum_{\alpha} w_\alpha {\langle {\alpha} \rvert} A \left( \sum_\beta {\lvert {\beta} \rangle} {\langle {\beta} \rvert} \right) {\lvert {\alpha} \rangle}\\ &=\sum_\alpha w_\alpha {\langle {\alpha} \rvert} A {\lvert {\alpha} \rangle}\end{aligned}

But this, by definition of the ensemble average, is just

\begin{aligned}\text{Tr}( \rho A )&=\left\langle{{A}}\right\rangle_{\text{av}}.\end{aligned} \hspace{\stretch{1}}(3.71)

We can use this to compute the ensemble averages of the Pauli matrices

\begin{aligned}\left\langle{{\sigma_x}}\right\rangle_{\text{av}} &= \text{Tr} \left(\frac{1}{{8}}\begin{bmatrix}5 & -3 \\ -3 & 3\end{bmatrix}\begin{bmatrix} 0 & 1 \\ 1 & 0 \\ \end{bmatrix}\right) = -\frac{3}{4} \\ \left\langle{{\sigma_y}}\right\rangle_{\text{av}} &= \text{Tr} \left(\frac{1}{{8}}\begin{bmatrix}5 & -3 \\ -3 & 3\end{bmatrix}\begin{bmatrix} 0 & -i \\ i & 0 \\ \end{bmatrix}\right) = 0 \\ \left\langle{{\sigma_z}}\right\rangle_{\text{av}} &= \text{Tr} \left(\frac{1}{{8}}\begin{bmatrix}5 & -3 \\ -3 & 3\end{bmatrix}\begin{bmatrix} 1 & 0 \\ 0 & -1 \\ \end{bmatrix}\right) = \frac{1}{4} \\ \end{aligned}

We can also find without the explicit matrix multiplication from 3.70

\begin{aligned}\left\langle{{\sigma_x}}\right\rangle_{\text{av}} &= \text{Tr} \frac{1}{{2}}\left(\sigma_x + \frac{1}{{4}} \sigma_z \sigma_x -\frac{3}{4} \sigma_x^2\right) = -\frac{3}{4} \\ \left\langle{{\sigma_y}}\right\rangle_{\text{av}} &= \text{Tr} \frac{1}{{2}}\left(\sigma_y + \frac{1}{{4}} \sigma_z \sigma_y -\frac{3}{4} \sigma_x \sigma_y\right) = 0 \\ \left\langle{{\sigma_z}}\right\rangle_{\text{av}} &= \text{Tr} \frac{1}{{2}}\left(\sigma_z + \frac{1}{{4}} \sigma_z^2 -\frac{3}{4} \sigma_x \sigma_z\right) = \frac{1}{{4}}.\end{aligned}

(where to do so we observe that \text{Tr} \sigma_i \sigma_j = 0 for i\ne j and \text{Tr} \sigma_i = 0, and \text{Tr} \sigma_i^2 = 2.)

We see that the traces of the density operator and Pauli matrix products act very much like dot products extracting out the ensemble averages, which end up very much like the magnitudes of the projections in each of the directions.

Problem 8.


Show that the quantity \boldsymbol{\sigma} \cdot \mathbf{p} V(r) \boldsymbol{\sigma} \cdot \mathbf{p}, when simplified, has a term proportional to \mathbf{L} \cdot \boldsymbol{\sigma}.


Consider the operation

\begin{aligned}\boldsymbol{\sigma} \cdot \mathbf{p} V(r) \Psi&=- i \hbar \sigma_k \partial_k V(r) \Psi \\ &=- i \hbar \sigma_k (\partial_k V(r)) \Psi + V(r) (\boldsymbol{\sigma} \cdot \mathbf{p} ) \Psi  \\ \end{aligned}

With r = \sqrt{\sum_j x_j^2}, we have

\begin{aligned}\partial_k V(r) = \frac{1}{{2}}\frac{1}{{r}} 2 x_k \frac{\partial {V(r)}}{\partial {r}},\end{aligned}

which gives us the commutator

\begin{aligned}\left[{ \boldsymbol{\sigma} \cdot \mathbf{p}},{V(r)}\right]&=- \frac{i \hbar}{r} \frac{\partial {V(r)}}{\partial {r}} (\boldsymbol{\sigma} \cdot \mathbf{x}) \end{aligned} \hspace{\stretch{1}}(3.72)

Insertion into the operator in question we have

\begin{aligned}\boldsymbol{\sigma} \cdot \mathbf{p} V(r) \boldsymbol{\sigma} \cdot \mathbf{p} =- \frac{i \hbar}{r} \frac{\partial {V(r)}}{\partial {r}} (\boldsymbol{\sigma} \cdot \mathbf{x}) (\boldsymbol{\sigma} \cdot \mathbf{p} ) + V(r) (\boldsymbol{\sigma} \cdot \mathbf{p} )^2\end{aligned} \hspace{\stretch{1}}(3.73)

With decomposition of the (\boldsymbol{\sigma} \cdot \mathbf{x}) (\boldsymbol{\sigma} \cdot \mathbf{p} ) into symmetric and antisymmetric components, we should have in the second term our \boldsymbol{\sigma} \cdot \mathbf{L}

\begin{aligned}(\boldsymbol{\sigma} \cdot \mathbf{x}) (\boldsymbol{\sigma} \cdot \mathbf{p} )=\frac{1}{{2}} \left\{{\boldsymbol{\sigma} \cdot \mathbf{x}},{\boldsymbol{\sigma} \cdot \mathbf{p}}\right\}+\frac{1}{{2}} \left[{\boldsymbol{\sigma} \cdot \mathbf{x}},{\boldsymbol{\sigma} \cdot \mathbf{p}}\right]\end{aligned} \hspace{\stretch{1}}(3.74)

where we expect \boldsymbol{\sigma} \cdot \mathbf{L} \propto \left[{\boldsymbol{\sigma} \cdot \mathbf{x}},{\boldsymbol{\sigma} \cdot \mathbf{p}}\right]. Alternately in components

\begin{aligned}(\boldsymbol{\sigma} \cdot \mathbf{x}) (\boldsymbol{\sigma} \cdot \mathbf{p} )&=\sigma_k x_k \sigma_j p_j \\ &=x_k p_k I + \sum_{j\ne k} \sigma_k \sigma_j x_k p_j \\ &=x_k p_k I + i \sum_m \epsilon_{kjm} \sigma_m x_k p_j \\ &=I (\mathbf{x} \cdot \mathbf{p}) + i (\boldsymbol{\sigma} \cdot \mathbf{L})\end{aligned}

Problem 9.





[1] BR Desai. Quantum mechanics with basic field theory. Cambridge University Press, 2009.

Posted in Math and Physics Learning. | Tagged: , , , , , , , , , , , , , , , | Leave a Comment »