Peeter Joot's (OLD) Blog.

Math, physics, perl, and programming obscurity.

Posts Tagged ‘special relativity’

A final pre-exam update of my notes compilation for ‘PHY452H1S Basic Statistical Mechanics’, Taught by Prof. Arun Paramekanti

Posted by peeterjoot on April 22, 2013

Here’s my third update of my notes compilation for this course, including all of the following:

April 21, 2013 Fermi function expansion for thermodynamic quantities

April 20, 2013 Relativistic Fermi Gas

April 10, 2013 Non integral binomial coefficient

April 10, 2013 energy distribution around mean energy

April 09, 2013 Velocity volume element to momentum volume element

April 04, 2013 Phonon modes

April 03, 2013 BEC and phonons

April 03, 2013 Max entropy, fugacity, and Fermi gas

April 02, 2013 Bosons

April 02, 2013 Relativisitic density of states

March 28, 2013 Bosons

plus everything detailed in the description of my previous update and before.

Posted in Math and Physics Learning. | Tagged: , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , | 1 Comment »

Velocity volume element to momentum volume element

Posted by peeterjoot on April 9, 2013

[Click here for a PDF of this post with nicer formatting (especially if my latex to wordpress script has left FORMULA DOES NOT PARSE errors.)]


One of the problems I attempted had integrals over velocity space with volume element d^3\mathbf{u}. Initially I thought that I’d need a change of variables to momentum space, and calculated the corresponding momentum space volume element. Here’s that calculation.


We are working with a Hamiltonian

\begin{aligned}\epsilon = \sqrt{ (p c)^2 + \epsilon_0^2 },\end{aligned} \hspace{\stretch{1}}(1.1)

where the rest energy is

\begin{aligned}\epsilon_0 = m c^2.\end{aligned} \hspace{\stretch{1}}(1.2)

Hamilton’s equations give us

\begin{aligned}u_\alpha = \frac{ p_\alpha/c^2 }{\epsilon},\end{aligned} \hspace{\stretch{1}}(1.3)


\begin{aligned}p_\alpha = \frac{ m u_\alpha }{\sqrt{1 - \mathbf{u}^2/c^2}}.\end{aligned} \hspace{\stretch{1}}(1.4)

This is enough to calculate the Jacobian for our volume element change of variables

\begin{aligned}du_x \wedge du_y \wedge du_z &= \frac{\partial(u_x, u_y, u_z)}{\partial(p_x, p_y, p_z)}dp_x \wedge dp_y \wedge dp_z \\ &= \frac{1}{{c^6 \left( { m^2 + (\mathbf{p}/c)^2 } \right)^{9/2}}}\begin{vmatrix}m^2 c^2 + p_y^2 + p_z^2 & - p_y p_x & - p_z p_x \\ -p_x p_y & m^2 c^2 + p_x^2 + p_z^2 & - p_z p_y \\ -p_x p_z & -p_y p_z & m^2 c^2 + p_x^2 + p_y^2\end{vmatrix}dp_x \wedge dp_y \wedge dp_z \\ &= m^2 \left( { m^2 + \mathbf{p}^2/c^2 } \right)^{-5/2}dp_x \wedge dp_y \wedge dp_z.\end{aligned} \hspace{\stretch{1}}(1.5)

That final simplification of the determinant was a little hairy, but yielded nicely to Mathematica.

Our final result for the velocity volume element in momentum space, in terms of the particle energy is

\begin{aligned}d^3 \mathbf{u} = \frac{c^6 \epsilon_0^2 } {\epsilon^5} d^3 \mathbf{p}.\end{aligned} \hspace{\stretch{1}}(1.6)

Posted in Math and Physics Learning. | Tagged: , , , | Leave a Comment »

An updated compilation of notes, for ‘PHY452H1S Basic Statistical Mechanics’, Taught by Prof. Arun Paramekanti

Posted by peeterjoot on March 27, 2013

Here’s my second update of my notes compilation for this course, including all of the following:

March 27, 2013 Fermi gas

March 26, 2013 Fermi gas thermodynamics

March 26, 2013 Fermi gas thermodynamics

March 23, 2013 Relativisitic generalization of statistical mechanics

March 21, 2013 Kittel Zipper problem

March 18, 2013 Pathria chapter 4 diatomic molecule problem

March 17, 2013 Gibbs sum for a two level system

March 16, 2013 open system variance of N

March 16, 2013 probability forms of entropy

March 14, 2013 Grand Canonical/Fermion-Bosons

March 13, 2013 Quantum anharmonic oscillator

March 12, 2013 Grand canonical ensemble

March 11, 2013 Heat capacity of perturbed harmonic oscillator

March 10, 2013 Langevin small approximation

March 10, 2013 Addition of two one half spins

March 10, 2013 Midterm II reflection

March 07, 2013 Thermodynamic identities

March 06, 2013 Temperature

March 05, 2013 Interacting spin

plus everything detailed in the description of my first update and before.

Posted in Math and Physics Learning. | Tagged: , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , | 1 Comment »

Relativistic generalization of statistical mechanics

Posted by peeterjoot on March 22, 2013

[Click here for a PDF of this post with nicer formatting (especially if my latex to wordpress script has left FORMULA DOES NOT PARSE errors.)]


I was wondering how to generalize the arguments of [1] to relativistic systems. Here’s a bit of blundering through the non-relativistic arguments of that text, tweaking them slightly.

I’m sure this has all been done before, but was a useful exercise to understand the non-relativistic arguments of Pathria better.

Generalizing from energy to four momentum

Generalizing the arguments of section 1.1.

Instead of considering that the total energy of the system is fixed, it makes sense that we’d have to instead consider the total four-momentum of the system fixed, so if we have N particles, we have a total four momentum

\begin{aligned}P = \sum_i n_i P_i = \sum n_i \left( \epsilon_i/c, \mathbf{p}_i \right),\end{aligned} \hspace{\stretch{1}}(1.2.1)

where n_i is the total number of particles with four momentum P_i. We can probably expect that the n_i‘s in this relativistic system will be smaller than those in a non-relativistic system since we have many more states when considering that we can have both specific energies and specific momentum, and the combinatorics of those extra degrees of freedom. However, we’ll still have

\begin{aligned}N = \sum_i n_i.\end{aligned} \hspace{\stretch{1}}(1.2.2)

Only given a specific observer frame can these these four-momentum components \left( \epsilon_i/c, \mathbf{p}_i \right) be expressed explicitly, as in

\begin{aligned}\epsilon_i = \gamma_i m_i c^2\end{aligned} \hspace{\stretch{1}}(1.0.3a)

\begin{aligned}\mathbf{p}_i = \gamma_i m \mathbf{v}_i\end{aligned} \hspace{\stretch{1}}(1.0.3b)

\begin{aligned}\gamma_i = \frac{1}{{\sqrt{1 - \mathbf{v}_i^2/c^2}}},\end{aligned} \hspace{\stretch{1}}(1.0.3c)

where \mathbf{v}_i is the velocity of the particle in that observer frame.

Generalizing the number if microstates, and notion of thermodynamic equilibrium

Generalizing the arguments of section 1.2.

We can still count the number of all possible microstates, but that number, denoted \Omega(N, V, E), for a given total energy needs to be parameterized differently. First off, any given volume is observer dependent, so we likely need to map

\begin{aligned}V \rightarrow \int d^4 x = \int dx^0 \wedge dx^1 \wedge dx^2 \wedge dx^3.\end{aligned} \hspace{\stretch{1}}(1.0.4)

Let’s still call this V, but know that we mean this to be four volume element, bounded in both space and time, referred to a fixed observer’s frame. So, lets write the total number of microstates as

\begin{aligned}\Omega(N, V, P) = \Omega \left( N, \int d^4 x, E/c, P^1, P^2, P^3 \right),\end{aligned} \hspace{\stretch{1}}(1.0.5)

where P = ( E/c, \mathbf{P} ) is the total four momentum of the system. If we have a system subdivided into to two systems in contact as in fig. 1.1, where the two systems have total four momentum P_1 and P_2 respectively.

Fig 1.1: Two physical systems in thermal contact


In the text the total energy of both systems was written

\begin{aligned}E^{(0)} = E_1 + E_2,\end{aligned} \hspace{\stretch{1}}(1.0.6)

so we’ll write

\begin{aligned}{P^{(0)}}^\mu = P_1^\mu + P_2^\mu = \text{constant},\end{aligned} \hspace{\stretch{1}}(1.0.7)

so that the total number of microstates of the combined system is now

\begin{aligned}\Omega^{(0)}(P_1, P_2) = \Omega_1(P_1) \Omega_2(P_2).\end{aligned} \hspace{\stretch{1}}(1.0.8)

As before, if \bar{{P}}^\mu_i denotes an equilibrium value of P_i^\mu, then maximizing eq. 1.0.8 requires all the derivatives (no sum over \mu here)

\begin{aligned}\left({\partial {\Omega_1(P_1)}}/{\partial {P^\mu_1}}\right)_{{P_1 = \bar{{P_1}}}}\Omega_2(\bar{{P}}_2)+\Omega_1(\bar{{P}}_1)\left({\partial {\Omega_2(P_2)}}/{\partial {P^\mu}}\right)_{{P_2 = \bar{{P_2}}}}\times\frac{\partial {P_2^\mu}}{\partial {P_1^\mu}}= 0.\end{aligned} \hspace{\stretch{1}}(1.0.9)

With each of the components of the total four-momentum P^\mu_1 + P^\mu_2 separately constant, we have {\partial {P_2^\mu}}/{\partial {P_1^\mu}} = -1, so that we have

\begin{aligned}\left({\partial {\ln \Omega_1(P_1)}}/{\partial {P^\mu_1}}\right)_{{P_1 = \bar{{P_1}}}}=\left({\partial {\ln \Omega_2(P_2)}}/{\partial {P^\mu}}\right)_{{P_2 = \bar{{P_2}}}},\end{aligned} \hspace{\stretch{1}}(1.0.10)

as before. However, we now have one such identity for each component of the total four momentum P which has been held constant. Let’s now define

\begin{aligned}\beta_\mu \equiv \left({\partial {\ln \Omega(N, V, P)}}/{\partial {P^\mu}}\right)_{{N, V, P = \bar{{P}}}},\end{aligned} \hspace{\stretch{1}}(1.0.11)

Our old scalar temperature is then

\begin{aligned}\beta_0 = c \left({\partial {\ln \Omega(N, V, P)}}/{\partial {E}}\right)_{{N, V, P = \bar{{P}}}} = c \beta = \frac{c}{k_{\mathrm{B}} T},\end{aligned} \hspace{\stretch{1}}(1.0.12)

but now we have three additional such constants to figure out what to do with. A first start would be figuring out how the Boltzmann probabilities should be generalized.

Equilibrium between a system and a heat reservoir

Generalizing the arguments of section 3.1.

As in the text, let’s consider a very large heat reservoir A' and a subsystem A as in fig. 1.2 that has come to a state of mutual equilibrium. This likely needs to be defined as a state in which the four vector \beta_\mu is common, as opposed to just \beta_0 the temperature field being common.

Fig 1.2: A system A immersed in heat reservoir A’


If the four momentum of the heat reservoir is P_r' with P_r for the subsystem, and

\begin{aligned}P_r + P_r' = P^{(0)} = \text{constant}.\end{aligned} \hspace{\stretch{1}}(1.0.13)


\begin{aligned}\Omega'({P^\mu_r}') = \Omega'(P^{(0)} - {P^\mu_r}) \propto P_r,\end{aligned} \hspace{\stretch{1}}(1.0.14)

for the number of microstates in the reservoir, so that a Taylor expansion of the logarithm around P_r' = P^{(0)} (with sums implied) is

\begin{aligned}\ln \Omega'({P^\mu_r}') = \ln \Omega'({P^{(0)}}) +\left({\partial {\ln \Omega'}}/{\partial {{P^\mu}'}}\right)_{{P' = P^{(0)}}} \left( P^{(0)} - P^\mu \right)\approx\text{constant} - \beta_\mu' P^\mu.\end{aligned} \hspace{\stretch{1}}(1.0.15)

Here we’ve inserted the definition of \beta^\mu from eq. 1.0.11, so that at equilibrium, with \beta_\mu' = \beta_\mu, we obtain

\begin{aligned}\Omega'({P^\mu_r}') = \exp\left( - \beta_\mu P^\mu \right)=\exp\left( - \beta E \right)\exp\left( - \beta_1 P^1 \right)\exp\left( - \beta_2 P^3 \right)\exp\left( - \beta_3 P^3 \right).\end{aligned} \hspace{\stretch{1}}(1.0.16)

Next steps

This looks consistent with the outline provided in by Lubos to the stackexchange “is there a relativistic quantum thermodynamics” question. I’m sure it wouldn’t be too hard to find references that explore this, as well as explain why non-relativistic stat mech can be used for photon problems. Further exploration of this should wait until after the studies for this course are done.


[1] RK Pathria. Statistical mechanics. Butterworth Heinemann, Oxford, UK, 1996.

Posted in Math and Physics Learning. | Tagged: , , , , , , , , , , , , , | Leave a Comment »

Gauge transformation of the Dirac equation.

Posted by peeterjoot on August 21, 2011

[Click here for a PDF of this post with nicer formatting (especially if my latex to wordpress script has left FORMULA DOES NOT PARSE errors.)]


In [1] the gauge transformation of the Dirac equation is covered, producing the non-relativistic equation with the correct spin interaction. There are unfortunately some sign errors, some of which self correct, and some of which don’t impact the end result, but are slightly confusing. There are also some omitted details. I’ll attempt to work through the same calculation with all the signs in the right places and also fill in some of the details I found myself wanting.

A step back. On the gauge transformation.

The gauge transformations utilized are given as

\begin{aligned}\mathcal{E} &\rightarrow \mathcal{E} - e \phi \\ \mathbf{p} &\rightarrow \mathbf{p} - e \mathbf{A}.\end{aligned} \hspace{\stretch{1}}(2.1)

Let’s start off by reminding ourself where these come from. As outlined in section 12.9 in [2] (with some details pondered in [3]), our relativistic Lagrangian is

\begin{aligned}\mathcal{L} = -m c^2 \sqrt{ 1 - \frac{\mathbf{u}}{c^2}} + \frac{e}{c} \mathbf{u} \cdot \mathbf{A} - e \phi.\end{aligned} \hspace{\stretch{1}}(2.3)

The conjugate momentum is

\begin{aligned}\mathbf{P} = \mathbf{e}^i \frac{\partial {\mathcal{L}}}{\partial {u^i}} = \frac{m \mathbf{u}}{\sqrt{1 - \mathbf{u}^2/c^2}} + \frac{e}{c} \mathbf{A},\end{aligned} \hspace{\stretch{1}}(2.4)


\begin{aligned}\mathbf{P} = \mathbf{p} + \frac{e}{c} \mathbf{A}.\end{aligned} \hspace{\stretch{1}}(2.5)

The Hamiltonian, which must be expressed in terms of this conjugate momentum \mathbf{P}, is found to be

\begin{aligned}\mathcal{E} = \sqrt{ (c \mathbf{P} - e \mathbf{A})^2 + m^2 c^4 } + e \phi.\end{aligned} \hspace{\stretch{1}}(2.6)

With the free particle Lagrangian

\begin{aligned}\mathcal{L} = -m c^2 \sqrt{ 1 - \frac{\mathbf{u}}{c^2}} ,\end{aligned} \hspace{\stretch{1}}(2.7)

our conjugate momentum is

\begin{aligned}\mathbf{P} = \frac{m \mathbf{u}}{\sqrt{ 1 - \mathbf{u}^2/c^2} }.\end{aligned} \hspace{\stretch{1}}(2.8)

For this we find that our Hamiltonian \mathcal{E} = \mathbf{P} \cdot \mathbf{u} - \mathcal{L} is

\begin{aligned}\mathcal{E} = \frac{m c^2}{\sqrt{1 - \mathbf{u}^2/c^2}},\end{aligned} \hspace{\stretch{1}}(2.9)

but this has to be expressed in terms of \mathbf{P}. Having found the form of the Hamiltonian for the interaction case, it is easily verified that 2.6 contains the required form once the interaction fields (\phi, \mathbf{A}) are zeroed

\begin{aligned}\mathcal{E} = \sqrt{ (c \mathbf{P})^2 + m^2 c^4 }.\end{aligned} \hspace{\stretch{1}}(2.10)

Considering the interaction case, Jackson points out that the energy and momentum terms can be combined as a four momentum

\begin{aligned}p^a = \left( \frac{1}{{c}}(\mathcal{E} - e \phi), \mathbf{P} - \frac{e}{c}\mathbf{A} \right),\end{aligned} \hspace{\stretch{1}}(2.11)

so that the re-arranged and squared Hamiltonian takes the form

\begin{aligned}p^a p_a = (m c)^2.\end{aligned} \hspace{\stretch{1}}(2.12)

From this we see that for the Lorentz force, the interaction can be found, starting with the free particle Hamiltonian 2.6, making the transformation

\begin{aligned}\mathcal{E}   &\rightarrow \mathcal{E} - e\phi \\ \mathbf{P} &\rightarrow \mathbf{P} - \frac{e}{c}\mathbf{A},\end{aligned} \hspace{\stretch{1}}(2.13)

or in covariant form

\begin{aligned}p^\mu \rightarrow p^\mu - \frac{e}{c}A^\mu.\end{aligned} \hspace{\stretch{1}}(2.15)

On the gauge transformation of the Dirac equation.

The task at hand now is to make the transformations of 2.13, applied to the Dirac equation

\begin{aligned}{p} = \gamma_\mu p^\mu = m c.\end{aligned} \hspace{\stretch{1}}(3.16)

The first observation to make is that we appear to have different units in the Desai text. Let’s continue using the units from Jackson, and translate them later if inclined.

Right multiplication of 3.16 by \gamma_0 gives us

\begin{aligned}0 &= \gamma_0 ({p} - m c) \\   &= \gamma_0 \gamma_\mu \left( p^\mu - \frac{e}{c} A^\mu \right)- \gamma_0 m c\\   &=\gamma_0 \gamma_0 \left(\frac{\mathcal{E}}{c} - \frac{e}{c} \phi \right)+\gamma_0 \gamma_a \left(p^a - \frac{e}{c} A^a \right)- \gamma_0 m c \\   &=\frac{1}{{c}} \left( \mathcal{E}- e \phi \right)-\boldsymbol{\alpha} \cdot \left(\mathbf{P} - \frac{e}{c} \mathbf{A} \right)- \gamma_0 m c \\ \end{aligned}

With the minor notational freedom of using \gamma_0 instead of \gamma_4, this is our starting point in the Desai text, and we can now left multiply by

\begin{aligned}({p} + m c) \gamma_0 =\frac{1}{{c}} \left( \mathcal{E} - e \phi \right)+\boldsymbol{\alpha} \cdot \left(\mathbf{P} - \frac{e}{c} \mathbf{A} \right)+ \gamma_0 m c.\end{aligned} \hspace{\stretch{1}}(3.17)

The motivation for this appears to be that this product of conjugate like quantities

\begin{aligned}\begin{aligned}0 &= ({p} + m c) \gamma_0 \gamma_0 ({p} - m c)  \\ &=({p} + m c) ({p} - m c) \\ &= \frac{1}{{c^2}} \left( \mathcal{E} - e \phi \right)^2 -\left( \mathbf{P} - \frac{e}{c} \mathbf{A} \right)^2 - (m c)^2 + \cdots,\end{aligned}\end{aligned} \hspace{\stretch{1}}(3.18)

produces the the Klein-Gordon equation, plus some cross terms to be determined. Those cross terms are the important bits since they contain the spin interaction, even in the non-relativistic limit.

Let’s do the expansion.

\begin{aligned}0&= ({p} + m c) \gamma_0 \gamma_0 ({p} - m c) u \\ &=\left(\frac{1}{{c}} \left( \mathcal{E} - e \phi \right)+\boldsymbol{\alpha} \cdot \left(\mathbf{P} - \frac{e}{c} \mathbf{A} \right)+ \gamma_0 m c\right)\left(\frac{1}{{c}} \left( \mathcal{E}- e \phi \right)-\boldsymbol{\alpha} \cdot \left(\mathbf{P} - \frac{e}{c} \mathbf{A} \right)- \gamma_0 m c \right) u \\ &=\frac{1}{{c}} \left( \mathcal{E} - e \phi \right)\left(\frac{1}{{c}} \left( \mathcal{E}- e \phi \right)-\boldsymbol{\alpha} \cdot \left(\mathbf{P} - \frac{e}{c} \mathbf{A} \right)- \gamma_0 m c \right) u \\ &\qquad +\boldsymbol{\alpha} \cdot \left(\mathbf{P} - \frac{e}{c} \mathbf{A} \right)\left(\frac{1}{{c}} \left( \mathcal{E}- e \phi \right)-\boldsymbol{\alpha} \cdot \left(\mathbf{P} - \frac{e}{c} \mathbf{A} \right)- \gamma_0 m c \right) u \\ &\qquad + \gamma_0 m c\left(\frac{1}{{c}} \left( \mathcal{E}- e \phi \right)-\boldsymbol{\alpha} \cdot \left(\mathbf{P} - \frac{e}{c} \mathbf{A} \right)- \gamma_0 m c \right) u \\ &=\left(\frac{1}{{c^2}} \left( \mathcal{E} - e \phi \right)^2- \left( \boldsymbol{\alpha} \cdot \left(\mathbf{P} - \frac{e}{c} \mathbf{A} \right) \right)^2- (mc)^2\right) u\\ &\qquad + \frac{1}{{c}} \left[{\boldsymbol{\alpha} \cdot \left(\mathbf{P} - \frac{e}{c} \mathbf{A} \right)},{\mathcal{E} - e \phi}\right] u- m c\left\{{\boldsymbol{\alpha} \cdot \left(\mathbf{P} - \frac{e}{c} \mathbf{A} \right)},{ \gamma_0}\right\} u \\ &\qquad + {\gamma_0 m\left(\mathcal{E} - e \phi\right) u}- {\gamma_0 m\left(\mathcal{E} - e \phi\right) u}\\ \end{aligned}

Since \gamma_0 anticommutes with any \boldsymbol{\alpha} \cdot \mathbf{x}, even when \mathbf{x} contains operators, the anticommutator term is killed.

While done in the text, lets also do the \boldsymbol{\alpha} \cdot \left(\mathbf{P} - \frac{e}{c} \mathbf{A} \right) square for completeness. Because this is an operator, we need to treat this as

\begin{aligned}\left( \boldsymbol{\alpha} \cdot \left(\mathbf{P} - \frac{e}{c} \mathbf{A} \right) \right)^2 u&=\boldsymbol{\alpha} \cdot \left(\mathbf{P} - \frac{e}{c} \mathbf{A} \right)\boldsymbol{\alpha} \cdot \left(\mathbf{P} u - \frac{e}{c} \mathbf{A} u \right),\end{aligned}

so want to treat the two vectors as independent, say (\boldsymbol{\alpha} \cdot \mathbf{a})(\boldsymbol{\alpha} \cdot \mathbf{b}). That is

\begin{aligned}(\boldsymbol{\alpha} \cdot \mathbf{a})(\boldsymbol{\alpha} \cdot \mathbf{b})&=\begin{bmatrix}0 & \boldsymbol{\sigma} \cdot \mathbf{a} \\ \boldsymbol{\sigma} \cdot \mathbf{a} & 0\end{bmatrix}\begin{bmatrix}0 & \boldsymbol{\sigma} \cdot \mathbf{b} \\ \boldsymbol{\sigma} \cdot \mathbf{b} & 0\end{bmatrix} \\ &=\begin{bmatrix}(\boldsymbol{\sigma} \cdot \mathbf{a}) (\boldsymbol{\sigma} \cdot \mathbf{b})  & 0 \\ 0 & (\boldsymbol{\sigma} \cdot \mathbf{a}) (\boldsymbol{\sigma} \cdot \mathbf{b})  & 0 \\ \end{bmatrix} \\ \end{aligned}

The diagonal elements can be expanded by coordinates

\begin{aligned}(\boldsymbol{\sigma} \cdot \mathbf{a}) (\boldsymbol{\sigma} \cdot \mathbf{b})&=\sum_{m,n} \sigma^m a^m \sigma^n b^n \\ &=\sum_m a^m b^m+\sum_{m\ne n} \sigma^m \sigma^n a^m b^m \\ &=\mathbf{a} \cdot \mathbf{b}+i \sum_{m\ne n} \sigma^o \epsilon^{m n o} a^m b^m \\ &=\mathbf{a} \cdot \mathbf{b}+i \boldsymbol{\sigma} \cdot (\mathbf{a} \times \mathbf{b}),\end{aligned}


\begin{aligned}(\boldsymbol{\alpha} \cdot \mathbf{a})(\boldsymbol{\alpha} \cdot \mathbf{b})=\begin{bmatrix}\mathbf{a} \cdot \mathbf{b} + i \boldsymbol{\sigma} \cdot (\mathbf{a} \times \mathbf{b}) & 0 \\ 0 & \mathbf{a} \cdot \mathbf{b} + i \boldsymbol{\sigma} \cdot (\mathbf{a} \times \mathbf{b})\end{bmatrix}\end{aligned} \hspace{\stretch{1}}(3.19)

Plugging this back in, we now have an extra term in the expansion

\begin{aligned}0&=\left(\frac{1}{{c^2}} \left( \mathcal{E} - e \phi \right)^2- \left( \mathbf{P} - \frac{e}{c} \mathbf{A} \right)^2- (mc)^2\right) u\\ &\qquad + \frac{1}{{c}} \left[{\boldsymbol{\alpha} \cdot \left(\mathbf{P} - \frac{e}{c} \mathbf{A} \right)},{\mathcal{E} - e \phi}\right] u\\ &\qquad- i \boldsymbol{\sigma}' \cdot\left(\left( \mathbf{P} - \frac{e}{c} \mathbf{A} \right) \times \left( \mathbf{P} - \frac{e}{c} \mathbf{A} \right)\right) u\end{aligned}

Here \boldsymbol{\sigma}' was defined as the direct product of the two by two identity with the abstract matrix \boldsymbol{\sigma} as follows

\begin{aligned}\boldsymbol{\sigma}' =\begin{bmatrix}\boldsymbol{\sigma} & 0 \\ 0 & \boldsymbol{\sigma}\end{bmatrix}= I \otimes \boldsymbol{\sigma}\end{aligned} \hspace{\stretch{1}}(3.20)

Like the \mathbf{L} \times \mathbf{L} angular momentum operator cross products this one wasn’t zero. Expanding it yields

\begin{aligned}\left( \mathbf{P} - \frac{e}{c} \mathbf{A} \right) \times \left( \mathbf{P} - \frac{e}{c} \mathbf{A} \right) u&=\mathbf{P} \times \mathbf{P} u+ \frac{e^2}{c^2} \mathbf{A} \times \mathbf{A} u- \frac{e}{c} \left( \mathbf{A} \times \mathbf{P} + \mathbf{P} \times \mathbf{A} \right) u \\ &=- \frac{e}{c} \left( \mathbf{A} \times (\mathbf{P} u) + (\mathbf{P} u) \times \mathbf{A} + u (\mathbf{P} \times \mathbf{A}) \right) \\ &=- \frac{e}{c} (-i \hbar \boldsymbol{\nabla} \times \mathbf{A}) u \\ &=\frac{i e \hbar}{c} \mathbf{H} u\end{aligned}

Plugging in again we are getting closer, and now have the magnetic field cross term

\begin{aligned}0&=\left(\frac{1}{{c^2}} \left( \mathcal{E} - e \phi \right)^2- \left(\mathbf{P} - \frac{e}{c} \mathbf{A} \right)^2- (mc)^2\right) u\\ &\qquad + \frac{1}{{c}}\left[{\boldsymbol{\alpha} \cdot \left(\mathbf{P} - \frac{e}{c} \mathbf{A} \right)},{\mathcal{E} - e \phi}\right] u\\ &\qquad+ \frac{e \hbar}{c} \boldsymbol{\sigma}' \cdot \mathbf{H} u.\end{aligned}

All that remains is evaluation of the commutator term, which should yield the electric field interaction. That commutator is

\begin{aligned}\left[{\boldsymbol{\alpha} \cdot \left(\mathbf{P} - \frac{e}{c} \mathbf{A} \right)},{\mathcal{E} - e \phi}\right] u&={\boldsymbol{\alpha} \cdot \mathbf{P} \mathcal{E} u}- e \boldsymbol{\alpha} \cdot \mathbf{P} \phi u- \frac{e}{c} \boldsymbol{\alpha} \cdot \mathbf{A} \mathcal{E} u+ {\frac{e^2}{c} \boldsymbol{\alpha} \cdot \mathbf{A} \phi u} \\ &- {\mathcal{E} \boldsymbol{\alpha} \cdot \mathbf{P} u}+ e \phi \boldsymbol{\alpha} \cdot \mathbf{P} u+ \frac{e}{c} \mathcal{E} \boldsymbol{\alpha} \cdot \mathbf{A} u- {\frac{e^2}{c} \phi \boldsymbol{\alpha} \cdot \mathbf{A} u} \\ &=\boldsymbol{\alpha} \cdot \left( - e \mathbf{P} \phi+ \frac{e}{c} \mathcal{E} \right) u \\ &=e i \hbar \boldsymbol{\alpha} \cdot \left( \boldsymbol{\nabla} \phi+ \frac{1}{c} \frac{\partial {\mathbf{A}}}{\partial {t}} \right) u \\ &=- e i \hbar \boldsymbol{\alpha} \cdot \mathbf{E} u\end{aligned}

That was the last bit required to fully expand the space time split of our squared momentum equations. We have

\begin{aligned}0=({p} + mc)({p} - mc) u=\left(\frac{1}{{c^2}} \left( \mathcal{E} - e \phi \right)^2- \left(\mathbf{P} - \frac{e}{c} \mathbf{A} \right)^2- (mc)^2- \frac{i e \hbar}{c} \boldsymbol{\alpha} \cdot \mathbf{E}+ \frac{e \hbar}{c} \boldsymbol{\sigma}' \cdot \mathbf{H}\right) u\end{aligned} \hspace{\stretch{1}}(3.21)

This is the end result of the reduction of the spacetime split gauge transformed Dirac equation. The next step is to obtain the non-relativistic Hamiltonian operator equation (linear in the time derivative operator and quadratic in spacial partials) that has both the electric field and magnetic field terms that we desire to accurately describe spin (actually we need only the magnetic interaction term for non-relativistic spin, but we’ll see that soon).

To obtain the first order time derivatives we can consider an approximation to the (\mathcal{E} - e \phi)^2 terms. We can get that by considering the difference of squares factorization

\begin{aligned}\frac{1}{{c^2}} ( \mathcal{E} - e \phi - m c^2) ( \mathcal{E} - e \phi + m c^2) u&=\frac{1}{{c^2}} \left(( \mathcal{E} - e \phi )^2 u - (m c^2)^2 u- {m c^2 \mathcal{E} u}+ {\mathcal{E} m c^2 u} \right) \\ &=\frac{1}{{c^2}} ( \mathcal{E} - e \phi )^2 u - (m c)^2 u\end{aligned}

In the text, this is factored, instead of the factorization verified. I wanted to be careful to ensure that the operators did not have any effect. They don’t, which is clear in retrospect since the \mathcal{E} operator and the scalar mc necessarily commute. With this factorization, some relativistic approximations are possible. Considering the free particle energy, we can separate out the rest energy from the kinetic (which is perversely designated with subscript T for some reason in the text (and others))

\begin{aligned}\mathcal{E}&= \gamma m c^2  \\ &= m c^2 \left( 1 + \frac{1}{{2}} \left(\frac{\mathbf{v}}{c}\right)^2 + \cdots \right) \\ &= m c^2 + \frac{1}{{2}} m \mathbf{v}^2 + \cdots \\ &\equiv m c^2 + \mathcal{E}_{T}\end{aligned}

With this definition, the energy minus mass term in terms of kinetic energy (that we also had in the Klein-Gordon equation) takes the form

\begin{aligned}\frac{1}{{c^2}} ( \mathcal{E} - e \phi )^2 u - (m c)^2 u=\frac{1}{{c^2}} ( \mathcal{E}_{T} - e \phi ) ( \mathcal{E} - e \phi + m c^2) u\end{aligned} \hspace{\stretch{1}}(3.22)

In the second factor, to get a non-relativistic approximation of \mathcal{E} - e \phi, the text states without motivation that e \phi will be considered small compared to m c^2. We can make some sense of this by considering the classical Hamiltonian for a particle in a field

\begin{aligned}\mathcal{E}&= \sqrt{ c^2 \left(\mathbf{P} - \frac{e}{c} \mathbf{A}\right) + (m c^2)^2 } + e \phi \\ &= \sqrt{ c^2 (\gamma m \mathbf{v})^2 + (m c^2)^2 } + e \phi \\ &= m c \sqrt{ (\gamma \mathbf{v})^2 + c^2 } + e \phi \\ &= m c \sqrt{ \frac{ \mathbf{v}^2 + c^2 ( 1 - \mathbf{v}^2/c^2) } { 1 - \mathbf{v}^2/c^2 } } + e \phi \\ &= \gamma m c^2 + e \phi \\ &= m c^2 \left( 1 + \frac{1}{{2}} \frac{\mathbf{v}^2}{c^2} + \cdots \right) + e \phi.\end{aligned}

We find that, in the non-relativistic limit, we have

\begin{aligned}\mathcal{E} - e \phi = m c^2 + \frac{1}{{2}} m \mathbf{v}^2 + \cdots \approx m c^2,\end{aligned} \hspace{\stretch{1}}(3.23)

and obtain the first order approximation of our time derivative operator

\begin{aligned}\frac{1}{{c^2}} ( \mathcal{E} - e \phi )^2 u - (m c)^2 u\approx\frac{1}{{c^2}} ( \mathcal{E}_{T} - e \phi ) 2 m c^2 u,\end{aligned} \hspace{\stretch{1}}(3.24)


\begin{aligned}\frac{1}{{c^2}} ( \mathcal{E} - e \phi )^2 u - (m c)^2 u\approx2 m ( \mathcal{E}_{T} - e \phi ).\end{aligned} \hspace{\stretch{1}}(3.25)

It seems slightly underhanded to use the free particle Hamiltonian in one part of the approximation, and the Hamiltonian for a particle in a field for the other part. This is probably why the text just mandates that e\phi be small compared to m c^2.

To summarize once more before the final reduction (where we eliminate the electric field component of the operator equation), we have

\begin{aligned}0=({p} + mc)({p} - mc) u\approx\left(2 m ( \mathcal{E}_{T} - e \phi )- \left(\mathbf{P} - \frac{e}{c} \mathbf{A} \right)^2- \frac{i e \hbar}{c} \boldsymbol{\alpha} \cdot \mathbf{E}+ \frac{e \hbar}{c} \boldsymbol{\sigma}' \cdot \mathbf{H}\right) u.\end{aligned} \hspace{\stretch{1}}(3.26)

Except for the electric field term, this is the result that is derived in the text. It was argued that this term is not significant compared to e \phi when the particle velocity is restricted to the non-relativistic domain. This is done by computing the expectation of this term relative to e \phi. Consider

\begin{aligned}{\left\lvert{ \left\langle{{ \frac{e \hbar}{ 2 m c} \frac{\boldsymbol{\alpha} \cdot \mathbf{E}}{e \phi } }}\right\rangle }\right\rvert}\end{aligned} \hspace{\stretch{1}}(3.27)

With the velocities low enough so that the time variation of the vector potential does not contribute to the electric field (i.e. the electrostatic case), we have

\begin{aligned}\mathbf{E} = - \boldsymbol{\nabla} \phi = - \hat{\mathbf{r}} \frac{\partial {\phi}}{\partial {r}}.\end{aligned} \hspace{\stretch{1}}(3.28)

The variation in length a that is considered is labeled the characteristic length

\begin{aligned}p a \sim \hbar,\end{aligned} \hspace{\stretch{1}}(3.29)

so that with p = m v we have

\begin{aligned}a \sim \frac{\hbar}{m v}.\end{aligned} \hspace{\stretch{1}}(3.30)

This characteristic length is not elaborated on, but one can observe the similarity to the Compton wavelength

\begin{aligned}L_{\text{Compton}} = \frac{\hbar}{m c},\end{aligned} \hspace{\stretch{1}}(3.31)

the length scale for which Quantum field theory must be considered. This length scale is considerably larger for velocities smaller than the speed of light. For example, the drift velocity of electrons in copper is \sim 10^{6} \frac{\text{m}}{\text{s}}, which fixes our length scale to 100 times the Compton length (\sim 10^{-12} \text{m}). This is still a very small length, but is in the QM domain instead of QED. With such a length scale consider the magnitude of a differential contribution to the electric field

\begin{aligned}{\left\lvert{\phi}\right\rvert} = {\left\lvert{\mathbf{E}}\right\rvert} \Delta x = {\left\lvert{\mathbf{E}}\right\rvert} a,\end{aligned} \hspace{\stretch{1}}(3.32)

so that

\begin{aligned}\left\langle{{ \frac{e \hbar}{ 2 m c} \frac{\boldsymbol{\alpha} \cdot \mathbf{E}}{e {\left\lvert{\phi}\right\rvert} } }}\right\rangle&=\left\langle{{ \frac{e \hbar}{ 2 m c} \frac{\boldsymbol{\alpha} \cdot \mathbf{E}}{e a {\left\lvert{\mathbf{E}}\right\rvert} } }}\right\rangle \\ &=\left\langle{{ \frac{e \hbar}{m} \frac{1}{ 2 c} \frac{\boldsymbol{\alpha} \cdot \mathbf{E}}{e \frac{\hbar }{ m v } {\left\lvert{\mathbf{E}}\right\rvert} } }}\right\rangle \\ &=\frac{1}{{2}} \frac{v}{c} \left\langle{{ \frac{\boldsymbol{\alpha} \cdot \mathbf{E}}{ {\left\lvert{\mathbf{E}}\right\rvert} } }}\right\rangle.\end{aligned}

Thus the magnitude of this (vector) expectation is dominated by the expectation of just the \boldsymbol{\alpha}. That has been calculated earlier when Dirac currents were considered, where it was found that

\begin{aligned}\left\langle{{\alpha_i}}\right\rangle = \psi^\dagger \alpha_i \psi = (\mathbf{j})_i.\end{aligned} \hspace{\stretch{1}}(3.33)

Also recall that (33.73) that this current was related to momentum with

\begin{aligned}\mathbf{j} = \frac{\mathbf{p}}{m c} = \frac{\mathbf{v}}{c}\end{aligned} \hspace{\stretch{1}}(3.34)

which allows for a final approximation of the magnitude of the electric field term’s expectation value relative to the e\phi term of the Hamiltonian operator. Namely

\begin{aligned}{\left\lvert{ \left\langle{{ \frac{e \hbar}{ 2 m c} \frac{\boldsymbol{\alpha} \cdot \mathbf{E}}{e \phi } }}\right\rangle }\right\rvert}\sim\frac{\mathbf{v}^2}{c^2}.\end{aligned} \hspace{\stretch{1}}(3.35)

With that last approximation made, the gauge transformed Dirac equation, after non-relativistic approximation of the energy and electric field terms, is left as

\begin{aligned}i \hbar \frac{\partial {}}{\partial {t}}=\frac{1}{{2m}} \left(i \hbar \boldsymbol{\nabla} + \frac{e}{c} \mathbf{A} \right)^2- \frac{e \hbar}{2 m c} \boldsymbol{\sigma}' \cdot \mathbf{H}+ e \phi.\end{aligned} \hspace{\stretch{1}}(3.36)

This is still a four dimensional equation, and it is stated in the text that only the large component is relevant (reducing the degrees of spin freedom to two). That argument makes a bit more sense with the matrix form of the gauge reduction which follows in the next section, so understanding that well seems worthwhile, and is the next thing to digest.


[1] BR Desai. Quantum mechanics with basic field theory. Cambridge University Press, 2009.

[2] JD Jackson. Classical Electrodynamics Wiley. John Wiley and Sons, 2nd edition, 1975.

[3] Peeter Joot. Misc Physics and Math Play, chapter Hamiltonian notes.

Posted in Math and Physics Learning. | Tagged: , , , , , , , , , , , | Leave a Comment »

My first arxiv submission. Change of basis and Gram-Schmidt orthonormalization in special relativity

Posted by peeterjoot on April 29, 2011

Now that I have an academic email address I was able to make an arxiv submission (I’d tried previously and been auto-rejected) :

Change of basis and Gram-Schmidt orthonormalization in special relativity

This is based on a tutorial from our relativistic electrodynamics class, which covered non-internal relativistic systems. Combining what I learned from that with some concepts I learned from ‘Geometric Algebra for Physicists’ (particularly reciprocal frames) I was able to write up some notes that took those ideas plus basic linear algebra (the Graham-Schmidt procedure) and apply them to relativity and/or non-orthonormal Euclidean bases. How to do projections onto non-orthonormal Euclidean bases isn’t taught in Algebra I, but once you figure out that the same thing works for SR.

Will anybody read it? I don’t know … but I had fun writing it.

Posted in Math and Physics Learning. | Tagged: , , , , , , , | 2 Comments »

PHY450H1S. Relativistic Electrodynamics Tutorial 1 (TA: Simon Freedman).

Posted by peeterjoot on January 21, 2011

[Click here for a PDF of this post with nicer formatting]

Worked question.

The TA blasted through a problem from Hartle [1], section 5.17 (all the while apologizing for going so slow). I’m going to have to look these notes over carefully to figure out what on Earth he was doing.

At one point he asked if anybody was completely lost. Nobody said yes, but given the class title, I had the urge to say “No, just relatively lost”.

In a source’s rest frame S emits radiation isotropically with a frequency \omega with number flux f(\text{photons}/\text{cm}^2 s). Moves along x’-axis with speed V in an observer frame (O). What does the energy flux in O look like?

A brief intro with four vectors

A 3-vector:

\begin{aligned}\mathbf{a} &= (a_x, a_y, a_z) = (a^1, a^2, a^3) \\ \mathbf{b} &= (b_x, b_y, b_z) = (b^1, b^2, b^3)\end{aligned} \hspace{\stretch{1}}(1.1)

For this we have the dot product

\begin{aligned}\mathbf{a} \cdot \mathbf{b} = \sum_{\alpha=1}^3 a^\alpha b^\alpha\end{aligned} \hspace{\stretch{1}}(1.3)

Greek letters in this course (opposite to everybody else in the world, because of Landau and Lifshitz) run from 1 to 3, whereas roman letters run through the set \{0,1,2,3\}.

We want to put space and time on an equal footing and form the composite quantity (four vector)

\begin{aligned}x^i = (ct, \mathbf{r}) = (x^0, x^1, x^2, x^3),\end{aligned} \hspace{\stretch{1}}(1.4)


\begin{aligned}x^0 &= ct \\ x^1 &= x \\ x^2 &= y \\ x^3 &= z.\end{aligned} \hspace{\stretch{1}}(1.5)

It will also be convenient to drop indexes when referring to all the components of a four vector and we will use lower or upper case non-bold letters to represent such four vectors. For example

\begin{aligned}X = (ct, \mathbf{r}),\end{aligned} \hspace{\stretch{1}}(1.9)


\begin{aligned}v = \gamma \left(c, \mathbf{v} \right).\end{aligned} \hspace{\stretch{1}}(1.10)

Three vectors will be represented as letters with over arrows \vec{a} or (in text) bold face \mathbf{a}.

Recall that the squared spacetime interval between two events X_1 and X_2 is defined as

\begin{aligned}{S_{X_1, X_2}}^2 = (ct_1 - c t_2)^2 - (\mathbf{x}_1 - \mathbf{x}_2)^2.\end{aligned} \hspace{\stretch{1}}(1.11)

In particular, with one of these zero, we have an operator which takes a single four vector and spits out a scalar, measuring a “distance” from the origin

\begin{aligned}s^2 = (ct)^2 - \mathbf{r}^2.\end{aligned} \hspace{\stretch{1}}(1.12)

This motivates the introduction of a dot product for our four vector space.

\begin{aligned}X \cdot X = (ct)^2 - \mathbf{r}^2 = (x^0)^2 - \sum_{\alpha=1}^3 (x^\alpha)^2\end{aligned} \hspace{\stretch{1}}(1.13)

Utilizing the spacetime dot product of 1.13 we have for the dot product of the difference between two events

\begin{aligned}(X - Y) \cdot (X - Y)&=(x^0 - y^0)^2 - \sum_{\alpha =1}^3 (x^\alpha - y^\alpha)^2 \\ &=X \cdot X + Y \cdot Y - 2 x^0 y^0 + 2 \sum_{\alpha =1}^3 x^\alpha y^\alpha.\end{aligned}

From this, assuming our dot product 1.13 is both linear and symmetric, we have for any pair of spacetime events

\begin{aligned}X \cdot Y = x^0 y^0 - \sum_{\alpha =1}^3 x^\alpha y^\alpha.\end{aligned} \hspace{\stretch{1}}(1.14)

How do our four vectors transform? This is really just a notational issue, since this has already been discussed. In this new notation we have

\begin{aligned}{x^0}' &= ct' = \gamma ( ct - \beta x) = \gamma ( x^0 - \beta x^1 ) \\ {x^1}' &= x' = \gamma ( x - \beta ct ) = \gamma ( x^1 - \beta x^0 ) \\ {x^2}' &= x^2 \\ {x^3}' &= x^3\end{aligned} \hspace{\stretch{1}}(1.15)

where \beta = V/c, and \gamma^{-2} = 1 - \beta^2.

In order to put some structure to this, it can be helpful to express this dot product as a quadratic form. We write

\begin{aligned}A \cdot B = \begin{bmatrix}a^0 & \mathbf{a}^\text{T} \end{bmatrix}\begin{bmatrix}1 & 0 & 0 & 0 \\ 0 & -1 & 0 & 0 \\ 0 & 0 & -1 & 0 \\ 0 & 0 & 0 & -1 \end{bmatrix}\begin{bmatrix}b^0 \\ \mathbf{b}\end{bmatrix}= A^\text{T} G B.\end{aligned} \hspace{\stretch{1}}(1.19)

We can write our Lorentz boost as a matrix

\begin{aligned}\begin{bmatrix}\gamma & -\beta \gamma & 0 & 0 \\ -\beta \gamma & \gamma & 0 & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \end{bmatrix}\end{aligned} \hspace{\stretch{1}}(1.20)

so that the dot product between two transformed four vectors takes the form

\begin{aligned}A' \cdot B' = A^\text{T} O^\text{T} G O B\end{aligned} \hspace{\stretch{1}}(1.21)

Back to the problem.

We will work in momentum space, where we have

\begin{aligned}p^i &= (p^0, \mathbf{p}) = \left( \frac{E}{c}, \mathbf{p}\right) \\ p^2 &= \frac{E^2}{c^2} -\mathbf{p}^2 \\ \mathbf{p} &= \hbar \mathbf{k} \\ E &= \hbar \omega \\ p^i &= \hbar k^i \\ k^i &= \left(\frac{\omega}{c}, \mathbf{k}\right)\end{aligned} \hspace{\stretch{1}}(1.22)

Justifying this.

Now, the TA blurted all this out. We know some of it from the QM context, and if we’ve been reading ahead know a bit of this from our text [2] (the energy momentum four vector relationships). Let’s go back to the classical electromagnetism and recall what we know about the relation of frequency and wave numbers for continuous fields. We want solutions to Maxwell’s equation in vacuum and can show that such solution also implies that our fields obey a wave equation

\begin{aligned}\frac{1}{{c^2}} \frac{\partial^2 \Psi}{\partial t^2} - \boldsymbol{\nabla}^2 \Psi = 0,\end{aligned} \hspace{\stretch{1}}(1.28)

where \Psi is one of \mathbf{E} or \mathbf{B}. We have other constraints imposed on the solutions by Maxwell’s equations, but require that they at least obey 1.28 in addition to these constraints.

With application of a spatial Fourier transformation of the wave equation, we find that our solution takes the form

\begin{aligned}\Psi = (2 \pi)^{-3/2} \int \tilde{\Psi}(\mathbf{k}, 0) e^{i (\omega t \pm \mathbf{k} \cdot \mathbf{x}) } d^3 \mathbf{k}.\end{aligned} \hspace{\stretch{1}}(1.29)

If one takes this as a given and applies the wave equation operator to this as a test solution, one finds without doing the Fourier transform work that we also have a constraint. That is

\begin{aligned}\frac{1}{{c^2}} (i \omega)^2 \Psi - (\pm i \mathbf{k})^2 \Psi = 0.\end{aligned} \hspace{\stretch{1}}(1.30)

So even in the continuous field domain, we have a relationship between frequency and wave number. We see that this also happens to have the form of a lightlike spacetime interval

\begin{aligned}\frac{\omega^2}{c^2} - \mathbf{k}^2 = 0.\end{aligned} \hspace{\stretch{1}}(1.31)

Also recall that the photoelectric effect imposes an experimental constraint on photon energy, where we have

\begin{aligned}E = h \nu = \frac{h}{2\pi} 2 \pi \nu = \hbar \omega\end{aligned} \hspace{\stretch{1}}(1.32)

Therefore if we impose a mechanics like P = (E/c, \mathbf{p}) relativistic energy-momentum relationship on light, it then makes sense to form a nilpotent (lightlike) four vector for our photon energy. This combines our special relativistic expectations, with the constraints on the fields imposed by classical electromagnetism. We can then write for the photon four momentum

\begin{aligned}P = \left( \frac{\hbar \omega}{c}, \hbar k \right)\end{aligned} \hspace{\stretch{1}}(1.33)

Back to the TA’s formula blitz.

Utilizing spherical polar coordinates in momentum (wave number) space, measuring the polar angle from the k^1 (x-like) axis, we can compute this polar angle in both pairs of frames,

\begin{aligned} \cos \alpha &= \frac{k^1}{{\left\lvert{\mathbf{k}}\right\rvert}} = \frac{k^1}{\omega/c} \\ \cos \alpha' &= \frac{{k^1}'}{\omega'/c} = \frac{\gamma (k^1 + \beta \omega/c)}{\gamma(\omega/c + \beta k^1)}\end{aligned} \hspace{\stretch{1}}(1.34)

Note that this requires us to assume that wave number four vectors transform in the same fashion as regular mechanical position and momentum four vectors. Also note that we have the primed frame moving negatively along the x-axis, instead of the usual positive origin shift. The question is vague enough to allow this since it only requires motion.

\paragraph{check 1}

as \beta \rightarrow 1 (ie: our primed frame velocity approaches the speed of light relative to the rest frame), \cos \alpha' \rightarrow 1, \alpha' = 0. The surface gets more and more compressed.

In the original reference frame the radiation was isotropic. In the new frame how does it change with respect to the angle? This is really a question to find this number flux rate

\begin{aligned}f'(\alpha') = ?\end{aligned} \hspace{\stretch{1}}(1.36)

In our rest frame the total number of photons traveling through the surface in a given interval of time is

\begin{aligned}N &= \int d\Omega dt f(\alpha) = \int d \phi \sin \alpha d\alpha = -2 \pi \int d(\cos\alpha) dt f(\alpha) \\ \end{aligned} \hspace{\stretch{1}}(1.37)

Here we utilize the spherical solid angle d\Omega = \sin \alpha d\alpha d\phi = - d(\cos\alpha) d\phi, and integrate \phi over the [0, 2\pi] interval. We also have to assume that our number flux density is not a function of horizontal angle \phi in the rest frame.

In the moving frame we similarly have

\begin{aligned}N' &= -2 \pi \int d(\cos\alpha') dt' f'(\alpha'),\end{aligned} \hspace{\stretch{1}}(1.39)

and we again have had to assume that our transformed number flux density is not a function of the horizontal angle \phi. This seems like a reasonable move since {k^2}' = k^2 and {k^3}' = k^3 as they are perpendicular to the boost direction.

\begin{aligned}f'(\alpha') = \frac{d(\cos\alpha)}{d(\cos\alpha')} \left( \frac{dt}{dt'} \right) f(\alpha)\end{aligned} \hspace{\stretch{1}}(1.40)

Now, utilizing a conservation of mass argument, we can argue that N = N'. Regardless of the motion of the frame, the same number of particles move through the surface. Taking ratios, and examining an infinitesimal time interval, and the associated flux through a small patch, we have

\begin{aligned}\left( \frac{d(\cos\alpha)}{d(\cos\alpha')} \right) = \left( \frac{d(\cos\alpha')}{d(\cos\alpha)} \right)^{-1} = \gamma^2 ( 1 + \beta \cos\alpha)^2\end{aligned} \hspace{\stretch{1}}(1.41)

Part of the statement above was a do-it-yourself. First recall that c t' = \gamma ( c t + \beta x ), so dt/dt' evaluated at x=0 is 1/\gamma.

The rest is messier. We can calculate the d(\cos) values in the ratio above using 1.34. For example, for d(\cos(\alpha)) we have

\begin{aligned}d(\cos\alpha) &= d \left( \frac{k^1}{\omega/c} \right) \\ &= dk^1 \frac{1}{{\omega/c}} - c \frac{1}{{\omega^2}} d\omega.\end{aligned}

If one does the same thing for d(\cos\alpha'), after a whole whack of messy algebra one finds that the differential terms and a whole lot more mystically cancels, leaving just

\begin{aligned}\frac{d\cos\alpha'}{d\cos\alpha} = \frac{\omega^2/c^2}{(\omega/c + \beta k^1)^2} (1 - \beta^2)\end{aligned} \hspace{\stretch{1}}(1.42)

A bit more reduction with reference back to 1.34 verifies 1.41.

Also note that again from 1.34 we have

\begin{aligned}\cos\alpha' = \frac{\cos\alpha + \beta}{1 + \beta \cos\alpha}\end{aligned} \hspace{\stretch{1}}(1.43)

and rearranging this for \cos\alpha' gives us

\begin{aligned}\cos\alpha = \frac{\cos\alpha' - \beta}{1 - \beta \cos\alpha'},\end{aligned} \hspace{\stretch{1}}(1.44)

which we can sum to find that

\begin{aligned}1 + \beta \cos\alpha = \frac{1}{{\gamma^2 (1 - \beta \cos \alpha')^2 }},\end{aligned} \hspace{\stretch{1}}(1.45)

so putting all the pieces together we have

\begin{aligned}f'(\alpha') = \frac{1}{{\gamma}} \frac{f(\alpha)}{(\gamma (1-\beta \cos\alpha'))^2}\end{aligned} \hspace{\stretch{1}}(1.46)

The question asks for the energy flux density. We get this by multiplying the number density by the frequency of the light in question. This is, as a function of the polar angle, in each of the frames.

\begin{aligned}L(\alpha) &= \hbar \omega(\alpha) f(\alpha) = \hbar \omega f \\ L'(\alpha') &= \hbar \omega'(\alpha') f'(\alpha') = \hbar \omega' f'\end{aligned} \hspace{\stretch{1}}(1.47)

But we have

\begin{aligned}\omega'(\alpha')/c = \gamma( \omega/c + \beta k^1 ) = \gamma \omega/c ( 1 + \beta \cos\alpha )\end{aligned} \hspace{\stretch{1}}(1.49)

Aside, \beta << 1,

\begin{aligned}\omega' = \omega ( 1 + \beta \cos\alpha) + O(\beta^2) = \omega + \delta \omega\end{aligned} \hspace{\stretch{1}}(1.50)

\begin{aligned}\delta \omega &= \beta, \alpha = 0 		\qquad \text{blue shift} \\ \delta \omega &= -\beta, \alpha = \pi 		\qquad \text{red shift}\end{aligned} \hspace{\stretch{1}}(1.51)

The TA then writes

\begin{aligned}L'(\alpha') = \frac{L/\gamma}{(\gamma (1 - \beta \cos\alpha'))^3}\end{aligned} \hspace{\stretch{1}}(1.53)

although, I calculate

\begin{aligned}L'(\alpha') = \frac{L}{\gamma^4 (\gamma (1 - \beta \cos\alpha'))^4}\end{aligned} \hspace{\stretch{1}}(1.54)

He then says, the forward backward ratio is

\begin{aligned}L'(0)/L'(\pi) = {\left( \frac{ 1 + \beta }{1-\beta} \right)}^3\end{aligned} \hspace{\stretch{1}}(1.55)

The forward radiation is much bigger than the backwards radiation.

For this I get:

\begin{aligned}L'(0)/L'(\pi) = {\left( \frac{ 1 + \beta }{1-\beta} \right)}^4\end{aligned} \hspace{\stretch{1}}(1.56)

It is still bigger for \beta positive, which I think is the point.

If I can somehow manage to keep my signs right as I do this course I may survive. Why did he pick a positive sign way back in 1.34?


[1] J.B. Hartle and T. Dray. Gravity: an introduction to Einsteins general relativity, volume 71. 2003.

[2] L.D. Landau and E.M. Lifshits. The classical theory of fields. Butterworth-Heinemann, 1980.

Posted in Math and Physics Learning. | Tagged: , , , , , , , , , , , , , , , , , | Leave a Comment »

PHY450H1S. Relativistic Electrodynamics Lecture 4 (Taught by Prof. Erich Poppitz). Spacetime geometry, Lorentz transformations, Minkowski diagrams.

Posted by peeterjoot on January 18, 2011

[Click here for a PDF of this post with nicer formatting]


Still covering chapter 1 material from the text [1].

Finished covering Professor Poppitz’s lecture notes: invariance of finite intervals (25-26).

Started covering Professor Poppitz’s lecture notes: analogy with rotations and derivation of Lorentz transformations (27-32); Minkowski space diagram of boosted frame (32.1); using the diagram to find length contraction (32.2) ; nonrelativistic limit of boosts (33).

More spacetime geometry.

PICTURE: ct,x curvy worldline with tangent vector \mathbf{v}.

In an inertial frame moving with \mathbf{v}, whose origin coincides with momentary position of this moving observer ds^2 = c^2 {dt'}^2 = c^2 dt^2 - \mathbf{r}^2

“proper time” is

\begin{aligned}dt' = dt \sqrt{ 1 - \frac{1}{{c^2}} \left( \frac{d\mathbf{r}}{dt} \right)^2 } = dt \sqrt{ 1 - \frac{\mathbf{v}^2}{c^2}} \end{aligned} \hspace{\stretch{1}}(2.1)

We see that $latex dt’

0$, so that \sqrt{1-\mathbf{v}^2/c^2} < 1.

In a manifestly invariant way we define the proper time as

\begin{aligned}d\tau \equiv \frac{ds}{c}\end{aligned} \hspace{\stretch{1}}(2.2)

So that between worldpoints a and b the proper time is a line integral over the worldline

\begin{aligned}d\tau \equiv \frac{1}{{c}} \int_a^b ds.\end{aligned} \hspace{\stretch{1}}(2.3)

PICTURE: We are splitting up the worldline into many small pieces and summing them up.

HOLE IN LECTURE NOTES: ON PROPER TIME for “length” of straight vs. curved worldlines: TO BE REVISITED. Prof. Poppitz promised to revisit this again next time … his notes are confusing him, and he’d like to move on.

Finite interval invariance.

Tomorrow we are going to complete the proof about invariance. We’ve shown that light like intervals are invariant, and that infinitesimal intervals are invariant. We need to put these pieces together for finite intervals.

Deriving the Lorentz transformation.

Let’s find the coordinate transforms that leave s_{12}^2 invariant. This generalizes Galileo’s transformations.

We’d like to generalize rotations, which leave spatial distance invariant. Such a transformation also leaves the spacetime interval invariant.

In Euclidean space we can generate an arbitrary rotation by composition of rotation around any of the xy, yz, zx axis.

For 4D Euclidean space we would form any rotation by composition of any of the 6 independent rotations for the 6 available planes. For example with x,y,z,w axis we can rotate in any of the xy, xz, xw, yz, yw, zw planes.

For spacetime we can “rotate” in x,t, y,t, z,t “planes”. Physically this is motion space (boosting a position).

Consider a x,t transformation.

The trick (that is in the notes) is to rewrite the time as an analytical continuation of the time coordinate, as follows

\begin{aligned}ds^2 = c^2 dt^2 - dx^2\end{aligned} \hspace{\stretch{1}}(4.4)

and write

\begin{aligned}t \rightarrow i \tau,\end{aligned} \hspace{\stretch{1}}(4.5)

so that the interval becomes

\begin{aligned}ds^2 = - (c^2 d\tau^2 + dx^2)\end{aligned} \hspace{\stretch{1}}(4.6)

Now we have a structure that is familiar, and we can rotate as we normally do. Prof does not want to go through the details of this “trickery” in class, but says to see the notes. The end result is that we can transform as follows

\begin{aligned}x' &= x \cosh \psi + ct \sinh \psi \\ ct' &= x \sinh \psi + ct \cosh \psi \end{aligned} \hspace{\stretch{1}}(4.7)

which is analogous to a spatial rotation

\begin{aligned}x' &= x \cos \alpha + y \sin \alpha \\ y' &= -x \sin \alpha + y \cos \alpha \end{aligned} \hspace{\stretch{1}}(4.9)

There are some differences in sign as well, but the important feature to recall is that \cosh^2 x - \sinh^2 x = (1/4)( e^{2x} + e^{-2x} + 2 - e^{2x} - e^{-2x} + 2 ) = 1. We call these hyperbolic rotations, something that is simply a mathematical transformation. Now we want to relate this to something physical.

\paragraph{Q: What is \psi?}

The origin of O has coordinates (t, \mathbf{O}) in the O frame.

PICTURE (pg 32): O' frame translating along x axis with speed v_x. We have

\begin{aligned}\frac{x'}{c t'} = \frac{v_x}{c}\end{aligned} \hspace{\stretch{1}}(4.11)

However, using 4.7 we have for the origin

\begin{aligned}x' &= ct \sinh \psi \\ ct' &= ct \cosh \psi\end{aligned} \hspace{\stretch{1}}(4.12)

so that

\begin{aligned}\frac{x'}{c t'} = \tanh \psi = \frac{v_x}{c}\end{aligned} \hspace{\stretch{1}}(4.14)


\begin{aligned}\cosh \psi &= \frac{1}{{\sqrt{1 - \tanh^2 \psi}}} \\ \sinh \psi &= \frac{\tanh \psi}{\sqrt{1 - \tanh^2 \psi}}\end{aligned} \hspace{\stretch{1}}(4.15)

Performing all the gory substitutions one gets

\begin{aligned}x' &= \frac{1}{{\sqrt{1 - v_x^2/c^2}}} x+\frac{v_x/c}{\sqrt{1 - v_x^2/c^2}} c t \\ y' &= y \\ z' &= z \\ ct' &= \frac{v_x/c}{\sqrt{1 - v_x^2/c^2}} x+\frac{1}{{\sqrt{1 - v_x^2/c^2}}} c t\end{aligned} \hspace{\stretch{1}}(4.17)

PICTURE: Let us go to the more conventional case, where O is at rest and O' is moving with velocity v_x.

We achieve this by simply changing the sign of v_x in 4.17 above. This gives us

\begin{aligned}x' &= \frac{1}{{\sqrt{1 - v_x^2/c^2}}} x-\frac{v_x/c}{\sqrt{1 - v_x^2/c^2}} c t \\ y' &= y \\ z' &= z \\ ct' &= -\frac{v_x/c}{\sqrt{1 - v_x^2/c^2}} x+\frac{1}{{\sqrt{1 - v_x^2/c^2}}} c t\end{aligned} \hspace{\stretch{1}}(4.21)

We want some shorthand to make this easier to write and introduce

\begin{aligned}\gamma = \frac{1}{{\sqrt{1 - v_x^2/c^2}}},\end{aligned} \hspace{\stretch{1}}(4.25)

so that 4.21 becomes

\begin{aligned}x' &=  \gamma \left( x - \frac{v_x}{c} ct \right) \\ ct' &=  \gamma \left( ct - \frac{v_x}{c} x \right)\end{aligned} \hspace{\stretch{1}}(4.26)

We started the class by saying these would generalize the Galilean transformations. Observe that if we take c \rightarrow \infty, we have \gamma \rightarrow 1 and

\begin{aligned}x' &= x - v_x t + O((v_x/c)^2)t' &= t  + O(v_x/c)\end{aligned} \hspace{\stretch{1}}(4.28)

This is how to remember the signs. We want things to match up with the non-relativistic limit.

\paragraph{Q: How do lines of constant x' and ct' look like on the x,ct spacetime diagram?}

Our starting point (again) is

\begin{aligned}x' &=  \gamma \left( x - \frac{v_x}{c} ct \right) \\ ct' &=  \gamma \left( ct - \frac{v_x}{c} x \right).\end{aligned} \hspace{\stretch{1}}(4.29)

What are the points with x' = 0. Those are the points where x = (v_x/c) c t. This is the ct' axis. That’s the straight worldline

PICTURE: worldline of O' origin.

What are the points with ct' = 0. Those are the points where c t = x v_x/c. This is the x' axis.

Lines that are parallel to the x' axis are lines of constant x', and lines parallel to ct' axis are lines of constant t', but the light cone is the same for both.

\paragraph{What is this good for?}

We have time to pick from either length contraction or non-causality (how to kill your grandfather). How about length contraction. We can use the diagram to read the x or ct coordinates, or examine causality, but it is hard to read off t' or x' coordinates.


[1] L.D. Landau and E.M. Lifshits. The classical theory of fields. Butterworth-Heinemann, 1980.

Posted in Math and Physics Learning. | Tagged: , , , , , | Leave a Comment »

PHY450H1S. Relativistic Electrodynamics Lecture 2 (Taught by Prof. Erich Poppitz). Spacetime, events, worldlines, proper time, invariance.

Posted by peeterjoot on January 12, 2011

[Click here for a PDF of this post with nicer formatting]


No reading from [1] appears to have been assigned, but relevant stuff can be found in chapter 1.

From Professor Poppitz’s lecture notes, we have reading: pp.12-26: spacetime, spacetime points, worldlines, interval (12-14); invariance of infinitesimal intervals (15-17); geometry of spacetime, lightlike, spacelike, timelike intervals, and worldlines (18-22); proper time (23-24); invariance of finite intervals (25-26).

Followup for questions from last lecture.

Yes we have speed of light different in media. Example, speed of light in water is 3/4 vacuum speed due to high index of refraction. Also note that we can have effects like an electron moving in water can constantly emit light. This is called Cerenkov radiation.

Einstein’s relativity principle


\item Replace Galilean transformations between coordinates in differential inertial frames with Lorentz transforms between (\mathbf{x}, t). Postulate that these constitute the symmetries of physics. Recall that Galilean transformations are symmetries of the laws of non-relativistic physics.

Comment made that the symmetries impose the dynamics, and the symmetries provided the form of the Lagrangian in classical physics. Go back and revisit this.

\item Speed of light c is the same in all inertial frames. Phrased in this form, relativity leads to “relativity of simultaneity”.

PICTURE: Three people on a platform, at positions 1,3,2, all with equidistant separation. This stationary frame is labeled O. 1 and 2 flash light signals at the same time and in frame O the reception of the light signal by 3 is observed as arriving at 3 simultaneously.

Now introduce a moving frame with origin O' moving along the positive x axis. To a stationary observer in O' the three guys are seen to be moving in the -x direction. The middle guy (3) is eventually going to be seen to receive the light signal by this O' observer, but less time is required for the light to get from 1 to 3, and more time is required for the light to get from 2 to 1 (3 is moving away from the light according to the O' observer). Because the speed of light is perceived as constant for all observers, the perception is then that the light must arrive at 3 at different times.

This is very non-intuitive since we are implicitly trained by our surroundings that Galilean transformations govern mechanical behavior.

In O, 1 and 2 send light signals simultaneously while in O' 1 sends light later than 2. The conclusion, rather surprisingly compared to intuition, is that simultaneity is relative.



We will need to develop some tools to work with these concepts in a concrete fashion. It is convenient to combine space \mathbb{R}^{3} and time \mathbb{R}^{1} into a 4d “spacetime”. In [1] this is called fictitious spacetime for reasons that are not clear. Points in this space are also called “events”, or “spacetime points”, or “world point”. The “world line” is the trajectory for a particle in spacetime.

PICTURE: \mathbb{R}^{3} represented as a plane, and t up. For every point we can plot an \mathbf{x}(t) in this combined space.

Spacetime intervals for light like behaviour.

Consider two frames, one moving along the x-axis at a (constant) rate not yet specified.

“events” have coordinates (t, \mathbf{x}) in O and (t', \mathbf{x}') in O'. Because we now have to model the mathematics without a notion of simultaneity, we must now also introduce different time coordinates t, and t' in the two frames.

Let’s imagine that at at time t_1 light is emitted at \mathbf{x}_1, and at time t_2 this light is absorbed. Our space time events are then (t_1, \mathbf{x}_1) and (t_2, \mathbf{x}_2). In the O frame, the light will go a distance c(t_2 - t_1). This same distance can also be expressed as

\begin{aligned}\sqrt{ (\mathbf{x}_1 - \mathbf{x}_2)^2}.\end{aligned} \hspace{\stretch{1}}(5.1)

These are equal. It is convenient to work without the square roots, so we write

\begin{aligned}(\mathbf{x}_1 - \mathbf{x}_2)^2 = c^2 (t_2 - t_1)^2\end{aligned} \hspace{\stretch{1}}(5.2)


\begin{aligned}c^2 (t_2 - t_1)^2 - (\mathbf{x}_1 - \mathbf{x}_2)^2 =c^2 (t_2 - t_1)^2 - (x_1 - x_2)^2- (y_1 - y_2)^2- (z_1 - z_2)^2 = 0.\end{aligned} \hspace{\stretch{1}}(5.3)

We can repeat the same argument for the primed frame. In this frame, at time t_1' light is emitted at \mathbf{x}_1', and at time t_2' this light is absorbed. Our space time events in this frame are then (t_1', \mathbf{x}_1') and (t_2', \mathbf{x}_2'). As above, in this O' frame, the light will go a distance c(t_2' - t_1'), with a similar Euclidean distance involving \mathbf{x}_1' and \mathbf{x}_2'. That is

\begin{aligned}c^2 (t_2' - t_1')^2 - (\mathbf{x}_1' - \mathbf{x}_2')^2 =c^2 (t_2' - t_1')^2 - (x_1' - x_2')^2- (y_1' - y_2')^2- (z_1' - z_2')^2 = 0.\end{aligned} \hspace{\stretch{1}}(5.4)

We get zero for this quantity in any inertial frame 1. This quantity is found to be very important, and want to give this a label. We call this the “interval”, or the “spacetime interval”, and write this as follows:

\begin{aligned}s_{12}^2 = c^2 (t_2 - t_1)^2 - (\mathbf{r}_2 - \mathbf{r}_1)^2\end{aligned} \hspace{\stretch{1}}(5.5)

This is a quantity calculated between any two spacetime points with coordinates (t_2, \mathbf{r}_2) and (t_1, \mathbf{r}_1) in some frame.

So far we have argued that c being the same in any two frames implies that spacetime events “separated by a zero interval” in one frame are “separated by a zero interval” in any other frame.

Invariance of infinitesimal intervals.

For events that are infinitesimally close to each other. i.e. t_2 - t_1 and \mathbf{r}_2 -\mathbf{r}_1 are small (infinitesimal), it is convient to denote t_2 - t_1 and \mathbf{r}_2 - \mathbf{r}_1 by dt and d\mathbf{r} respectively. We can then define

\begin{aligned}ds_{12}^2 = c^2 dt^2 - d\mathbf{r}^2,\end{aligned} \hspace{\stretch{1}}(6.6)


\begin{aligned}ds= \sqrt{c^2 dt^2 - d\mathbf{r}^2}.\end{aligned} \hspace{\stretch{1}}(6.7)

We will use this a lot.

We have learned that if s_{12} = 0 in one frame, then s_{12}' = 0 in any other frame. We generally expect that there is a relation s_{12}' = F(s_12) between the intervals in two frames. So far we have learned that F(0) = 0.

Let’s now consider the case where both of these intervals are infinitesimal. Then we can write

\begin{aligned}ds_{12}' = F(ds_{12}) = F(0) + F'(0) ds_{12} + \cdots = F'(0) ds_{12} + \cdots.\end{aligned} \hspace{\stretch{1}}(6.8)

We will neglect terms O(ds_{12})^2 and higher. Thus equality of zero intervals between two frames implies that

\begin{aligned}ds_{12}' \propto ds_{12}.\end{aligned} \hspace{\stretch{1}}(6.9)

Now we must invoke an assumption (principle) of homogeneity of time and space and isotropy of space. This interval should not depend on where these events take place, or on the time that the measurements were performed. If this is the case then we conclude that the proportionality constant relating the two intervals is not a function of position or space. We argue that this proportionality can then only be a function of the (absolute) relative speed between the frames.

We write this as

\begin{aligned}ds_{12}' = F(v_{12}) ds_{12}\end{aligned} \hspace{\stretch{1}}(6.10)

This argument can be turned around and we say that ds_{12} = \tilde{F}(v_{12}) ds_{12}'. Thus \tilde{F} = F, because there is no distinction between O and O'. We want to conclude that

\begin{aligned}ds_{12} = F(v_{12}) ds_{12}' = F(v_{12}) \tilde{F}(v_{12}) ds_{12}\end{aligned} \hspace{\stretch{1}}(6.11)

and then conclude that F = \tilde{F} = 1. This argument is to be continued. To complete this conclusion we will need to perform some additional math, once we cover finite intervals.


[1] L.D. Landau and E.M. Lifshits. The classical theory of fields. Butterworth-Heinemann, 1980.

Posted in Math and Physics Learning. | Tagged: , , , , , , | 1 Comment »