Peeter Joot's (OLD) Blog.

Math, physics, perl, and programming obscurity.

Posts Tagged ‘lorentz boost’

My first arxiv submission. Change of basis and Gram-Schmidt orthonormalization in special relativity

Posted by peeterjoot on April 29, 2011

Now that I have an academic email address I was able to make an arxiv submission (I’d tried previously and been auto-rejected) :

Change of basis and Gram-Schmidt orthonormalization in special relativity

This is based on a tutorial from our relativistic electrodynamics class, which covered non-internal relativistic systems. Combining what I learned from that with some concepts I learned from ‘Geometric Algebra for Physicists’ (particularly reciprocal frames) I was able to write up some notes that took those ideas plus basic linear algebra (the Graham-Schmidt procedure) and apply them to relativity and/or non-orthonormal Euclidean bases. How to do projections onto non-orthonormal Euclidean bases isn’t taught in Algebra I, but once you figure out that the same thing works for SR.

Will anybody read it? I don’t know … but I had fun writing it.

Posted in Math and Physics Learning. | Tagged: , , , , , , , | 2 Comments »

Playing with complex notation for relativistic applications in a plane

Posted by peeterjoot on April 19, 2011

[Click here for a PDF of this post with nicer formatting]


In the electrodynamics midterm we had a question on circular motion. This screamed for use of complex numbers to describe the spatial parts of the spacetime trajectories.

Let’s play with this a bit.

Our invariant.

Suppose we describe our spacetime point as a paired time and complex number

\begin{aligned}X = (ct, z).\end{aligned} \hspace{\stretch{1}}(2.1)

Our spacetime invariant interval in this form is thus

\begin{aligned}X^2 \equiv (ct)^2 - {\left\lvert{z}\right\rvert}^2.\end{aligned} \hspace{\stretch{1}}(2.2)

Not much different than the usual coordinate representation of the spatial coordinates, except that we have a {\left\lvert{z}\right\rvert}^2 replacing the usual \mathbf{x}^2.

Taking the spacetime distance between X and another point, say \tilde{X} = ( c \tilde{t}, \tilde{z}) motivates the inner product between two points in this representation

\begin{aligned}(X - \tilde{X})^2 &= (ct - c \tilde{t})^2 - {\left\lvert{z - \tilde{z}}\right\rvert}^2 \\ &= (ct - c \tilde{t})^2 - (z - \tilde{z})(z^{*} - \tilde{z}^{*}) \\ &= (ct)^2 - 2 (ct) (c \tilde{t}) + (c \tilde{t})^2 - {\left\lvert{z}\right\rvert}^2 - {\left\lvert{\tilde{z}}\right\rvert}^2 + (z \tilde{z}^{*} + z^{*} \tilde{z}) \\ &= X^2 + \tilde{X}^2 - 2 \left( (ct) (c \tilde{t}) - \frac{1}{{2}}(z \tilde{z}^{*} + z^{*} \tilde{z}) \right) \\ \end{aligned}

It’s clear that it makes sense to define

\begin{aligned}X \cdot \tilde{X} = (ct) (c \tilde{t}) - \text{Real} (z \tilde{z}^{*}),\end{aligned} \hspace{\stretch{1}}(2.3)

consistent with our original starting point

\begin{aligned}X^2 = X \cdot X.\end{aligned} \hspace{\stretch{1}}(2.4)

Let’s also introduce a complex inner product

\begin{aligned}{\langle{{z}}, {{\tilde{z}}}\rangle} \equiv \frac{1}{{2}} \left( z \tilde{z}^{*} + z^{*} \tilde{z}) \right) = \text{Real} (z \tilde{z}^{*}).\end{aligned} \hspace{\stretch{1}}(2.5)

Our dot product can now be written

\begin{aligned}X \cdot \tilde{X} = (ct) (c \tilde{t}) - {\langle{{z}}, {{\tilde{z}}}\rangle}.\end{aligned} \hspace{\stretch{1}}(2.6)

Change of basis.

Our standard basis for our spatial components is \{1, i\}, but we are free to pick any other basis should we choose. In particular, if we rotate our basis counterclockwise by \phi, our new basis, still orthonormal, is \{ e^{i\phi}, i e^{i\phi} \}.

In any orthonormal basis the coordinates of a point with respect to that basis are real, so just as we can write

\begin{aligned}z = {\langle{{1}}, {{z}}\rangle} + i {\langle{{i}}, {{z}}\rangle},\end{aligned} \hspace{\stretch{1}}(3.7)

we can extract the coordinates in the rotated frame, also simply by taking inner products

\begin{aligned}z = e^{i \phi} {\langle{{e^{i \phi}}}, {{z}}\rangle} + i e^{i\phi} {\langle{{i e^{i\phi} }}, {{z}}\rangle}.\end{aligned} \hspace{\stretch{1}}(3.8)

The values {\langle{{e^{i \phi}}}, {{z}}\rangle}, and {\langle{{i e^{i\phi} }}, {{z}}\rangle} are the (real) coordinates of the point z in this rotated basis.

This is enough that we can write the Lorentz boost immediately for a velocity \vec{v} = c \beta e^{i\phi} at an arbitrary angle \phi in the plane

\begin{aligned}\begin{bmatrix}ct' \\ {\langle{{e^{i\phi}}}, {{z'}}\rangle} \\ {\langle{{i e^{i\phi}}}, {{z'}}\rangle} \end{bmatrix}=\begin{bmatrix}\gamma & -\gamma \beta & 0 \\ -\gamma \beta & \gamma & 0 \\ 0 & 0 & 1\end{bmatrix}\begin{bmatrix}ct \\ {\langle{{e^{i\phi}}}, {{z}}\rangle} \\ {\langle{{i e^{i\phi}}}, {{z}}\rangle} \end{bmatrix}\end{aligned} \hspace{\stretch{1}}(3.9)

Let’s translate this to ct, x, y coordinates as a check. For the spatial component parallel to the boost direction we have

\begin{aligned}{\langle{{e^{i\phi}}}, {{x + iy}}\rangle} &= \text{Real} ( e^{-i\phi} (x + i y) ) \\ &= \text{Real} ( (\cos\phi - i \sin\phi)(x + i y) ) \\ &= x \cos\phi + y \sin\phi,\end{aligned}

and the perpendicular components are

\begin{aligned}{\langle{{ i e^{i\phi}}}, {{x + iy}}\rangle} &= \text{Real} ( -i e^{-i\phi} (x + i y) ) \\ &= \text{Real} ( (-i \cos\phi - \sin\phi)(x + i y) ) \\ &= -x \sin\phi + y \cos\phi.\end{aligned}

Grouping the two gives

\begin{aligned}\begin{bmatrix}{\langle{{e^{i\phi}}}, {{x + iy}}\rangle}  \\ {\langle{{i e^{i\phi}}}, {{x + iy}}\rangle} \end{bmatrix}=\begin{bmatrix}\cos\phi & \sin\phi \\ -\sin\phi & \cos\phi\end{bmatrix}\begin{bmatrix}x \\ y\end{bmatrix}= R_{-\phi}\begin{bmatrix}x \\ y\end{bmatrix}\end{aligned} \hspace{\stretch{1}}(3.10)

The boost equation in terms of the cartesian coordinates is thus

\begin{aligned}\begin{bmatrix}1 & 0 \\ 0 & R_{-\phi}\end{bmatrix}\begin{bmatrix}c t' \\ x' \\ y'\end{bmatrix}=\begin{bmatrix}\gamma & -\gamma \beta & 0 \\ -\gamma \beta & \gamma & 0 \\ 0 & 0 & 1\end{bmatrix}\begin{bmatrix}1 & 0 \\ 0 & R_{-\phi}\end{bmatrix}\begin{bmatrix}c t \\ x \\ y\end{bmatrix}.\end{aligned} \hspace{\stretch{1}}(3.11)


\begin{aligned}\begin{bmatrix}c t' \\ x' \\ y'\end{bmatrix}={\left\lVert{{\wedge^\mu}_\nu}\right\rVert} \begin{bmatrix}c t \\ x \\ y\end{bmatrix},\end{aligned} \hspace{\stretch{1}}(3.12)

the boost matrix {\left\lVert{{\wedge^\mu}_\nu}\right\rVert} is found to be (after a bit of work)

\begin{aligned}{\left\lVert{{\wedge^\mu}_\nu}\right\rVert} &=\begin{bmatrix}1 & 0 \\ 0 & R_{\phi}\end{bmatrix}\begin{bmatrix}\gamma & -\gamma \beta & 0 \\ -\gamma \beta & \gamma & 0 \\ 0 & 0 & 1\end{bmatrix}\begin{bmatrix}1 & 0 \\ 0 & R_{-\phi}\end{bmatrix} \\ &=\begin{bmatrix}\gamma & - \gamma \beta \cos\phi & -\gamma \beta \sin\phi \\ -\gamma \beta \cos\phi & \gamma \cos^2\phi + \sin^2 \phi & (\gamma -1) \sin\phi \cos\phi \\ -\gamma \beta \sin\phi & (\gamma -1) \sin\phi \cos\phi & \gamma \sin^2\phi + \cos^2\phi \\ \end{bmatrix} \\ \end{aligned}

A final bit of regrouping gives

\begin{aligned}{\left\lVert{{\wedge^\mu}_\nu}\right\rVert} =\begin{bmatrix}\gamma & - \gamma \beta \cos\phi & -\gamma \beta \sin\phi \\ -\gamma \beta \cos\phi & 1 + ( \gamma -1) \cos^2\phi & (\gamma -1) \sin\phi \cos\phi \\ -\gamma \beta \sin\phi & (\gamma -1) \sin\phi \cos\phi & 1 + (\gamma -1) \sin^2\phi \\ \end{bmatrix}.\end{aligned} \hspace{\stretch{1}}(3.13)

This is consistent with the result stated in [1], finishing the game for the day.


[1] Wikipedia. Lorentz transformation — wikipedia, the free encyclopedia [online]. 2011. [Online; accessed 20-April-2011].

Posted in Math and Physics Learning. | Tagged: , , | Leave a Comment »

PHY450H1S (relativistic electrodynamics) Problem Set 3.

Posted by peeterjoot on March 2, 2011

[Click here for a PDF of this post with nicer formatting]


This problem set is as yet ungraded (although only the second question will be graded).

Problem 1. Fun with \epsilon_{\alpha\beta\gamma}, \epsilon^{ijkl}, F_{ij}, and the duality of Maxwell’s equations in vacuum.

1. Statement. rank 3 spatial antisymmetric tensor identities.

Prove that

\begin{aligned}\epsilon_{\alpha \beta \gamma}\epsilon_{\mu \nu \gamma}=\delta_{\alpha\mu} \delta_{\beta\nu}-\delta_{\alpha\nu} \delta_{\beta\mu}\end{aligned} \hspace{\stretch{1}}(2.1)

and use it to find the familiar relation for

\begin{aligned}(\mathbf{A} \times \mathbf{B}) \cdot (\mathbf{C} \times \mathbf{D})\end{aligned} \hspace{\stretch{1}}(2.2)

Also show that

\begin{aligned}\epsilon_{\alpha \beta \gamma}\epsilon_{\mu \beta \gamma}=2 \delta_{\alpha\mu}.\end{aligned} \hspace{\stretch{1}}(2.3)

(Einstein summation implied all throughout this problem).

1. Solution

We can explicitly expand the (implied) sum over indexes \gamma. This is

\begin{aligned}\epsilon_{\alpha \beta \gamma}\epsilon_{\mu \nu \gamma}=\epsilon_{\alpha \beta 1} \epsilon_{\mu \nu 1}+\epsilon_{\alpha \beta 2} \epsilon_{\mu \nu 2}+\epsilon_{\alpha \beta 3} \epsilon_{\mu \nu 3}\end{aligned} \hspace{\stretch{1}}(2.4)

For any \alpha \ne \beta only one term is non-zero. For example with \alpha,\beta = 2,3, we have just a contribution from the \gamma = 1 part of the sum

\begin{aligned}\epsilon_{2 3 1} \epsilon_{\mu \nu 1}.\end{aligned} \hspace{\stretch{1}}(2.5)

The value of this for (\mu,\nu) = (\alpha,\beta) is

\begin{aligned}(\epsilon_{2 3 1})^2\end{aligned} \hspace{\stretch{1}}(2.6)

whereas for (\mu,\nu) = (\beta,\alpha) we have

\begin{aligned}-(\epsilon_{2 3 1})^2\end{aligned} \hspace{\stretch{1}}(2.7)

Our sum has value one when (\alpha, \beta) matches (\mu, \nu), and value minus one for when (\mu, \nu) are permuted. We can summarize this, by saying that when \alpha \ne \beta we have

\begin{aligned}\boxed{\epsilon_{\alpha \beta \gamma}\epsilon_{\mu \nu \gamma}=\delta_{\alpha\mu} \delta_{\beta\nu}-\delta_{\alpha\nu} \delta_{\beta\mu}.}\end{aligned} \hspace{\stretch{1}}(2.8)

However, observe that when \alpha = \beta the RHS is

\begin{aligned}\delta_{\alpha\mu} \delta_{\alpha\nu}-\delta_{\alpha\nu} \delta_{\alpha\mu} = 0,\end{aligned} \hspace{\stretch{1}}(2.9)

as desired, so this form works in general without any \alpha \ne \beta qualifier, completing this part of the problem.

\begin{aligned}(\mathbf{A} \times \mathbf{B}) \cdot (\mathbf{C} \times \mathbf{D})&=(\epsilon_{\alpha \beta \gamma} \mathbf{e}^\alpha A^\beta B^\gamma ) \cdot(\epsilon_{\mu \nu \sigma} \mathbf{e}^\mu C^\nu D^\sigma ) \\ &=\epsilon_{\alpha \beta \gamma} A^\beta B^\gamma\epsilon_{\alpha \nu \sigma} C^\nu D^\sigma \\ &=(\delta_{\beta \nu} \delta_{\gamma\sigma}-\delta_{\beta \sigma} \delta_{\gamma\nu} )A^\beta B^\gammaC^\nu D^\sigma \\ &=A^\nu B^\sigmaC^\nu D^\sigma-A^\sigma B^\nuC^\nu D^\sigma.\end{aligned}

This gives us

\begin{aligned}\boxed{(\mathbf{A} \times \mathbf{B}) \cdot (\mathbf{C} \times \mathbf{D})=(\mathbf{A} \cdot \mathbf{C})(\mathbf{B} \cdot \mathbf{D})-(\mathbf{A} \cdot \mathbf{D})(\mathbf{B} \cdot \mathbf{C}).}\end{aligned} \hspace{\stretch{1}}(2.10)

We have one more identity to deal with.

\begin{aligned}\epsilon_{\alpha \beta \gamma}\epsilon_{\mu \beta \gamma}\end{aligned} \hspace{\stretch{1}}(2.11)

We can expand out this (implied) sum slow and dumb as well

\begin{aligned}\epsilon_{\alpha \beta \gamma}\epsilon_{\mu \beta \gamma}&=\epsilon_{\alpha 1 2} \epsilon_{\mu 1 2}+\epsilon_{\alpha 2 1} \epsilon_{\mu 2 1} \\ &+\epsilon_{\alpha 1 3} \epsilon_{\mu 1 3}+\epsilon_{\alpha 3 1} \epsilon_{\mu 3 1} \\ &+\epsilon_{\alpha 2 3} \epsilon_{\mu 2 3}+\epsilon_{\alpha 3 2} \epsilon_{\mu 3 2} \\ &=2 \epsilon_{\alpha 1 2} \epsilon_{\mu 1 2}+ 2 \epsilon_{\alpha 1 3} \epsilon_{\mu 1 3}+ 2 \epsilon_{\alpha 2 3} \epsilon_{\mu 2 3}\end{aligned}

Now, observe that for any \alpha \in (1,2,3) only one term of this sum is picked up. For example, with no loss of generality, pick \alpha = 1. We are left with only

\begin{aligned}2 \epsilon_{1 2 3} \epsilon_{\mu 2 3}\end{aligned} \hspace{\stretch{1}}(2.12)

This has the value

\begin{aligned}2 (\epsilon_{1 2 3})^2 = 2\end{aligned} \hspace{\stretch{1}}(2.13)

when \mu = \alpha and is zero otherwise. We can therefore summarize the evaluation of this sum as

\begin{aligned}\boxed{\epsilon_{\alpha \beta \gamma}\epsilon_{\mu \beta \gamma}=  2\delta_{\alpha\mu},}\end{aligned} \hspace{\stretch{1}}(2.14)

completing this problem.

2. Statement. Determinant of three by three matrix.

Prove that for any 3 \times 3 matrix {\left\lVert{A_{\alpha\beta}}\right\rVert}: \epsilon_{\mu\nu\lambda} A_{\alpha \mu} A_{\beta\nu} A_{\gamma\lambda} = \epsilon_{\alpha \beta \gamma} \text{Det} A and that \epsilon_{\alpha\beta\gamma} \epsilon_{\mu\nu\lambda} A_{\alpha \mu} A_{\beta\nu} A_{\gamma\lambda} = 6 \text{Det} A.

2. Solution

In class Simon showed us how the first identity can be arrived at using the triple product \mathbf{a} \cdot (\mathbf{b} \times \mathbf{c}) = \text{Det}(\mathbf{a} \mathbf{b} \mathbf{c}). It occurred to me later that I’d seen the identity to be proven in the context of Geometric Algebra, but hadn’t recognized it in this tensor form. Basically, a wedge product can be expanded in sums of determinants, and when the dimension of the space is the same as the vector, we have a pseudoscalar times the determinant of the components.

For example, in \mathbb{R}^{2}, let’s take the wedge product of a pair of vectors. As preparation for the relativistic \mathbb{R}^{4} case We won’t require an orthonormal basis, but express the vector in terms of a reciprocal frame and the associated components

\begin{aligned}a = a^i e_i = a_j e^j\end{aligned} \hspace{\stretch{1}}(2.15)


\begin{aligned}e^i \cdot e_j = {\delta^i}_j.\end{aligned} \hspace{\stretch{1}}(2.16)

When we get to the relativistic case, we can pick (but don’t have to) the standard basis

\begin{aligned}e_0 &= (1, 0, 0, 0) \\ e_1 &= (0, 1, 0, 0) \\ e_2 &= (0, 0, 1, 0) \\ e_3 &= (0, 0, 0, 1),\end{aligned} \hspace{\stretch{1}}(2.17)

for which our reciprocal frame is implicitly defined by the metric

\begin{aligned}e^0 &= (1, 0, 0, 0) \\ e^1 &= (0, -1, 0, 0) \\ e^2 &= (0, 0, -1, 0) \\ e^3 &= (0, 0, 0, -1).\end{aligned} \hspace{\stretch{1}}(2.21)

Anyways. Back to the problem. Let’s examine the \mathbb{R}^{2} case. Our wedge product in coordinates is

\begin{aligned}a \wedge b=a^i b^j (e_i \wedge e_j)\end{aligned} \hspace{\stretch{1}}(2.25)

Since there are only two basis vectors we have

\begin{aligned}a \wedge b=(a^1 b^2 - a^2 b^1) e_1 \wedge e_2 = \text{Det} {\left\lVert{a^i b^j}\right\rVert} (e_1 \wedge e_2).\end{aligned} \hspace{\stretch{1}}(2.26)

Our wedge product is a product of the determinant of the vector coordinates, times the \mathbb{R}^{2} pseudoscalar e_1 \wedge e_2.

This doesn’t look quite like the \mathbb{R}^{3} relation that we want to prove, which had an antisymmetric tensor factor for the determinant. Observe that we get the determinant by picking off the e_1 \wedge e_2 component of the bivector result (the only component in this case), and we can do that by dotting with e^2 \cdot e^1. To get an antisymmetric tensor times the determinant, we have only to dot with a different pseudoscalar (one that differs by a possible sign due to permutation of the indexes). That is

\begin{aligned}(e^t \wedge e^s) \cdot (a \wedge b)&=a^i b^j (e^t \wedge e^s) \cdot (e_i \wedge e_j) \\ &=a^i b^j\left( {\delta^{s}}_i {\delta^{t}}_j-{\delta^{t}}_i {\delta^{s}}_j  \right) \\ &=a^i b^j{\delta^{[t}}_j {\delta^{s]}}_i \\ &=a^i b^j{\delta^{t}}_{[j} {\delta^{s}}_{i]} \\ &=a^{[i} b^{j]}{\delta^{t}}_{j} {\delta^{s}}_{i} \\ &=a^{[s} b^{t]}\end{aligned}

Now, if we write a^i = A^{1 i} and b^j = A^{2 j} we have

\begin{aligned}(e^t \wedge e^s) \cdot (a \wedge b)=A^{1 s} A^{2 t} -A^{1 t} A^{2 s}\end{aligned} \hspace{\stretch{1}}(2.27)

We can write this in two different ways. One of which is

\begin{aligned}A^{1 s} A^{2 t} -A^{1 t} A^{2 s} =\epsilon^{s t} \text{Det} {\left\lVert{A^{ij}}\right\rVert}\end{aligned} \hspace{\stretch{1}}(2.28)

and the other of which is by introducing free indexes for 1 and 2, and summing antisymmetrically over these. That is

\begin{aligned}A^{1 s} A^{2 t} -A^{1 t} A^{2 s}=A^{a s} A^{b t} \epsilon_{a b}\end{aligned} \hspace{\stretch{1}}(2.29)

So, we have

\begin{aligned}\boxed{A^{a s} A^{b t} \epsilon_{a b} =A^{1 i} A^{2 j} {\delta^{[t}}_j {\delta^{s]}}_i =\epsilon^{s t} \text{Det} {\left\lVert{A^{ij}}\right\rVert},}\end{aligned} \hspace{\stretch{1}}(2.30)

This result hold regardless of the metric for the space, and does not require that we were using an orthonormal basis. When the metric is Euclidean and we have an orthonormal basis, then all the indexes can be dropped.

The \mathbb{R}^{3} and \mathbb{R}^{4} cases follow in exactly the same way, we just need more vectors in the wedge products.

For the \mathbb{R}^{3} case we have

\begin{aligned}(e^u \wedge e^t \wedge e^s) \cdot ( a \wedge b \wedge c)&=a^i b^j c^k(e^u \wedge e^t \wedge e^s) \cdot (e_i \wedge e_j \wedge e_k) \\ &=a^i b^j c^k{\delta^{[u}}_k{\delta^{t}}_j{\delta^{s]}}_i \\ &=a^{[s} b^t c^{u]}\end{aligned}

Again, with a^i = A^{1 i} and b^j = A^{2 j}, and c^k = A^{3 k} we have

\begin{aligned}(e^u \wedge e^t \wedge e^s) \cdot ( a \wedge b \wedge c)=A^{1 i} A^{2 j} A^{3 k}{\delta^{[u}}_k{\delta^{t}}_j{\delta^{s]}}_i\end{aligned} \hspace{\stretch{1}}(2.31)

and we can choose to write this in either form, resulting in the identity

\begin{aligned}\boxed{\epsilon^{s t u} \text{Det} {\left\lVert{A^{ij}}\right\rVert}=A^{1 i} A^{2 j} A^{3 k}{\delta^{[u}}_k{\delta^{t}}_j{\delta^{s]}}_i=\epsilon_{a b c} A^{a s} A^{b t} A^{c u}.}\end{aligned} \hspace{\stretch{1}}(2.32)

The \mathbb{R}^{4} case follows exactly the same way, and we have

\begin{aligned}(e^v \wedge e^u \wedge e^t \wedge e^s) \cdot ( a \wedge b \wedge c \wedge d)&=a^i b^j c^k d^l(e^v \wedge e^u \wedge e^t \wedge e^s) \cdot (e_i \wedge e_j \wedge e_k \wedge e_l) \\ &=a^i b^j c^k d^l{\delta^{[v}}_l{\delta^{u}}_k{\delta^{t}}_j{\delta^{s]}}_i \\ &=a^{[s} b^t c^{u} d^{v]}.\end{aligned}

This time with a^i = A^{0 i} and b^j = A^{1 j}, and c^k = A^{2 k}, and d^l = A^{3 l} we have

\begin{aligned}\boxed{\epsilon^{s t u v} \text{Det} {\left\lVert{A^{ij}}\right\rVert}=A^{0 i} A^{1 j} A^{2 k} A^{3 l}{\delta^{[v}}_l{\delta^{u}}_k{\delta^{t}}_j{\delta^{s]}}_i=\epsilon_{a b c d} A^{a s} A^{b t} A^{c u} A^{d v}.}\end{aligned} \hspace{\stretch{1}}(2.33)

This one is almost the identity to be established later in problem 1.4. We have only to raise and lower some indexes to get that one. Note that in the Minkowski standard basis above, because s, t, u, v must be a permutation of 0,1,2,3 for a non-zero result, we must have

\begin{aligned}\epsilon^{s t u v} = (-1)^3 (+1) \epsilon_{s t u v}.\end{aligned} \hspace{\stretch{1}}(2.34)

So raising and lowering the identity above gives us

\begin{aligned}-\epsilon_{s t u v} \text{Det} {\left\lVert{A_{ij}}\right\rVert}=\epsilon^{a b c d} A_{a s} A_{b t} A_{c u} A_{d u}.\end{aligned} \hspace{\stretch{1}}(2.35)

No sign changes were required for the indexes a, b, c, d, since they are paired.

Until we did the raising and lowering operations here, there was no specific metric required, so our first result 2.33 is the more general one.

There’s one more part to this problem, doing the antisymmetric sums over the indexes s, t, \cdots. For the \mathbb{R}^{2} case we have

\begin{aligned}\epsilon_{s t} \epsilon_{a b} A^{a s} A^{b t}&=\epsilon_{s t} \epsilon^{s t} \text{Det} {\left\lVert{A^{ij}}\right\rVert} \\ &=\left( \epsilon_{1 2} \epsilon^{1 2} +\epsilon_{2 1} \epsilon^{2 1} \right)\text{Det} {\left\lVert{A^{ij}}\right\rVert} \\ &=\left( 1^2 + (-1)^2\right)\text{Det} {\left\lVert{A^{ij}}\right\rVert}\end{aligned}

We conclude that

\begin{aligned}\boxed{\epsilon_{s t} \epsilon_{a b} A^{a s} A^{b t} = 2! \text{Det} {\left\lVert{A^{ij}}\right\rVert}.}\end{aligned} \hspace{\stretch{1}}(2.36)

For the \mathbb{R}^{3} case we have the same operation

\begin{aligned}\epsilon_{s t u} \epsilon_{a b c} A^{a s} A^{b t} A^{c u}&=\epsilon_{s t u} \epsilon^{s t u} \text{Det} {\left\lVert{A^{ij}}\right\rVert} \\ &=\left( \epsilon_{1 2 3} \epsilon^{1 2 3} +\epsilon_{1 3 2} \epsilon^{1 3 2} + \cdots\right)\text{Det} {\left\lVert{A^{ij}}\right\rVert} \\ &=(\pm 1)^2 (3!)\text{Det} {\left\lVert{A^{ij}}\right\rVert}.\end{aligned}

So we conclude

\begin{aligned}\boxed{\epsilon_{s t u} \epsilon_{a b c} A^{a s} A^{b t} A^{c u}= 3! \text{Det} {\left\lVert{A^{ij}}\right\rVert}.}\end{aligned} \hspace{\stretch{1}}(2.37)

It’s clear what the pattern is, and if we evaluate the sum of the antisymmetric tensor squares in \mathbb{R}^{4} we have

\begin{aligned}\epsilon_{s t u v} \epsilon_{s t u v}&=\epsilon_{0 1 2 3} \epsilon_{0 1 2 3}+\epsilon_{0 1 3 2} \epsilon_{0 1 3 2}+\epsilon_{0 2 1 3} \epsilon_{0 2 1 3}+ \cdots \\ &= (\pm 1)^2 (4!),\end{aligned}

So, for our SR case we have

\begin{aligned}\boxed{\epsilon_{s t u v} \epsilon_{a b c d} A^{a s} A^{b t} A^{c u} A^{d v}= 4! \text{Det} {\left\lVert{A^{ij}}\right\rVert}.}\end{aligned} \hspace{\stretch{1}}(2.38)

This was part of question 1.4, albeit in lower index form. Here since all indexes are matched, we have the same result without major change

\begin{aligned}\boxed{\epsilon^{s t u v} \epsilon^{a b c d} A_{a s} A_{b t} A_{c u} A_{d v}= 4! \text{Det} {\left\lVert{A_{ij}}\right\rVert}.}\end{aligned} \hspace{\stretch{1}}(2.39)

The main difference is that we are now taking the determinant of a lower index tensor.

3. Statement. Rotational invariance of 3D antisymmetric tensor

Use the previous results to show that \epsilon_{\mu\nu\lambda} is invariant under rotations.

3. Solution

We apply transformations to coordinates (and thus indexes) of the form

\begin{aligned}x_\mu \rightarrow O_{\mu\nu} x_\nu\end{aligned} \hspace{\stretch{1}}(2.40)

With our tensor transforming as its indexes, we have

\begin{aligned}\epsilon_{\mu\nu\lambda} \rightarrow \epsilon_{\alpha\beta\sigma} O_{\mu\alpha} O_{\nu\beta} O_{\lambda\sigma}.\end{aligned} \hspace{\stretch{1}}(2.41)

We’ve got 2.32, which after dropping indexes, because we are in a Euclidean space, we have

\begin{aligned}\epsilon_{\mu \nu \lambda} \text{Det} {\left\lVert{A_{ij}}\right\rVert} = \epsilon_{\alpha \beta \sigma} A_{\alpha \mu} A_{\beta \nu} A_{\sigma \lambda}.\end{aligned} \hspace{\stretch{1}}(2.42)

Let A_{i j} = O_{j i}, which gives us

\begin{aligned}\epsilon_{\mu\nu\lambda} \rightarrow \epsilon_{\mu\nu\lambda} \text{Det} A^\text{T}\end{aligned} \hspace{\stretch{1}}(2.43)

but since \text{Det} O = \text{Det} O^\text{T}, we have shown that \epsilon_{\mu\nu\lambda} is invariant under rotation.

4. Statement. Rotational invariance of 4D antisymmetric tensor

Use the previous results to show that \epsilon_{i j k l} is invariant under Lorentz transformations.

4. Solution

This follows the same way. We assume a transformation of coordinates of the following form

\begin{aligned}(x')^i &= {O^i}_j x^j \\ (x')_i &= {O_i}^j x_j,\end{aligned} \hspace{\stretch{1}}(2.44)

where the determinant of {O^i}_j = 1 (sanity check of sign: {O^i}_j = {\delta^i}_j).

Our antisymmetric tensor transforms as its coordinates individually

\begin{aligned}\epsilon_{i j k l} &\rightarrow \epsilon_{a b c d} {O_i}^a{O_j}^b{O_k}^c{O_l}^d \\ &= \epsilon^{a b c d} O_{i a}O_{j b}O_{k c}O_{l d} \\ \end{aligned}

Let P_{ij} = O_{ji}, and raise and lower all the indexes in 2.46 for

\begin{aligned}-\epsilon_{s t u v} \text{Det} {\left\lVert{P_{ij}}\right\rVert}=\epsilon^{a b c d} P_{a s} P_{b t} P_{c u} P_{d v}.\end{aligned} \hspace{\stretch{1}}(2.46)

We have

\begin{aligned}\epsilon_{i j k l} &= \epsilon^{a b c d} P_{a i}P_{a j}P_{a k}P_{a l} \\ &=-\epsilon_{i j k l} \text{Det} {\left\lVert{P_{ij}}\right\rVert} \\ &=-\epsilon_{i j k l} \text{Det} {\left\lVert{O_{ij}}\right\rVert} \\ &=-\epsilon_{i j k l} \text{Det} {\left\lVert{g_{im} {O^m}_j }\right\rVert} \\ &=-\epsilon_{i j k l} (-1)(1) \\ &=\epsilon_{i j k l}\end{aligned}

Since \epsilon_{i j k l} = -\epsilon^{i j k l} both are therefore invariant under Lorentz transformation.

5. Statement. Sum of contracting symmetric and antisymmetric rank 2 tensors

Show that A^{ij} B_{ij} = 0 if A is symmetric and B is antisymmetric.

5. Solution

We swap indexes in B, switch dummy indexes, then swap indexes in A

\begin{aligned}A^{i j} B_{i j} &= -A^{i j} B_{j i} \\ &= -A^{j i} B_{i j} \\ &= -A^{i j} B_{i j} \\ \end{aligned}

Our result is the negative of itself, so must be zero.

6. Statement. Characteristic equation for the electromagnetic strength tensor

Show that P(\lambda) = \text{Det} {\left\lVert{F_{i j} - \lambda g_{i j}}\right\rVert} is invariant under Lorentz transformations. Consider the polynomial of P(\lambda), also called the characteristic polynomial of the matrix {\left\lVert{F_{i j}}\right\rVert}. Find the coefficients of the expansion of P(\lambda) in powers of \lambda in terms of the components of {\left\lVert{F_{i j}}\right\rVert}. Use the result to argue that \mathbf{E} \cdot \mathbf{B} and \mathbf{E}^2 - \mathbf{B}^2 are Lorentz invariant.

6. Solution

The invariance of the determinant

Let’s consider how any lower index rank 2 tensor transforms. Given a transformation of coordinates

\begin{aligned}(x^i)' &= {O^i}_j x^j \\ (x_i)' &= {O_i}^j x^j ,\end{aligned} \hspace{\stretch{1}}(2.47)

where \text{Det} {\left\lVert{ {O^i}_j }\right\rVert} = 1, and {O_i}^j = {O^m}_n g_{i m} g^{j n}. Let’s reflect briefly on why this determinant is unit valued. We have

\begin{aligned}(x^i)' (x_i)'= {O_i}^a x^a {O^i}_b x^b = x^b x_b,\end{aligned} \hspace{\stretch{1}}(2.49)

which implies that the transformation product is

\begin{aligned}{O_i}^a {O^i}_b = {\delta^a}_b,\end{aligned} \hspace{\stretch{1}}(2.50)

the identity matrix. The identity matrix has unit determinant, so we must have

\begin{aligned}1 = (\text{Det} \hat{G})^2 (\text{Det} {\left\lVert{ {O^i}_j }\right\rVert})^2.\end{aligned} \hspace{\stretch{1}}(2.51)

Since \text{Det} \hat{G} = -1 we have

\begin{aligned}\text{Det} {\left\lVert{ {O^i}_j }\right\rVert} = \pm 1,\end{aligned} \hspace{\stretch{1}}(2.52)

which is all that we can say about the determinant of this class of transformations by considering just invariance. If we restrict the transformations of coordinates to those of the same determinant sign as the identity matrix, we rule out reflections in time or space. This seems to be the essence of the SO(1,3) labeling.

Why dwell on this? Well, I wanted to be clear on the conventions I’d chosen, since parts of the course notes used \hat{O} = {\left\lVert{O^{i j}}\right\rVert}, and X' = \hat{O} X, and gave that matrix unit determinant. That O^{i j} looks like it is equivalent to my {O^i}_j, except that the one in the course notes is loose when it comes to lower and upper indexes since it gives (x')^i = O^{i j} x^j.

I’ll write

\begin{aligned}\hat{O} = {\left\lVert{{O^i}_j}\right\rVert},\end{aligned} \hspace{\stretch{1}}(2.53)

and require this (not {\left\lVert{O^{i j}}\right\rVert}) to be the matrix with unit determinant. Having cleared the index upper and lower confusion I had trying to reconcile the class notes with the rules for index manipulation, let’s now consider the Lorentz transformation of a lower index rank 2 tensor (not necessarily antisymmetric or symmetric)

We have, transforming in the same fashion as a lower index coordinate four vector (but twice, once for each index)

\begin{aligned}A_{i j} \rightarrow A_{k m} {O_i}^k{O_j}^m.\end{aligned} \hspace{\stretch{1}}(2.54)

The determinant of the transformation tensor {O_i}^j is

\begin{aligned}\text{Det} {\left\lVert{ {O_i}^j }\right\rVert} = \text{Det} {\left\lVert{ g^{i m} {O^m}_n g^{n j} }\right\rVert} = (\text{Det} \hat{G}) (1) (\text{Det} \hat{G} ) = (-1)^2 (1) = 1.\end{aligned} \hspace{\stretch{1}}(2.55)

We see that the determinant of a lower index rank 2 tensor is invariant under Lorentz transformation. This would include our characteristic polynomial P(\lambda).

Expanding the determinant.

Utilizing 2.39 we can now calculate the characteristic polynomial. This is

\begin{aligned}\text{Det} {\left\lVert{F_{ij} - \lambda g_{ij} }\right\rVert}&= \frac{1}{{4!}}\epsilon^{s t u v} \epsilon^{a b c d} (F_{ a s } - \lambda g_{a s}) (F_{ b t } - \lambda g_{b t}) (F_{ c u } - \lambda g_{c u}) (F_{ d v } - \lambda g_{d v}) \\ &=\frac{1}{{24}}\epsilon^{s t u v} \epsilon_{a b c d} ({F^a}_s - \lambda {g^a}_s) ({F^b}_t - \lambda {g^b}_t) ({F^c}_u - \lambda {g^c}_u) ({F^d}_v - \lambda {g^d}_v) \\ \end{aligned}

However, {g^a}_b = g_{b c} g^{a c}, or {\left\lVert{{g^a}_b}\right\rVert} = \hat{G}^2 = I. This means we have

\begin{aligned}{g^a}_b = {\delta^a}_b,\end{aligned} \hspace{\stretch{1}}(2.56)

and our determinant is reduced to

\begin{aligned}\begin{aligned}P(\lambda) &=\frac{1}{{24}}\epsilon^{s t u v} \epsilon_{a b c d} \Bigl({F^a}_s {F^b}_t - \lambda( {\delta^a}_s {F^b}_t + {\delta^b}_t {F^a}_s ) + \lambda^2 {\delta^a}_s {\delta^b}_t \Bigr) \\ &\times \qquad \qquad \Bigl({F^c}_u {F^d}_v - \lambda( {\delta^c}_u {F^d}_v + {\delta^d}_v {F^c}_u ) + \lambda^2 {\delta^c}_u {\delta^d}_v \Bigr) \end{aligned}\end{aligned} \hspace{\stretch{1}}(2.57)

If we expand this out we have our powers of \lambda coefficients are

\begin{aligned}\lambda^0 &:\frac{1}{{24}} \epsilon^{s t u v} \epsilon_{a b c d} {F^a}_s {F^b}_t {F^c}_u {F^d}_v \\ \lambda^1 &:\frac{1}{{24}} \epsilon^{s t u v} \epsilon_{a b c d} \Bigl(- ({\delta^c}_u {F^d}_v + {\delta^d}_v {F^c}_u ) {F^a}_s {F^b}_t - ({\delta^a}_s {F^b}_t + {\delta^b}_t {F^a}_s ) {F^c}_u {F^d}_v \Bigr) \\ \lambda^2 &:\frac{1}{{24}} \epsilon^{s t u v} \epsilon_{a b c d} \Bigl({\delta^c}_u {\delta^d}_v {F^a}_s {F^b}_t +( {\delta^a}_s {F^b}_t + {\delta^b}_t {F^a}_s ) ( {\delta^c}_u {F^d}_v + {\delta^d}_v {F^c}_u ) + {\delta^a}_s {\delta^b}_t  {F^c}_u {F^d}_v \Bigr) \\ \lambda^3 &:\frac{1}{{24}} \epsilon^{s t u v} \epsilon_{a b c d} \Bigl(- ( {\delta^a}_s {F^b}_t + {\delta^b}_t {F^a}_s ) {\delta^c}_u {\delta^d}_v - {\delta^a}_s {\delta^b}_t  ( {\delta^c}_u {F^d}_v + {\delta^d}_v {F^c}_u ) \Bigr) \\ \lambda^4 &:\frac{1}{{24}} \epsilon^{s t u v} \epsilon_{a b c d} \Bigl({\delta^a}_s {\delta^b}_t {\delta^c}_u {\delta^d}_v \Bigr) \\ \end{aligned}

By 2.39 the \lambda^0 coefficient is just \text{Det} {\left\lVert{F_{i j}}\right\rVert}.

The \lambda^3 terms can be seen to be zero. For example, the first one is

\begin{aligned}-\frac{1}{{24}} \epsilon^{s t u v} \epsilon_{a b c d} {\delta^a}_s {F^b}_t {\delta^c}_u {\delta^d}_v &=-\frac{1}{{24}} \epsilon^{s t u v} \epsilon_{s b u v} {F^b}_t \\ &=-\frac{1}{{12}} \delta^{t}_b {F^b}_t \\ &=-\frac{1}{{12}} {F^b}_b \\ &=-\frac{1}{{12}} F^{bu} g_{ub} \\ &= 0,\end{aligned}

where the final equality to zero comes from summing a symmetric and antisymmetric product.

Similarly the \lambda coefficients can be shown to be zero. Again the first as a sample is

\begin{aligned}-\frac{1}{{24}} \epsilon^{s t u v} \epsilon_{a b c d} {\delta^c}_u {F^d}_v {F^a}_s {F^b}_t &=-\frac{1}{{24}} \epsilon^{u s t v} \epsilon_{u a b d} {F^d}_v {F^a}_s {F^b}_t  \\ &=-\frac{1}{{24}} \delta^{[s}_a\delta^{t}_b\delta^{v]}_d{F^d}_v {F^a}_s {F^b}_t  \\ &=-\frac{1}{{24}} {F^a}_{[s}{F^b}_{t}{F^d}_{v]} \\ \end{aligned}

Disregarding the -1/24 factor, let’s just expand this antisymmetric sum

\begin{aligned}{F^a}_{[a}{F^b}_{b}{F^d}_{d]}&={F^a}_{a}{F^b}_{b}{F^d}_{d}+{F^a}_{d}{F^b}_{a}{F^d}_{b}+{F^a}_{b}{F^b}_{d}{F^d}_{a}-{F^a}_{a}{F^b}_{d}{F^d}_{b}-{F^a}_{d}{F^b}_{b}{F^d}_{a}-{F^a}_{b}{F^b}_{a}{F^d}_{d} \\ &={F^a}_{d}{F^b}_{a}{F^d}_{b}+{F^a}_{b}{F^b}_{d}{F^d}_{a} \\ \end{aligned}

Of the two terms above that were retained, they are the only ones without a zero {F^i}_i factor. Consider the first part of this remaining part of the sum. Employing the metric tensor, to raise indexes so that the antisymmetry of F^{ij} can be utilized, and then finally relabeling all the dummy indexes we have

\begin{aligned}{F^a}_{d}{F^b}_{a}{F^d}_{b}&=F^{a u}F^{b v}F^{d w}g_{d u}g_{a v}g_{b w} \\ &=(-1)^3F^{u a}F^{v b}F^{w d}g_{d u}g_{a v}g_{b w} \\ &=-(F^{u a}g_{a v})(F^{v b}g_{b w} )(F^{w d}g_{d u})\\ &=-{F^u}_v{F^v}_w{F^w}_u\\ &=-{F^a}_b{F^b}_d{F^d}_a\\ \end{aligned}

This is just the negative of the second term in the sum, leaving us with zero.

Finally, we have for the \lambda^2 coefficient (\times 24)

\begin{aligned}&\epsilon^{s t u v} \epsilon_{a b c d} \Bigl({\delta^c}_u {\delta^d}_v {F^a}_s {F^b}_t +{\delta^a}_s {F^b}_t {\delta^c}_u {F^d}_v +{\delta^b}_t {F^a}_s {\delta^d}_v {F^c}_u  \\ &\qquad +{\delta^b}_t {F^a}_s {\delta^c}_u {F^d}_v +{\delta^a}_s {F^b}_t {\delta^d}_v {F^c}_u + {\delta^a}_s {\delta^b}_t  {F^c}_u {F^d}_v \Bigr) \\ &=\epsilon^{s t u v} \epsilon_{a b u v}   {F^a}_s {F^b}_t +\epsilon^{s t u v} \epsilon_{s b u d}  {F^b}_t  {F^d}_v +\epsilon^{s t u v} \epsilon_{a t c v}  {F^a}_s  {F^c}_u  \\ &\qquad +\epsilon^{s t u v} \epsilon_{a t u d}  {F^a}_s  {F^d}_v +\epsilon^{s t u v} \epsilon_{s b c v}  {F^b}_t  {F^c}_u + \epsilon^{s t u v} \epsilon_{s t c d}    {F^c}_u {F^d}_v \\ &=\epsilon^{s t u v} \epsilon_{a b u v}   {F^a}_s {F^b}_t +\epsilon^{t v s u } \epsilon_{b d s u}  {F^b}_t  {F^d}_v +\epsilon^{s u t v} \epsilon_{a c t v}  {F^a}_s  {F^c}_u  \\ &\qquad +\epsilon^{s v t u} \epsilon_{a d t u}  {F^a}_s  {F^d}_v +\epsilon^{t u s v} \epsilon_{b c s v}  {F^b}_t  {F^c}_u + \epsilon^{u v s t} \epsilon_{c d s t}    {F^c}_u {F^d}_v \\ &=6\epsilon^{s t u v} \epsilon_{a b u v} {F^a}_s {F^b}_t  \\ &=6 (2){\delta^{[s}}_a{\delta^{t]}}_b{F^a}_s {F^b}_t  \\ &=12{F^a}_{[a} {F^b}_{b]}  \\ &=12( {F^a}_{a} {F^b}_{b} - {F^a}_{b} {F^b}_{a} ) \\ &=-12 {F^a}_{b} {F^b}_{a} \\ &=-12 F^{a b} F_{b a} \\ &=12 F^{a b} F_{a b}\end{aligned}

Therefore, our characteristic polynomial is

\begin{aligned}\boxed{P(\lambda) = \text{Det} {\left\lVert{F_{i j}}\right\rVert} + \frac{\lambda^2}{2} F^{a b} F_{a b} + \lambda^4.}\end{aligned} \hspace{\stretch{1}}(2.58)

Observe that in matrix form our strength tensors are

\begin{aligned}{\left\lVert{ F^{ij} }\right\rVert} &= \begin{bmatrix}0 & -E_x & -E_y & -E_z \\ E_x & 0 & -B_z & B_y \\ E_y & B_z & 0 & -B_x \\ E_z & -B_y & B_x & 0\end{bmatrix} \\ {\left\lVert{ F_{ij} }\right\rVert} &= \begin{bmatrix}0 & E_x & E_y & E_z \\ -E_x & 0 & -B_z & B_y \\ -E_y & B_z & 0 & -B_x \\ -E_z & -B_y & B_x & 0\end{bmatrix}.\end{aligned} \hspace{\stretch{1}}(2.59)

From these we can compute F^{a b} F_{a b} easily by inspection

\begin{aligned}F^{a b} F_{a b} = 2 (\mathbf{B}^2 - \mathbf{E}^2).\end{aligned} \hspace{\stretch{1}}(2.61)

Computing the determinant is not so easy. The dumb and simple way of expanding by cofactors takes two pages, and yields eventually

\begin{aligned}\text{Det} {\left\lVert{ F^{i j} }\right\rVert} = (\mathbf{E} \cdot \mathbf{B})^2.\end{aligned} \hspace{\stretch{1}}(2.62)

That supplies us with a relation for the characteristic polynomial in \mathbf{E} and \mathbf{B}

\begin{aligned}\boxed{P(\lambda) = (\mathbf{E} \cdot \mathbf{B})^2 + \lambda^2 (\mathbf{B}^2 - \mathbf{E}^2) + \lambda^4.}\end{aligned} \hspace{\stretch{1}}(2.63)

Observe that we found this for the special case where \mathbf{E} and \mathbf{B} were perpendicular in homework 2. Observe that when we have that perpendicularity, we can solve for the eigenvalues by inspection

\begin{aligned}\lambda \in \{ 0, 0, \pm \sqrt{ \mathbf{E}^2 - \mathbf{B}^2 } \},\end{aligned} \hspace{\stretch{1}}(2.64)

and were able to diagonalize the matrix {F^{i}}_j to solve the Lorentz force equation in parametric form. When {\left\lvert{\mathbf{E}}\right\rvert} > {\left\lvert{\mathbf{B}}\right\rvert} we had real eigenvalues and an orthogonal diagonalization when \mathbf{B} = 0. For the {\left\lvert{\mathbf{B}}\right\rvert} > {\left\lvert{\mathbf{E}}\right\rvert}, we had a two purely imaginary eigenvalues, and when \mathbf{E} = 0 this was a Hermitian diagonalization. For the general case, when one of \mathbf{E}, or \mathbf{B} was zero, things didn’t have the same nice closed form solution.

In general our eigenvalues are

\begin{aligned}\lambda = \pm \frac{1}{{\sqrt{2}}} \sqrt{ \mathbf{E}^2 - \mathbf{B}^2 \pm \sqrt{ (\mathbf{E}^2 - \mathbf{B}^2)^2 - 4 (\mathbf{E} \cdot \mathbf{B})^2 }}.\end{aligned} \hspace{\stretch{1}}(2.65)

For the purposes of this problem we really only wish to show that \mathbf{E} \cdot \mathbf{B} and \mathbf{E}^2 - \mathbf{B}^2 are Lorentz invariants. When \lambda = 0 we have P(\lambda) = (\mathbf{E} \cdot \mathbf{B})^2, a Lorentz invariant. This must mean that \mathbf{E} \cdot \mathbf{B} is itself a Lorentz invariant. Since that is invariant, and we require P(\lambda) to be invariant for any other possible values of \lambda, the difference \mathbf{E}^2 - \mathbf{B}^2 must also be Lorentz invariant.

7. Statement. Show that the pseudoscalar invariant has only boundary effects.

Use integration by parts to show that \int d^4 x \epsilon^{i j k l} F_{ i j } F_{ k l } only depends on the values of A^i(x) at the “boundary” of spacetime (e.g. the “surface” depicted on page 105 of the notes) and hence does not affect the equations of motion for the electromagnetic field.

7. Solution

This proceeds in a fairly straightforward fashion

\begin{aligned}\int d^4 x \epsilon^{i j k l} F_{ i j } F_{ k l }&=\int d^4 x \epsilon^{i j k l} (\partial_i A_j - \partial_j A_i) F_{ k l } \\ &=\int d^4 x \epsilon^{i j k l} (\partial_i A_j) F_{ k l } -\epsilon^{j i k l} (\partial_i A_j) F_{ k l } \\ &=2 \int d^4 x \epsilon^{i j k l} (\partial_i A_j) F_{ k l } \\ &=2 \int d^4 x \epsilon^{i j k l} \left( \frac{\partial {}}{\partial {x^i}}(A_j F_{ k l }-A_j \frac{\partial { F_{ k l } }}{\partial {x^i}}\right)\\ \end{aligned}

Now, observe that by the Bianchi identity, this second term is zero

\begin{aligned}\epsilon^{i j k l} \frac{\partial { F_{ k l } }}{\partial {x^i}}=-\epsilon^{j i k l} \partial_i F_{ k l } = 0\end{aligned} \hspace{\stretch{1}}(2.66)

Now we have a set of perfect differentials, and can integrate

\begin{aligned}\int d^4 x \epsilon^{i j k l} F_{ i j } F_{ k l }&= 2 \int d^4 x \epsilon^{i j k l} \frac{\partial {}}{\partial {x^i}}(A_j F_{ k l })\\ &= 2 \int dx^j dx^k dx^l\epsilon^{i j k l} {\left.{{(A_j F_{ k l })}}\right\vert}_{{\Delta x^i}}\\ \end{aligned}

We are left with a only contributions to the integral from the boundary terms on the spacetime hypervolume, three-volume normals bounding the four-volume integration in the original integral.

8. Statement. Electromagnetic duality transformations.

Show that the Maxwell equations in vacuum are invariant under the transformation: F_{i j} \rightarrow \tilde{F}_{i j}, where \tilde{F}_{i j} = \frac{1}{{2}} \epsilon_{i j k l} F^{k l} is the dual electromagnetic stress tensor. Replacing F with \tilde{F} is known as “electric-magnetic duality”. Explain this name by considering the transformation in terms of \mathbf{E} and \mathbf{B}. Are the Maxwell equations with sources invariant under electric-magnetic duality transformations?

8. Solution

Let’s first consider the explanation of the name. First recall what the expansions are of F_{i j} and F^{i j} in terms of \mathbf{E} and \mathbf{E}. These are

\begin{aligned}F_{0 \alpha} &= \partial_0 A_\alpha - \partial_\alpha A_0 \\ &= -\frac{1}{{c}} \frac{\partial {A^\alpha}}{\partial {t}} - \frac{\partial {\phi}}{\partial {x^\alpha}} \\ &= E_\alpha\end{aligned}

with F^{0 \alpha} = -E^\alpha, and E^\alpha = E_\alpha.

The magnetic field components are

\begin{aligned}F_{\beta \alpha} &= \partial_\beta A_\alpha - \partial_\alpha A_\beta \\ &= -\partial_\beta A^\alpha + \partial_\alpha A^\beta \\ &= \epsilon_{\alpha \beta \sigma} B^\sigma\end{aligned}

with F^{\beta \alpha} = \epsilon^{\alpha \beta \sigma} B_\sigma and B_\sigma = B^\sigma.

Now let’s expand the dual tensors. These are

\begin{aligned}\tilde{F}_{0 \alpha} &=\frac{1}{{2}} \epsilon_{0 \alpha i j} F^{i j} \\ &=\frac{1}{{2}} \epsilon_{0 \alpha \beta \sigma} F^{\beta \sigma} \\ &=\frac{1}{{2}} \epsilon_{0 \alpha \beta \sigma} \epsilon^{\sigma \beta \mu} B_\mu \\ &=-\frac{1}{{2}} \epsilon_{0 \alpha \beta \sigma} \epsilon^{\mu \beta \sigma} B_\mu \\ &=-\frac{1}{{2}} (2!) {\delta_\alpha}^\mu B_\mu \\ &=- B_\alpha \\ \end{aligned}


\begin{aligned}\tilde{F}_{\beta \alpha} &=\frac{1}{{2}} \epsilon_{\beta \alpha i j} F^{i j} \\ &=\frac{1}{{2}} \left(\epsilon_{\beta \alpha 0 \sigma} F^{0 \sigma} +\epsilon_{\beta \alpha \sigma 0} F^{\sigma 0} \right) \\ &=\epsilon_{0 \beta \alpha \sigma} (-E^\sigma) \\ &=\epsilon_{\alpha \beta \sigma} E^\sigma\end{aligned}

Summarizing we have

\begin{aligned}F_{0 \alpha} &= E^\alpha \\ F^{0 \alpha} &= -E^\alpha \\ F^{\beta \alpha} &= F_{\beta \alpha} = \epsilon_{\alpha \beta \sigma} B^\sigma \\ \tilde{F}_{0 \alpha} &= - B_\alpha \\ \tilde{F}^{0 \alpha} &= B_\alpha \\ \tilde{F}_{\beta \alpha} &= \tilde{F}^{\beta \alpha} = \epsilon_{\alpha \beta \sigma} E^\sigma\end{aligned} \hspace{\stretch{1}}(2.67)

Is there a sign error in the \tilde{F}_{0 \alpha} = - B_\alpha result? Other than that we have the same sort of structure for the tensor with E and B switched around.

Let’s write these in matrix form, to compare

\begin{aligned}\begin{array}{l l l l}{\left\lVert{ \tilde{F}_{i j} }\right\rVert} &= \begin{bmatrix}0 & -B_x & -B_y & -B_z \\ B_x & 0 & -E_z & E_y \\ B_y & E_z & 0 & E_x \\ B_z & -E_y & -E_x & 0 \\ \end{bmatrix} ^{i j} }\right\rVert} &= \begin{bmatrix}0 & B_x & B_y & B_z \\ -B_x & 0 & -E_z & E_y \\ -B_y & E_z & 0 & -E_x \\ -B_z & -E_y & E_x & 0 \\ \end{bmatrix} \\ {\left\lVert{ F^{ij} }\right\rVert} &= \begin{bmatrix}0 & -E_x & -E_y & -E_z \\ E_x & 0 & -B_z & B_y \\ E_y & B_z & 0 & -B_x \\ E_z & -B_y & B_x & 0\end{bmatrix} }\right\rVert} &= \begin{bmatrix}0 & E_x & E_y & E_z \\ -E_x & 0 & -B_z & B_y \\ -E_y & B_z & 0 & -B_x \\ -E_z & -B_y & B_x & 0\end{bmatrix}.\end{array}\end{aligned} \hspace{\stretch{1}}(2.73)

From these we can see by inspection that we have

\begin{aligned}\tilde{F}^{i j} F_{ij} = \tilde{F}_{i j} F^{ij} = 4 (\mathbf{E} \cdot \mathbf{B})\end{aligned} \hspace{\stretch{1}}(2.74)

This is consistent with the stated result in [1] (except for a factor of c due to units differences), so it appears the signs above are all kosher.

Now, let’s see if the if the dual tensor satisfies the vacuum equations.

\begin{aligned}\partial_j \tilde{F}^{i j}&=\partial_j \frac{1}{{2}} \epsilon^{i j k l} F_{k l} \\ &=\frac{1}{{2}} \epsilon^{i j k l} \partial_j (\partial_k A_l - \partial_l A_k) \\ &=\frac{1}{{2}} \epsilon^{i j k l} \partial_j \partial_k A_l - \frac{1}{{2}} \epsilon^{i j l k} \partial_k A_l \\ &=\frac{1}{{2}} (\epsilon^{i j k l} - \epsilon^{i j k l} \partial_k A_l \\ &= 0 \qquad\square\end{aligned}

So the first checks out, provided we have no sources. If we have sources, then we see here that Maxwell’s equations do not hold since this would imply that the four current density must be zero.

How about the Bianchi identity? That gives us

\begin{aligned}\epsilon^{i j k l} \partial_j \tilde{F}_{k l} &=\epsilon^{i j k l} \partial_j \frac{1}{{2}} \epsilon_{k l a b} F^{a b} \\ &=\frac{1}{{2}} \epsilon^{k l i j} \epsilon_{k l a b} \partial_j F^{a b} \\ &=\frac{1}{{2}} (2!) {\delta^i}_{[a} {\delta^j}_{b]} \partial_j F^{a b} \\ &=\partial_j (F^{i j} - F^{j i} ) \\ &=2 \partial_j F^{i j} .\end{aligned}

The factor of two is slightly curious. Is there a mistake above? If there is a mistake, it doesn’t change the fact that Maxwell’s equation

\begin{aligned}\partial_k F^{k i} = \frac{4 \pi}{c} j^i\end{aligned} \hspace{\stretch{1}}(2.75)

Gives us zero for the Bianchi identity under source free conditions of j^i = 0.

Problem 2. Transformation properties of \mathbf{E} and \mathbf{B}, again.

1. Statement

Use the form of F^{i j} from page 82 in the class notes, the transformation law for {\left\lVert{ F^{i j} }\right\rVert} given further down that same page, and the explicit form of the SO(1,3) matrix \hat{O} (say, corresponding to motion in the positive x_1 direction with speed v) to derive the transformation law of the fields \mathbf{E} and \mathbf{B}. Use the transformation law to find the electromagnetic field of a charged particle moving with constant speed v in the positive x_1 direction and check that the result agrees with the one that you obtained in Homework 2.

1. Solution

Given a transformation of coordinates

\begin{aligned}{x'}^i \rightarrow {O^i}_j x^j\end{aligned} \hspace{\stretch{1}}(3.76)

our rank 2 tensor F^{i j} transforms as

\begin{aligned}F^{i j} \rightarrow {O^i}_aF^{a b}{O^j}_b.\end{aligned} \hspace{\stretch{1}}(3.77)

Introducing matrices

\begin{aligned}\hat{O} &= {\left\lVert{{O^i}_j}\right\rVert} \\ \hat{F} &= {\left\lVert{F^{ij}}\right\rVert} = \begin{bmatrix}0 & -E_x & -E_y & -E_z \\ E_x & 0 & -B_z & B_y \\ E_y & B_z & 0 & -B_x \\ E_z & -B_y & B_x & 0\end{bmatrix} \end{aligned} \hspace{\stretch{1}}(3.78)

and noting that \hat{O}^\text{T} = {\left\lVert{{O^j}_i}\right\rVert}, we can express the electromagnetic strength tensor transformation as

\begin{aligned}\hat{F} \rightarrow \hat{O} \hat{F} \hat{O}^\text{T}.\end{aligned} \hspace{\stretch{1}}(3.80)

The class notes use {x'}^i \rightarrow O^{ij} x^j, which violates our conventions on mixed upper and lower indexes, but the end result 3.80 is the same.

\begin{aligned}{\left\lVert{{O^i}_j}\right\rVert} =\begin{bmatrix}\cosh\alpha & -\sinh\alpha & 0 & 0 \\ -\sinh\alpha & \cosh\alpha & 0 & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1\end{bmatrix}.\end{aligned} \hspace{\stretch{1}}(3.81)


\begin{aligned}C &= \cosh\alpha = \gamma \\ S &= -\sinh\alpha = -\gamma \beta,\end{aligned} \hspace{\stretch{1}}(3.82)

we can compute the transformed field strength tensor

\begin{aligned}\hat{F}' &=\begin{bmatrix}C & S & 0 & 0 \\ S & C & 0 & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1\end{bmatrix}\begin{bmatrix}0 & -E_x & -E_y & -E_z \\ E_x & 0 & -B_z & B_y \\ E_y & B_z & 0 & -B_x \\ E_z & -B_y & B_x & 0\end{bmatrix} \begin{bmatrix}C & S & 0 & 0 \\ S & C & 0 & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1\end{bmatrix} \\ &=\begin{bmatrix}C & S & 0 & 0 \\ S & C & 0 & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1\end{bmatrix}\begin{bmatrix}- S E_x        & -C E_x        & -E_y  & -E_z \\ C E_x          & S E_x         & -B_z  & B_y \\ C E_y + S B_z  & S E_y + C B_z & 0     & -B_x \\ C E_z - S B_y  & S E_z - C B_y & B_x   & 0 \end{bmatrix} \\ &=\begin{bmatrix}0 & -E_x & -C E_y - S B_z & - C E_z + S B_y \\ E_x & 0 & -S E_y - C B_z & - S E_z + C B_y \\ C E_y + S B_z & S E_y + C B_z & 0 & -B_x \\ C E_z - S B_y & S E_z - C B_y & B_x & 0\end{bmatrix} \\ &=\begin{bmatrix}0 & -E_x & -\gamma(E_y - \beta B_z) & - \gamma(E_z + \beta B_y) \\ E_x & 0 & - \gamma (-\beta E_y + B_z) & \gamma( \beta E_z + B_y) \\ \gamma (E_y - \beta B_z) & \gamma(-\beta E_y + B_z) & 0 & -B_x \\ \gamma (E_z + \beta B_y) & -\gamma(\beta E_z + B_y) & B_x & 0\end{bmatrix}.\end{aligned}

As a check we have the antisymmetry that is expected. There is also a regularity to the end result that is aesthetically pleasing, hinting that things are hopefully error free. In coordinates for \mathbf{E} and \mathbf{B} this is

\begin{aligned}E_x &\rightarrow E_x \\ E_y &\rightarrow \gamma ( E_y - \beta B_z ) \\ E_z &\rightarrow \gamma ( E_z + \beta B_y ) \\ B_z &\rightarrow B_x \\ B_y &\rightarrow \gamma ( B_y + \beta E_z ) \\ B_z &\rightarrow \gamma ( B_z - \beta E_y ) \end{aligned} \hspace{\stretch{1}}(3.84)

Writing \boldsymbol{\beta} = \mathbf{e}_1 \beta, we have

\begin{aligned}\boldsymbol{\beta} \times \mathbf{B} = \begin{vmatrix} \mathbf{e}_1 & \mathbf{e}_2 & \mathbf{e}_3 \\ \beta & 0 & 0 \\ B_x & B_y & B_z\end{vmatrix} = \mathbf{e}_2 (-\beta B_z) + \mathbf{e}_3( \beta B_y ),\end{aligned} \hspace{\stretch{1}}(3.90)

which puts us enroute to a tidier vector form

\begin{aligned}E_x &\rightarrow E_x \\ E_y &\rightarrow \gamma ( E_y + (\boldsymbol{\beta} \times \mathbf{B})_y ) \\ E_z &\rightarrow \gamma ( E_z + (\boldsymbol{\beta} \times \mathbf{B})_z ) \\ B_z &\rightarrow B_x \\ B_y &\rightarrow \gamma ( B_y - (\boldsymbol{\beta} \times \mathbf{E})_y ) \\ B_z &\rightarrow \gamma ( B_z - (\boldsymbol{\beta} \times \mathbf{E})_z ).\end{aligned} \hspace{\stretch{1}}(3.91)

For a vector \mathbf{A}, write \mathbf{A}_\parallel = (\mathbf{A} \cdot \hat{\mathbf{v}})\hat{\mathbf{v}}, \mathbf{A}_\perp = \mathbf{A} - \mathbf{A}_\parallel, allowing a compact description of the field transformation

\begin{aligned}\mathbf{E} &\rightarrow \mathbf{E}_\parallel + \gamma \mathbf{E}_\perp + \gamma (\boldsymbol{\beta} \times \mathbf{B})_\perp \\ \mathbf{B} &\rightarrow \mathbf{B}_\parallel + \gamma \mathbf{B}_\perp - \gamma (\boldsymbol{\beta} \times \mathbf{E})_\perp.\end{aligned} \hspace{\stretch{1}}(3.97)

Now, we want to consider the field of a moving particle. In the particle’s (unprimed) rest frame the field due to its potential \phi = q/r is

\begin{aligned}\mathbf{E} &= \frac{q}{r^2} \hat{\mathbf{r}} \\ \mathbf{B} &= 0.\end{aligned} \hspace{\stretch{1}}(3.99)

Coordinates for a “stationary” observer, who sees this particle moving along the x-axis at speed v are related by a boost in the -v direction

\begin{aligned}\begin{bmatrix}ct' \\ x' \\ y' \\ z'\end{bmatrix}\begin{bmatrix}\gamma & \gamma (v/c) & 0 & 0 \\ \gamma (v/c) & \gamma & 0 & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1\end{bmatrix}\begin{bmatrix}ct \\ x \\ y \\ z\end{bmatrix}.\end{aligned} \hspace{\stretch{1}}(3.101)

Therefore the fields in the observer frame will be

\begin{aligned}\mathbf{E}' &= \mathbf{E}_\parallel + \gamma \mathbf{E}_\perp - \gamma \frac{v}{c}(\mathbf{e}_1 \times \mathbf{B})_\perp = \mathbf{E}_\parallel + \gamma \mathbf{E}_\perp \\ \mathbf{B}' &= \mathbf{B}_\parallel + \gamma \mathbf{B}_\perp + \gamma \frac{v}{c}(\mathbf{e}_1 \times \mathbf{E})_\perp = \gamma \frac{v}{c}(\mathbf{e}_1 \times \mathbf{E})_\perp \end{aligned} \hspace{\stretch{1}}(3.102)

More explicitly with \mathbf{E} = \frac{q}{r^3}(x, y, z) this is

\begin{aligned}\mathbf{E}' &= \frac{q}{r^3}(x, \gamma y, \gamma z) \\ \mathbf{B}' &= \gamma \frac{q v}{c r^3} ( 0, -z, y )\end{aligned} \hspace{\stretch{1}}(3.104)

Comparing to Problem 3 in Problem set 2, I see that this matches the result obtained by separately transforming the gradient, the time partial, and the scalar potential. Actually, if I am being honest, I see that I made a sign error in all the coordinates of \mathbf{E}' when I initially did (this ungraded problem) in problem set 2. That sign error should have been obvious by considering the v=0 case which would have mysteriously resulted in inversion of all the coordinates of the observed electric field.

2. Statement

A particle is moving with velocity \mathbf{v} in perpendicular \mathbf{E} and \mathbf{B} fields, all given in some particular “stationary” frame of reference.

\item Show that there exists a frame where the problem of finding the particle trajectory can be reduced to having either only an electric or only a magnetic field.
\item Explain what determines which case takes place.
\item Find the velocity \mathbf{v}_0 of that frame relative to the “stationary” frame.

2. Solution

\paragraph{Part 1 and 2:} Existence of the transformation.

In the single particle Lorentz trajectory problem we wish to solve

\begin{aligned}m c \frac{du^i}{ds} = \frac{e}{c} F^{i j} u_j,\end{aligned} \hspace{\stretch{1}}(3.106)

which in matrix form we can write as

\begin{aligned}\frac{d U}{ds} = \frac{e}{m c^2} \hat{F} \hat{G} U.\end{aligned} \hspace{\stretch{1}}(3.107)

where we write our column vector proper velocity as U = {\left\lVert{u^i}\right\rVert}. Under transformation of coordinates {u'}^i = {O^i}_j x^j, with \hat{O} = {\left\lVert{{O^i}_j}\right\rVert}, this becomes

\begin{aligned}\hat{O} \frac{d U}{ds} = \frac{e}{m c^2} \hat{O} \hat{F} \hat{O}^\text{T} \hat{G} \hat{O} U.\end{aligned} \hspace{\stretch{1}}(3.108)

Suppose we can find eigenvectors for the matrix \hat{O} \hat{F} \hat{O}^\text{T} \hat{G}. That is for some eigenvalue \lambda, we can find an eigenvector \Sigma

\begin{aligned}\hat{O} \hat{F} \hat{O}^\text{T} \hat{G} \Sigma = \lambda \Sigma.\end{aligned} \hspace{\stretch{1}}(3.109)

Rearranging we have

\begin{aligned}(\hat{O} \hat{F} \hat{O}^\text{T} \hat{G} - \lambda I) \Sigma = 0\end{aligned} \hspace{\stretch{1}}(3.110)

and conclude that \Sigma lies in the null space of the matrix \hat{O} \hat{F} \hat{O}^\text{T} \hat{G} - \lambda I and that this difference of matrices must have a zero determinant

\begin{aligned}\text{Det} (\hat{O} \hat{F} \hat{O}^\text{T} \hat{G} - \lambda I) = -\text{Det} (\hat{O} \hat{F} \hat{O}^\text{T} - \lambda \hat{G}) = 0.\end{aligned} \hspace{\stretch{1}}(3.111)

Since \hat{G} = \hat{O} \hat{G} \hat{O}^\text{T} for any Lorentz transformation \hat{O} in SO(1,3), and \text{Det} ABC = \text{Det} A \text{Det} B \text{Det} C we have

\begin{aligned}\text{Det} (\hat{O} \hat{F} \hat{O}^\text{T} - \lambda G)= \text{Det} (\hat{F} - \lambda \hat{G}).\end{aligned} \hspace{\stretch{1}}(3.112)

In problem 1.6, we called this our characteristic equation P(\lambda) = \text{Det} (\hat{F} - \lambda \hat{G}). Observe that the characteristic equation is Lorentz invariant for any \lambda, which requires that the eigenvalues \lambda are also Lorentz invariants.

In problem 1.6 of this problem set we computed that this characteristic equation expands to

\begin{aligned}P(\lambda) = \text{Det} (\hat{F} - \lambda \hat{G}) = (\mathbf{E} \cdot \mathbf{B})^2 + \lambda^2 (\mathbf{B}^2 - \mathbf{E}^2) + \lambda^4.\end{aligned} \hspace{\stretch{1}}(3.113)

The eigenvalues for the system, also each necessarily Lorentz invariants, are

\begin{aligned}\lambda = \pm \frac{1}{{\sqrt{2}}} \sqrt{ \mathbf{E}^2 - \mathbf{B}^2 \pm \sqrt{ (\mathbf{E}^2 - \mathbf{B}^2)^2 - 4 (\mathbf{E} \cdot \mathbf{B})^2 }}.\end{aligned} \hspace{\stretch{1}}(3.114)

Observe that in the specific case where \mathbf{E} \cdot \mathbf{B} = 0, as in this problem, we must have \mathbf{E}' \cdot \mathbf{B}' in all frames, and the two non-zero eigenvalues of our characteristic polynomial are simply

\begin{aligned}\lambda = \pm \sqrt{\mathbf{E}^2 - \mathbf{B}^2}.\end{aligned} \hspace{\stretch{1}}(3.115)

These and \mathbf{E} \cdot \mathbf{B} = 0 are the invariants for this system. If we have \mathbf{E}^2 > \mathbf{B}^2 in one frame, we must also have {\mathbf{E}'}^2 > {\mathbf{B}'}^2 in another frame, still maintaining perpendicular fields. In particular if \mathbf{B}' = 0 we maintain real eigenvalues. Similarly if \mathbf{B}^2 > \mathbf{E}^2 in some frame, we must always have imaginary eigenvalues, and this is also true in the \mathbf{E}' = 0 case.

While the problem can be posed as a pure diagonalization problem (and even solved numerically this way for the general constant fields case), we can also work symbolically, thinking of the trajectories problem as simply seeking a transformation of frames that reduce the scope of the problem to one that is more tractable. That does not have to be the linear transformation that diagonalizes the system. Instead we are free to transform to a frame where one of the two fields \mathbf{E}' or \mathbf{B}' is zero, provided the invariants discussed are maintained.

\paragraph{Part 3:} Finding the boost velocity that wipes out one of the fields.

Let’s now consider a Lorentz boost \hat{O}, and seek to solve for the boost velocity that wipes out one of the fields, given the invariants that must be maintained for the system

To make things concrete, suppose that our perpendicular fields are given by \mathbf{E} = E \mathbf{e}_2 and \mathbf{B} = B \mathbf{e}_3.

Let also assume that we can find the velocity \mathbf{v}_0 for which one or more of the transformed fields is zero. Suppose that velocity is

\begin{aligned}\mathbf{v}_0 = v_0 (\alpha_1, \alpha_2, \alpha_3) = v_0 \hat{\mathbf{v}}_0,\end{aligned} \hspace{\stretch{1}}(3.116)

where \alpha_i are the direction cosines of \mathbf{v}_0 so that \sum_i \alpha_i^2 = 1. We will want to compute the components of \mathbf{E} and \mathbf{B} parallel and perpendicular to this velocity.

Those are

\begin{aligned}\mathbf{E}_\parallel &= E \mathbf{e}_2 \cdot (\alpha_1, \alpha_2, \alpha_3) (\alpha_1, \alpha_2, \alpha_3) \\ &= E \alpha_2 (\alpha_1, \alpha_2, \alpha_3) \\ \end{aligned}

\begin{aligned}\mathbf{E}_\perp &= E \mathbf{e}_2 - \mathbf{E}_\parallel \\ &= E (-\alpha_1 \alpha_2, 1 - \alpha_2^2, -\alpha_2 \alpha_3) \\ &= E (-\alpha_1 \alpha_2, \alpha_1^2 + \alpha_3^2, -\alpha_2 \alpha_3) \\ \end{aligned}

For the magnetic field we have

\begin{aligned}\mathbf{B}_\parallel &= B \alpha_3 (\alpha_1, \alpha_2, \alpha_3),\end{aligned}


\begin{aligned}\mathbf{B}_\perp &= B \mathbf{e}_3 - \mathbf{B}_\parallel \\ &= B (-\alpha_1 \alpha_3, -\alpha_2 \alpha_3, \alpha_1^2 + \alpha_2^2)  \\ \end{aligned}

Now, observe that (\boldsymbol{\beta} \times \mathbf{B})_\parallel \propto ((\mathbf{v}_0 \times \mathbf{B}) \cdot \mathbf{v}_0) \mathbf{v}_0, but this is just zero. So we have (\boldsymbol{\beta} \times \mathbf{B})_\parallel = \boldsymbol{\beta} \times \mathbf{B}. So our cross products terms are just

\begin{aligned}\hat{\mathbf{v}}_0 \times \mathbf{B} &=         \begin{vmatrix}         \mathbf{e}_1 & \mathbf{e}_2 & \mathbf{e}_3 \\         \alpha_1 & \alpha_2 & \alpha_3 \\         0 & 0 & B         \end{vmatrix} = B (\alpha_2, -\alpha_1, 0) \\ \hat{\mathbf{v}}_0 \times \mathbf{E} &=         \begin{vmatrix}         \mathbf{e}_1 & \mathbf{e}_2 & \mathbf{e}_3 \\         \alpha_1 & \alpha_2 & \alpha_3 \\         0 & E & 0         \end{vmatrix} = E (-\alpha_3, 0, \alpha_1)\end{aligned}

We can now express how the fields transform, given this arbitrary boost velocity. From 3.97, this is

\begin{aligned}\mathbf{E} &\rightarrow E \alpha_2 (\alpha_1, \alpha_2, \alpha_3) + \gamma E (-\alpha_1 \alpha_2, \alpha_1^2 + \alpha_3^2, -\alpha_2 \alpha_3) + \gamma \frac{v_0^2}{c^2} B (\alpha_2, -\alpha_1, 0) \\ \mathbf{B} &\rightarrowB \alpha_3 (\alpha_1, \alpha_2, \alpha_3)+ \gamma B (-\alpha_1 \alpha_3, -\alpha_2 \alpha_3, \alpha_1^2 + \alpha_2^2)  - \gamma \frac{v_0^2}{c^2} E (-\alpha_3, 0, \alpha_1)\end{aligned} \hspace{\stretch{1}}(3.117)

Zero Electric field case.

Let’s tackle the two cases separately. First when {\left\lvert{\mathbf{B}}\right\rvert} > {\left\lvert{\mathbf{E}}\right\rvert}, we can transform to a frame where \mathbf{E}'=0. In coordinates from 3.117 this supplies us three sets of equations. These are

\begin{aligned}0 &= E \alpha_2 \alpha_1 (1 - \gamma) + \gamma \frac{v_0^2}{c^2} B \alpha_2  \\ 0 &= E \alpha_2^2 + \gamma E (\alpha_1^2 + \alpha_3^2) - \gamma \frac{v_0^2}{c^2} B \alpha_1  \\ 0 &= E \alpha_2 \alpha_3 (1 - \gamma).\end{aligned} \hspace{\stretch{1}}(3.119)

With an assumed solution the \mathbf{e}_3 coordinate equation implies that one of \alpha_2 or \alpha_3 is zero. Perhaps there are solutions with \alpha_3 = 0 too, but inspection shows that \alpha_2 = 0 nicely kills off the first equation. Since \alpha_1^2 + \alpha_2^2 + \alpha_3^2 = 1, that also implies that we are left with

\begin{aligned}0 = E - \frac{v_0^2}{c^2} B \alpha_1 \end{aligned} \hspace{\stretch{1}}(3.122)


\begin{aligned}\alpha_1 &= \frac{E}{B} \frac{c^2}{v_0^2} \\ \alpha_2 &= 0 \\ \alpha_3 &= \sqrt{1 - \frac{E^2}{B^2} \frac{c^4}{v_0^4} }\end{aligned} \hspace{\stretch{1}}(3.123)

Our velocity was \mathbf{v}_0 = v_0 (\alpha_1, \alpha_2, \alpha_3) solving the problem for the {\left\lvert{\mathbf{B}}\right\rvert}^2 > {\left\lvert{\mathbf{E}}\right\rvert}^2 case up to an adjustable constant v_0. That constant comes with constraints however, since we must also have our cosine \alpha_1 \le 1. Expressed another way, the magnitude of the boost velocity is constrained by the relation

\begin{aligned}\frac{\mathbf{v}_0^2}{c^2} \ge {\left\lvert{\frac{E}{B}}\right\rvert}.\end{aligned} \hspace{\stretch{1}}(3.126)

It appears we may also pick the equality case, so one velocity (not unique) that should transform away the electric field is

\begin{aligned}\boxed{\mathbf{v}_0 = c \sqrt{{\left\lvert{\frac{E}{B}}\right\rvert}} \mathbf{e}_1 = \pm c \sqrt{{\left\lvert{\frac{E}{B}}\right\rvert}} \frac{\mathbf{E} \times \mathbf{B}}{{\left\lvert{\mathbf{E}}\right\rvert} {\left\lvert{\mathbf{B}}\right\rvert}}.}\end{aligned} \hspace{\stretch{1}}(3.127)

This particular boost direction is perpendicular to both fields. Observe that this highlights the invariance condition {\left\lvert{\frac{E}{B}}\right\rvert} < 1 since we see this is required for a physically realizable velocity. Boosting in this direction will reduce our problem to one that has only the magnetic field component.

Zero Magnetic field case.

Now, let’s consider the case where we transform the magnetic field away, the case when our characteristic polynomial has strictly real eigenvalues \lambda = \pm \sqrt{\mathbf{E}^2 - \mathbf{B}^2}. In this case, if we write out our equations for the transformed magnetic field and require these to separately equal zero, we have

\begin{aligned}0 &= B \alpha_3 \alpha_1 ( 1 - \gamma ) + \gamma \frac{v_0^2}{c^2} E \alpha_3 \\ 0 &= B \alpha_2 \alpha_3 ( 1 - \gamma ) \\ 0 &= B (\alpha_3^2 + \gamma (\alpha_1^2 + \alpha_2^2)) - \gamma \frac{v_0^2}{c^2} E \alpha_1.\end{aligned} \hspace{\stretch{1}}(3.128)

Similar to before we see that \alpha_3 = 0 kills off the first and second equations, leaving just

\begin{aligned}0 = B - \frac{v_0^2}{c^2} E \alpha_1.\end{aligned} \hspace{\stretch{1}}(3.131)

We now have a solution for the family of direction vectors that kill the magnetic field off

\begin{aligned}\alpha_1 &= \frac{B}{E} \frac{c^2}{v_0^2} \\ \alpha_2 &= \sqrt{ 1 - \frac{B^2}{E^2} \frac{c^4}{v_0^4} } \\ \alpha_3 &= 0.\end{aligned} \hspace{\stretch{1}}(3.132)

In addition to the initial constraint that {\left\lvert{\frac{B}{E}}\right\rvert} < 1, we have as before, constraints on the allowable values of v_0

\begin{aligned}\frac{\mathbf{v}_0^2}{c^2} \ge {\left\lvert{\frac{B}{E}}\right\rvert}.\end{aligned} \hspace{\stretch{1}}(3.135)

Like before we can pick the equality \alpha_1^2 = 1, yielding a boost direction of

\begin{aligned}\boxed{\mathbf{v}_0 = c \sqrt{{\left\lvert{\frac{B}{E}}\right\rvert}} \mathbf{e}_1 = \pm c \sqrt{{\left\lvert{\frac{B}{E}}\right\rvert}} \frac{\mathbf{E} \times \mathbf{B}}{{\left\lvert{\mathbf{E}}\right\rvert} {\left\lvert{\mathbf{B}}\right\rvert}}.}\end{aligned} \hspace{\stretch{1}}(3.136)

Again, we see that the invariance condition {\left\lvert{\mathbf{B}}\right\rvert} < {\left\lvert{\mathbf{E}}\right\rvert} is required for a physically realizable velocity if that velocity is entirely perpendicular to the fields.

Problem 3. Continuity equation for delta function current distributions.


Show explicitly that the electromagnetic 4-current j^i for a particle moving with constant velocity (considered in class, p. 100-101 of notes) is conserved \partial_i j^i = 0. Give a physical interpretation of this conservation law, for example by integrating \partial_i j^i over some spacetime region and giving an integral form to the conservation law (\partial_i j^i = 0 is known as the “continuity equation”).


First lets review. Our four current was defined as

\begin{aligned}j^i(x) = \sum_A c e_A \int_{x(\tau)} dx_A^i(\tau) \delta^4(x - x_A(\tau)).\end{aligned} \hspace{\stretch{1}}(4.137)

If each of the trajectories x_A(\tau) represents constant motion we have

\begin{aligned}x_A(\tau) = x_A(0) + \gamma_A \tau ( c, \mathbf{v}_A ).\end{aligned} \hspace{\stretch{1}}(4.138)

The spacetime split of this four vector is

\begin{aligned}x_A^0(\tau) &= x_A^0(0) + \gamma_A \tau c \\ \mathbf{x}_A(\tau) &= \mathbf{x}_A(0) + \gamma_A \tau \mathbf{v},\end{aligned} \hspace{\stretch{1}}(4.139)

with differentials

\begin{aligned}dx_A^0(\tau) &= \gamma_A d\tau c \\ d\mathbf{x}_A(\tau) &= \gamma_A d\tau \mathbf{v}_A.\end{aligned} \hspace{\stretch{1}}(4.141)

Writing out the delta functions explicitly we have

\begin{aligned}\begin{aligned}j^i(x) = \sum_A &c e_A \int_{x(\tau)} dx_A^i(\tau) \delta(x^0 - x_A^0(0) - \gamma_A c \tau) \delta(x^1 - x_A^1(0) - \gamma_A v_A^1 \tau) \\ &\delta(x^2 - x_A^2(0) - \gamma_A v_A^2 \tau) \delta(x^3 - x_A^3(0) - \gamma_A v_A^3 \tau)\end{aligned}\end{aligned} \hspace{\stretch{1}}(4.143)

So our time and space components of the current can be written

\begin{aligned}j^0(x) &= \sum_A c^2 e_A \gamma_A \int_{x(\tau)} d\tau\delta(x^0 - x_A^0(0) - \gamma_A c \tau)\delta^3(\mathbf{x} - \mathbf{x}_A(0) - \gamma_A \mathbf{v}_A \tau) \\ \mathbf{j}(x) &= \sum_A c e_A \mathbf{v}_A \gamma_A \int_{x(\tau)} d\tau\delta(x^0 - x_A^0(0) - \gamma_A c \tau)\delta^3(\mathbf{x} - \mathbf{x}_A(0) - \gamma_A \mathbf{v}_A \tau).\end{aligned} \hspace{\stretch{1}}(4.144)

Each of these integrals can be evaluated with respect to the time coordinate delta function leaving the distribution

\begin{aligned}j^0(x) &= \sum_A c e_A \delta^3(\mathbf{x} - \mathbf{x}_A(0) - \frac{\mathbf{v}_A}{c} (x^0 - x_A^0(0))) \\ \mathbf{j}(x) &= \sum_A e_A \mathbf{v}_A \delta^3(\mathbf{x} - \mathbf{x}_A(0) - \frac{\mathbf{v}_A}{c} (x^0 - x_A^0(0)))\end{aligned} \hspace{\stretch{1}}(4.146)

With this more general expression (multi-particle case) it should be possible to show that the four divergence is zero, however, the problem only asks for one particle. For the one particle case, we can make things really easy by taking the initial point in space and time as the origin, and aligning our velocity with one of the coordinates (say x).

Doing so we have the result derived in class

\begin{aligned}j = e \begin{bmatrix}c \\ v \\ 0 \\ 0 \end{bmatrix}\delta(x - v x^0/c)\delta(y)\delta(z).\end{aligned} \hspace{\stretch{1}}(4.148)

Our divergence then has only two portions

\begin{aligned}\frac{\partial {j^0}}{\partial {x^0}} &= e c (-v/c) \delta'(x - v x^0/c) \delta(y) \delta(z) \\ \frac{\partial {j^1}}{\partial {x}} &= e v \delta'(x - v x^0/c) \delta(y) \delta(z).\end{aligned} \hspace{\stretch{1}}(4.149)

and these cancel out when summed. Note that this requires us to be loose with our delta functions, treating them like regular functions that are differentiable.

For the more general multiparticle case, we can treat the sum one particle at a time, and in each case, rotate coordinates so that the four divergence only picks up one term.

As for physical interpretation via integral, we have using the four dimensional divergence theorem

\begin{aligned}\int d^4 x \partial_i j^i = \int j^i dS_i\end{aligned} \hspace{\stretch{1}}(4.151)

where dS_i is the three-volume element perpendicular to a x^i = \text{constant} plane. These volume elements are detailed generally in the text [2], however, they do note that one special case specifically dS_0 = dx dy dz, the element of the three-dimensional (spatial) volume “normal” to hyperplanes ct = \text{constant}.

Without actually computing the determinants, we have something that is roughly of the form

\begin{aligned}0 = \int j^i dS_i=\int c \rho dx dy dz+\int \mathbf{j} \cdot (\mathbf{n}_x c dt dy dz + \mathbf{n}_y c dt dx dz + \mathbf{n}_z c dt dx dy).\end{aligned} \hspace{\stretch{1}}(4.152)

This is cheating a bit to just write \mathbf{n}_x, \mathbf{n}_y, \mathbf{n}_z. Are there specific orientations required by the metric. To be precise we’d have to calculate the determinants detailed in the text, and then do the duality transformations.

Per unit time, we can write instead

\begin{aligned}\frac{\partial {}}{\partial {t}} \int \rho dV= -\int \mathbf{j} \cdot (\mathbf{n}_x dy dz + \mathbf{n}_y dx dz + \mathbf{n}_z dx dy)\end{aligned} \hspace{\stretch{1}}(4.153)

Rather loosely this appears to roughly describe that the rate of change of charge in a volume must be matched with the “flow” of current through the surface within that amount of time.


[1] Wikipedia. Electromagnetic tensor — wikipedia, the free encyclopedia [online]. 2011. [Online; accessed 27-February-2011].

[2] L.D. Landau and E.M. Lifshitz. The classical theory of fields. Butterworth-Heinemann, 1980.

Posted in Math and Physics Learning. | Tagged: , , , , , , , , , , , , , , , , , , , | Leave a Comment »

PHY450H1S. Relativistic Electrodynamics Lecture 6 (Taught by Prof. Erich Poppitz). Four vectors and tensors.

Posted by peeterjoot on January 25, 2011

[Click here for a PDF of this post with nicer formatting]


Still covering chapter 1 material from the text [1].

Covering Professor Poppitz’s lecture notes: nonrelativistic limit of boosts (33); number of parameters of Lorentz transformations (34-35); introducing four-vectors, the metric tensor, the invariant “dot-product and SO(1,3) (36-40); the Poincare group (41); the convenience of “upper” and “lower” indices (42-43); tensors (44)

The Special Orthogonal group (for Euclidean space).

Lorentz transformations are like “rotations” for (t, x, y, z) that preserve (ct)^2 - x^2 - y^2 - z^2. There are 6 continuous parameters:

\item 3 rotations in x,y,z space
\item 3 “boosts” in x or y or z.

For rotations of space we talk about a group of transformations of 3D Euclidean space, and call this the S0(3) group. Here S is for Special, O for Orthogonal, and 3 for the dimensions.

For a transformed vector in 3D space we write

\begin{aligned}\begin{bmatrix}x \\ y \\ z\end{bmatrix} \rightarrow \begin{bmatrix}x \\ y \\ z\end{bmatrix}' = O \begin{bmatrix}x \\ y \\ z\end{bmatrix}.\end{aligned} \hspace{\stretch{1}}(2.1)

Here O is an orthogonal 3 \times 3 matrix, and has the property

\begin{aligned}O^T O = \mathbf{1}.\end{aligned} \hspace{\stretch{1}}(2.2)

Taking determinants, we have

\begin{aligned}\det{ O^T } \det{ O} = 1,\end{aligned} \hspace{\stretch{1}}(2.3)

and since \det{O^\text{T}} = \det{ O }, we have

\begin{aligned}(\det{O})^2 = 1,\end{aligned} \hspace{\stretch{1}}(2.4)

so our determinant must be

\begin{aligned}\det O = \pm 1.\end{aligned} \hspace{\stretch{1}}(2.5)

We work with the positive case only, avoiding the transformations that include reflections.

The Unitary condition O^\text{T} O = 1 is an indication that the inner product is preserved. Observe that in matrix form we can write the inner product

\begin{aligned}\mathbf{r}_1 \cdot \mathbf{r}_2 = \begin{bmatrix}x_1 & y_1 & z_1\end{bmatrix}\begin{bmatrix}x_1 \\ y_2 \\ x_3 \\ \end{bmatrix}.\end{aligned} \hspace{\stretch{1}}(2.6)

For a transformed vector X' = O X, we have {X'}^\text{T} = X^\text{T} O^\text{T}, and

\begin{aligned}X' \cdot X' = (X^\text{T} O^\text{T}) (O X) = X^\text{T} (O^\text{T} O) X = X^T X = X \cdot X\end{aligned} \hspace{\stretch{1}}(2.7)

The Special Orthogonal group (for spacetime).

This generalizes to Lorentz boosts! There are two differences

\item Lorentz transforms should be 4 \times 4 not 3 \times 3 and act in (ct, x, y, z), and NOT (x,y,z).
\item They should leave invariant NOT \mathbf{r}_1 \cdot \mathbf{r}_2, but c2 t_2 t_1 - \mathbf{r}_2 \cdot \mathbf{r}_1.

Don’t get confused that I demanded c^2 t_2 t_1 - \mathbf{r}_2 \cdot \mathbf{r}_1 = \text{invariant} rather than c^2 (t_2 - t_1)^2 - (\mathbf{r}_2 - \mathbf{r}_1)^2 = \text{invariant}. Expansion of this (squared) interval, provides just this four vector dot product and its invariance condition

\begin{aligned}\text{invariant} &=c^2 (t_2 - t_1)^2 - (\mathbf{r}_2 - \mathbf{r}_1)^2 \\ &=(c^2 t_2^2 - \mathbf{r}_2^2) + (c^2 t_2^2 - \mathbf{r}_2^2)- 2 c^2 t_2 t_1 + 2 \mathbf{r}_1 \cdot \mathbf{r}_2.\end{aligned}

Observe that we have the sum of two invariants plus our new cross term, so this cross term, (-2 times our dot product to be defined), must also be an invariant.

Introduce the four vector

\begin{aligned}x^0 &= ct \\ x^1 &= x \\ x^2 &= y \\ x^3 &= z \end{aligned}

Or (x^0, x^1, x^2, x^3) = \{ x^i, i = 0,1,2,3 \}.

We will also write

\begin{aligned}x^i &= (ct, \mathbf{r}) \\ \tilde{x}^i &= (c\tilde{t}, \tilde{\mathbf{r}})\end{aligned}

Our inner product is

\begin{aligned}c^2 t \tilde{t} - \mathbf{r} \cdot \tilde{\mathbf{r}}\end{aligned} \hspace{\stretch{1}}(3.8)

Introduce the 4 \times 4 matrix

\begin{aligned} \left\lVert{g_{ij}}\right\rVert = \begin{bmatrix}1 & 0 & 0 & 0 \\ 0 & -1 & 0 & 0 \\ 0 & 0 & -1 & 0 \\ 0 & 0 & 0 & -1 \\ \end{bmatrix}\end{aligned} \hspace{\stretch{1}}(3.9)

This is called the Minkowski spacetime metric.


\begin{aligned}c^2 t \tilde{t} - \mathbf{r} \cdot \tilde{\mathbf{r}}&\equiv \sum_{i, j = 0}^3 \tilde{x}^i g_{ij} x^j \\ &= \sum_{i, j = 0}^3 \tilde{x}^i g_{ij} x^j \\ & \tilde{x}^0 x^0 -\tilde{x}^1 x^1 -\tilde{x}^2 x^2 -\tilde{x}^3 x^3 \end{aligned}

\paragraph{Einstein summation convention}. Whenever indexes are repeated that are assumed to be summed over.

We also write

\begin{aligned}X = \begin{bmatrix}x^0 \\ x^1 \\ x^2 \\ x^3 \\ \end{bmatrix}\end{aligned} \hspace{\stretch{1}}(3.10)

\begin{aligned}\tilde{X} = \begin{bmatrix}\tilde{x}^0 \\ \tilde{x}^1 \\ \tilde{x}^2 \\ \tilde{x}^3 \\ \end{bmatrix}\end{aligned} \hspace{\stretch{1}}(3.11)

\begin{aligned}G = \begin{bmatrix}1 & 0 & 0 & 0 \\ 0 & -1 & 0 & 0 \\ 0 & 0 & -1 & 0 \\ 0 & 0 & 0 & -1 \\ \end{bmatrix}\end{aligned} \hspace{\stretch{1}}(3.12)

Our inner product

\begin{aligned}c^2 t \tilde{t} - \tilde{\mathbf{r}} \cdot \mathbf{r} = \tilde{X}^\text{T} G X &=\begin{bmatrix}\tilde{x}^0 & \tilde{x}^1 & \tilde{x}^2 & \tilde{x}^3 \end{bmatrix}\begin{bmatrix}1 & 0 & 0 & 0 \\ 0 & -1 & 0 & 0 \\ 0 & 0 & -1 & 0 \\ 0 & 0 & 0 & -1 \\ \end{bmatrix}\begin{bmatrix}\tilde{x}^0 \\ \tilde{x}^1 \\ \tilde{x}^2 \\ \tilde{x}^3 \\ \end{bmatrix}\end{aligned}

Under Lorentz boosts, we have

\begin{aligned}X = \hat{O} X',\end{aligned} \hspace{\stretch{1}}(3.13)


\begin{aligned}\hat{O} =\begin{bmatrix}\gamma & - \gamma v_x/c  & 0 & 0 \\ - \gamma v_x/c & \gamma  & 0 & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \end{bmatrix}\end{aligned} \hspace{\stretch{1}}(3.14)

(for x-direction boosts)

\tilde{X} = \hat{O} \tilde{X}' \tilde{X}^\text{T} = \tilde{X'}^\text{T} \hat{O}^\text{T} \hspace{\stretch{1}}(3.15)

But \hat{O} must be such that \tilde{X}^\text{T} G X is invariant. i.e.

\begin{aligned}\tilde{X} G X = {\tilde{X'}}^\text{T} (\hat{O}^\text{T} G \hat{O}) X' = {X'}^\text{T} (G) X' \qquad \forall X' \text{and} \tilde{X}' \end{aligned} \hspace{\stretch{1}}(3.16)

This implies

\begin{aligned}\boxed{\hat{O}^\text{T} G \hat{O} = G}\end{aligned} \hspace{\stretch{1}}(3.17)

Such \hat{O}‘s are called “pseudo-orthogonal”.

Lorentz transformations are represented by the set of all 4 \times 4 pseudo-orthogonal matrices.

In symbols

\begin{aligned}\hat{O}^T G \hat{O} = G\end{aligned} \hspace{\stretch{1}}(3.18)

Just as before we can take the determinant of both sides. Doing so we have

\begin{aligned}\det(\hat{O}^T G \hat{O}) = \det(\hat{O}^T) \det(G) \det(\hat{O}) = \det(G)\end{aligned} \hspace{\stretch{1}}(3.19)

The \det(G) terms cancel, and since \det(\hat{O}^T) = \det(\hat{O}), this leaves us with (\det(\hat{O}))^2 = 1, or

\begin{aligned}\det(\hat{O}) = \pm 1\end{aligned} \hspace{\stretch{1}}(3.20)

We take the \det 0 = +1 case only, so that the transformations do not change orientation (no reflection in space or time). This set of transformation forms the group


Special orthogonal, one time, 3 space dimensions.

Einstein relativity can be defined as the “laws of physics that leave four vectors invariant in the

\begin{aligned}SO(1,3) \times T^4\end{aligned}

symmetry group.

Here T^4 is the group of translations in spacetime with 4 continuous parameters. The complete group of transformations that form the group of relativistic physics has 10 = 3 + 3 + 4 continuous parameters.

This group is called the Poincare group of symmetry transforms.

More notation

Our inner product is written

\begin{aligned}\tilde{x}^i g_{ij} x^j\end{aligned} \hspace{\stretch{1}}(4.21)

but this is very cumbersome. The convenient way to write this is instead

\begin{aligned}\tilde{x}^i g_{ij} x^j = \tilde{x}_j x^j = \tilde{x}^i x_i\end{aligned} \hspace{\stretch{1}}(4.22)


\begin{aligned}x_i = g_{ij} x^j = g_{ji} x^j\end{aligned} \hspace{\stretch{1}}(4.23)

Note: A check that we should always be able to make. Indexes that are not summed over should be conserved. So in the above we have a free i on the LHS, and should have a non-summed i index on the RHS too (also lower matching lower, or upper matching upper).

Non-matched indexes are bad in the same sort of sense that an expression like

\begin{aligned}\mathbf{r} = 1\end{aligned} \hspace{\stretch{1}}(4.24)

isn’t well defined (assuming a vector space \mathbf{r} and not a multivector Clifford algebra that is;)

Example explicitly:

\begin{aligned}x_0 &= g_{0 0} x^0 = ct  \\ x_1 &= g_{1 j} x^j = g_{11} x^1 = -x^1 \\ x_2 &= g_{2 j} x^j = g_{22} x^2 = -x^2 \\ x_3 &= g_{3 j} x^j = g_{33} x^3 = -x^3\end{aligned}

We would not have objects of the form

\begin{aligned}x^i x^i = (ct)^2 + \mathbf{r}^2\end{aligned} \hspace{\stretch{1}}(4.25)

for example. This is not a Lorentz invariant quantity.

\paragraph{Lorentz scalar example:} \tilde{x}^i x_i
\paragraph{Lorentz vector example:} x^i

This last is also called a rank-1 tensor.

Lorentz rank-2 tensors: ex: g_{ij}

or other 2-index objects.

Why in the world would we ever want to consider two index objects. We aren’t just trying to be hard on ourselves. Recall from classical mechanics that we have a two index object, the inertial tensor.

In mechanics, for a rigid body we had the energy

\begin{aligned}T = \sum_{ij = 1}^3 \Omega_i I_{ij} \Omega_j\end{aligned} \hspace{\stretch{1}}(4.26)

The inertial tensor was this object

\begin{aligned}I_{ij} = \sum_{a = 1}^N m_a \left(\delta_{ij} \mathbf{r}_a^2 - r_{a_i} r_{a_j} \right)\end{aligned} \hspace{\stretch{1}}(4.27)

or for a continuous body

\begin{aligned}I_{ij} = \int \rho(\mathbf{r}) \left(\delta_{ij} \mathbf{r}^2 - r_{i} r_{j} \right)\end{aligned} \hspace{\stretch{1}}(4.28)

In electrostatics we have the quadrupole tensor, … and we have other such objects all over physics.

Note that the energy T of the body above cannot depend on the coordinate system in use. This is a general property of tensors. These are object that transform as products of vectors, as I_{ij} does.

We call I_{ij} a rank-2 3-tensor. rank-2 because there are two indexes, and 3 because the indexes range from 1 to 3.

The point is that tensors have the property that the transformed tensors transform as

\begin{aligned}I_{ij}' = \sum_{l, m = 1,2,3} O_{il} O_{jm} I_{lm}\end{aligned} \hspace{\stretch{1}}(4.29)

Another example: the completely antisymmetric rank 3, 3-tensor

\begin{aligned}\epsilon_{ijk}\end{aligned} \hspace{\stretch{1}}(4.30)


In Newtonian dynamics we have

\begin{aligned}m \dot{d}{\mathbf{r}} = \mathbf{f}\end{aligned} \hspace{\stretch{1}}(5.31)

An equation of motion should be expressed in terms of vectors. This equation is written in a way that shows that the law of physics is independent of the choice of coordinates. We can do this in the context of tensor algebra as well. Ironically, this will require us to explicitly work with the coordinate representation, but this work will be augmented by the fact that we require our tensors to transform in specific ways.

In Newtonian mechanics we can look to symmetries and the invariance of the action with respect to those symmetries to express the equations of motion. Our symmetries in Newtonian mechanics leave the action invariant with respect to spatial translation and with respect to rotation.

We want to express relativistic dynamics in a similar way, and will have to express the action as a Lorentz scalar. We are going to impose the symmetries of the Poincare group to determine the relativistic laws of dynamics, and the next task will be to consider the possibilities for our relativistic action, and see what that action implies for dynamics in a relativistic context.


[1] L.D. Landau and E.M. Lifshits. The classical theory of fields. Butterworth-Heinemann, 1980.

Posted in Math and Physics Learning. | Tagged: , , , , , , , , , , , | Leave a Comment »

PHY450H1S. Relativistic Electrodynamics Tutorial 1 (TA: Simon Freedman).

Posted by peeterjoot on January 21, 2011

[Click here for a PDF of this post with nicer formatting]

Worked question.

The TA blasted through a problem from Hartle [1], section 5.17 (all the while apologizing for going so slow). I’m going to have to look these notes over carefully to figure out what on Earth he was doing.

At one point he asked if anybody was completely lost. Nobody said yes, but given the class title, I had the urge to say “No, just relatively lost”.

In a source’s rest frame S emits radiation isotropically with a frequency \omega with number flux f(\text{photons}/\text{cm}^2 s). Moves along x’-axis with speed V in an observer frame (O). What does the energy flux in O look like?

A brief intro with four vectors

A 3-vector:

\begin{aligned}\mathbf{a} &= (a_x, a_y, a_z) = (a^1, a^2, a^3) \\ \mathbf{b} &= (b_x, b_y, b_z) = (b^1, b^2, b^3)\end{aligned} \hspace{\stretch{1}}(1.1)

For this we have the dot product

\begin{aligned}\mathbf{a} \cdot \mathbf{b} = \sum_{\alpha=1}^3 a^\alpha b^\alpha\end{aligned} \hspace{\stretch{1}}(1.3)

Greek letters in this course (opposite to everybody else in the world, because of Landau and Lifshitz) run from 1 to 3, whereas roman letters run through the set \{0,1,2,3\}.

We want to put space and time on an equal footing and form the composite quantity (four vector)

\begin{aligned}x^i = (ct, \mathbf{r}) = (x^0, x^1, x^2, x^3),\end{aligned} \hspace{\stretch{1}}(1.4)


\begin{aligned}x^0 &= ct \\ x^1 &= x \\ x^2 &= y \\ x^3 &= z.\end{aligned} \hspace{\stretch{1}}(1.5)

It will also be convenient to drop indexes when referring to all the components of a four vector and we will use lower or upper case non-bold letters to represent such four vectors. For example

\begin{aligned}X = (ct, \mathbf{r}),\end{aligned} \hspace{\stretch{1}}(1.9)


\begin{aligned}v = \gamma \left(c, \mathbf{v} \right).\end{aligned} \hspace{\stretch{1}}(1.10)

Three vectors will be represented as letters with over arrows \vec{a} or (in text) bold face \mathbf{a}.

Recall that the squared spacetime interval between two events X_1 and X_2 is defined as

\begin{aligned}{S_{X_1, X_2}}^2 = (ct_1 - c t_2)^2 - (\mathbf{x}_1 - \mathbf{x}_2)^2.\end{aligned} \hspace{\stretch{1}}(1.11)

In particular, with one of these zero, we have an operator which takes a single four vector and spits out a scalar, measuring a “distance” from the origin

\begin{aligned}s^2 = (ct)^2 - \mathbf{r}^2.\end{aligned} \hspace{\stretch{1}}(1.12)

This motivates the introduction of a dot product for our four vector space.

\begin{aligned}X \cdot X = (ct)^2 - \mathbf{r}^2 = (x^0)^2 - \sum_{\alpha=1}^3 (x^\alpha)^2\end{aligned} \hspace{\stretch{1}}(1.13)

Utilizing the spacetime dot product of 1.13 we have for the dot product of the difference between two events

\begin{aligned}(X - Y) \cdot (X - Y)&=(x^0 - y^0)^2 - \sum_{\alpha =1}^3 (x^\alpha - y^\alpha)^2 \\ &=X \cdot X + Y \cdot Y - 2 x^0 y^0 + 2 \sum_{\alpha =1}^3 x^\alpha y^\alpha.\end{aligned}

From this, assuming our dot product 1.13 is both linear and symmetric, we have for any pair of spacetime events

\begin{aligned}X \cdot Y = x^0 y^0 - \sum_{\alpha =1}^3 x^\alpha y^\alpha.\end{aligned} \hspace{\stretch{1}}(1.14)

How do our four vectors transform? This is really just a notational issue, since this has already been discussed. In this new notation we have

\begin{aligned}{x^0}' &= ct' = \gamma ( ct - \beta x) = \gamma ( x^0 - \beta x^1 ) \\ {x^1}' &= x' = \gamma ( x - \beta ct ) = \gamma ( x^1 - \beta x^0 ) \\ {x^2}' &= x^2 \\ {x^3}' &= x^3\end{aligned} \hspace{\stretch{1}}(1.15)

where \beta = V/c, and \gamma^{-2} = 1 - \beta^2.

In order to put some structure to this, it can be helpful to express this dot product as a quadratic form. We write

\begin{aligned}A \cdot B = \begin{bmatrix}a^0 & \mathbf{a}^\text{T} \end{bmatrix}\begin{bmatrix}1 & 0 & 0 & 0 \\ 0 & -1 & 0 & 0 \\ 0 & 0 & -1 & 0 \\ 0 & 0 & 0 & -1 \end{bmatrix}\begin{bmatrix}b^0 \\ \mathbf{b}\end{bmatrix}= A^\text{T} G B.\end{aligned} \hspace{\stretch{1}}(1.19)

We can write our Lorentz boost as a matrix

\begin{aligned}\begin{bmatrix}\gamma & -\beta \gamma & 0 & 0 \\ -\beta \gamma & \gamma & 0 & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \end{bmatrix}\end{aligned} \hspace{\stretch{1}}(1.20)

so that the dot product between two transformed four vectors takes the form

\begin{aligned}A' \cdot B' = A^\text{T} O^\text{T} G O B\end{aligned} \hspace{\stretch{1}}(1.21)

Back to the problem.

We will work in momentum space, where we have

\begin{aligned}p^i &= (p^0, \mathbf{p}) = \left( \frac{E}{c}, \mathbf{p}\right) \\ p^2 &= \frac{E^2}{c^2} -\mathbf{p}^2 \\ \mathbf{p} &= \hbar \mathbf{k} \\ E &= \hbar \omega \\ p^i &= \hbar k^i \\ k^i &= \left(\frac{\omega}{c}, \mathbf{k}\right)\end{aligned} \hspace{\stretch{1}}(1.22)

Justifying this.

Now, the TA blurted all this out. We know some of it from the QM context, and if we’ve been reading ahead know a bit of this from our text [2] (the energy momentum four vector relationships). Let’s go back to the classical electromagnetism and recall what we know about the relation of frequency and wave numbers for continuous fields. We want solutions to Maxwell’s equation in vacuum and can show that such solution also implies that our fields obey a wave equation

\begin{aligned}\frac{1}{{c^2}} \frac{\partial^2 \Psi}{\partial t^2} - \boldsymbol{\nabla}^2 \Psi = 0,\end{aligned} \hspace{\stretch{1}}(1.28)

where \Psi is one of \mathbf{E} or \mathbf{B}. We have other constraints imposed on the solutions by Maxwell’s equations, but require that they at least obey 1.28 in addition to these constraints.

With application of a spatial Fourier transformation of the wave equation, we find that our solution takes the form

\begin{aligned}\Psi = (2 \pi)^{-3/2} \int \tilde{\Psi}(\mathbf{k}, 0) e^{i (\omega t \pm \mathbf{k} \cdot \mathbf{x}) } d^3 \mathbf{k}.\end{aligned} \hspace{\stretch{1}}(1.29)

If one takes this as a given and applies the wave equation operator to this as a test solution, one finds without doing the Fourier transform work that we also have a constraint. That is

\begin{aligned}\frac{1}{{c^2}} (i \omega)^2 \Psi - (\pm i \mathbf{k})^2 \Psi = 0.\end{aligned} \hspace{\stretch{1}}(1.30)

So even in the continuous field domain, we have a relationship between frequency and wave number. We see that this also happens to have the form of a lightlike spacetime interval

\begin{aligned}\frac{\omega^2}{c^2} - \mathbf{k}^2 = 0.\end{aligned} \hspace{\stretch{1}}(1.31)

Also recall that the photoelectric effect imposes an experimental constraint on photon energy, where we have

\begin{aligned}E = h \nu = \frac{h}{2\pi} 2 \pi \nu = \hbar \omega\end{aligned} \hspace{\stretch{1}}(1.32)

Therefore if we impose a mechanics like P = (E/c, \mathbf{p}) relativistic energy-momentum relationship on light, it then makes sense to form a nilpotent (lightlike) four vector for our photon energy. This combines our special relativistic expectations, with the constraints on the fields imposed by classical electromagnetism. We can then write for the photon four momentum

\begin{aligned}P = \left( \frac{\hbar \omega}{c}, \hbar k \right)\end{aligned} \hspace{\stretch{1}}(1.33)

Back to the TA’s formula blitz.

Utilizing spherical polar coordinates in momentum (wave number) space, measuring the polar angle from the k^1 (x-like) axis, we can compute this polar angle in both pairs of frames,

\begin{aligned} \cos \alpha &= \frac{k^1}{{\left\lvert{\mathbf{k}}\right\rvert}} = \frac{k^1}{\omega/c} \\ \cos \alpha' &= \frac{{k^1}'}{\omega'/c} = \frac{\gamma (k^1 + \beta \omega/c)}{\gamma(\omega/c + \beta k^1)}\end{aligned} \hspace{\stretch{1}}(1.34)

Note that this requires us to assume that wave number four vectors transform in the same fashion as regular mechanical position and momentum four vectors. Also note that we have the primed frame moving negatively along the x-axis, instead of the usual positive origin shift. The question is vague enough to allow this since it only requires motion.

\paragraph{check 1}

as \beta \rightarrow 1 (ie: our primed frame velocity approaches the speed of light relative to the rest frame), \cos \alpha' \rightarrow 1, \alpha' = 0. The surface gets more and more compressed.

In the original reference frame the radiation was isotropic. In the new frame how does it change with respect to the angle? This is really a question to find this number flux rate

\begin{aligned}f'(\alpha') = ?\end{aligned} \hspace{\stretch{1}}(1.36)

In our rest frame the total number of photons traveling through the surface in a given interval of time is

\begin{aligned}N &= \int d\Omega dt f(\alpha) = \int d \phi \sin \alpha d\alpha = -2 \pi \int d(\cos\alpha) dt f(\alpha) \\ \end{aligned} \hspace{\stretch{1}}(1.37)

Here we utilize the spherical solid angle d\Omega = \sin \alpha d\alpha d\phi = - d(\cos\alpha) d\phi, and integrate \phi over the [0, 2\pi] interval. We also have to assume that our number flux density is not a function of horizontal angle \phi in the rest frame.

In the moving frame we similarly have

\begin{aligned}N' &= -2 \pi \int d(\cos\alpha') dt' f'(\alpha'),\end{aligned} \hspace{\stretch{1}}(1.39)

and we again have had to assume that our transformed number flux density is not a function of the horizontal angle \phi. This seems like a reasonable move since {k^2}' = k^2 and {k^3}' = k^3 as they are perpendicular to the boost direction.

\begin{aligned}f'(\alpha') = \frac{d(\cos\alpha)}{d(\cos\alpha')} \left( \frac{dt}{dt'} \right) f(\alpha)\end{aligned} \hspace{\stretch{1}}(1.40)

Now, utilizing a conservation of mass argument, we can argue that N = N'. Regardless of the motion of the frame, the same number of particles move through the surface. Taking ratios, and examining an infinitesimal time interval, and the associated flux through a small patch, we have

\begin{aligned}\left( \frac{d(\cos\alpha)}{d(\cos\alpha')} \right) = \left( \frac{d(\cos\alpha')}{d(\cos\alpha)} \right)^{-1} = \gamma^2 ( 1 + \beta \cos\alpha)^2\end{aligned} \hspace{\stretch{1}}(1.41)

Part of the statement above was a do-it-yourself. First recall that c t' = \gamma ( c t + \beta x ), so dt/dt' evaluated at x=0 is 1/\gamma.

The rest is messier. We can calculate the d(\cos) values in the ratio above using 1.34. For example, for d(\cos(\alpha)) we have

\begin{aligned}d(\cos\alpha) &= d \left( \frac{k^1}{\omega/c} \right) \\ &= dk^1 \frac{1}{{\omega/c}} - c \frac{1}{{\omega^2}} d\omega.\end{aligned}

If one does the same thing for d(\cos\alpha'), after a whole whack of messy algebra one finds that the differential terms and a whole lot more mystically cancels, leaving just

\begin{aligned}\frac{d\cos\alpha'}{d\cos\alpha} = \frac{\omega^2/c^2}{(\omega/c + \beta k^1)^2} (1 - \beta^2)\end{aligned} \hspace{\stretch{1}}(1.42)

A bit more reduction with reference back to 1.34 verifies 1.41.

Also note that again from 1.34 we have

\begin{aligned}\cos\alpha' = \frac{\cos\alpha + \beta}{1 + \beta \cos\alpha}\end{aligned} \hspace{\stretch{1}}(1.43)

and rearranging this for \cos\alpha' gives us

\begin{aligned}\cos\alpha = \frac{\cos\alpha' - \beta}{1 - \beta \cos\alpha'},\end{aligned} \hspace{\stretch{1}}(1.44)

which we can sum to find that

\begin{aligned}1 + \beta \cos\alpha = \frac{1}{{\gamma^2 (1 - \beta \cos \alpha')^2 }},\end{aligned} \hspace{\stretch{1}}(1.45)

so putting all the pieces together we have

\begin{aligned}f'(\alpha') = \frac{1}{{\gamma}} \frac{f(\alpha)}{(\gamma (1-\beta \cos\alpha'))^2}\end{aligned} \hspace{\stretch{1}}(1.46)

The question asks for the energy flux density. We get this by multiplying the number density by the frequency of the light in question. This is, as a function of the polar angle, in each of the frames.

\begin{aligned}L(\alpha) &= \hbar \omega(\alpha) f(\alpha) = \hbar \omega f \\ L'(\alpha') &= \hbar \omega'(\alpha') f'(\alpha') = \hbar \omega' f'\end{aligned} \hspace{\stretch{1}}(1.47)

But we have

\begin{aligned}\omega'(\alpha')/c = \gamma( \omega/c + \beta k^1 ) = \gamma \omega/c ( 1 + \beta \cos\alpha )\end{aligned} \hspace{\stretch{1}}(1.49)

Aside, \beta << 1,

\begin{aligned}\omega' = \omega ( 1 + \beta \cos\alpha) + O(\beta^2) = \omega + \delta \omega\end{aligned} \hspace{\stretch{1}}(1.50)

\begin{aligned}\delta \omega &= \beta, \alpha = 0 		\qquad \text{blue shift} \\ \delta \omega &= -\beta, \alpha = \pi 		\qquad \text{red shift}\end{aligned} \hspace{\stretch{1}}(1.51)

The TA then writes

\begin{aligned}L'(\alpha') = \frac{L/\gamma}{(\gamma (1 - \beta \cos\alpha'))^3}\end{aligned} \hspace{\stretch{1}}(1.53)

although, I calculate

\begin{aligned}L'(\alpha') = \frac{L}{\gamma^4 (\gamma (1 - \beta \cos\alpha'))^4}\end{aligned} \hspace{\stretch{1}}(1.54)

He then says, the forward backward ratio is

\begin{aligned}L'(0)/L'(\pi) = {\left( \frac{ 1 + \beta }{1-\beta} \right)}^3\end{aligned} \hspace{\stretch{1}}(1.55)

The forward radiation is much bigger than the backwards radiation.

For this I get:

\begin{aligned}L'(0)/L'(\pi) = {\left( \frac{ 1 + \beta }{1-\beta} \right)}^4\end{aligned} \hspace{\stretch{1}}(1.56)

It is still bigger for \beta positive, which I think is the point.

If I can somehow manage to keep my signs right as I do this course I may survive. Why did he pick a positive sign way back in 1.34?


[1] J.B. Hartle and T. Dray. Gravity: an introduction to Einsteins general relativity, volume 71. 2003.

[2] L.D. Landau and E.M. Lifshits. The classical theory of fields. Butterworth-Heinemann, 1980.

Posted in Math and Physics Learning. | Tagged: , , , , , , , , , , , , , , , , , | Leave a Comment »

PHY450H1S. Relativistic Electrodynamics Lecture 4 (Taught by Prof. Erich Poppitz). Spacetime geometry, Lorentz transformations, Minkowski diagrams.

Posted by peeterjoot on January 18, 2011

[Click here for a PDF of this post with nicer formatting]


Still covering chapter 1 material from the text [1].

Finished covering Professor Poppitz’s lecture notes: invariance of finite intervals (25-26).

Started covering Professor Poppitz’s lecture notes: analogy with rotations and derivation of Lorentz transformations (27-32); Minkowski space diagram of boosted frame (32.1); using the diagram to find length contraction (32.2) ; nonrelativistic limit of boosts (33).

More spacetime geometry.

PICTURE: ct,x curvy worldline with tangent vector \mathbf{v}.

In an inertial frame moving with \mathbf{v}, whose origin coincides with momentary position of this moving observer ds^2 = c^2 {dt'}^2 = c^2 dt^2 - \mathbf{r}^2

“proper time” is

\begin{aligned}dt' = dt \sqrt{ 1 - \frac{1}{{c^2}} \left( \frac{d\mathbf{r}}{dt} \right)^2 } = dt \sqrt{ 1 - \frac{\mathbf{v}^2}{c^2}} \end{aligned} \hspace{\stretch{1}}(2.1)

We see that $latex dt’

0$, so that \sqrt{1-\mathbf{v}^2/c^2} < 1.

In a manifestly invariant way we define the proper time as

\begin{aligned}d\tau \equiv \frac{ds}{c}\end{aligned} \hspace{\stretch{1}}(2.2)

So that between worldpoints a and b the proper time is a line integral over the worldline

\begin{aligned}d\tau \equiv \frac{1}{{c}} \int_a^b ds.\end{aligned} \hspace{\stretch{1}}(2.3)

PICTURE: We are splitting up the worldline into many small pieces and summing them up.

HOLE IN LECTURE NOTES: ON PROPER TIME for “length” of straight vs. curved worldlines: TO BE REVISITED. Prof. Poppitz promised to revisit this again next time … his notes are confusing him, and he’d like to move on.

Finite interval invariance.

Tomorrow we are going to complete the proof about invariance. We’ve shown that light like intervals are invariant, and that infinitesimal intervals are invariant. We need to put these pieces together for finite intervals.

Deriving the Lorentz transformation.

Let’s find the coordinate transforms that leave s_{12}^2 invariant. This generalizes Galileo’s transformations.

We’d like to generalize rotations, which leave spatial distance invariant. Such a transformation also leaves the spacetime interval invariant.

In Euclidean space we can generate an arbitrary rotation by composition of rotation around any of the xy, yz, zx axis.

For 4D Euclidean space we would form any rotation by composition of any of the 6 independent rotations for the 6 available planes. For example with x,y,z,w axis we can rotate in any of the xy, xz, xw, yz, yw, zw planes.

For spacetime we can “rotate” in x,t, y,t, z,t “planes”. Physically this is motion space (boosting a position).

Consider a x,t transformation.

The trick (that is in the notes) is to rewrite the time as an analytical continuation of the time coordinate, as follows

\begin{aligned}ds^2 = c^2 dt^2 - dx^2\end{aligned} \hspace{\stretch{1}}(4.4)

and write

\begin{aligned}t \rightarrow i \tau,\end{aligned} \hspace{\stretch{1}}(4.5)

so that the interval becomes

\begin{aligned}ds^2 = - (c^2 d\tau^2 + dx^2)\end{aligned} \hspace{\stretch{1}}(4.6)

Now we have a structure that is familiar, and we can rotate as we normally do. Prof does not want to go through the details of this “trickery” in class, but says to see the notes. The end result is that we can transform as follows

\begin{aligned}x' &= x \cosh \psi + ct \sinh \psi \\ ct' &= x \sinh \psi + ct \cosh \psi \end{aligned} \hspace{\stretch{1}}(4.7)

which is analogous to a spatial rotation

\begin{aligned}x' &= x \cos \alpha + y \sin \alpha \\ y' &= -x \sin \alpha + y \cos \alpha \end{aligned} \hspace{\stretch{1}}(4.9)

There are some differences in sign as well, but the important feature to recall is that \cosh^2 x - \sinh^2 x = (1/4)( e^{2x} + e^{-2x} + 2 - e^{2x} - e^{-2x} + 2 ) = 1. We call these hyperbolic rotations, something that is simply a mathematical transformation. Now we want to relate this to something physical.

\paragraph{Q: What is \psi?}

The origin of O has coordinates (t, \mathbf{O}) in the O frame.

PICTURE (pg 32): O' frame translating along x axis with speed v_x. We have

\begin{aligned}\frac{x'}{c t'} = \frac{v_x}{c}\end{aligned} \hspace{\stretch{1}}(4.11)

However, using 4.7 we have for the origin

\begin{aligned}x' &= ct \sinh \psi \\ ct' &= ct \cosh \psi\end{aligned} \hspace{\stretch{1}}(4.12)

so that

\begin{aligned}\frac{x'}{c t'} = \tanh \psi = \frac{v_x}{c}\end{aligned} \hspace{\stretch{1}}(4.14)


\begin{aligned}\cosh \psi &= \frac{1}{{\sqrt{1 - \tanh^2 \psi}}} \\ \sinh \psi &= \frac{\tanh \psi}{\sqrt{1 - \tanh^2 \psi}}\end{aligned} \hspace{\stretch{1}}(4.15)

Performing all the gory substitutions one gets

\begin{aligned}x' &= \frac{1}{{\sqrt{1 - v_x^2/c^2}}} x+\frac{v_x/c}{\sqrt{1 - v_x^2/c^2}} c t \\ y' &= y \\ z' &= z \\ ct' &= \frac{v_x/c}{\sqrt{1 - v_x^2/c^2}} x+\frac{1}{{\sqrt{1 - v_x^2/c^2}}} c t\end{aligned} \hspace{\stretch{1}}(4.17)

PICTURE: Let us go to the more conventional case, where O is at rest and O' is moving with velocity v_x.

We achieve this by simply changing the sign of v_x in 4.17 above. This gives us

\begin{aligned}x' &= \frac{1}{{\sqrt{1 - v_x^2/c^2}}} x-\frac{v_x/c}{\sqrt{1 - v_x^2/c^2}} c t \\ y' &= y \\ z' &= z \\ ct' &= -\frac{v_x/c}{\sqrt{1 - v_x^2/c^2}} x+\frac{1}{{\sqrt{1 - v_x^2/c^2}}} c t\end{aligned} \hspace{\stretch{1}}(4.21)

We want some shorthand to make this easier to write and introduce

\begin{aligned}\gamma = \frac{1}{{\sqrt{1 - v_x^2/c^2}}},\end{aligned} \hspace{\stretch{1}}(4.25)

so that 4.21 becomes

\begin{aligned}x' &=  \gamma \left( x - \frac{v_x}{c} ct \right) \\ ct' &=  \gamma \left( ct - \frac{v_x}{c} x \right)\end{aligned} \hspace{\stretch{1}}(4.26)

We started the class by saying these would generalize the Galilean transformations. Observe that if we take c \rightarrow \infty, we have \gamma \rightarrow 1 and

\begin{aligned}x' &= x - v_x t + O((v_x/c)^2)t' &= t  + O(v_x/c)\end{aligned} \hspace{\stretch{1}}(4.28)

This is how to remember the signs. We want things to match up with the non-relativistic limit.

\paragraph{Q: How do lines of constant x' and ct' look like on the x,ct spacetime diagram?}

Our starting point (again) is

\begin{aligned}x' &=  \gamma \left( x - \frac{v_x}{c} ct \right) \\ ct' &=  \gamma \left( ct - \frac{v_x}{c} x \right).\end{aligned} \hspace{\stretch{1}}(4.29)

What are the points with x' = 0. Those are the points where x = (v_x/c) c t. This is the ct' axis. That’s the straight worldline

PICTURE: worldline of O' origin.

What are the points with ct' = 0. Those are the points where c t = x v_x/c. This is the x' axis.

Lines that are parallel to the x' axis are lines of constant x', and lines parallel to ct' axis are lines of constant t', but the light cone is the same for both.

\paragraph{What is this good for?}

We have time to pick from either length contraction or non-causality (how to kill your grandfather). How about length contraction. We can use the diagram to read the x or ct coordinates, or examine causality, but it is hard to read off t' or x' coordinates.


[1] L.D. Landau and E.M. Lifshits. The classical theory of fields. Butterworth-Heinemann, 1980.

Posted in Math and Physics Learning. | Tagged: , , , , , | Leave a Comment »

Lorentz transformation of the metric tensors.

Posted by peeterjoot on January 16, 2011

[Click here for a PDF of this post with nicer formatting]

Following up on the previous thought, it is not hard to come up with an example of a symmetric tensor a whole lot simpler than the electrodynamic stress tensor. The metric tensor is probably the simplest symmetric tensor, and we get that by considering the dot product of two vectors. Taking the dot product of vectors a and b for example we have

\begin{aligned}a \cdot b = a^\mu b^\nu \gamma_\mu \cdot \gamma_\nu\end{aligned} \hspace{\stretch{1}}(4.17)

From this, the metric tensors are defined as

\begin{aligned}\eta_{\mu\nu} &= \gamma_\mu \cdot \gamma_\nu \\ \eta^{\mu\nu} &= \gamma^\mu \cdot \gamma^\nu\end{aligned} \hspace{\stretch{1}}(4.18)

These are both symmetric and diagonal, and in fact equal (regardless of whether one picks a +,-,-,- or -,+,+,+ signature for the space).

Let’s look at the transformation of the dot product, utilizing the transformation of the four vectors being dotted to do so. By definition, when both vectors are equal, we have the (squared) spacetime interval, which based on the speed of light being constant, has been found to be an invariant under transformation.

\begin{aligned}a' \cdot b'= a^\mu b^\nu L(\gamma_\mu) \cdot L(\gamma_\nu)\end{aligned} \hspace{\stretch{1}}(4.20)

We note that, like any other vector, the image L(\gamma_\mu) of the Lorentz transform of the vector \gamma_\mu can be written as

\begin{aligned}L(\gamma_\mu) = \left( L(\gamma_\mu) \cdot \gamma^\nu \right) \gamma_\nu\end{aligned} \hspace{\stretch{1}}(4.21)

Similarily we can write any vector in terms of the reciprocal frame

\begin{aligned}\gamma_\nu = (\gamma_\nu \cdot \gamma_\mu) \gamma^\mu.\end{aligned} \hspace{\stretch{1}}(4.22)

The dot product factor is a component of the metric tensor

\begin{aligned}\eta_{\nu \mu} = \gamma_\nu \cdot \gamma_\mu,\end{aligned} \hspace{\stretch{1}}(4.23)

so we see that the dot product transforms as

\begin{aligned}a' \cdot b' = a^\mu b^\nu ( L(\gamma_\mu) \cdot \gamma^\alpha ) ( L(\gamma_\nu) \cdot \gamma^\beta ) \gamma_\alpha\cdot\gamma_\beta= a^\mu b^\nu {L_\mu}^\alpha{L_\nu}^\beta\eta_{\alpha \beta}\end{aligned} \hspace{\stretch{1}}(4.24)

In particular, for a = b where we have the invariant interval defined by the condition a^2 = {a'}^2, we must have

\begin{aligned}a^\mu a^\nu \eta_{\mu \nu}= a^\mu a^\nu {L_\mu}^\alpha{L_\nu}^\beta\eta_{\alpha \beta}\end{aligned} \hspace{\stretch{1}}(4.25)

This implies that the symmetric metric tensor transforms as

\begin{aligned}\eta_{\mu\nu}={L_\mu}^\alpha{L_\nu}^\beta\eta_{\alpha \beta}\end{aligned} \hspace{\stretch{1}}(4.26)

Recall from 3.16 that the coordinates representation of a bivector, an antisymmetric quantity transformed as

\begin{aligned}T^{\mu \nu} \rightarrow T^{\sigma \pi} {L_\sigma}^\mu {L_\pi}^\nu.\end{aligned} \hspace{\stretch{1}}(4.27)

This is a very similar transformation, but differs from the bivector case where our free indexes were upper indexes. Suppose that we define an alternate set of coordinates for the Lorentz transformation. Let

\begin{aligned}{L^\mu}_\nu = L(\gamma^\mu) \cdot \gamma_\nu.\end{aligned} \hspace{\stretch{1}}(4.28)

This can be related to the previous coordinate matrix by

\begin{aligned}{L^\mu}_\nu = \eta^{\mu \alpha } \eta_{\nu \beta } {L_\alpha}^\beta. \end{aligned} \hspace{\stretch{1}}(4.29)

If we examine how the coordinates of x^2 transform in thier lower index representation we find

\begin{aligned}{x'}^2 = x_\mu x_\nu {L^\mu}_\alpha {L^\nu}_\beta \eta^{\alpha \beta} = x^2 = x_\mu x_\nu \eta^{\mu \nu},\end{aligned} \hspace{\stretch{1}}(4.30)

and therefore find that the (upper index) metric tensor transforms as

\begin{aligned}\eta^{\mu \nu} \rightarrow\eta^{\alpha \beta}{L^\mu}_\alpha {L^\nu}_\beta .\end{aligned} \hspace{\stretch{1}}(4.31)

Compared to 4.27 we have almost the same structure of transformation. Are these the same? Does the notation I picked here introduce an apparent difference that does not actually exist? We really want to know if we have the identity

\begin{aligned}L(\gamma_\mu) \cdot \gamma^\nu\stackrel{?}{=}L(\gamma^\nu) \cdot \gamma_\mu,\end{aligned} \hspace{\stretch{1}}(4.32)

which given the notation selected would mean that {L_\mu}^\nu = {L^\nu}_\mu, and justify a notational simplification {L_\mu}^\nu = {L^\nu}_\mu = L^\nu_\mu.

The inverse Lorentz transformation

To answer this question, let’s consider a specific example, an x-axis boost of rapidity \alpha. For that our Lorentz transformation takes the following form

\begin{aligned}L(x) = e^{-\sigma_1 \alpha/2} x e^{\sigma_1 \alpha/2},\end{aligned} \hspace{\stretch{1}}(5.33)

where \sigma_k = \gamma_k \gamma_0. Since \sigma_1 anticommutes with \gamma_0 and \gamma_1, but commutes with \gamma_2 and \gamma_3, we have

\begin{aligned}L(x) = (x^0 \gamma_0 + x^1 \gamma_1) e^{\sigma_1 \alpha} + x^2 \gamma_2 + x^3 \gamma_3,\end{aligned} \hspace{\stretch{1}}(5.34)

and after expansion this is

\begin{aligned}L(x) = \gamma_0 ( x^0 \cosh \alpha - x^1 \sinh \alpha ) +\gamma_1 ( x^1 \cosh \alpha - x^0 \sinh \alpha )+\gamma_2+\gamma_3\end{aligned} \hspace{\stretch{1}}(5.35)

In particular for the basis vectors themselves we have

\begin{aligned}\begin{bmatrix}L(\gamma_0) \\ L(\gamma_1) \\ L(\gamma_2) \\ L(\gamma_3)\end{bmatrix}=\begin{bmatrix}\gamma_0 \cosh \alpha - \gamma_1 \sinh \alpha \\ -\gamma_0 \sinh \alpha + \gamma_1 \cosh \alpha \\ \gamma_2 \\ \gamma_3\end{bmatrix}\end{aligned} \hspace{\stretch{1}}(5.36)

Forming a matrix with \mu indexing over rows and \nu indexing over columns we have

\begin{aligned}{L_\mu}^\nu =\begin{bmatrix}\cosh \alpha &- \sinh \alpha & 0 & 0 \\ -\sinh \alpha & \cosh \alpha & 0 & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1\end{bmatrix}\end{aligned} \hspace{\stretch{1}}(5.37)

Performing the same expansion for {L^\nu}_\mu, again with \mu indexing over rows, we have

\begin{aligned}{L^\nu}_\mu =\begin{bmatrix}\cosh \alpha & \sinh \alpha & 0 & 0 \\ \sinh \alpha & \cosh \alpha & 0 & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1\end{bmatrix}.\end{aligned} \hspace{\stretch{1}}(5.38)

This answers the question. We cannot assume that {L_\mu}^\nu = {L^\nu}_\mu. In fact, in this particular case, we have {L^\nu}_\mu = ({L_\mu}^\nu)^{-1}. Is that a general condition? Note that for the general case, we have to consider compounded transformations, where each can be a boost or rotation.


[1] L.D. Landau and E.M. Lifshits. The classical theory of fields. Butterworth-Heinemann, 1980.

[2] C. Doran and A.N. Lasenby. Geometric algebra for physicists. Cambridge University Press New York, Cambridge, UK, 1st edition, 2003.

Posted in Math and Physics Learning. | Tagged: , , , , , , , | Leave a Comment »

Multivector commutators and Lorentz boosts.

Posted by peeterjoot on October 31, 2010

[Click here for a PDF of this post with nicer formatting]


In some reading there I found that the electrodynamic field components transform in a reversed sense to that of vectors, where instead of the perpendicular to the boost direction remaining unaffected, those are the parts that are altered.

To explore this, look at the Lorentz boost action on a multivector, utilizing symmetric and antisymmetric products to split that vector into portions effected and unaffected by the boost. For the bivector (electrodynamic case) and the four vector case, examine how these map to dot and wedge (or cross) products.

The underlying motivator for this boost consideration is an attempt to see where equation (6.70) of [1] comes from. We get to this by the very end.


Structure of the bivector boost.

Recall that we can write our Lorentz boost in exponential form with

\begin{aligned}L &= e^{\alpha \boldsymbol{\sigma}/2} \\ X' &= L^\dagger X L,\end{aligned} \hspace{\stretch{1}}(2.1)

where \boldsymbol{\sigma} is a spatial vector. This works for our bivector field too, assuming the composite transformation is an outermorphism of the transformed four vectors. Applying the boost to both the gradient and the potential our transformed field is then

\begin{aligned}F' &= \nabla' \wedge A' \\ &= (L^\dagger \nabla L) \wedge (L^\dagger A L) \\ &= \frac{1}{{2}} \left((L^\dagger \stackrel{ \rightarrow }{\nabla} L) (L^\dagger A L) -(L^\dagger A L) (L^\dagger \stackrel{ \leftarrow }{\nabla} L)\right) \\ &= \frac{1}{{2}} L^\dagger \left( \stackrel{ \rightarrow }{\nabla} A - A \stackrel{ \leftarrow }{\nabla} \right) L  \\ &= L^\dagger (\nabla \wedge A) L.\end{aligned}

Note that arrows were used briefly to indicate that the partials of the gradient are still acting on A despite their vector components being to one side. We are left with the very simple transformation rule

\begin{aligned}F' = L^\dagger F L,\end{aligned} \hspace{\stretch{1}}(2.3)

which has exactly the same structure as the four vector boost.

Employing the commutator and anticommutator to find the parallel and perpendicular components.

If we apply the boost to a four vector, those components of the four vector that commute with the spatial direction \boldsymbol{\sigma} are unaffected. As an example, which also serves to ensure we have the sign of the rapidity angle \alpha correct, consider \boldsymbol{\sigma} = \boldsymbol{\sigma}_1. We have

\begin{aligned}X' = e^{-\alpha \boldsymbol{\sigma}/2} ( x^0 \gamma_0 + x^1 \gamma_1 + x^2 \gamma_2 + x^3 \gamma_3 ) (\cosh \alpha/2 + \gamma_1 \gamma_0 \sinh \alpha/2 )\end{aligned} \hspace{\stretch{1}}(2.4)

We observe that the scalar and \boldsymbol{\sigma}_1 = \gamma_1 \gamma_0 components of the exponential commute with \gamma_2 and \gamma_3 since there is no vector in common, but that \boldsymbol{\sigma}_1 anticommutes with \gamma_0 and \gamma_1. We can therefore write

\begin{aligned}X' &= x^2 \gamma_2 + x^3 \gamma_3 +( x^0 \gamma_0 + x^1 \gamma_1 + ) (\cosh \alpha + \gamma_1 \gamma_0 \sinh \alpha ) \\ &= x^2 \gamma_2 + x^3 \gamma_3 +\gamma_0 ( x^0 \cosh\alpha - x^1 \sinh \alpha )+ \gamma_1 ( x^1 \cosh\alpha - x^0 \sinh \alpha )\end{aligned}

reproducing the familiar matrix result should we choose to write it out. How can we express the commutation property without resorting to components. We could write the four vector as a spatial and timelike component, as in

\begin{aligned}X = x^0 \gamma_0 + \mathbf{x} \gamma_0,\end{aligned} \hspace{\stretch{1}}(2.5)

and further separate that into components parallel and perpendicular to the spatial unit vector \boldsymbol{\sigma} as

\begin{aligned}X = x^0 \gamma_0 + (\mathbf{x} \cdot \boldsymbol{\sigma}) \boldsymbol{\sigma} \gamma_0 + (\mathbf{x} \wedge \boldsymbol{\sigma}) \boldsymbol{\sigma} \gamma_0.\end{aligned} \hspace{\stretch{1}}(2.6)

However, it would be nicer to group the first two terms together, since they are ones that are affected by the transformation. It would also be nice to not have to resort to spatial dot and wedge products, since we get into trouble too easily if we try to mix dot and wedge products of four vector and spatial vector components.

What we can do is employ symmetric and antisymmetric products (the anticommutator and commutator respectively). Recall that we can write any multivector product this way, and in particular

\begin{aligned}M \boldsymbol{\sigma} = \frac{1}{{2}} (M \boldsymbol{\sigma}  + \boldsymbol{\sigma} M) + \frac{1}{{2}} (M \boldsymbol{\sigma} - \boldsymbol{\sigma} M).\end{aligned} \hspace{\stretch{1}}(2.7)

Left multiplying by the unit spatial vector \boldsymbol{\sigma} we have

\begin{aligned}M = \frac{1}{{2}} (M + \boldsymbol{\sigma} M \boldsymbol{\sigma}) + \frac{1}{{2}} (M - \boldsymbol{\sigma} M \boldsymbol{\sigma}) = \frac{1}{{2}} \left\{{M},{\boldsymbol{\sigma}}\right\} \boldsymbol{\sigma} + \frac{1}{{2}} \left[{M},{\boldsymbol{\sigma}}\right] \boldsymbol{\sigma}.\end{aligned} \hspace{\stretch{1}}(2.8)

When M = \mathbf{a} is a spatial vector this is our familiar split into parallel and perpendicular components with the respective projection and rejection operators

\begin{aligned}\mathbf{a} = \frac{1}{{2}} \left\{\mathbf{a},{\boldsymbol{\sigma}}\right\} \boldsymbol{\sigma} + \frac{1}{{2}} \left[{\mathbf{a}},{\boldsymbol{\sigma}}\right] \boldsymbol{\sigma} = (\mathbf{a} \cdot \boldsymbol{\sigma}) \boldsymbol{\sigma} + (\mathbf{a} \wedge \boldsymbol{\sigma}) \boldsymbol{\sigma}.\end{aligned} \hspace{\stretch{1}}(2.9)

However, the more general split employing symmetric and antisymmetric products in 2.8, is something we can use for our four vector and bivector objects too.

Observe that we have the commutation and anti-commutation relationships

\begin{aligned}\left( \frac{1}{{2}} \left\{{M},{\boldsymbol{\sigma}}\right\} \boldsymbol{\sigma} \right) \boldsymbol{\sigma} &= \boldsymbol{\sigma} \left( \frac{1}{{2}} \left\{{M},{\boldsymbol{\sigma}}\right\} \boldsymbol{\sigma} \right) \\ \left( \frac{1}{{2}} \left[{M},{\boldsymbol{\sigma}}\right] \boldsymbol{\sigma} \right) \boldsymbol{\sigma} &= -\boldsymbol{\sigma} \left( \frac{1}{{2}} \left[{M},{\boldsymbol{\sigma}}\right] \boldsymbol{\sigma} \right).\end{aligned} \hspace{\stretch{1}}(2.10)

This split therefore serves to separate the multivector object in question nicely into the portions that are acted on by the Lorentz boost, or left unaffected.

Application of the symmetric and antisymmetric split to the bivector field.

Let’s apply 2.8 to the spacetime event X again with an x-axis boost \sigma = \sigma_1. The anticommutator portion of X in this boost direction is

\begin{aligned}\frac{1}{{2}} \left\{{X},{\boldsymbol{\sigma}_1}\right\} \boldsymbol{\sigma}_1&=\frac{1}{{2}} \left(\left( x^0 \gamma_0 + x^1 \gamma_1 + x^2 \gamma_2 + x^3 \gamma_3 \right)+\gamma_1 \gamma_0\left( x^0 \gamma_0 + x^1 \gamma_1 + x^2 \gamma_2 + x^3 \gamma_3 \right) \gamma_1 \gamma_0\right) \\ &=x^2 \gamma_2 + x^3 \gamma_3,\end{aligned}

whereas the commutator portion gives us

\begin{aligned}\frac{1}{{2}} \left[{X},{\boldsymbol{\sigma}_1}\right] \boldsymbol{\sigma}_1&=\frac{1}{{2}} \left(\left( x^0 \gamma_0 + x^1 \gamma_1 + x^2 \gamma_2 + x^3 \gamma_3 \right)-\gamma_1 \gamma_0\left( x^0 \gamma_0 + x^1 \gamma_1 + x^2 \gamma_2 + x^3 \gamma_3 \right) \gamma_1 \gamma_0\right) \\ &=x^0 \gamma_0 + x^1 \gamma_1.\end{aligned}

We’ve seen that only these commutator portions are acted on by the boost. We have therefore found the desired logical grouping of the four vector X into portions that are left unchanged by the boost and those that are affected. That is

\begin{aligned}\frac{1}{{2}} \left[{X},{\boldsymbol{\sigma}}\right] \boldsymbol{\sigma} &= x^0 \gamma_0 + (\mathbf{x} \cdot \boldsymbol{\sigma}) \boldsymbol{\sigma} \gamma_0  \\ \frac{1}{{2}} \left\{{X},{\boldsymbol{\sigma}}\right\} \boldsymbol{\sigma} &= (\mathbf{x} \wedge \boldsymbol{\sigma}) \boldsymbol{\sigma} \gamma_0 \end{aligned} \hspace{\stretch{1}}(2.12)

Let’s now return to the bivector field F = \nabla \wedge A = \mathbf{E} + I c \mathbf{B}, and split that multivector into boostable and unboostable portions with the commutator and anticommutator respectively.

Observing that our pseudoscalar I commutes with all spatial vectors we have for the anticommutator parts that will not be affected by the boost

\begin{aligned}\frac{1}{{2}} \left\{{\mathbf{E} + I c \mathbf{B}},{\boldsymbol{\sigma}}\right\} \boldsymbol{\sigma} &= (\mathbf{E} \cdot \boldsymbol{\sigma}) \boldsymbol{\sigma} + I c (\mathbf{B} \cdot \boldsymbol{\sigma}) \boldsymbol{\sigma},\end{aligned} \hspace{\stretch{1}}(2.14)

and for the components that will be boosted we have

\begin{aligned}\frac{1}{{2}} \left[{\mathbf{E} + I c \mathbf{B}},{\boldsymbol{\sigma}}\right] \boldsymbol{\sigma} &= (\mathbf{E} \wedge \boldsymbol{\sigma}) \boldsymbol{\sigma} + I c (\mathbf{B} \wedge \boldsymbol{\sigma}) \boldsymbol{\sigma}.\end{aligned} \hspace{\stretch{1}}(2.15)

For the four vector case we saw that the components that lay “perpendicular” to the boost direction, were unaffected by the boost. For the field we see the opposite, and the components of the individual electric and magnetic fields that are parallel to the boost direction are unaffected.

Our boosted field is therefore

\begin{aligned}F' = (\mathbf{E} \cdot \boldsymbol{\sigma}) \boldsymbol{\sigma} + I c (\mathbf{B} \cdot \boldsymbol{\sigma}) \boldsymbol{\sigma}+ \left( (\mathbf{E} \wedge \boldsymbol{\sigma}) \boldsymbol{\sigma} + I c (\mathbf{B} \wedge \boldsymbol{\sigma}) \boldsymbol{\sigma}\right) \left( \cosh \alpha + \boldsymbol{\sigma} \sinh \alpha \right)\end{aligned} \hspace{\stretch{1}}(2.16)

Focusing on just the non-parallel terms we have

\begin{aligned}\left( (\mathbf{E} \wedge \boldsymbol{\sigma}) \boldsymbol{\sigma} + I c (\mathbf{B} \wedge \boldsymbol{\sigma}) \boldsymbol{\sigma}\right) \left( \cosh \alpha + \boldsymbol{\sigma} \sinh \alpha \right)&=(\mathbf{E}_\perp + I c \mathbf{B}_\perp ) \cosh\alpha+(I \mathbf{E} \times \boldsymbol{\sigma} - c \mathbf{B} \times \boldsymbol{\sigma} ) \sinh\alpha \\ &=\mathbf{E}_\perp \cosh\alpha - c (\mathbf{B} \times \boldsymbol{\sigma} ) \sinh\alpha + I ( c \mathbf{B}_\perp \cosh\alpha + (\mathbf{E} \times \boldsymbol{\sigma}) \sinh\alpha ) \\ &=\gamma \left(\mathbf{E}_\perp - c (\mathbf{B} \times \boldsymbol{\sigma} ) {\left\lvert{\mathbf{v}}\right\rvert}/c+ I ( c \mathbf{B}_\perp + (\mathbf{E} \times \boldsymbol{\sigma}) {\left\lvert{\mathbf{v}}\right\rvert}/c) \right)\end{aligned}

A final regrouping gives us

\begin{aligned}F'&=\mathbf{E}_\parallel + \gamma \left( \mathbf{E}_\perp - \mathbf{B} \times \mathbf{v} \right)+I c \left( \mathbf{B}_\parallel + \gamma \left( \mathbf{B}_\perp + \mathbf{E} \times \mathbf{v}/c^2 \right) \right)\end{aligned} \hspace{\stretch{1}}(2.17)

In particular when we consider the proton, electron system as in equation (6.70) of [1] where it is stated that the electron will feel a magnetic field given by

\begin{aligned}\mathbf{B} = - \frac{\mathbf{v}}{c} \times \mathbf{E}\end{aligned} \hspace{\stretch{1}}(2.18)

we can see where this comes from. If F = \mathbf{E} + I c (0) is the field acting on the electron, then application of a \mathbf{v} boost to the electron perpendicular to the field (ie: radial motion), we get

\begin{aligned}F' = I c \gamma \mathbf{E} \times \mathbf{v}/c^2 =-I c \gamma \frac{\mathbf{v}}{c^2} \times \mathbf{E}\end{aligned} \hspace{\stretch{1}}(2.19)

We also have an additional 1/c factor in our result, but that’s a consequence of the choice of units where the dimensions of \mathbf{E} match c \mathbf{B}, whereas in the text we have \mathbf{E} and \mathbf{B} in the same units. We also have an additional \gamma factor, so we must presume that {\left\lvert{\mathbf{v}}\right\rvert} << c in this portion of the text. That is actually a requirement here, for if the electron was already in motion, we'd have to boost a field that also included a magnetic component. A consequence of this is that the final interaction Hamiltonian of (6.75) is necessarily non-relativistic.


[1] BR Desai. Quantum mechanics with basic field theory. Cambridge University Press, 2009.

Posted in Math and Physics Learning. | Tagged: , , , , , , , , , , , , , , , , , , , , | Leave a Comment »

Rotations using matrix exponentials

Posted by peeterjoot on July 27, 2010

[Click here for a PDF of this post with nicer formatting]


In [1] it is noted in problem 1.3 that any Unitary operator can be expressed in exponential form

\begin{aligned}U = e^{iC},\end{aligned} \hspace{\stretch{1}}(1.1)

where C is Hermitian. This is a powerful result hiding away in this problem. I haven’t actually managed to prove this yet to my satisfaction, but working through some examples is highly worthwhile. In particular it is interesting to compute the matrix C for a rotation matrix. One finds that the matrix for such a rotation operator is in fact one of the Pauli spin matrices, and I found it interesting that this falls out so naturally. Additionally, it is rather slick that one is able to so concisely express the rotation in exponential form, something that is natural and powerful in complex variable algebra, and also possible using Geometric Algebra using exponentials of bivectors. Here we can do it after all with nothing more than the plain old matrix algebra that everybody is already comfortable with.

The logarithm of the Unitary matrix.

By inspection we can invert 1.1 for C, by taking the logarithm

\begin{aligned}C = -i \ln U.\end{aligned} \hspace{\stretch{1}}(2.2)

The problem becomes one of evaluating the logarithm, or even giving meaning to it. I’ll assume that the functions of matrices that we are interested in are all polynomial in powers of the matrix, as in

\begin{aligned}f(U) = \sum_k \alpha_k U^k,\end{aligned} \hspace{\stretch{1}}(2.3)

and that such series are convergent. Then using a spectral decomposition, possible since Unitary matrices are normal, we can write for diagonal \Sigma = {\begin{bmatrix} \lambda_i \end{bmatrix}}_i

\begin{aligned}U = V \Sigma V^\dagger,\end{aligned} \hspace{\stretch{1}}(2.4)


\begin{aligned}f(U) = V \left( \sum_k \alpha_k \Sigma^k \right) V^\dagger = V {\begin{bmatrix} f(\lambda_i) \end{bmatrix}}_i V^\dagger.\end{aligned} \hspace{\stretch{1}}(2.5)

Provided the logarithm has a convergent power series representation for U, we then have for our Hermitian matrix C

\begin{aligned}C = -i V (\ln \Sigma) V^\dagger\end{aligned} \hspace{\stretch{1}}(2.6)

Evaluate this logarithm for an x,y plane rotation.

Given the rotation matrix

\begin{aligned}U =\begin{bmatrix}\cos\theta & \sin\theta \\ -\sin\theta & \cos\theta\end{bmatrix},\end{aligned} \hspace{\stretch{1}}(2.7)

We find that the eigenvalues are e^{\pm i\theta}, with eigenvectors proportional to (1, \pm i) respectively. Our decomposition for U is then given by
2.4, and

\begin{aligned}V &= \frac{1}{{\sqrt{2}}}\begin{bmatrix}1 & 1 \\ i & -i\end{bmatrix} \\ \Sigma &=\begin{bmatrix}e^{i\theta} & 0 \\ 0 & e^{-i\theta}\end{bmatrix}.\end{aligned} \hspace{\stretch{1}}(2.8)

Taking logs we have

\begin{aligned}C&=\frac{1}{2}\begin{bmatrix}1 & 1 \\ i & -i\end{bmatrix}\begin{bmatrix}\theta & 0 \\ 0 & -\theta\end{bmatrix} \begin{bmatrix}1 & -i \\ 1 & i\end{bmatrix} \\ &=\frac{1}{2}\begin{bmatrix}1 & 1 \\ i & -i\end{bmatrix}\begin{bmatrix}\theta  & -i\theta \\ -\theta & -i\theta\end{bmatrix}  \\ &=\begin{bmatrix}0 & -i\theta \\ i\theta & 0\end{bmatrix}.\end{aligned}

With the Pauli matrix

\begin{aligned}\sigma_2 =\begin{bmatrix}0 & -i \\ i & 0\end{bmatrix},\end{aligned} \hspace{\stretch{1}}(2.10)

we then have for an x,y plane rotation matrix just:

\begin{aligned}C = \theta \sigma_2\end{aligned} \hspace{\stretch{1}}(2.11)


\begin{aligned}U = e^{i \theta \sigma_2}.\end{aligned} \hspace{\stretch{1}}(2.12)

Immediately, since \sigma_2^2 = I, this also provides us with a trigonometric expansion

\begin{aligned}U = I \cos\theta + i \sigma_2 \sin\theta.\end{aligned} \hspace{\stretch{1}}(2.13)

By inspection one can see that this takes us full circle back to the original matrix form 2.7 of the rotation. The exponential form of
2.12 has a beauty that is however far superior to the plain old trigonometric matrix that we are comfortable with. All without any geometric algebra or bivector exponentials.

Three dimensional exponential rotation matrices.

By inspection, we can augment our matrix C for a three dimensional rotation in the x,y plane, or a y,z rotation, or a x,z rotation. Those are, respectively

\begin{aligned}U_{x,y}&=\exp\begin{bmatrix}0 & \theta & 0 \\ -\theta & 0 & 0 \\ 0 & 0 & i\end{bmatrix} \\ U_{y,z}&=\exp\begin{bmatrix}i & 0 & 0 \\ 0 & 0 & \theta \\ 0 & -\theta & 0 \\ \end{bmatrix} \\ U_{x,z}&=\exp\begin{bmatrix}0 & 0 & \theta \\ 0 & i & 0 \\ -\theta & 0 & 0 \\ \end{bmatrix}\end{aligned} \hspace{\stretch{1}}(2.14)

Each of these matrices can be related to each other by similarity transformation using the permutation matrices

\begin{aligned}\begin{bmatrix}0 & 0 & 1 \\ 0 & 1 & 0 \\ 1 & 0 & 0 \\ \end{bmatrix},\end{aligned}


\begin{aligned}\begin{bmatrix}1 & 0 & 0 \\ 0 & 0 & 1 \\ 0 & 1 & 0 \\ \end{bmatrix}.\end{aligned}

Exponential matrix form for a Lorentz boost.

The next obvious thing to try with this matrix representation is a Lorentz boost.

\begin{aligned}L =\begin{bmatrix}\cosh\alpha & -\sinh\alpha \\ -\sinh\alpha & \cosh\alpha\end{bmatrix},\end{aligned} \hspace{\stretch{1}}(2.17)

where \cosh\alpha = \gamma, and \tanh\alpha = \beta.

This matrix has a spectral decomposition given by

\begin{aligned}V &= \frac{1}{{\sqrt{2}}}\begin{bmatrix}1 & 1 \\ -1 & 1\end{bmatrix} \\ \Sigma &=\begin{bmatrix}e^\alpha & 0 \\ 0 & e^{-\alpha}\end{bmatrix}.\end{aligned} \hspace{\stretch{1}}(2.18)

Taking logs and computing C we have

\begin{aligned}C&=-\frac{i}{2}\begin{bmatrix}1 & 1 \\ -1 & 1 \end{bmatrix}\begin{bmatrix}\alpha & 0 \\ 0 & -\alpha\end{bmatrix} \begin{bmatrix}1 & -1 \\ 1 & 1\end{bmatrix} \\ &=-\frac{i}{2}\begin{bmatrix}1 & 1 \\ -1 & 1 \end{bmatrix}\begin{bmatrix}\alpha & -\alpha \\ -\alpha & -\alpha\end{bmatrix} \\ &=i \alpha\begin{bmatrix}0 & 1 \\ 1 & 0 \end{bmatrix}.\end{aligned}

Again we have one of the Pauli spin matrices. This time it is

\begin{aligned}\sigma_1 =\begin{bmatrix}0 & 1 \\ 1 & 0 \end{bmatrix}.\end{aligned} \hspace{\stretch{1}}(2.20)

So we can write our Lorentz boost 2.17 as just

\begin{aligned}L = e^{-\alpha \sigma_1} = I \cosh\alpha - \sigma_1 \sinh\alpha.\end{aligned} \hspace{\stretch{1}}(2.21)

By inspection again, we can come full circle by inspection from this last hyperbolic representation back to the original explicit matrix representation. Quite nifty!

It occurred to me after the fact that the Lorentz boost is not Unitary. The fact that the eigenvalues are not a purely complex phase term, like those of the rotation is actually a good hint that looking at how to characterize the eigenvalues of a unitary matrix can be used to show that the matrix C = -i V \ln \Sigma V^\dagger is Hermitian.


[1] BR Desai. Quantum mechanics with basic field theory. Cambridge University Press, 2009.

Posted in Math and Physics Learning. | Tagged: , , , , , , , | Leave a Comment »

Relativistic Doppler formula.

Posted by peeterjoot on July 6, 2009

[Click here for a PDF of this post with nicer formatting]

Transform of angular velocity four vector.

It was possible to derive the Lorentz boost matrix by requiring that the wave equation operator

\begin{aligned}\nabla^2 = \frac{1}{c^2}\frac{\partial^2}{\partial t^2} - \boldsymbol{\nabla}^2 \end{aligned}

retain its form under linear transformation ([1]). Applying spatial Fourier transforms ([2]), one finds that solutions to the wave equation

\begin{aligned}\nabla^2 \psi(t,\mathbf{x}) = 0 \end{aligned}

Have the form

\begin{aligned}\psi(t, \mathbf{x}) = \int A(\mathbf{k}) e^{i(\mathbf{k} \cdot \mathbf{x} - \omega t)} d^3 k \end{aligned}

Provided that \omega = \pm c {\left\lvert{\mathbf{k}}\right\rvert}. Wave equation solutions can therefore be thought of as continuously weighted superpositions of constrained fundamental solutions

\begin{aligned}\psi &= e^{i(\mathbf{k} \cdot \mathbf{x} - \omega t)} \\ c^2 \mathbf{k}^2 &= \omega^2 \end{aligned}

The constraint on frequency and wave number has the look of a Lorentz square

\begin{aligned}\omega^2 - c^2 \mathbf{k}^2 = 0 \end{aligned}

Which suggests that in additional to the spacetime vector

\begin{aligned}X = (ct, \mathbf{x}) = x^\mu \gamma_\mu \end{aligned}

evident in the wave equation fundamental solution, we also have a frequency-wavenumber four vector

\begin{aligned}K = (\omega/c, \mathbf{k}) = k^\mu \gamma_\mu \end{aligned}

The pair of four vectors above allow the fundamental solutions to be put explicitly into covariant form

\begin{aligned}K \cdot X = \omega t - \mathbf{k} \cdot \mathbf{x} = k_\mu x^\mu \end{aligned}

\begin{aligned}\psi = e^{-i K \cdot X} \end{aligned}

Let’s also examine the transformation properties of this fundamental solution, and see as a side effect that K
has transforms appropriately as a four vector.

\begin{aligned}0 &= \nabla^2 \psi(t,\mathbf{x}) \\ &= {\nabla'}^2 \psi(t',\mathbf{x}') \\ &= {\nabla'}^2 e^{i(\mathbf{x}' \cdot \mathbf{k}' - \omega' t')} \\ &= -\left(\frac{{\omega'}^2}{c^2} - {\mathbf{k}'}^2 \right) e^{i(\mathbf{x}' \cdot \mathbf{k}' - \omega' t')} \\  \end{aligned}

We therefore have the same form of frequency wave number constraint in the transformed frame (if we require that
the wave function for light is unchanged under transformation)

\begin{aligned}{\omega'}^2 = c^2 {\mathbf{k}'}^2  \end{aligned}

Writing this as

\begin{aligned}0 = {\omega}^2 - c^2 {\mathbf{k}}^2 = {\omega'}^2 - c^2 {\mathbf{k}'}^2  \end{aligned}

singles out the Lorentz invariant nature of the (\omega, \mathbf{k}) pairing, and we conclude that this pairing
does indeed transform as a four vector.

Application of one dimensional boost.

Having attempted to justify the four vector nature of the wave number vector K, now move on to application of a boost along the x-axis to this vector.

\begin{aligned}\begin{bmatrix}\omega' \\ c k' \\ \end{bmatrix}&=\gamma\begin{bmatrix}1 & -\beta \\ -\beta& 1 \\ \end{bmatrix}\begin{bmatrix}\omega \\ c k \\ \end{bmatrix} \\ &=\begin{bmatrix}\omega - v k \\ c k - \beta \omega\end{bmatrix}  \end{aligned}

We can take ratios of the frequencies if we make use of the dependency between \omega and k. Namely, \omega = \pm c k. We then have

\begin{aligned}\frac{\omega'}{\omega}&= \gamma(1 \mp \beta) \\ &= \frac{1 \mp \beta}{\sqrt{1 - \beta^2}} \\ &= \frac{1 \mp \beta}{\sqrt{1 - \beta}\sqrt{1 + \beta}} \\  \end{aligned}

For the positive angular frequency this is

\begin{aligned}\frac{\omega'}{\omega}&= \frac{\sqrt{1 - \beta}}{\sqrt{1 + \beta}} \\  \end{aligned}

and for the negative frequency the reciprocal.

Deriving this with a Lorentz boost is much simpler than the time dilation argument in wikipedia doppler article ([3]). EDIT: Later found exactly the above boost argument in the wiki k-vector article ([4]).

What’s missing here is putting this in a physical context properly with source and reciever frequencies spelled out. That would make this more than just math.


[1] Peeter Joot. Wave equation based Lorentz transformation derivation [online].

[2] Peeter Joot. Fourier transform solutions to the wave equation [online].

[3] Wikipedia. Relativistic doppler effect — wikipedia, the free encyclopedia [online]. 2009. [Online; accessed 26-June-2009].

[4] Wikipedia. Wave vector — wikipedia, the free encyclopedia [online]. 2009. [Online; accessed 30-June-2009].

Posted in Math and Physics Learning. | Tagged: , , , | Leave a Comment »