Peeter Joot's (OLD) Blog.

Math, physics, perl, and programming obscurity.

Posts Tagged ‘rotation’

Geometry of general Jones vector (problem 2.8)

Posted by peeterjoot on August 9, 2012

[Click here for a PDF of this post with nicer formatting]

Another problem from [1].


The general case is represented by the Jones vector

\begin{aligned}\begin{bmatrix}A \\ B e^{i\Delta}\end{bmatrix}.\end{aligned} \hspace{\stretch{1}}(1.1.1)

Show that this represents elliptically polarized light in which the major axis of the ellipse makes an angle

\begin{aligned}\frac{1}{{2}} \tan^{-1} \left( \frac{2 A B \cos \Delta }{A^2 - B^2} \right),\end{aligned} \hspace{\stretch{1}}(1.1.2)

with the x axis.


Prior to attempting the problem as stated, let’s explore the algebra of a parametric representation of an ellipse, rotated at an angle \theta as in figure (1). The equation of the ellipse in the rotated coordinates is

Figure 1: Rotated ellipse


\begin{aligned}\begin{bmatrix}x' \\ y'\end{bmatrix}=\begin{bmatrix}a \cos u \\ b \sin u\end{bmatrix},\end{aligned} \hspace{\stretch{1}}(1.2.3)

which is easily seen to have the required form

\begin{aligned}\left( \frac{x'}{a} \right)^2+\left( \frac{y'}{b} \right)^2 = 1.\end{aligned} \hspace{\stretch{1}}(1.2.4)

We’d like to express x' and y' in the “fixed” frame. Consider figure (2) where our coordinate conventions are illustrated. With

Figure 2: 2d rotation of frame


\begin{aligned}\begin{bmatrix}\hat{\mathbf{x}}' \\ \hat{\mathbf{y}}'\end{bmatrix}=\begin{bmatrix}\hat{\mathbf{x}} e^{\hat{\mathbf{x}} \hat{\mathbf{y}} \theta} \\ \hat{\mathbf{y}} e^{\hat{\mathbf{x}} \hat{\mathbf{y}} \theta}\end{bmatrix}=\begin{bmatrix}\hat{\mathbf{x}} \cos \theta + \hat{\mathbf{y}} \sin\theta \\ \hat{\mathbf{y}} \cos \theta - \hat{\mathbf{x}} \sin\theta\end{bmatrix},\end{aligned} \hspace{\stretch{1}}(1.2.5)

and x \hat{\mathbf{x}} + y\hat{\mathbf{y}} = x' \hat{\mathbf{x}} + y' \hat{\mathbf{y}} we find

\begin{aligned}\begin{bmatrix}x' \\ y'\end{bmatrix}=\begin{bmatrix}\cos \theta & \sin\theta \\ -\sin\theta & \cos\theta\end{bmatrix}\begin{bmatrix}x \\ y\end{bmatrix},\end{aligned} \hspace{\stretch{1}}(1.2.22)

so that the equation of the ellipse can be stated as

\begin{aligned}\begin{bmatrix}\cos \theta & \sin\theta \\ -\sin\theta & \cos\theta\end{bmatrix}\begin{bmatrix}x \\ y\end{bmatrix}=\begin{bmatrix}a \cos u \\ b \sin u\end{bmatrix},\end{aligned} \hspace{\stretch{1}}(1.2.7)


\begin{aligned}\begin{bmatrix}x \\ y\end{bmatrix}=\begin{bmatrix}\cos \theta & -\sin\theta \\ \sin\theta & \cos\theta\end{bmatrix}\begin{bmatrix}a \cos u \\ b \sin u\end{bmatrix}=\begin{bmatrix}a \cos \theta \cos u - b \sin \theta \sin u \\ a \sin \theta \cos u + b \cos \theta \sin u\end{bmatrix}.\end{aligned} \hspace{\stretch{1}}(1.2.8)

Observing that

\begin{aligned}\cos u + \alpha \sin u = \text{Real}\left( (1 + i \alpha) e^{-i u} \right)\end{aligned} \hspace{\stretch{1}}(1.2.9)

we have, with \text{atan2} = \text{atan2}(x, y) a Jones vector representation of our rotated ellipse

\begin{aligned}\begin{bmatrix}x \\ y\end{bmatrix}=\text{Real}\begin{bmatrix}( a \cos \theta - i b \sin\theta ) e^{-iu} \\ ( a \sin \theta + i b \cos\theta ) e^{-iu}\end{bmatrix}=\text{Real}\begin{bmatrix}\sqrt{ a^2 \cos^2 \theta + b^2 \sin^2 \theta } e^{i \text{atan2}(a \cos\theta, -b\sin\theta) - i u} \\ \sqrt{ a^2 \sin^2 \theta + b^2 \cos^2 \theta } e^{i \text{atan2}(a \sin\theta, b\cos\theta) - i u}\end{bmatrix}.\end{aligned} \hspace{\stretch{1}}(1.2.10)

Since we can absorb a constant phase factor into our -iu argument, we can write this as

\begin{aligned}\begin{bmatrix}x \\ y\end{bmatrix}=\text{Real}\left(\begin{bmatrix}\sqrt{ a^2 \cos^2 \theta + b^2 \sin^2 \theta } \\ \sqrt{ a^2 \sin^2 \theta + b^2 \cos^2 \theta } e^{i \text{atan2}(a \sin\theta, b\cos\theta) -i \text{atan2}(a \cos\theta, -b\sin\theta)} \end{bmatrix} e^{-i u'}\right).\end{aligned} \hspace{\stretch{1}}(1.2.11)

This has the required form once we make the identifications

\begin{aligned}A = \sqrt{ a^2 \cos^2 \theta + b^2 \sin^2 \theta }\end{aligned} \hspace{\stretch{1}}(1.2.12)

\begin{aligned}B = \sqrt{ a^2 \sin^2 \theta + b^2 \cos^2 \theta } \end{aligned} \hspace{\stretch{1}}(1.2.13)

\begin{aligned}\Delta =\text{atan2}(a \sin\theta, b\cos\theta) - \text{atan2}(a \cos\theta, -b\sin\theta).\end{aligned} \hspace{\stretch{1}}(1.2.14)

What isn’t obvious is that we can do this for any A, B, and \Delta. Portions of this problem I tried in Mathematica starting from the elliptic equation derived in section 8.1.3 of [2]. I’d used Mathematica since on paper I found the rotation angle that eliminated the cross terms to always be 45 degrees, but this turns out to have been because I’d first used a change of variables that scaled the equation. Here’s the whole procedure without any such scaling to arrive at the desired result for this problem. Our starting point is the Jones specified field, again as above I’ve using -iu = i (k z - \omega t)

\begin{aligned}\mathbf{E} = \text{Real}\left( \begin{bmatrix}A \\ B e^{i \Delta}\end{bmatrix}e^{-i u}\right)=\begin{bmatrix}A \cos u \\ B \cos ( \Delta - u )\end{bmatrix}e^{-i u}\end{aligned} \hspace{\stretch{1}}(1.2.15)

We need our cosine angle addition formula

\begin{aligned}\cos( a + b ) = \text{Real} \left( (\cos a + i \sin a)(\cos b + i \sin b)\right) =\cos a \cos b - \sin a \sin b.\end{aligned} \hspace{\stretch{1}}(1.2.16)

Using this and writing \mathbf{E} = (x, y) we have

\begin{aligned}x = A \cos u\end{aligned} \hspace{\stretch{1}}(1.2.17)

\begin{aligned}y = B ( \cos \Delta \cos u + \sin \Delta \sin u ).\end{aligned} \hspace{\stretch{1}}(1.2.18)

Subtracting x \cos \Delta/A from y/B we have

\begin{aligned}\frac{y}{B} - \frac{x}{A} \cos \Delta = \sin \Delta \sin u.\end{aligned} \hspace{\stretch{1}}(1.2.27)

Squaring this and using \sin^2 u = 1 - \cos^2 u, and 1.2.17 we have

\begin{aligned}\left( \frac{y}{B} - \frac{x}{A} \cos \Delta \right)^2 = \sin^2 \Delta \left( 1 - \frac{x^2}{A^2} \right),\end{aligned} \hspace{\stretch{1}}(1.2.27)

which expands and simplifies to

\begin{aligned}\left( \frac{x}{A} \right)^2 +\left( \frac{y}{B} \right)^2 - 2 \left( \frac{x}{A} \right)\left( \frac{y}{B} \right)\cos \Delta = \sin^2 \Delta,\end{aligned} \hspace{\stretch{1}}(1.2.27)

which is an equation of a rotated ellipse as desired. Let’s figure out the angle of rotation required to kill the cross terms. Writing a = 1/A, b = 1/B and rotating our primed coordinate frame by \theta degrees

\begin{aligned}\begin{bmatrix}x \\ y\end{bmatrix}=\begin{bmatrix}\cos \theta & -\sin\theta \\ \sin\theta & \cos\theta\end{bmatrix}\begin{bmatrix}x' \\ y'\end{bmatrix},\end{aligned} \hspace{\stretch{1}}(1.2.22)

we have

\begin{aligned}\begin{aligned}\sin^2 \Delta &=a^2 (x' \cos \theta - y'\sin\theta)^2+b^2 ( x' \sin\theta + y' \cos\theta)^2 \\ &- 2 a b (x' \cos \theta - y'\sin\theta)( x'\sin\theta + y'\cos\theta)\cos \Delta \\ &=(x')^2 ( a^2 \cos^2 \theta + b^2 \sin^2 \theta - 2 a b \cos \theta \sin \theta \cos \Delta ) \\ &+(y')^2 ( a^2 \sin^2 \theta + b^2 \cos^2 \theta + 2 a b \cos \theta \sin \theta \cos \Delta ) \\ &+ 2 x' y' ( (b^2 -a^2) \cos \theta \sin\theta + a b (\sin^2 \theta - \cos^2 \theta) \cos \Delta ).\end{aligned}\end{aligned} \hspace{\stretch{1}}(1.2.27)

To kill off the cross term we require

\begin{aligned}\begin{aligned}0 &= (b^2 -a^2) \cos \theta \sin\theta + a b (\sin^2 \theta - \cos^2 \theta) \cos \Delta \\ &= \frac{1}{{2}} (b^2 -a^2) \sin (2 \theta) - a b \cos (2 \theta) \cos \Delta,\end{aligned}\end{aligned} \hspace{\stretch{1}}(1.2.27)


\begin{aligned}\tan (2 \theta) = \frac{2 a b \cos \Delta}{b^2 - a^2} = \frac{2 A B \cos \Delta}{A^2 - B^2}.\end{aligned} \hspace{\stretch{1}}(1.2.27)

This yields 1.1.2 as desired. We also end up with expressions for our major and minor axis lengths, which are respectively for \sin \Delta \ne 0

\begin{aligned}\sin\Delta/ \sqrt{ b^2 + (a^2 - b^2) \cos^2 \theta - a b \sin (2 \theta) \cos \Delta }\end{aligned} \hspace{\stretch{1}}(1.2.27)

\begin{aligned}\sin\Delta/\sqrt{ b^2 + (a^2 - b^2)\sin^2 \theta + a b \sin (2 \theta) \cos \Delta },\end{aligned} \hspace{\stretch{1}}(1.2.27)

which completes the task of determining the geometry of the elliptic parameterization we see results from the general Jones vector description.


[1] G.R. Fowles. Introduction to modern optics. Dover Pubns, 1989.

[2] E. Hecht. Optics. 1998.

Posted in Math and Physics Learning. | Tagged: , , , , , | Leave a Comment »

Geometric Algebra. The very quickest introduction.

Posted by peeterjoot on March 17, 2012

[Click here for a PDF of this post with nicer formatting.]


An attempt to make a relatively concise introduction to Geometric (or Clifford) Algebra. Much more complete introductions to the subject can be found in [1], [2], and [3].


We have a couple basic principles upon which the algebra is based

  1. Vectors can be multiplied.
  2. The square of a vector is the (squared) length of that vector (with appropriate generalizations for non-Euclidean metrics).
  3. Vector products are associative (but not necessarily commutative).

That’s really all there is to it, and the rest, paraphrasing Feynman, can be figured out by anybody sufficiently clever.

By example. The 2D case.

Consider a 2D Euclidean space, and the product of two vectors \mathbf{a} and \mathbf{b} in that space. Utilizing a standard orthonormal basis \{\mathbf{e}_1, \mathbf{e}_2\} we can write

\begin{aligned}\mathbf{a} &= \mathbf{e}_1 x_1 + \mathbf{e}_2 x_2 \\ \mathbf{b} &= \mathbf{e}_1 y_1 + \mathbf{e}_2 y_2,\end{aligned} \hspace{\stretch{1}}(3.1)

and let’s write out the product of these two vectors \mathbf{a} \mathbf{b}, not yet knowing what we will end up with. That is

\begin{aligned}\mathbf{a} \mathbf{b} &= (\mathbf{e}_1 x_1 + \mathbf{e}_2 x_2 )( \mathbf{e}_1 y_1 + \mathbf{e}_2 y_2 ) \\ &= \mathbf{e}_1^2 x_1 y_1 + \mathbf{e}_2^2 x_2 y_2+ \mathbf{e}_1 \mathbf{e}_2 x_1 y_2 + \mathbf{e}_2 \mathbf{e}_1 x_2 y_1\end{aligned}

From axiom 2 we have \mathbf{e}_1^2 = \mathbf{e}_2^2 = 1, so we have

\begin{aligned}\mathbf{a} \mathbf{b} = x_1 y_1 + x_2 y_2 + \mathbf{e}_1 \mathbf{e}_2 x_1 y_2 + \mathbf{e}_2 \mathbf{e}_1 x_2 y_1.\end{aligned} \hspace{\stretch{1}}(3.3)

We’ve multiplied two vectors and ended up with a scalar component (and recognize that this part of the vector product is the dot product), and a component that is a “something else”. We’ll call this something else a bivector, and see that it is characterized by a product of non-colinear vectors. These products \mathbf{e}_1 \mathbf{e}_2 and \mathbf{e}_2 \mathbf{e}_1 are in fact related, and we can see that by looking at the case of \mathbf{b} = \mathbf{a}. For that we have

\begin{aligned}\mathbf{a}^2 &=x_1 x_1 + x_2 x_2 + \mathbf{e}_1 \mathbf{e}_2 x_1 x_2 + \mathbf{e}_2 \mathbf{e}_1 x_2 x_1 \\ &={\left\lvert{\mathbf{a}}\right\rvert}^2 +x_1 x_2 ( \mathbf{e}_1 \mathbf{e}_2 + \mathbf{e}_2 \mathbf{e}_1 )\end{aligned}

Since axiom (2) requires our vectors square to equal its (squared) length, we must then have

\begin{aligned}\mathbf{e}_1 \mathbf{e}_2 + \mathbf{e}_2 \mathbf{e}_1 = 0,\end{aligned} \hspace{\stretch{1}}(3.4)


\begin{aligned}\mathbf{e}_2 \mathbf{e}_1 = -\mathbf{e}_1 \mathbf{e}_2.\end{aligned} \hspace{\stretch{1}}(3.5)

We see that Euclidean orthonormal vectors anticommute. What we can see with some additional study is that any colinear vectors commute, and in Euclidean spaces (of any dimension) vectors that are normal to each other anticommute (this can also be taken as a definition of normal).

We can now return to our product of two vectors 3.3 and simplify it slightly

\begin{aligned}\mathbf{a} \mathbf{b} = x_1 y_1 + x_2 y_2 + \mathbf{e}_1 \mathbf{e}_2 (x_1 y_2 - x_2 y_1).\end{aligned} \hspace{\stretch{1}}(3.6)

The product of two vectors in 2D is seen here to have one scalar component, and one bivector component (an irreducible product of two normal vectors). Observe the symmetric and antisymmetric split of the scalar and bivector components above. This symmetry and antisymmetry can be made explicit, introducing dot and wedge product notation respectively

\begin{aligned}\mathbf{a} \cdot \mathbf{b} &= \frac{1}{{2}}( \mathbf{a} \mathbf{b} + \mathbf{b} \mathbf{a}) = x_1 y_1 + x_2 y_2 \\ \mathbf{a} \wedge \mathbf{b} &= \frac{1}{{2}}( \mathbf{a} \mathbf{b} - \mathbf{b} \mathbf{a}) = \mathbf{e}_1 \mathbf{e}_2 (x_1 y_y - x_2 y_1).\end{aligned} \hspace{\stretch{1}}(3.7)

so that the vector product can be written as

\begin{aligned}\mathbf{a} \mathbf{b} = \mathbf{a} \cdot \mathbf{b} + \mathbf{a} \wedge \mathbf{b}.\end{aligned} \hspace{\stretch{1}}(3.9)


In many contexts it is useful to introduce an ordered product of all the unit vectors for the space is called the pseudoscalar. In our 2D case this is

\begin{aligned}i = \mathbf{e}_1 \mathbf{e}_2,\end{aligned} \hspace{\stretch{1}}(4.10)

a quantity that we find behaves like the complex imaginary. That can be shown by considering its square

\begin{aligned}(\mathbf{e}_1 \mathbf{e}_2)^2&=(\mathbf{e}_1 \mathbf{e}_2)(\mathbf{e}_1 \mathbf{e}_2) \\ &=\mathbf{e}_1 (\mathbf{e}_2 \mathbf{e}_1) \mathbf{e}_2 \\ &=-\mathbf{e}_1 (\mathbf{e}_1 \mathbf{e}_2) \mathbf{e}_2 \\ &=-(\mathbf{e}_1 \mathbf{e}_1) (\mathbf{e}_2 \mathbf{e}_2) \\ &=-1^2 \\ &= -1\end{aligned}

Here the anticommutation of normal vectors property has been used, as well as (for the first time) the associative multiplication axiom.

In a 3D context, you’ll see the pseudoscalar in many places (expressing the normals to planes for example). It also shows up in a number of fundamental relationships. For example, if one writes

\begin{aligned}I = \mathbf{e}_1 \mathbf{e}_2 \mathbf{e}_3\end{aligned} \hspace{\stretch{1}}(4.11)

for the 3D pseudoscalar, then it’s also possible to show

\begin{aligned}\mathbf{a} \mathbf{b} = \mathbf{a} \cdot \mathbf{b} + I (\mathbf{a} \times \mathbf{b})\end{aligned} \hspace{\stretch{1}}(4.12)

something that will be familiar to the student of QM, where we see this in the context of Pauli matrices. The Pauli matrices also encode a Clifford algebraic structure, but we do not need an explicit matrix representation to do so.


Very much like complex numbers we can utilize exponentials to perform rotations. Rotating in a sense from \mathbf{e}_1 to \mathbf{e}_2, can be expressed as

\begin{aligned}\mathbf{a} e^{i \theta}&=(\mathbf{e}_1 x_1 + \mathbf{e}_2 x_2) (\cos\theta + \mathbf{e}_1 \mathbf{e}_2 \sin\theta) \\ &=\mathbf{e}_1 (x_1 \cos\theta - x_2 \sin\theta)+\mathbf{e}_2 (x_2 \cos\theta + x_1 \sin\theta)\end{aligned}

More generally, even in N dimensional Euclidean spaces, if \mathbf{a} is a vector in a plane, and \hat{\mathbf{u}} and \hat{\mathbf{v}} are perpendicular unit vectors in that plane, then the rotation through angle \theta is given by

\begin{aligned}\mathbf{a} \rightarrow \mathbf{a} e^{\hat{\mathbf{u}} \hat{\mathbf{v}} \theta}.\end{aligned} \hspace{\stretch{1}}(5.13)

This is illustrated in figure (1).

Plane rotation.


Notice that we have expressed the rotation here without utilizing a normal direction for the plane. The sense of the rotation is encoded by the bivector \hat{\mathbf{u}} \hat{\mathbf{v}} that describes the plane and the orientation of the rotation (or by duality the direction of the normal in a 3D space). By avoiding a requirement to encode the rotation using a normal to the plane we have an method of expressing the rotation that works not only in 3D spaces, but also in 2D and greater than 3D spaces, something that isn’t possible when we restrict ourselves to traditional vector algebra (where quantities like the cross product can’t be defined in a 2D or 4D space, despite the fact that things they may represent, like torque are planar phenomena that do not have any intrinsic requirement for a normal that falls out of the plane.).

When \mathbf{a} does not lie in the plane spanned by the vectors \hat{\mathbf{u}} and \hat{\mathbf{v}} , as in figure (2), we must express the rotations differently. A rotation then takes the form

\begin{aligned}\mathbf{a} \rightarrow e^{-\hat{\mathbf{u}} \hat{\mathbf{v}} \theta/2} \mathbf{a} e^{\hat{\mathbf{u}} \hat{\mathbf{v}} \theta/2}.\end{aligned} \hspace{\stretch{1}}(5.14)

3D rotation.


In the 2D case, and when the vector lies in the plane this reduces to the one sided complex exponential operator used above. We see these types of paired half angle rotations in QM, and they are also used extensively in computer graphics under the guise of quaternions.


[1] L. Dorst, D. Fontijne, and S. Mann. Geometric Algebra for Computer Science. Morgan Kaufmann, San Francisco, 2007.

[2] C. Doran and A.N. Lasenby. Geometric algebra for physicists. Cambridge University Press New York, Cambridge, UK, 1st edition, 2003.

[3] D. Hestenes. New Foundations for Classical Mechanics. Kluwer Academic Publishers, 1999.

Posted in Math and Physics Learning. | Tagged: , , , , , , , , , , , , , , , , , | 2 Comments »

Infinitesimal rotations

Posted by peeterjoot on January 27, 2012

[Click here for a PDF of this post with nicer formatting and figures if the post had any (especially if my latex to wordpress script has left FORMULA DOES NOT PARSE errors.)]


In a classical mechanics lecture (which I audited) Prof. Poppitz made the claim that an infinitesimal rotation in direction \hat{\mathbf{n}} of magnitude \delta \phi has the form

\begin{aligned}\mathbf{x} \rightarrow \mathbf{x} + \delta \boldsymbol{\phi} \times \mathbf{x},\end{aligned} \hspace{\stretch{1}}(1.1)


\begin{aligned}\delta \boldsymbol{\phi} = \hat{\mathbf{n}} \delta \phi.\end{aligned} \hspace{\stretch{1}}(1.2)

I believe he expressed things in terms of the differential displacement

\begin{aligned}\delta \mathbf{x} = \delta \boldsymbol{\phi} \times \mathbf{x}\end{aligned} \hspace{\stretch{1}}(1.3)

This was verified for the special case \hat{\mathbf{n}} = \hat{\mathbf{z}} and \mathbf{x} = x \hat{\mathbf{x}}. Let’s derive this in the general case too.

With geometric algebra.

Let’s temporarily dispense with the normal notation and introduce two perpendicular unit vectors \hat{\mathbf{u}}, and \hat{\mathbf{v}} in the plane of the rotation. Relate these to the unit normal with

\begin{aligned}\hat{\mathbf{n}} = \hat{\mathbf{u}} \times \hat{\mathbf{v}}.\end{aligned} \hspace{\stretch{1}}(2.4)

A rotation through an angle \phi (infinitesimal or otherwise) is then

\begin{aligned}\mathbf{x} \rightarrow e^{-\hat{\mathbf{u}} \hat{\mathbf{v}} \phi/2} \mathbf{x} e^{\hat{\mathbf{u}} \hat{\mathbf{v}} \phi/2}.\end{aligned} \hspace{\stretch{1}}(2.5)

Suppose that we decompose \mathbf{x} into components in the plane and in the direction of the normal \hat{\mathbf{n}}. We have

\begin{aligned}\mathbf{x} = x_u \hat{\mathbf{u}} + x_v \hat{\mathbf{v}} + x_n \hat{\mathbf{n}}.\end{aligned} \hspace{\stretch{1}}(2.6)

The exponentials commute with the \hat{\mathbf{n}} vector, and anticommute otherwise, leaving us with

\begin{aligned}\mathbf{x} &\rightarrow x_n \hat{\mathbf{n}} + (x_u \hat{\mathbf{u}} + x_v \hat{\mathbf{v}}) e^{\hat{\mathbf{u}} \hat{\mathbf{v}} \phi} \\ &=x_n \hat{\mathbf{n}} + (x_u \hat{\mathbf{u}} + x_v \hat{\mathbf{v}}) (\cos\phi + \hat{\mathbf{u}} \hat{\mathbf{v}} \sin\phi) \\ &=x_n \hat{\mathbf{n}} + \hat{\mathbf{u}} (x_u \cos\phi - x_v \sin\phi) +\hat{\mathbf{v}} (x_v \cos\phi + x_u \sin\phi).\end{aligned}

In the last line we use \hat{\mathbf{u}}^2 = 1 and \hat{\mathbf{u}} \hat{\mathbf{v}} = - \hat{\mathbf{v}} \hat{\mathbf{u}}. Making the angle infinitesimal \phi \rightarrow \delta \phi we have

\begin{aligned}\mathbf{x} &\rightarrow x_n \hat{\mathbf{n}} + \hat{\mathbf{u}} (x_u - x_v \delta\phi) +\hat{\mathbf{v}} (x_v + x_u \delta\phi)  \\ &=\mathbf{x} + \delta\phi( x_u \hat{\mathbf{v}} - x_v \hat{\mathbf{u}})\end{aligned}

We have only to confirm that this matches the assumed cross product representation

\begin{aligned}\hat{\mathbf{n}} \times \mathbf{x}&=\begin{vmatrix}\hat{\mathbf{u}} & \hat{\mathbf{v}} & \hat{\mathbf{n}} \\ 0 & 0 & 1 \\ x_u & x_v & x_n\end{vmatrix} \\ &=-\hat{\mathbf{u}} x_v + \hat{\mathbf{v}} x_u\end{aligned}

Taking the two last computations we find

\begin{aligned}\delta \mathbf{x} = \delta \phi \hat{\mathbf{n}} \times \mathbf{x} = \delta \boldsymbol{\phi} \times \mathbf{x},\end{aligned} \hspace{\stretch{1}}(2.7)

as desired.

Without geometric algebra.

We’ve also done the setup above to verify this result without GA. Here we wish to apply the rotation to the coordinate vector of \mathbf{x} in the \{\hat{\mathbf{u}}, \hat{\mathbf{v}}, \hat{\mathbf{n}}\} basis which gives us

\begin{aligned}\begin{bmatrix}x_u \\ x_v \\ x_n \end{bmatrix}&\rightarrow \begin{bmatrix}\cos\delta\phi & -\sin\delta\phi & 0 \\ \sin\delta\phi & \cos\delta\phi & 0 \\ 0 & 0 & 1\end{bmatrix}\begin{bmatrix}x_u \\ x_v \\ x_n \end{bmatrix} \\ &\approx\begin{bmatrix}1 & -\delta\phi & 0 \\ \delta\phi & 1 & 0 \\ 0 & 0 & 1\end{bmatrix}\begin{bmatrix}x_u \\ x_v \\ x_n \end{bmatrix} \\ &=\begin{bmatrix}x_u \\ x_v \\ x_n \end{bmatrix} +\begin{bmatrix}0 & -\delta\phi & 0 \\ \delta\phi & 0 & 0 \\ 0 & 0 & 0\end{bmatrix}\begin{bmatrix}x_u \\ x_v \\ x_n \end{bmatrix} \\ &=\begin{bmatrix}x_u \\ x_v \\ x_n \end{bmatrix} +\delta\phi\begin{bmatrix}-x_v \\ x_u \\ 0\end{bmatrix} \end{aligned}

But as we’ve shown, this last coordinate vector is just \hat{\mathbf{n}} \times \mathbf{x}, and we get our desired result using plain old fashioned matrix algebra as well.

Really the only difference between this and what was done in class is that there’s no assumption here that \mathbf{x} = x \hat{\mathbf{x}}.

Posted in Math and Physics Learning. | Tagged: , , , , , , , , | Leave a Comment »

Strain tensor in spherical coordinates

Posted by peeterjoot on January 23, 2012

[Click here for a PDF of this post with nicer formatting and figures if the post had any (especially if my latex to wordpress script has left FORMULA DOES NOT PARSE errors.)]

Spherical tensor.

To perform the derivation in spherical coordinates we have some setup to do first, since we need explicit representations of all three unit vectors. The radial vector we can get easily by geometry and find the usual

\begin{aligned}\hat{\mathbf{r}} =\begin{bmatrix}\sin\theta \cos\phi \\ \sin\theta \sin\phi \\ \cos\theta\end{bmatrix}\end{aligned} \hspace{\stretch{1}}(3.61)

We can get \hat{\boldsymbol{\phi}} by geometrical intuition since it the plane unit vector at angle \phi rotated by \pi/2. That is

\begin{aligned}\hat{\boldsymbol{\phi}} =\begin{bmatrix}-\sin\phi \\ \cos\phi \\ 0\end{bmatrix}\end{aligned} \hspace{\stretch{1}}(3.62)

We can get \hat{\boldsymbol{\theta}} by utilizing the right handedness of the coordinates since

\begin{aligned}\hat{\boldsymbol{\phi}} \times \hat{\mathbf{r}} = \hat{\boldsymbol{\theta}}\end{aligned} \hspace{\stretch{1}}(3.63)

and find

\begin{aligned}\hat{\boldsymbol{\theta}} =\begin{bmatrix}\cos\theta \cos\phi \\ \cos\theta \sin\phi \\ -\sin\theta\end{bmatrix}\end{aligned} \hspace{\stretch{1}}(3.64)

That and some Mathematica brute force can be used to calculate the differential strain element, and we find

\begin{aligned}\begin{aligned}&d\mathbf{l}'^2 - d\mathbf{x}^2 \\ &=2 (dr)^2 \biggl(\frac{\partial u_r}{\partial r}+ \frac{1}{{2}}\frac{\partial u_m}{\partial r} \frac{\partial u_m}{\partial r}\biggr) \\ & + 2 r^2 (d\theta )^2 \biggl(\frac{1}{{r}} u_r + \frac{1}{{2r^2}}(u_r^2 + u_{\theta }^2) - \frac{1}{{r^2}} u_{\theta } \frac{\partial u_r}{\partial \theta }+ \left(\frac{1}{{r}} + \frac{1}{{r^2}}u_r\right) \frac{\partial u_{\theta }}{\partial \theta }+ \frac{1}{{2 r^2}} \frac{\partial u_m}{\partial \theta } \frac{\partial u_m}{\partial \theta }\biggr) \\ &+ 2 r^2 \sin^2\theta (d\phi )^2 \biggl(  \frac{1}{{2 r^2 \sin^2\theta}} u_\phi^2+ \frac{1}{{2 r^2 }} u_{\theta }^2 \cot^2\theta+ \frac{1}{{r}} u_r+ \frac{1}{{2 r^2}} u_r^2+ \left(\frac{1}{{r}} + \frac{1}{{r^2}}u_r\right) u_{\theta } \cot\theta  \\ &\qquad- \frac{1}{{r^2 \sin\theta}} u_{\phi } \frac{\partial u_r}{\partial \phi }- \frac{1}{{r^2 }} u_{\phi } \frac{\cos\theta}{\sin^2\theta} \frac{\partial u_{\theta }}{\partial \phi }+ \frac{1}{{r^2 }} \frac{\partial u_{\phi }}{\partial \phi } \left(u_{\theta } \frac{\cos\theta}{\sin^2\theta} + \left(r + u_r\right) \frac{1}{{\sin\theta}} \right)+ \frac{1}{{2 r^2 \sin^2\theta}} \frac{\partial u_m}{\partial \phi } \frac{\partial u_m}{\partial \phi }\biggr) \\ & + 2 dr r d\theta \biggl(- \frac{1}{{r}} u_{\theta }+ \frac{1}{{r}} \frac{\partial u_r}{\partial \theta }- \frac{1}{{r}} u_{\theta } \frac{\partial u_r}{\partial r}+ \frac{\partial u_{\theta }}{\partial r} \left(1 + \frac{u_r}{r} \right)+ \frac{1}{{r}} \frac{\partial u_m}{\partial r} \frac{\partial u_m}{\partial \theta }\biggr) \\ & + 2 r^2 \sin\theta d\theta  d\phi  \biggl(\frac{1}{{r^2 }} u_{\theta } u_{\phi }- \frac{1}{{r^2 \sin\theta}} u_{\theta } \frac{\partial u_r}{\partial \phi }- \frac{1}{{r^2 }} u_{\phi } \frac{\partial u_r}{\partial \theta }- \frac{1}{{r^2 }} u_{\phi } \cot\theta \left(r + u_r + \frac{\partial u_{\theta }}{\partial \theta }\right)  \\ &\qquad+ \frac{1}{{r^2 \sin\theta}} \left(r + u_r \right) \frac{\partial u_{\theta }}{\partial \phi }+ \frac{\partial u_{\phi }}{\partial \theta } \left(\frac{u_{\theta }}{r^2} \cot\theta + \frac{1}{{r}} + \frac{u_r}{r^2} \right)+ \frac{1}{{r^2 \sin\theta}} \frac{\partial u_m}{\partial \theta } \frac{\partial u_m}{\partial \phi }\biggr) \\ & + 2 r \sin\theta d\phi dr \biggl(- \frac{1}{{r }} u_{\phi }+ \frac{1}{{r \sin\theta}} \frac{\partial u_r}{\partial \phi }- u_{\phi } \frac{1}{{r }} \frac{\partial u_r}{\partial r}- u_{\phi } \cot\theta \frac{1}{{r }} \frac{\partial u_{\theta }}{\partial r}+ \frac{1}{{r }} \frac{\partial u_{\phi }}{\partial r} \left( u_{\theta } \cot\theta + r + u_r \right)+ \frac{1}{{r \sin\theta}} \frac{\partial u_m}{\partial \phi } \frac{\partial u_m}{\partial r}\biggr)\end{aligned}\end{aligned} \hspace{\stretch{1}}(3.65)

A manual derivation.

Doing the calculation pretty much completely with Mathematica is rather unsatisfying. To set up for it let’s first compute the unit vectors from scratch. I’ll use geometric algebra to do this calculation. Consider figure (\ref{fig:qmTwoExamReflection:continuumL2fig5})

\caption{Composite rotations for spherical polar unit vectors.}

We have two sets of rotations, the first is a rotation about the z axis by \phi. Writing i = \mathbf{e}_1 \mathbf{e}_2 for the unit bivector in the x,y plane, we rotate

\begin{aligned}\mathbf{e}_1' &= \mathbf{e}_1 e^{i\phi} = \mathbf{e}_1 \cos\phi + \mathbf{e}_2 \sin\phi \\ \mathbf{e}_2' &= \mathbf{e}_2 e^{i\phi} = \mathbf{e}_2 \cos\phi - \mathbf{e}_1 \sin\phi \\ \mathbf{e}_3' &= \mathbf{e}_3\end{aligned} \hspace{\stretch{1}}(3.66)

Now we rotate in the plane spanned by \mathbf{e}_3 and \mathbf{e}_1' by \theta. With j = \mathbf{e}_3 \mathbf{e}_1', our vectors in the plane rotate as

\begin{aligned}\mathbf{e}_1'' &= \mathbf{e}_1' e^{j\phi} = \mathbf{e}_1 e^{i\phi} e^{j\theta}  \\ \mathbf{e}_3'' &= \mathbf{e}_3' e^{j\theta} = \mathbf{e}_3 e^{j\theta},\end{aligned} \hspace{\stretch{1}}(3.69)

(with \mathbf{e}_2'' = \mathbf{e}_2 since \mathbf{e}_2 \cdot j = 0).

\begin{aligned}\hat{\boldsymbol{\theta}} = \mathbf{e}_1''&= \mathbf{e}_1 e^{i\phi} e^{j\theta} \\ &= \mathbf{e}_1 e^{i\phi} (\cos\theta + \mathbf{e}_3 \mathbf{e}_1 e^{i\phi} \sin\theta) \\ &= \mathbf{e}_1 e^{i\phi} \cos\theta -\mathbf{e}_3 \sin\theta \\ &= (\mathbf{e}_1 \cos\phi + \mathbf{e}_2 \sin\phi) \cos\theta -\mathbf{e}_3 \sin\theta \\ \end{aligned}

\begin{aligned}\hat{\mathbf{r}} = \mathbf{e}_3''&= \mathbf{e}_3 e^{j\theta} \\ &= \mathbf{e}_3 (\cos\theta + \mathbf{e}_3 \mathbf{e}_1 e^{i\phi} \sin\theta) \\ &= \mathbf{e}_3 \cos\theta + \mathbf{e}_1 e^{i\phi} \sin\theta \\ &= \mathbf{e}_3 \cos\theta + (\mathbf{e}_1 \cos\phi + \mathbf{e}_2 \sin\phi) \sin\theta \\ \end{aligned}

Now, these are all the same relations that we could find with coordinate algebra

\begin{aligned}\hat{\mathbf{r}} &= \mathbf{e}_1 \cos\phi \sin\theta +\mathbf{e}_2 \sin\phi \sin\theta +\mathbf{e}_3 \cos\theta  \\ \hat{\boldsymbol{\theta}} &= \mathbf{e}_1 \cos\phi \cos\theta +\mathbf{e}_2 \sin\phi \cos\theta -\mathbf{e}_3 \sin\theta  \\ \hat{\boldsymbol{\phi}} &= -\mathbf{e}_1 \sin\phi + \mathbf{e}_2 \cos\phi\end{aligned} \hspace{\stretch{1}}(3.71)

There’s nothing special in this approach if that is as far as we go, but we can put things in a nice tidy form for computation of the differentials of the unit vectors. Introducing the unit pseudoscalar I = \mathbf{e}_1 \mathbf{e}_2 \mathbf{e}_3 we can write these in a compact exponential form.

\begin{aligned}\hat{\mathbf{r}}&= (\mathbf{e}_1 \cos\phi +\mathbf{e}_2 \sin\phi ) \sin\theta +\mathbf{e}_3 \cos\theta  \\ &= \mathbf{e}_1 e^{i\phi} \sin\theta +\mathbf{e}_3 \cos\theta  \\ &= \mathbf{e}_3 ( \cos\theta + \mathbf{e}_3 \mathbf{e}_1 e^{i\phi} \sin\theta ) \\ &= \mathbf{e}_3 ( \cos\theta + \mathbf{e}_3 \mathbf{e}_1 \mathbf{e}_2 \mathbf{e}_2 e^{i\phi} \sin\theta ) \\ &= \mathbf{e}_3 ( \cos\theta + I \hat{\boldsymbol{\phi}} \sin\theta ) \\ &= \mathbf{e}_3 e^{ I \hat{\boldsymbol{\phi}} \theta }\end{aligned}

\begin{aligned}\hat{\boldsymbol{\theta}}&=\mathbf{e}_1 \cos\phi \cos\theta +\mathbf{e}_2 \sin\phi \cos\theta -\mathbf{e}_3 \sin\theta  \\ &=(\mathbf{e}_1 \cos\phi +\mathbf{e}_2 \sin\phi ) \cos\theta -\mathbf{e}_3 \sin\theta  \\ &=\mathbf{e}_1 e^{i\phi} \cos\theta -\mathbf{e}_3 \sin\theta  \\ &=\mathbf{e}_1 e^{i\phi} ( \cos\theta - e^{-i\phi} \mathbf{e}_1 \mathbf{e}_3 \sin\theta ) \\ &=\mathbf{e}_1 e^{i\phi} ( \cos\theta - \mathbf{e}_1 \mathbf{e}_3 e^{i\phi} \sin\theta ) \\ &=\mathbf{e}_1 e^{i\phi} ( \cos\theta - \mathbf{e}_1 \mathbf{e}_3 \mathbf{e}_2 \mathbf{e}_2 e^{i\phi} \sin\theta ) \\ &=\mathbf{e}_1 e^{i\phi} ( \cos\theta + I \hat{\boldsymbol{\phi}} \sin\theta ) \\ &=\mathbf{e}_1 \mathbf{e}_2 \mathbf{e}_2 e^{i\phi} ( \cos\theta + I \hat{\boldsymbol{\phi}} \sin\theta ) \\ &=i \hat{\boldsymbol{\phi}} e^{I \hat{\boldsymbol{\phi}} \theta}.\end{aligned}

To summarize we have

\begin{aligned}\hat{\boldsymbol{\phi}} &= \mathbf{e}_2 e^{i\phi} \\ \hat{\mathbf{r}} &= \mathbf{e}_3 e^{I\hat{\boldsymbol{\phi}} \theta} \\ \hat{\boldsymbol{\theta}} &= i \hat{\boldsymbol{\phi}} e^{I\hat{\boldsymbol{\phi}} \theta}.\end{aligned} \hspace{\stretch{1}}(3.74)

Taking differentials we find first

\begin{aligned}d\hat{\boldsymbol{\phi}} = \mathbf{e}_2 e^{i\phi} i d\phi = \hat{\boldsymbol{\phi}} i d\phi\end{aligned}

\begin{aligned}d\hat{\boldsymbol{\theta}}&= d \left( i \hat{\boldsymbol{\phi}} e^{I\hat{\boldsymbol{\phi}} \theta} \right) \\ &= i d \hat{\boldsymbol{\phi}} e^{I\hat{\boldsymbol{\phi}} \theta} + i \hat{\boldsymbol{\phi}} d \left( \cos\theta + I \hat{\boldsymbol{\phi}} \sin\theta \right) \\ &= i d \hat{\boldsymbol{\phi}} e^{I\hat{\boldsymbol{\phi}} \theta}+ i \hat{\boldsymbol{\phi}} I (d \hat{\boldsymbol{\phi}}) \sin\theta+ i \hat{\boldsymbol{\phi}} I \hat{\boldsymbol{\phi}} e^{I\hat{\boldsymbol{\phi}} \theta} d\theta \\ &= i \hat{\boldsymbol{\phi}} i e^{I\hat{\boldsymbol{\phi}} \theta} d\phi+ i \hat{\boldsymbol{\phi}} I \hat{\boldsymbol{\phi}} i \sin\theta d\phi+ i \hat{\boldsymbol{\phi}} I \hat{\boldsymbol{\phi}} e^{I\hat{\boldsymbol{\phi}} \theta} d\theta \\ &= \hat{\boldsymbol{\phi}} e^{I\hat{\boldsymbol{\phi}} \theta} d\phi- I \sin\theta d\phi- \mathbf{e}_3 e^{I\hat{\boldsymbol{\phi}} \theta} d\theta \\ &= \hat{\boldsymbol{\phi}} (\cos\theta + I \hat{\boldsymbol{\phi}} \sin\theta) d\phi- I \sin\theta d\phi- \mathbf{e}_3 e^{I\hat{\boldsymbol{\phi}} \theta} d\theta \\ &= \hat{\boldsymbol{\phi}} \cos\theta d\phi - \hat{\mathbf{r}} d\theta\end{aligned}

\begin{aligned}d \hat{\mathbf{r}}&=\mathbf{e}_3 d \left( e^{I\hat{\boldsymbol{\phi}} \theta} \right) \\ &=\mathbf{e}_3 d \left( \cos\theta + I \hat{\boldsymbol{\phi}} \sin\theta \right) \\ &=\mathbf{e}_3 \left( I (d \hat{\boldsymbol{\phi}}) \sin\theta + I \hat{\boldsymbol{\phi}} e^{I\hat{\boldsymbol{\phi}} \theta} d\theta \right) \\ &=\mathbf{e}_3 \left( I \hat{\boldsymbol{\phi}} i \sin\theta d\phi + I \hat{\boldsymbol{\phi}} e^{I\hat{\boldsymbol{\phi}} \theta} d\theta \right) \\ &=i \hat{\boldsymbol{\phi}} i \sin\theta d\phi + i \hat{\boldsymbol{\phi}} e^{I\hat{\boldsymbol{\phi}} \theta} d\theta \\ &=\hat{\boldsymbol{\phi}} \sin\theta d\phi + \hat{\boldsymbol{\theta}} d\theta\end{aligned}

Summarizing these differentials we have

\begin{aligned}d\hat{\mathbf{r}} &= \hat{\boldsymbol{\phi}} \sin\theta d\phi + \hat{\boldsymbol{\theta}} d\theta \\ d\hat{\boldsymbol{\theta}} &= \hat{\boldsymbol{\phi}} \cos\theta d\phi - \hat{\mathbf{r}} d\theta \\ d\hat{\boldsymbol{\phi}} &= \hat{\boldsymbol{\phi}} i d\phi\end{aligned} \hspace{\stretch{1}}(3.77)

A final cleanup is required. While \hat{\boldsymbol{\phi}} i is a vector and has a nicely compact form, we need to decompose this into components in the \hat{\mathbf{r}}, \hat{\boldsymbol{\theta}} and \hat{\boldsymbol{\phi}} directions. Taking scalar products we have

\begin{aligned}\hat{\boldsymbol{\phi}} \cdot (\hat{\boldsymbol{\phi}} i) = 0\end{aligned}

\begin{aligned}\hat{\mathbf{r}} \cdot (\hat{\boldsymbol{\phi}} i)&=\left\langle{{ \hat{\mathbf{r}} \hat{\boldsymbol{\phi}} i}}\right\rangle \\ &=\left\langle{{ \mathbf{e}_3 e^{I\hat{\boldsymbol{\phi}} \theta} \mathbf{e}_2 e^{i\phi} i}}\right\rangle \\ &=\left\langle{{ \mathbf{e}_3 (\cos\theta + I \mathbf{e}_2 e^{i\phi} \sin\theta) \mathbf{e}_2 e^{i\phi} i}}\right\rangle \\ &=\left\langle{{ I (\cos\theta e^{-i\phi} + I \mathbf{e}_2 \sin\theta) \mathbf{e}_2 }}\right\rangle \\ &=-\sin\theta\end{aligned}

\begin{aligned}\hat{\boldsymbol{\theta}} \cdot (\hat{\boldsymbol{\phi}} i)&=\left\langle{{ \hat{\boldsymbol{\theta}} \hat{\boldsymbol{\phi}} i }}\right\rangle \\ &=\left\langle{{ i \hat{\boldsymbol{\phi}} e^{I\hat{\boldsymbol{\phi}} \theta} \hat{\boldsymbol{\phi}} i }}\right\rangle \\ &=-\left\langle{{ \hat{\boldsymbol{\phi}} e^{I\hat{\boldsymbol{\phi}} \theta} \hat{\boldsymbol{\phi}} }}\right\rangle \\ &=-\left\langle{{ e^{I\hat{\boldsymbol{\phi}} \theta} }}\right\rangle \\ &=- \cos\theta.\end{aligned}

Summarizing once again, but this time in terms of \hat{\mathbf{r}}, \hat{\boldsymbol{\theta}} and \hat{\boldsymbol{\phi}} we have

\begin{aligned}d\hat{\mathbf{r}} &= \hat{\boldsymbol{\phi}} \sin\theta d\phi + \hat{\boldsymbol{\theta}} d\theta \\ d\hat{\boldsymbol{\theta}} &= \hat{\boldsymbol{\phi}} \cos\theta d\phi - \hat{\mathbf{r}} d\theta \\ d\hat{\boldsymbol{\phi}} &= -(\hat{\mathbf{r}} \sin\theta + \hat{\boldsymbol{\theta}} \cos\theta) d\phi\end{aligned} \hspace{\stretch{1}}(3.80)

Now we are set to take differentials. With

\begin{aligned}\mathbf{x} = r \hat{\mathbf{r}},\end{aligned} \hspace{\stretch{1}}(3.83)

we have

\begin{aligned}d\mathbf{x} =dr \hat{\mathbf{r}}+ r d\hat{\mathbf{r}}=dr \hat{\mathbf{r}} + \hat{\boldsymbol{\phi}} r \sin\theta d\phi + r \hat{\boldsymbol{\theta}} d\theta.\end{aligned} \hspace{\stretch{1}}(3.84)

Squaring this we get the usual spherical polar line scalar line element

\begin{aligned}d\mathbf{x}^2 = dr^2 + r^2 \sin^2\theta d\phi^2 + r^2 d\theta^2.\end{aligned} \hspace{\stretch{1}}(3.85)


\begin{aligned}\mathbf{u} = u_r \hat{\mathbf{r}} + u_\theta \hat{\boldsymbol{\theta}} + u_\phi \hat{\boldsymbol{\phi}},\end{aligned} \hspace{\stretch{1}}(3.86)

our differential is

\begin{aligned}d\mathbf{u}&=du_r \hat{\mathbf{r}} + du_\theta \hat{\boldsymbol{\theta}} + du_\phi \hat{\boldsymbol{\phi}}+ u_r d\hat{\mathbf{r}} + u_\theta d\hat{\boldsymbol{\theta}} + u_\phi d \hat{\boldsymbol{\phi}} \\ &=du_r \hat{\mathbf{r}} + du_\theta \hat{\boldsymbol{\theta}} + du_\phi \hat{\boldsymbol{\phi}}+ u_r \left(\hat{\boldsymbol{\phi}} \sin\theta d\phi + \hat{\boldsymbol{\theta}} d\theta \right)+ u_\theta \left( \hat{\boldsymbol{\phi}} \cos\theta d\phi - \hat{\mathbf{r}} d\theta \right)- u_\phi (\hat{\mathbf{r}} \sin\theta + \hat{\boldsymbol{\theta}} \cos\theta) d\phi\\ &=\hat{\mathbf{r}} \left( du_r - u_\theta d\theta - u_\phi \sin\theta d\phi \right) \\ &+\hat{\boldsymbol{\theta}} \left( du_\theta + u_r d\theta - u_\phi \cos\theta d\phi \right) \\ &+\hat{\boldsymbol{\phi}} \left( du_\phi + u_r \sin\theta d\phi + u_\theta \cos\theta d\phi \right).\end{aligned}

We can add d\mathbf{x} to this and take differences

\begin{aligned}\begin{aligned}(d\mathbf{u} + d\mathbf{x})^2 - d\mathbf{x}^2&=\left( du_r - u_\theta d\theta - u_\phi \sin\theta d\phi + dr \right)^2 \\ &+\left( du_\theta + u_r d\theta - u_\phi \cos\theta d\phi + r d\theta \right)^2 \\ &+\left( du_\phi + u_r \sin\theta d\phi + u_\theta \cos\theta d\phi + r \sin\theta d\phi \right)^2\end{aligned}\end{aligned} \hspace{\stretch{1}}(3.87)

For each m = r,\theta,\phi we have

\begin{aligned}du_m=\frac{\partial {u_m}}{\partial {r}} dr +\frac{\partial {u_m}}{\partial {\theta}} d\theta +\frac{\partial {u_m}}{\partial {\phi}} d\phi,\end{aligned} \hspace{\stretch{1}}(3.88)

and plugging through that calculation is really all it takes to derive the textbook result. To do this to first order in u_m, we find

\begin{aligned}\frac{1}{{2}} \left((d\mathbf{u} + d\mathbf{x})^2 - d\mathbf{x}^2\right)&=du_r dr- u_\theta d\theta dr- u_\phi \sin\theta d\phi dr  \\ &+ du_\theta r d\theta+ u_r r d\theta^2- u_\phi r \cos\theta d\phi d\theta \\ &+ r \sin\theta du_\phi d\phi+ r \sin^2\theta u_r d\phi^2+ r \sin\theta \cos\theta u_\theta d\phi^2 \\ &=\left( \frac{\partial {u_r}}{\partial {r}} dr + \frac{\partial {u_r}}{\partial {\theta}} d\theta + \frac{\partial {u_r}}{\partial {\phi}} d\phi \right)dr- u_\theta d\theta dr- u_\phi \sin\theta d\phi dr  \\ &+\left( \frac{\partial {u_\theta}}{\partial {r}} dr + \frac{\partial {u_\theta}}{\partial {\theta}} d\theta + \frac{\partial {u_\theta}}{\partial {\phi}} d\phi \right) r d\theta+ u_r r d\theta^2- u_\phi r \cos\theta d\phi d\theta \\ &+\left( \frac{\partial {u_\phi}}{\partial {r}} dr + \frac{\partial {u_\phi}}{\partial {\theta}} d\theta + \frac{\partial {u_\phi}}{\partial {\phi}} d\phi \right)r \sin\theta d\phi+ r \sin^2\theta u_r d\phi^2+ r \sin\theta \cos\theta u_\theta d\phi^2\end{aligned}

Collecting terms we have the result of the text in the braces

\begin{aligned}\begin{aligned}\left((d\mathbf{u} + d\mathbf{x})^2 - d\mathbf{x}^2\right)&=2 dr^2 \left(\frac{\partial {u_r}}{\partial {r}}\right) \\ &+2 r^2 d\theta^2 \left(\frac{1}{{r}} \frac{\partial {u_\theta}}{\partial {\theta}} + u_r \frac{1}{{r}}\right) \\ &+2 r^2 \sin^2\theta d\phi^2 \left(\frac{\partial {u_\phi}}{\partial {\phi}} \frac{1}{{r \sin\theta}} + \frac{1}{{r}} u_r + \frac{1}{{r}} \cot\theta u_\theta\right) \\ &+2 dr r d\theta \left(\frac{1}{{r}} \frac{\partial {u_r}}{\partial {\theta}} - \frac{1}{{r}} u_\theta +\frac{\partial {u_\theta}}{\partial {r}}\right) \\ &+2 r^2 \sin\theta d\theta d\phi \left(\frac{\partial {u_\theta}}{\partial {\phi}} \frac{1}{{r \sin\theta}} - \frac{1}{{r}} u_\phi \cot\theta +\frac{1}{{r}} \frac{\partial {u_\phi}}{\partial {\theta}}\right) \\ &+2 r \sin\theta d\phi dr \left(\frac{1}{{r \sin\theta}} \frac{\partial {u_r}}{\partial {\phi}} - \frac{1}{{r}} u_\phi + \frac{\partial {u_\phi}}{\partial {r}}\right)\end{aligned}\end{aligned} \hspace{\stretch{1}}(3.89)

It should be possible to do the calculation to second order too, but to include all the quadratic terms in u_m is again really messy. Trying that with mathematica gives the same results as above using the strictly coordinate algebra approach.


[1] L.D. Landau, EM Lifshitz, JB Sykes, WH Reid, and E.H. Dill. Theory of elasticity: Vol. 7 of course of theoretical physics. 1960.

Posted in Math and Physics Learning. | Tagged: , , , , , , , , | Leave a Comment »

PHY456H1F: Quantum Mechanics II. Lecture 19 (Taught by Prof J.E. Sipe). Rotations of operators.

Posted by peeterjoot on November 16, 2011

[Click here for a PDF of this post with nicer formatting and figures if the post had any (especially if my latex to wordpress script has left FORMULA DOES NOT PARSE errors.)]


Peeter’s lecture notes from class. May not be entirely coherent.

Rotations of operators.

READING: section 28 [1].

Rotating with U[M] as in figure (\ref{fig:qmTwoL19:qmTwoL19fig1})
\caption{Rotating a state centered at F}

\begin{aligned}\tilde{r}_i = \sum_j M_{ij} \bar{r}_j\end{aligned} \hspace{\stretch{1}}(2.1)

\begin{aligned}{\left\langle {\psi} \right\rvert} R_i {\left\lvert {\psi} \right\rangle} = \bar{r}_i\end{aligned} \hspace{\stretch{1}}(2.2)

\begin{aligned}{\left\langle {\psi} \right\rvert} U^\dagger[M] R_i U[M] {\left\lvert {\psi} \right\rangle}&= \tilde{r}_i = \sum_j M_{ij} \bar{r}_j \\ &={\left\langle {\psi} \right\rvert} \Bigl( U^\dagger[M] R_i U[M] \Bigr) {\left\lvert {\psi} \right\rangle}\end{aligned}


\begin{aligned}U^\dagger[M] R_i U[M] = \sum_j M_{ij} R_j\end{aligned} \hspace{\stretch{1}}(2.3)

Any three operators V_x, V_y, V_z that transform according to

\begin{aligned}U^\dagger[M] V_i U[M] = \sum_j M_{ij} V_j\end{aligned} \hspace{\stretch{1}}(2.4)

form the components of a vector operator.

Infinitesimal rotations

Consider infinitesimal rotations, where we can show that

\begin{aligned}\left[{V_i},{J_j}\right] = i \hbar \sum_k \epsilon_{ijk} V_k\end{aligned} \hspace{\stretch{1}}(2.5)

Note that for V_i = J_i we recover the familiar commutator rules for angular momentum, but this also holds for operators \mathbf{R}, \mathbf{P}, \mathbf{J}, …

Note that

\begin{aligned}U^\dagger[M] = U[M^{-1}] = U[M^\text{T}],\end{aligned} \hspace{\stretch{1}}(2.6)


\begin{aligned}U^\dagger[M] V_i U^\dagger[M] = U^\dagger[M^\dagger] V_i U[M^\dagger] = \sum_j M_{ji} V_j\end{aligned} \hspace{\stretch{1}}(2.7)


\begin{aligned}{\left\langle {\psi} \right\rvert} V_i {\left\lvert {\psi} \right\rangle}={\left\langle {\psi} \right\rvert}U^\dagger[M] \Bigl( U[M] V_i U^\dagger[M] \Bigr) U[M]{\left\lvert {\psi} \right\rangle}\end{aligned} \hspace{\stretch{1}}(2.8)

In the same way, suppose we have nine operators

\begin{aligned}\tau_{ij}, \qquad i, j = x, y, z\end{aligned} \hspace{\stretch{1}}(2.9)

that transform according to

\begin{aligned}U[M] \tau_{ij} U^\dagger[M] = \sum_{lm} M_{li} M_{mj} \tau_{lm}\end{aligned} \hspace{\stretch{1}}(2.10)

then we will call these the components of (Cartesian) a second rank tensor operator. Suppose that we have an operator S that transforms

\begin{aligned}U[M] S U^\dagger[M] = S\end{aligned} \hspace{\stretch{1}}(2.11)

Then we will call S a scalar operator.

A problem.

This all looks good, but it is really not satisfactory. There is a problem.

Suppose that we have a Cartesian tensor operator like this, lets look at the quantity

\begin{aligned}\sum_i \tau_{ii}&=\sum_iU[M] \tau_{ii} U^\dagger[M]  \\ &= \sum_i\sum_{lm} M_{li} M_{mi} \tau_{lm} \\ &= \sum_i\sum_{lm} M_{li} M_{im}^\text{T} \tau_{lm} \\ &= \sum_{lm} \delta_{lm} \tau_{lm} \\ &= \sum_{l} \tau_{ll} \end{aligned}

We see buried inside these Cartesian tensors of higher rank there is some simplicity embedded (in this case trace invariance). Who knows what other relationships are also there? We want to work with and extract the buried simplicities, and we will find that the Cartesian way of expressing these tensors is horribly inefficient. What is a representation that doesn’t have any excess information, and is in some sense minimal?

How do we extract these buried simplicities?


\begin{aligned}U[M] {\left\lvert {j m''} \right\rangle} \end{aligned} \hspace{\stretch{1}}(2.12)

gives a linear combination of the {\left\lvert {j m'} \right\rangle}.

\begin{aligned}U[M] {\left\lvert {j m''} \right\rangle} &=\sum_{m'} {\left\lvert {j m'} \right\rangle} {\left\langle {j m'} \right\rvert} U[M] {\left\lvert {j m''} \right\rangle}  \\ &=\sum_{m'} {\left\lvert {j m'} \right\rangle} D^{(j)}_{m' m''}[M] \\ \end{aligned}

We’ve talked about before how these D^{(j)}_{m' m''}[M] form a representation of the rotation group. These are in fact (not proved here) an irreducible representation.

Look at each element of D^{(j)}_{m' m''}[M]. These are matrices and will be different according to which rotation M is chosen. There is some M for which this element is nonzero. There’s no element in this matrix element that is zero for all possible M. There are more formal ways to think about this in a group theory context, but this is a physical way to think about this.

Think of these as the basis vectors for some eigenket of J^2.

\begin{aligned}{\left\lvert {\psi} \right\rangle} &= \sum_{m''} {\left\lvert {j m''} \right\rangle} \left\langle{{j m''}} \vert {{\psi}}\right\rangle \\ &= \sum_{m''} \bar{a}_{m''} {\left\lvert {j m''} \right\rangle}\end{aligned}


\begin{aligned}\bar{a}_{m''} = \left\langle{{j m''}} \vert {{\psi}}\right\rangle \end{aligned} \hspace{\stretch{1}}(2.13)


\begin{aligned}U[M] {\left\lvert {\psi} \right\rangle} = &= \sum_{m'} U[M] {\left\lvert {j m'} \right\rangle} \left\langle{{j m'}} \vert {{\psi}}\right\rangle \\ &= \sum_{m'} U[M] {\left\lvert {j m'} \right\rangle} \bar{a}_{m'} \\ &= \sum_{m', m''} {\left\lvert {j m''} \right\rangle} {\left\langle {j m''} \right\rvert}U[M] {\left\lvert {j m'} \right\rangle} \bar{a}_{m'} \\ &= \sum_{m', m''} {\left\lvert {j m''} \right\rangle} D^{(j)}_{m'', m'}\bar{a}_{m'} \\ &= \sum_{m''} \tilde{a}_{m''} {\left\lvert {j m''} \right\rangle} \end{aligned}


\begin{aligned}\tilde{a}_{m''} = \sum_{m'} D^{(j)}_{m'', m'} \bar{a}_{m'} \\ \end{aligned} \hspace{\stretch{1}}(2.14)

Recall that

\begin{aligned}\tilde{r}_j = \sum_j M_{ij} \bar{r}_j\end{aligned} \hspace{\stretch{1}}(2.15)

Define (2k + 1) operators {T_k}^q, q = k, k-1, \cdots -k as the elements of a spherical tensor of rank k if

\begin{aligned}U[M] {T_k}^q U^\dagger[M] = \sum_{q'} D^{(j)}_{q' q} {T_k}^{q'}\end{aligned} \hspace{\stretch{1}}(2.16)

Here we are looking for a better way to organize things, and it will turn out (not to be proved) that this will be an irreducible way to represent things.


We want to work though some examples of spherical tensors, and how they relate to Cartesian tensors. To do this, a motivating story needs to be told.

Let’s suppose that {\left\lvert {\psi} \right\rangle} is a ket for a single particle. Perhaps we are talking about an electron without spin, and write

\begin{aligned}\left\langle{\mathbf{r}} \vert {{\psi}}\right\rangle &= Y_{lm}(\theta, \phi) f(r) \\ &= \sum_{m''} \bar{a}_{m''} Y_{l m''}(\theta, \phi) \end{aligned}

for \bar{a}_{m''} = \delta_{m'' m} and after dropping f(r). So

\begin{aligned}{\left\langle {\mathbf{r}} \right\rvert} U[M] {\left\lvert {\psi} \right\rangle} =\sum_{m''} \sum_{m'} D^{(j)}_{m'' m} \bar{a}_{m'} Y_{l m''}(\theta, \phi) \end{aligned} \hspace{\stretch{1}}(2.17)

We are writing this in this particular way to make a point. Now also assume that

\begin{aligned}\left\langle{\mathbf{r}} \vert {{\psi}}\right\rangle = Y_{lm}(\theta, \phi)\end{aligned} \hspace{\stretch{1}}(2.18)

so we find

\begin{aligned}{\left\langle {\mathbf{r}} \right\rvert} U[M] {\left\lvert {\psi} \right\rangle} &=\sum_{m''} Y_{l m''}(\theta, \phi) D^{(j)}_{m'' m} \\ &=Y_{l m}(\theta, \phi) \end{aligned}

\begin{aligned}Y_{l m}(\theta, \phi)  = Y_{lm}(x, y, z)\end{aligned} \hspace{\stretch{1}}(2.19)


\begin{aligned}Y'_{l m}(x, y, z)= \sum_{m''} Y_{l m''}(x, y, z)D^{(j)}_{m'' m} \end{aligned} \hspace{\stretch{1}}(2.20)

Now consider the spherical harmonic as an operator Y_{l m}(X, Y, Z)

\begin{aligned}U[M] Y_{lm}(X, Y, Z) U^\dagger[M] =\sum_{m''} Y_{l m''}(X, Y, Z)D^{(j)}_{m'' m} \end{aligned} \hspace{\stretch{1}}(2.21)

So this is a way to generate spherical tensor operators of rank 0, 1, 2, \cdots.


[1] BR Desai. Quantum mechanics with basic field theory. Cambridge University Press, 2009.

Posted in Math and Physics Learning. | Tagged: , , , , , , , , , | Leave a Comment »

PHY450H1S. Relativistic Electrodynamics Lecture 6 (Taught by Prof. Erich Poppitz). Four vectors and tensors.

Posted by peeterjoot on January 25, 2011

[Click here for a PDF of this post with nicer formatting]


Still covering chapter 1 material from the text [1].

Covering Professor Poppitz’s lecture notes: nonrelativistic limit of boosts (33); number of parameters of Lorentz transformations (34-35); introducing four-vectors, the metric tensor, the invariant “dot-product and SO(1,3) (36-40); the Poincare group (41); the convenience of “upper” and “lower” indices (42-43); tensors (44)

The Special Orthogonal group (for Euclidean space).

Lorentz transformations are like “rotations” for (t, x, y, z) that preserve (ct)^2 - x^2 - y^2 - z^2. There are 6 continuous parameters:

\item 3 rotations in x,y,z space
\item 3 “boosts” in x or y or z.

For rotations of space we talk about a group of transformations of 3D Euclidean space, and call this the S0(3) group. Here S is for Special, O for Orthogonal, and 3 for the dimensions.

For a transformed vector in 3D space we write

\begin{aligned}\begin{bmatrix}x \\ y \\ z\end{bmatrix} \rightarrow \begin{bmatrix}x \\ y \\ z\end{bmatrix}' = O \begin{bmatrix}x \\ y \\ z\end{bmatrix}.\end{aligned} \hspace{\stretch{1}}(2.1)

Here O is an orthogonal 3 \times 3 matrix, and has the property

\begin{aligned}O^T O = \mathbf{1}.\end{aligned} \hspace{\stretch{1}}(2.2)

Taking determinants, we have

\begin{aligned}\det{ O^T } \det{ O} = 1,\end{aligned} \hspace{\stretch{1}}(2.3)

and since \det{O^\text{T}} = \det{ O }, we have

\begin{aligned}(\det{O})^2 = 1,\end{aligned} \hspace{\stretch{1}}(2.4)

so our determinant must be

\begin{aligned}\det O = \pm 1.\end{aligned} \hspace{\stretch{1}}(2.5)

We work with the positive case only, avoiding the transformations that include reflections.

The Unitary condition O^\text{T} O = 1 is an indication that the inner product is preserved. Observe that in matrix form we can write the inner product

\begin{aligned}\mathbf{r}_1 \cdot \mathbf{r}_2 = \begin{bmatrix}x_1 & y_1 & z_1\end{bmatrix}\begin{bmatrix}x_1 \\ y_2 \\ x_3 \\ \end{bmatrix}.\end{aligned} \hspace{\stretch{1}}(2.6)

For a transformed vector X' = O X, we have {X'}^\text{T} = X^\text{T} O^\text{T}, and

\begin{aligned}X' \cdot X' = (X^\text{T} O^\text{T}) (O X) = X^\text{T} (O^\text{T} O) X = X^T X = X \cdot X\end{aligned} \hspace{\stretch{1}}(2.7)

The Special Orthogonal group (for spacetime).

This generalizes to Lorentz boosts! There are two differences

\item Lorentz transforms should be 4 \times 4 not 3 \times 3 and act in (ct, x, y, z), and NOT (x,y,z).
\item They should leave invariant NOT \mathbf{r}_1 \cdot \mathbf{r}_2, but c2 t_2 t_1 - \mathbf{r}_2 \cdot \mathbf{r}_1.

Don’t get confused that I demanded c^2 t_2 t_1 - \mathbf{r}_2 \cdot \mathbf{r}_1 = \text{invariant} rather than c^2 (t_2 - t_1)^2 - (\mathbf{r}_2 - \mathbf{r}_1)^2 = \text{invariant}. Expansion of this (squared) interval, provides just this four vector dot product and its invariance condition

\begin{aligned}\text{invariant} &=c^2 (t_2 - t_1)^2 - (\mathbf{r}_2 - \mathbf{r}_1)^2 \\ &=(c^2 t_2^2 - \mathbf{r}_2^2) + (c^2 t_2^2 - \mathbf{r}_2^2)- 2 c^2 t_2 t_1 + 2 \mathbf{r}_1 \cdot \mathbf{r}_2.\end{aligned}

Observe that we have the sum of two invariants plus our new cross term, so this cross term, (-2 times our dot product to be defined), must also be an invariant.

Introduce the four vector

\begin{aligned}x^0 &= ct \\ x^1 &= x \\ x^2 &= y \\ x^3 &= z \end{aligned}

Or (x^0, x^1, x^2, x^3) = \{ x^i, i = 0,1,2,3 \}.

We will also write

\begin{aligned}x^i &= (ct, \mathbf{r}) \\ \tilde{x}^i &= (c\tilde{t}, \tilde{\mathbf{r}})\end{aligned}

Our inner product is

\begin{aligned}c^2 t \tilde{t} - \mathbf{r} \cdot \tilde{\mathbf{r}}\end{aligned} \hspace{\stretch{1}}(3.8)

Introduce the 4 \times 4 matrix

\begin{aligned} \left\lVert{g_{ij}}\right\rVert = \begin{bmatrix}1 & 0 & 0 & 0 \\ 0 & -1 & 0 & 0 \\ 0 & 0 & -1 & 0 \\ 0 & 0 & 0 & -1 \\ \end{bmatrix}\end{aligned} \hspace{\stretch{1}}(3.9)

This is called the Minkowski spacetime metric.


\begin{aligned}c^2 t \tilde{t} - \mathbf{r} \cdot \tilde{\mathbf{r}}&\equiv \sum_{i, j = 0}^3 \tilde{x}^i g_{ij} x^j \\ &= \sum_{i, j = 0}^3 \tilde{x}^i g_{ij} x^j \\ & \tilde{x}^0 x^0 -\tilde{x}^1 x^1 -\tilde{x}^2 x^2 -\tilde{x}^3 x^3 \end{aligned}

\paragraph{Einstein summation convention}. Whenever indexes are repeated that are assumed to be summed over.

We also write

\begin{aligned}X = \begin{bmatrix}x^0 \\ x^1 \\ x^2 \\ x^3 \\ \end{bmatrix}\end{aligned} \hspace{\stretch{1}}(3.10)

\begin{aligned}\tilde{X} = \begin{bmatrix}\tilde{x}^0 \\ \tilde{x}^1 \\ \tilde{x}^2 \\ \tilde{x}^3 \\ \end{bmatrix}\end{aligned} \hspace{\stretch{1}}(3.11)

\begin{aligned}G = \begin{bmatrix}1 & 0 & 0 & 0 \\ 0 & -1 & 0 & 0 \\ 0 & 0 & -1 & 0 \\ 0 & 0 & 0 & -1 \\ \end{bmatrix}\end{aligned} \hspace{\stretch{1}}(3.12)

Our inner product

\begin{aligned}c^2 t \tilde{t} - \tilde{\mathbf{r}} \cdot \mathbf{r} = \tilde{X}^\text{T} G X &=\begin{bmatrix}\tilde{x}^0 & \tilde{x}^1 & \tilde{x}^2 & \tilde{x}^3 \end{bmatrix}\begin{bmatrix}1 & 0 & 0 & 0 \\ 0 & -1 & 0 & 0 \\ 0 & 0 & -1 & 0 \\ 0 & 0 & 0 & -1 \\ \end{bmatrix}\begin{bmatrix}\tilde{x}^0 \\ \tilde{x}^1 \\ \tilde{x}^2 \\ \tilde{x}^3 \\ \end{bmatrix}\end{aligned}

Under Lorentz boosts, we have

\begin{aligned}X = \hat{O} X',\end{aligned} \hspace{\stretch{1}}(3.13)


\begin{aligned}\hat{O} =\begin{bmatrix}\gamma & - \gamma v_x/c  & 0 & 0 \\ - \gamma v_x/c & \gamma  & 0 & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \end{bmatrix}\end{aligned} \hspace{\stretch{1}}(3.14)

(for x-direction boosts)

\tilde{X} = \hat{O} \tilde{X}' \tilde{X}^\text{T} = \tilde{X'}^\text{T} \hat{O}^\text{T} \hspace{\stretch{1}}(3.15)

But \hat{O} must be such that \tilde{X}^\text{T} G X is invariant. i.e.

\begin{aligned}\tilde{X} G X = {\tilde{X'}}^\text{T} (\hat{O}^\text{T} G \hat{O}) X' = {X'}^\text{T} (G) X' \qquad \forall X' \text{and} \tilde{X}' \end{aligned} \hspace{\stretch{1}}(3.16)

This implies

\begin{aligned}\boxed{\hat{O}^\text{T} G \hat{O} = G}\end{aligned} \hspace{\stretch{1}}(3.17)

Such \hat{O}‘s are called “pseudo-orthogonal”.

Lorentz transformations are represented by the set of all 4 \times 4 pseudo-orthogonal matrices.

In symbols

\begin{aligned}\hat{O}^T G \hat{O} = G\end{aligned} \hspace{\stretch{1}}(3.18)

Just as before we can take the determinant of both sides. Doing so we have

\begin{aligned}\det(\hat{O}^T G \hat{O}) = \det(\hat{O}^T) \det(G) \det(\hat{O}) = \det(G)\end{aligned} \hspace{\stretch{1}}(3.19)

The \det(G) terms cancel, and since \det(\hat{O}^T) = \det(\hat{O}), this leaves us with (\det(\hat{O}))^2 = 1, or

\begin{aligned}\det(\hat{O}) = \pm 1\end{aligned} \hspace{\stretch{1}}(3.20)

We take the \det 0 = +1 case only, so that the transformations do not change orientation (no reflection in space or time). This set of transformation forms the group


Special orthogonal, one time, 3 space dimensions.

Einstein relativity can be defined as the “laws of physics that leave four vectors invariant in the

\begin{aligned}SO(1,3) \times T^4\end{aligned}

symmetry group.

Here T^4 is the group of translations in spacetime with 4 continuous parameters. The complete group of transformations that form the group of relativistic physics has 10 = 3 + 3 + 4 continuous parameters.

This group is called the Poincare group of symmetry transforms.

More notation

Our inner product is written

\begin{aligned}\tilde{x}^i g_{ij} x^j\end{aligned} \hspace{\stretch{1}}(4.21)

but this is very cumbersome. The convenient way to write this is instead

\begin{aligned}\tilde{x}^i g_{ij} x^j = \tilde{x}_j x^j = \tilde{x}^i x_i\end{aligned} \hspace{\stretch{1}}(4.22)


\begin{aligned}x_i = g_{ij} x^j = g_{ji} x^j\end{aligned} \hspace{\stretch{1}}(4.23)

Note: A check that we should always be able to make. Indexes that are not summed over should be conserved. So in the above we have a free i on the LHS, and should have a non-summed i index on the RHS too (also lower matching lower, or upper matching upper).

Non-matched indexes are bad in the same sort of sense that an expression like

\begin{aligned}\mathbf{r} = 1\end{aligned} \hspace{\stretch{1}}(4.24)

isn’t well defined (assuming a vector space \mathbf{r} and not a multivector Clifford algebra that is;)

Example explicitly:

\begin{aligned}x_0 &= g_{0 0} x^0 = ct  \\ x_1 &= g_{1 j} x^j = g_{11} x^1 = -x^1 \\ x_2 &= g_{2 j} x^j = g_{22} x^2 = -x^2 \\ x_3 &= g_{3 j} x^j = g_{33} x^3 = -x^3\end{aligned}

We would not have objects of the form

\begin{aligned}x^i x^i = (ct)^2 + \mathbf{r}^2\end{aligned} \hspace{\stretch{1}}(4.25)

for example. This is not a Lorentz invariant quantity.

\paragraph{Lorentz scalar example:} \tilde{x}^i x_i
\paragraph{Lorentz vector example:} x^i

This last is also called a rank-1 tensor.

Lorentz rank-2 tensors: ex: g_{ij}

or other 2-index objects.

Why in the world would we ever want to consider two index objects. We aren’t just trying to be hard on ourselves. Recall from classical mechanics that we have a two index object, the inertial tensor.

In mechanics, for a rigid body we had the energy

\begin{aligned}T = \sum_{ij = 1}^3 \Omega_i I_{ij} \Omega_j\end{aligned} \hspace{\stretch{1}}(4.26)

The inertial tensor was this object

\begin{aligned}I_{ij} = \sum_{a = 1}^N m_a \left(\delta_{ij} \mathbf{r}_a^2 - r_{a_i} r_{a_j} \right)\end{aligned} \hspace{\stretch{1}}(4.27)

or for a continuous body

\begin{aligned}I_{ij} = \int \rho(\mathbf{r}) \left(\delta_{ij} \mathbf{r}^2 - r_{i} r_{j} \right)\end{aligned} \hspace{\stretch{1}}(4.28)

In electrostatics we have the quadrupole tensor, … and we have other such objects all over physics.

Note that the energy T of the body above cannot depend on the coordinate system in use. This is a general property of tensors. These are object that transform as products of vectors, as I_{ij} does.

We call I_{ij} a rank-2 3-tensor. rank-2 because there are two indexes, and 3 because the indexes range from 1 to 3.

The point is that tensors have the property that the transformed tensors transform as

\begin{aligned}I_{ij}' = \sum_{l, m = 1,2,3} O_{il} O_{jm} I_{lm}\end{aligned} \hspace{\stretch{1}}(4.29)

Another example: the completely antisymmetric rank 3, 3-tensor

\begin{aligned}\epsilon_{ijk}\end{aligned} \hspace{\stretch{1}}(4.30)


In Newtonian dynamics we have

\begin{aligned}m \dot{d}{\mathbf{r}} = \mathbf{f}\end{aligned} \hspace{\stretch{1}}(5.31)

An equation of motion should be expressed in terms of vectors. This equation is written in a way that shows that the law of physics is independent of the choice of coordinates. We can do this in the context of tensor algebra as well. Ironically, this will require us to explicitly work with the coordinate representation, but this work will be augmented by the fact that we require our tensors to transform in specific ways.

In Newtonian mechanics we can look to symmetries and the invariance of the action with respect to those symmetries to express the equations of motion. Our symmetries in Newtonian mechanics leave the action invariant with respect to spatial translation and with respect to rotation.

We want to express relativistic dynamics in a similar way, and will have to express the action as a Lorentz scalar. We are going to impose the symmetries of the Poincare group to determine the relativistic laws of dynamics, and the next task will be to consider the possibilities for our relativistic action, and see what that action implies for dynamics in a relativistic context.


[1] L.D. Landau and E.M. Lifshits. The classical theory of fields. Butterworth-Heinemann, 1980.

Posted in Math and Physics Learning. | Tagged: , , , , , , , , , , , | Leave a Comment »

PHY356F: Quantum Mechanics I. Dec 7 2010 ; Lecture 11 notes. Rotations and Angular momentum.

Posted by peeterjoot on December 7, 2010

[Click here for a PDF of this post with nicer formatting]

This time. Rotations (chapter 26).

Why are we doing the math? Because it applies to physical systems. Slides of IBM’s SEM quantum coral and others shown and discussed.

PICTURE: Standard right handed coordinate system with point (x,y,z). We’d like to discuss how to represent this point in other coordinate systems, such as one with the x,y axes rotated to x',y' through an angle \phi.

Our problem is to find in the rotated coordinate system from (x,y,z) to (x', y', z').

There’s clearly a relationship between the representations. That relationship between x', y', z' and x,y,z for a counter-clockwise rotation about the z axis is

\begin{aligned}x' &= x \cos \phi - y \sin\phi \\ y' &= x \sin \phi + y \cos\phi \\ z' &= z\end{aligned} \hspace{\stretch{1}}(13.214)

Treat (x,y,z) and (x',y',z') like vectors and write

\begin{aligned}\begin{bmatrix}x'  \\ y' \\ z' \end{bmatrix}=\begin{bmatrix}\cos \phi &- \sin\phi & 0 \\ \sin \phi & \cos\phi & 0 \\ 0 & 0 & 1\end{bmatrix}\begin{bmatrix}x  \\ y \\ z \end{bmatrix}\end{aligned} \hspace{\stretch{1}}(13.217)


\begin{aligned}\begin{bmatrix}x'  \\ y' \\ z' \end{bmatrix}=R_z(\phi)\begin{bmatrix}x  \\ y \\ z \end{bmatrix}\end{aligned} \hspace{\stretch{1}}(13.218)

\paragraph{Q: Is R_z(\phi) a unitary operator?}

Definition U is unitary if U^\dagger U = \mathbf{1}, where \mathbf{1} is the identity operator. We take Hermitian conjugates, which in this case is just the transpose since all elements of the matrix are real, and multiply

\begin{aligned}(R_z(\phi))^\dagger R_z(\phi) &=\begin{bmatrix}\cos \phi & \sin\phi & 0 \\ -\sin \phi & \cos\phi & 0 \\ 0 & 0 & 1\end{bmatrix}\begin{bmatrix}\cos \phi &- \sin\phi & 0 \\ \sin \phi & \cos\phi & 0 \\ 0 & 0 & 1\end{bmatrix} \\ &=\begin{bmatrix}\cos^2 \phi + \sin^2\phi  & -\sin\phi \cos\phi  + \sin\phi \cos\phi  & 0 \\ -\cos\phi \sin\phi  + \cos\phi \sin\phi  & \cos^2\phi  + \sin^2 \phi & 0 \\ 0 & 0 & 1 \\ \end{bmatrix} \\ &=\begin{bmatrix}1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \\ \end{bmatrix} \\ &= \mathbf{1}\end{aligned}

Apply the above to a vector \mathbf{v} = (v_x, v_y, v_z) and write \mathbf{v}' = (v_x', v_y', v_z'). These are related as

\begin{aligned}\mathbf{v} = R_z(\phi) \mathbf{v}\end{aligned} \hspace{\stretch{1}}(13.219)

Now we want to consider the infinitesimal case where we allow the rotation angle to get arbitrarily small. Consider this specific z axis rotation case, and assume that \phi is very small. Let \phi = \epsilon and write

\begin{aligned}\mathbf{v}' &=\begin{bmatrix}v_x'  \\ v_y' \\ v_z' \end{bmatrix}=R_z(\phi)\begin{bmatrix}v_x \\ v_y \\ v_z \end{bmatrix}=\begin{bmatrix}\cos \epsilon &- \sin\epsilon & 0 \\ \sin \epsilon & \cos\epsilon & 0 \\ 0 & 0 & 1\end{bmatrix} \mathbf{v} \\ &\approx\begin{bmatrix}1 &- \epsilon & 0 \\ \epsilon & 1 & 0 \\ 0 & 0 & 1\end{bmatrix} \mathbf{v} =\left(\begin{bmatrix}1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1\end{bmatrix} +\begin{bmatrix}0 &- \epsilon & 0 \\ \epsilon & 0 & 0 \\ 0 & 0 & 1\end{bmatrix} \right)\mathbf{v} \end{aligned} \hspace{\stretch{1}}(13.220)


\begin{aligned}S_z = i \hbar\begin{bmatrix}0 &- 1 & 0 \\ 1 & 0 & 0 \\ 0 & 0 & 1\end{bmatrix} \end{aligned} \hspace{\stretch{1}}(13.222)

which is the generator of infinitesimal rotations about the z axis.

Our rotated coordinate vector becomes

\begin{aligned}\mathbf{v}' &= \left(\begin{bmatrix}1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1\end{bmatrix} +\frac{i \hbar \epsilon}{i\hbar}\begin{bmatrix}0 &- 1 & 0 \\ 1 & 0 & 0 \\ 0 & 0 & 1\end{bmatrix} \right)\mathbf{v} \\ &=\left(\mathbf{1} + \frac{\epsilon}{i \hbar} S_z\right)\mathbf{v}\end{aligned}


\begin{aligned}\mathbf{v}'=\left(\mathbf{1} - \frac{i \epsilon}{\hbar} S_z\right)\mathbf{v}\end{aligned} \hspace{\stretch{1}}(13.223)

Many infinitesimal rotations can be combined to create a finite rotation via

\begin{aligned}\lim_{N \rightarrow \infty} \left( 1 + \frac{\alpha}{N} \right)^N = e^\alpha\end{aligned} \hspace{\stretch{1}}(13.224)

\begin{aligned}\alpha = -i \phi S_z/\hbar\end{aligned} \hspace{\stretch{1}}(13.225)

For a finite rotation

\begin{aligned}\mathbf{v}'=e^{ -i \frac{\phi S_z}{\hbar} }\mathbf{v}\end{aligned} \hspace{\stretch{1}}(13.226)

Now think about transforming g(x,y,z), an arbitrary function. Take \epsilon is very small so that

\begin{aligned}x' &= x \cos \phi - y \sin\phi = x \cos \epsilon - y \sin\epsilon \approx x - y \epsilon \\ y' &= x \sin \phi + y \cos\phi = x \sin \epsilon + y \cos\epsilon \approx x \epsilon + y \\ z' &= z\end{aligned} \hspace{\stretch{1}}(13.227)

\paragraph{Question: Why can we assume that \epsilon is small.}
\paragraph{Answer: We declare it to be small because it is simpler, and eventually build up to the general case where it is larger. We want to master the easy task before moving on to the more difficult ones.}

Our function is now transformed

\begin{aligned}g(x', y', z') \approx g( x - y \epsilon, y + x \epsilon, z) \\ &= g( x , y , z) - \epsilon y \frac{\partial {g}}{\partial {x}} + \epsilon x \frac{\partial {g}}{\partial {y}} + \cdots \\ &=\left( \mathbf{1} - \epsilon y \frac{\partial {}}{\partial {x}}+ \epsilon x \frac{\partial {}}{\partial {y}}\right)g( x, y ,z )\end{aligned}

Recall that the coordinate definition of the angular momentum operator is

\begin{aligned}L_z = -i \hbar \left( x \frac{\partial {}}{\partial {y}} - y \frac{\partial {}}{\partial {x}} \right) = x p_y - y p_x\end{aligned} \hspace{\stretch{1}}(13.230)

We can now write

\begin{aligned}g(x', y', z') &=\left( \mathbf{1} +\frac{-i \hbar \epsilon}{-i\hbar} \left(x \frac{\partial {}}{\partial {y}}- y \frac{\partial {}}{\partial {x}}\right)\right)g( x, y ,z ) \\ &=\left( \mathbf{1} +\frac{i \epsilon}{\hbar} L_z\right)g( x, y ,z )\end{aligned}

For a finite rotation with angle \phi we have

\begin{aligned}g(x', y', z') =e^{i \frac{\phi L_z}{\hbar}}g( x, y ,z )\end{aligned} \hspace{\stretch{1}}(13.231)

\paragraph{Question: somebody says that the rotation is clockwise not counterclockwise.}

I didn’t follow the reasoning briefly mentioned on the board since it looks right to me. Perhaps this is the age old mixup between rotating the coordinates and the basis vectors. Review what’s in the text carefully. Can also check by

If you rotate a ket, and examine how the state representation of that ket changes under rotation, we have

\begin{aligned}{\lvert {x', y', z'} \rangle} = {\lvert {x - \epsilon y, y + \epsilon x, z} \rangle}\end{aligned} \hspace{\stretch{1}}(13.232)


\begin{aligned}\left\langle{{\Psi}} \vert {{x', y', z'}}\right\rangle &= \Psi^{*}(x', y', z') \\ &=\Psi^{*}(x - \epsilon y, y + \epsilon x, z) \\ &=\Psi^{*}(x , y , z) - \epsilon \frac{\partial {\Psi^{*}}}{\partial {y}}+ \epsilon \frac{\partial {\Psi^{*}}}{\partial {x}} \\ &=\left( \mathbf{1} + \frac{i \epsilon} {\hbar} L_z \right) \Psi^{*}(x , y , z) \end{aligned}

Taking the complex conjugate we have

\begin{aligned}\Psi(x', y', z') \left( \mathbf{1} - \frac{i \epsilon} {\hbar} L_z \right) \Psi(x , y , z) \end{aligned} \hspace{\stretch{1}}(13.233)

For infinitesimal rotations about the z axis we have for functions

\begin{aligned}\Psi(x', y', z') =e^{ - \frac{i \epsilon} {\hbar} L_z } \Psi(x , y , z) \end{aligned} \hspace{\stretch{1}}(13.234)

For finite rotations of a vector about the z axis we have

\begin{aligned}\mathbf{v}'=e^{ - \frac{i \phi S_z} {\hbar} } \Psi(x , y , z) \mathbf{v}\end{aligned} \hspace{\stretch{1}}(13.235)

and for functions

\begin{aligned}\Psi(x', y', z') =e^{ - \frac{i \phi L_z} {\hbar} } \Psi(x , y , z) \end{aligned} \hspace{\stretch{1}}(13.236)

Vatche has mentioned some devices being researched right now where there is an attempt to isolate the spin orientation so that, say, only spin up or spin down electrons are allowed to flow. There are some possible interesting applications here to Quantum computation. Can we actually make a quantum computing device that is actually usable? We can make NAND devices as mentioned in the article above. Can this be scaled? We don’t know how to do this yet.

Recall that one description of a “particle” that has both a position and spin representation is

\begin{aligned}{\lvert {\Psi} \rangle} = {\lvert {u} \rangle} \otimes {\lvert {s m} \rangle}\end{aligned} \hspace{\stretch{1}}(13.237)

where we have a tensor product of kets. One usually just writes the simpler

\begin{aligned}{\lvert {u} \rangle} \otimes {\lvert {s m} \rangle} \equiv {\lvert {u} \rangle} {\lvert {s m} \rangle} \end{aligned} \hspace{\stretch{1}}(13.238)

An example of the above is

\begin{aligned}\begin{bmatrix}u_1(\mathbf{r}) \\ u_2(\mathbf{r}) \\ u_3(\mathbf{r}) \\ \end{bmatrix}= \Bigl( {\langle {\mathbf{r}} \rvert} {\langle { s m} \rvert} \Bigr) {\lvert {\Psi} \rangle}\end{aligned} \hspace{\stretch{1}}(13.239)

where u_1 is spin component one. For s=1 this would be m=-1, 0, 1.

Here we have also used

\begin{aligned}{\lvert {\mathbf{r}} \rangle}= {\lvert {x} \rangle}\otimes{\lvert {y} \rangle}\otimes{\lvert {z} \rangle} \\ &={\lvert {x} \rangle}{\lvert {y} \rangle}{\lvert {z} \rangle} \\ &={\lvert {x y z} \rangle}\end{aligned}

We can now ask the question of how this thing transforms. We transform each component of this as a vector. The transformation of

\begin{aligned}\begin{bmatrix}u_1(\mathbf{r}) \\ u_2(\mathbf{r}) \\ u_3(\mathbf{r}) \end{bmatrix}\end{aligned}

results in

\begin{aligned}{\begin{bmatrix}u_1(\mathbf{r}) \\ u_2(\mathbf{r}) \\ u_3(\mathbf{r}) \end{bmatrix}}'=e^{ -i \phi (S_z + L_z)/\hbar }\begin{bmatrix}u_1(\mathbf{r}) \\ u_2(\mathbf{r}) \\ u_3(\mathbf{r}) \end{bmatrix}\end{aligned} \hspace{\stretch{1}}(13.240)

Or with J_z = S_z + L_z

\begin{aligned}{\lvert {\Psi'} \rangle} = e^{-i \phi J_z/\hbar } {\lvert {\Psi} \rangle} \end{aligned} \hspace{\stretch{1}}(13.241)

Observe that this separates out nicely with the S_z operation acting on the vector parts, and the L_z operator acting on the functional dependence.

Posted in Math and Physics Learning. | Tagged: , , , , , | Leave a Comment »

Derivation of the spherical polar Laplacian

Posted by peeterjoot on October 9, 2010

[Click here for a PDF of this post with nicer formatting]


In [1] was a Geometric Algebra derivation of the 2D polar Laplacian by squaring the quadient. In [2] was a factorization of the spherical polar unit vectors in a tidy compact form. Here both these ideas are utilized to derive the spherical polar form for the Laplacian, an operation that is strictly algebraic (squaring the gradient) provided we operate on the unit vectors correctly.

Our rotation multivector.

Our starting point is a pair of rotations. We rotate first in the x,y plane by \phi

\begin{aligned}\mathbf{x} &\rightarrow \mathbf{x}' = \tilde{R_\phi} \mathbf{x} R_\phi \\ i &\equiv \mathbf{e}_1 \mathbf{e}_2 \\ R_\phi &= e^{i \phi/2}\end{aligned} \hspace{\stretch{1}}(2.1)

Then apply a rotation in the \mathbf{e}_3 \wedge (\tilde{R_\phi} \mathbf{e}_1 R_\phi) = \tilde{R_\phi} \mathbf{e}_3 \mathbf{e}_1 R_\phi plane

\begin{aligned}\mathbf{x}' &\rightarrow \mathbf{x}'' = \tilde{R_\theta} \mathbf{x}' R_\theta \\ R_\theta &= e^{ \tilde{R_\phi} \mathbf{e}_3 \mathbf{e}_1 R_\phi \theta/2 } = \tilde{R_\phi} e^{ \mathbf{e}_3 \mathbf{e}_1 \theta/2 } R_\phi\end{aligned} \hspace{\stretch{1}}(2.4)

The composition of rotations now gives us

\begin{aligned}\mathbf{x}&\rightarrow \mathbf{x}'' = \tilde{R_\theta} \tilde{R_\phi} \mathbf{x} R_\phi R_\theta = \tilde{R} \mathbf{x} R \\ R &= R_\phi R_\theta = e^{ \mathbf{e}_3 \mathbf{e}_1 \theta/2 } e^{ \mathbf{e}_1 \mathbf{e}_2 \phi/2 }.\end{aligned}

Expressions for the unit vectors.

The unit vectors in the rotated frame can now be calculated. With I = \mathbf{e}_1 \mathbf{e}_2 \mathbf{e}_3 we can calculate

\begin{aligned}\hat{\boldsymbol{\phi}} &= \tilde{R} \mathbf{e}_2 R  \\ \hat{\mathbf{r}} &= \tilde{R} \mathbf{e}_3 R  \\ \hat{\boldsymbol{\theta}} &= \tilde{R} \mathbf{e}_1 R\end{aligned} \hspace{\stretch{1}}(3.6)

Performing these we get

\begin{aligned}\hat{\boldsymbol{\phi}}&= e^{ -\mathbf{e}_1 \mathbf{e}_2 \phi/2 } e^{ -\mathbf{e}_3 \mathbf{e}_1 \theta/2 } \mathbf{e}_2 e^{ \mathbf{e}_3 \mathbf{e}_1 \theta/2 } e^{ \mathbf{e}_1 \mathbf{e}_2 \phi/2 } \\ &= \mathbf{e}_2 e^{ i \phi },\end{aligned}


\begin{aligned}\hat{\mathbf{r}}&= e^{ -\mathbf{e}_1 \mathbf{e}_2 \phi/2 } e^{ -\mathbf{e}_3 \mathbf{e}_1 \theta/2 } \mathbf{e}_3 e^{ \mathbf{e}_3 \mathbf{e}_1 \theta/2 } e^{ \mathbf{e}_1 \mathbf{e}_2 \phi/2 } \\ &= e^{ -\mathbf{e}_1 \mathbf{e}_2 \phi/2 } (\mathbf{e}_3 \cos\theta + \mathbf{e}_1 \sin\theta ) e^{ \mathbf{e}_1 \mathbf{e}_2 \phi/2 } \\ &= \mathbf{e}_3 \cos\theta +\mathbf{e}_1 \sin\theta e^{ \mathbf{e}_1 \mathbf{e}_2 \phi } \\ &= \mathbf{e}_3 (\cos\theta + \mathbf{e}_3 \mathbf{e}_1 \sin\theta e^{ \mathbf{e}_1 \mathbf{e}_2 \phi } ) \\ &= \mathbf{e}_3 e^{I \hat{\boldsymbol{\phi}} \theta},\end{aligned}


\begin{aligned}\hat{\boldsymbol{\theta}}&= e^{ -\mathbf{e}_1 \mathbf{e}_2 \phi/2 } e^{ -\mathbf{e}_3 \mathbf{e}_1 \theta/2 } \mathbf{e}_1 e^{ \mathbf{e}_3 \mathbf{e}_1 \theta/2 } e^{ \mathbf{e}_1 \mathbf{e}_2 \phi/2 } \\ &= e^{ -\mathbf{e}_1 \mathbf{e}_2 \phi/2 } ( \mathbf{e}_1 \cos\theta - \mathbf{e}_3 \sin\theta ) e^{ \mathbf{e}_1 \mathbf{e}_2 \phi/2 } \\ &= \mathbf{e}_1 \cos\theta e^{ \mathbf{e}_1 \mathbf{e}_2 \phi/2 } - \mathbf{e}_3 \sin\theta \\ &= i \hat{\boldsymbol{\phi}} \cos\theta - \mathbf{e}_3 \sin\theta \\ &= i \hat{\boldsymbol{\phi}} (\cos\theta + \hat{\boldsymbol{\phi}} i \mathbf{e}_3 \sin\theta ) \\ &= i \hat{\boldsymbol{\phi}} e^{I \hat{\boldsymbol{\phi}} \theta}.\end{aligned}

Summarizing these are

\begin{aligned}\hat{\boldsymbol{\phi}} &= \mathbf{e}_2 e^{ i \phi } \\ \hat{\mathbf{r}} &= \mathbf{e}_3 e^{I \hat{\boldsymbol{\phi}} \theta} \\ \hat{\boldsymbol{\theta}} &= i \hat{\boldsymbol{\phi}} e^{I \hat{\boldsymbol{\phi}} \theta}.\end{aligned} \hspace{\stretch{1}}(3.9)

Derivatives of the unit vectors.

We’ll need the partials. Most of these can be computed from 3.9 by inspection, and are

\begin{aligned}\partial_r \hat{\boldsymbol{\phi}} &= 0 \\ \partial_r \hat{\mathbf{r}} &= 0 \\ \partial_r \hat{\boldsymbol{\theta}} &= 0 \\ \partial_\theta \hat{\boldsymbol{\phi}} &= 0 \\ \partial_\theta \hat{\mathbf{r}} &= \hat{\mathbf{r}} I \hat{\boldsymbol{\phi}} \\ \partial_\theta \hat{\boldsymbol{\theta}} &= \hat{\boldsymbol{\theta}} I \hat{\boldsymbol{\phi}} \\ \partial_\phi \hat{\boldsymbol{\phi}} &= \hat{\boldsymbol{\phi}} i \\ \partial_\phi \hat{\mathbf{r}} &= \hat{\boldsymbol{\phi}} \sin\theta \\ \partial_\phi \hat{\boldsymbol{\theta}} &= \hat{\boldsymbol{\phi}} \cos\theta\end{aligned} \hspace{\stretch{1}}(4.12)

Expanding the Laplacian.

We note that the line element is ds = dr + r d\theta + r\sin\theta d\phi, so our gradient in spherical coordinates is

\begin{aligned}\boldsymbol{\nabla} &= \hat{\mathbf{r}} \partial_r + \frac{\hat{\boldsymbol{\theta}}}{r} \partial_\theta + \frac{\hat{\boldsymbol{\phi}}}{r\sin\theta} \partial_\phi.\end{aligned} \hspace{\stretch{1}}(5.21)

We can now evaluate the Laplacian

\begin{aligned}\boldsymbol{\nabla}^2 &=\left( \hat{\mathbf{r}} \partial_r + \frac{\hat{\boldsymbol{\theta}}}{r} \partial_\theta + \frac{\hat{\boldsymbol{\phi}}}{r\sin\theta} \partial_\phi \right) \cdot\left( \hat{\mathbf{r}} \partial_r + \frac{\hat{\boldsymbol{\theta}}}{r} \partial_\theta + \frac{\hat{\boldsymbol{\phi}}}{r\sin\theta} \partial_\phi \right).\end{aligned} \hspace{\stretch{1}}(5.22)

Evaluating these one set at a time we have

\begin{aligned}\hat{\mathbf{r}} \partial_r \cdot \left( \hat{\mathbf{r}} \partial_r + \frac{\hat{\boldsymbol{\theta}}}{r} \partial_\theta + \frac{\hat{\boldsymbol{\phi}}}{r\sin\theta} \partial_\phi \right) &= \partial_{rr},\end{aligned}


\begin{aligned}\frac{1}{{r}} \hat{\boldsymbol{\theta}} \partial_\theta \cdot \left( \hat{\mathbf{r}} \partial_r + \frac{\hat{\boldsymbol{\theta}}}{r} \partial_\theta + \frac{\hat{\boldsymbol{\phi}}}{r\sin\theta} \partial_\phi \right)&=\frac{1}{{r}} \left\langle{{\hat{\boldsymbol{\theta}} \left(\hat{\mathbf{r}} I \hat{\boldsymbol{\phi}} \partial_r + \hat{\mathbf{r}} \partial_{\theta r}+ \frac{\hat{\boldsymbol{\theta}}}{r} \partial_{\theta\theta} + \frac{1}{{r}} \hat{\boldsymbol{\theta}} I \hat{\boldsymbol{\phi}} \partial_\theta+ \hat{\boldsymbol{\phi}} \partial_\theta \frac{1}{{r\sin\theta}} \partial_\phi\right)}}\right\rangle \\ &= \frac{1}{{r}} \partial_r+\frac{1}{{r^2}} \partial_{\theta\theta},\end{aligned}


\begin{aligned}\frac{\hat{\boldsymbol{\phi}}}{r\sin\theta} \partial_\phi &\cdot\left( \hat{\mathbf{r}} \partial_r + \frac{\hat{\boldsymbol{\theta}}}{r} \partial_\theta + \frac{\hat{\boldsymbol{\phi}}}{r\sin\theta} \partial_\phi \right) \\ &=\frac{1}{r\sin\theta} \left\langle{{\hat{\boldsymbol{\phi}}\left(\hat{\boldsymbol{\phi}} \sin\theta \partial_r + \hat{\mathbf{r}} \partial_{\phi r} + \hat{\boldsymbol{\phi}} \cos\theta \frac{1}{r} \partial_\theta + \frac{\hat{\boldsymbol{\theta}}}{r} \partial_{\phi \theta }+ \hat{\boldsymbol{\phi}} i \frac{1}{r\sin\theta} \partial_\phi + \hat{\boldsymbol{\phi}} \frac{1}{r\sin\theta} \partial_{\phi \phi }\right)}}\right\rangle \\ &=\frac{1}{{r}} \partial_r+ \frac{\cot\theta}{r^2}\partial_\theta+ \frac{1}{{r^2 \sin^2\theta}} \partial_{\phi\phi}\end{aligned}

Summing these we have

\begin{aligned}\boldsymbol{\nabla}^2 &=\partial_{rr}+ \frac{2}{r} \partial_r+\frac{1}{{r^2}} \partial_{\theta\theta}+ \frac{\cot\theta}{r^2}\partial_\theta+ \frac{1}{{r^2 \sin^2\theta}} \partial_{\phi\phi}\end{aligned} \hspace{\stretch{1}}(5.23)

This is often written with a chain rule trick to considate the r and \theta partials

\begin{aligned}\boldsymbol{\nabla}^2 \Psi &=\frac{1}{{r}} \partial_{rr} (r \Psi)+ \frac{1}{{r^2 \sin\theta}} \partial_\theta \left( \sin\theta \partial_\theta \Psi \right)+ \frac{1}{{r^2 \sin^2\theta}} \partial_{\psi\psi} \Psi\end{aligned} \hspace{\stretch{1}}(5.24)

It’s simple to verify that this is identical to 5.23.


[1] Peeter Joot. Polar form for the gradient and Laplacian. [online].

[2] Peeter Joot. Spherical Polar unit vectors in exponential form. [online]. .

Posted in Math and Physics Learning. | Tagged: , , , , , , , , , | Leave a Comment »

Rotations using matrix exponentials

Posted by peeterjoot on July 27, 2010

[Click here for a PDF of this post with nicer formatting]


In [1] it is noted in problem 1.3 that any Unitary operator can be expressed in exponential form

\begin{aligned}U = e^{iC},\end{aligned} \hspace{\stretch{1}}(1.1)

where C is Hermitian. This is a powerful result hiding away in this problem. I haven’t actually managed to prove this yet to my satisfaction, but working through some examples is highly worthwhile. In particular it is interesting to compute the matrix C for a rotation matrix. One finds that the matrix for such a rotation operator is in fact one of the Pauli spin matrices, and I found it interesting that this falls out so naturally. Additionally, it is rather slick that one is able to so concisely express the rotation in exponential form, something that is natural and powerful in complex variable algebra, and also possible using Geometric Algebra using exponentials of bivectors. Here we can do it after all with nothing more than the plain old matrix algebra that everybody is already comfortable with.

The logarithm of the Unitary matrix.

By inspection we can invert 1.1 for C, by taking the logarithm

\begin{aligned}C = -i \ln U.\end{aligned} \hspace{\stretch{1}}(2.2)

The problem becomes one of evaluating the logarithm, or even giving meaning to it. I’ll assume that the functions of matrices that we are interested in are all polynomial in powers of the matrix, as in

\begin{aligned}f(U) = \sum_k \alpha_k U^k,\end{aligned} \hspace{\stretch{1}}(2.3)

and that such series are convergent. Then using a spectral decomposition, possible since Unitary matrices are normal, we can write for diagonal \Sigma = {\begin{bmatrix} \lambda_i \end{bmatrix}}_i

\begin{aligned}U = V \Sigma V^\dagger,\end{aligned} \hspace{\stretch{1}}(2.4)


\begin{aligned}f(U) = V \left( \sum_k \alpha_k \Sigma^k \right) V^\dagger = V {\begin{bmatrix} f(\lambda_i) \end{bmatrix}}_i V^\dagger.\end{aligned} \hspace{\stretch{1}}(2.5)

Provided the logarithm has a convergent power series representation for U, we then have for our Hermitian matrix C

\begin{aligned}C = -i V (\ln \Sigma) V^\dagger\end{aligned} \hspace{\stretch{1}}(2.6)

Evaluate this logarithm for an x,y plane rotation.

Given the rotation matrix

\begin{aligned}U =\begin{bmatrix}\cos\theta & \sin\theta \\ -\sin\theta & \cos\theta\end{bmatrix},\end{aligned} \hspace{\stretch{1}}(2.7)

We find that the eigenvalues are e^{\pm i\theta}, with eigenvectors proportional to (1, \pm i) respectively. Our decomposition for U is then given by
2.4, and

\begin{aligned}V &= \frac{1}{{\sqrt{2}}}\begin{bmatrix}1 & 1 \\ i & -i\end{bmatrix} \\ \Sigma &=\begin{bmatrix}e^{i\theta} & 0 \\ 0 & e^{-i\theta}\end{bmatrix}.\end{aligned} \hspace{\stretch{1}}(2.8)

Taking logs we have

\begin{aligned}C&=\frac{1}{2}\begin{bmatrix}1 & 1 \\ i & -i\end{bmatrix}\begin{bmatrix}\theta & 0 \\ 0 & -\theta\end{bmatrix} \begin{bmatrix}1 & -i \\ 1 & i\end{bmatrix} \\ &=\frac{1}{2}\begin{bmatrix}1 & 1 \\ i & -i\end{bmatrix}\begin{bmatrix}\theta  & -i\theta \\ -\theta & -i\theta\end{bmatrix}  \\ &=\begin{bmatrix}0 & -i\theta \\ i\theta & 0\end{bmatrix}.\end{aligned}

With the Pauli matrix

\begin{aligned}\sigma_2 =\begin{bmatrix}0 & -i \\ i & 0\end{bmatrix},\end{aligned} \hspace{\stretch{1}}(2.10)

we then have for an x,y plane rotation matrix just:

\begin{aligned}C = \theta \sigma_2\end{aligned} \hspace{\stretch{1}}(2.11)


\begin{aligned}U = e^{i \theta \sigma_2}.\end{aligned} \hspace{\stretch{1}}(2.12)

Immediately, since \sigma_2^2 = I, this also provides us with a trigonometric expansion

\begin{aligned}U = I \cos\theta + i \sigma_2 \sin\theta.\end{aligned} \hspace{\stretch{1}}(2.13)

By inspection one can see that this takes us full circle back to the original matrix form 2.7 of the rotation. The exponential form of
2.12 has a beauty that is however far superior to the plain old trigonometric matrix that we are comfortable with. All without any geometric algebra or bivector exponentials.

Three dimensional exponential rotation matrices.

By inspection, we can augment our matrix C for a three dimensional rotation in the x,y plane, or a y,z rotation, or a x,z rotation. Those are, respectively

\begin{aligned}U_{x,y}&=\exp\begin{bmatrix}0 & \theta & 0 \\ -\theta & 0 & 0 \\ 0 & 0 & i\end{bmatrix} \\ U_{y,z}&=\exp\begin{bmatrix}i & 0 & 0 \\ 0 & 0 & \theta \\ 0 & -\theta & 0 \\ \end{bmatrix} \\ U_{x,z}&=\exp\begin{bmatrix}0 & 0 & \theta \\ 0 & i & 0 \\ -\theta & 0 & 0 \\ \end{bmatrix}\end{aligned} \hspace{\stretch{1}}(2.14)

Each of these matrices can be related to each other by similarity transformation using the permutation matrices

\begin{aligned}\begin{bmatrix}0 & 0 & 1 \\ 0 & 1 & 0 \\ 1 & 0 & 0 \\ \end{bmatrix},\end{aligned}


\begin{aligned}\begin{bmatrix}1 & 0 & 0 \\ 0 & 0 & 1 \\ 0 & 1 & 0 \\ \end{bmatrix}.\end{aligned}

Exponential matrix form for a Lorentz boost.

The next obvious thing to try with this matrix representation is a Lorentz boost.

\begin{aligned}L =\begin{bmatrix}\cosh\alpha & -\sinh\alpha \\ -\sinh\alpha & \cosh\alpha\end{bmatrix},\end{aligned} \hspace{\stretch{1}}(2.17)

where \cosh\alpha = \gamma, and \tanh\alpha = \beta.

This matrix has a spectral decomposition given by

\begin{aligned}V &= \frac{1}{{\sqrt{2}}}\begin{bmatrix}1 & 1 \\ -1 & 1\end{bmatrix} \\ \Sigma &=\begin{bmatrix}e^\alpha & 0 \\ 0 & e^{-\alpha}\end{bmatrix}.\end{aligned} \hspace{\stretch{1}}(2.18)

Taking logs and computing C we have

\begin{aligned}C&=-\frac{i}{2}\begin{bmatrix}1 & 1 \\ -1 & 1 \end{bmatrix}\begin{bmatrix}\alpha & 0 \\ 0 & -\alpha\end{bmatrix} \begin{bmatrix}1 & -1 \\ 1 & 1\end{bmatrix} \\ &=-\frac{i}{2}\begin{bmatrix}1 & 1 \\ -1 & 1 \end{bmatrix}\begin{bmatrix}\alpha & -\alpha \\ -\alpha & -\alpha\end{bmatrix} \\ &=i \alpha\begin{bmatrix}0 & 1 \\ 1 & 0 \end{bmatrix}.\end{aligned}

Again we have one of the Pauli spin matrices. This time it is

\begin{aligned}\sigma_1 =\begin{bmatrix}0 & 1 \\ 1 & 0 \end{bmatrix}.\end{aligned} \hspace{\stretch{1}}(2.20)

So we can write our Lorentz boost 2.17 as just

\begin{aligned}L = e^{-\alpha \sigma_1} = I \cosh\alpha - \sigma_1 \sinh\alpha.\end{aligned} \hspace{\stretch{1}}(2.21)

By inspection again, we can come full circle by inspection from this last hyperbolic representation back to the original explicit matrix representation. Quite nifty!

It occurred to me after the fact that the Lorentz boost is not Unitary. The fact that the eigenvalues are not a purely complex phase term, like those of the rotation is actually a good hint that looking at how to characterize the eigenvalues of a unitary matrix can be used to show that the matrix C = -i V \ln \Sigma V^\dagger is Hermitian.


[1] BR Desai. Quantum mechanics with basic field theory. Cambridge University Press, 2009.

Posted in Math and Physics Learning. | Tagged: , , , , , , , | Leave a Comment »

On Professor Dmitrevsky’s “the only valid Laplacian definition is the divergence of gradient”.

Posted by peeterjoot on December 2, 2009

[Click here for a PDF of this post with nicer formatting]


To all tyrannical old Professors driven to cruelty by an unending barrage of increasingly ill prepared students.


The text [1] has an excellent general derivation of a number of forms of the gradient, divergence, curl and Laplacian.

This is actually done, not starting with the usual Cartesian forms, but more general definitions.

\begin{aligned}(\text{grad}\  \phi)_i &= \lim_{ds_i \rightarrow 0} \frac{\phi(q_i + dq_i) - \phi(q_i)}{ds_i} \\ \text{div}\  \mathbf{V} &= \lim_{\Delta \tau \rightarrow 0} \frac{1}{{\Delta \tau}} \int_\sigma \mathbf{V} \cdot d\boldsymbol{\sigma} \\ (\text{curl}\  \mathbf{V}) \cdot \mathbf{n} &= \lim_{\Delta \sigma \rightarrow 0} \frac{1}{{\Delta \sigma}} \oint_\lambda \mathbf{V} \cdot d\boldsymbol{\lambda} \\ \text{Laplacian}\  \phi &= \text{div} (\text{grad}\ \phi).\end{aligned} \quad\quad\quad(1)

These are then shown to imply the usual Cartesian definitions, plus provide the means to calculate the general relationships in whatever coordinate system you like. All in all one can’t beat this approach, and I’m not going to try to replicate it, because I can’t improve it in any way by doing so.

Given that, what do I have to say on this topic? Well, way way back in first year electricity and magnetism, my dictator of a prof, the intimidating but diminutive Dmitrevsky, yelled at us repeatedly that one cannot just dot the gradient to form the Laplacian. As far as he was concerned one can only say

\begin{aligned}\text{Laplacian}\  \phi &= \text{div} (\text{grad}\ \phi),\end{aligned} \quad\quad\quad(5)

and never never never, the busted way

\begin{aligned}\text{Laplacian}\  \phi &= (\boldsymbol{\nabla} \cdot \boldsymbol{\nabla}) \phi.\end{aligned} \quad\quad\quad(6)

Because “this only works in Cartesian coordinates”. He probably backed up this assertion with a heartwarming and encouraging statement like “back in the days when University of Toronto was a real school you would have learned this in kindergarten”.

This detail is actually something that has bugged me ever since, because my assumption was that, provided one was careful, why would a change to an alternate coordinate system matter? The gradient is still the gradient, so it seems to me that this ought to be a general way to calculate things.

Here we explore the validity of the dictatorial comments of Prof Dmitrevsky. The key to reconciling intuition and his statement turns out to lie with the fact that one has to let the gradient operate on the unit vectors in the non Cartesian representation as well as the partials, something that wasn’t clear as a first year student. Provided that this is done, the plain old dot product procedure yields the expected results.

This exploration will utilize a two dimensional space as a starting point, transforming from Cartesian to polar form representation. I’ll also utilize a geometric algebra representation of the polar unit vectors.

The gradient in polar form.

Lets start off with a calculation of the gradient in polar form starting with the Cartesian form. Writing \partial_x = {\partial {}}/{\partial {x}}, \partial_y = {\partial {}}/{\partial {y}}, \partial_r = {\partial {}}/{\partial {r}}, and \partial_\theta = {\partial {}}/{\partial {\theta}}, we want to map

\begin{aligned}\boldsymbol{\nabla} = \mathbf{e}_1 \partial_1 + \mathbf{e}_2 \partial_2= \begin{bmatrix}\mathbf{e}_1 & \mathbf{e}_2 \end{bmatrix}\begin{bmatrix}\partial_1 \\ \partial_2 \end{bmatrix},\end{aligned} \quad\quad\quad(7)

into the same form using \hat{\mathbf{r}}, \hat{\boldsymbol{\theta}}, \partial_r, and \partial_\theta. With i = \mathbf{e}_1 \mathbf{e}_2 we have

\begin{aligned}\begin{bmatrix}\mathbf{e}_1 \\ \mathbf{e}_2\end{bmatrix}=e^{i\theta}\begin{bmatrix}\hat{\mathbf{r}} \\ \hat{\boldsymbol{\theta}}\end{bmatrix}.\end{aligned} \quad\quad\quad(8)

Next we need to do a chain rule expansion of the partial operators to change variables. In matrix form that is

\begin{aligned}\begin{bmatrix}\frac{\partial {}}{\partial {x}} \\ \frac{\partial {}}{\partial {y}} \end{bmatrix}= \begin{bmatrix}\frac{\partial {r}}{\partial {x}} &          \frac{\partial {\theta}}{\partial {x}} \\ \frac{\partial {r}}{\partial {y}} &          \frac{\partial {\theta}}{\partial {y}} \end{bmatrix}\begin{bmatrix}\frac{\partial {}}{\partial {r}} \\ \frac{\partial {}}{\partial {\theta}} \end{bmatrix}.\end{aligned} \quad\quad\quad(9)

To calculate these partials we drop back to coordinates

\begin{aligned}x^2 + y^2 &= r^2 \\ \frac{y}{x} &= \tan\theta \\ \frac{x}{y} &= \cot\theta.\end{aligned} \quad\quad\quad(10)

From this we calculate

\begin{aligned}\frac{\partial {r}}{\partial {x}} &= \cos\theta \\ \frac{\partial {r}}{\partial {y}} &= \sin\theta \\  \frac{1}{{r\cos\theta}} &= \frac{\partial {\theta}}{\partial {y}} \frac{1}{{\cos^2\theta}} \\ \frac{1}{{r\sin\theta}} &= -\frac{\partial {\theta}}{\partial {x}} \frac{1}{{\sin^2\theta}},\end{aligned} \quad\quad\quad(13)


\begin{aligned}\begin{bmatrix}\frac{\partial {}}{\partial {x}} \\ \frac{\partial {}}{\partial {y}} \end{bmatrix}= \begin{bmatrix}\cos\theta & -\sin\theta/r \\ \sin\theta & \cos\theta/r\end{bmatrix}\begin{bmatrix}\frac{\partial {}}{\partial {r}} \\ \frac{\partial {}}{\partial {\theta}} \end{bmatrix}.\end{aligned} \quad\quad\quad(17)

We can now write down the gradient in polar form, prior to final simplification

\begin{aligned}\boldsymbol{\nabla} = e^{i\theta}\begin{bmatrix}\hat{\mathbf{r}} & \hat{\boldsymbol{\theta}}\end{bmatrix}\begin{bmatrix}\cos\theta & -\sin\theta/r \\ \sin\theta & \cos\theta/r\end{bmatrix}\begin{bmatrix}\frac{\partial {}}{\partial {r}} \\ \frac{\partial {}}{\partial {\theta}} \end{bmatrix}.\end{aligned} \quad\quad\quad(18)

Observe that we can factor a unit vector

\begin{aligned}\begin{bmatrix}\hat{\mathbf{r}} & \hat{\boldsymbol{\theta}}\end{bmatrix}=\hat{\mathbf{r}}\begin{bmatrix}1 & i\end{bmatrix}=\begin{bmatrix}i & 1\end{bmatrix}\hat{\boldsymbol{\theta}}\end{aligned} \quad\quad\quad(19)

so the 1,1 element of the matrix product in the interior is

\begin{aligned}\begin{bmatrix}\hat{\mathbf{r}} & \hat{\boldsymbol{\theta}}\end{bmatrix}\begin{bmatrix}\cos\theta \\ \sin\theta \end{bmatrix}=\hat{\mathbf{r}} e^{i\theta} = e^{-i\theta}\hat{\mathbf{r}}.\end{aligned} \quad\quad\quad(20)

Similarly, the 1,2 element of the matrix product in the interior is

\begin{aligned}\begin{bmatrix}\hat{\mathbf{r}} & \hat{\boldsymbol{\theta}}\end{bmatrix}\begin{bmatrix}-\sin\theta/r \\ \cos\theta/r\end{bmatrix}=\frac{1}{{r}} e^{-i\theta} \hat{\boldsymbol{\theta}}.\end{aligned} \quad\quad\quad(21)

The exponentials cancel nicely, leaving after a final multiplication with the polar form for the gradient

\begin{aligned}\boldsymbol{\nabla} = \hat{\mathbf{r}} \partial_r + \hat{\boldsymbol{\theta}} \frac{1}{{r}} \partial_\theta\end{aligned} \quad\quad\quad(22)

That was a fun way to get the result, although we could have just looked it up. We want to use this now to calculate the Laplacian.

Polar form Laplacian for the plane.

We are now ready to look at the Laplacian. First let’s do it the first year electricity and magnetism course way. We look up the formula for polar form divergence, the one we were supposed to have memorized in kindergarten, and find it to be

\begin{aligned}\text{div}\ \mathbf{A} = \partial_r A_r + \frac{1}{{r}} A_r + \frac{1}{{r}} \partial_\theta A_\theta\end{aligned} \quad\quad\quad(23)

We can now apply this to the gradient vector in polar form which has components \boldsymbol{\nabla}_r = \partial_r, and \boldsymbol{\nabla}_\theta = (1/r)\partial_\theta, and get

\begin{aligned}\text{div}\ \text{grad} = \partial_{rr} + \frac{1}{{r}} \partial_r + \frac{1}{{r}} \partial_{\theta\theta}\end{aligned} \quad\quad\quad(24)

This is the expected result, and what we should get by performing \boldsymbol{\nabla} \cdot \boldsymbol{\nabla} in polar form. Now, let’s do it the wrong way, dotting our gradient with itself.

\begin{aligned}\boldsymbol{\nabla} \cdot \boldsymbol{\nabla} &= \left(\partial_r, \frac{1}{{r}} \partial_\theta\right) \cdot \left(\partial_r, \frac{1}{{r}} \partial_\theta\right) \\ &= \partial_{rr} + \frac{1}{{r}} \partial_\theta \left(\frac{1}{{r}} \partial_\theta\right) \\ &= \partial_{rr} + \frac{1}{{r^2}} \partial_{\theta\theta}\end{aligned}

This is wrong! So is Dmitrevsky right that this procedure is flawed, or do you spot the mistake? I have also cruelly written this out in a way that obscures the error and highlights the source of the confusion.

The problem is that our unit vectors are functions, and they must also be included in the application of our partials. Using the coordinate polar form without explicitly putting in the unit vectors is how we go wrong. Here’s the right way

\begin{aligned}\boldsymbol{\nabla} \cdot \boldsymbol{\nabla} &=\left( \hat{\mathbf{r}} \partial_r + \hat{\boldsymbol{\theta}} \frac{1}{{r}} \partial_\theta \right) \cdot \left( \hat{\mathbf{r}} \partial_r + \hat{\boldsymbol{\theta}} \frac{1}{{r}} \partial_\theta \right) \\ &=\hat{\mathbf{r}} \cdot \partial_r \left(\hat{\mathbf{r}} \partial_r \right)+\hat{\mathbf{r}} \cdot \partial_r \left( \hat{\boldsymbol{\theta}} \frac{1}{{r}} \partial_\theta \right)+\hat{\boldsymbol{\theta}} \cdot \frac{1}{{r}} \partial_\theta \left( \hat{\mathbf{r}} \partial_r \right)+\hat{\boldsymbol{\theta}} \cdot \frac{1}{{r}} \partial_\theta \left( \hat{\boldsymbol{\theta}} \frac{1}{{r}} \partial_\theta \right) \\ \end{aligned}

Now we need the derivatives of our unit vectors. The \partial_r derivatives are zero since these have no radial dependence, but we do have \theta partials

\begin{aligned}\partial_\theta \hat{\mathbf{r}} &=\partial_\theta \left( \mathbf{e}_1 e^{i\theta} \right) \\ &=\mathbf{e}_1 \mathbf{e}_1 \mathbf{e}_2 e^{i\theta} \\ &=\mathbf{e}_2 e^{i\theta} \\ &=\hat{\boldsymbol{\theta}},\end{aligned}


\begin{aligned}\partial_\theta \hat{\boldsymbol{\theta}} &=\partial_\theta \left( \mathbf{e}_2 e^{i\theta} \right) \\ &=\mathbf{e}_2 \mathbf{e}_1 \mathbf{e}_2 e^{i\theta} \\ &=-\mathbf{e}_1 e^{i\theta} \\ &=-\hat{\mathbf{r}}.\end{aligned}

(One should be able to get the same results if these unit vectors were written out in full as \hat{\mathbf{r}} = \mathbf{e}_1 \cos\theta + \mathbf{e}_2 \sin\theta, and \hat{\boldsymbol{\theta}} = \mathbf{e}_2 \cos\theta - \mathbf{e}_1 \sin\theta, instead of using the obscure geometric algebra quaterionic rotation exponential operators.)

Having calculated these partials we now have

\begin{aligned}(\boldsymbol{\nabla} \cdot \boldsymbol{\nabla}) =\partial_{rr} +\frac{1}{{r}} \partial_r +\frac{1}{{r^2}} \partial_{\theta\theta} \end{aligned} \quad\quad\quad(25)

Exactly what it should be, and what we got with the coordinate form of the divergence operator when applying the “Laplacian equals the divergence of the gradient” rule blindly. We see that the expectation that \boldsymbol{\nabla} \cdot \boldsymbol{\nabla} is the Laplacian in more than the Cartesian coordinate system is not invalid, but that care is required to apply the chain rule to all functions. We also see that expressing a vector in coordinate form when the basis vectors are position dependent is also a path to danger.

Is this anything that our electricity and magnetism prof didn’t know? Unlikely. Is this something that our prof felt that could not be explained to a mob of first year students? Probably.


[1] F.W. Byron and R.W. Fuller. Mathematics of Classical and Quantum Physics. Dover Publications, 1992.

Posted in Math and Physics Learning. | Tagged: , , , , , , , , , | 1 Comment »