Peeter Joot's (OLD) Blog.

Math, physics, perl, and programming obscurity.

Posts Tagged ‘quaternion’

Geometric Algebra. The very quickest introduction.

Posted by peeterjoot on March 17, 2012

[Click here for a PDF of this post with nicer formatting.]

Motivation.

An attempt to make a relatively concise introduction to Geometric (or Clifford) Algebra. Much more complete introductions to the subject can be found in [1], [2], and [3].

Axioms

We have a couple basic principles upon which the algebra is based

  1. Vectors can be multiplied.
  2. The square of a vector is the (squared) length of that vector (with appropriate generalizations for non-Euclidean metrics).
  3. Vector products are associative (but not necessarily commutative).

That’s really all there is to it, and the rest, paraphrasing Feynman, can be figured out by anybody sufficiently clever.

By example. The 2D case.

Consider a 2D Euclidean space, and the product of two vectors \mathbf{a} and \mathbf{b} in that space. Utilizing a standard orthonormal basis \{\mathbf{e}_1, \mathbf{e}_2\} we can write

\begin{aligned}\mathbf{a} &= \mathbf{e}_1 x_1 + \mathbf{e}_2 x_2 \\ \mathbf{b} &= \mathbf{e}_1 y_1 + \mathbf{e}_2 y_2,\end{aligned} \hspace{\stretch{1}}(3.1)

and let’s write out the product of these two vectors \mathbf{a} \mathbf{b}, not yet knowing what we will end up with. That is

\begin{aligned}\mathbf{a} \mathbf{b} &= (\mathbf{e}_1 x_1 + \mathbf{e}_2 x_2 )( \mathbf{e}_1 y_1 + \mathbf{e}_2 y_2 ) \\ &= \mathbf{e}_1^2 x_1 y_1 + \mathbf{e}_2^2 x_2 y_2+ \mathbf{e}_1 \mathbf{e}_2 x_1 y_2 + \mathbf{e}_2 \mathbf{e}_1 x_2 y_1\end{aligned}

From axiom 2 we have \mathbf{e}_1^2 = \mathbf{e}_2^2 = 1, so we have

\begin{aligned}\mathbf{a} \mathbf{b} = x_1 y_1 + x_2 y_2 + \mathbf{e}_1 \mathbf{e}_2 x_1 y_2 + \mathbf{e}_2 \mathbf{e}_1 x_2 y_1.\end{aligned} \hspace{\stretch{1}}(3.3)

We’ve multiplied two vectors and ended up with a scalar component (and recognize that this part of the vector product is the dot product), and a component that is a “something else”. We’ll call this something else a bivector, and see that it is characterized by a product of non-colinear vectors. These products \mathbf{e}_1 \mathbf{e}_2 and \mathbf{e}_2 \mathbf{e}_1 are in fact related, and we can see that by looking at the case of \mathbf{b} = \mathbf{a}. For that we have

\begin{aligned}\mathbf{a}^2 &=x_1 x_1 + x_2 x_2 + \mathbf{e}_1 \mathbf{e}_2 x_1 x_2 + \mathbf{e}_2 \mathbf{e}_1 x_2 x_1 \\ &={\left\lvert{\mathbf{a}}\right\rvert}^2 +x_1 x_2 ( \mathbf{e}_1 \mathbf{e}_2 + \mathbf{e}_2 \mathbf{e}_1 )\end{aligned}

Since axiom (2) requires our vectors square to equal its (squared) length, we must then have

\begin{aligned}\mathbf{e}_1 \mathbf{e}_2 + \mathbf{e}_2 \mathbf{e}_1 = 0,\end{aligned} \hspace{\stretch{1}}(3.4)

or

\begin{aligned}\mathbf{e}_2 \mathbf{e}_1 = -\mathbf{e}_1 \mathbf{e}_2.\end{aligned} \hspace{\stretch{1}}(3.5)

We see that Euclidean orthonormal vectors anticommute. What we can see with some additional study is that any colinear vectors commute, and in Euclidean spaces (of any dimension) vectors that are normal to each other anticommute (this can also be taken as a definition of normal).

We can now return to our product of two vectors 3.3 and simplify it slightly

\begin{aligned}\mathbf{a} \mathbf{b} = x_1 y_1 + x_2 y_2 + \mathbf{e}_1 \mathbf{e}_2 (x_1 y_2 - x_2 y_1).\end{aligned} \hspace{\stretch{1}}(3.6)

The product of two vectors in 2D is seen here to have one scalar component, and one bivector component (an irreducible product of two normal vectors). Observe the symmetric and antisymmetric split of the scalar and bivector components above. This symmetry and antisymmetry can be made explicit, introducing dot and wedge product notation respectively

\begin{aligned}\mathbf{a} \cdot \mathbf{b} &= \frac{1}{{2}}( \mathbf{a} \mathbf{b} + \mathbf{b} \mathbf{a}) = x_1 y_1 + x_2 y_2 \\ \mathbf{a} \wedge \mathbf{b} &= \frac{1}{{2}}( \mathbf{a} \mathbf{b} - \mathbf{b} \mathbf{a}) = \mathbf{e}_1 \mathbf{e}_2 (x_1 y_y - x_2 y_1).\end{aligned} \hspace{\stretch{1}}(3.7)

so that the vector product can be written as

\begin{aligned}\mathbf{a} \mathbf{b} = \mathbf{a} \cdot \mathbf{b} + \mathbf{a} \wedge \mathbf{b}.\end{aligned} \hspace{\stretch{1}}(3.9)

Pseudoscalar

In many contexts it is useful to introduce an ordered product of all the unit vectors for the space is called the pseudoscalar. In our 2D case this is

\begin{aligned}i = \mathbf{e}_1 \mathbf{e}_2,\end{aligned} \hspace{\stretch{1}}(4.10)

a quantity that we find behaves like the complex imaginary. That can be shown by considering its square

\begin{aligned}(\mathbf{e}_1 \mathbf{e}_2)^2&=(\mathbf{e}_1 \mathbf{e}_2)(\mathbf{e}_1 \mathbf{e}_2) \\ &=\mathbf{e}_1 (\mathbf{e}_2 \mathbf{e}_1) \mathbf{e}_2 \\ &=-\mathbf{e}_1 (\mathbf{e}_1 \mathbf{e}_2) \mathbf{e}_2 \\ &=-(\mathbf{e}_1 \mathbf{e}_1) (\mathbf{e}_2 \mathbf{e}_2) \\ &=-1^2 \\ &= -1\end{aligned}

Here the anticommutation of normal vectors property has been used, as well as (for the first time) the associative multiplication axiom.

In a 3D context, you’ll see the pseudoscalar in many places (expressing the normals to planes for example). It also shows up in a number of fundamental relationships. For example, if one writes

\begin{aligned}I = \mathbf{e}_1 \mathbf{e}_2 \mathbf{e}_3\end{aligned} \hspace{\stretch{1}}(4.11)

for the 3D pseudoscalar, then it’s also possible to show

\begin{aligned}\mathbf{a} \mathbf{b} = \mathbf{a} \cdot \mathbf{b} + I (\mathbf{a} \times \mathbf{b})\end{aligned} \hspace{\stretch{1}}(4.12)

something that will be familiar to the student of QM, where we see this in the context of Pauli matrices. The Pauli matrices also encode a Clifford algebraic structure, but we do not need an explicit matrix representation to do so.

Rotations

Very much like complex numbers we can utilize exponentials to perform rotations. Rotating in a sense from \mathbf{e}_1 to \mathbf{e}_2, can be expressed as

\begin{aligned}\mathbf{a} e^{i \theta}&=(\mathbf{e}_1 x_1 + \mathbf{e}_2 x_2) (\cos\theta + \mathbf{e}_1 \mathbf{e}_2 \sin\theta) \\ &=\mathbf{e}_1 (x_1 \cos\theta - x_2 \sin\theta)+\mathbf{e}_2 (x_2 \cos\theta + x_1 \sin\theta)\end{aligned}

More generally, even in N dimensional Euclidean spaces, if \mathbf{a} is a vector in a plane, and \hat{\mathbf{u}} and \hat{\mathbf{v}} are perpendicular unit vectors in that plane, then the rotation through angle \theta is given by

\begin{aligned}\mathbf{a} \rightarrow \mathbf{a} e^{\hat{\mathbf{u}} \hat{\mathbf{v}} \theta}.\end{aligned} \hspace{\stretch{1}}(5.13)

This is illustrated in figure (1).

Plane rotation.

 

Notice that we have expressed the rotation here without utilizing a normal direction for the plane. The sense of the rotation is encoded by the bivector \hat{\mathbf{u}} \hat{\mathbf{v}} that describes the plane and the orientation of the rotation (or by duality the direction of the normal in a 3D space). By avoiding a requirement to encode the rotation using a normal to the plane we have an method of expressing the rotation that works not only in 3D spaces, but also in 2D and greater than 3D spaces, something that isn’t possible when we restrict ourselves to traditional vector algebra (where quantities like the cross product can’t be defined in a 2D or 4D space, despite the fact that things they may represent, like torque are planar phenomena that do not have any intrinsic requirement for a normal that falls out of the plane.).

When \mathbf{a} does not lie in the plane spanned by the vectors \hat{\mathbf{u}} and \hat{\mathbf{v}} , as in figure (2), we must express the rotations differently. A rotation then takes the form

\begin{aligned}\mathbf{a} \rightarrow e^{-\hat{\mathbf{u}} \hat{\mathbf{v}} \theta/2} \mathbf{a} e^{\hat{\mathbf{u}} \hat{\mathbf{v}} \theta/2}.\end{aligned} \hspace{\stretch{1}}(5.14)

3D rotation.

 

In the 2D case, and when the vector lies in the plane this reduces to the one sided complex exponential operator used above. We see these types of paired half angle rotations in QM, and they are also used extensively in computer graphics under the guise of quaternions.

References

[1] L. Dorst, D. Fontijne, and S. Mann. Geometric Algebra for Computer Science. Morgan Kaufmann, San Francisco, 2007.

[2] C. Doran and A.N. Lasenby. Geometric algebra for physicists. Cambridge University Press New York, Cambridge, UK, 1st edition, 2003.

[3] D. Hestenes. New Foundations for Classical Mechanics. Kluwer Academic Publishers, 1999.

Advertisements

Posted in Math and Physics Learning. | Tagged: , , , , , , , , , , , , , , , , , | 2 Comments »

Derivation of the spherical polar Laplacian

Posted by peeterjoot on October 9, 2010

[Click here for a PDF of this post with nicer formatting]

Motivation.

In [1] was a Geometric Algebra derivation of the 2D polar Laplacian by squaring the quadient. In [2] was a factorization of the spherical polar unit vectors in a tidy compact form. Here both these ideas are utilized to derive the spherical polar form for the Laplacian, an operation that is strictly algebraic (squaring the gradient) provided we operate on the unit vectors correctly.

Our rotation multivector.

Our starting point is a pair of rotations. We rotate first in the x,y plane by \phi

\begin{aligned}\mathbf{x} &\rightarrow \mathbf{x}' = \tilde{R_\phi} \mathbf{x} R_\phi \\ i &\equiv \mathbf{e}_1 \mathbf{e}_2 \\ R_\phi &= e^{i \phi/2}\end{aligned} \hspace{\stretch{1}}(2.1)

Then apply a rotation in the \mathbf{e}_3 \wedge (\tilde{R_\phi} \mathbf{e}_1 R_\phi) = \tilde{R_\phi} \mathbf{e}_3 \mathbf{e}_1 R_\phi plane

\begin{aligned}\mathbf{x}' &\rightarrow \mathbf{x}'' = \tilde{R_\theta} \mathbf{x}' R_\theta \\ R_\theta &= e^{ \tilde{R_\phi} \mathbf{e}_3 \mathbf{e}_1 R_\phi \theta/2 } = \tilde{R_\phi} e^{ \mathbf{e}_3 \mathbf{e}_1 \theta/2 } R_\phi\end{aligned} \hspace{\stretch{1}}(2.4)

The composition of rotations now gives us

\begin{aligned}\mathbf{x}&\rightarrow \mathbf{x}'' = \tilde{R_\theta} \tilde{R_\phi} \mathbf{x} R_\phi R_\theta = \tilde{R} \mathbf{x} R \\ R &= R_\phi R_\theta = e^{ \mathbf{e}_3 \mathbf{e}_1 \theta/2 } e^{ \mathbf{e}_1 \mathbf{e}_2 \phi/2 }.\end{aligned}

Expressions for the unit vectors.

The unit vectors in the rotated frame can now be calculated. With I = \mathbf{e}_1 \mathbf{e}_2 \mathbf{e}_3 we can calculate

\begin{aligned}\hat{\boldsymbol{\phi}} &= \tilde{R} \mathbf{e}_2 R  \\ \hat{\mathbf{r}} &= \tilde{R} \mathbf{e}_3 R  \\ \hat{\boldsymbol{\theta}} &= \tilde{R} \mathbf{e}_1 R\end{aligned} \hspace{\stretch{1}}(3.6)

Performing these we get

\begin{aligned}\hat{\boldsymbol{\phi}}&= e^{ -\mathbf{e}_1 \mathbf{e}_2 \phi/2 } e^{ -\mathbf{e}_3 \mathbf{e}_1 \theta/2 } \mathbf{e}_2 e^{ \mathbf{e}_3 \mathbf{e}_1 \theta/2 } e^{ \mathbf{e}_1 \mathbf{e}_2 \phi/2 } \\ &= \mathbf{e}_2 e^{ i \phi },\end{aligned}

and

\begin{aligned}\hat{\mathbf{r}}&= e^{ -\mathbf{e}_1 \mathbf{e}_2 \phi/2 } e^{ -\mathbf{e}_3 \mathbf{e}_1 \theta/2 } \mathbf{e}_3 e^{ \mathbf{e}_3 \mathbf{e}_1 \theta/2 } e^{ \mathbf{e}_1 \mathbf{e}_2 \phi/2 } \\ &= e^{ -\mathbf{e}_1 \mathbf{e}_2 \phi/2 } (\mathbf{e}_3 \cos\theta + \mathbf{e}_1 \sin\theta ) e^{ \mathbf{e}_1 \mathbf{e}_2 \phi/2 } \\ &= \mathbf{e}_3 \cos\theta +\mathbf{e}_1 \sin\theta e^{ \mathbf{e}_1 \mathbf{e}_2 \phi } \\ &= \mathbf{e}_3 (\cos\theta + \mathbf{e}_3 \mathbf{e}_1 \sin\theta e^{ \mathbf{e}_1 \mathbf{e}_2 \phi } ) \\ &= \mathbf{e}_3 e^{I \hat{\boldsymbol{\phi}} \theta},\end{aligned}

and

\begin{aligned}\hat{\boldsymbol{\theta}}&= e^{ -\mathbf{e}_1 \mathbf{e}_2 \phi/2 } e^{ -\mathbf{e}_3 \mathbf{e}_1 \theta/2 } \mathbf{e}_1 e^{ \mathbf{e}_3 \mathbf{e}_1 \theta/2 } e^{ \mathbf{e}_1 \mathbf{e}_2 \phi/2 } \\ &= e^{ -\mathbf{e}_1 \mathbf{e}_2 \phi/2 } ( \mathbf{e}_1 \cos\theta - \mathbf{e}_3 \sin\theta ) e^{ \mathbf{e}_1 \mathbf{e}_2 \phi/2 } \\ &= \mathbf{e}_1 \cos\theta e^{ \mathbf{e}_1 \mathbf{e}_2 \phi/2 } - \mathbf{e}_3 \sin\theta \\ &= i \hat{\boldsymbol{\phi}} \cos\theta - \mathbf{e}_3 \sin\theta \\ &= i \hat{\boldsymbol{\phi}} (\cos\theta + \hat{\boldsymbol{\phi}} i \mathbf{e}_3 \sin\theta ) \\ &= i \hat{\boldsymbol{\phi}} e^{I \hat{\boldsymbol{\phi}} \theta}.\end{aligned}

Summarizing these are

\begin{aligned}\hat{\boldsymbol{\phi}} &= \mathbf{e}_2 e^{ i \phi } \\ \hat{\mathbf{r}} &= \mathbf{e}_3 e^{I \hat{\boldsymbol{\phi}} \theta} \\ \hat{\boldsymbol{\theta}} &= i \hat{\boldsymbol{\phi}} e^{I \hat{\boldsymbol{\phi}} \theta}.\end{aligned} \hspace{\stretch{1}}(3.9)

Derivatives of the unit vectors.

We’ll need the partials. Most of these can be computed from 3.9 by inspection, and are

\begin{aligned}\partial_r \hat{\boldsymbol{\phi}} &= 0 \\ \partial_r \hat{\mathbf{r}} &= 0 \\ \partial_r \hat{\boldsymbol{\theta}} &= 0 \\ \partial_\theta \hat{\boldsymbol{\phi}} &= 0 \\ \partial_\theta \hat{\mathbf{r}} &= \hat{\mathbf{r}} I \hat{\boldsymbol{\phi}} \\ \partial_\theta \hat{\boldsymbol{\theta}} &= \hat{\boldsymbol{\theta}} I \hat{\boldsymbol{\phi}} \\ \partial_\phi \hat{\boldsymbol{\phi}} &= \hat{\boldsymbol{\phi}} i \\ \partial_\phi \hat{\mathbf{r}} &= \hat{\boldsymbol{\phi}} \sin\theta \\ \partial_\phi \hat{\boldsymbol{\theta}} &= \hat{\boldsymbol{\phi}} \cos\theta\end{aligned} \hspace{\stretch{1}}(4.12)

Expanding the Laplacian.

We note that the line element is ds = dr + r d\theta + r\sin\theta d\phi, so our gradient in spherical coordinates is

\begin{aligned}\boldsymbol{\nabla} &= \hat{\mathbf{r}} \partial_r + \frac{\hat{\boldsymbol{\theta}}}{r} \partial_\theta + \frac{\hat{\boldsymbol{\phi}}}{r\sin\theta} \partial_\phi.\end{aligned} \hspace{\stretch{1}}(5.21)

We can now evaluate the Laplacian

\begin{aligned}\boldsymbol{\nabla}^2 &=\left( \hat{\mathbf{r}} \partial_r + \frac{\hat{\boldsymbol{\theta}}}{r} \partial_\theta + \frac{\hat{\boldsymbol{\phi}}}{r\sin\theta} \partial_\phi \right) \cdot\left( \hat{\mathbf{r}} \partial_r + \frac{\hat{\boldsymbol{\theta}}}{r} \partial_\theta + \frac{\hat{\boldsymbol{\phi}}}{r\sin\theta} \partial_\phi \right).\end{aligned} \hspace{\stretch{1}}(5.22)

Evaluating these one set at a time we have

\begin{aligned}\hat{\mathbf{r}} \partial_r \cdot \left( \hat{\mathbf{r}} \partial_r + \frac{\hat{\boldsymbol{\theta}}}{r} \partial_\theta + \frac{\hat{\boldsymbol{\phi}}}{r\sin\theta} \partial_\phi \right) &= \partial_{rr},\end{aligned}

and

\begin{aligned}\frac{1}{{r}} \hat{\boldsymbol{\theta}} \partial_\theta \cdot \left( \hat{\mathbf{r}} \partial_r + \frac{\hat{\boldsymbol{\theta}}}{r} \partial_\theta + \frac{\hat{\boldsymbol{\phi}}}{r\sin\theta} \partial_\phi \right)&=\frac{1}{{r}} \left\langle{{\hat{\boldsymbol{\theta}} \left(\hat{\mathbf{r}} I \hat{\boldsymbol{\phi}} \partial_r + \hat{\mathbf{r}} \partial_{\theta r}+ \frac{\hat{\boldsymbol{\theta}}}{r} \partial_{\theta\theta} + \frac{1}{{r}} \hat{\boldsymbol{\theta}} I \hat{\boldsymbol{\phi}} \partial_\theta+ \hat{\boldsymbol{\phi}} \partial_\theta \frac{1}{{r\sin\theta}} \partial_\phi\right)}}\right\rangle \\ &= \frac{1}{{r}} \partial_r+\frac{1}{{r^2}} \partial_{\theta\theta},\end{aligned}

and

\begin{aligned}\frac{\hat{\boldsymbol{\phi}}}{r\sin\theta} \partial_\phi &\cdot\left( \hat{\mathbf{r}} \partial_r + \frac{\hat{\boldsymbol{\theta}}}{r} \partial_\theta + \frac{\hat{\boldsymbol{\phi}}}{r\sin\theta} \partial_\phi \right) \\ &=\frac{1}{r\sin\theta} \left\langle{{\hat{\boldsymbol{\phi}}\left(\hat{\boldsymbol{\phi}} \sin\theta \partial_r + \hat{\mathbf{r}} \partial_{\phi r} + \hat{\boldsymbol{\phi}} \cos\theta \frac{1}{r} \partial_\theta + \frac{\hat{\boldsymbol{\theta}}}{r} \partial_{\phi \theta }+ \hat{\boldsymbol{\phi}} i \frac{1}{r\sin\theta} \partial_\phi + \hat{\boldsymbol{\phi}} \frac{1}{r\sin\theta} \partial_{\phi \phi }\right)}}\right\rangle \\ &=\frac{1}{{r}} \partial_r+ \frac{\cot\theta}{r^2}\partial_\theta+ \frac{1}{{r^2 \sin^2\theta}} \partial_{\phi\phi}\end{aligned}

Summing these we have

\begin{aligned}\boldsymbol{\nabla}^2 &=\partial_{rr}+ \frac{2}{r} \partial_r+\frac{1}{{r^2}} \partial_{\theta\theta}+ \frac{\cot\theta}{r^2}\partial_\theta+ \frac{1}{{r^2 \sin^2\theta}} \partial_{\phi\phi}\end{aligned} \hspace{\stretch{1}}(5.23)

This is often written with a chain rule trick to considate the r and \theta partials

\begin{aligned}\boldsymbol{\nabla}^2 \Psi &=\frac{1}{{r}} \partial_{rr} (r \Psi)+ \frac{1}{{r^2 \sin\theta}} \partial_\theta \left( \sin\theta \partial_\theta \Psi \right)+ \frac{1}{{r^2 \sin^2\theta}} \partial_{\psi\psi} \Psi\end{aligned} \hspace{\stretch{1}}(5.24)

It’s simple to verify that this is identical to 5.23.

References

[1] Peeter Joot. Polar form for the gradient and Laplacian. [online]. http://sites.google.com/site/peeterjoot/math2009/polarGradAndLaplacian.pdf.

[2] Peeter Joot. Spherical Polar unit vectors in exponential form. [online]. http://sites.google.com/site/peeterjoot/math2009/sphericalPolarUnit.pdf .

Posted in Math and Physics Learning. | Tagged: , , , , , , , , , | Leave a Comment »

On Professor Dmitrevsky’s “the only valid Laplacian definition is the divergence of gradient”.

Posted by peeterjoot on December 2, 2009

[Click here for a PDF of this post with nicer formatting]

Dedication.

To all tyrannical old Professors driven to cruelty by an unending barrage of increasingly ill prepared students.

Motivation.

The text [1] has an excellent general derivation of a number of forms of the gradient, divergence, curl and Laplacian.

This is actually done, not starting with the usual Cartesian forms, but more general definitions.

\begin{aligned}(\text{grad}\  \phi)_i &= \lim_{ds_i \rightarrow 0} \frac{\phi(q_i + dq_i) - \phi(q_i)}{ds_i} \\ \text{div}\  \mathbf{V} &= \lim_{\Delta \tau \rightarrow 0} \frac{1}{{\Delta \tau}} \int_\sigma \mathbf{V} \cdot d\boldsymbol{\sigma} \\ (\text{curl}\  \mathbf{V}) \cdot \mathbf{n} &= \lim_{\Delta \sigma \rightarrow 0} \frac{1}{{\Delta \sigma}} \oint_\lambda \mathbf{V} \cdot d\boldsymbol{\lambda} \\ \text{Laplacian}\  \phi &= \text{div} (\text{grad}\ \phi).\end{aligned} \quad\quad\quad(1)

These are then shown to imply the usual Cartesian definitions, plus provide the means to calculate the general relationships in whatever coordinate system you like. All in all one can’t beat this approach, and I’m not going to try to replicate it, because I can’t improve it in any way by doing so.

Given that, what do I have to say on this topic? Well, way way back in first year electricity and magnetism, my dictator of a prof, the intimidating but diminutive Dmitrevsky, yelled at us repeatedly that one cannot just dot the gradient to form the Laplacian. As far as he was concerned one can only say

\begin{aligned}\text{Laplacian}\  \phi &= \text{div} (\text{grad}\ \phi),\end{aligned} \quad\quad\quad(5)

and never never never, the busted way

\begin{aligned}\text{Laplacian}\  \phi &= (\boldsymbol{\nabla} \cdot \boldsymbol{\nabla}) \phi.\end{aligned} \quad\quad\quad(6)

Because “this only works in Cartesian coordinates”. He probably backed up this assertion with a heartwarming and encouraging statement like “back in the days when University of Toronto was a real school you would have learned this in kindergarten”.

This detail is actually something that has bugged me ever since, because my assumption was that, provided one was careful, why would a change to an alternate coordinate system matter? The gradient is still the gradient, so it seems to me that this ought to be a general way to calculate things.

Here we explore the validity of the dictatorial comments of Prof Dmitrevsky. The key to reconciling intuition and his statement turns out to lie with the fact that one has to let the gradient operate on the unit vectors in the non Cartesian representation as well as the partials, something that wasn’t clear as a first year student. Provided that this is done, the plain old dot product procedure yields the expected results.

This exploration will utilize a two dimensional space as a starting point, transforming from Cartesian to polar form representation. I’ll also utilize a geometric algebra representation of the polar unit vectors.

The gradient in polar form.

Lets start off with a calculation of the gradient in polar form starting with the Cartesian form. Writing \partial_x = {\partial {}}/{\partial {x}}, \partial_y = {\partial {}}/{\partial {y}}, \partial_r = {\partial {}}/{\partial {r}}, and \partial_\theta = {\partial {}}/{\partial {\theta}}, we want to map

\begin{aligned}\boldsymbol{\nabla} = \mathbf{e}_1 \partial_1 + \mathbf{e}_2 \partial_2= \begin{bmatrix}\mathbf{e}_1 & \mathbf{e}_2 \end{bmatrix}\begin{bmatrix}\partial_1 \\ \partial_2 \end{bmatrix},\end{aligned} \quad\quad\quad(7)

into the same form using \hat{\mathbf{r}}, \hat{\boldsymbol{\theta}}, \partial_r, and \partial_\theta. With i = \mathbf{e}_1 \mathbf{e}_2 we have

\begin{aligned}\begin{bmatrix}\mathbf{e}_1 \\ \mathbf{e}_2\end{bmatrix}=e^{i\theta}\begin{bmatrix}\hat{\mathbf{r}} \\ \hat{\boldsymbol{\theta}}\end{bmatrix}.\end{aligned} \quad\quad\quad(8)

Next we need to do a chain rule expansion of the partial operators to change variables. In matrix form that is

\begin{aligned}\begin{bmatrix}\frac{\partial {}}{\partial {x}} \\ \frac{\partial {}}{\partial {y}} \end{bmatrix}= \begin{bmatrix}\frac{\partial {r}}{\partial {x}} &          \frac{\partial {\theta}}{\partial {x}} \\ \frac{\partial {r}}{\partial {y}} &          \frac{\partial {\theta}}{\partial {y}} \end{bmatrix}\begin{bmatrix}\frac{\partial {}}{\partial {r}} \\ \frac{\partial {}}{\partial {\theta}} \end{bmatrix}.\end{aligned} \quad\quad\quad(9)

To calculate these partials we drop back to coordinates

\begin{aligned}x^2 + y^2 &= r^2 \\ \frac{y}{x} &= \tan\theta \\ \frac{x}{y} &= \cot\theta.\end{aligned} \quad\quad\quad(10)

From this we calculate

\begin{aligned}\frac{\partial {r}}{\partial {x}} &= \cos\theta \\ \frac{\partial {r}}{\partial {y}} &= \sin\theta \\  \frac{1}{{r\cos\theta}} &= \frac{\partial {\theta}}{\partial {y}} \frac{1}{{\cos^2\theta}} \\ \frac{1}{{r\sin\theta}} &= -\frac{\partial {\theta}}{\partial {x}} \frac{1}{{\sin^2\theta}},\end{aligned} \quad\quad\quad(13)

for

\begin{aligned}\begin{bmatrix}\frac{\partial {}}{\partial {x}} \\ \frac{\partial {}}{\partial {y}} \end{bmatrix}= \begin{bmatrix}\cos\theta & -\sin\theta/r \\ \sin\theta & \cos\theta/r\end{bmatrix}\begin{bmatrix}\frac{\partial {}}{\partial {r}} \\ \frac{\partial {}}{\partial {\theta}} \end{bmatrix}.\end{aligned} \quad\quad\quad(17)

We can now write down the gradient in polar form, prior to final simplification

\begin{aligned}\boldsymbol{\nabla} = e^{i\theta}\begin{bmatrix}\hat{\mathbf{r}} & \hat{\boldsymbol{\theta}}\end{bmatrix}\begin{bmatrix}\cos\theta & -\sin\theta/r \\ \sin\theta & \cos\theta/r\end{bmatrix}\begin{bmatrix}\frac{\partial {}}{\partial {r}} \\ \frac{\partial {}}{\partial {\theta}} \end{bmatrix}.\end{aligned} \quad\quad\quad(18)

Observe that we can factor a unit vector

\begin{aligned}\begin{bmatrix}\hat{\mathbf{r}} & \hat{\boldsymbol{\theta}}\end{bmatrix}=\hat{\mathbf{r}}\begin{bmatrix}1 & i\end{bmatrix}=\begin{bmatrix}i & 1\end{bmatrix}\hat{\boldsymbol{\theta}}\end{aligned} \quad\quad\quad(19)

so the 1,1 element of the matrix product in the interior is

\begin{aligned}\begin{bmatrix}\hat{\mathbf{r}} & \hat{\boldsymbol{\theta}}\end{bmatrix}\begin{bmatrix}\cos\theta \\ \sin\theta \end{bmatrix}=\hat{\mathbf{r}} e^{i\theta} = e^{-i\theta}\hat{\mathbf{r}}.\end{aligned} \quad\quad\quad(20)

Similarly, the 1,2 element of the matrix product in the interior is

\begin{aligned}\begin{bmatrix}\hat{\mathbf{r}} & \hat{\boldsymbol{\theta}}\end{bmatrix}\begin{bmatrix}-\sin\theta/r \\ \cos\theta/r\end{bmatrix}=\frac{1}{{r}} e^{-i\theta} \hat{\boldsymbol{\theta}}.\end{aligned} \quad\quad\quad(21)

The exponentials cancel nicely, leaving after a final multiplication with the polar form for the gradient

\begin{aligned}\boldsymbol{\nabla} = \hat{\mathbf{r}} \partial_r + \hat{\boldsymbol{\theta}} \frac{1}{{r}} \partial_\theta\end{aligned} \quad\quad\quad(22)

That was a fun way to get the result, although we could have just looked it up. We want to use this now to calculate the Laplacian.

Polar form Laplacian for the plane.

We are now ready to look at the Laplacian. First let’s do it the first year electricity and magnetism course way. We look up the formula for polar form divergence, the one we were supposed to have memorized in kindergarten, and find it to be

\begin{aligned}\text{div}\ \mathbf{A} = \partial_r A_r + \frac{1}{{r}} A_r + \frac{1}{{r}} \partial_\theta A_\theta\end{aligned} \quad\quad\quad(23)

We can now apply this to the gradient vector in polar form which has components \boldsymbol{\nabla}_r = \partial_r, and \boldsymbol{\nabla}_\theta = (1/r)\partial_\theta, and get

\begin{aligned}\text{div}\ \text{grad} = \partial_{rr} + \frac{1}{{r}} \partial_r + \frac{1}{{r}} \partial_{\theta\theta}\end{aligned} \quad\quad\quad(24)

This is the expected result, and what we should get by performing \boldsymbol{\nabla} \cdot \boldsymbol{\nabla} in polar form. Now, let’s do it the wrong way, dotting our gradient with itself.

\begin{aligned}\boldsymbol{\nabla} \cdot \boldsymbol{\nabla} &= \left(\partial_r, \frac{1}{{r}} \partial_\theta\right) \cdot \left(\partial_r, \frac{1}{{r}} \partial_\theta\right) \\ &= \partial_{rr} + \frac{1}{{r}} \partial_\theta \left(\frac{1}{{r}} \partial_\theta\right) \\ &= \partial_{rr} + \frac{1}{{r^2}} \partial_{\theta\theta}\end{aligned}

This is wrong! So is Dmitrevsky right that this procedure is flawed, or do you spot the mistake? I have also cruelly written this out in a way that obscures the error and highlights the source of the confusion.

The problem is that our unit vectors are functions, and they must also be included in the application of our partials. Using the coordinate polar form without explicitly putting in the unit vectors is how we go wrong. Here’s the right way

\begin{aligned}\boldsymbol{\nabla} \cdot \boldsymbol{\nabla} &=\left( \hat{\mathbf{r}} \partial_r + \hat{\boldsymbol{\theta}} \frac{1}{{r}} \partial_\theta \right) \cdot \left( \hat{\mathbf{r}} \partial_r + \hat{\boldsymbol{\theta}} \frac{1}{{r}} \partial_\theta \right) \\ &=\hat{\mathbf{r}} \cdot \partial_r \left(\hat{\mathbf{r}} \partial_r \right)+\hat{\mathbf{r}} \cdot \partial_r \left( \hat{\boldsymbol{\theta}} \frac{1}{{r}} \partial_\theta \right)+\hat{\boldsymbol{\theta}} \cdot \frac{1}{{r}} \partial_\theta \left( \hat{\mathbf{r}} \partial_r \right)+\hat{\boldsymbol{\theta}} \cdot \frac{1}{{r}} \partial_\theta \left( \hat{\boldsymbol{\theta}} \frac{1}{{r}} \partial_\theta \right) \\ \end{aligned}

Now we need the derivatives of our unit vectors. The \partial_r derivatives are zero since these have no radial dependence, but we do have \theta partials

\begin{aligned}\partial_\theta \hat{\mathbf{r}} &=\partial_\theta \left( \mathbf{e}_1 e^{i\theta} \right) \\ &=\mathbf{e}_1 \mathbf{e}_1 \mathbf{e}_2 e^{i\theta} \\ &=\mathbf{e}_2 e^{i\theta} \\ &=\hat{\boldsymbol{\theta}},\end{aligned}

and

\begin{aligned}\partial_\theta \hat{\boldsymbol{\theta}} &=\partial_\theta \left( \mathbf{e}_2 e^{i\theta} \right) \\ &=\mathbf{e}_2 \mathbf{e}_1 \mathbf{e}_2 e^{i\theta} \\ &=-\mathbf{e}_1 e^{i\theta} \\ &=-\hat{\mathbf{r}}.\end{aligned}

(One should be able to get the same results if these unit vectors were written out in full as \hat{\mathbf{r}} = \mathbf{e}_1 \cos\theta + \mathbf{e}_2 \sin\theta, and \hat{\boldsymbol{\theta}} = \mathbf{e}_2 \cos\theta - \mathbf{e}_1 \sin\theta, instead of using the obscure geometric algebra quaterionic rotation exponential operators.)

Having calculated these partials we now have

\begin{aligned}(\boldsymbol{\nabla} \cdot \boldsymbol{\nabla}) =\partial_{rr} +\frac{1}{{r}} \partial_r +\frac{1}{{r^2}} \partial_{\theta\theta} \end{aligned} \quad\quad\quad(25)

Exactly what it should be, and what we got with the coordinate form of the divergence operator when applying the “Laplacian equals the divergence of the gradient” rule blindly. We see that the expectation that \boldsymbol{\nabla} \cdot \boldsymbol{\nabla} is the Laplacian in more than the Cartesian coordinate system is not invalid, but that care is required to apply the chain rule to all functions. We also see that expressing a vector in coordinate form when the basis vectors are position dependent is also a path to danger.

Is this anything that our electricity and magnetism prof didn’t know? Unlikely. Is this something that our prof felt that could not be explained to a mob of first year students? Probably.

References

[1] F.W. Byron and R.W. Fuller. Mathematics of Classical and Quantum Physics. Dover Publications, 1992.

Posted in Math and Physics Learning. | Tagged: , , , , , , , , , | 1 Comment »

Spherical polar pendulum for one and multiple masses (Take II)

Posted by peeterjoot on November 10, 2009

[Click here for a PDF of this post with nicer formatting]

Motivation

Attempting the multiple spherical pendulum problem with a bivector parameterized Lagrangian has just been attempted ([1]), but did not turn out to be an effective approach. Here a variation is used, employing regular plain old scalar spherical angle parameterized Kinetic energy, but still employing Geometric Algebra to express the Hermitian quadratic form associated with this energy term.

The same set of simplifying assumptions will be made. These are point masses, zero friction at the pivots and rigid nonspringy massless connecting rods between the masses.

The Lagrangian.

A two particle spherical pendulum is depicted in figure (\ref{fig:sPolarMultiPendulum:pendulumDouble})

\begin{figure}[htp]
\centering
\includegraphics[totalheight=0.4\textheight]{pendulumDouble}
\caption{Double spherical pendulum.}
\end{figure}

The position vector for each particle can be expressed relative to the mass it is connected to (or the origin for the first particle), as in

\begin{aligned}z_k &= z_{k-1} + \mathbf{e}_3 l_k e^{j_k \theta_k} \\ j_k &= \mathbf{e}_3 \wedge \left( \mathbf{e}_1 e^{i \phi_k} \right) \\ i &= \mathbf{e}_1 \wedge \mathbf{e}_2\end{aligned} \quad\quad\quad(1)

To express the Kinetic energy for any of the masses m_k, we need the derivative of the incremental difference in position

\begin{aligned}\frac{d}{dt} \left( \mathbf{e}_3 e^{j_k \theta_k} \right)&=\mathbf{e}_3 \left( j_k \dot{\theta}_k e^{j_k \theta_k} + \frac{d j_k }{dt} \sin\theta_k \right)  \\ &=\mathbf{e}_3 \left( j_k \dot{\theta}_k e^{j_k \theta_k} + \mathbf{e}_3 \mathbf{e}_2 \dot{\phi}_k e^{i \phi_k} \sin\theta_k \right)  \\ &=\left( \frac{d}{dt}\begin{bmatrix}\theta_k & \phi_k\end{bmatrix} \right)\begin{bmatrix}\mathbf{e}_1 e^{i \phi_k} e^{j_k \theta_k} \\ \mathbf{e}_2 e^{i \phi_k} \sin\theta_k\end{bmatrix}\end{aligned}

Introducing a Hermitian conjugation A^\dagger = \tilde{A}^\text{T}, reversing and transposing the matrix, and writing

\begin{aligned}A_k &=\begin{bmatrix}\mathbf{e}_1 e^{i \phi_k} e^{j_k \theta_k} \\ \mathbf{e}_2 e^{i \phi_k} \sin\theta_k\end{bmatrix} \\ \boldsymbol{\Theta}_k &=\begin{bmatrix}\theta_k \\ \phi_k\end{bmatrix}\end{aligned} \quad\quad\quad(4)

We can now write the relative velocity differential as

\begin{aligned}(\dot{z}_k - \dot{z}_{k-1})^2 = l_k^2 {\dot{\boldsymbol{\Theta}}_k}^\dagger A_k A_k^\dagger \dot{\boldsymbol{\Theta}}_k\end{aligned} \quad\quad\quad(6)

Observe that the inner product is Hermitian under this definition since (A_k A_k^\dagger)^\dagger = A_k A_k^\dagger. \footnote{Realized later, and being too lazy to adjust everything in these notes, the use of reversion here is not neccessary. Since the generalized coordinates are scalars we could use transposition instead of Hermitian conjugation. All the matrix elements are vectors so reversal doesn’t change anything.}

The total (squared) velocity of the kth particle is then

\begin{aligned}\boldsymbol{\Theta} &=\begin{bmatrix}\boldsymbol{\Theta}_1 \\ \boldsymbol{\Theta}_2 \\ \vdots \\ \boldsymbol{\Theta}_N \\ \end{bmatrix} \\ B_k &=\begin{bmatrix}l_1 A_1 \\ l_2 A_2 \\ \vdots \\ l_k A_k \\ 0 \\ \end{bmatrix} \\ (\dot{z}_k)^2 &=\dot{\boldsymbol{\Theta}}^\dagger B_k B_k^\dagger \dot{\boldsymbol{\Theta}}\end{aligned} \quad\quad\quad(7)

(where the zero matrix in B_k is a N-k by one zero). Summing over all masses and adding in the potential energy we have for the Lagrangian of the system

\begin{aligned}K &=\frac{1}{{2}} \sum_{k=1}^N m_k \dot{\boldsymbol{\Theta}}^\dagger B_k B_k^\dagger \dot{\boldsymbol{\Theta}} \\ \mu_k &= \sum_{j=k}^N m_j \\ \Phi &=g \sum_{k=1}^N \mu_k l_k \cos\theta_k \\ \mathcal{L} &= K - \Phi\end{aligned} \quad\quad\quad(10)

There’s a few layers of equations involved and we still have an unholy mess of matrix and geometric algebra in the kernel of the kinetic energy quadratic form, but at least this time all the generalized coordinates of the system are scalars.

Some tidy up.

Before continuing with evaluation of the Euler-Lagrange equations it is helpful to make a couple of observations about the structure of the matrix products that make up our velocity quadratic forms

\begin{aligned}\dot{\boldsymbol{\Theta}}^\dagger B_k B_k^\dagger \dot{\boldsymbol{\Theta}}&=\dot{\boldsymbol{\Theta}}^\dagger \begin{bmatrix}\begin{bmatrix}l_1^2 A_1 A_1^\dagger & l_1 l_2 A_1 A_2^\dagger & \hdots & l_1 l_k A_1 A_k^\dagger \\ l_2 l_1 A_2 A_1^\dagger & l_2^2 A_2 A_2^\dagger & \hdots & l_2 l_k A_2 A_k^\dagger \\ \vdots \\ l_k l_1 A_k A_1^\dagger & l_k l_2 A_k A_2^\dagger & \hdots & l_k^2 A_k A_k^\dagger \end{bmatrix} & 0 \\ 0 & 0\end{bmatrix}\dot{\boldsymbol{\Theta}}\end{aligned} \quad\quad\quad(14)

Specifically, consider the A_a A_b^\dagger products that make up the elements of the matrices Q_k = B_k B_k^\dagger. Without knowing anything about the grades that make up the elements of Q_k, since it is Hermitian (by this definition of Hermitian) there can be no elements of grade order two or three in the final matrix. This is because reversion of such grades inverts the sign, and the matrix elements in Q_k all equal their reverse. Additionally, the elements of the multivector column matrices A_k are vectors, so in the product A_a A_b^\dagger we can only have scalar and bivector (grade two) elements. The resulting one by one scalar matrix is a sum over all the mixed angular velocities \dot{\theta}_a \dot{\theta}_b, \dot{\theta}_a \dot{\phi}_b, and \dot{\phi}_a \dot{\phi}_b, so once this summation is complete any bivector grades of A_a A_b^\dagger must cancel out. This is consistent with the expectation that we have a one by one scalar matrix result out of this in the end (i.e. a number). The end result is a freedom to exploit the convienence of explicitly using a scalar selection operator that filters out any vector, bivector, and trivector grades in the products A_a A_b^\dagger. We will get the same result if we write

\begin{aligned}\dot{\boldsymbol{\Theta}}^\dagger B_k B_k^\dagger \dot{\boldsymbol{\Theta}}&=\dot{\boldsymbol{\Theta}}^\dagger \begin{bmatrix}\begin{bmatrix}l_1^2 \left\langle{{A_1 A_1^\dagger}}\right\rangle & l_1 l_2 \left\langle{{A_1 A_2^\dagger}}\right\rangle & \hdots & l_1 l_k \left\langle{{A_1 A_k^\dagger}}\right\rangle \\ l_2 l_1 \left\langle{{A_2 A_1^\dagger}}\right\rangle & l_2^2 \left\langle{{A_2 A_2^\dagger}}\right\rangle & \hdots & l_2 l_k \left\langle{{A_2 A_k^\dagger}}\right\rangle \\ \vdots \\ l_k l_1 \left\langle{{A_k A_1^\dagger}}\right\rangle & l_k l_2 \left\langle{{A_k A_2^\dagger}}\right\rangle & \hdots & l_k^2 \left\langle{{A_k A_k^\dagger}}\right\rangle\end{bmatrix} & 0 \\ 0 & 0\end{bmatrix}\dot{\boldsymbol{\Theta}}\end{aligned} \quad\quad\quad(15)

Pulling in the summation over m_k we have

\begin{aligned}\sum_k m_k\dot{\boldsymbol{\Theta}}^\dagger B_k B_k^\dagger \dot{\boldsymbol{\Theta}}&=\dot{\boldsymbol{\Theta}}^\dagger {\begin{bmatrix}\mu_{\max(r,c)} l_r l_c \left\langle{{A_r A_c^\dagger}}\right\rangle\end{bmatrix}}_{rc}\dot{\boldsymbol{\Theta}}\end{aligned} \quad\quad\quad(16)

It appears justifiable to lable the \mu_{\max(r,c)} l_r l_c factors of the angular velocity matrices as moments of inertia in a generalized sense. Using this block matrix form, and scalar selection, we can now write the Lagrangian in a slightly tidier form

\begin{aligned}\mu_k &= \sum_{j=k}^N m_j \\ Q &= {\begin{bmatrix}\mu_{\max(r,c)} l_r l_c A_r A_c^\dagger \end{bmatrix}}_{rc} \\ K &=\frac{1}{{2}} \dot{\boldsymbol{\Theta}}^\dagger Q\dot{\boldsymbol{\Theta}} =\frac{1}{{2}} \dot{\boldsymbol{\Theta}}^\text{T} \left\langle{{Q}}\right\rangle\dot{\boldsymbol{\Theta}} \\ \Phi &=g \sum_{k=1}^N \mu_k l_k \cos\theta_k \\ \mathcal{L} &= K - \Phi\end{aligned} \quad\quad\quad(17)

After some expansion, writing S_\theta = \sin\theta, C_\phi = \cos\phi and so forth, one can find that the scalar parts of the block matrixes A_r A_c^\dagger contained in Q are

\begin{aligned}\left\langle{{A_r A_c^\dagger}}\right\rangle=\begin{bmatrix}C_{\phi_c - \phi_r} C_{\theta_r}C_{\theta_c}+S_{\theta_r}S_{\theta_c} &-S_{\phi_c - \phi_r} C_{\theta_r} S_{\theta_c} \\ S_{\phi_c - \phi_r} C_{\theta_c} S_{\theta_r} &C_{\phi_c - \phi_r} S_{\theta_r} S_{\theta_c}\end{bmatrix}\end{aligned} \quad\quad\quad(22)

The diagonal blocks are particularily simple and have no \phi dependence

\begin{aligned}\left\langle{{A_r A_r^\dagger}}\right\rangle=\begin{bmatrix}1 & 0 \\ 0 & \sin^2 \theta_r\end{bmatrix}\end{aligned} \quad\quad\quad(23)

Observe also that \left\langle{{A_r A_c^\dagger}}\right\rangle^T = \left\langle{{A_c A_r^\dagger}}\right\rangle, so the scalar matrix

\begin{aligned}\left\langle{{Q}}\right\rangle = {\begin{bmatrix}\mu_{\max(r,c)} l_r l_c \left\langle{{ A_r A_c^\dagger }}\right\rangle\end{bmatrix}}_{rc}\end{aligned} \quad\quad\quad(24)

is a real symmetric matrix. We have the option of using this explicit scalar expansion if desired for further computations associated with this problem. That completely eliminates the Geometric algebra from the problem, and is probably a logical way to formulate things for numerical work since one can then exploit any pre existing matrix algebra system without having to create one that understands non-commuting variables and vector products.

Evaluating the Euler-Lagrange equations.

For the acceleration terms of the Euler-Lagrange equations our computation reduces nicely to a function of only \left\langle{{Q}}\right\rangle

\begin{aligned}\frac{d}{dt} \frac{\partial {\mathcal{L}}}{\partial {\dot{\theta}_a}}&=\frac{1}{{2}} \frac{d}{dt} \left(\frac{\partial {\dot{\boldsymbol{\Theta}}}}{\partial {\dot{\theta}_a}}^\text{T}\left\langle{{Q}}\right\rangle \dot{\boldsymbol{\Theta}}+\dot{\boldsymbol{\Theta}}^\text{T}\left\langle{{Q}}\right\rangle \frac{\partial {\dot{\boldsymbol{\Theta}}}}{\partial {\dot{\theta}_a}}\right)  \\ &=\frac{d}{dt} \left({\begin{bmatrix}\delta_{ac}\begin{bmatrix}1 & 0\end{bmatrix}\end{bmatrix}}_c\left\langle{{Q}}\right\rangle \dot{\boldsymbol{\Theta}}\right) \end{aligned}

and

\begin{aligned}\frac{d}{dt} \frac{\partial {\mathcal{L}}}{\partial {\dot{\phi}_a}}&=\frac{1}{{2}} \frac{d}{dt} \left(\frac{\partial {\dot{\boldsymbol{\Theta}}}}{\partial {\dot{\phi}_a}}^\text{T}\left\langle{{Q}}\right\rangle \dot{\boldsymbol{\Theta}}+\dot{\boldsymbol{\Theta}}^\text{T}\left\langle{{Q}}\right\rangle \frac{\partial {\dot{\boldsymbol{\Theta}}}}{\partial {\dot{\phi}_a}}\right)  \\ &=\frac{d}{dt} \left({\begin{bmatrix}\delta_{ac}\begin{bmatrix}0 & 1\end{bmatrix}\end{bmatrix}}_c\left\langle{{Q}}\right\rangle \dot{\boldsymbol{\Theta}}\right) \end{aligned}

The last groupings above made use of \left\langle{{Q}}\right\rangle = \left\langle{{Q}}\right\rangle^\text{T}, and in particular (\left\langle{{Q}}\right\rangle + \left\langle{{Q}}\right\rangle^\text{T})/2 = \left\langle{{Q}}\right\rangle. We can now form a column matrix putting all the angular velocity gradient in a tidy block matrix representation

\begin{aligned}\nabla_{\dot{\boldsymbol{\Theta}}} \mathcal{L} = {\begin{bmatrix}\begin{bmatrix}\frac{\partial {\mathcal{L}}}{\partial {\dot{\theta}_r}} \\ \frac{\partial {\mathcal{L}}}{\partial {\dot{\phi}_r}} \\ \end{bmatrix}\end{bmatrix}}_r = \left\langle{{Q}}\right\rangle \dot{\boldsymbol{\Theta}}\end{aligned} \quad\quad\quad(25)

A small aside on Hamiltonian form. This velocity gradient is also the conjugate momentum of the Hamiltonian, so if we wish to express the Hamiltonian in terms of conjugate momenta, we require invertability of \left\langle{{Q}}\right\rangle at the point in time that we evaluate things. Writing

\begin{aligned}P_{\boldsymbol{\Theta}} = \nabla_{\dot{\boldsymbol{\Theta}}} \mathcal{L} \end{aligned} \quad\quad\quad(26)

and noting that (\left\langle{{Q}}\right\rangle^{-1})^\text{T} = \left\langle{{Q}}\right\rangle^{-1}, we get for the kinetic energy portion of the Hamiltonian

\begin{aligned}K = \frac{1}{{2}} {P_{\boldsymbol{\Theta}}}^\text{T} \left\langle{{Q}}\right\rangle^{-1} P_{\boldsymbol{\Theta}}\end{aligned} \quad\quad\quad(27)

Now, the invertability of \left\langle{{Q}}\right\rangle cannot be taken for granted. Even in the single particle case we do not have invertability. For the single particle case we have

\begin{aligned}\left\langle{{Q}}\right\rangle =m l^2 \begin{bmatrix}1 & 0 \\ 0 & \sin^2 \theta\end{bmatrix}\end{aligned} \quad\quad\quad(28)

so at \theta = \pm \pi/2 this quadratic form is singlular, and the planar angular momentum becomes a constant of motion.

Returning to the evaluation of the Euler-Lagrange equations, the problem is now reduced to calculating the right hand side of the following system

\begin{aligned}\frac{d}{dt} \left( \left\langle{{Q}}\right\rangle \dot{\boldsymbol{\Theta}} \right) ={\begin{bmatrix}\begin{bmatrix}\frac{\partial {\mathcal{L}}}{\partial {\theta_r}} \\ \frac{\partial {\mathcal{L}}}{\partial {\phi_r}} \\ \end{bmatrix}\end{bmatrix}}_r\end{aligned} \quad\quad\quad(29)

With back substituition of 22, and 24 we have a complete non-multivector expansion of the left hand side. For the right hand side taking the \theta_a and \phi_a derivatives respectively we get

\begin{aligned}\frac{\partial {\mathcal{L}}}{\partial {\theta_a}}=\frac{1}{{2}} \dot{\boldsymbol{\Theta}}^\dagger {\begin{bmatrix}\mu_{\max(r,c)} l_r l_c \left\langle{{\frac{\partial {A_r}}{\partial {\theta_a}} A_c^\dagger + A_r \frac{\partial {A_c}}{\partial {\theta_a}}^\dagger}}\right\rangle\end{bmatrix}}_{rc} \dot{\boldsymbol{\Theta}}-g \mu_a l_a \sin\theta_a \end{aligned} \quad\quad\quad(30)

\begin{aligned}\frac{\partial {\mathcal{L}}}{\partial {\phi_a}}=\frac{1}{{2}} \dot{\boldsymbol{\Theta}}^\dagger {\begin{bmatrix}\mu_{\max(r,c)} l_r l_c \left\langle{{\frac{\partial {A_r}}{\partial {\phi_a}} A_c^\dagger + A_r \frac{\partial {A_c}}{\partial {\phi_a}}^\dagger}}\right\rangle\end{bmatrix}}_{rc} \dot{\boldsymbol{\Theta}}\end{aligned} \quad\quad\quad(31)

So to procede we must consider the \left\langle{{A_r A_c^\dagger}}\right\rangle partials. A bit of thought shows that the matrices of partials above are mostly zeros. Illustrating by example, consider {\partial {\left\langle{{Q}}\right\rangle}}/{\partial {\theta_2}}, which in block matrix form is

\begin{aligned}\frac{\partial {\left\langle{{Q}}\right\rangle}}{\partial {\theta_2}}=\begin{bmatrix}0 & \frac{1}{{2}} \mu_2 l_1 l_2 \left\langle{{A_1 \frac{\partial {A_2}}{\partial {\theta_2}}^\dagger}}\right\rangle & 0 & \hdots & 0 \\ \frac{1}{{2}} \mu_2 l_2 l_1 \left\langle{{\frac{\partial {A_2}}{\partial {\theta_2}} A_1^\dagger}}\right\rangle &\frac{1}{{2}} \mu_2 l_2 l_2 \left\langle{{A_2 \frac{\partial {A_2}}{\partial {\theta_2}}^\dagger + \frac{\partial {A_2}}{\partial {\theta_2}} A_2^\dagger}}\right\rangle &\frac{1}{{2}} \mu_3 l_2 l_3 \left\langle{{\frac{\partial {A_2}}{\partial {\theta_2}} A_3^\dagger}}\right\rangle & \hdots &\frac{1}{{2}} \mu_N l_2 l_N \left\langle{{\frac{\partial {A_2}}{\partial {\theta_2}} A_N^\dagger}}\right\rangle \\ 0 & \frac{1}{{2}} \mu_3 l_3 l_2 \left\langle{{A_3 \frac{\partial {A_2}}{\partial {\theta_2}}^\dagger}}\right\rangle & 0 & \hdots & 0 \\ 0 & \vdots & 0 & \hdots & 0 \\ 0 & \frac{1}{{2}} \mu_N l_N l_2 \left\langle{{A_N \frac{\partial {A_2}}{\partial {\theta_2}}^\dagger}}\right\rangle & 0 & \hdots & 0 \\ \end{bmatrix}\end{aligned} \quad\quad\quad(32)

Observe that the diagonal term has a scalar plus its reverse, so we can drop the one half factor and one of the summands for a total contribution to {\partial {\mathcal{L}}}/{\partial {\theta_2}} of just

\begin{aligned}\mu_2 {l_2}^2 {\dot{\boldsymbol{\Theta}}_2}^\text{T} \left\langle{{\frac{\partial {A_2}}{\partial {\theta_2}} A_2^\dagger}}\right\rangle \dot{\boldsymbol{\Theta}}_2\end{aligned}

Now consider one of the pairs of off diagonal terms. Adding these we contributions to {\partial {\mathcal{L}}}/{\partial {\theta_2}} of

\begin{aligned}\frac{1}{{2}} \mu_2 l_1 l_2 {\dot{\boldsymbol{\Theta}}_1}^\text{T}\left\langle{{A_1 \frac{\partial {A_2}}{\partial {\theta_2}}^\dagger}}\right\rangle \dot{\boldsymbol{\Theta}}_2+\frac{1}{{2}} \mu_2 l_2 l_1 {\dot{\boldsymbol{\Theta}}_2}^\text{T}\left\langle{{\frac{\partial {A_2}}{\partial {\theta_2}} A_1^\dagger}}\right\rangle \dot{\boldsymbol{\Theta}}_1&=\frac{1}{{2}} \mu_2 l_1 l_2 {\dot{\boldsymbol{\Theta}}_1}^\text{T}\left\langle{{A_1 \frac{\partial {A_2}}{\partial {\theta_2}}^\dagger + A_1 \frac{\partial {A_2}}{\partial {\theta_2}}^\dagger}}\right\rangle \dot{\boldsymbol{\Theta}}_2 \\ &=\mu_2 l_1 l_2 {\dot{\boldsymbol{\Theta}}_1}^\text{T}\left\langle{{A_1 \frac{\partial {A_2}}{\partial {\theta_2}}^\dagger}}\right\rangle \dot{\boldsymbol{\Theta}}_2 \\ \end{aligned}

This has exactly the same form as the diagonal term, so summing over all terms we get for the position gradient components of the Euler-Lagrange equation just

\begin{aligned}\frac{\partial {\mathcal{L}}}{\partial {\theta_a}}&=\sum_{k}\mu_{\max(k,a)} l_k l_a {\dot{\boldsymbol{\Theta}}_k}^\text{T}\left\langle{{A_k \frac{\partial {A_a}}{\partial {\theta_a}}^\dagger}}\right\rangle \dot{\boldsymbol{\Theta}}_a -g \mu_a l_a \sin\theta_a \end{aligned} \quad\quad\quad(33)

\begin{aligned}\frac{\partial {\mathcal{L}}}{\partial {\phi_a}}&=\sum_{k}\mu_{\max(k,a)} l_k l_a {\dot{\boldsymbol{\Theta}}_k}^\text{T}\left\langle{{A_k \frac{\partial {A_a}}{\partial {\phi_a}}^\dagger}}\right\rangle \dot{\boldsymbol{\Theta}}_a \end{aligned} \quad\quad\quad(34)

The only thing that remains to do is evaluate the \left\langle{{A_k {\partial {A_a}}/{\partial {\phi_a}}^\dagger}}\right\rangle matrixes.

It should be possible but it is tedious to calculate the block matrix derivative terms from the A_a partials using

\begin{aligned}\frac{\partial {A_a}}{\partial {\theta_a}} &=\begin{bmatrix}-\mathbf{e}_3 e^{j_a \theta_a} \\ \mathbf{e}_2 e^{i \phi_a} C_{\theta_a}\end{bmatrix}\end{aligned} \quad\quad\quad(35)

\begin{aligned}\frac{\partial {A_a}}{\partial {\phi_a}}&=\begin{bmatrix}\mathbf{e}_2 e^{i \phi_a} C_{\theta_a} \\ -\mathbf{e}_1 e^{i \phi_a} S_{\theta_a}\end{bmatrix}\end{aligned} \quad\quad\quad(36)

However multiplying this out and reducing is a bit tedious and would be a better job for a symbolic algebra package. With 22 available to use, one gets easily

\begin{aligned}\left\langle{{ A_k \frac{\partial {A_c}}{\partial {\theta_c}}^\dagger }}\right\rangle&=\begin{bmatrix}-C_{\phi_a - \phi_k} C_{\theta_k} S_{\theta_a} + S_{\theta_k} C_{\theta_a} &-S_{\phi_a - \phi_k} C_{\theta_k} C_{\theta_a} \\ -S_{\phi_a - \phi_k} S_{\theta_a} S_{\theta_k} &C_{\phi_a - \phi_k} (1 + \delta_{k a}) S_{\theta_k} C_{\theta_a} \end{bmatrix}\end{aligned} \quad\quad\quad(37)

\begin{aligned}\left\langle{{ A_k \frac{\partial {A_a}}{\partial {\phi_a}}^\dagger }}\right\rangle&=\begin{bmatrix}-S_{\phi_a - \phi_k} C_{\theta_k} C_{\theta_a} + S_{\theta_k} S_{\theta_a} &-C_{\phi_a - \phi_k} C_{\theta_k} S_{\theta_a} \\ C_{\phi_a - \phi_k} C_{\theta_a} S_{\theta_k} &-S_{\phi_a - \phi_k} S_{\theta_k} S_{\theta_a} \end{bmatrix}\end{aligned} \quad\quad\quad(38)

The right hand side of the Euler-Lagrange equations now becomes

\begin{aligned}\nabla_{\boldsymbol{\Theta}} \mathcal{L} =\sum_k{\begin{bmatrix}\begin{bmatrix}\mu_{\max(k,r)} l_k l_r {\dot{\boldsymbol{\Theta}}_k}^\text{T} \left\langle{{ A_k \frac{\partial {A_r}}{\partial {\theta_r}}^\dagger }}\right\rangle \dot{\boldsymbol{\Theta}}_r \\ \mu_{\max(k,r)} l_k l_r {\dot{\boldsymbol{\Theta}}_k}^\text{T} \left\langle{{ A_k \frac{\partial {A_r}}{\partial {\phi_r}}^\dagger }}\right\rangle \dot{\boldsymbol{\Theta}}_r \end{bmatrix}\end{bmatrix}}_r- g{\begin{bmatrix}\mu_r l_r \sin\theta_r \begin{bmatrix}1 \\ 0\end{bmatrix}\end{bmatrix}}_r\end{aligned} \quad\quad\quad(39)

Can the \dot{\boldsymbol{\Theta}}_a matrices be factored out, perhaps allowing for expression as a function of \dot{\boldsymbol{\Theta}}? How to do that if it is possible is not obvious. The driving reason to do so would be to put things into a tidy form where things are a function of the system angular velocity vector \boldsymbol{\Theta}, but this is not possible anyways since the gradient is non-linear.

Hamiltonian form and linearization.

Having calculated the Hamiltonian equations for the multiple mass planar pendulum in [2], doing so for the spherical pendulum can now be done by inspection. With the introduction of a phase space vector for the system using the conjugate momenta (for angles where these conjugate momenta are non-singlular)

\begin{aligned}\mathbf{z} = \begin{bmatrix}P_{\boldsymbol{\Theta}} \\ \boldsymbol{\Theta}\end{bmatrix}\end{aligned} \quad\quad\quad(40)

we can write the Hamiltonian equations

\begin{aligned}\frac{d\mathbf{z}}{dt} = \begin{bmatrix}\nabla_{\boldsymbol{\Theta}} \mathcal{L} \\ \left\langle{{Q}}\right\rangle^{-1} P_{\boldsymbol{\Theta}}\end{bmatrix}\end{aligned} \quad\quad\quad(41)

The position gradient is given explicitly in 39, and that can be substituted here. That gradient is expressed in terms of \dot{\boldsymbol{\Theta}}_k and not the conjugate momenta, but the mapping required to express the whole system in terms of the conjugate momenta is simple enough

\begin{aligned}\dot{\boldsymbol{\Theta}}_k = {\begin{bmatrix}\delta_{kc} I_{22}\end{bmatrix}}_c \left\langle{{Q}}\right\rangle^{-1} P_{\boldsymbol{\Theta}}\end{aligned} \quad\quad\quad(42)

It is apparent that for any sort of numerical treatment use of a angular momentum and angular position phase space vector is not prudent. If the aim is nothing more than working with a first order system instead of second order, then we are probably better off with an angular velocity plus angular position phase space system.

\begin{aligned}\frac{d}{dt}\begin{bmatrix}\left\langle{{Q}}\right\rangle \dot{\boldsymbol{\Theta}} \\ \boldsymbol{\Theta}\end{bmatrix}=\begin{bmatrix}\nabla_{\boldsymbol{\Theta}} \mathcal{L} \\ \dot{\boldsymbol{\Theta}}\end{bmatrix}\end{aligned} \quad\quad\quad(43)

This eliminates the requirement for inverting the sometimes singular matrix \left\langle{{Q}}\right\rangle, but one is still left with something that is perhaps tricky to work with since we have the possibility of zeros on the left hand side. The resulting equation is of the form

\begin{aligned}M \mathbf{x}' = f(\mathbf{x})\end{aligned} \quad\quad\quad(44)

where M = \left[\begin{smallmatrix}\left\langle{{Q}}\right\rangle & 0 \\  0 & I\end{smallmatrix}\right] is a possibly singular matrix, and f is a non-linear function of the components of \boldsymbol{\Theta}, and \dot{\boldsymbol{\Theta}}. This is concievably linearizable in the neighbourhood of a particular phase space point \mathbf{x}_0. If that is done, resulting in an equation of the form

\begin{aligned}M \mathbf{y}' = f(\mathbf{x}_0) + B \mathbf{y} \end{aligned} \quad\quad\quad(45)

where \mathbf{x} = \mathbf{y} + \mathbf{x}_0 and B is an appropriate matrix of partials (the specifics of which don’t really have to be spelled out here). Because of the possible singularities of M the exponentiation techniques applied to the linearized planar pendulum may not be possible with such a linearization. Study of this less well formed system of LDEs probably has interesting aspects, but is also likely best tackled independently of the specifics of the spherical pendulum problem.

Thoughts about the Hamiltonian singularity.

The fact that the Hamiltonian goes singular on the horizontal in this spherical polar representation is actually what I think is the most interesting bit in the problem (the rest being a lot mechanical details). On the horizontal \phi=0 or \dot{\phi} = 37000 radians/sec makes no difference to the dynamics. All you can say is that the horizontal plane angular momentum is a constant of the system. It seems very much like the increasing uncertaintly that you get in the corresponding radial QM equation. Once you start pinning down the \theta angle, you loose the ability to say much about \phi.

It is also kind of curious how the energy of the system is never ill defined but a choice of a particular orientation to use as a reference for observations of the momenta introduces the singularity as the system approaches the horizontal in that reference frame.

Perhaps there are some deeper connections relating these classical and QM similarity. Would learning about symplectic flows and phase space volume invariance shed some light on this?

A summary.

A fair amount of notation was introduced along the way in the process of formulating the spherical pendulum equations. It is worthwhile to do a final consise summary of notation and results before moving on for future reference.

The positions of the masses are given by

\begin{aligned}z_k &= z_{k-1} + \mathbf{e}_3 l_k e^{j_k \theta_k} \\ j_k &= \mathbf{e}_3 \wedge \left( \mathbf{e}_1 e^{i \phi_k} \right) \\ i &= \mathbf{e}_1 \wedge \mathbf{e}_2\end{aligned} \quad\quad\quad(46)

With the introduction of a column vector of vectors (where we multiply matrices using the Geometric vector product),

\begin{aligned}\boldsymbol{\Theta}_k &=\begin{bmatrix}\theta_k \\ \phi_k\end{bmatrix}\end{aligned} \quad\quad\quad(49)

\begin{aligned}\boldsymbol{\Theta} &={\begin{bmatrix}\boldsymbol{\Theta}_1 &\boldsymbol{\Theta}_2 &\hdots &\boldsymbol{\Theta}_N \end{bmatrix}}^\text{T}\end{aligned} \quad\quad\quad(50)

and a matrix of velocity components (with matrix multiplication of the vector elements using the Geometric vector product), we can form the Lagrangian

\begin{aligned}A_k &=\begin{bmatrix}\mathbf{e}_1 e^{i \phi_k} e^{j_k \theta_k} \\ \mathbf{e}_2 e^{i \phi_k} S_{\theta_k}\end{bmatrix} \end{aligned} \quad\quad\quad(51)

\begin{aligned}\mu_k &= \sum_{j=k}^N m_j \\ \left\langle{{Q}}\right\rangle &= {\begin{bmatrix}\mu_{\max(r,c)} l_r l_c \left\langle{{A_r A_c^\text{T}}}\right\rangle\end{bmatrix}}_{rc} \\ K &= \frac{1}{{2}} \dot{\boldsymbol{\Theta}}^\text{T} \left\langle{{Q}}\right\rangle\dot{\boldsymbol{\Theta}} \\ \Phi &=g \sum_{k=1}^N \mu_k l_k C_{\theta_k} \\ \mathcal{L} &= K - \Phi\end{aligned} \quad\quad\quad(52)

An explicit scalar matrix evaluation of the (symmetric) block matrix components of \left\langle{{Q}}\right\rangle was evaluated and found to be

\begin{aligned}\left\langle{{A_r A_c^\text{T}}}\right\rangle=\begin{bmatrix}C_{\phi_c - \phi_r} C_{\theta_r}C_{\theta_c}+S_{\theta_r}S_{\theta_c} &-S_{\phi_c - \phi_r} C_{\theta_r} S_{\theta_c} \\ S_{\phi_c - \phi_r} C_{\theta_c} S_{\theta_r} &C_{\phi_c - \phi_r} S_{\theta_r} S_{\theta_c}\end{bmatrix}\end{aligned} \quad\quad\quad(57)

These can be used if explicit evaluation of the Kinetic energy is desired, avoiding redundant summation over the pairs of skew entries in the quadratic form matrix \left\langle{{Q}}\right\rangle

\begin{aligned}K = \frac{1}{{2}} \sum_k \mu_k l_k^2 {\dot{\boldsymbol{\Theta}}_k}^T \left\langle{{A_k A_k^\text{T}}}\right\rangle \dot{\boldsymbol{\Theta}}_k+ \sum_{r<c} \mu_{\max(r,c)} l_r l_c {\dot{\boldsymbol{\Theta}}_r}^\text{T}\left\langle{{A_r A_c^\text{T}}}\right\rangle\dot{\boldsymbol{\Theta}}_c\end{aligned} \quad\quad\quad(58)

We utilize angular position and velocity gradients

\begin{aligned}\nabla_{\boldsymbol{\Theta}_k} &= \begin{bmatrix}\frac{\partial}{\partial \theta_k} \\ \frac{\partial}{\partial \phi_k} \end{bmatrix} \\ \nabla_{\dot{\boldsymbol{\Theta}}_k} &= \begin{bmatrix}\frac{\partial }{\partial \dot{\theta}_k} \\ \frac{\partial }{\partial \dot{\phi}_k} \end{bmatrix} \\ \nabla_{\boldsymbol{\Theta}} &= {\begin{bmatrix}{\nabla_{\boldsymbol{\Theta}_1}}^\text{T} & {\nabla_{\boldsymbol{\Theta}_2}}^\text{T} & \hdots  & {\nabla_{\boldsymbol{\Theta}_N}}^\text{T} \end{bmatrix}}^\text{T} \\ \nabla_{\boldsymbol{\dot{\Theta}}} &= {\begin{bmatrix}{\nabla_{\boldsymbol{\dot{\Theta}}_1}}^\text{T} & {\nabla_{\boldsymbol{\dot{\Theta}}_2}}^\text{T} & \hdots  & {\nabla_{\boldsymbol{\dot{\Theta}}_N}}^\text{T} \end{bmatrix}}^\text{T} \end{aligned} \quad\quad\quad(59)

and use these to form the Euler-Lagrange equations for the system in column vector form

\begin{aligned}\frac{d}{dt} \nabla_{\dot{\boldsymbol{\Theta}}} \mathcal{L} = \nabla_{\boldsymbol{\Theta}} \mathcal{L}\end{aligned} \quad\quad\quad(63)

For the canonical momenta we found the simple result

\begin{aligned}\nabla_{\dot{\boldsymbol{\Theta}}} \mathcal{L} = \left\langle{{Q}}\right\rangle \dot{\boldsymbol{\Theta}}\end{aligned} \quad\quad\quad(64)

For the position gradient portion of the Euler-Lagrange equations 63 we found in block matrix form

\begin{aligned}\nabla_{\boldsymbol{\Theta}} \mathcal{L} =\sum_k{\begin{bmatrix}\begin{bmatrix}\mu_{\max(k,r)} l_k l_r {\dot{\boldsymbol{\Theta}}_k}^\text{T} \left\langle{{ A_k \frac{\partial {A_r}}{\partial {\theta_r}}^\dagger }}\right\rangle \dot{\boldsymbol{\Theta}}_r \\ \mu_{\max(k,r)} l_k l_r {\dot{\boldsymbol{\Theta}}_k}^\text{T} \left\langle{{ A_k \frac{\partial {A_r}}{\partial {\phi_r}}^\dagger }}\right\rangle \dot{\boldsymbol{\Theta}}_r \end{bmatrix}\end{bmatrix}}_r- g{\begin{bmatrix}\mu_r l_r S_{\theta_r}\begin{bmatrix}1 \\ 0\end{bmatrix}\end{bmatrix}}_r\end{aligned} \quad\quad\quad(65)

\begin{aligned}\left\langle{{ A_k \frac{\partial {A_c}}{\partial {\theta_c}}^\dagger }}\right\rangle&=\begin{bmatrix}-C_{\phi_a - \phi_k} C_{\theta_k} S_{\theta_a} + S_{\theta_k} C_{\theta_a} &-S_{\phi_a - \phi_k} C_{\theta_k} C_{\theta_a} \\ -S_{\phi_a - \phi_k} S_{\theta_a} S_{\theta_k} &C_{\phi_a - \phi_k} (1 + \delta_{k a}) S_{\theta_k} C_{\theta_a} \end{bmatrix}\end{aligned} \quad\quad\quad(66)

\begin{aligned}\left\langle{{ A_k \frac{\partial {A_a}}{\partial {\phi_a}}^\dagger }}\right\rangle&=\begin{bmatrix}-S_{\phi_a - \phi_k} C_{\theta_k} C_{\theta_a} + S_{\theta_k} S_{\theta_a} &-C_{\phi_a - \phi_k} C_{\theta_k} S_{\theta_a} \\ C_{\phi_a - \phi_k} C_{\theta_a} S_{\theta_k} &-S_{\phi_a - \phi_k} S_{\theta_k} S_{\theta_a} \end{bmatrix}\end{aligned} \quad\quad\quad(67)

A set of Hamiltonian equations for the system could also be formed. However, this requires that one somehow restrict attention to the subset of phase space where the canonical momenta matrix \left\langle{{Q}}\right\rangle is non-singular, something not generally possible.

References

[1] Peeter Joot. {Spherical polar pendulum for one and multiple masses, and multivector Euler-Lagrange formulation.} [online]. http://sites.google.com/site/peeterjoot/math2009/sPolarMultiPendulum.pdf.

[2] Peeter Joot. Hamiltonian notes. [online]. http://sites.google.com/site/peeterjoot/math2009/hamiltonian.pdf.

Posted in Math and Physics Learning. | Tagged: , , , , , , , , , , | Leave a Comment »

Spherical polar pendulum for one and multiple masses, and multivector Euler-Lagrange formulation.

Posted by peeterjoot on November 4, 2009

[Click here for a PDF of this post with nicer formatting]

Motivation

The planar multiple pendulum problem proved somewhat tractable in the Hamiltonian formulation. Generalizing this to allow for three dimensional motion is a logical next step. Here this is attempted, using a Geometric Algebra scalar plus bivector parametrization of the spherical position of each dangling mass relative to the position of what it is attached to, as in

\begin{aligned}\mathbf{z} = l \mathbf{e}_3 e^{j\theta}\end{aligned} \quad\quad\quad(1)

The exponential is essentially a unit quaternion, rotating the vector l \mathbf{e}_3 from the polar axis to its \theta,\phi angle dependent position. Two sided rotation operators are avoided here by requiring of the unit bivector j = \mathbf{e}_3 \wedge \mathbf{m}, where \mathbf{m} is a vector in the plane of rotation passing through the great circle from \mathbf{e}_3 through \mathbf{z}. Note that we are free to pick \mathbf{m} = \mathbf{e}_1 e^{\mathbf{e}_1 \mathbf{e}_2 \phi}, the unit vector in the x,y plane at angle \phi from the x-axis. When that is done j = \mathbf{e}_3 \mathbf{m} since these are perpendicular. Setting up the Lagrangian in terms of the bivector j instead of the scalar angle \phi will be attempted, since this is expected to have some elegance and will be a fun way to try the problem. This should also provide a concrete example of a multivector Lagrangian in a context much simpler than electromagnetic fields or quantum mechanics.

Note finally that a number of simplifying assumptions will be made. These include use of point masses, zero friction at the pivots and rigid nonspringy massless connecting rods between the masses.

Kinetic energy for the single pendulum case.

Let’s compute derivatives of the unit vector

\begin{aligned}\hat{\mathbf{z}} = \mathbf{e}_3 e^{j\theta} = e^{-j\theta} \mathbf{e}_3\end{aligned} \quad\quad\quad(2)

This can be done with both the left and right factorization of \mathbf{e}_3, and are respectively

\begin{aligned}\dot{\hat{\mathbf{z}}} &= \mathbf{e}_3 \left( j \dot{\theta} e^{j\theta} + \frac{dj}{dt} \sin\theta \right) \\ &=\mathbf{e}_3\begin{bmatrix}j e^{j\theta} & \sin\theta\end{bmatrix}\frac{d}{dt} \begin{bmatrix}\theta \\ j\end{bmatrix}\end{aligned} \quad\quad\quad(3)

\begin{aligned}\dot{\hat{\mathbf{z}}} &= \left( -j \dot{\theta} e^{-j\theta} - \frac{dj}{dt} \sin\theta \right) \mathbf{e}_3 \\ &=\left(\frac{d}{dt} \begin{bmatrix}\theta & -j\end{bmatrix}\right)\begin{bmatrix}-j e^{-j\theta} \\ \sin\theta\end{bmatrix}\mathbf{e}_3\end{aligned} \quad\quad\quad(5)

These derivatives have been grouped into a matrix factors that allow a natural seeming conjugate operation to be defined. That is for a matrix of multivector elements a_{ij}

\begin{aligned}A =\begin{bmatrix}a_{ij}\end{bmatrix}\end{aligned} \quad\quad\quad(7)

define a conjugate matrix, as the transpose of the reversed elements

\begin{aligned}A^\dagger \equiv\begin{bmatrix}\tilde{a}_{ji}\end{bmatrix}\end{aligned} \quad\quad\quad(8)

With this definition, plus two helpers

\begin{aligned}\boldsymbol{\Theta} &\equiv\begin{bmatrix}\theta \\  j\end{bmatrix} \\ R &= \begin{bmatrix}j e^{j\theta} & \sin\theta\end{bmatrix}\end{aligned} \quad\quad\quad(9)

Our velocity becomes

\begin{aligned}{\dot{\hat{\mathbf{z}}}}^2 &= {\dot{\boldsymbol{\Theta}}}^\dagger R^\dagger R \dot{\boldsymbol{\Theta}} \\ \end{aligned} \quad\quad\quad(11)

Explicitly, expanding the inner matrix product we can write

\begin{aligned}Q &\equiv R^\dagger R \\ &=\begin{bmatrix}1 & -j e^{-j\theta} \sin\theta \\ j e^{j\theta} \sin\theta & \sin^2 \theta\end{bmatrix} \\ {\dot{\hat{\mathbf{z}}}}^2 &= {\dot{\boldsymbol{\Theta}}}^\dagger Q \dot{\boldsymbol{\Theta}} \\ \end{aligned} \quad\quad\quad(13)

This is a slightly unholy mix of geometric and matrix algebra, but it works to compactly express the velocity dependence. Observe that this inner matrix Q = Q^\dagger, so it is Hermitian with this definition of conjugation.

Our Lagrangian for the one particle pendulum, measuring potential energy from the horizontal, is then

\begin{aligned}\mathcal{L} = {\dot{\boldsymbol{\Theta}}}^\dagger \frac{1}{{2}} m l^2 Q \dot{\boldsymbol{\Theta}} - m g l \cos\theta\end{aligned} \quad\quad\quad(17)

We also have a mechanism that should generalize fairly easily to the two or many pendulum cases too.

Before continuing, it should be noted that there were assumptions made in this energy expression derivation that are not reflected in the Lagrangian above. One of these was the unit bivector assumption for j, as well as a \mathbf{e}_3 containment assumption for the plane this represents (\mathbf{e}_3 \wedge j = 0). So for completeness we should probably add to the Lagrangian above some Lagrange multiplier enforced constraints

\begin{aligned}\lambda ( {j}^2 + 1 ) + \alpha \dot{c} ( \mathbf{e}_3 \wedge j )\end{aligned} \quad\quad\quad(18)

Here \alpha has got to be a trivector multiplier for the Lagrangian to be a scalar. Can we get away with omitting these constraints?

Two and multi particle case.

Having constructed a way that can express the velocity of a single spherical pendulum in a tidy way, we can move on to consider the multiple pendulum case as shown in figure (\ref{fig:sPolarMultiPendulum:pendulumDouble})

\begin{figure}[htp]
\centering
\includegraphics[totalheight=0.4\textheight]{pendulumDouble}
\caption{Double spherical pendulum.}
\end{figure}

There are two bivectors depicted, j_1 and j_2 representing oriented planes passing through great circles from a local polar axis (in direction \mathbf{e}_3). Let the positions of the respective masses be z_1 and z_2, where each mass is connected by a rigid massless rod of length l_1 and l_2 respectively. The masses are rotated by angles \theta_1 and \theta_2 in the planes j_1 and j_2 from an initial direction of \mathbf{e}_3. We can express the position of the second mass as

\begin{aligned}\mathbf{z}_2 = \mathbf{z}_1 + \mathbf{e}_3 e^{j_2 \theta_2}\end{aligned} \quad\quad\quad(19)

We can use the same factorization as previously used for the single mass case and write for our collection of angular velocities

\begin{aligned}\boldsymbol{\Theta} &\equiv\begin{bmatrix}\theta_1 \\  j_1 \\ \theta_2 \\  j_2 \\ \end{bmatrix} \end{aligned} \quad\quad\quad(20)

Using this the total Kinetic energy is

\begin{aligned}K &= {\dot{\boldsymbol{\Theta}}}^\dagger \frac{1}{{2}} Q \dot{\boldsymbol{\Theta}} \\ R_1 &=\begin{bmatrix}l_1 j_1 e^{j_1\theta_1} & l_1 \sin\theta_1 &0 &0\end{bmatrix} \\ R_2 &=\begin{bmatrix} l_1 j_1 e^{j_1\theta_1} &  l_1 \sin\theta_1 & l_2 j_2 e^{j_2\theta_2} &  l_2 \sin\theta_2 \end{bmatrix} \\ Q &=m_1 {R_1}^\dagger R_1+m_2 {R_2}^\dagger R_2\end{aligned} \quad\quad\quad(21)

Notation has been switched slightly from the single mass case, and the m l^2 factor is now incorporated directly into Q for convenience.

An expansion of Q is essentially one of block matrix multiplication (where we already have to be careful with order of operations as we do for the geometric product elements themselves). We have something like

\begin{aligned}R_1 &= \begin{bmatrix}A_1 & 0\end{bmatrix} \\ {R_1}^\dagger &= \begin{bmatrix}{A_1}^\dagger \\ 0\end{bmatrix} \\ R_2 &= \begin{bmatrix}A_1 & A_2\end{bmatrix} \\ {R_2}^\dagger &= \begin{bmatrix}{A_1}^\dagger \\ {A_2}^\dagger \\ \end{bmatrix} \end{aligned} \quad\quad\quad(25)

We have for the products

\begin{aligned}{R_1}^\dagger R_1 &=\begin{bmatrix}{A_1}^\dagger A_1 & 0 \\ 0 & 0\end{bmatrix} \\ {R_2}^\dagger R_2 &=\begin{bmatrix}{A_1}^\dagger A_1 ^\dagger A_2 \\ {A_2}^\dagger A_1 ^\dagger A_2 \\ \end{bmatrix} \end{aligned} \quad\quad\quad(29)

So our quadratic form matrix is

\begin{aligned}Q =\begin{bmatrix}(m_1 + m_2) {A_1}^\dagger A_1 & m_2 {A_1}^\dagger A_2 \\ m_2 {A_2}^\dagger A_1 & m_2 {A_2}^\dagger A_2 \\ \end{bmatrix} \end{aligned} \quad\quad\quad(31)

In general for the multiple particle case this is

\begin{aligned}Q &={\begin{bmatrix}\left(\sum_{k=\max(r,c)}^N m_k \right){A_r}^\dagger A_c\end{bmatrix}}_{rc} \\ A_k &= l_k \begin{bmatrix}j_k e^{j_k \theta_k} & \sin\theta_k\end{bmatrix}\end{aligned} \quad\quad\quad(32)

Expanded explicitly this is

\begin{aligned}Q &={\begin{bmatrix}\left(\sum_{k=\max(r,c)}^N m_k \right) l_r l_c\begin{bmatrix}- j_r e^{-j_r \theta_r} j_c e^{j_c\theta_c} & - j_r e^{-j_r \theta_r} \sin\theta_c \\ j_c e^{j_c \theta_c} \sin\theta_r & \sin\theta_r \sin\theta_c\end{bmatrix}\end{bmatrix}}_{rc}\end{aligned} \quad\quad\quad(34)

Observe that the order of products in this expansion is specifically ordered, since the j_c and j_r bivectors do not necessarily commute.

The potential in the multiple particle case is also fairly straightforward to compute. Consider the two particle case to illustrate the pattern. Using the lowest point as the potential reference we have

\begin{aligned}\phi' = g \sum m_i h_i= m_1 r_1 (1 + \cos\theta_1) + m_2 \left( r_1(1 + \cos\theta_1) + r_2( 1 + \cos\theta_2) \right)\end{aligned} \quad\quad\quad(35)

Alternately, dropping all the constant terms (using the horizon as the potential reference) we have for the general case

\begin{aligned}\phi = g \sum_i \left( \sum_{k=i}^N m_k \right) r_i \cos\theta_i\end{aligned} \quad\quad\quad(36)

Lets collect all the bits and pieces now for the multiple pendulum Lagrangian now, repeating for coherency, and introducing a tiny bit more notation (mass sums and block angular velocity matrices) for convenience

\begin{aligned}\mathcal{L} &= K - \phi + \sum_k \lambda_k ({j_k}^2 + 1 ) \\ \boldsymbol{\Theta}_i &=\begin{bmatrix}\theta_i \\ j_i\end{bmatrix} \\ \boldsymbol{\Theta} &={\begin{bmatrix}\boldsymbol{\Theta}_r\end{bmatrix}}_{r} \\ \mu_i &=\sum_{k=i}^N m_k \\ Q &={\begin{bmatrix}\mu_{\max(r,c)}l_r l_c\begin{bmatrix}- j_r e^{-j_r \theta_r} j_c e^{j_c\theta_c} & - j_r e^{-j_r \theta_r} \sin\theta_c \\ j_c e^{j_c \theta_c} \sin\theta_r & \sin\theta_r \sin\theta_c\end{bmatrix}\end{bmatrix}}_{rc} \\ K &=\frac{1}{{2}} {\dot{\boldsymbol{\Theta}}}^\dagger Q \dot{\boldsymbol{\Theta}} \\ \phi &=g \sum_{i=1}^N \mu_i r_i \cos\theta_i\end{aligned} \quad\quad\quad(37)

Building up to the Multivector Euler-Lagrange equations.

Rather than diving right into any abstract math, lets consider a few specific examples of multivector Lagrangians to develop some comfort with non-scalar Lagrangian functions. The generalized “coordinates” in the Lagrangian for the spherical pendulum problem being considered include include bivectors, and we don’t know how to evaluate the Euler Lagrange equations for anything but scalar generalized coordinates.

The goal is to develop Euler-Lagrange equations that can handle a Lagrangian for these more general functions, but until we figure out how to do that, we can at least tackle the problem using variation of the action around a stationary solution.

A first example to build intuition.

To help understand what we have to do, lets consider the very simplest bivector parametrized Lagrangian, that of a spherical pendulum constrained (perhaps by a track or a surface) of moving only in a ring. This is shown pictorially in figure (\ref{fig:sPolarMultiPendulum:pendulumPolarCircular})

\begin{figure}[htp]
\centering
\includegraphics[totalheight=0.4\textheight]{pendulumPolarCircular}
\caption{Circularly constrained spherical pendulum.}
\end{figure}

The potential energy is fixed on this surface, so our Lagrangian is purely kinetic

\begin{aligned}\mathcal{L} = -\frac{1}{{2}} m l^2 \sin^2 \theta_0 {j'}^2\end{aligned} \quad\quad\quad(44)

We’d like to directly vary the action for the Lagrangian around a stationary solution

\begin{aligned}S = \int \mathcal{L} dt\end{aligned} \quad\quad\quad(45)

Introducing a bivector variation j = \bar{j} + \epsilon, and writing I = m l^2 \sin^2\theta_0 we have

\begin{aligned}\bar{S} + \delta S &= -\frac{1}{{2}} I \int (\bar{j}' + \epsilon')^2 dt \\ &= -\frac{1}{{2}} I \int (\bar{j}')^2 dt-\frac{1}{{2}} I \int \left( {\bar{j}}' \epsilon' + \epsilon' {\bar{j}}' \right) dt-\frac{1}{{2}} I \int (\epsilon')^2 dt\end{aligned}

The first term is just \bar{S}. Setting the variation \delta S = 0 (neglecting the quadratic \epsilon' term) and integrating by parts we have

\begin{aligned}0 &= \delta S \\ &= \int dt \left( \frac{d}{dt}\left( \frac{1}{{2}} I {\bar{j}}' \right) \epsilon + \epsilon \frac{d}{dt}\left( \frac{1}{{2}} I {\bar{j}}' \right) \right) \\ &= \int \left( \frac{d}{dt} I {\bar{j}}' \right) \dot{c} \epsilon dt \end{aligned}

With \epsilon arbitrary it appears that the solutions of the variation problem are given by

\begin{aligned}\frac{d}{dt}\left( I j' \right) = 0\end{aligned} \quad\quad\quad(46)

Has anything been lost by requiring that this is zero identically when all we had originally was the dot product of this with the variation bivector was zero? If this zero describes the solution set, then we should be able to integrate this, yielding a constant. However, following this to its logical conclusion leads to inconsistency. Integrating 46, producing a bivector constant \kappa we have

\begin{aligned}I j' = \kappa\end{aligned} \quad\quad\quad(47)

The original constraint on j was the bivector spanning the plane from the polar axis to the azimuthal unit vector in the x,y plane at angle a \phi. The \phi dependence wasn’t specified, and left encoded in the bivector representation. Without words that was

\begin{aligned}j = \mathbf{e}_3 \mathbf{e}_1 e^{\mathbf{e}_{12} \phi}\end{aligned} \quad\quad\quad(48)

Inserting back into 47 this gives

\begin{aligned}\frac{d}{dt} e^{\mathbf{e}_{12} \phi} = \frac{\mathbf{e}_1 \mathbf{e}_3 \kappa}{I}\end{aligned} \quad\quad\quad(49)

One more integration is trivially possible yielding

\begin{aligned}e^{\mathbf{e}_{12} \phi(t)} = e^{\mathbf{e}_{12} \phi_0} + \frac{\mathbf{e}_1 \mathbf{e}_3 \kappa t}{I} \end{aligned} \quad\quad\quad(50)

There are two possibilities for the grades of \kappa, one is \kappa \propto \mathbf{e}_3 \mathbf{e}_1, so that the time dependence is a scalar, and the other is \kappa \propto \mathbf{e}_3 \mathbf{e}_2 so that we have an x,y plane bivector component. Allowing for both, and separating into real and imaginary parts we have

\begin{aligned}\cos\phi - \cos\phi_0 &= \frac{1}{{I}} \kappa_r t \\ \sin\phi - \sin\phi_0 &= \frac{1}{{I}} \kappa_i t\end{aligned} \quad\quad\quad(51)

This isn’t anything close to the solution that is expected if we were to start with the scalar Lagrangian for the same problem. That Lagrangian is

\begin{aligned}\mathcal{L} = \frac{1}{{2}} I \dot{\phi}^2\end{aligned} \quad\quad\quad(53)

and the Euler Lagrange equations give us, for a scalar constant \mu

\begin{aligned}I \dot{\phi} = \mu\end{aligned} \quad\quad\quad(54)

So the solution for \phi is just

\begin{aligned}\phi - \phi_0 = \frac{\mu t}{I}\end{aligned} \quad\quad\quad(55)

In the absence of friction this makes sense. Our angle increases monotonically, and we have circular motion with constant angular velocity. Compare this to the messier 51 derived from the bivector “solution” to the variational problem. There is definitely something wrong with the variational approach (or conclusion) when the variable is a bivector.

Does it help to include the constraints explicitly? The bivector parametrized Lagrangian with the unit bivector multiplier constraint for this system is

\begin{aligned}\mathcal{L} = -\frac{1}{{2}} I (j')^2 + \lambda (j^2 + 1)\end{aligned} \quad\quad\quad(56)

Doing the variation we get a set of two equations

\begin{aligned}- I j'' &= 2 \lambda j \\ j^2 &= -1\end{aligned} \quad\quad\quad(57)

Once again re-inserting j = \mathbf{e}_{31} e^{\mathbf{e}_{12} \phi} one gets

\begin{aligned}e^{i\phi} = A \sin(\sqrt{I/2\lambda} t) + B \cos(\sqrt{I/2\lambda} t)\end{aligned} \quad\quad\quad(59)

Setting \lambda = 1, A = i, B = 1, we have

\begin{aligned}\phi \propto t\end{aligned} \quad\quad\quad(60)

This is now consistent with the scalar Lagrangian treatment. In both cases we have the angle linearly proportional to time, and have a single parameter to adjust the angular velocity (used \mu in the scalar treatment above and have \lambda this time (although it was set to one here for convenience). The conclusion has to be that the requirement to include the multipliers for the constraints is absolutely necessary to get the right physics out of this bivector parametrized Lagrangian. The good news is that we can get the right physics out of a non-scalar treatment. What is a bit disturbing is that it was fairly difficult in the end to get from the results of the variation to a solution, and that this will not likely get any easier with a more complex system.

A second example.

The scalar expansion 82 of the kinetic term in our spherical polar Lagrangian shows a couple of other specific multivector functions we can consider the variation of.

We have considered an specific example of a Lagrangian function \mathcal{L} = f(j' \dot{c} j') without (yet) utilizing or deriving a multivector form of the Euler-Lagrange equations. Let’s consider a few more specific simple examples motivated by the expansion of the kinetic energy in 82. Lets start with

\begin{aligned}\mathcal{L} = \theta' a \dot{c} j'\end{aligned} \quad\quad\quad(61)

where a is a bivector constant, \theta a scalar, and j a bivector variable for the system. Expanding this around stationary points j = \bar{j} + \epsilon, and \theta = \bar{\theta} + \phi we have to first order

\begin{aligned}\int \delta \mathcal{L} dt &\approx \int \left( \phi' a \dot{c} \bar{j}' + \bar{\theta}' a \dot{c} \epsilon' \right) dt \\ &= \int \left( \phi \frac{d}{dt}( - a \dot{c} \bar{j}' ) + \frac{d}{dt}( - \bar{\theta}' a ) \dot{c} \epsilon \right) dt \\ \end{aligned}

In this simple system where both scalar variable \theta and bivector variable j are cyclic “coordinates”, we have solutions to the variational problem given by the pair of equations

\begin{aligned}- a \dot{c} j' &= \text{constant} \\ - \theta' a &= \text{constant}\end{aligned} \quad\quad\quad(62)

As to what, if anything, this particular Lagrangian (a fragment picked out of a real Kinetic energy function) represents physically that doesn’t matter so much, since the point of this example was to build up to treating the more general case where we are representing something physical.

A third example.

If we throw a small additional wrench into the problem above and allow a to be one of the variables our system is dependent on.

\begin{aligned}\mathcal{L}(\theta, \theta', a, a', j, j') = \theta' a \dot{c} j'\end{aligned} \quad\quad\quad(64)

It is less obvious how to do a first order Taylor expansion of this Lagrangian required for the variation around the stationary solution. If all the coordinates in the Lagrangian were scalars as in

\begin{aligned}\mathcal{L}(\theta, \theta', a, a', j, j') = \theta' a j'\end{aligned} \quad\quad\quad(65)

(dropping dot products now that these are all scalar variables), then the variation requires nothing abnormal. Suppose our stationary point has coordinates \bar{\theta}, \bar{a}, \bar{j}, with variations \alpha, \beta, \gamma that vanish at the extremes of the integral as usual.

With scalar variables our path is clear, and we just form

\begin{aligned}\delta S &= \int \delta \mathcal{L} \\ &= \int \left( \alpha \frac{\partial {\mathcal{L}}}{\partial {\theta}}+\alpha' \frac{\partial {\mathcal{L}}}{\partial {\theta'}}+\beta \frac{\partial {\mathcal{L}}}{\partial {a}}+\beta' \frac{\partial {\mathcal{L}}}{\partial {a'}}+\gamma \frac{\partial {\mathcal{L}}}{\partial {j}}+\gamma' \frac{\partial {\mathcal{L}}}{\partial {j'}}\right) \\ \end{aligned}

Here is it implied that all the partials are evaluated at the stationary points \bar{\theta}, \bar{a}, \bar{j}. Doing the integration by parts we have

\begin{aligned}\delta S&=\int \left( \alpha \left( \frac{\partial {\mathcal{L}}}{\partial {\theta}} - \frac{d}{dt} \frac{\partial {\mathcal{L}}}{\partial {\theta'}} \right) +\beta \left( \frac{\partial {\mathcal{L}}}{\partial {a}} - \frac{d}{dt} \frac{\partial {\mathcal{L}}}{\partial {a'}} \right) +\gamma \left( \frac{\partial {\mathcal{L}}}{\partial {j}} - \frac{d}{dt} \frac{\partial {\mathcal{L}}}{\partial {j'}} \right) \right)\end{aligned}

Setting \delta S = 0 this produces the Euler-Lagrange equations for the system. For our specific Lagrangian this procedure gives

\begin{aligned}\int \delta \mathcal{L} &=\int \alpha' \bar{a} \bar{j}' + \bar{\theta}' \beta \bar{j}' + \bar{\theta}' \bar{a} \gamma' \\ &=\int \alpha \frac{d}{dt}\left( - \bar{a} \bar{j}' \right) + \bar{\theta}' \beta \bar{j}' + \frac{d}{dt}\left( -\bar{\theta}' \bar{a}\right) \gamma \\ \end{aligned}

Each time we do the first order expansion we are varying just one of the coordinates. It seems likely that the correct answer for the multivariable case will be

\begin{aligned}\int \delta \mathcal{L} &=\int \alpha \frac{d}{dt}\left( - \bar{a} \dot{c} \bar{j}' \right) + \beta \dot{c} (\bar{\theta}' \bar{j}') + \frac{d}{dt}\left( -\bar{\theta}' \bar{a}\right) \dot{c} \gamma \\ \end{aligned}

Thus the variational problem produces solutions (the coupled equations we still have to solve) of

\begin{aligned}- a \dot{c} j' &= \text{constant} \\ \theta' j' &= 0 \\ -\theta' a &= \text{constant}\end{aligned} \quad\quad\quad(66)

Multivector Euler-Lagrange equations.

Derivation.

Having considered a few specific Lagrangians dependent on multivector generalized “coordinates”, some basic comfort that it is at least possible has been gained. Let’s now move on to the general case. It is sufficient to consider Lagrangian functions that only depend on blades, since we can write any more general multivector as a sum of such blades

Write

\begin{aligned}\mathcal{L} = \mathcal{L}(X_k, X_k')\end{aligned} \quad\quad\quad(69)

where X_k \in \bigwedge^{m_k} is a blade of grade m_k. We can do a regular old chain rule expansion of this Lagrangian if coordinates are (temporarily) introduced for each of the blades X_k. For example, if X is a bivector in \mathbb{R}^{3}, we can write

\begin{aligned}X = \mathbf{e}_{12} a^{12} +\mathbf{e}_{13} a^{13} +\mathbf{e}_{23} a^{23} \end{aligned}

and then we can do a first order Taylor series expansion of \mathcal{L}(\bar{X} + \mathbf{\epsilon}) in terms of these coordinates. With \mathbf{\epsilon} = \mathbf{e}_{12}\epsilon^{12} + \mathbf{e}_{13}\epsilon^{13} + \mathbf{e}_{23}\epsilon^{23}, for this specific single variable Lagrangian we have

\begin{aligned}\delta \mathcal{L} &= \mathcal{L}(\bar{X} + \mathbf{\epsilon}) - \mathcal{L}(\bar{X}) \\ & \approx \epsilon^{12} {\left.{{\frac{\partial {\mathcal{L}}}{\partial {a^{12}}}}}\right\vert}_{{\bar{X}}}+ \epsilon^{23} {\left.{{\frac{\partial {\mathcal{L}}}{\partial {a^{23}}}}}\right\vert}_{{\bar{X}}}+ \epsilon^{13} {\left.{{\frac{\partial {\mathcal{L}}}{\partial {a^{13}}}}}\right\vert}_{{\bar{X}}}\end{aligned}

If we write the coordinate expansion of our blades X_k as

\begin{aligned}X_k = \sum_{b_k} {x_k}^{b_k} \sigma_{b_k}\end{aligned}

where b_k is some set of indexes like \{12, 23, 13\} from the \mathbb{R}^{3} bivector example, then our chain rule expansion of the Lagrangian to first order about a stationary point becomes

\begin{aligned}\delta \mathcal{L} &=\sum_{b_k} {\epsilon_k}^{b_k} {\left.{{ \frac{\partial {\mathcal{L}}}{\partial {{x_k}^{b_k}}} }}\right\vert}_{{\bar{X}_k, \bar{X}_k'}}+\sum_{b_k} ({\epsilon_k}^{b_k})' {\left.{{ \frac{\partial {\mathcal{L}}}{\partial {({x_k}^{b_k})'}} }}\right\vert}_{{\bar{X}_k, \bar{X}_k'}}\end{aligned} \quad\quad\quad(70)

Trying to write this beastie down with abstract indexing variables is a bit ugly. There’s something to be said for not trying to be general since writing this for the single variable bivector example was much clearer. However, having written down the ugly beastie, it can now be cleaned up nicely by introducing position and velocity gradient operators for each of the grades

\begin{aligned}\nabla_{X_k} &\equiv \sigma^{b_k} \frac{\partial }{\partial {{x_k}^{b_k}}} \\ \nabla_{X_k'} &\equiv \sigma^{b_k} \frac{\partial }{\partial {({x_k}^{b_k})'}}\end{aligned} \quad\quad\quad(71)

Utilizing this (assumed) orthonormal basis pair \sigma^{b_k} \dot{c} \sigma_{b_j} = {\delta^k}_j we have

\begin{aligned}\delta S &= \int \delta \mathcal{L} \\ &=\int \sum_k\mathbf{\epsilon}_k \dot{c} {\left.{{ \nabla_{X_k} \mathcal{L} }}\right\vert}_{{\bar{X}_k, \bar{X}_k'}}+{\mathbf{\epsilon}_k}' \dot{c} {\left.{{ \nabla_{X_k'} \mathcal{L} }}\right\vert}_{{\bar{X}_k, \bar{X}_k'}} \\ &=\int \sum_k\mathbf{\epsilon}_k \dot{c} \left( {\left.{{ \nabla_{X_k} \mathcal{L} }}\right\vert}_{{\bar{X}_k, \bar{X}_k'}} - \frac{d}{dt} {\left.{{ \nabla_{X_k'} \mathcal{L} }}\right\vert}_{{\bar{X}_k, \bar{X}_k'}} \right) \\ \end{aligned}

Setting \delta S = 0 we have a set of equations for each of the blade variables, and the Euler-Lagrange equations take the form

\begin{aligned}\nabla_{X_k} \mathcal{L} = \frac{d}{dt} \nabla_{X_k'} \mathcal{L} \end{aligned} \quad\quad\quad(73)

Work some examples.

For the pendulum problem we are really only interested in applying 73 for scalar and bivector variables. Let’s revisit some of the example Lagrangians, functions of bivectors, already considered and see how we can evaluate the position and velocity gradients. This generalized Euler-Lagrange formulation doesn’t do much good if we can’t work with it.

Let’s take the bivector gradient of a couple bivector functions (all the ones considered previously leading up to the multivector Euler-Lagrange equations).

\begin{aligned}f(B) &= A \dot{c} B \\ g(B) &= B^2 \end{aligned}

It is actually sufficient (in a less than 4D space) to consider only the first,

\begin{aligned}\nabla_B f &= \nabla_B \left\langle{{ A B }}\right\rangle \\ &= \sum_{a<b} \sigma^{ab} \frac{\partial }{\partial {b^{ab}}} \left\langle{{ A b^{a' b'} \sigma_{a' b'} }}\right\rangle \\ &= \sum_{a<b} \sigma^{ab} \left\langle{{ A \sigma_{ab} }}\right\rangle \\ &= \sum_{a<b} \sigma^{ab} A \dot{c} \sigma_{ab} \\ &= A\end{aligned}

For g we then have

\begin{aligned}\nabla_B g &= \nabla_B B B \\ &= \nabla_B \left\langle{{ B B}}\right\rangle \\ &= 2 B\end{aligned}

This is now enough to evaluate our bivector parameterized Lagrangian from the first example, 56, reproducing the result obtained by direct variation (as in Feynman’s lectures)

\begin{aligned}\frac{d}{dt} \nabla_{j'} \mathcal{L} &= \nabla_{j} \mathcal{L} \\ - I j'' &= \\         &= 2 \lambda j\end{aligned}

By inspection we can see that this works for the remaining two motivational examples too.

Evaluating the pendulum Euler-Lagrange equations (scalar, bivector parameterized KE).

We now have the tools to evaluate the Euler-Lagrange equations for the pendulum problem 37. Since all the d{\theta}_a/dt and dj_a/dt dependence is in the kinetic energy we can start with that. For

\begin{aligned}K = \frac{1}{{2}} \dot{\boldsymbol{\Theta}}^\dagger Q \dot{\boldsymbol{\Theta}}\end{aligned} \quad\quad\quad(74)

We want each of \frac{\partial {K}}{\partial {\dot{\theta}_a}} and \nabla_{j_a'} K. Of these the \dot{\theta} derivative is easier, so lets start with that

\begin{aligned}\frac{\partial {K}}{\partial {\theta_a}}&=\frac{1}{{2}} \left( \frac{\partial {\dot{\boldsymbol{\Theta}}}}{\partial {\dot{\theta}_a}}^\dagger \right) Q \dot{\boldsymbol{\Theta}}+\frac{1}{{2}} \dot{\boldsymbol{\Theta}}^\dagger Q \left( \frac{\partial {\dot{\boldsymbol{\Theta}}}}{\partial {\dot{\theta}_a}} \right)  \\ \end{aligned}

Each of these are scalars and thus equal their Hermitian conjuagate. This leaves us with just one term doubled

\begin{aligned}\frac{\partial {K}}{\partial {\theta_a}}&=\dot{\boldsymbol{\Theta}}^\dagger Q \left( \frac{\partial {\dot{\boldsymbol{\Theta}}}}{\partial {\dot{\theta}_a}} \right) \\ \end{aligned}

A final evaluation of the derivatives, in block matrix form over rows r, gives us

\begin{aligned}\frac{\partial {K}}{\partial {\theta_a}}=\dot{\boldsymbol{\Theta}}^\dagger Q {\begin{bmatrix}\delta_{ar}\begin{bmatrix}1 \\ 0\end{bmatrix}\end{bmatrix}}_{r}\end{aligned} \quad\quad\quad(75)

For the bivector velocity gradients, we can do the same,

\begin{aligned}\nabla_{j_a'} K&=\frac{1}{{2}} \left( \stackrel{ \rightarrow }\nabla_{j_a'} \dot{\boldsymbol{\Theta}}^\dagger \right) Q \dot{\boldsymbol{\Theta}}+\frac{1}{{2}} \dot{\boldsymbol{\Theta}}^\dagger Q \left( \dot{\boldsymbol{\Theta}} \stackrel{ \leftarrow }\nabla_{j_a'} \right)  \\ \end{aligned}

Only the j_a' parts of \dot{\boldsymbol{\Theta}} contribute to the velocity gradients, so we need to know how to evaluate a bivector gradient (as opposed to the square which was done earlier). Expanding in coordinates

\begin{aligned}\nabla_j j &= \frac{1}{{2}} \sum_{a b} \sigma^{ab} \frac{\partial }{\partial {j^{ab}}} \frac{1}{{2}} \sum_{a'b'} \sigma_{a' b'} j^{a' b'} \\ &= \frac{1}{{2}} \sum_{a b} \sigma^{ab} \sigma_{a b} \\ &= 3\end{aligned}

(three here because we are in \mathbb{R}^{3}, and our bivectors have three independent coordinates).

\begin{aligned}\nabla_{j_a'} K&=\frac{3}{2} {\begin{bmatrix}\delta_{ac}\begin{bmatrix}0 & -1 \end{bmatrix}\end{bmatrix}}_{c}Q \dot{\boldsymbol{\Theta}}+\frac{3}{2} \dot{\boldsymbol{\Theta}}^\dagger Q {\begin{bmatrix}\delta_{ar}\begin{bmatrix}0 \\ 1 \end{bmatrix}\end{bmatrix}}_{r}\end{aligned}

These terms are both one-by-one bivector matrixes, and negate with Hermitian conjugation, so we can again double up and eliminate one of the two terms, producing

\begin{aligned}\nabla_{j_a'} K&= 3 \dot{\boldsymbol{\Theta}}^\dagger Q {\begin{bmatrix}\delta_{ar}\begin{bmatrix}0 \\ 1 \end{bmatrix}\end{bmatrix}}_{r}\end{aligned} \quad\quad\quad(76)

Completing the Euler-Lagrange equation evaluation we have for the \theta_a coordinates

\begin{aligned}\frac{d}{dt} \left(\dot{\boldsymbol{\Theta}}^\dagger Q {\begin{bmatrix}\delta_{ar}\begin{bmatrix}1 \\ 0\end{bmatrix}\end{bmatrix}}_{r}\right)= \frac{1}{{2}} \dot{\boldsymbol{\Theta}}^\dagger \frac{\partial {Q}}{\partial {\theta_a}} \dot{\boldsymbol{\Theta}} - \frac{\partial {\phi}}{\partial {\theta_a}}\end{aligned} \quad\quad\quad(77)

plus one equation for each of the bivectors j_a

\begin{aligned} 3 \frac{d}{dt} \left(\dot{\boldsymbol{\Theta}}^\dagger Q {\begin{bmatrix}\delta_{ar}\begin{bmatrix}0 \\ 1 \end{bmatrix}\end{bmatrix}}_{r}\right)=\frac{1}{{2}} \sum_{e < f} \mathbf{e}_f \mathbf{e}_e \dot{\boldsymbol{\Theta}}^\dagger \frac{\partial {Q}}{\partial {{j_a}^{e f}}} \dot{\boldsymbol{\Theta}} + 2 \lambda_a j_a\end{aligned} \quad\quad\quad(78)

Because the bivector \mathbf{e}_f \mathbf{e}_e does not (nessessarily) commute with bivectors j_a' that are part of \dot{\boldsymbol{\Theta}} there doesn’t look like there’s much hope of assembling the left hand j_a gradients into a nice non-coordinate form. Additionally, looking back at this approach, and the troubles with producing meaningful equations of motion even in the constrained single pendulum case, it appears that a bivector parameterization of the kinetic energy is genally not a good approach. This is an unfortunate conclusion after going through all the effort to develop intuition that led to the multivector Euler-Lagrange formulation for this problem.

Oh well.

The Hermitian formulation used here should still provide a nice compact way of expressing the kinetic energy, even if we work with plain old scalar spherical polar angles \theta, and \phi. Will try this another day, since adding that to these notes will only make them that much more intractable.

Appendix calculation. A verification that the Kinetic matrix product is a real scalar.

In the kinetic term of the rather scary looking Lagrangian of 37 we have what should be a real scalar, but it is not obvious that this is the case. As a validation that nothing very bad went wrong, it seems worthwhile to do a check that this is in fact the case, expanding this out explicitly in gory detail.

One way to try this expansion is utilizing a block matrix summing over the diagonal and paired skew terms separately. That is

\begin{aligned}K &=\frac{1}{{2}}\sum_{k=1}^N\mu_k{l_k}^2{\dot{\boldsymbol{\Theta}}_k}^\dagger\begin{bmatrix}1 & - j_k e^{-j_k \theta_k} \sin\theta_k \\ j_k e^{j_k \theta_k} \sin\theta_k & \sin^2\theta_k \end{bmatrix}\dot{\boldsymbol{\Theta}}_k \\ &+\frac{1}{{2}}\sum_{a<b}\mu_bl_a l_b\left({\dot{\boldsymbol{\Theta}}_a}^\dagger\begin{bmatrix}- j_a e^{-j_a \theta_a} j_b e^{j_b\theta_b} & - j_a e^{-j_a \theta_a} \sin\theta_b \\ j_b e^{j_b \theta_b} \sin\theta_a & \sin\theta_a \sin\theta_b\end{bmatrix}\dot{\boldsymbol{\Theta}}_b +{\dot{\boldsymbol{\Theta}}_b}^\dagger\begin{bmatrix}- j_b e^{-j_b \theta_b} j_a e^{j_a\theta_a} & - j_b e^{-j_b \theta_b} \sin\theta_a \\ j_a e^{j_a \theta_a} \sin\theta_b & \sin\theta_b \sin\theta_a\end{bmatrix}\dot{\boldsymbol{\Theta}}_a\right)\end{aligned}

Examining the diagonal matrix products and expanding one of these (dropping the k suffix for tidiness), we have

\begin{aligned}{\dot{\boldsymbol{\Theta}}}^\dagger\begin{bmatrix}1 & - j e^{-j \theta} \sin\theta \\ j e^{j \theta} \sin\theta & \sin^2\theta \end{bmatrix}\dot{\boldsymbol{\Theta}} &=\dot{\theta}^2 -\sin^2\theta \left(\frac{dj}{dt}\right)^2 - \dot{\theta} \sin\theta\cos\theta \left( j \frac{dj}{dt} + \frac{dj}{dt} j \right)\end{aligned} \quad\quad\quad(79)

Since we are working in 3D this symmetric sum is twice the dot product of the bivector j with its derivative, which means that it is a scalar. We expect this to be zero though, and can observe that this is the case since j was by definition a unit bivector

\begin{aligned}j \frac{dj}{dt} + \frac{dj}{dt} j = \frac{d j^2}{dt} = \frac{d (-1)}{dt} = 0\end{aligned} \quad\quad\quad(80)

(thus j and its derivative represent orthogonal oriented planes rather like \hat{\mathbf{r}} and its derivative are orthogonal on a circle or sphere). The implication is that the diagonal subset of the kinetic energy expansion contains just

\begin{aligned}\frac{1}{{2}}\sum_{k=1}^N\mu_k{l_k}^2\left(\left(\frac{d\theta_k}{dt}\right)^2 -\sin^2\theta_k \left(\frac{d j_k}{dt}\right)^2  \right)\end{aligned} \quad\quad\quad(81)

If we are going to have any complex interaction terms then they will have to come from the off diagonal products. Expanding the first of these

\begin{aligned}{\dot{\boldsymbol{\Theta}}_a}^\dagger&\begin{bmatrix}- j_a e^{-j_a \theta_a} j_b e^{j_b\theta_b} & - j_a e^{-j_a \theta_a} \sin\theta_b \\ j_b e^{j_b \theta_b} \sin\theta_a & \sin\theta_a \sin\theta_b\end{bmatrix}\dot{\boldsymbol{\Theta}}_b \\ &=\begin{bmatrix}\theta_a' & -j_a'\end{bmatrix}\begin{bmatrix}- j_a e^{-j_a \theta_a} j_b e^{j_b\theta_b} & - j_a e^{-j_a \theta_a} \sin\theta_b \\ j_b e^{j_b \theta_b} \sin\theta_a & \sin\theta_a \sin\theta_b\end{bmatrix}\begin{bmatrix}\theta_b' \\ j_b'\end{bmatrix} \\ &=\begin{bmatrix}\theta_a' & -j_a'\end{bmatrix}\begin{bmatrix}- j_a e^{-j_a \theta_a} j_b e^{j_b\theta_b} \theta_b' - j_a e^{-j_a \theta_a} \sin\theta_b j_b' \\ j_b e^{j_b \theta_b} \sin\theta_a \theta_b' + \sin\theta_a \sin\theta_b j_b'\end{bmatrix} \\ &=- j_a e^{-j_a \theta_a} j_b e^{j_b\theta_b} \theta_a' \theta_b' - j_a e^{-j_a \theta_a} \sin\theta_b \theta_a' j_b' -j_a' j_b e^{j_b \theta_b} \sin\theta_a \theta_b' - \sin\theta_a \sin\theta_b j_a' j_b'\end{aligned}

Adding to this the a \leftrightarrow b exchanged product and rearranging yields

\begin{aligned}- \theta_a' \theta_b' (j_a e^{-j_a \theta_a} j_b e^{j_b\theta_b} + j_b e^{-j_b \theta_b} j_a e^{j_a\theta_a} ) - \sin\theta_a \sin\theta_b (j_a' j_b' + j_b' j_a') \\ - \sin\theta_b \theta_a' ( j_a e^{-j_a \theta_a} j_b' + j_b' j_a e^{j_a \theta_a} )- \sin\theta_a \theta_b' ( j_b e^{-j_b \theta_b} j_a' + j_a' j_b e^{j_b \theta_b} )\end{aligned}

Each of these multivector sums within the brackets is of the form A + \tilde{A}, a multivector plus its reverse. There can therefore be no bivector or trivector terms since they negate on reversal, and the resulting sum can have only scalar and vector grades. Of these the second term, j_a' j_b' + j_b' j_a' = 2 j_a' \dot{c} j_b' so it is unarguably a scalar as expected, but additional arguments are required to show this of the other three terms. Of these remaining three, the last two have the same form. Examining the first of these two

\begin{aligned}j_b e^{-j_b \theta_b} j_a' + j_a' j_b e^{j_b \theta_b} &=(j_b \cos\theta_b + \sin\theta_b) j_a' + j_a' (j_b \cos\theta_b - \sin\theta_b) \\ &=\cos\theta_b (j_b j_a' + j_a' j_b ) \\ &=2 \cos\theta_b (j_b \dot{c} j_a')\end{aligned}

The first term actually expands in a similarly straightforward way. The vector terms all cancel, and one is left with just

\begin{aligned}j_a e^{-j_a \theta_a} j_b e^{j_b\theta_b} + j_b e^{-j_b \theta_b} j_a e^{j_a\theta_a} = 2 \cos\theta_a\cos\theta_b j_a \dot{c} j_b - 2 \sin\theta_a \sin\theta_b \end{aligned}

Writing S_{\theta_k} = \sin\theta_k and C_{\theta_k} = \cos\theta_k (for compactness to fit things all in since the expanded result is messy), all of this KE terms can be assembled into the following explicit scalar expansion

\begin{aligned}\begin{array}{l l}K &=\frac{1}{{2}}\sum_{k=1}^N\mu_k{l_k}^2\left( (\theta_k')^2 - (S_{\theta_k} j_k')^2 \right) \\ &-\sum_{a<b}\mu_bl_a l_b\left(\theta_a'  \theta_b'  (C_{\theta_a}C_{\theta_b} (j_a \dot{c} j_b) - S_{\theta_a} S_{\theta_b} )+S_{\theta_a} S_{\theta_b} (j_a'  \dot{c} j_b' )+S_{\theta_b} C_{\theta_b} \theta_a'  (j_b \dot{c} j_a' )+S_{\theta_a} C_{\theta_a} \theta_b'  (j_a \dot{c} j_b' )\right)\end{array}\end{aligned} \quad\quad\quad(82)

Noting that all the bivector, bivector dot products are scalars really completes the desired verification. We can however, be more explicit using j_a = \mathbf{e}_{31} e^{\mathbf{e}_{12}\phi_a}, which gives after a bit of manipulation

\begin{aligned}j_a \dot{c} j_b &= - \cos(\phi_a - \phi_b) \\ (j_a')^2 &= -(\phi_a')^2 \\ j_a \dot{c} j_b' &= \phi_b' \sin(\phi_a - \phi_b) \\ j_a' \dot{c} j_b' &= -\phi_a' \phi_b' \cos(\phi_a - \phi_b)\end{aligned} \quad\quad\quad(83)

These can then be inserted back into 82 in a straightforward fashion, but it is not any more illuminating to do so.

Posted in Math and Physics Learning. | Tagged: , , , , , , , , | Leave a Comment »

spherical polar line element (continued).

Posted by peeterjoot on October 25, 2009

[Click here for a PDF of this sequence of posts with nicer formatting]

REPOSTING WITH ADDITIONAL NOTES.

Line element.

The line element for the particle moving on a spherical surface can be calculated by calculating the derivative of the spherical polar unit vector \hat{\mathbf{r}}

\begin{aligned}\frac{d\hat{\mathbf{r}}}{dt} = \frac{\partial {\hat{\mathbf{r}}}}{\partial {\phi}} \frac{d\phi}{dt} +\frac{\partial {\hat{\mathbf{r}}}}{\partial {\theta}} \frac{d\theta}{dt}\end{aligned} \quad\quad\quad(17)

than taking the magnitude of this vector. We can start either in coordinate form

\begin{aligned}\hat{\mathbf{r}} &= \mathbf{e}_3 \cos\theta + \mathbf{e}_1 \sin\theta \cos\phi + \mathbf{e}_2 \sin\theta \sin\phi\end{aligned} \quad\quad\quad(18)

or, instead do it the fun way, first grouping this into a complex exponential form. This factorization was done above, but starting over allows this to be done a bit more effectively for this particular problem. As above, writing i = \mathbf{e}_1 \mathbf{e}_2, the first factorization is

\begin{aligned}\hat{\mathbf{r}} &= \mathbf{e}_3 \cos\theta + \mathbf{e}_1 \sin\theta e^{i\phi} \end{aligned} \quad\quad\quad(19)

The unit vector \boldsymbol{\rho} = \mathbf{e}_1 e^{i\phi} lies in the x,y plane perpendicular to \mathbf{e}_3, so we can form the unit bivector \mathbf{e}_3\boldsymbol{\rho} and further factor the unit vector terms into a \cos + i \sin form

\begin{aligned}\hat{\mathbf{r}} &= \mathbf{e}_3 \cos\theta + \mathbf{e}_1 \sin\theta e^{i\phi} \\ &= \mathbf{e}_3 (\cos\theta + \mathbf{e}_3 \boldsymbol{\rho} \sin\theta) \\ \end{aligned}

This allows the spherical polar unit vector to be expressed in complex exponential form (really a vector-quaternion product)

\begin{aligned}\hat{\mathbf{r}} = \mathbf{e}_3 e^{\mathbf{e}_3 \boldsymbol{\rho} \theta} = e^{-\mathbf{e}_3 \boldsymbol{\rho} \theta} \mathbf{e}_3\end{aligned} \quad\quad\quad(20)

Now, calculating the unit vector velocity, we get

\begin{aligned}\frac{d\hat{\mathbf{r}}}{dt} &= \mathbf{e}_3 \mathbf{e}_3 \boldsymbol{\rho} e^{\mathbf{e}_3 \boldsymbol{\rho} \theta} \dot{\theta} + \mathbf{e}_1 \mathbf{e}_1 \mathbf{e}_2 \sin\theta e^{i\phi} \dot{\phi} \\ &= \boldsymbol{\rho} e^{\mathbf{e}_3 \boldsymbol{\rho} \theta} \left(\dot{\theta} + e^{-\mathbf{e}_3 \boldsymbol{\rho} \theta} \boldsymbol{\rho} \sin\theta e^{-i\phi} \mathbf{e}_2 \dot{\phi}\right) \\ &= \left( \dot{\theta} + \mathbf{e}_2 \sin\theta e^{i\phi} \dot{\phi} \boldsymbol{\rho} e^{\mathbf{e}_3 \boldsymbol{\rho} \theta} \right) e^{-\mathbf{e}_3 \boldsymbol{\rho} \theta} \boldsymbol{\rho}\end{aligned}

The last two lines above factor the \boldsymbol{\rho} vector and the e^{\mathbf{e}_3 \boldsymbol{\rho} \theta} quaternion to the left and to the right in preparation for squaring this to calculate the magnitude.

\begin{aligned}\left( \frac{d\hat{\mathbf{r}}}{dt} \right)^2&=\left\langle{{ \left( \frac{d\hat{\mathbf{r}}}{dt} \right)^2 }}\right\rangle \\ &=\left\langle{{ \left( \dot{\theta} + \mathbf{e}_2 \sin\theta e^{i\phi} \dot{\phi} \boldsymbol{\rho} e^{\mathbf{e}_3 \boldsymbol{\rho} \theta} \right) \left(\dot{\theta} + e^{-\mathbf{e}_3 \boldsymbol{\rho} \theta} \boldsymbol{\rho} \sin\theta e^{-i\phi} \mathbf{e}_2 \dot{\phi}\right) }}\right\rangle \\ &=\dot{\theta}^2 + \sin^2\theta \dot{\phi}^2+ \sin\theta \dot{\phi} \dot{\theta}\left\langle{{ \mathbf{e}_2 e^{i\phi} \boldsymbol{\rho} e^{\mathbf{e}_3 \rho \theta}+e^{-\mathbf{e}_3 \rho \theta} \boldsymbol{\rho} e^{-i\phi} \mathbf{e}_2}}\right\rangle \\ \end{aligned}

This last term (\in \text{span} \{\rho\mathbf{e}_1, \rho\mathbf{e}_2, \mathbf{e}_1\mathbf{e}_3, \mathbf{e}_2\mathbf{e}_3\}) has only grade two components, so the scalar part is zero. We are left with the line element

\begin{aligned}\left(\frac{d (r\hat{\mathbf{r}})}{dt}\right)^2 = r^2 \left( \dot{\theta}^2 + \sin^2\theta \dot{\phi}^2 \right)\end{aligned} \quad\quad\quad(21)

In retrospect, at least once one sees the answer, it seems obvious. Keeping \theta constant the length increment moving in the plane is ds = \sin\theta d\phi, and keeping \phi constant, we have ds = d\theta. Since these are perpendicular directions we can add the lengths using the Pythagorean theorem.

Line element using an angle and unit bivector parameterization.

Parameterizing using scalar angles is not the only approach that we can take to calculate the line element on the unit sphere. Proceding directly with a alternate polar representation, utilizing a unit bivector j, and scalar angle \theta is

\begin{aligned}\mathbf{x} = r \mathbf{e}_3 e^{j\theta}\end{aligned} \quad\quad\quad(22)

For this product to be a vector j must contain \mathbf{e}_3 as a factor (j = \mathbf{e}_3 \wedge m for some vector m.) Setting r = 1 for now, the deriviate of \mathbf{x} is

\begin{aligned}\dot{\mathbf{x}} &= \mathbf{e}_3 \frac{d}{dt} \left( \cos\theta + j \sin\theta \right) \\ &= \mathbf{e}_3 \dot{\theta} \left( -\sin\theta + j \cos\theta \right) + \mathbf{e}_3 \frac{d j}{dt} \sin\theta  \\ &= \mathbf{e}_3 \dot{\theta} j \left( j \sin\theta + \cos\theta \right) + \mathbf{e}_3 \frac{d j}{dt} \sin\theta  \\ \end{aligned}

This is

\begin{aligned}\dot{\mathbf{x}} = \mathbf{e}_3 \left( \frac{d\theta}{dt} j e^{j\theta} + \frac{d j}{dt} \sin\theta \right)\end{aligned} \quad\quad\quad(23)

Alternately, we can take derivatives of \mathbf{x} = r e^{-j\theta} \mathbf{e}_3, for

Or

\begin{aligned}\dot{\mathbf{x}} = -\left( \frac{d\theta}{dt} j e^{-j\theta} + \frac{d j}{dt} \sin\theta \right) \mathbf{e}_3\end{aligned} \quad\quad\quad(24)

Together with 23, the line element for position change on the unit sphere is then

\begin{aligned}\dot{\mathbf{x}}^2 &= \left\langle{{-\left( \frac{d\theta}{dt} j e^{-j\theta} + \frac{d j}{dt} \sin\theta \right) \mathbf{e}_3 \mathbf{e}_3 \left( \frac{d\theta}{dt} j e^{j\theta} + \frac{d j}{dt} \sin\theta \right) }}\right\rangle \\ &= \left\langle{{-\left( \frac{d\theta}{dt} j e^{-j\theta} + \frac{d j}{dt} \sin\theta \right) \left( \frac{d\theta}{dt} j e^{j\theta} + \frac{d j}{dt} \sin\theta \right) }}\right\rangle \\ &=\left(\frac{d\theta}{dt}\right)^2 - \left(\frac{d j}{dt}\right)^2 \sin^2\theta - \frac{d\theta}{dt} \sin\theta \left\langle{{ \frac{dj}{dt} j e^{j\theta} + j e^{-j\theta} \frac{dj}{dt} }}\right\rangle \\ \end{aligned}

Starting with cyclic reordering of the last term, we get zero

\begin{aligned}\left\langle{{ \frac{dj}{dt} j e^{j\theta} + j e^{-j\theta} \frac{dj}{dt} }}\right\rangle &=\left\langle{{ \frac{dj}{dt} j \left( e^{j\theta} + e^{-j\theta} \right) }}\right\rangle  \\ &=\left\langle{{ \frac{dj}{dt} j 2 j \sin\theta }}\right\rangle  \\ &=- 2 \sin\theta \frac{d}{dt} \underbrace{\left\langle{{ j }}\right\rangle}_{=0}  \\ \end{aligned}

The line element (for constant r) is therefore

\begin{aligned}\dot{\mathbf{x}}^2 =r^2 \left( \dot{\theta}^2 - \left(\frac{dj}{dt}\right)^2 \sin^2 \theta \right)\end{aligned} \quad\quad\quad(25)

This is essentially the same result as we got starting with an explicit r, \theta, \phi. Repeating for comparision that was

\begin{aligned}\dot{\mathbf{x}}^2 = r^2 \left( \dot{\theta}^2 + \sin^2\theta \dot{\phi}^2 \right)\end{aligned} \quad\quad\quad(26)

The bivector that we have used this time encodes the orientation of the plane of rotation from the polar axis down to the position on the sphere corresponds to the angle \phi in the
scalar parameterization. The negation in sign is expected due to the negative bivector square.

Also comparing to previous results it is notable that we can explicitly express this bivector in terms of the scalar angle if desired as

\begin{aligned}\boldsymbol{\rho} &= \mathbf{e}_1 e^{\mathbf{e}_1 \mathbf{e}_2 \phi} = \mathbf{e}_1 \cos\phi + \mathbf{e}_2 \sin\phi \\ j &= \mathbf{e}_3 \wedge \boldsymbol{\rho} = \mathbf{e}_3 \boldsymbol{\rho}\end{aligned} \quad\quad\quad(27)

The inverse mapping, expressing the scalar angle using the bivector representation is also possible, but not unique. The principle angle for that inverse mapping is

\begin{aligned}\phi &= -\mathbf{e}_1 \mathbf{e}_2 \ln(\mathbf{e}_1 \mathbf{e}_3 j) \end{aligned} \quad\quad\quad(29)

Allowing the magnitude to vary.

Writing a vector in polar form

\begin{aligned}\mathbf{x} = r \hat{\mathbf{r}}\end{aligned} \quad\quad\quad(30)

and also allowing r to vary, we have

\begin{aligned}\left(\frac{d\mathbf{x}}{dt}\right)^2 &= \left( \frac{dr}{dt} \hat{\mathbf{r}} + r \frac{d\hat{\mathbf{r}}}{dt} \right)^2 \\ &= \left( \frac{dr}{dt} \right)^2 + r^2 \left( \frac{d\hat{\mathbf{r}}}{dt} \right)^2 + 2 r \frac{dr}{dt} \hat{\mathbf{r}} \dot{c} \frac{d\hat{\mathbf{r}}}{dt} \end{aligned}

The squared unit vector derivative was previously calculated to be

\begin{aligned}\left(\frac{d \hat{\mathbf{r}}}{dt}\right)^2 = \dot{\theta}^2 + \sin^2\theta \dot{\phi}^2 \end{aligned} \quad\quad\quad(31)

Picturing the geometry is enough to know that \dot{\hat{\mathbf{r}}} \dot{c} \hat{\mathbf{r}} = 0 since \dot{\hat{\mathbf{r}}} is always tangential to the sphere. It should also be possible to algebraically show this, but without going through the effort we at least now know the general line element

\begin{aligned}\dot{\mathbf{x}}^2 = {\dot{r}}^2 + r^2 \left( \dot{\theta}^2 + \sin^2\theta \dot{\phi}^2 \right)\end{aligned} \quad\quad\quad(32)

Posted in Math and Physics Learning. | Tagged: , , , | Leave a Comment »

Line element for Spherical Polar unit vectors using Geometric Algebra.

Posted by peeterjoot on October 9, 2009

[Click here for a PDF incorporating this post with nicer formatting]

The line element for the particle moving on a spherical surface can be calculated by calculating the derivative of the spherical polar unit vector \hat{\mathbf{r}}

\begin{aligned}\frac{d\hat{\mathbf{r}}}{dt} = \frac{\partial {\hat{\mathbf{r}}}}{\partial {\phi}} \frac{d\phi}{dt} +\frac{\partial {\hat{\mathbf{r}}}}{\partial {\theta}} \frac{d\theta}{dt}\end{aligned} \quad\quad\quad(17)

than taking the magnitude of this vector. We can start either in coordinate form

\begin{aligned}\hat{\mathbf{r}} &= \mathbf{e}_3 \cos\theta + \mathbf{e}_1 \sin\theta \cos\phi + \mathbf{e}_2 \sin\theta \sin\phi\end{aligned} \quad\quad\quad(18)

or, instead do it the fun way, first grouping this into a complex exponential form. Writing i = \mathbf{e}_1 \mathbf{e}_2, the first factorization is

\begin{aligned}\hat{\mathbf{r}} &= \mathbf{e}_3 \cos\theta + \mathbf{e}_1 \sin\theta e^{i\phi} \end{aligned} \quad\quad\quad(19)

The unit vector \boldsymbol{\rho} = \mathbf{e}_1 e^{i\phi} lies in the x,y plane perpendicular to \mathbf{e}_3, so we can form the unit bivector \mathbf{e}_3\boldsymbol{\rho} and further factor the unit vector terms into a \cos + i \sin form

\begin{aligned}\hat{\mathbf{r}} &= \mathbf{e}_3 \cos\theta + \mathbf{e}_1 \sin\theta e^{i\phi} \\ &= \mathbf{e}_3 (\cos\theta + \mathbf{e}_3 \boldsymbol{\rho} \sin\theta) \\ \end{aligned}

This allows the spherical polar unit vector to be expressed in complex exponential form (really a vector-quaternion product)

\begin{aligned}\hat{\mathbf{r}} = \mathbf{e}_3 e^{\mathbf{e}_3 \boldsymbol{\rho} \theta} = e^{-\mathbf{e}_3 \boldsymbol{\rho} \theta} \mathbf{e}_3\end{aligned} \quad\quad\quad(20)

Now, calcuating the unit vector velocity, we get

\begin{aligned}\frac{d\hat{\mathbf{r}}}{dt} &= \mathbf{e}_3 \mathbf{e}_3 \boldsymbol{\rho} e^{\mathbf{e}_3 \boldsymbol{\rho} \theta} \dot{\theta} + \mathbf{e}_1 \mathbf{e}_1 \mathbf{e}_2 \sin\theta e^{i\phi} \dot{\phi} \\ &= \boldsymbol{\rho} e^{\mathbf{e}_3 \boldsymbol{\rho} \theta} \left(\dot{\theta} + e^{-\mathbf{e}_3 \boldsymbol{\rho} \theta} \boldsymbol{\rho} \sin\theta e^{-i\phi} \mathbf{e}_2 \dot{\phi}\right) \\ &= \left( \dot{\theta} + \mathbf{e}_2 \sin\theta e^{i\phi} \dot{\phi} \boldsymbol{\rho} e^{\mathbf{e}_3 \boldsymbol{\rho} \theta} \right) e^{-\mathbf{e}_3 \boldsymbol{\rho} \theta} \boldsymbol{\rho}\end{aligned}

The last two lines above factor the \boldsymbol{\rho} vector and the e^{\mathbf{e}_3 \boldsymbol{\rho} \theta} quaternion to the left and to the right in preparation for squaring this to calculate the magnitude.

\begin{aligned}\left( \frac{d\hat{\mathbf{r}}}{dt} \right)^2&=\left\langle{{ \left( \frac{d\hat{\mathbf{r}}}{dt} \right)^2 }}\right\rangle \\ &=\left\langle{{ \left( \dot{\theta} + \mathbf{e}_2 \sin\theta e^{i\phi} \dot{\phi} \boldsymbol{\rho} e^{\mathbf{e}_3 \boldsymbol{\rho} \theta} \right) \left(\dot{\theta} + e^{-\mathbf{e}_3 \boldsymbol{\rho} \theta} \boldsymbol{\rho} \sin\theta e^{-i\phi} \mathbf{e}_2 \dot{\phi}\right) }}\right\rangle \\ &=\dot{\theta}^2 + \sin^2\theta \dot{\phi}^2+ \sin\theta \dot{\phi} \dot{\theta}\left\langle{{ \mathbf{e}_2 e^{i\phi} \boldsymbol{\rho} e^{\mathbf{e}_3 \rho \theta}+e^{-\mathbf{e}_3 \rho \theta} \boldsymbol{\rho} e^{-i\phi} \mathbf{e}_2}}\right\rangle \\ \end{aligned}

This last term (\in \text{span} \{\rho\mathbf{e}_1, \rho\mathbf{e}_2, \mathbf{e}_1\mathbf{e}_3, \mathbf{e}_2\mathbf{e}_3\}) has only grade two components, so the scalar part is zero. We are left with the line element

\begin{aligned}\left(\frac{d (r\hat{\mathbf{r}})}{dt}\right)^2 = r^2 \left( \dot{\theta}^2 + \sin^2\theta \dot{\phi}^2 \right)\end{aligned} \quad\quad\quad(21)

In retrospect, at least once one sees the answer, it seems obvious. Keeping \theta constant the length increment moving in the plane is ds = \sin\theta d\phi, and keeping \phi constant, we have ds = d\theta. Since these are perpendicular directions we can add the lengths using the Pythagorean theorem.

Posted in Math and Physics Learning. | Tagged: , , , | Leave a Comment »

Spherical Polar unit vectors in exponential form.

Posted by peeterjoot on September 20, 2009

[Click here for a PDF of this post with nicer formatting]

Motivation

In [1] I blundered on a particularly concise exponential non-coordinate form for the unit vectors in a spherical polar coordinate system. For future reference outside of a quantum mechanical context here is a separate and more concise iteration of these results.

The rotation and notation.

The spherical polar rotor is a composition of rotations, expressed as half angle exponentials. Following the normal physics conventions we first apply a z,x plane rotation by angle theta, then an x,y plane rotation by angle \phi. This produces the rotor

\begin{aligned}R = e^{\mathbf{e}_{31}\theta/2} e^{\mathbf{e}_{12}\phi/2}\end{aligned} \quad\quad\quad(1)

Our triplet of Cartesian unit vectors is therefore rotated as

\begin{aligned}\begin{pmatrix}\hat{\mathbf{r}} \\ \hat{\boldsymbol{\theta}} \\ \hat{\boldsymbol{\phi}} \\ \end{pmatrix}&=\tilde{R}\begin{pmatrix}\mathbf{e}_3 \\ \mathbf{e}_1 \\ \mathbf{e}_2 \\ \end{pmatrix}R\end{aligned} \quad\quad\quad(2)

In the quantum mechanical context it was convenient to denote the x,y plane unit bivector with the imaginary symbol

\begin{aligned}i = \mathbf{e}_1 \mathbf{e}_2\end{aligned} \quad\quad\quad(3)

reserving for the spatial pseudoscalar the capital

\begin{aligned}I = \mathbf{e}_1 \mathbf{e}_2 \mathbf{e}_3 = \hat{\mathbf{r}} \hat{\boldsymbol{\theta}} \hat{\boldsymbol{\phi}} = i \mathbf{e}_3\end{aligned} \quad\quad\quad(4)

Note the characteristic differences between these two “imaginaries”. The planar quantity i = \mathbf{e}_1 \mathbf{e}_2 commutes with \mathbf{e}_3, but anticommutes with either \mathbf{e}_1 or \mathbf{e}_2. On the other hand the spatial pseudoscalar I commutes with any vector, bivector or trivector in the algebra.

Application of the rotor. The spherical polar unit vectors.

Having fixed notation, lets apply the rotation to each of the unit vectors in sequence, starting with the calculation for \hat{\boldsymbol{\phi}}. This is

\begin{aligned}\hat{\boldsymbol{\phi}} &= e^{-i \phi/2} e^{-\mathbf{e}_{31}\theta/2} (\mathbf{e}_2) e^{\mathbf{e}_{31}\theta/2} e^{i\phi/2} \\ &= \mathbf{e}_2 e^{i\phi} \end{aligned}

Here, since \mathbf{e}_2 commutes with the rotor bivector \mathbf{e}_3 \mathbf{e}_1 the innermost exponentials cancel, leaving just the i\phi rotation. For \hat{\mathbf{r}} it is a bit messier, and we have

\begin{aligned}\hat{\mathbf{r}} &= e^{-i \phi/2} e^{-\mathbf{e}_{31}\theta/2} (\mathbf{e}_3) e^{\mathbf{e}_{31}\theta/2} e^{i\phi/2} \\ &= e^{-i \phi/2} \mathbf{e}_3 e^{\mathbf{e}_{31}\theta} e^{i\phi/2} \\ &= e^{-i \phi/2} (\mathbf{e}_3 \cos\theta + \mathbf{e}_1 \sin\theta) e^{i\phi/2} \\ &= \mathbf{e}_3 \cos\theta + \mathbf{e}_1 \sin\theta e^{i\phi} \\ &= \mathbf{e}_3 \cos\theta + \mathbf{e}_1 \mathbf{e}_2 \sin\theta \mathbf{e}_2 e^{i\phi} \\ &= \mathbf{e}_3 \cos\theta + i \sin\theta \hat{\boldsymbol{\phi}} \\ &= \mathbf{e}_3 (\cos\theta + \mathbf{e}_3 i \sin\theta \hat{\boldsymbol{\phi}}) \\ &= \mathbf{e}_3 e^{I\hat{\boldsymbol{\phi}}\theta} \end{aligned}

Finally for \hat{\boldsymbol{\theta}}, we have a similar messy expansion

\begin{aligned}\hat{\boldsymbol{\theta}} &= e^{-i \phi/2} e^{-\mathbf{e}_{31}\theta/2} (\mathbf{e}_1) e^{\mathbf{e}_{31}\theta/2} e^{i\phi/2} \\ &= e^{-i \phi/2} \mathbf{e}_1 e^{\mathbf{e}_{31}\theta} e^{i\phi/2} \\ &= e^{-i \phi/2} (\mathbf{e}_1 \cos\theta - \mathbf{e}_3 \sin\theta) e^{i\phi/2} \\ &= \mathbf{e}_1 \cos\theta e^{i\phi} - \mathbf{e}_3 \sin\theta \\ &= i \cos\theta \mathbf{e}_2 e^{i\phi} - \mathbf{e}_3 \sin\theta \\ &= i \hat{\boldsymbol{\phi}} \cos\theta - \mathbf{e}_3 \sin\theta \\ &= i \hat{\boldsymbol{\phi}} (\cos\theta + \hat{\boldsymbol{\phi}} i \mathbf{e}_3 \sin\theta) \\ &= i \hat{\boldsymbol{\phi}} e^{I\hat{\boldsymbol{\phi}}\theta}\end{aligned}

Summarizing the three of these relations we have for the rotated unit vectors

\begin{aligned}\hat{\mathbf{r}} &= \mathbf{e}_3 e^{I \hat{\boldsymbol{\phi}} \theta} \\ \hat{\boldsymbol{\theta}} &= i \hat{\boldsymbol{\phi}} e^{I \hat{\boldsymbol{\phi}} \theta} \\ \hat{\boldsymbol{\phi}} &= \mathbf{e}_2 e^{i\phi} \end{aligned} \quad\quad\quad(5)

and in particular for the radial position vector from the origin, rotating from the polar axis, we have

\begin{aligned}\mathbf{x} &= r \hat{\mathbf{r}} = r \mathbf{e}_3 e^{I\hat{\boldsymbol{\phi}} \theta}\end{aligned} \quad\quad\quad(8)

Compare this to the coordinate representation

\begin{aligned}\mathbf{x} = r(\sin\theta \cos\phi, \sin\theta \sin\phi, \cos\theta)\end{aligned} \quad\quad\quad(9)

it is not initially obvious that these \theta and \phi rotations admit such a tidy factorization.

A small example application.

Let’s use these results to compute the spherical polar volume element. Pictorially this can be read off simply from a diagram. If one is less trusting of pictorial means (or want a method more generally applicable), we can also do this particular calculation algebraically, expanding the determinant of partials

\begin{aligned}\begin{vmatrix}\frac{\partial \mathbf{x}}{\partial r} & \frac{\partial \mathbf{x}}{\partial \theta} & \frac{\partial \mathbf{x}}{\partial \phi} \\ \end{vmatrix} dr d\theta d\phi&=\begin{vmatrix}\sin\theta \cos\phi & \cos\theta \cos\phi & -\sin\theta \sin\phi \\ \sin\theta \sin\phi & \cos\theta \sin\phi & \sin\theta \cos\phi \\ \cos\theta          & -\sin\theta         & 0                   \\ \end{vmatrix} r^2 dr d\theta d\phi\end{aligned} \quad\quad\quad(10)

One can chug through the trig reduction for this determinant with not too much trouble, but it isn’t particularly fun.

Now compare to the same calculation proceeding directly with the exponential form. We do still need to compute the partials

\begin{aligned}\frac{\partial \mathbf{x}}{\partial r} = \hat{\mathbf{r}}\end{aligned}

\begin{aligned}\frac{\partial \mathbf{x}}{\partial \theta} &= r \mathbf{e}_3 \frac{\partial }{\partial \theta} e^{I\hat{\boldsymbol{\phi}} \theta} \\ &= r \hat{\mathbf{r}} I \hat{\boldsymbol{\phi}} \\ &= r \hat{\mathbf{r}} (\hat{\mathbf{r}} \hat{\boldsymbol{\theta}} \hat{\boldsymbol{\phi}}) \hat{\boldsymbol{\phi}} \\ &= r \hat{\boldsymbol{\theta}}\end{aligned}

\begin{aligned}\frac{\partial \mathbf{x}}{\partial \phi} &= r \mathbf{e}_3 \frac{\partial }{\partial \phi} (\cos\theta + I\hat{\boldsymbol{\phi}} \sin\theta) \\ &= -r \mathbf{e}_3 I i \hat{\boldsymbol{\phi}} \sin\theta \\ &= r \hat{\boldsymbol{\phi}} \sin\theta \end{aligned}

So the area element, the oriented area of the parallelogram between the two vectors d\theta \partial \mathbf{x}/\partial \theta, and d\phi \partial \mathbf{x}/\partial \phi on the spherical surface at radius r is

\begin{aligned}d\mathbf{S} = \left(d\theta \frac{\partial \mathbf{x}}{\partial \theta}\right) \wedge \left( d\phi \frac{\partial \mathbf{x}}{\partial \phi} \right) = r^2 \hat{\boldsymbol{\theta}} \hat{\boldsymbol{\phi}} \sin\theta d\theta d\phi\end{aligned} \quad\quad\quad(11)

and the volume element in trivector form is just the product

\begin{aligned}d\mathbf{V} = \left(dr\frac{\partial \mathbf{x}}{\partial r}\right) \wedge d\mathbf{S}= r^2 \sin\theta I dr d\theta d\phi\end{aligned} \quad\quad\quad(12)

References

[1] Peeter Joot. Bivector form of quantum angular momentum operator [online]. http://sites.google.com/site/peeterjoot/math2009/qmAngularMom.pdf.

Posted in Math and Physics Learning. | Tagged: , , , , , | Leave a Comment »

A summary: Bivector form of quantum angular momentum operator

Posted by peeterjoot on September 6, 2009

[Click here for a PDF of this sequence of posts with nicer formatting]

Having covered a fairly wide range in the previous Geometric Algebra exploration of the angular momentum operator, it seems worthwhile to attempt to summarize what was learned.

The exploration started with a simple observation that the use of the spatial pseudoscalar for the imaginary of the angular momentum operator in its coordinate form

\begin{aligned}L_1 &= -i \hbar( x_2 \partial_3 - x_3 \partial_2 ) \\ L_2 &= -i \hbar( x_3 \partial_1 - x_1 \partial_3 ) \\ L_3 &= -i \hbar( x_1 \partial_2 - x_2 \partial_1 )  \end{aligned} \quad\quad\quad(69)

allowed for expressing the angular momentum operator in its entirety as a bivector valued operator

\begin{aligned}\mathbf{L} &= - \hbar \mathbf{x} \wedge \boldsymbol{\nabla} \end{aligned} \quad\quad\quad(72)

The bivector representation has an intrinsic complex behavior, eliminating the requirement for an explicitly imaginary i as is the case in the coordinate representation.

It was then assumed that the Laplacian can be expressed directly in terms of \mathbf{x} \wedge \boldsymbol{\nabla}. This isn’t an unreasonable thought since we can factor the gradient with components projected onto and perpendicular to a constant reference vector \hat{\mathbf{a}} as

\begin{aligned}\boldsymbol{\nabla} = \hat{\mathbf{a}} (\hat{\mathbf{a}} \cdot \boldsymbol{\nabla}) + \hat{\mathbf{a}} (\hat{\mathbf{a}} \wedge \boldsymbol{\nabla}) \end{aligned} \quad\quad\quad(73)

and this squares to a Laplacian expressed in terms of these constant reference directions

\begin{aligned}\boldsymbol{\nabla}^2 = (\hat{\mathbf{a}} \cdot \boldsymbol{\nabla})^2 - (\hat{\mathbf{a}} \cdot \boldsymbol{\nabla})^2 \end{aligned} \quad\quad\quad(74)

a quantity that has an angular momentum like operator with respect to a constant direction. It was then assumed that we could find an operator representation of the form

\begin{aligned}\boldsymbol{\nabla}^2 = \frac{1}{{\mathbf{x}^2}} \left( (\mathbf{x} \cdot \boldsymbol{\nabla})^2 - \left\langle{{(\mathbf{x} \cdot \boldsymbol{\nabla})^2}}\right\rangle + f(\mathbf{x}, \boldsymbol{\nabla}) \right) \end{aligned} \quad\quad\quad(75)

Where f(\mathbf{x}, \boldsymbol{\nabla}) was to be determined, and was found by subtraction. Thinking ahead to relativistic applications this result was obtained for the n-dimensional Laplacian and was found to be

\begin{aligned}\nabla^2 &= \frac{1}{{x^2}} \left( (n-2 + x \cdot \nabla) (x \cdot \nabla) - \left\langle{{(x \wedge \nabla)^2}}\right\rangle \right) \end{aligned} \quad\quad\quad(76)

For the 3D case specifically this is

\begin{aligned}\boldsymbol{\nabla}^2 &= \frac{1}{{\mathbf{x}^2}} \left( (1 + \mathbf{x} \cdot \boldsymbol{\nabla}) (\mathbf{x} \cdot \boldsymbol{\nabla}) - \left\langle{{(\mathbf{x} \wedge \boldsymbol{\nabla})^2}}\right\rangle \right) \end{aligned} \quad\quad\quad(77)

While the scalar selection above is good for some purposes, it interferes with observations about simultaneous eigenfunctions for the angular momentum operator and the scalar part of its square as seen in the Laplacian. With some difficulty and tedium, by subtracting the bivector and quadvector grades from the squared angular momentum operator (x \wedge \nabla)^2 it was eventually found that (76) can be written as

\begin{aligned}\nabla^2 &= \frac{1}{{x^2}} \left(   (n-2 + x \cdot \nabla) (x \cdot \nabla) + (n-2 - x \wedge \nabla) (x \wedge \nabla) + (x \wedge \nabla) \wedge (x \wedge \nabla) \right) \end{aligned} \quad\quad\quad(78)

In the 3D case the quadvector vanishes and (77) with the scalar selection removed is reduced to

\begin{aligned}\boldsymbol{\nabla}^2 &= \frac{1}{{\mathbf{x}^2}} \left( (1 + \mathbf{x} \cdot \boldsymbol{\nabla}) (\mathbf{x} \cdot \boldsymbol{\nabla}) + (1 - \mathbf{x} \wedge \boldsymbol{\nabla}) (\mathbf{x} \wedge \boldsymbol{\nabla}) \right) \end{aligned} \quad\quad\quad(79)

In 3D we also have the option of using the duality relation between the cross and the wedge \mathbf{a} \wedge \mathbf{b} = i (\mathbf{a} \times \mathbf{b}) to express the Laplacian

\begin{aligned}\boldsymbol{\nabla}^2 &= \frac{1}{{\mathbf{x}^2}} \left( (1 + \mathbf{x} \cdot \boldsymbol{\nabla}) (\mathbf{x} \cdot \boldsymbol{\nabla}) + (1 - i (\mathbf{x} \times \boldsymbol{\nabla})) i(\mathbf{x} \times \boldsymbol{\nabla}) \right) \end{aligned} \quad\quad\quad(80)

Since it is customary to express angular momentum as \mathbf{L} = -i \hbar (\mathbf{x} \times \boldsymbol{\nabla}), we see here that the imaginary in this context should perhaps necessarily be viewed as the spatial pseudoscalar. It was that guess that led down this path, and we come full circle back to this considering how to factor the Laplacian in vector quantities. Curiously this factorization is in no way specific to Quantum Theory.

A few verifications of the Laplacian in (80) were made. First it was shown that the directional derivative terms containing \mathbf{x} \cdot \boldsymbol{\nabla}, are equivalent to the radial terms of the Laplacian in spherical polar coordinates. That is

\begin{aligned}\frac{1}{{\mathbf{x}^2}} (1 + \mathbf{x} \cdot \boldsymbol{\nabla}) (\mathbf{x} \cdot \boldsymbol{\nabla}) \psi &= \frac{1}{{r}} \frac{\partial^2}{\partial r^2} (r\psi)  \end{aligned} \quad\quad\quad(81)

Employing the quaternion operator for the spherical polar rotation

\begin{aligned}R &= e^{\mathbf{e}_{31}\theta/2} e^{\mathbf{e}_{12}\phi/2} \\ \mathbf{x} &= r \tilde{R} \mathbf{e}_3 R  \end{aligned} \quad\quad\quad(82)

it was also shown that there was explicitly no radial dependence in the angular momentum operator which takes the form

\begin{aligned}\mathbf{x} \wedge \boldsymbol{\nabla} &= \tilde{R} \left( \mathbf{e}_3 \mathbf{e}_1 R \partial_\theta + \mathbf{e}_3 \mathbf{e}_2 R \frac{1}{{\sin\theta}} \partial_\phi \right) \\ &= \hat{\mathbf{r}} \left( \hat{\boldsymbol{\theta}} \partial_\theta + \hat{\boldsymbol{\phi}} \frac{1}{{\sin\theta}} \partial_\phi \right)  \end{aligned} \quad\quad\quad(84)

Because there is a \theta, and \phi dependence in the unit vectors \hat{\mathbf{r}}, \hat{\boldsymbol{\theta}}, and \hat{\boldsymbol{\phi}}, squaring the angular momentum operator in this form means that the unit vectors are also operated on. Those vectors were given by the triplet

\begin{aligned}\begin{pmatrix}\hat{\mathbf{r}} \\ \hat{\boldsymbol{\theta}} \\ \hat{\boldsymbol{\phi}} \\ \end{pmatrix}&=\tilde{R}\begin{pmatrix}\mathbf{e}_3 \\ \mathbf{e}_1 \\ \mathbf{e}_2 \\ \end{pmatrix}R \end{aligned} \quad\quad\quad(86)

Using I = \mathbf{e}_1 \mathbf{e}_2 \mathbf{e}_3 for the spatial pseudoscalar, and i = \mathbf{e}_1 \mathbf{e}_2 (a possibly confusing switch of notation) for the bivector of the x-y plane we can write the spherical polar unit vectors in exponential form as

\begin{aligned}\begin{pmatrix}\hat{\boldsymbol{\phi}} \\ \hat{\mathbf{r}} \\ \hat{\boldsymbol{\theta}} \\ \end{pmatrix}&=\begin{pmatrix}\mathbf{e}_2 e^{i\phi} \\ \mathbf{e}_3 e^{I \hat{\boldsymbol{\phi}} \theta} \\ i \hat{\boldsymbol{\phi}} e^{I \hat{\boldsymbol{\phi}} \theta} \\ \end{pmatrix} \end{aligned} \quad\quad\quad(87)

These or related expansions were used to verify (with some difficulty) that the scalar squared bivector operator is identical to the expected scalar spherical polar coordinates parts of the Laplacian

\begin{aligned}-\left\langle{{ (\mathbf{x} \wedge \boldsymbol{\nabla})^2 }}\right\rangle =\frac{1}{{\sin\theta}} \frac{\partial {}}{\partial {\theta}} \sin\theta \frac{\partial {}}{\partial {\theta}} + \frac{1}{{\sin^2\theta}} \frac{\partial^2}{\partial \phi^2} \end{aligned} \quad\quad\quad(88)

Additionally, by left or right dividing a unit bivector from the angular momentum operator, we are able to find that the raising and lowering operators are left as one of the factors

\begin{aligned}\mathbf{x} \wedge \boldsymbol{\nabla} &= \mathbf{e}_{31} \left( e^{i\phi} (\partial_\theta + i \cot\theta \partial_\phi) + \mathbf{e}_{13} \frac{1}{{i}} \partial_\phi \right) \\ \mathbf{x} \wedge \boldsymbol{\nabla} &= \left( e^{-i\phi} (\partial_\theta - i \cot\theta \partial_\phi) - \mathbf{e}_{13} \frac{1}{{i}} \partial_\phi \right) \mathbf{e}_{31} \end{aligned} \quad\quad\quad(89)

Both of these use i = \mathbf{e}_1 \mathbf{e}_2, the bivector for the plane, and not the spatial pseudoscalar. We are then able to see that in the context of the raising and lowering operator for the radial equation the interpretation of the imaginary should be one of a plane.

Using the raising operator factorization, it was calculated that (\sin\theta)^\lambda e^{i\lambda \phi} was an eigenfunction of the bivector operator \mathbf{x} \wedge \boldsymbol{\nabla} with eigenvalue -\lambda. This results in the simultaneous eigenvalue of \lambda(\lambda + 1) for this eigenfunction with the scalar squared angular momentum operator.

There are a few things here that have not been explored to their logical conclusion.

The bivector Fourier projections I \mathbf{e}_k (\mathbf{x} \wedge \boldsymbol{\nabla} ) \cdot (-I \mathbf{e}_k) do not obey the commutation relations of the scalar angular momentum components, so an attempt to directly use these to construct raising and lowering operators does not produce anything useful. The raising and lowering operators in a form that could be used to find eigensolutions were found by factoring out \mathbf{e}_{13} from the bivector operator. Making this particular factorization was a fluke and only because it was desirable to express the bivector operator entirely in spherical polar form. It is curious that this results in raising and lowering operators for the x,y plane, and understanding this further would be nice.

In the eigen solutions for the bivector operator, no quantization condition was imposed. I don’t understand the argument that Bohm used to do so in the traditional treatment, and revisiting this once that is done is in order.

I am also unsure exactly how Bohm knows that the inner product for the eigenfunctions should be a surface integral. This choice works, but what drives it. Can that be related to anything here?

Posted in Math and Physics Learning. | Tagged: , , , , , , , | Leave a Comment »