Peeter Joot's (OLD) Blog.

Math, physics, perl, and programming obscurity.

Posts Tagged ‘spherical polar coordinates’

Some worked problems from old PHY356 exams.

Posted by peeterjoot on January 9, 2011

[Click here for a PDF of this post with nicer formatting]


Some of the old exam questions that I did for preparation for the exam I liked, and thought I’d write up some of them for potential future reference.

Questions from the Dec 2007 PHY355H1F exam.

1b. Parity operator.

\paragraph{Q:} If \Pi is the parity operator, defined by \Pi {\lvert {x} \rangle} = {\lvert {-x} \rangle}, where {\lvert {x} \rangle} is the eigenket of the position operator X with eigenvalue x), and P is the momentum operator conjugate to X, show (carefully) that \Pi P \Pi = -P.


Consider the matrix element {\langle {-x'} \rvert} \left[{\Pi},{P}\right] {\lvert {x} \rangle}. This is

\begin{aligned}{\langle {-x'} \rvert} \left[{\Pi},{P}\right] {\lvert {x} \rangle}&={\langle {-x'} \rvert} \Pi P - P \Pi {\lvert {x} \rangle} \\ &={\langle {-x'} \rvert} \Pi P {\lvert {x} \rangle} - {\langle {-x} \rvert} P \Pi {\lvert {x} \rangle} \\ &={\langle {x'} \rvert} P {\lvert {x} \rangle} - {\langle {-x} \rvert} P {\lvert {-x} \rangle} \\ &=- i \hbar \left(\delta(x'-x) \frac{\partial {}}{\partial {x}}-\underbrace{\delta(-x -(-x'))}_{= \delta(x'-x) = \delta(x-x')} \frac{\partial {}}{\partial {-x}}\right) \\ &=- 2 i \hbar \delta(x'-x) \frac{\partial {}}{\partial {x}} \\ &=2 {\langle {x'} \rvert} P {\lvert {x} \rangle} \\ &=2 {\langle {-x'} \rvert} \Pi P {\lvert {x} \rangle} \\ \end{aligned}

We’ve taken advantage of the Hermitian property of P and \Pi here, and can rearrange for

\begin{aligned}{\langle {-x'} \rvert} \Pi P - P \Pi - 2 \Pi P {\lvert {x} \rangle} = 0\end{aligned} \hspace{\stretch{1}}(2.1)

Since this is true for all {\langle {-x} \rvert} and {\lvert {x} \rangle} we have

\begin{aligned}\Pi P + P \Pi = 0.\end{aligned} \hspace{\stretch{1}}(2.2)

Right multiplication by \Pi and rearranging we have

\begin{aligned}\Pi P \Pi = - P \Pi \Pi = - P.\end{aligned} \hspace{\stretch{1}}(2.3)

1f. Free particle propagator.

\paragraph{Q:} For a free particle moving in one-dimension, the propagator (i.e. the coordinate representation of the evolution operator),

\begin{aligned}G(x,x';t) = {\langle {x} \rvert} U(t) {\lvert {x'} \rangle}\end{aligned} \hspace{\stretch{1}}(2.4)

is given by

\begin{aligned}G(x,x';t) = \sqrt{\frac{m}{2 \pi i \hbar t}} e^{i m (x-x')^2/ (2 \hbar t)}.\end{aligned} \hspace{\stretch{1}}(2.5)


This problem is actually fairly straightforward, but it is nice to work it having had a similar problem set question where we were asked about this time evolution operator matrix element (ie: what it’s physical meaning is). Here we have a concrete example of the form of this matrix operator.

Proceeding directly, we have

\begin{aligned}{\langle {x} \rvert} U {\lvert {x'} \rangle}&=\int \left\langle{x} \vert {p'}\right\rangle {\langle {p'} \rvert} U {\lvert {p} \rangle} \left\langle{p} \vert {x'}\right\rangle dp dp' \\ &=\int u_{p'}(x) {\langle {p'} \rvert} e^{-i P^2 t/(2 m \hbar)} {\lvert {p} \rangle} u_p^{*}(x') dp dp' \\ &=\int u_{p'}(x) e^{-i p^2 t/(2 m \hbar)} \delta(p-p') u_p^{*}(x') dp dp' \\ &=\int u_{p}(x) e^{-i p^2 t/(2 m \hbar)} u_p^{*}(x') dp \\ &=\frac{1}{(\sqrt{2 \pi \hbar})^2} \int e^{i p (x-x')/\hbar} e^{-i p^2 t/(2 m \hbar)} dp \\ &=\frac{1}{2 \pi \hbar} \int e^{i p (x-x')/\hbar} e^{-i p^2 t/(2 m \hbar)} dp \\ &=\frac{1}{2 \pi} \int e^{i k (x-x')} e^{-i \hbar k^2 t/(2 m)} dk \\ &=\frac{1}{2 \pi} \int dk e^{- \left(k^2 \frac{ i \hbar t}{2m} - i k (x-x')\right)} \\ &=\frac{1}{2 \pi} \int dk e^{- \frac{ i \hbar t}{2m}\left(k - i \frac{2m}{i \hbar t}\frac{(x-x')}{2} \right)^2- \frac{i^2 2 m (x-x')^2}{4 i \hbar t} } \\ &=\frac{1}{2 \pi}  \sqrt{\pi} \sqrt{\frac{2m}{i \hbar t}}e^{\frac{ i m (x-x')^2}{2 \hbar t}},\end{aligned}

which is the desired result. Now, let’s look at how this would be used. We can express our time evolved state using this matrix element by introducing an identity

\begin{aligned}\left\langle{{x}} \vert {{\psi(t)}}\right\rangle &={\langle {x} \rvert} U {\lvert {\psi(0)} \rangle} \\ &=\int dx' {\langle {x} \rvert} U {\lvert {x'} \rangle} \left\langle{{x'}} \vert {{\psi(0)}}\right\rangle \\ &=\sqrt{\frac{m}{2 \pi i \hbar t}} \int dx' e^{i m (x-x')^2/ (2 \hbar t)}\left\langle{{x'}} \vert {{\psi(0)}}\right\rangle \\ \end{aligned}

This gives us

\begin{aligned}\psi(x, t)=\sqrt{\frac{m}{2 \pi i \hbar t}} \int dx' e^{i m (x-x')^2/ (2 \hbar t)} \psi(x', 0)\end{aligned} \hspace{\stretch{1}}(2.6)

However, note that our free particle wave function at time zero is

\begin{aligned}\psi(x, 0) = \frac{e^{i p x/\hbar}}{\sqrt{2 \pi \hbar}}\end{aligned} \hspace{\stretch{1}}(2.7)

So the convolution integral 2.6 does not exist. We likely have to require that the solution be not a pure state, but instead a superposition of a set of continuous states (a wave packet in position or momentum space related by Fourier transforms). That is

\begin{aligned}\psi(x, 0) &= \frac{1}{{\sqrt{2 \pi \hbar}}} \int \hat{\psi}(p, 0) e^{i p x/\hbar} dp \\ \hat{\psi}(p, 0) &= \frac{1}{{\sqrt{2 \pi \hbar}}} \int \psi(x'', 0) e^{-i p x''/\hbar} dx''\end{aligned} \hspace{\stretch{1}}(2.8)

The time evolution of this wave packet is then determined by the propagator, and is

\begin{aligned}\psi(x,t) =\sqrt{\frac{m}{2 \pi i \hbar t}} \frac{1}{{\sqrt{2 \pi \hbar}}} \int dx' dpe^{i m (x-x')^2/ (2 \hbar t)}\hat{\psi}(p, 0) e^{i p x'/\hbar} ,\end{aligned} \hspace{\stretch{1}}(2.10)

or in terms of the position space wave packet evaluated at time zero

\begin{aligned}\psi(x,t) =\sqrt{\frac{m}{2 \pi i \hbar t}}\frac{1}{{2 \pi}}\int dx' dx'' dke^{i m (x-x')^2/ (2 \hbar t)}e^{i k (x' - x'')} \psi(x'', 0)\end{aligned} \hspace{\stretch{1}}(2.11)

We see that the propagator also ends up with a Fourier transform structure, and we have

\begin{aligned}\psi(x,t) &= \int dx' U(x, x' ; t) \psi(x', 0) \\ U(x, x' ; t) &=\sqrt{\frac{m}{2 \pi i \hbar t}}\frac{1}{{2 \pi}}\int du dke^{i m (x - x' - u)^2/ (2 \hbar t)}e^{i k u }\end{aligned} \hspace{\stretch{1}}(2.12)

Does that Fourier transform exist? I’d not be surprised if it ended up with a delta function representation. I’ll hold off attempting to evaluate and reduce it until another day.

4. Hydrogen atom.

This problem deals with the hydrogen atom, with an initial ket

\begin{aligned}{\lvert {\psi(0)} \rangle} = \frac{1}{{\sqrt{3}}} {\lvert {100} \rangle}+\frac{1}{{\sqrt{3}}} {\lvert {210} \rangle}+\frac{1}{{\sqrt{3}}} {\lvert {211} \rangle},\end{aligned} \hspace{\stretch{1}}(2.14)


\begin{aligned}\left\langle{\mathbf{r}} \vert {{100}}\right\rangle = \Phi_{100}(\mathbf{r}),\end{aligned} \hspace{\stretch{1}}(2.15)


\paragraph{Q: (a)}

If no measurement is made until time t = t_0,

\begin{aligned}t_0 = \frac{\pi \hbar}{ \frac{3}{4} (13.6 \text{eV}) } = \frac{ 4 \pi \hbar }{ 3 E_I},\end{aligned} \hspace{\stretch{1}}(2.16)

what is the ket {\lvert {\psi(t)} \rangle} just before the measurement is made?


Our time evolved state is

\begin{aligned}{\lvert {\psi{t_0}} \rangle} = \frac{1}{{\sqrt{3}}} e^{-i E_1 t_0 /\hbar } {\lvert {100} \rangle}+\frac{1}{{\sqrt{3}}} e^{- i E_2 t_0/\hbar } ({\lvert {210} \rangle} + {\lvert {211} \rangle}).\end{aligned} \hspace{\stretch{1}}(2.17)

Also observe that this initial time was picked to make the exponential values come out nicely, and we have

\begin{aligned}\frac{E_n t_0 }{\hbar} &= - \frac{E_I \pi \hbar }{\frac{3}{4} E_I n^2 \hbar} \\ &= - \frac{4 \pi }{ 3 n^2 },\end{aligned}

so our time evolved state is just

\begin{aligned}{\lvert {\psi(t_0)} \rangle} = \frac{1}{{\sqrt{3}}} e^{-i 4 \pi / 3} {\lvert {100} \rangle}+\frac{1}{{\sqrt{3}}} e^{- i \pi / 3 } ({\lvert {210} \rangle} + {\lvert {211} \rangle}).\end{aligned} \hspace{\stretch{1}}(2.18)

\paragraph{Q: (b)}

Suppose that at time t_0 an L_z measurement is made, and the outcome 0 is recorded. What is the appropriate ket \psi_{\text{after}}(t_0) right after the measurement?


A measurement with outcome 0, means that the L_z operator measurement found the state at that point to be the eigenstate for L_z eigenvalue 0. Recall that if {\lvert {\phi} \rangle} is an eigenstate of L_z we have

\begin{aligned}L_z {\lvert {\phi} \rangle} = m \hbar {\lvert {\phi} \rangle},\end{aligned} \hspace{\stretch{1}}(2.19)

so a measurement of L_z with outcome zero means that we have m=0. Our measurement of L_z at time t_0 therefore filters out all but the m=0 states and our new state is proportional to the projection over all m=0 states as follows

\begin{aligned}{\lvert {\psi_{\text{after}}(t_0)} \rangle}&\propto \left( \sum_{n l} {\lvert {n l 0} \rangle}{\langle {n l 0} \rvert} \right) {\lvert {\psi(t_0)} \rangle}  \\ &\propto \left( {\lvert {1 0 0} \rangle}{\langle {1 0 0} \rvert} +{\lvert {2 1 0} \rangle}{\langle {2 1 0} \rvert} \right) {\lvert {\psi(t_0)} \rangle}  \\ &= \frac{1}{{\sqrt{3}}} e^{-i 4 \pi / 3} {\lvert {100} \rangle}+\frac{1}{{\sqrt{3}}} e^{- i \pi / 3 } {\lvert {210} \rangle} \end{aligned}

A final normalization yields

\begin{aligned}{\lvert {\psi_{\text{after}}(t_0)} \rangle}= \frac{1}{{\sqrt{2}}} ({\lvert {210} \rangle} - {\lvert {100} \rangle})\end{aligned} \hspace{\stretch{1}}(2.20)

\paragraph{Q: (c)}

Right after this L_z measurement, what is {\left\lvert{\psi_{\text{after}}(t_0)}\right\rvert}^2?


Our amplitude is

\begin{aligned}\left\langle{\mathbf{r}} \vert {{\psi_{\text{after}}(t_0)}}\right\rangle&= \frac{1}{{\sqrt{2}}} (\left\langle{\mathbf{r}} \vert {{210}}\right\rangle - \left\langle{\mathbf{r}} \vert {{100}}\right\rangle) \\ &= \frac{1}{{\sqrt{2 \pi a_0^3}}}\left(\frac{r}{4\sqrt{2} a_0} e^{-r/2a_0} \cos\theta-e^{-r/a_0}\right) \\ &= \frac{1}{{\sqrt{2 \pi a_0^3}}}e^{-r/2 a_0} \left(\frac{r}{4\sqrt{2} a_0} \cos\theta-e^{-r/2 a_0}\right),\end{aligned}

so the probability density is

\begin{aligned}{\left\lvert{\left\langle{\mathbf{r}} \vert {{\psi_{\text{after}}(t_0)}}\right\rangle}\right\rvert}^2= \frac{1}{{2 \pi a_0^3}}e^{-r/a_0} \left(\frac{r}{4\sqrt{2} a_0} \cos\theta-e^{-r/2 a_0}\right)^2 \end{aligned} \hspace{\stretch{1}}(2.21)

\paragraph{Q: (d)}

If then a position measurement is made immediately, which if any components of the expectation value of \mathbf{R} will be nonvanishing? Justify your answer.


The expectation value of this vector valued operator with respect to a radial state {\lvert {\psi} \rangle} = \sum_{nlm} a_{nlm} {\lvert {nlm} \rangle} can be expressed as

\begin{aligned}\left\langle{\mathbf{R}}\right\rangle = \sum_{i=1}^3 \mathbf{e}_i \sum_{nlm, n'l'm'} a_{nlm}^{*} a_{n'l'm'} {\langle {nlm} \rvert} X_i{\lvert {n'l'm'} \rangle},\end{aligned} \hspace{\stretch{1}}(2.22)

where X_1 = X = R \sin\Theta \cos\Phi, X_2 = Y = R \sin\Theta \sin\Phi, X_3 = Z = R \cos\Phi.

Consider one of the matrix elements, and expand this by introducing an identity twice

\begin{aligned}{\langle {nlm} \rvert} X_i {\lvert {n'l'm'} \rangle}&=\int r^2 \sin\theta dr d\theta d\phi{r'}^2 \sin\theta' dr' d\theta' d\phi'\left\langle{{nlm}} \vert {{r \theta \phi}}\right\rangle {\langle {r \theta \phi} \rvert} X_i {\lvert {r' \theta' \phi' } \rangle}\left\langle{{r' \theta' \phi'}} \vert {{n'l'm'}}\right\rangle \\ &=\int r^2 \sin\theta dr d\theta d\phi{r'}^2 \sin\theta' dr' d\theta' d\phi'R_{nl}(r) Y_{lm}^{*}(\theta,\phi)\delta^3(\mathbf{x} - \mathbf{x}') x_iR_{n'l'}(r') Y_{l'm'}(\theta',\phi')\\ &=\int r^2 \sin\theta dr d\theta d\phi{r'}^2 \sin\theta' dr' d\theta' d\phi'R_{nl}(r) Y_{lm}^{*}(\theta,\phi) \\ &\qquad{r'}^2 \sin\theta' \delta(r-r') \delta(\theta - \theta') \delta(\phi-\phi')x_iR_{n'l'}(r') Y_{l'm'}(\theta',\phi')\\ &=\int r^2 \sin\theta dr d\theta d\phidr' d\theta' d\phi'R_{nl}(r) Y_{lm}^{*}(\theta,\phi) \delta(r-r') \delta(\theta - \theta') \delta(\phi-\phi')x_iR_{n'l'}(r') Y_{l'm'}(\theta',\phi')\\ &=\int r^2 \sin\theta dr d\theta d\phiR_{nl}(r) R_{n'l'}(r) Y_{lm}^{*}(\theta,\phi) Y_{l'm'}(\theta,\phi)x_i\\ \end{aligned}

Because our state has only m=0 contributions, the only \phi dependence for the X and Y components of \mathbf{R} come from those components themselves. For X, we therefore integrate \int_0^{2\pi} \cos\phi d\phi = 0, and for Y we integrate \int_0^{2\pi} \sin\phi d\phi = 0, and these terms vanish. Our expectation value for \mathbf{R} for this state, therefore lies completely on the z axis.

Questions from the Dec 2008 PHY355H1F exam.

1b. Trace invariance for unitary transformation.

\paragraph{Q:} Show that the trace of an operator is invariant under unitary transforms, i.e. if A' = U^\dagger A U, where U is a unitary operator, prove \text{Tr}(A') = \text{Tr}(A).


The bulk of this question is really to show that commutation of operators leaves the trace invariant (unless this is assumed). To show that we start with the definition of the trace

\begin{aligned}\text{Tr}(AB) &= \sum_n {\langle {n} \rvert} A B {\lvert {n} \rangle} \\ &= \sum_{n m} {\langle {n} \rvert} A {\lvert {m} \rangle} {\langle {m} \rvert} B {\lvert {n} \rangle} \\ &= \sum_{n m} {\langle {m} \rvert} B {\lvert {n} \rangle} {\langle {n} \rvert} A {\lvert {m} \rangle} \\ &= \sum_{m} {\langle {m} \rvert} B A {\lvert {m} \rangle}.\end{aligned}

Thus we have

\begin{aligned}\text{Tr}(A B) = \text{Tr}( B A ).\end{aligned} \hspace{\stretch{1}}(3.23)

For the unitarily transformed operator we have

\begin{aligned}\text{Tr}(A') &= \text{Tr}( U^\dagger A U ) \\ &= \text{Tr}( U^\dagger (A U) ) \\ &= \text{Tr}( (A U) U^\dagger ) \\ &= \text{Tr}( A (U U^\dagger) ) \\ &= \text{Tr}( A ) \qquad \square\end{aligned}

1d. Determinant of an exponential operator in terms of trace.

\paragraph{Q:} If A is an Hermitian operator, show that

\begin{aligned}\text{Det}( \exp A ) = \exp ( \text{Tr}(A) )\end{aligned} \hspace{\stretch{1}}(3.24)

where the Determinant (\text{Det}) of an operator is the product of all its eigenvectors.


The eigenvalues clue in the question provides the starting point. We write the exponential in its series form

\begin{aligned}e^A = 1 + \sum_{k=1}^\infty \frac{1}{{k!}} A^k\end{aligned} \hspace{\stretch{1}}(3.25)

Now, suppose that we have the following eigenvalue relationships for A

\begin{aligned}A {\lvert {n} \rangle} = \lambda_n {\lvert {n} \rangle}.\end{aligned} \hspace{\stretch{1}}(3.26)

From this the exponential is

\begin{aligned}e^A {\lvert {n} \rangle} &= {\lvert {n} \rangle} + \sum_{k=1}^\infty \frac{1}{{k!}} A^k {\lvert {n} \rangle} \\ &= {\lvert {n} \rangle} + \sum_{k=1}^\infty \frac{1}{{k!}} (\lambda_n)^k {\lvert {n} \rangle} \\ &= e^{\lambda_n} {\lvert {n} \rangle}.\end{aligned}

We see that the eigenstates of e^A are those of A, with eigenvalues e^{\lambda_n}.

By the definition of the determinant given we have

\begin{aligned}\text{Det}( e^A ) &= \Pi_n e^{\lambda_n} \\ &= e^{\sum_n \lambda_n} \\ &= e^{\text{Tr}ace(A)} \qquad \square\end{aligned}

1e. Eigenvectors of the Harmonic oscillator creation operator.

\paragraph{Q:} Prove that the only eigenvector of the Harmonic oscillator creation operator is {\lvert {\text{null}} \rangle}.


Recall that the creation (raising) operator was given by

\begin{aligned}a^\dagger = \sqrt{\frac{m \omega}{2 \hbar}} X - \frac{ i }{\sqrt{2 m \omega \hbar} } P= \frac{1}{{ \alpha \sqrt{2} }} X - \frac{ i \alpha }{\sqrt{2} \hbar } P,\end{aligned} \hspace{\stretch{1}}(3.27)

where \alpha = \sqrt{\hbar/m \omega}. Now assume that a^\dagger {\lvert {\phi} \rangle} = \lambda {\lvert {\phi} \rangle} so that

\begin{aligned}{\langle {x} \rvert} a^\dagger {\lvert {\phi} \rangle} = {\langle {x} \rvert} \lambda {\lvert {\phi} \rangle}.\end{aligned} \hspace{\stretch{1}}(3.28)

Write \left\langle{{x}} \vert {{\phi}}\right\rangle = \phi(x), and expand the LHS using 3.27 for

\begin{aligned}\lambda \phi(x) &= {\langle {x} \rvert} a^\dagger {\lvert {\phi} \rangle}  \\ &= {\langle {x} \rvert} \left( \frac{1}{{ \alpha \sqrt{2} }} X - \frac{ i \alpha }{\sqrt{2} \hbar } P \right) {\lvert {\phi} \rangle} \\ &= \frac{x \phi(x)}{ \alpha \sqrt{2} } - \frac{ i \alpha }{\sqrt{2} \hbar } (-i\hbar)\frac{\partial {}}{\partial {x}} \phi(x) \\ &= \frac{x \phi(x)}{ \alpha \sqrt{2} } - \frac{ \alpha }{\sqrt{2} } \frac{\partial {\phi(x)}}{\partial {x}}.\end{aligned}

As usual write \xi = x/\alpha, and rearrange. This gives us

\begin{aligned}\frac{\partial {\phi}}{\partial {\xi}} +\sqrt{2} \lambda \phi - \xi \phi = 0.\end{aligned} \hspace{\stretch{1}}(3.29)

Observe that this can be viewed as a homogeneous LDE of the form

\begin{aligned}\frac{\partial {\phi}}{\partial {\xi}} - \xi \phi = 0,\end{aligned} \hspace{\stretch{1}}(3.30)

augmented by a forcing term \sqrt{2}\lambda \phi. The homogeneous equation has the solution \phi = A e^{\xi^2/2}, so for the complete equation we assume a solution

\begin{aligned}\phi(\xi) = A(\xi) e^{\xi^2/2}.\end{aligned} \hspace{\stretch{1}}(3.31)

Since \phi' = (A' + A \xi) e^{\xi^2/2}, we produce a LDE of

\begin{aligned}0 &= (A' + A \xi -\xi A + \sqrt{2} \lambda A ) e^{\xi^2/2} \\ &= (A' + \sqrt{2} \lambda A ) e^{\xi^2/2},\end{aligned}


\begin{aligned}0 = A' + \sqrt{2} \lambda A.\end{aligned} \hspace{\stretch{1}}(3.32)

This has solution A = B e^{-\sqrt{2} \lambda \xi}, so our solution for 3.29 is

\begin{aligned}\phi(\xi) = B e^{\xi^2/2 - \sqrt{2} \lambda \xi} = B' e^{ (\xi - \lambda \sqrt{2} )^2/2}.\end{aligned} \hspace{\stretch{1}}(3.33)

This wave function is an imaginary Gaussian with minimum at \xi = \lambda\sqrt{2}. It is also unnormalizable since we require B' = 0 for any \lambda if \int {\left\lvert{\phi}\right\rvert}^2 < \infty. Since \left\langle{{\xi}} \vert {{\phi}}\right\rangle = \phi(\xi) = 0, we must also have {\lvert {\phi} \rangle} = 0, completing the exercise.

2. Two level quantum system.

Consider a two-level quantum system, with basis states \{{\lvert {a} \rangle}, {\lvert {b} \rangle}\}. Suppose that the Hamiltonian for this system is given by

\begin{aligned}H = \frac{\hbar \Delta}{2} ( {\lvert {b} \rangle}{\langle {b} \rvert}- {\lvert {a} \rangle}{\langle {a} \rvert})+ i \frac{\hbar \Omega}{2} ( {\lvert {a} \rangle}{\langle {b} \rvert}- {\lvert {b} \rangle}{\langle {a} \rvert})\end{aligned} \hspace{\stretch{1}}(3.34)

where \Delta and \Omega are real positive constants.

\paragraph{Q: (a)} Find the energy eigenvalues and the normalized energy eigenvectors (expressed in terms of the \{{\lvert {a} \rangle}, {\lvert {b} \rangle}\} basis). Write the time evolution operator U(t) = e^{-i H t/\hbar} using these eigenvectors.


The eigenvalue part of this problem is probably easier to do in matrix form. Let

\begin{aligned}{\lvert {a} \rangle} &= \begin{bmatrix}1 \\ 0\end{bmatrix} \\ {\lvert {b} \rangle} &= \begin{bmatrix}0 \\ 1\end{bmatrix}.\end{aligned} \hspace{\stretch{1}}(3.35)

Our Hamiltonian is then

\begin{aligned}H = \frac{\hbar}{2} \begin{bmatrix}-\Delta & i \Omega \\ -i \Omega & \Delta\end{bmatrix}.\end{aligned} \hspace{\stretch{1}}(3.37)

Computing \det{H - \lambda I} = 0, we get

\begin{aligned}\lambda = \pm \frac{\hbar}{2} \sqrt{ \Delta^2 + \Omega^2 }.\end{aligned} \hspace{\stretch{1}}(3.38)

Let \delta = \sqrt{ \Delta^2 + \Omega^2 }. Our normalized eigenvectors are found to be

\begin{aligned}{\lvert {\pm} \rangle} = \frac{1}{{\sqrt{ 2 \delta (\delta \pm \Delta)} }}\begin{bmatrix}i \Omega \\ \Delta \pm \delta\end{bmatrix}.\end{aligned} \hspace{\stretch{1}}(3.39)

In terms of {\lvert {a} \rangle} and {\lvert {b} \rangle}, we then have

\begin{aligned}{\lvert {\pm} \rangle} = \frac{1}{{\sqrt{ 2 \delta (\delta \pm \Delta)} }}\left(i \Omega {\lvert {a} \rangle}+ (\Delta \pm \delta) {\lvert {b} \rangle} \right).\end{aligned} \hspace{\stretch{1}}(3.40)

Note that our Hamiltonian has a simple form in this basis. That is

\begin{aligned}H = \frac{\delta \hbar}{2} ({\lvert {+} \rangle}{\langle {+} \rvert} - {\lvert {-} \rangle}{\langle {-} \rvert} )\end{aligned} \hspace{\stretch{1}}(3.41)

Observe that once we do the diagonalization, we have a Hamiltonian that appears to have the form of a scaled projector for an open Stern-Gerlach aparatus.

Observe that the diagonalized Hamiltonian operator makes the time evolution operator’s form also simple, which is, by inspection

\begin{aligned}U(t) = e^{-i t \frac{\delta}{2}} {\lvert {+} \rangle}{\langle {+} \rvert} + e^{i t \frac{\delta}{2}} {\lvert {-} \rangle}{\langle {-} \rvert}.\end{aligned} \hspace{\stretch{1}}(3.42)

Since we are asked for this in terms of {\lvert {a} \rangle}, and {\lvert {b} \rangle}, the projectors {\lvert {\pm} \rangle}{\langle {\pm} \rvert} are required. These are

\begin{aligned}{\lvert {\pm} \rangle}{\langle {\pm} \rvert} &= \frac{1}{{2 \delta (\delta \pm \Delta)}}\Bigl( i \Omega {\lvert {a} \rangle} + (\Delta \pm \delta) {\lvert {b} \rangle} \Bigr)\Bigl( -i \Omega {\langle {a} \rvert} + (\Delta \pm \delta) {\langle {b} \rvert} \Bigr) \\ \end{aligned}

\begin{aligned}{\lvert {\pm} \rangle}{\langle {\pm} \rvert} = \frac{1}{{2 \delta (\delta \pm \Delta)}}\Bigl(\Omega^2 {\lvert {a} \rangle}{\langle {a} \rvert}+(\delta \pm \delta)^2 {\lvert {b} \rangle}{\langle {b} \rvert}+i \Omega (\Delta \pm \delta) ({\lvert {a} \rangle}{\langle {b} \rvert}-{\lvert {b} \rangle}{\langle {a} \rvert})\Bigr)\end{aligned} \hspace{\stretch{1}}(3.43)

Substitution into 3.42 and a fair amount of algebra leads to

\begin{aligned}U(t) = \cos(\delta t/2) \Bigl( {\lvert {a} \rangle}{\langle {a} \rvert} + {\lvert {b} \rangle}{\langle {b} \rvert} \Bigr)+ i \frac{\Omega}{\delta} \sin(\delta t/2) \Bigl( {\lvert {a} \rangle}{\langle {a} \rvert} - {\lvert {b} \rangle}{\langle {b} \rvert} -i ({\lvert {a} \rangle}{\langle {b} \rvert} - {\lvert {b} \rangle}{\langle {a} \rvert} )\Bigr).\end{aligned} \hspace{\stretch{1}}(3.44)

Note that while a big cumbersome, we can also verify that we can recover the original Hamiltonian from 3.41 and 3.43.

\paragraph{Q: (b)}

Suppose that the initial state of the system at time t = 0 is {\lvert {\phi(0)} \rangle}= {\lvert {b} \rangle}. Find an expression for the state at some later time t > 0, {\lvert {\phi(t)} \rangle}.


Most of the work is already done. Computation of {\lvert {\phi(t)} \rangle} = U(t) {\lvert {\phi(0)} \rangle} follows from 3.44

\begin{aligned}{\lvert {\phi(t)} \rangle} =\cos(\delta t/2) {\lvert {b} \rangle}- i \frac{\Omega}{\delta} \sin(\delta t/2) \Bigl( {\lvert {b} \rangle} +i {\lvert {a} \rangle}\Bigr).\end{aligned} \hspace{\stretch{1}}(3.45)

\paragraph{Q: (c)}

Suppose that an observable, specified by the operator X = {\lvert {a} \rangle}{\langle {b} \rvert} + {\lvert {b} \rangle}{\langle {a} \rvert}, is measured for this system. What is the probabilbity that, at time t, the result 1 is obtained? Plot this probability as a function of time, showing the maximum and minimum values of the function, and the corresponding values of t.


The language of questions like these attempt to bring some physics into the mathematics. The phrase “the result 1 is obtained”, is really a statement that the operator X, after measurement is found to have the eigenstate with numeric value 1.

We can calcuate the eigenvectors for this operator easily enough and find them to be \pm 1. For the positive eigenvalue we can also compute the eigenstate to be

\begin{aligned}{\lvert {X+} \rangle} = \frac{1}{{\sqrt{2}}} \Bigl( {\lvert {a} \rangle} + {\lvert {b} \rangle} \Bigr).\end{aligned} \hspace{\stretch{1}}(3.46)

The question of what the probability for this measurement is then really a question asking for the computation of the amplitude

\begin{aligned}{\left\lvert{\frac{1}{{\sqrt{2}}}\left\langle{{ (a + b)}} \vert {{\phi(t)}}\right\rangle}\right\rvert}^2\end{aligned} \hspace{\stretch{1}}(3.47)

From 3.45 we find this probability to be

\begin{aligned}{\left\lvert{\frac{1}{{\sqrt{2}}}\left\langle{{ (a + b)}} \vert {{\phi(t)}}\right\rangle}\right\rvert}^2&=\frac{1}{{2}} \left(\left(\cos(\delta t/2) + \frac{\Omega}{\delta} \sin(\delta t/2)\right)^2+ \frac{ \Omega^2 \sin^2(\delta t/2)}{\delta^2}\right) \\ &=\frac{1}{{4}} \left( 1 + 3 \frac{\Omega^2}{\delta^2} + \frac{\Delta^2}{\delta^2} \cos (\delta t) + 2 \frac{ \Omega}{\delta} \sin(\delta t) \right)\end{aligned}

We have a simple superposition of two sinusuiods out of phase, periodic with period 2 \pi/\delta. I’d attempted a rough sketch of this on paper, but won’t bother scanning it here or describing it further.

\paragraph{Q: (d)}

Suppose an experimenter has control over the values of the parameters \Delta and \Omega. Explain how she might prepare the state ({\lvert {a} \rangle} + {\lvert {b} \rangle})/\sqrt{2}.


For this part of the question I wasn’t sure what approach to take. I thought perhaps this linear combination of states could be made to equal one of the energy eigenstates, and if one could prepare the system in that state, then for certain values of \delta and \Delta one would then have this desired state.

To get there I note that we can express the states {\lvert {a} \rangle}, and {\lvert {b} \rangle} in terms of the eigenstates by inverting

\begin{aligned}\begin{bmatrix}{\lvert {+} \rangle} \\ {\lvert {-} \rangle} \\ \end{bmatrix}=\frac{1}{{\sqrt{2\delta}}}\begin{bmatrix}\frac{i \Omega}{\sqrt{\delta + \Delta}} & \sqrt{\delta + \Delta} \\ \frac{i \Omega}{\sqrt{\delta - \Delta}} & -\sqrt{\delta - \Delta}\end{bmatrix}\begin{bmatrix}{\lvert {a} \rangle} \\ {\lvert {b} \rangle} \\ \end{bmatrix}.\end{aligned} \hspace{\stretch{1}}(3.48)

Skipping all the algebra one finds

\begin{aligned}\begin{bmatrix}{\lvert {a} \rangle} \\ {\lvert {b} \rangle} \\ \end{bmatrix}=\begin{bmatrix}-i\sqrt{\delta - \Delta} & -i\sqrt{\delta + \Delta} \\ \frac{\Omega}{\sqrt{\delta - \Delta}} &-\frac{\Omega}{\sqrt{\delta + \Delta}} \end{bmatrix}\begin{bmatrix}{\lvert {+} \rangle} \\ {\lvert {-} \rangle} \\ \end{bmatrix}.\end{aligned} \hspace{\stretch{1}}(3.49)

Unfortunately, this doesn’t seem helpful. I find

\begin{aligned}\frac{1}{{\sqrt{2}}} ( {\lvert {a} \rangle} + {\lvert {b} \rangle} ) = \frac{{\lvert {+} \rangle}}{\sqrt{\delta - \Delta}}( \Omega - i (\delta - \Delta) )-\frac{{\lvert {-} \rangle}}{\sqrt{\delta + \Delta}}( \Omega + i (\delta + \Delta) )\end{aligned} \hspace{\stretch{1}}(3.50)

There’s no obvious way to pick \Omega and \Delta to leave just {\lvert {+} \rangle} or {\lvert {-} \rangle}. When I did this on paper originally I got a different answer for this sum, but looking at it now, I can’t see how I managed to get that answer (it had no factors of i in the result as the one above does).

3. One dimensional harmonic oscillator.

Consider a one-dimensional harmonic oscillator with the Hamiltonian

\begin{aligned}H = \frac{1}{{2m}}P^2 + \frac{1}{{2}} m \omega^2 X^2\end{aligned} \hspace{\stretch{1}}(3.51)

Denote the ground state of the system by {\lvert {0} \rangle}, the first excited state by {\lvert {1} \rangle} and so on.

\paragraph{Q: (a)}
Evaluate {\langle {n} \rvert} X {\lvert {n} \rangle} and {\langle {n} \rvert} X^2 {\lvert {n} \rangle} for arbitrary {\lvert {n} \rangle}.


Writing X in terms of the raising and lowering operators we have

\begin{aligned}X = \frac{\alpha}{\sqrt{2}} (a^\dagger + a),\end{aligned} \hspace{\stretch{1}}(3.52)

so \left\langle{{X}}\right\rangle is proportional to

\begin{aligned}{\langle {n} \rvert} a^\dagger + a {\lvert {n} \rangle} = \sqrt{n+1} \left\langle{{n}} \vert {{n+1}}\right\rangle + \sqrt{n} \left\langle{{n}} \vert {{n-1}}\right\rangle = 0.\end{aligned} \hspace{\stretch{1}}(3.53)

For \left\langle{{X^2}}\right\rangle we have

\begin{aligned}\left\langle{{X^2}}\right\rangle&=\frac{\alpha^2}{2}{\langle {n} \rvert} (a^\dagger + a)(a^\dagger + a) {\lvert {n} \rangle} \\ &=\frac{\alpha^2}{2}{\langle {n} \rvert} (a^\dagger + a) \left( \sqrt{n+1} {\lvert {n+1} \rangle} + \sqrt{n-1} {\lvert {n-1} \rangle}\right)  \\ &=\frac{\alpha^2}{2}{\langle {n} \rvert} \Bigl( (n+1) {\lvert {n} \rangle} + \sqrt{n(n-1)} {\lvert {n-2} \rangle}+ \sqrt{(n+1)(n+2)} {\lvert {n+2} \rangle} + n {\lvert {n} \rangle} \Bigr).\end{aligned}

We are left with just

\begin{aligned}\left\langle{{X^2}}\right\rangle = \frac{\hbar}{2 m \omega} (2n + 1).\end{aligned} \hspace{\stretch{1}}(3.54)

\paragraph{Q: (b)}

Suppose that at t=0 the system is prepared in the state

\begin{aligned}{\lvert {\psi(0)} \rangle} = \frac{1}{{\sqrt{2}}} ( {\lvert {0} \rangle} + i {\lvert {1} \rangle} ).\end{aligned} \hspace{\stretch{1}}(3.55)

If a measurement of position X were performaed immediately, sketch the propability distribution P(x) that a particle would be found within dx of x. Justify how you construct the sketch.


The probability that we started in state {\lvert {\psi(0)} \rangle} and ended up in position x is governed by the amplitude \left\langle{{x}} \vert {{\psi(0)}}\right\rangle, and the probability of being within an interval \Delta x, surrounding the point x is given by

\begin{aligned}\int_{x'=x-\Delta x/2}^{x+\Delta x/2} {\left\lvert{ \left\langle{{x'}} \vert {{\psi(0)}}\right\rangle }\right\rvert}^2 dx'.\end{aligned} \hspace{\stretch{1}}(3.56)

In the limit as \Delta x \rightarrow 0, this is just the squared amplitude itself evaluated at the point x, so we are interested in the quantity

\begin{aligned}{\left\lvert{ \left\langle{{x}} \vert {{\psi(0)}}\right\rangle }\right\rvert}^2  = \frac{1}{{2}} {\left\lvert{ \left\langle{{x}} \vert {{0}}\right\rangle + i \left\langle{{x}} \vert {{1}}\right\rangle }\right\rvert}^2.\end{aligned} \hspace{\stretch{1}}(3.57)

We are given these wave functions in the supplemental formulas. Namely,

\begin{aligned}\left\langle{{x}} \vert {{0}}\right\rangle &= \psi_0(x) = \frac{e^{-x^2/2\alpha^2}}{ \sqrt{\alpha \sqrt{\pi}}} \\ \left\langle{{x}} \vert {{1}}\right\rangle &= \psi_1(x) = \frac{e^{-x^2/2\alpha^2} 2 x }{ \alpha \sqrt{2 \alpha \sqrt{\pi}}}.\end{aligned} \hspace{\stretch{1}}(3.58)

Substituting these into 3.57 we have

\begin{aligned}{\left\lvert{ \left\langle{{x}} \vert {{\psi(0)}}\right\rangle }\right\rvert}^2 = \frac{1}{{2}} e^{-x^2/\alpha^2}\frac{1}{{ \alpha \sqrt{\pi}}}{\left\lvert{ 1 + \frac{2 i x}{\alpha \sqrt{2} } }\right\rvert}^2=\frac{e^{-x^2/\alpha^2}}{ 2\alpha \sqrt{\pi}}\left( 1 + \frac{2 x^2}{\alpha^2 } \right).\end{aligned} \hspace{\stretch{1}}(3.60)

This \href{^(-x^2)+(1+

\paragraph{Q: (c)}

Now suppose the state given in (b) above were allowed to evolve for a time t, determine the expecation value of X and \Delta X at that time.


Our time evolved state is

\begin{aligned}U(t) {\lvert {\psi(0)} \rangle} = \frac{1}{{\sqrt{2}}}\left(e^{-i \hbar \omega \left( 0 + \frac{1}{{2}} \right) t/\hbar } {\lvert {0} \rangle}+ i e^{-i \hbar \omega \left( 1 + \frac{1}{{2}} \right) t/\hbar } {\lvert {0} \rangle}\right)=\frac{1}{{\sqrt{2}}}\left(e^{-i \omega t/2 } {\lvert {0} \rangle}+ i e^{- 3 i \omega t/2 } {\lvert {1} \rangle}\right).\end{aligned} \hspace{\stretch{1}}(3.61)

The position expectation is therefore

\begin{aligned}{\langle {\psi(t)} \rvert} X {\lvert {\psi(t)} \rangle}&= \frac{\alpha}{2 \sqrt{2}}\left(e^{i \omega t/2 } {\langle {0} \rvert}- i e^{ 3 i \omega t/2 } {\langle {1} \rvert}\right)(a^\dagger + a)\left(e^{-i \omega t/2 } {\lvert {0} \rangle}+ i e^{- 3 i \omega t/2 } {\lvert {1} \rangle}\right) \\ \end{aligned}

We have already demonstrated that {\langle {n} \rvert} X {\lvert {n} \rangle} = 0, so we must only expand the cross terms, but those are just {\langle {0} \rvert} a^\dagger + a {\lvert {1} \rangle} = 1. This leaves

\begin{aligned}{\langle {\psi(t)} \rvert} X {\lvert {\psi(t)} \rangle}= \frac{\alpha}{2 \sqrt{2}}\left( -i e^{i \omega t} + i e^{-i \omega t} \right)=\sqrt{\frac{\hbar}{2 m \omega}} \cos(\omega t)\end{aligned} \hspace{\stretch{1}}(3.62)

For the squared position expectation

\begin{aligned}{\langle {\psi(t)} \rvert} X^2 {\lvert {\psi(t)} \rangle}&= \frac{\alpha^2}{4 (2)}\left(e^{i \omega t/2 } {\langle {0} \rvert}- i e^{ 3 i \omega t/2 } {\langle {1} \rvert}\right)(a^\dagger + a)^2\left(e^{-i \omega t/2 } {\lvert {0} \rangle}+ i e^{- 3 i \omega t/2 } {\lvert {1} \rangle}\right) \\ &=\frac{1}{{2}} ( {\langle {0} \rvert} X^2 {\lvert {0} \rangle} + {\langle {1} \rvert} X^2 {\lvert {1} \rangle} )+ i \frac{\alpha^2 }{8} ( - e^{ i \omega t} {\langle {1} \rvert} (a^\dagger + a)^2 {\lvert {0} \rangle}+ e^{ -i \omega t} {\langle {0} \rvert} (a^\dagger + a)^2 {\lvert {1} \rangle})\end{aligned}

Noting that (a^\dagger + a) {\lvert {0} \rangle} = {\lvert {1} \rangle}, and (a^\dagger + a)^2 {\lvert {0} \rangle} = (a^\dagger + a){\lvert {1} \rangle} = \sqrt{2} {\lvert {2} \rangle} + {\lvert {0} \rangle}, so we see the last two terms are zero. The first two we can evaluate using our previous result 3.54 which was \left\langle{{X^2}}\right\rangle = \frac{\alpha^2}{2} (2n + 1). This leaves

\begin{aligned}{\langle {\psi(t)} \rvert} X^2 {\lvert {\psi(t)} \rangle} = \alpha^2 \end{aligned} \hspace{\stretch{1}}(3.63)

Since \left\langle{{X}}\right\rangle^2 = \alpha^2 \cos^2(\omega t)/2, we have

\begin{aligned}(\Delta X)^2 = \left\langle{{X^2}}\right\rangle - \left\langle{{X}}\right\rangle^2 = \alpha^2 \left(1 - \frac{1}{{2}} \cos^2(\omega t) \right)\end{aligned} \hspace{\stretch{1}}(3.64)

\paragraph{Q: (d)}

Now suppose that initially the system were prepared in the ground state {\lvert {0} \rangle}, and then the resonance frequency is changed abrubtly from \omega to \omega' so that the Hamiltonian becomes

\begin{aligned}H = \frac{1}{{2m}}P^2 + \frac{1}{{2}} m {\omega'}^2 X^2.\end{aligned} \hspace{\stretch{1}}(3.65)

Immediately, an energy measurement is performed ; what is the probability of obtaining the result E = \hbar \omega' (3/2)?


This energy measurement E = \hbar \omega' (3/2) = \hbar \omega' (1 + 1/2), corresponds to an observation of state {\lvert {1'} \rangle}, after an initial observation of {\lvert {0} \rangle}. The probability of such a measurement is

\begin{aligned}{\left\lvert{ \left\langle{{1'}} \vert {{0}}\right\rangle }\right\rvert}^2\end{aligned} \hspace{\stretch{1}}(3.66)

Note that

\begin{aligned}\left\langle{{1'}} \vert {{0}}\right\rangle &=\int dx \left\langle{{1'}} \vert {{x}}\right\rangle\left\langle{{x}} \vert {{0}}\right\rangle \\ &=\int dx \psi_{1'}^{*} \psi_0(x) \\ \end{aligned}

The wave functions above are

\begin{aligned}\phi_{1'}(x) &= \frac{ 2 x e^{-x^2/2 {\alpha'}^2 }}{ \alpha' \sqrt{ 2 \alpha' \sqrt{\pi} } } \\ \phi_{0}(x) &= \frac{ e^{-x^2/2 {\alpha}^2 } } { \sqrt{ \alpha \sqrt{\pi} } } \end{aligned} \hspace{\stretch{1}}(3.67)

Putting the pieces together we have

\begin{aligned}\left\langle{{1'}} \vert {{0}}\right\rangle &=\frac{2 }{ \alpha' \sqrt{ 2 \alpha' \alpha \pi } }\int dxx e^{-\frac{x^2}{2}\left( \frac{1}{{{\alpha'}^2}} + \frac{1}{{\alpha^2}} \right) }\end{aligned} \hspace{\stretch{1}}(3.69)

Since this is an odd integral kernel over an even range, this evaluates to zero, and we conclude that the probability of measuring the specified energy is zero when the system is initially prepared in the ground state associated with the original Hamiltonian. Intuitively this makes some sense, if one thinks of the Fourier coefficient problem: one cannot construct an even function from linear combinations of purely odd functions.

Posted in Math and Physics Learning. | Tagged: , , , , , , , , , , , , , , , , , , , , , , | Leave a Comment »

Oct 12, PHY356F lecture notes.

Posted by peeterjoot on October 12, 2010

Today as an experiment I tried taking live notes in latex in class (previously I’d not taken any notes since the lectures followed the text so closely).

[Click here for a PDF of this post with nicer formatting]

Oct 12.

Review. What have we learned?

Chapter 1.

Information about systems comes from vectors and operators. Express the vector {\lvert {\phi} \rangle} describing the system in terms of eigenvectors {\lvert {a_n} \rangle}. n \in 1,2,3,\cdots.

of some operator A. What are the coefficients c_n? Act on both sides by {\langle {a_m} \rvert} to find

\begin{aligned}\left\langle{{a_m}} \vert {{\phi}}\right\rangle &= \sum_n c_n \underbrace{\left\langle{{a_m}} \vert {{a_n}}\right\rangle}_{\text{Kronicker delta}}  \\ &= \sum c_n \delta_{mn} \\ &= c_m\end{aligned}

\begin{aligned}c_m = \left\langle{{a_m}} \vert {{\phi}}\right\rangle\end{aligned}


\begin{aligned}\mathbf{v} = \sum_i v_i \mathbf{e}_i \end{aligned}

\begin{aligned}\mathbf{e}_1 \cdot \mathbf{v} = \sum_i v_i \mathbf{e}_1 \cdot \mathbf{e}_i = v_1\end{aligned}

Physical information comes from the probability for obtaining a measurement of the physical entity associated with operator A. The probability of obtaining outcome a_m, an eigenvalue of A, is {\left\lvert{c_n}\right\rvert}^2

Chapter 2.

Deal with operators that have continuous eigenvalues and eigenvectors.

We now express

\begin{aligned}{\lvert {\phi} \rangle} = \int dk f(k) {\lvert {k} \rangle}\end{aligned}

Here the coeffecients f(k) are analogous to c_n.

Now if we project onto k'

\begin{aligned}\left\langle{{k'}} \vert {{\phi}}\right\rangle &= \int dk f(k) \underbrace{\left\langle{{k'}} \vert {{k}}\right\rangle}_{\text{Dirac delta}} \\ &= \int dk f(k) \delta(k' -k) \\ &= f(k') \end{aligned}

Unlike the discrete case, this is not a probability. Probability density for obtaining outcome k' is {\left\lvert{f(k')}\right\rvert}^2.

Example 2.

\begin{aligned}{\lvert {\phi} \rangle} = \int dk f(k) {\lvert {k} \rangle}\end{aligned}

Now if we project x onto both sides

\begin{aligned}\left\langle{{x}} \vert {{\phi}}\right\rangle &= \int dk f(k) \left\langle{{x}} \vert {{k}}\right\rangle \\ \end{aligned}

With \left\langle{{x}} \vert {{k}}\right\rangle = u_k(x)

\begin{aligned}\phi(x) &\equiv \left\langle{{x}} \vert {{\phi}}\right\rangle \\ &= \int dk f(k) u_k(x)  \\ &= \int dk f(k) \frac{1}{{\sqrt{L}}} e^{ikx}\end{aligned}

This is with periodic boundary value conditions for the normalization. The infinite normalization is also possible.

\begin{aligned}\phi(x) &= \frac{1}{{\sqrt{L}}} \int dk f(k) e^{ikx}\end{aligned}

Multiply both sides by e^{-ik'x}/\sqrt{L} and integrate. This is analogous to multiplying {\lvert {\phi} \rangle} = \int f(k) {\lvert {k} \rangle} dk by {\langle {k'} \rvert}. We get

\begin{aligned}\int \phi(x) \frac{1}{{\sqrt{L}}} e^{-ik'x} dx&= \frac{1}{{L}} \iint dk f(k) e^{i(k-k')x} dx \\ &= \int dk f(k) \Bigl( \frac{1}{{L}} \int e^{i(k-k')x} \Bigr) \\ &= \int dk f(k) \delta(k-k') \\ &= f(k')\end{aligned}

\begin{aligned}f(k') &=\int \phi(x) \frac{1}{{\sqrt{L}}} e^{-ik'x} dx\end{aligned}

We can talk about the state vector in terms of its position basis \phi(x) or in the momentum space via Fourier transformation. This is the equivalent thing, but just expressed different. The question of interpretation in terms of probabilities works out the same. Either way we look at the probability density.

The quantity

\begin{aligned}{\lvert {\phi} \rangle} = \int dk f(k) {\lvert {k} \rangle}\end{aligned}

is also called a wave packet state since it involves a superposition of many stats {\lvert {k} \rangle}. Example: See Fig 4.1 (Gaussian wave packet, with {\left\lvert{\phi}\right\rvert}^2 as the height). This wave packet is a snapshot of the wave function amplitude at one specific time instant. The evolution of this wave packet is governed by the Hamiltonian, which brings us to chapter 3.

Chapter 3.


\begin{aligned}{\lvert {\phi} \rangle} = \int dk f(k) {\lvert {k} \rangle}\end{aligned}

How do we find {\lvert {\phi(t)} \rangle}, the time evolved state? Here we have the option of choosing which of the pictures (Schr\”{o}dinger, Heisenberg, interaction) we deal with. Since the Heisenberg picture deals with time evolved operators, and the interaction picture with evolving Hamiltonian’s, neither of these is required to answer this question. Consider the Schr\”{o}dinger picture which gives

\begin{aligned}{\lvert {\phi(t)} \rangle} = \int dk f(k) {\lvert {k} \rangle} e^{-i E_k t/\hbar}\end{aligned}

where E_k is the eigenvalue of the Hamiltonian operator H.

STRONG SEEMING HINT: If looking for additional problems and homework, consider in detail the time evolution of the Gaussian wave packet state.

Chapter 4.

For three dimensions with V(x,y,z) = 0

\begin{aligned}H &= \frac{1}{{2m}} \mathbf{p}^2 \\ \mathbf{p} &= \sum_i p_i \mathbf{e}_i \\ \end{aligned}

In the position representation, where

\begin{aligned}p_i &= -i \hbar \frac{d}{dx_i}\end{aligned}

the Sch equation is

\begin{aligned}H u(x,y,z) &= E u(x,y,z) \\ H &= -\frac{\hbar^2}{2m} \boldsymbol{\nabla}^2 \\ = -\frac{\hbar^2}{2m} \left( \frac{\partial^2}{\partial {x}^2}+\frac{\partial^2}{\partial {y}^2}+\frac{\partial^2}{\partial {z}^2}\right) \end{aligned}

Separation of variables assumes it is possible to let

\begin{aligned}u(x,y,z) = X(x) Y(y) Z(z)\end{aligned}

(these capital letters are functions, not operators).

\begin{aligned}-\frac{\hbar^2}{2m} \left( YZ \frac{\partial^2 X}{\partial {x}^2}+ XZ \frac{\partial^2 Y}{\partial {y}^2}+ YZ \frac{\partial^2 Z}{\partial {z}^2}\right)&= E X Y Z\end{aligned}

Dividing as usual by XYZ we have

\begin{aligned}-\frac{\hbar^2}{2m} \left( \frac{1}{{X}} \frac{\partial^2 X}{\partial {x}^2}+ \frac{1}{{Y}} \frac{\partial^2 Y}{\partial {y}^2}+ \frac{1}{{Z}} \frac{\partial^2 Z}{\partial {z}^2} \right)&= E \end{aligned}

The curious thing is that we have these three derivatives, which is supposed to be related to an Energy, which is independent of any x,y,z, so it must be that each of these is separately constant. We can separate these into three individual equations

\begin{aligned}-\frac{\hbar^2}{2m} \frac{1}{{X}} \frac{\partial^2 X}{\partial {x}^2} &= E_1 \\ -\frac{\hbar^2}{2m} \frac{1}{{Y}} \frac{\partial^2 Y}{\partial {x}^2} &= E_2 \\ -\frac{\hbar^2}{2m} \frac{1}{{Z}} \frac{\partial^2 Z}{\partial {x}^2} &= E_3\end{aligned}


\begin{aligned}\frac{\partial^2 X}{\partial {x}^2} &= \left( - \frac{2m E_1}{\hbar^2} \right) X  \\ \frac{\partial^2 Y}{\partial {x}^2} &= \left( - \frac{2m E_2}{\hbar^2} \right) Y  \\ \frac{\partial^2 Z}{\partial {x}^2} &= \left( - \frac{2m E_3}{\hbar^2} \right) Z \end{aligned}

We have then

\begin{aligned}X(x) = C_1 e^{i k x}\end{aligned}


\begin{aligned}E_1 &= \frac{\hbar^2 k_1^2 }{2m} = \frac{p_1^2}{2m} \\ E_2 &= \frac{\hbar^2 k_2^2 }{2m} = \frac{p_2^2}{2m} \\ E_3 &= \frac{\hbar^2 k_3^2 }{2m} = \frac{p_3^2}{2m} \end{aligned}

We are free to use any sort of normalization procedure we wish (periodic boundary conditions, infinite Dirac, …)

Angular momentum.

HOMEWORK: go through the steps to understand how to formulate \boldsymbol{\nabla}^2 in spherical polar coordinates. This is a lot of work, but is good practice and background for dealing with the Hydrogen atom, something with spherical symmetry that is most naturally analyzed in the spherical polar coordinates.

In spherical coordinates (We won’t go through this here, but it is good practice) with

\begin{aligned}x &= r \sin\theta \cos\phi \\ y &= r \sin\theta \sin\phi \\ z &= r \cos\theta\end{aligned}

we have with u = u(r,\theta, \phi)

\begin{aligned}-\frac{\hbar^2}{2m} \left( \frac{1}{{r}} \partial_{rr} (r u) +  \frac{1}{{r^2 \sin\theta}} \partial_\theta (\sin\theta \partial_\theta u) + \frac{1}{{r^2 \sin^2\theta}} \partial_{\phi\phi} u \right)&= E u\end{aligned}

We see the start of a separation of variables attack with u = R(r) Y(\theta, \phi). We end up with

\begin{aligned}-\frac{\hbar^2}{2m} &\left( \frac{r}{R} (r R')' +  \frac{1}{{Y \sin\theta}} \partial_\theta (\sin\theta \partial_\theta Y) + \frac{1}{{Y \sin^2\theta}} \partial_{\phi\phi} Y \right) \\ \end{aligned}

\begin{aligned}r (r R')' + \left( \frac{2m E}{\hbar^2} r^2 - \lambda \right) R &= 0\end{aligned}

\begin{aligned}\frac{1}{{Y \sin\theta}} \partial_\theta (\sin\theta \partial_\theta Y) + \frac{1}{{Y \sin^2\theta}} \partial_{\phi\phi} Y &= -\lambda\end{aligned}

Application of separation of variables again, with Y = P(\theta) Q(\phi) gives us

\begin{aligned}\frac{1}{{P \sin\theta}} \partial_\theta (\sin\theta \partial_\theta P) + \frac{1}{{Q \sin^2\theta}} \partial_{\phi\phi} Q &= -\lambda \end{aligned}

\begin{aligned}\frac{\sin\theta}{P } \partial_\theta (\sin\theta \partial_\theta P) +\lambda  \sin^2\theta+ \frac{1}{{Q }} \partial_{\phi\phi} Q &= 0\end{aligned}

\begin{aligned}\frac{\sin\theta}{P } \partial_\theta (\sin\theta \partial_\theta P) + \lambda \sin^2\theta - \mu = 0\frac{1}{{Q }} \partial_{\phi\phi} Q &= -\mu\end{aligned}


\begin{aligned}\frac{1}{P \sin\theta} \partial_\theta (\sin\theta \partial_\theta P) +\lambda -\frac{\mu}{\sin^2\theta} &= 0\end{aligned} \hspace{\stretch{1}}(1.1)

\begin{aligned}\partial_{\phi\phi} Q &= -\mu Q\end{aligned} \hspace{\stretch{1}}(1.2)

The equation for P can be solved using the Legendre function P_l^m(\cos\theta) where \lambda = l(l+1) and l is an integer

Replacing \mu with m^2, where m is an integer

\begin{aligned}\frac{d^2 Q}{d\phi^2} &= -m^2 Q\end{aligned}

Imposing a periodic boundary condition Q(\phi) = Q(\phi + 2\pi), where (m = 0, \pm 1, \pm 2, \cdots) we have

\begin{aligned}Q &= \frac{1}{{\sqrt{2\pi}}} e^{im\phi}\end{aligned}

There is the overall solution r(r,\theta,\phi) = R(r) Y(\theta, \phi) for a free particle. The functions Y(\theta, \phi) are

\begin{aligned}Y_{lm}(\theta, \phi) &= N \left( \frac{1}{{\sqrt{2\pi}}} e^{im\phi} \right) \underbrace{ P_l^m(\cos\theta) }_{ -l \le m \le l }\end{aligned}

where N is a normalization constant, and m = 0, \pm 1, \pm 2, \cdots. Y_{lm} is an eigenstate of the \mathbf{L}^2 operator and L_z (two for the price of one). There’s no specific reason for the direction z, but it is the direction picked out of convention.

Angular momentum is given by

\begin{aligned}\mathbf{L} = \mathbf{r} \times \mathbf{p}\end{aligned}


\begin{aligned}\mathbf{R} = x \hat{\mathbf{x}} + y\hat{\mathbf{y}} + z\hat{\mathbf{z}}\end{aligned}


\begin{aligned}\mathbf{p} = p_x \hat{\mathbf{x}} + p_y\hat{\mathbf{y}} + p_z\hat{\mathbf{z}}\end{aligned}

The important thing to remember is that the aim of following all the math is to show that

\begin{aligned}\mathbf{L}^2 Y_{lm} = \hbar^2 l (l+1) Y_{lm}\end{aligned}

and simultaneously

\begin{aligned}\mathbf{L}_z Y_{lm} = \hbar m Y_{lm}\end{aligned}

Part of the solution involves working with \left[{L_z},{L_{+}}\right], and \left[{L_z},{L_{-}}\right], where

\begin{aligned}L_{+} &= L_x + i L_y \\ L_{-} &= L_x - i L_y\end{aligned}

An exercise (not in the book) is to evaluate

\begin{aligned}\left[{L_z},{L_{+}}\right] &= L_z L_x + i L_z L_y - L_x L_z - i L_y L_z \end{aligned} \hspace{\stretch{1}}(1.3)


\begin{aligned}\left[{L_x},{L_y}\right]  &= i \hbar L_z \\ \left[{L_y},{L_z}\right]  &= i \hbar L_x \\ \left[{L_z},{L_x}\right]  &= i \hbar L_y\end{aligned} \hspace{\stretch{1}}(1.4)

Substitution back in 1.3 we have

\begin{aligned}\left[{L_z},{L_{+}}\right] &=\left[{L_z},{L_x}\right] + i \left[{L_z},{L_y}\right]  \\ &=i \hbar ( L_y - i L_x ) \\ &=\hbar ( i L_y +  L_x ) \\ &=\hbar L_{+}\end{aligned}

Posted in Math and Physics Learning. | Tagged: , , , , , , , , | Leave a Comment »

Derivation of the spherical polar Laplacian

Posted by peeterjoot on October 9, 2010

[Click here for a PDF of this post with nicer formatting]


In [1] was a Geometric Algebra derivation of the 2D polar Laplacian by squaring the quadient. In [2] was a factorization of the spherical polar unit vectors in a tidy compact form. Here both these ideas are utilized to derive the spherical polar form for the Laplacian, an operation that is strictly algebraic (squaring the gradient) provided we operate on the unit vectors correctly.

Our rotation multivector.

Our starting point is a pair of rotations. We rotate first in the x,y plane by \phi

\begin{aligned}\mathbf{x} &\rightarrow \mathbf{x}' = \tilde{R_\phi} \mathbf{x} R_\phi \\ i &\equiv \mathbf{e}_1 \mathbf{e}_2 \\ R_\phi &= e^{i \phi/2}\end{aligned} \hspace{\stretch{1}}(2.1)

Then apply a rotation in the \mathbf{e}_3 \wedge (\tilde{R_\phi} \mathbf{e}_1 R_\phi) = \tilde{R_\phi} \mathbf{e}_3 \mathbf{e}_1 R_\phi plane

\begin{aligned}\mathbf{x}' &\rightarrow \mathbf{x}'' = \tilde{R_\theta} \mathbf{x}' R_\theta \\ R_\theta &= e^{ \tilde{R_\phi} \mathbf{e}_3 \mathbf{e}_1 R_\phi \theta/2 } = \tilde{R_\phi} e^{ \mathbf{e}_3 \mathbf{e}_1 \theta/2 } R_\phi\end{aligned} \hspace{\stretch{1}}(2.4)

The composition of rotations now gives us

\begin{aligned}\mathbf{x}&\rightarrow \mathbf{x}'' = \tilde{R_\theta} \tilde{R_\phi} \mathbf{x} R_\phi R_\theta = \tilde{R} \mathbf{x} R \\ R &= R_\phi R_\theta = e^{ \mathbf{e}_3 \mathbf{e}_1 \theta/2 } e^{ \mathbf{e}_1 \mathbf{e}_2 \phi/2 }.\end{aligned}

Expressions for the unit vectors.

The unit vectors in the rotated frame can now be calculated. With I = \mathbf{e}_1 \mathbf{e}_2 \mathbf{e}_3 we can calculate

\begin{aligned}\hat{\boldsymbol{\phi}} &= \tilde{R} \mathbf{e}_2 R  \\ \hat{\mathbf{r}} &= \tilde{R} \mathbf{e}_3 R  \\ \hat{\boldsymbol{\theta}} &= \tilde{R} \mathbf{e}_1 R\end{aligned} \hspace{\stretch{1}}(3.6)

Performing these we get

\begin{aligned}\hat{\boldsymbol{\phi}}&= e^{ -\mathbf{e}_1 \mathbf{e}_2 \phi/2 } e^{ -\mathbf{e}_3 \mathbf{e}_1 \theta/2 } \mathbf{e}_2 e^{ \mathbf{e}_3 \mathbf{e}_1 \theta/2 } e^{ \mathbf{e}_1 \mathbf{e}_2 \phi/2 } \\ &= \mathbf{e}_2 e^{ i \phi },\end{aligned}


\begin{aligned}\hat{\mathbf{r}}&= e^{ -\mathbf{e}_1 \mathbf{e}_2 \phi/2 } e^{ -\mathbf{e}_3 \mathbf{e}_1 \theta/2 } \mathbf{e}_3 e^{ \mathbf{e}_3 \mathbf{e}_1 \theta/2 } e^{ \mathbf{e}_1 \mathbf{e}_2 \phi/2 } \\ &= e^{ -\mathbf{e}_1 \mathbf{e}_2 \phi/2 } (\mathbf{e}_3 \cos\theta + \mathbf{e}_1 \sin\theta ) e^{ \mathbf{e}_1 \mathbf{e}_2 \phi/2 } \\ &= \mathbf{e}_3 \cos\theta +\mathbf{e}_1 \sin\theta e^{ \mathbf{e}_1 \mathbf{e}_2 \phi } \\ &= \mathbf{e}_3 (\cos\theta + \mathbf{e}_3 \mathbf{e}_1 \sin\theta e^{ \mathbf{e}_1 \mathbf{e}_2 \phi } ) \\ &= \mathbf{e}_3 e^{I \hat{\boldsymbol{\phi}} \theta},\end{aligned}


\begin{aligned}\hat{\boldsymbol{\theta}}&= e^{ -\mathbf{e}_1 \mathbf{e}_2 \phi/2 } e^{ -\mathbf{e}_3 \mathbf{e}_1 \theta/2 } \mathbf{e}_1 e^{ \mathbf{e}_3 \mathbf{e}_1 \theta/2 } e^{ \mathbf{e}_1 \mathbf{e}_2 \phi/2 } \\ &= e^{ -\mathbf{e}_1 \mathbf{e}_2 \phi/2 } ( \mathbf{e}_1 \cos\theta - \mathbf{e}_3 \sin\theta ) e^{ \mathbf{e}_1 \mathbf{e}_2 \phi/2 } \\ &= \mathbf{e}_1 \cos\theta e^{ \mathbf{e}_1 \mathbf{e}_2 \phi/2 } - \mathbf{e}_3 \sin\theta \\ &= i \hat{\boldsymbol{\phi}} \cos\theta - \mathbf{e}_3 \sin\theta \\ &= i \hat{\boldsymbol{\phi}} (\cos\theta + \hat{\boldsymbol{\phi}} i \mathbf{e}_3 \sin\theta ) \\ &= i \hat{\boldsymbol{\phi}} e^{I \hat{\boldsymbol{\phi}} \theta}.\end{aligned}

Summarizing these are

\begin{aligned}\hat{\boldsymbol{\phi}} &= \mathbf{e}_2 e^{ i \phi } \\ \hat{\mathbf{r}} &= \mathbf{e}_3 e^{I \hat{\boldsymbol{\phi}} \theta} \\ \hat{\boldsymbol{\theta}} &= i \hat{\boldsymbol{\phi}} e^{I \hat{\boldsymbol{\phi}} \theta}.\end{aligned} \hspace{\stretch{1}}(3.9)

Derivatives of the unit vectors.

We’ll need the partials. Most of these can be computed from 3.9 by inspection, and are

\begin{aligned}\partial_r \hat{\boldsymbol{\phi}} &= 0 \\ \partial_r \hat{\mathbf{r}} &= 0 \\ \partial_r \hat{\boldsymbol{\theta}} &= 0 \\ \partial_\theta \hat{\boldsymbol{\phi}} &= 0 \\ \partial_\theta \hat{\mathbf{r}} &= \hat{\mathbf{r}} I \hat{\boldsymbol{\phi}} \\ \partial_\theta \hat{\boldsymbol{\theta}} &= \hat{\boldsymbol{\theta}} I \hat{\boldsymbol{\phi}} \\ \partial_\phi \hat{\boldsymbol{\phi}} &= \hat{\boldsymbol{\phi}} i \\ \partial_\phi \hat{\mathbf{r}} &= \hat{\boldsymbol{\phi}} \sin\theta \\ \partial_\phi \hat{\boldsymbol{\theta}} &= \hat{\boldsymbol{\phi}} \cos\theta\end{aligned} \hspace{\stretch{1}}(4.12)

Expanding the Laplacian.

We note that the line element is ds = dr + r d\theta + r\sin\theta d\phi, so our gradient in spherical coordinates is

\begin{aligned}\boldsymbol{\nabla} &= \hat{\mathbf{r}} \partial_r + \frac{\hat{\boldsymbol{\theta}}}{r} \partial_\theta + \frac{\hat{\boldsymbol{\phi}}}{r\sin\theta} \partial_\phi.\end{aligned} \hspace{\stretch{1}}(5.21)

We can now evaluate the Laplacian

\begin{aligned}\boldsymbol{\nabla}^2 &=\left( \hat{\mathbf{r}} \partial_r + \frac{\hat{\boldsymbol{\theta}}}{r} \partial_\theta + \frac{\hat{\boldsymbol{\phi}}}{r\sin\theta} \partial_\phi \right) \cdot\left( \hat{\mathbf{r}} \partial_r + \frac{\hat{\boldsymbol{\theta}}}{r} \partial_\theta + \frac{\hat{\boldsymbol{\phi}}}{r\sin\theta} \partial_\phi \right).\end{aligned} \hspace{\stretch{1}}(5.22)

Evaluating these one set at a time we have

\begin{aligned}\hat{\mathbf{r}} \partial_r \cdot \left( \hat{\mathbf{r}} \partial_r + \frac{\hat{\boldsymbol{\theta}}}{r} \partial_\theta + \frac{\hat{\boldsymbol{\phi}}}{r\sin\theta} \partial_\phi \right) &= \partial_{rr},\end{aligned}


\begin{aligned}\frac{1}{{r}} \hat{\boldsymbol{\theta}} \partial_\theta \cdot \left( \hat{\mathbf{r}} \partial_r + \frac{\hat{\boldsymbol{\theta}}}{r} \partial_\theta + \frac{\hat{\boldsymbol{\phi}}}{r\sin\theta} \partial_\phi \right)&=\frac{1}{{r}} \left\langle{{\hat{\boldsymbol{\theta}} \left(\hat{\mathbf{r}} I \hat{\boldsymbol{\phi}} \partial_r + \hat{\mathbf{r}} \partial_{\theta r}+ \frac{\hat{\boldsymbol{\theta}}}{r} \partial_{\theta\theta} + \frac{1}{{r}} \hat{\boldsymbol{\theta}} I \hat{\boldsymbol{\phi}} \partial_\theta+ \hat{\boldsymbol{\phi}} \partial_\theta \frac{1}{{r\sin\theta}} \partial_\phi\right)}}\right\rangle \\ &= \frac{1}{{r}} \partial_r+\frac{1}{{r^2}} \partial_{\theta\theta},\end{aligned}


\begin{aligned}\frac{\hat{\boldsymbol{\phi}}}{r\sin\theta} \partial_\phi &\cdot\left( \hat{\mathbf{r}} \partial_r + \frac{\hat{\boldsymbol{\theta}}}{r} \partial_\theta + \frac{\hat{\boldsymbol{\phi}}}{r\sin\theta} \partial_\phi \right) \\ &=\frac{1}{r\sin\theta} \left\langle{{\hat{\boldsymbol{\phi}}\left(\hat{\boldsymbol{\phi}} \sin\theta \partial_r + \hat{\mathbf{r}} \partial_{\phi r} + \hat{\boldsymbol{\phi}} \cos\theta \frac{1}{r} \partial_\theta + \frac{\hat{\boldsymbol{\theta}}}{r} \partial_{\phi \theta }+ \hat{\boldsymbol{\phi}} i \frac{1}{r\sin\theta} \partial_\phi + \hat{\boldsymbol{\phi}} \frac{1}{r\sin\theta} \partial_{\phi \phi }\right)}}\right\rangle \\ &=\frac{1}{{r}} \partial_r+ \frac{\cot\theta}{r^2}\partial_\theta+ \frac{1}{{r^2 \sin^2\theta}} \partial_{\phi\phi}\end{aligned}

Summing these we have

\begin{aligned}\boldsymbol{\nabla}^2 &=\partial_{rr}+ \frac{2}{r} \partial_r+\frac{1}{{r^2}} \partial_{\theta\theta}+ \frac{\cot\theta}{r^2}\partial_\theta+ \frac{1}{{r^2 \sin^2\theta}} \partial_{\phi\phi}\end{aligned} \hspace{\stretch{1}}(5.23)

This is often written with a chain rule trick to considate the r and \theta partials

\begin{aligned}\boldsymbol{\nabla}^2 \Psi &=\frac{1}{{r}} \partial_{rr} (r \Psi)+ \frac{1}{{r^2 \sin\theta}} \partial_\theta \left( \sin\theta \partial_\theta \Psi \right)+ \frac{1}{{r^2 \sin^2\theta}} \partial_{\psi\psi} \Psi\end{aligned} \hspace{\stretch{1}}(5.24)

It’s simple to verify that this is identical to 5.23.


[1] Peeter Joot. Polar form for the gradient and Laplacian. [online].

[2] Peeter Joot. Spherical Polar unit vectors in exponential form. [online]. .

Posted in Math and Physics Learning. | Tagged: , , , , , , , , , | Leave a Comment »

Jacobians and spherical polar gradient

Posted by peeterjoot on December 8, 2009

[Click here for a PDF of this post with nicer formatting]


The dumbest and most obvious way to do a chain of variables for the gradient is to utilize a chain rule expansion producing the Jacobian matrix to transform the coordinates. Here we do this to calculate the spherical polar representation of the gradient.

There are smarter and easier ways to do this, but there is some surprising simple structure to the resulting Jacobians that seems worth noting.

Spherical polar gradient coordinates in terms of Cartesian.

We wish to do a change of variables for each of the differential operators of the gradient. This is essentially just application of the chain rule, as in

\begin{aligned}\frac{\partial {}}{\partial {r}} = \frac{\partial {x}}{\partial {r}} \frac{\partial {}}{\partial {x}}+\frac{\partial {y}}{\partial {r}} \frac{\partial {}}{\partial {y}}+\frac{\partial {z}}{\partial {r}} \frac{\partial {}}{\partial {z}}.\end{aligned} \quad\quad\quad(1)

Collecting all such derivatives we have in column vector form

\begin{aligned}\begin{bmatrix}\partial_r \\ \partial_\theta \\ \partial_\phi\end{bmatrix}= \begin{bmatrix}\frac{\partial {x}}{\partial {r}} &\frac{\partial {y}}{\partial {r}} &\frac{\partial {z}}{\partial {r}}  \\ \frac{\partial {x}}{\partial {\theta}} &\frac{\partial {y}}{\partial {\theta}} &\frac{\partial {z}}{\partial {\theta}}  \\ \frac{\partial {x}}{\partial {\phi}} &\frac{\partial {y}}{\partial {\phi}} &\frac{\partial {z}}{\partial {\phi}} \end{bmatrix}\begin{bmatrix}\partial_x \\ \partial_y \\ \partial_z\end{bmatrix}.\end{aligned} \quad\quad\quad(2)

This becomes a bit more tractable with the Jacobian notation

\begin{aligned}\frac{\partial (x,y,z)}{\partial (r,\theta,\phi)}=\begin{bmatrix}\frac{\partial {x}}{\partial {r}} &\frac{\partial {y}}{\partial {r}} &\frac{\partial {z}}{\partial {r}}  \\ \frac{\partial {x}}{\partial {\theta}} &\frac{\partial {y}}{\partial {\theta}} &\frac{\partial {z}}{\partial {\theta}}  \\ \frac{\partial {x}}{\partial {\phi}} &\frac{\partial {y}}{\partial {\phi}} &\frac{\partial {z}}{\partial {\phi}}\end{bmatrix}.\end{aligned} \quad\quad\quad(3)

The change of variables for the operator triplet is then just

\begin{aligned}\begin{bmatrix}\partial_r \\ \partial_\theta \\ \partial_\phi\end{bmatrix}= \frac{\partial (x,y,z)}{\partial (r,\theta,\phi)}\begin{bmatrix}\partial_x \\ \partial_y \\ \partial_z\end{bmatrix}.\end{aligned} \quad\quad\quad(4)

This Jacobian matrix is also not even too hard to calculate. With \mathbf{x} = r \hat{\mathbf{r}}, we have x_k = r \hat{\mathbf{r}} \cdot \mathbf{e}_k, and

\begin{aligned}\frac{\partial {x_k}}{\partial {r}} &= \hat{\mathbf{r}} \cdot \mathbf{e}_k \\ \frac{\partial {x_k}}{\partial {\theta}} &= r \frac{\partial {\hat{\mathbf{r}}}}{\partial {\theta}} \cdot \mathbf{e}_k \\ \frac{\partial {x_k}}{\partial {\phi}} &= r \frac{\partial {\hat{\mathbf{r}}}}{\partial {\phi}} \cdot \mathbf{e}_k.\end{aligned} \quad\quad\quad(5)

The last two derivatives can be calculated easily if the radial unit vector is written out explicitly, with S and C for sine and cosine respectively, these are

\begin{aligned}\hat{\mathbf{r}} &= \begin{bmatrix}S_\theta C_\phi \\ S_\theta S_\phi \\ C_\theta \end{bmatrix} \\ \frac{\partial {\hat{\mathbf{r}}}}{\partial {\theta}} &= \begin{bmatrix}C_\theta C_\phi \\ C_\theta S_\phi \\ -S_\theta \end{bmatrix} \\ \frac{\partial {\hat{\mathbf{r}}}}{\partial {\phi}} &= \begin{bmatrix}-S_\theta S_\phi \\ S_\theta C_\phi \\ 0\end{bmatrix} .\end{aligned} \quad\quad\quad(8)

We can plug these into the elements of the Jacobian matrix explicitly, which produces

\begin{aligned}\frac{\partial (x,y,z)}{\partial (r,\theta,\phi)}=\begin{bmatrix} S_\theta C_\phi & S_\theta S_\phi & C_\theta \\ r C_\theta C_\phi & r C_\theta S_\phi & - r S_\theta \\ -r S_\theta S_\phi & rS_\theta C_\phi & 0\end{bmatrix},\end{aligned} \quad\quad\quad(11)

however, we are probably better off just referring back to 8, and writing

\begin{aligned}\frac{\partial (x,y,z)}{\partial (r,\theta,\phi)}=\begin{bmatrix} \hat{\mathbf{r}}^\text{T} \\ r \frac{\partial {\hat{\mathbf{r}}^\text{T}}}{\partial {\theta}} \\ r \frac{\partial {\hat{\mathbf{r}}^\text{T}}}{\partial {\phi}} \end{bmatrix}.\end{aligned} \quad\quad\quad(12)

Unfortunately, this is actually a bit of a dead end. We really want the inverse of this matrix because the desired quantity is

\begin{aligned}\boldsymbol{\nabla} = \begin{bmatrix}\mathbf{e}_1 & \mathbf{e}_2 & \mathbf{e}_3  \end{bmatrix}\begin{bmatrix}\partial_{x_1} \\ \partial_{x_2} \\ \partial_{x_3}\end{bmatrix}.\end{aligned} \quad\quad\quad(13)

(Here my matrix of unit vectors treats these abusively as single elements and not as column vectors).

The matrix of equation 12 does not look particularly fun to invert directly, and that is what we need to substitute into
13. One knows that in the end if it was attempted things should mystically simplify (presuming this was done error free).

Cartesian gradient coordinates in terms of spherical polar partials.

Let’s flip things upside down and calculate the inverse Jacobian matrix directly. This is a messier job, but it appears less messy than the matrix inversion above.

\begin{aligned}r^2 &= x^2 + y^2 + z^2  \\ \sin^2 \theta &= \frac{x^2 + y^2}{x^2 + y^2 + z^2} \\ \tan\phi &= \frac{y}{x}.\end{aligned} \quad\quad\quad(14)

The messy task is now the calculation of these derivatives.

For the first, from r^2 = x^2 + y^2 + z^2, taking partials on both sides, we have

\begin{aligned}\frac{\partial {r}}{\partial {x_k}} = \frac{x_k}{r}.\end{aligned} \quad\quad\quad(17)

But these are just the direction cosines, the components of our polar unit vector \hat{\mathbf{r}}. We can then write for all of these derivatives in column matrix form

\begin{aligned}\boldsymbol{\nabla} r = \hat{\mathbf{r}}\end{aligned} \quad\quad\quad(18)

Next from \sin^2\theta = (x^2 + y^2)/r^2, we get after some reduction

\begin{aligned}\frac{\partial {\theta}}{\partial {x}} &= \frac{1}{{r}} C_\theta C_\phi \\ \frac{\partial {\phi}}{\partial {y}} &= \frac{1}{{r}} C_\theta S_\phi \\ \frac{\partial {\phi}}{\partial {z}} &= -\frac{S_\theta}{r}.\end{aligned} \quad\quad\quad(19)

Observe that we can antidifferentiate with respect to theta and obtain

\begin{aligned}\boldsymbol{\nabla} \theta &= \frac{1}{{r}}\begin{bmatrix}C_\theta C_\phi \\ C_\theta S_\phi \\ -S_\theta\end{bmatrix} \\ &=\frac{1}{{r}}\frac{\partial {}}{\partial {\theta}}\begin{bmatrix}S_\theta C_\phi \\ S_\theta S_\phi \\ C_\theta\end{bmatrix}.\end{aligned}

This last column vector is our friend the unit polar vector again, and we have

\begin{aligned}\boldsymbol{\nabla} \theta &= \frac{1}{{r}}\frac{\partial {\hat{\mathbf{r}}}}{\partial {\theta}}\end{aligned} \quad\quad\quad(22)

Finally for the \phi dependence we have after some reduction

\begin{aligned}\boldsymbol{\nabla} \phi &=\frac{1}{{r S_\theta}}\begin{bmatrix}-S_\phi \\ C_\phi \\ 0\end{bmatrix}.\end{aligned} \quad\quad\quad(23)

Again, we can antidifferentiate

\begin{aligned}\boldsymbol{\nabla} \phi &=\frac{1}{{r (S_\theta)^2}}\begin{bmatrix}-S_\theta S_\phi \\ S_\theta C_\phi \\ 0\end{bmatrix} \\ &=\frac{1}{{r (S_\theta)^2}}\frac{\partial {}}{\partial {\phi}}\begin{bmatrix}S_\theta C_\phi \\ S_\theta S_\phi \\ C_\theta\end{bmatrix}.\end{aligned}

We have our unit polar vector again, and our \phi partials nicely summarized by

\begin{aligned}\boldsymbol{\nabla} \phi &=\frac{1}{{r (S_\theta)^2}}\frac{\partial {\hat{\mathbf{r}}}}{\partial {\phi}}.\end{aligned} \quad\quad\quad(24)

With this we can now write out the Jacobian matrix either explicitly, or in column vector form in terms of \hat{\mathbf{r}}. First a reminder of why we want this matrix, for the following change of variables

\begin{aligned}\begin{bmatrix}\partial_x \\ \partial_y \\ \partial_z\end{bmatrix}= \begin{bmatrix}\frac{\partial {r}}{\partial {x}} &\frac{\partial {\theta}}{\partial {x}} &\frac{\partial {\phi}}{\partial {x}}  \\ \frac{\partial {r}}{\partial {y}} &\frac{\partial {\theta}}{\partial {y}} &\frac{\partial {\phi}}{\partial {y}}  \\ \frac{\partial {r}}{\partial {z}} &\frac{\partial {\theta}}{\partial {z}} &\frac{\partial {\phi}}{\partial {z}} \end{bmatrix}\begin{bmatrix}\partial_r \\ \partial_\theta \\ \partial_\phi\end{bmatrix}.\end{aligned} \quad\quad\quad(25)

We want the Jacobian matrix

\begin{aligned}\frac{\partial (r,\theta,\phi)}{\partial (x, y, z)}=\begin{bmatrix}\boldsymbol{\nabla} r & \boldsymbol{\nabla} \theta & \boldsymbol{\nabla} \phi\end{bmatrix}=\begin{bmatrix}\hat{\mathbf{r}} & \frac{1}{{r}} \frac{\partial {\hat{\mathbf{r}}}}{\partial {\theta}} & \frac{1}{{r \sin^2\theta}} \frac{\partial {\hat{\mathbf{r}}}}{\partial {\phi}}\end{bmatrix}.\end{aligned} \quad\quad\quad(26)

Explicitly this is

\begin{aligned}\frac{\partial (r,\theta,\phi)}{\partial (x, y, z)}=\begin{bmatrix}S_\theta C_\phi & \frac{1}{{r}} C_\theta C_\phi & -\frac{1}{{r S_\theta}} S_\phi \\ S_\theta S_\phi & \frac{1}{{r}} C_\theta S_\phi & \frac{C_\phi}{r S_\theta} \\ C_\theta        & -\frac{1}{{r}} S_\theta       &  0\end{bmatrix}.\end{aligned} \quad\quad\quad(27)

As a verification of correctness multiplication of this with 11 should produce identity. That’s a mess of trig that I don’t really feel like trying, but we can get a rough idea why it should all be the identity matrix by multiplying it out in block matrix form

\begin{aligned}\frac{\partial (x,y,z)}{\partial (r,\theta,\phi)}\frac{\partial (r,\theta,\phi)}{\partial (x, y, z)}&=\begin{bmatrix} \hat{\mathbf{r}}^\text{T} \\ r \frac{\partial {\hat{\mathbf{r}}^\text{T}}}{\partial {\theta}} \\ r \frac{\partial {\hat{\mathbf{r}}^\text{T}}}{\partial {\phi}} \end{bmatrix}\begin{bmatrix}\hat{\mathbf{r}} & \frac{1}{{r}} \frac{\partial {\hat{\mathbf{r}}}}{\partial {\theta}} & \frac{1}{{r \sin^2\theta}} \frac{\partial {\hat{\mathbf{r}}}}{\partial {\phi}}\end{bmatrix} \\ &=\begin{bmatrix} \hat{\mathbf{r}}^\text{T} \hat{\mathbf{r}}                & \frac{1}{{r}} \hat{\mathbf{r}}^\text{T} \frac{\partial {\hat{\mathbf{r}}}}{\partial {\theta}}      & \frac{1}{{r \sin^2 \theta}} \hat{\mathbf{r}}^\text{T} \frac{\partial {\hat{\mathbf{r}}}}{\partial {\phi}} \\ r \frac{\partial {\hat{\mathbf{r}}^\text{T}}}{\partial {\theta}} \hat{\mathbf{r}} & \frac{\partial {\hat{\mathbf{r}}^\text{T}}}{\partial {\theta}} \frac{\partial {\hat{\mathbf{r}}}}{\partial {\theta}} & \frac{1}{{\sin^2\theta}} \frac{\partial {\hat{\mathbf{r}}^\text{T}}}{\partial {\theta}} \frac{\partial {\hat{\mathbf{r}}}}{\partial {\phi}} \\ r \frac{\partial {\hat{\mathbf{r}}^\text{T}}}{\partial {\phi}} \hat{\mathbf{r}}   & \frac{\partial {\hat{\mathbf{r}}^\text{T}}}{\partial {\phi}} \frac{\partial {\hat{\mathbf{r}}}}{\partial {\theta}}   & \frac{1}{{\sin^2\theta}} \frac{\partial {\hat{\mathbf{r}}^\text{T}}}{\partial {\phi}} \frac{\partial {\hat{\mathbf{r}}}}{\partial {\phi}}\end{bmatrix}.\end{aligned}

The derivatives are vectors that lie tangential to the unit sphere. We can calculate this to verify, or we can look at the off diagonal terms which say just this if we trust the math that says these should all be zeros. For each of the off diagonal terms to be zero must mean that we have

\begin{aligned}0 = \frac{\partial {\hat{\mathbf{r}}}}{\partial {\theta}} \cdot \hat{\mathbf{r}} = \frac{\partial {\hat{\mathbf{r}}}}{\partial {\theta}} \cdot \frac{\partial {\hat{\mathbf{r}}}}{\partial {\phi}} = \frac{\partial {\hat{\mathbf{r}}}}{\partial {\phi}} \cdot \hat{\mathbf{r}} \end{aligned} \quad\quad\quad(28)

This makes intuitive sense. We can also verify quickly enough that ({\partial {\hat{\mathbf{r}}}}/{\partial {\theta}})^2 = 1, and ({\partial {\hat{\mathbf{r}}}}/{\partial {\phi}})^2 = \sin^2\theta (I did this with a back of the envelope calculation using geometric algebra). That is consistent with what this matrix product implies it should equal.

Completing the gradient change of variables to spherical polar coordinates.

We are now set to calculate the gradient in spherical polar coordinates from our Cartesian representation. From 13 and
25, and 26 we have

\begin{aligned}\boldsymbol{\nabla} =\begin{bmatrix}\mathbf{e}_1 & \mathbf{e}_2 & \mathbf{e}_3  \end{bmatrix}\begin{bmatrix}\hat{\mathbf{r}} \cdot \mathbf{e}_1 & \frac{1}{{r}} \frac{\partial {\hat{\mathbf{r}}}}{\partial {\theta}} \cdot \mathbf{e}_1 & \frac{1}{{r \sin^2\theta}} \frac{\partial {\hat{\mathbf{r}}}}{\partial {\phi}} \cdot \mathbf{e}_1 \\ \hat{\mathbf{r}} \cdot \mathbf{e}_2 & \frac{1}{{r}} \frac{\partial {\hat{\mathbf{r}}}}{\partial {\theta}} \cdot \mathbf{e}_2 & \frac{1}{{r \sin^2\theta}} \frac{\partial {\hat{\mathbf{r}}}}{\partial {\phi}} \cdot \mathbf{e}_2 \\ \hat{\mathbf{r}} \cdot \mathbf{e}_3 & \frac{1}{{r}} \frac{\partial {\hat{\mathbf{r}}}}{\partial {\theta}} \cdot \mathbf{e}_3 & \frac{1}{{r \sin^2\theta}} \frac{\partial {\hat{\mathbf{r}}}}{\partial {\phi}} \cdot \mathbf{e}_3 \end{bmatrix}\begin{bmatrix}\partial_r \\ \partial_\theta \\ \partial_\phi\end{bmatrix}.\end{aligned} \quad\quad\quad(29)

The Jacobian matrix has been written out explicitly as scalars because we are now switching to an abusive notation using matrices of vector elements. Our Jacobian, a matrix of scalars happened to have a nice compact representation in column vector form, but we cannot use this when multiplying out with our matrix elements (or perhaps could if we invented more conventions, but lets avoid that). Having written it out in full we see that we recover our original compact Jacobian representation, and have just

\begin{aligned}\boldsymbol{\nabla} = \begin{bmatrix}\hat{\mathbf{r}} & \frac{1}{{r}} \frac{\partial {\hat{\mathbf{r}}}}{\partial {\theta}} & \frac{1}{{r \sin^2\theta}} \frac{\partial {\hat{\mathbf{r}}}}{\partial {\phi}} \end{bmatrix}\begin{bmatrix}\partial_r \\ \partial_\theta \\ \partial_\phi\end{bmatrix}.\end{aligned} \quad\quad\quad(30)

Expanding this last product we have the gradient in its spherical polar representation

\begin{aligned}\boldsymbol{\nabla} = \begin{bmatrix}\hat{\mathbf{r}} \frac{\partial {}}{\partial {r}} + \frac{1}{{r}} \frac{\partial {\hat{\mathbf{r}}}}{\partial {\theta}} \frac{\partial {}}{\partial {\theta}} + \frac{1}{{r \sin\theta}} \frac{1}{{\sin\theta}} \frac{\partial {\hat{\mathbf{r}}}}{\partial {\phi}} \frac{\partial {}}{\partial {\phi}}\end{bmatrix}.\end{aligned} \quad\quad\quad(31)

With the labels

\begin{aligned}\hat{\boldsymbol{\theta}} &= \frac{\partial {\hat{\mathbf{r}}}}{\partial {\theta}} \\ \hat{\boldsymbol{\phi}} &= \frac{1}{{\sin\theta}} \frac{\partial {\hat{\mathbf{r}}}}{\partial {\phi}},\end{aligned} \quad\quad\quad(32)

(having confirmed that these are unit vectors), we have the final result for the gradient in this representation

\begin{aligned}\boldsymbol{\nabla} = \hat{\mathbf{r}} \frac{\partial {}}{\partial {r}} + \frac{1}{{r}} \hat{\boldsymbol{\theta}} \frac{\partial {}}{\partial {\theta}} + \frac{1}{{r \sin\theta}} \hat{\boldsymbol{\phi}} \frac{\partial {}}{\partial {\phi}}.\end{aligned} \quad\quad\quad(34)

Here the matrix delimiters for the remaining one by one matrix term were also dropped.

General expression for gradient in orthonormal frames.

Having done the computation for the spherical polar case, we get the result for any orthonormal frame for free. That is just

\begin{aligned}\boldsymbol{\nabla} = \sum_i (\boldsymbol{\nabla} q_i) \frac{\partial {}}{\partial {q_i}}.\end{aligned} \quad\quad\quad(35)

From each of the gradients we can factor out a unit vector in the direction of the gradient, and have an expression that structurally has the same form as 34. Writing \hat{\mathbf{q}}_i = (\boldsymbol{\nabla} q_i)/{\left\lvert{\boldsymbol{\nabla} q_i}\right\rvert}, this is

\begin{aligned}\boldsymbol{\nabla} = \sum_i {\left\lvert{\boldsymbol{\nabla} q_i}\right\rvert} \hat{\mathbf{q}}_i \frac{\partial {}}{\partial {q_i}}.\end{aligned} \quad\quad\quad(36)

These individual direction gradients are not necessarily easy to compute. The procedures outlined in [1] are a more effective way of dealing with this general computational task. However, if we want, we can at proceed this dumb obvious way and be able to get the desired result knowing only how to apply the chain rule, and the Cartesian definition of the gradient.


[1] F.W. Byron and R.W. Fuller. Mathematics of Classical and Quantum Physics. Dover Publications, 1992.

Posted in Math and Physics Learning. | Tagged: , , , | Leave a Comment »

An aborted attempt at a scientific style paper (geometric algebra N spherical pendulum)

Posted by peeterjoot on November 26, 2009

I’m not going to blog-ize this post since I’d have to first teach my latex to wordpress script to handle the theorem and subequations environment:

Matrix factorization of vectors in Geometric Algebra with application to the spherical N-pendulum problem

This was an attempt to cleanup my previous N spherical pendulum treatment (ie. equations of motion for an idealized chain). I’d used matrixes of multivector elements to arrive at an explicit expression for the kinetic energy for this system and then evaluated the Euler-Lagrange equations. I thought initially that this was pretty nifty, but as I was cleaning this up for an attempted arxiv post, I rudely realized that exactly the same result follows by factoring our a column vector of generalized coordinates (an angular velocity vector of sorts) from the scalar energy expressions (factoring a “one by one matrix” of scalar values into a product of matrixes).

No use of geometric algebra is required. Without the use of geometic algebra with its compact representation for the spherical parameterized points on the unit sphere, I don’t think I’d have even attempted this generalized problem. However, the idea that makes the problem tractable is completely independent of GA, and renders the paper as written just an obfuscated way to tackle it (since only a handful of people know this algebraic approach).

In case anybody is curious, here’s the abstract and introduction.


Geometric algebra provides a formalism that allows for coordinate and matrix free representations of vectors, linear transformations, and other mathematical structures. The replacement of matrix techniques with direct geometric algebra methods is a well studied field, but this need not be an exclusive replacement. While the geometric algebra multivector space does not constitute a field, operations on multivector element matrices can be defined in a consistent fashion. This use of geometric algebraic objects in matrices provides a hybrid mathematical framework with immediate applications to mathematical physics. Such an application is the dynamical examination of chain like structures, idealizable as an N mass pendulum with no planar constraints. The double planar pendulum and single mass spherical pendulum problems are well treated in Lagrangian physics texts, but due to complexity a similar treatment of the spherical N-pendulum problem is not pervasive. It will be shown that the multivector matrix technique lends itself nicely to this problem, allowing the complete specification of the equations of motion for this generalized pendulum system.


Derivation of the equations of motion for a planar motion constrained double pendulum system and a single spherical pendulum system are given as problems or examples in many texts covering Lagrangian mechanics. Setup of the Lagrangian, particularly an explicit specification of the system kinetic energy, is the difficult aspect of the multiple mass pendulum problem. Each mass in the system introduces additional interaction coupling terms, complicating the kinetic energy specification. In this paper, we utilize geometric (or Clifford) algebra to determine explicitly the Lagrangian for the spherical N pendulum system, and to evaluate the Euler-Lagrange equations for the system.

It is well known that the general specification of the kinetic energy for a system of independent point masses takes the form of a symmetric quadratic form \cite{goldstein1951cm} \cite{hestenes1999nfc}. However, actually calculating that energy explicitly for the general N-pendulum is likely thought too pedantic for even the most punishing instructor to inflict on students as a problem or example.

Considering this problem in a geometric algebra context motivates the introduction of multivector matrices. Given a one by one matrix containing a sum of vector terms, we can factor that matrix into a pair of $M \times 1$ and $1 \times M$ multivector matrices. In general there are many possible multivector factorizations of any given vector. For the pendulum problem all the required vector factorizations involve products with exponential rotation operators. This matrix factorization allows the velocity vector for each mass to be separated into a product containing one angular velocity coordinate vector and one multivector factor. Such a grouping can be used to tidily separate the kinetic energy into an explicit quadratic form, sandwiching a Hermitian multivector matrix between two vectors of generalized velocity coordinates.

While matrix algebra over the field of real numbers of complex numbers is familiar and well defined, geometric algebra multivector objects do not constitute a field, generally lacking both commutative multiplication and multiplicative inverses. This does not prevent a coherent definition of multivector matrix multiplication, provided that care is taken to retain appropriate order of element products. In the geometric algebra literature, there is significant focus on how to replace matrix and coordinate methods with geometric algebra, and less focus on how to use both together. A hybrid approach utilizing both matrix and geometric algebra has some elegance, and appears particularly well suited to the specific problem of the spherical N pendulum.

This paper is divided into two major sections. In the first, we start with a brief and minimal summary of required geometric algebra fundamentals and definitions. This includes the axioms, some nomenclature, and the use of one sided exponential rotation operators. To anybody already familiar with geometric algebra, this can be skipped or referred to only for notational conventions. Next, also in the first section, we define multivector matrix operations. A notion of Hermiticity useful for forming the quadratic forms used to express the kinetic energy for a multiple point mass system is defined here. A few examples of vector factorization into multivector matrix products are supplied for illustration purposes.

The second part of this paper applies the geometric algebra matrix techniques to the pendulum problem. For the single and double pendulum problems, already familiar to many, the methods and the results are compared to traditional coordinate geometry techniques used elsewhere. It is noteworthy that even the foundational geometric algebra mechanics text \cite{hestenes1999nfc} falls back to coordinate geometry for the planar double pendulum problem, instead of treating it in a way that is arguably more natural given the available tools.

Enroute to the Lagrangian we find an expression for the kinetic energy as a sum of bilinear forms, where the matrices used to express these forms are composed of pairs of block matrix products. This is surely a result that can be obtained by non geometric algebra methods, although how to obtain this result using alternate tools is not immediately obvious.

The end result of this paper is a complete and explicit specification of the Lagrangian and evaluation of the Euler-Lagrange equations for the chain-like N spherical pendulum system. While, this end result is free of geometric algebra, being nothing more than a non-linear set of coupled differential equations, it is believed that this geometric algebra matrix approach is an elegant and compact way to obtain it or to tackle similar problems. At the very least, this may prove to be one more useful addition to the physicist’s mathematical toolbox.

Posted in Math and Physics Learning. | Tagged: , , , , , , | Leave a Comment »

Spherical polar pendulum for one and multiple masses (Take II)

Posted by peeterjoot on November 10, 2009

[Click here for a PDF of this post with nicer formatting]


Attempting the multiple spherical pendulum problem with a bivector parameterized Lagrangian has just been attempted ([1]), but did not turn out to be an effective approach. Here a variation is used, employing regular plain old scalar spherical angle parameterized Kinetic energy, but still employing Geometric Algebra to express the Hermitian quadratic form associated with this energy term.

The same set of simplifying assumptions will be made. These are point masses, zero friction at the pivots and rigid nonspringy massless connecting rods between the masses.

The Lagrangian.

A two particle spherical pendulum is depicted in figure (\ref{fig:sPolarMultiPendulum:pendulumDouble})

\caption{Double spherical pendulum.}

The position vector for each particle can be expressed relative to the mass it is connected to (or the origin for the first particle), as in

\begin{aligned}z_k &= z_{k-1} + \mathbf{e}_3 l_k e^{j_k \theta_k} \\ j_k &= \mathbf{e}_3 \wedge \left( \mathbf{e}_1 e^{i \phi_k} \right) \\ i &= \mathbf{e}_1 \wedge \mathbf{e}_2\end{aligned} \quad\quad\quad(1)

To express the Kinetic energy for any of the masses m_k, we need the derivative of the incremental difference in position

\begin{aligned}\frac{d}{dt} \left( \mathbf{e}_3 e^{j_k \theta_k} \right)&=\mathbf{e}_3 \left( j_k \dot{\theta}_k e^{j_k \theta_k} + \frac{d j_k }{dt} \sin\theta_k \right)  \\ &=\mathbf{e}_3 \left( j_k \dot{\theta}_k e^{j_k \theta_k} + \mathbf{e}_3 \mathbf{e}_2 \dot{\phi}_k e^{i \phi_k} \sin\theta_k \right)  \\ &=\left( \frac{d}{dt}\begin{bmatrix}\theta_k & \phi_k\end{bmatrix} \right)\begin{bmatrix}\mathbf{e}_1 e^{i \phi_k} e^{j_k \theta_k} \\ \mathbf{e}_2 e^{i \phi_k} \sin\theta_k\end{bmatrix}\end{aligned}

Introducing a Hermitian conjugation A^\dagger = \tilde{A}^\text{T}, reversing and transposing the matrix, and writing

\begin{aligned}A_k &=\begin{bmatrix}\mathbf{e}_1 e^{i \phi_k} e^{j_k \theta_k} \\ \mathbf{e}_2 e^{i \phi_k} \sin\theta_k\end{bmatrix} \\ \boldsymbol{\Theta}_k &=\begin{bmatrix}\theta_k \\ \phi_k\end{bmatrix}\end{aligned} \quad\quad\quad(4)

We can now write the relative velocity differential as

\begin{aligned}(\dot{z}_k - \dot{z}_{k-1})^2 = l_k^2 {\dot{\boldsymbol{\Theta}}_k}^\dagger A_k A_k^\dagger \dot{\boldsymbol{\Theta}}_k\end{aligned} \quad\quad\quad(6)

Observe that the inner product is Hermitian under this definition since (A_k A_k^\dagger)^\dagger = A_k A_k^\dagger. \footnote{Realized later, and being too lazy to adjust everything in these notes, the use of reversion here is not neccessary. Since the generalized coordinates are scalars we could use transposition instead of Hermitian conjugation. All the matrix elements are vectors so reversal doesn’t change anything.}

The total (squared) velocity of the kth particle is then

\begin{aligned}\boldsymbol{\Theta} &=\begin{bmatrix}\boldsymbol{\Theta}_1 \\ \boldsymbol{\Theta}_2 \\ \vdots \\ \boldsymbol{\Theta}_N \\ \end{bmatrix} \\ B_k &=\begin{bmatrix}l_1 A_1 \\ l_2 A_2 \\ \vdots \\ l_k A_k \\ 0 \\ \end{bmatrix} \\ (\dot{z}_k)^2 &=\dot{\boldsymbol{\Theta}}^\dagger B_k B_k^\dagger \dot{\boldsymbol{\Theta}}\end{aligned} \quad\quad\quad(7)

(where the zero matrix in B_k is a N-k by one zero). Summing over all masses and adding in the potential energy we have for the Lagrangian of the system

\begin{aligned}K &=\frac{1}{{2}} \sum_{k=1}^N m_k \dot{\boldsymbol{\Theta}}^\dagger B_k B_k^\dagger \dot{\boldsymbol{\Theta}} \\ \mu_k &= \sum_{j=k}^N m_j \\ \Phi &=g \sum_{k=1}^N \mu_k l_k \cos\theta_k \\ \mathcal{L} &= K - \Phi\end{aligned} \quad\quad\quad(10)

There’s a few layers of equations involved and we still have an unholy mess of matrix and geometric algebra in the kernel of the kinetic energy quadratic form, but at least this time all the generalized coordinates of the system are scalars.

Some tidy up.

Before continuing with evaluation of the Euler-Lagrange equations it is helpful to make a couple of observations about the structure of the matrix products that make up our velocity quadratic forms

\begin{aligned}\dot{\boldsymbol{\Theta}}^\dagger B_k B_k^\dagger \dot{\boldsymbol{\Theta}}&=\dot{\boldsymbol{\Theta}}^\dagger \begin{bmatrix}\begin{bmatrix}l_1^2 A_1 A_1^\dagger & l_1 l_2 A_1 A_2^\dagger & \hdots & l_1 l_k A_1 A_k^\dagger \\ l_2 l_1 A_2 A_1^\dagger & l_2^2 A_2 A_2^\dagger & \hdots & l_2 l_k A_2 A_k^\dagger \\ \vdots \\ l_k l_1 A_k A_1^\dagger & l_k l_2 A_k A_2^\dagger & \hdots & l_k^2 A_k A_k^\dagger \end{bmatrix} & 0 \\ 0 & 0\end{bmatrix}\dot{\boldsymbol{\Theta}}\end{aligned} \quad\quad\quad(14)

Specifically, consider the A_a A_b^\dagger products that make up the elements of the matrices Q_k = B_k B_k^\dagger. Without knowing anything about the grades that make up the elements of Q_k, since it is Hermitian (by this definition of Hermitian) there can be no elements of grade order two or three in the final matrix. This is because reversion of such grades inverts the sign, and the matrix elements in Q_k all equal their reverse. Additionally, the elements of the multivector column matrices A_k are vectors, so in the product A_a A_b^\dagger we can only have scalar and bivector (grade two) elements. The resulting one by one scalar matrix is a sum over all the mixed angular velocities \dot{\theta}_a \dot{\theta}_b, \dot{\theta}_a \dot{\phi}_b, and \dot{\phi}_a \dot{\phi}_b, so once this summation is complete any bivector grades of A_a A_b^\dagger must cancel out. This is consistent with the expectation that we have a one by one scalar matrix result out of this in the end (i.e. a number). The end result is a freedom to exploit the convienence of explicitly using a scalar selection operator that filters out any vector, bivector, and trivector grades in the products A_a A_b^\dagger. We will get the same result if we write

\begin{aligned}\dot{\boldsymbol{\Theta}}^\dagger B_k B_k^\dagger \dot{\boldsymbol{\Theta}}&=\dot{\boldsymbol{\Theta}}^\dagger \begin{bmatrix}\begin{bmatrix}l_1^2 \left\langle{{A_1 A_1^\dagger}}\right\rangle & l_1 l_2 \left\langle{{A_1 A_2^\dagger}}\right\rangle & \hdots & l_1 l_k \left\langle{{A_1 A_k^\dagger}}\right\rangle \\ l_2 l_1 \left\langle{{A_2 A_1^\dagger}}\right\rangle & l_2^2 \left\langle{{A_2 A_2^\dagger}}\right\rangle & \hdots & l_2 l_k \left\langle{{A_2 A_k^\dagger}}\right\rangle \\ \vdots \\ l_k l_1 \left\langle{{A_k A_1^\dagger}}\right\rangle & l_k l_2 \left\langle{{A_k A_2^\dagger}}\right\rangle & \hdots & l_k^2 \left\langle{{A_k A_k^\dagger}}\right\rangle\end{bmatrix} & 0 \\ 0 & 0\end{bmatrix}\dot{\boldsymbol{\Theta}}\end{aligned} \quad\quad\quad(15)

Pulling in the summation over m_k we have

\begin{aligned}\sum_k m_k\dot{\boldsymbol{\Theta}}^\dagger B_k B_k^\dagger \dot{\boldsymbol{\Theta}}&=\dot{\boldsymbol{\Theta}}^\dagger {\begin{bmatrix}\mu_{\max(r,c)} l_r l_c \left\langle{{A_r A_c^\dagger}}\right\rangle\end{bmatrix}}_{rc}\dot{\boldsymbol{\Theta}}\end{aligned} \quad\quad\quad(16)

It appears justifiable to lable the \mu_{\max(r,c)} l_r l_c factors of the angular velocity matrices as moments of inertia in a generalized sense. Using this block matrix form, and scalar selection, we can now write the Lagrangian in a slightly tidier form

\begin{aligned}\mu_k &= \sum_{j=k}^N m_j \\ Q &= {\begin{bmatrix}\mu_{\max(r,c)} l_r l_c A_r A_c^\dagger \end{bmatrix}}_{rc} \\ K &=\frac{1}{{2}} \dot{\boldsymbol{\Theta}}^\dagger Q\dot{\boldsymbol{\Theta}} =\frac{1}{{2}} \dot{\boldsymbol{\Theta}}^\text{T} \left\langle{{Q}}\right\rangle\dot{\boldsymbol{\Theta}} \\ \Phi &=g \sum_{k=1}^N \mu_k l_k \cos\theta_k \\ \mathcal{L} &= K - \Phi\end{aligned} \quad\quad\quad(17)

After some expansion, writing S_\theta = \sin\theta, C_\phi = \cos\phi and so forth, one can find that the scalar parts of the block matrixes A_r A_c^\dagger contained in Q are

\begin{aligned}\left\langle{{A_r A_c^\dagger}}\right\rangle=\begin{bmatrix}C_{\phi_c - \phi_r} C_{\theta_r}C_{\theta_c}+S_{\theta_r}S_{\theta_c} &-S_{\phi_c - \phi_r} C_{\theta_r} S_{\theta_c} \\ S_{\phi_c - \phi_r} C_{\theta_c} S_{\theta_r} &C_{\phi_c - \phi_r} S_{\theta_r} S_{\theta_c}\end{bmatrix}\end{aligned} \quad\quad\quad(22)

The diagonal blocks are particularily simple and have no \phi dependence

\begin{aligned}\left\langle{{A_r A_r^\dagger}}\right\rangle=\begin{bmatrix}1 & 0 \\ 0 & \sin^2 \theta_r\end{bmatrix}\end{aligned} \quad\quad\quad(23)

Observe also that \left\langle{{A_r A_c^\dagger}}\right\rangle^T = \left\langle{{A_c A_r^\dagger}}\right\rangle, so the scalar matrix

\begin{aligned}\left\langle{{Q}}\right\rangle = {\begin{bmatrix}\mu_{\max(r,c)} l_r l_c \left\langle{{ A_r A_c^\dagger }}\right\rangle\end{bmatrix}}_{rc}\end{aligned} \quad\quad\quad(24)

is a real symmetric matrix. We have the option of using this explicit scalar expansion if desired for further computations associated with this problem. That completely eliminates the Geometric algebra from the problem, and is probably a logical way to formulate things for numerical work since one can then exploit any pre existing matrix algebra system without having to create one that understands non-commuting variables and vector products.

Evaluating the Euler-Lagrange equations.

For the acceleration terms of the Euler-Lagrange equations our computation reduces nicely to a function of only \left\langle{{Q}}\right\rangle

\begin{aligned}\frac{d}{dt} \frac{\partial {\mathcal{L}}}{\partial {\dot{\theta}_a}}&=\frac{1}{{2}} \frac{d}{dt} \left(\frac{\partial {\dot{\boldsymbol{\Theta}}}}{\partial {\dot{\theta}_a}}^\text{T}\left\langle{{Q}}\right\rangle \dot{\boldsymbol{\Theta}}+\dot{\boldsymbol{\Theta}}^\text{T}\left\langle{{Q}}\right\rangle \frac{\partial {\dot{\boldsymbol{\Theta}}}}{\partial {\dot{\theta}_a}}\right)  \\ &=\frac{d}{dt} \left({\begin{bmatrix}\delta_{ac}\begin{bmatrix}1 & 0\end{bmatrix}\end{bmatrix}}_c\left\langle{{Q}}\right\rangle \dot{\boldsymbol{\Theta}}\right) \end{aligned}


\begin{aligned}\frac{d}{dt} \frac{\partial {\mathcal{L}}}{\partial {\dot{\phi}_a}}&=\frac{1}{{2}} \frac{d}{dt} \left(\frac{\partial {\dot{\boldsymbol{\Theta}}}}{\partial {\dot{\phi}_a}}^\text{T}\left\langle{{Q}}\right\rangle \dot{\boldsymbol{\Theta}}+\dot{\boldsymbol{\Theta}}^\text{T}\left\langle{{Q}}\right\rangle \frac{\partial {\dot{\boldsymbol{\Theta}}}}{\partial {\dot{\phi}_a}}\right)  \\ &=\frac{d}{dt} \left({\begin{bmatrix}\delta_{ac}\begin{bmatrix}0 & 1\end{bmatrix}\end{bmatrix}}_c\left\langle{{Q}}\right\rangle \dot{\boldsymbol{\Theta}}\right) \end{aligned}

The last groupings above made use of \left\langle{{Q}}\right\rangle = \left\langle{{Q}}\right\rangle^\text{T}, and in particular (\left\langle{{Q}}\right\rangle + \left\langle{{Q}}\right\rangle^\text{T})/2 = \left\langle{{Q}}\right\rangle. We can now form a column matrix putting all the angular velocity gradient in a tidy block matrix representation

\begin{aligned}\nabla_{\dot{\boldsymbol{\Theta}}} \mathcal{L} = {\begin{bmatrix}\begin{bmatrix}\frac{\partial {\mathcal{L}}}{\partial {\dot{\theta}_r}} \\ \frac{\partial {\mathcal{L}}}{\partial {\dot{\phi}_r}} \\ \end{bmatrix}\end{bmatrix}}_r = \left\langle{{Q}}\right\rangle \dot{\boldsymbol{\Theta}}\end{aligned} \quad\quad\quad(25)

A small aside on Hamiltonian form. This velocity gradient is also the conjugate momentum of the Hamiltonian, so if we wish to express the Hamiltonian in terms of conjugate momenta, we require invertability of \left\langle{{Q}}\right\rangle at the point in time that we evaluate things. Writing

\begin{aligned}P_{\boldsymbol{\Theta}} = \nabla_{\dot{\boldsymbol{\Theta}}} \mathcal{L} \end{aligned} \quad\quad\quad(26)

and noting that (\left\langle{{Q}}\right\rangle^{-1})^\text{T} = \left\langle{{Q}}\right\rangle^{-1}, we get for the kinetic energy portion of the Hamiltonian

\begin{aligned}K = \frac{1}{{2}} {P_{\boldsymbol{\Theta}}}^\text{T} \left\langle{{Q}}\right\rangle^{-1} P_{\boldsymbol{\Theta}}\end{aligned} \quad\quad\quad(27)

Now, the invertability of \left\langle{{Q}}\right\rangle cannot be taken for granted. Even in the single particle case we do not have invertability. For the single particle case we have

\begin{aligned}\left\langle{{Q}}\right\rangle =m l^2 \begin{bmatrix}1 & 0 \\ 0 & \sin^2 \theta\end{bmatrix}\end{aligned} \quad\quad\quad(28)

so at \theta = \pm \pi/2 this quadratic form is singlular, and the planar angular momentum becomes a constant of motion.

Returning to the evaluation of the Euler-Lagrange equations, the problem is now reduced to calculating the right hand side of the following system

\begin{aligned}\frac{d}{dt} \left( \left\langle{{Q}}\right\rangle \dot{\boldsymbol{\Theta}} \right) ={\begin{bmatrix}\begin{bmatrix}\frac{\partial {\mathcal{L}}}{\partial {\theta_r}} \\ \frac{\partial {\mathcal{L}}}{\partial {\phi_r}} \\ \end{bmatrix}\end{bmatrix}}_r\end{aligned} \quad\quad\quad(29)

With back substituition of 22, and 24 we have a complete non-multivector expansion of the left hand side. For the right hand side taking the \theta_a and \phi_a derivatives respectively we get

\begin{aligned}\frac{\partial {\mathcal{L}}}{\partial {\theta_a}}=\frac{1}{{2}} \dot{\boldsymbol{\Theta}}^\dagger {\begin{bmatrix}\mu_{\max(r,c)} l_r l_c \left\langle{{\frac{\partial {A_r}}{\partial {\theta_a}} A_c^\dagger + A_r \frac{\partial {A_c}}{\partial {\theta_a}}^\dagger}}\right\rangle\end{bmatrix}}_{rc} \dot{\boldsymbol{\Theta}}-g \mu_a l_a \sin\theta_a \end{aligned} \quad\quad\quad(30)

\begin{aligned}\frac{\partial {\mathcal{L}}}{\partial {\phi_a}}=\frac{1}{{2}} \dot{\boldsymbol{\Theta}}^\dagger {\begin{bmatrix}\mu_{\max(r,c)} l_r l_c \left\langle{{\frac{\partial {A_r}}{\partial {\phi_a}} A_c^\dagger + A_r \frac{\partial {A_c}}{\partial {\phi_a}}^\dagger}}\right\rangle\end{bmatrix}}_{rc} \dot{\boldsymbol{\Theta}}\end{aligned} \quad\quad\quad(31)

So to procede we must consider the \left\langle{{A_r A_c^\dagger}}\right\rangle partials. A bit of thought shows that the matrices of partials above are mostly zeros. Illustrating by example, consider {\partial {\left\langle{{Q}}\right\rangle}}/{\partial {\theta_2}}, which in block matrix form is

\begin{aligned}\frac{\partial {\left\langle{{Q}}\right\rangle}}{\partial {\theta_2}}=\begin{bmatrix}0 & \frac{1}{{2}} \mu_2 l_1 l_2 \left\langle{{A_1 \frac{\partial {A_2}}{\partial {\theta_2}}^\dagger}}\right\rangle & 0 & \hdots & 0 \\ \frac{1}{{2}} \mu_2 l_2 l_1 \left\langle{{\frac{\partial {A_2}}{\partial {\theta_2}} A_1^\dagger}}\right\rangle &\frac{1}{{2}} \mu_2 l_2 l_2 \left\langle{{A_2 \frac{\partial {A_2}}{\partial {\theta_2}}^\dagger + \frac{\partial {A_2}}{\partial {\theta_2}} A_2^\dagger}}\right\rangle &\frac{1}{{2}} \mu_3 l_2 l_3 \left\langle{{\frac{\partial {A_2}}{\partial {\theta_2}} A_3^\dagger}}\right\rangle & \hdots &\frac{1}{{2}} \mu_N l_2 l_N \left\langle{{\frac{\partial {A_2}}{\partial {\theta_2}} A_N^\dagger}}\right\rangle \\ 0 & \frac{1}{{2}} \mu_3 l_3 l_2 \left\langle{{A_3 \frac{\partial {A_2}}{\partial {\theta_2}}^\dagger}}\right\rangle & 0 & \hdots & 0 \\ 0 & \vdots & 0 & \hdots & 0 \\ 0 & \frac{1}{{2}} \mu_N l_N l_2 \left\langle{{A_N \frac{\partial {A_2}}{\partial {\theta_2}}^\dagger}}\right\rangle & 0 & \hdots & 0 \\ \end{bmatrix}\end{aligned} \quad\quad\quad(32)

Observe that the diagonal term has a scalar plus its reverse, so we can drop the one half factor and one of the summands for a total contribution to {\partial {\mathcal{L}}}/{\partial {\theta_2}} of just

\begin{aligned}\mu_2 {l_2}^2 {\dot{\boldsymbol{\Theta}}_2}^\text{T} \left\langle{{\frac{\partial {A_2}}{\partial {\theta_2}} A_2^\dagger}}\right\rangle \dot{\boldsymbol{\Theta}}_2\end{aligned}

Now consider one of the pairs of off diagonal terms. Adding these we contributions to {\partial {\mathcal{L}}}/{\partial {\theta_2}} of

\begin{aligned}\frac{1}{{2}} \mu_2 l_1 l_2 {\dot{\boldsymbol{\Theta}}_1}^\text{T}\left\langle{{A_1 \frac{\partial {A_2}}{\partial {\theta_2}}^\dagger}}\right\rangle \dot{\boldsymbol{\Theta}}_2+\frac{1}{{2}} \mu_2 l_2 l_1 {\dot{\boldsymbol{\Theta}}_2}^\text{T}\left\langle{{\frac{\partial {A_2}}{\partial {\theta_2}} A_1^\dagger}}\right\rangle \dot{\boldsymbol{\Theta}}_1&=\frac{1}{{2}} \mu_2 l_1 l_2 {\dot{\boldsymbol{\Theta}}_1}^\text{T}\left\langle{{A_1 \frac{\partial {A_2}}{\partial {\theta_2}}^\dagger + A_1 \frac{\partial {A_2}}{\partial {\theta_2}}^\dagger}}\right\rangle \dot{\boldsymbol{\Theta}}_2 \\ &=\mu_2 l_1 l_2 {\dot{\boldsymbol{\Theta}}_1}^\text{T}\left\langle{{A_1 \frac{\partial {A_2}}{\partial {\theta_2}}^\dagger}}\right\rangle \dot{\boldsymbol{\Theta}}_2 \\ \end{aligned}

This has exactly the same form as the diagonal term, so summing over all terms we get for the position gradient components of the Euler-Lagrange equation just

\begin{aligned}\frac{\partial {\mathcal{L}}}{\partial {\theta_a}}&=\sum_{k}\mu_{\max(k,a)} l_k l_a {\dot{\boldsymbol{\Theta}}_k}^\text{T}\left\langle{{A_k \frac{\partial {A_a}}{\partial {\theta_a}}^\dagger}}\right\rangle \dot{\boldsymbol{\Theta}}_a -g \mu_a l_a \sin\theta_a \end{aligned} \quad\quad\quad(33)

\begin{aligned}\frac{\partial {\mathcal{L}}}{\partial {\phi_a}}&=\sum_{k}\mu_{\max(k,a)} l_k l_a {\dot{\boldsymbol{\Theta}}_k}^\text{T}\left\langle{{A_k \frac{\partial {A_a}}{\partial {\phi_a}}^\dagger}}\right\rangle \dot{\boldsymbol{\Theta}}_a \end{aligned} \quad\quad\quad(34)

The only thing that remains to do is evaluate the \left\langle{{A_k {\partial {A_a}}/{\partial {\phi_a}}^\dagger}}\right\rangle matrixes.

It should be possible but it is tedious to calculate the block matrix derivative terms from the A_a partials using

\begin{aligned}\frac{\partial {A_a}}{\partial {\theta_a}} &=\begin{bmatrix}-\mathbf{e}_3 e^{j_a \theta_a} \\ \mathbf{e}_2 e^{i \phi_a} C_{\theta_a}\end{bmatrix}\end{aligned} \quad\quad\quad(35)

\begin{aligned}\frac{\partial {A_a}}{\partial {\phi_a}}&=\begin{bmatrix}\mathbf{e}_2 e^{i \phi_a} C_{\theta_a} \\ -\mathbf{e}_1 e^{i \phi_a} S_{\theta_a}\end{bmatrix}\end{aligned} \quad\quad\quad(36)

However multiplying this out and reducing is a bit tedious and would be a better job for a symbolic algebra package. With 22 available to use, one gets easily

\begin{aligned}\left\langle{{ A_k \frac{\partial {A_c}}{\partial {\theta_c}}^\dagger }}\right\rangle&=\begin{bmatrix}-C_{\phi_a - \phi_k} C_{\theta_k} S_{\theta_a} + S_{\theta_k} C_{\theta_a} &-S_{\phi_a - \phi_k} C_{\theta_k} C_{\theta_a} \\ -S_{\phi_a - \phi_k} S_{\theta_a} S_{\theta_k} &C_{\phi_a - \phi_k} (1 + \delta_{k a}) S_{\theta_k} C_{\theta_a} \end{bmatrix}\end{aligned} \quad\quad\quad(37)

\begin{aligned}\left\langle{{ A_k \frac{\partial {A_a}}{\partial {\phi_a}}^\dagger }}\right\rangle&=\begin{bmatrix}-S_{\phi_a - \phi_k} C_{\theta_k} C_{\theta_a} + S_{\theta_k} S_{\theta_a} &-C_{\phi_a - \phi_k} C_{\theta_k} S_{\theta_a} \\ C_{\phi_a - \phi_k} C_{\theta_a} S_{\theta_k} &-S_{\phi_a - \phi_k} S_{\theta_k} S_{\theta_a} \end{bmatrix}\end{aligned} \quad\quad\quad(38)

The right hand side of the Euler-Lagrange equations now becomes

\begin{aligned}\nabla_{\boldsymbol{\Theta}} \mathcal{L} =\sum_k{\begin{bmatrix}\begin{bmatrix}\mu_{\max(k,r)} l_k l_r {\dot{\boldsymbol{\Theta}}_k}^\text{T} \left\langle{{ A_k \frac{\partial {A_r}}{\partial {\theta_r}}^\dagger }}\right\rangle \dot{\boldsymbol{\Theta}}_r \\ \mu_{\max(k,r)} l_k l_r {\dot{\boldsymbol{\Theta}}_k}^\text{T} \left\langle{{ A_k \frac{\partial {A_r}}{\partial {\phi_r}}^\dagger }}\right\rangle \dot{\boldsymbol{\Theta}}_r \end{bmatrix}\end{bmatrix}}_r- g{\begin{bmatrix}\mu_r l_r \sin\theta_r \begin{bmatrix}1 \\ 0\end{bmatrix}\end{bmatrix}}_r\end{aligned} \quad\quad\quad(39)

Can the \dot{\boldsymbol{\Theta}}_a matrices be factored out, perhaps allowing for expression as a function of \dot{\boldsymbol{\Theta}}? How to do that if it is possible is not obvious. The driving reason to do so would be to put things into a tidy form where things are a function of the system angular velocity vector \boldsymbol{\Theta}, but this is not possible anyways since the gradient is non-linear.

Hamiltonian form and linearization.

Having calculated the Hamiltonian equations for the multiple mass planar pendulum in [2], doing so for the spherical pendulum can now be done by inspection. With the introduction of a phase space vector for the system using the conjugate momenta (for angles where these conjugate momenta are non-singlular)

\begin{aligned}\mathbf{z} = \begin{bmatrix}P_{\boldsymbol{\Theta}} \\ \boldsymbol{\Theta}\end{bmatrix}\end{aligned} \quad\quad\quad(40)

we can write the Hamiltonian equations

\begin{aligned}\frac{d\mathbf{z}}{dt} = \begin{bmatrix}\nabla_{\boldsymbol{\Theta}} \mathcal{L} \\ \left\langle{{Q}}\right\rangle^{-1} P_{\boldsymbol{\Theta}}\end{bmatrix}\end{aligned} \quad\quad\quad(41)

The position gradient is given explicitly in 39, and that can be substituted here. That gradient is expressed in terms of \dot{\boldsymbol{\Theta}}_k and not the conjugate momenta, but the mapping required to express the whole system in terms of the conjugate momenta is simple enough

\begin{aligned}\dot{\boldsymbol{\Theta}}_k = {\begin{bmatrix}\delta_{kc} I_{22}\end{bmatrix}}_c \left\langle{{Q}}\right\rangle^{-1} P_{\boldsymbol{\Theta}}\end{aligned} \quad\quad\quad(42)

It is apparent that for any sort of numerical treatment use of a angular momentum and angular position phase space vector is not prudent. If the aim is nothing more than working with a first order system instead of second order, then we are probably better off with an angular velocity plus angular position phase space system.

\begin{aligned}\frac{d}{dt}\begin{bmatrix}\left\langle{{Q}}\right\rangle \dot{\boldsymbol{\Theta}} \\ \boldsymbol{\Theta}\end{bmatrix}=\begin{bmatrix}\nabla_{\boldsymbol{\Theta}} \mathcal{L} \\ \dot{\boldsymbol{\Theta}}\end{bmatrix}\end{aligned} \quad\quad\quad(43)

This eliminates the requirement for inverting the sometimes singular matrix \left\langle{{Q}}\right\rangle, but one is still left with something that is perhaps tricky to work with since we have the possibility of zeros on the left hand side. The resulting equation is of the form

\begin{aligned}M \mathbf{x}' = f(\mathbf{x})\end{aligned} \quad\quad\quad(44)

where M = \left[\begin{smallmatrix}\left\langle{{Q}}\right\rangle & 0 \\  0 & I\end{smallmatrix}\right] is a possibly singular matrix, and f is a non-linear function of the components of \boldsymbol{\Theta}, and \dot{\boldsymbol{\Theta}}. This is concievably linearizable in the neighbourhood of a particular phase space point \mathbf{x}_0. If that is done, resulting in an equation of the form

\begin{aligned}M \mathbf{y}' = f(\mathbf{x}_0) + B \mathbf{y} \end{aligned} \quad\quad\quad(45)

where \mathbf{x} = \mathbf{y} + \mathbf{x}_0 and B is an appropriate matrix of partials (the specifics of which don’t really have to be spelled out here). Because of the possible singularities of M the exponentiation techniques applied to the linearized planar pendulum may not be possible with such a linearization. Study of this less well formed system of LDEs probably has interesting aspects, but is also likely best tackled independently of the specifics of the spherical pendulum problem.

Thoughts about the Hamiltonian singularity.

The fact that the Hamiltonian goes singular on the horizontal in this spherical polar representation is actually what I think is the most interesting bit in the problem (the rest being a lot mechanical details). On the horizontal \phi=0 or \dot{\phi} = 37000 radians/sec makes no difference to the dynamics. All you can say is that the horizontal plane angular momentum is a constant of the system. It seems very much like the increasing uncertaintly that you get in the corresponding radial QM equation. Once you start pinning down the \theta angle, you loose the ability to say much about \phi.

It is also kind of curious how the energy of the system is never ill defined but a choice of a particular orientation to use as a reference for observations of the momenta introduces the singularity as the system approaches the horizontal in that reference frame.

Perhaps there are some deeper connections relating these classical and QM similarity. Would learning about symplectic flows and phase space volume invariance shed some light on this?

A summary.

A fair amount of notation was introduced along the way in the process of formulating the spherical pendulum equations. It is worthwhile to do a final consise summary of notation and results before moving on for future reference.

The positions of the masses are given by

\begin{aligned}z_k &= z_{k-1} + \mathbf{e}_3 l_k e^{j_k \theta_k} \\ j_k &= \mathbf{e}_3 \wedge \left( \mathbf{e}_1 e^{i \phi_k} \right) \\ i &= \mathbf{e}_1 \wedge \mathbf{e}_2\end{aligned} \quad\quad\quad(46)

With the introduction of a column vector of vectors (where we multiply matrices using the Geometric vector product),

\begin{aligned}\boldsymbol{\Theta}_k &=\begin{bmatrix}\theta_k \\ \phi_k\end{bmatrix}\end{aligned} \quad\quad\quad(49)

\begin{aligned}\boldsymbol{\Theta} &={\begin{bmatrix}\boldsymbol{\Theta}_1 &\boldsymbol{\Theta}_2 &\hdots &\boldsymbol{\Theta}_N \end{bmatrix}}^\text{T}\end{aligned} \quad\quad\quad(50)

and a matrix of velocity components (with matrix multiplication of the vector elements using the Geometric vector product), we can form the Lagrangian

\begin{aligned}A_k &=\begin{bmatrix}\mathbf{e}_1 e^{i \phi_k} e^{j_k \theta_k} \\ \mathbf{e}_2 e^{i \phi_k} S_{\theta_k}\end{bmatrix} \end{aligned} \quad\quad\quad(51)

\begin{aligned}\mu_k &= \sum_{j=k}^N m_j \\ \left\langle{{Q}}\right\rangle &= {\begin{bmatrix}\mu_{\max(r,c)} l_r l_c \left\langle{{A_r A_c^\text{T}}}\right\rangle\end{bmatrix}}_{rc} \\ K &= \frac{1}{{2}} \dot{\boldsymbol{\Theta}}^\text{T} \left\langle{{Q}}\right\rangle\dot{\boldsymbol{\Theta}} \\ \Phi &=g \sum_{k=1}^N \mu_k l_k C_{\theta_k} \\ \mathcal{L} &= K - \Phi\end{aligned} \quad\quad\quad(52)

An explicit scalar matrix evaluation of the (symmetric) block matrix components of \left\langle{{Q}}\right\rangle was evaluated and found to be

\begin{aligned}\left\langle{{A_r A_c^\text{T}}}\right\rangle=\begin{bmatrix}C_{\phi_c - \phi_r} C_{\theta_r}C_{\theta_c}+S_{\theta_r}S_{\theta_c} &-S_{\phi_c - \phi_r} C_{\theta_r} S_{\theta_c} \\ S_{\phi_c - \phi_r} C_{\theta_c} S_{\theta_r} &C_{\phi_c - \phi_r} S_{\theta_r} S_{\theta_c}\end{bmatrix}\end{aligned} \quad\quad\quad(57)

These can be used if explicit evaluation of the Kinetic energy is desired, avoiding redundant summation over the pairs of skew entries in the quadratic form matrix \left\langle{{Q}}\right\rangle

\begin{aligned}K = \frac{1}{{2}} \sum_k \mu_k l_k^2 {\dot{\boldsymbol{\Theta}}_k}^T \left\langle{{A_k A_k^\text{T}}}\right\rangle \dot{\boldsymbol{\Theta}}_k+ \sum_{r<c} \mu_{\max(r,c)} l_r l_c {\dot{\boldsymbol{\Theta}}_r}^\text{T}\left\langle{{A_r A_c^\text{T}}}\right\rangle\dot{\boldsymbol{\Theta}}_c\end{aligned} \quad\quad\quad(58)

We utilize angular position and velocity gradients

\begin{aligned}\nabla_{\boldsymbol{\Theta}_k} &= \begin{bmatrix}\frac{\partial}{\partial \theta_k} \\ \frac{\partial}{\partial \phi_k} \end{bmatrix} \\ \nabla_{\dot{\boldsymbol{\Theta}}_k} &= \begin{bmatrix}\frac{\partial }{\partial \dot{\theta}_k} \\ \frac{\partial }{\partial \dot{\phi}_k} \end{bmatrix} \\ \nabla_{\boldsymbol{\Theta}} &= {\begin{bmatrix}{\nabla_{\boldsymbol{\Theta}_1}}^\text{T} & {\nabla_{\boldsymbol{\Theta}_2}}^\text{T} & \hdots  & {\nabla_{\boldsymbol{\Theta}_N}}^\text{T} \end{bmatrix}}^\text{T} \\ \nabla_{\boldsymbol{\dot{\Theta}}} &= {\begin{bmatrix}{\nabla_{\boldsymbol{\dot{\Theta}}_1}}^\text{T} & {\nabla_{\boldsymbol{\dot{\Theta}}_2}}^\text{T} & \hdots  & {\nabla_{\boldsymbol{\dot{\Theta}}_N}}^\text{T} \end{bmatrix}}^\text{T} \end{aligned} \quad\quad\quad(59)

and use these to form the Euler-Lagrange equations for the system in column vector form

\begin{aligned}\frac{d}{dt} \nabla_{\dot{\boldsymbol{\Theta}}} \mathcal{L} = \nabla_{\boldsymbol{\Theta}} \mathcal{L}\end{aligned} \quad\quad\quad(63)

For the canonical momenta we found the simple result

\begin{aligned}\nabla_{\dot{\boldsymbol{\Theta}}} \mathcal{L} = \left\langle{{Q}}\right\rangle \dot{\boldsymbol{\Theta}}\end{aligned} \quad\quad\quad(64)

For the position gradient portion of the Euler-Lagrange equations 63 we found in block matrix form

\begin{aligned}\nabla_{\boldsymbol{\Theta}} \mathcal{L} =\sum_k{\begin{bmatrix}\begin{bmatrix}\mu_{\max(k,r)} l_k l_r {\dot{\boldsymbol{\Theta}}_k}^\text{T} \left\langle{{ A_k \frac{\partial {A_r}}{\partial {\theta_r}}^\dagger }}\right\rangle \dot{\boldsymbol{\Theta}}_r \\ \mu_{\max(k,r)} l_k l_r {\dot{\boldsymbol{\Theta}}_k}^\text{T} \left\langle{{ A_k \frac{\partial {A_r}}{\partial {\phi_r}}^\dagger }}\right\rangle \dot{\boldsymbol{\Theta}}_r \end{bmatrix}\end{bmatrix}}_r- g{\begin{bmatrix}\mu_r l_r S_{\theta_r}\begin{bmatrix}1 \\ 0\end{bmatrix}\end{bmatrix}}_r\end{aligned} \quad\quad\quad(65)

\begin{aligned}\left\langle{{ A_k \frac{\partial {A_c}}{\partial {\theta_c}}^\dagger }}\right\rangle&=\begin{bmatrix}-C_{\phi_a - \phi_k} C_{\theta_k} S_{\theta_a} + S_{\theta_k} C_{\theta_a} &-S_{\phi_a - \phi_k} C_{\theta_k} C_{\theta_a} \\ -S_{\phi_a - \phi_k} S_{\theta_a} S_{\theta_k} &C_{\phi_a - \phi_k} (1 + \delta_{k a}) S_{\theta_k} C_{\theta_a} \end{bmatrix}\end{aligned} \quad\quad\quad(66)

\begin{aligned}\left\langle{{ A_k \frac{\partial {A_a}}{\partial {\phi_a}}^\dagger }}\right\rangle&=\begin{bmatrix}-S_{\phi_a - \phi_k} C_{\theta_k} C_{\theta_a} + S_{\theta_k} S_{\theta_a} &-C_{\phi_a - \phi_k} C_{\theta_k} S_{\theta_a} \\ C_{\phi_a - \phi_k} C_{\theta_a} S_{\theta_k} &-S_{\phi_a - \phi_k} S_{\theta_k} S_{\theta_a} \end{bmatrix}\end{aligned} \quad\quad\quad(67)

A set of Hamiltonian equations for the system could also be formed. However, this requires that one somehow restrict attention to the subset of phase space where the canonical momenta matrix \left\langle{{Q}}\right\rangle is non-singular, something not generally possible.


[1] Peeter Joot. {Spherical polar pendulum for one and multiple masses, and multivector Euler-Lagrange formulation.} [online].

[2] Peeter Joot. Hamiltonian notes. [online].

Posted in Math and Physics Learning. | Tagged: , , , , , , , , , , | Leave a Comment »

Spherical polar pendulum for one and multiple masses, and multivector Euler-Lagrange formulation.

Posted by peeterjoot on November 4, 2009

[Click here for a PDF of this post with nicer formatting]


The planar multiple pendulum problem proved somewhat tractable in the Hamiltonian formulation. Generalizing this to allow for three dimensional motion is a logical next step. Here this is attempted, using a Geometric Algebra scalar plus bivector parametrization of the spherical position of each dangling mass relative to the position of what it is attached to, as in

\begin{aligned}\mathbf{z} = l \mathbf{e}_3 e^{j\theta}\end{aligned} \quad\quad\quad(1)

The exponential is essentially a unit quaternion, rotating the vector l \mathbf{e}_3 from the polar axis to its \theta,\phi angle dependent position. Two sided rotation operators are avoided here by requiring of the unit bivector j = \mathbf{e}_3 \wedge \mathbf{m}, where \mathbf{m} is a vector in the plane of rotation passing through the great circle from \mathbf{e}_3 through \mathbf{z}. Note that we are free to pick \mathbf{m} = \mathbf{e}_1 e^{\mathbf{e}_1 \mathbf{e}_2 \phi}, the unit vector in the x,y plane at angle \phi from the x-axis. When that is done j = \mathbf{e}_3 \mathbf{m} since these are perpendicular. Setting up the Lagrangian in terms of the bivector j instead of the scalar angle \phi will be attempted, since this is expected to have some elegance and will be a fun way to try the problem. This should also provide a concrete example of a multivector Lagrangian in a context much simpler than electromagnetic fields or quantum mechanics.

Note finally that a number of simplifying assumptions will be made. These include use of point masses, zero friction at the pivots and rigid nonspringy massless connecting rods between the masses.

Kinetic energy for the single pendulum case.

Let’s compute derivatives of the unit vector

\begin{aligned}\hat{\mathbf{z}} = \mathbf{e}_3 e^{j\theta} = e^{-j\theta} \mathbf{e}_3\end{aligned} \quad\quad\quad(2)

This can be done with both the left and right factorization of \mathbf{e}_3, and are respectively

\begin{aligned}\dot{\hat{\mathbf{z}}} &= \mathbf{e}_3 \left( j \dot{\theta} e^{j\theta} + \frac{dj}{dt} \sin\theta \right) \\ &=\mathbf{e}_3\begin{bmatrix}j e^{j\theta} & \sin\theta\end{bmatrix}\frac{d}{dt} \begin{bmatrix}\theta \\ j\end{bmatrix}\end{aligned} \quad\quad\quad(3)

\begin{aligned}\dot{\hat{\mathbf{z}}} &= \left( -j \dot{\theta} e^{-j\theta} - \frac{dj}{dt} \sin\theta \right) \mathbf{e}_3 \\ &=\left(\frac{d}{dt} \begin{bmatrix}\theta & -j\end{bmatrix}\right)\begin{bmatrix}-j e^{-j\theta} \\ \sin\theta\end{bmatrix}\mathbf{e}_3\end{aligned} \quad\quad\quad(5)

These derivatives have been grouped into a matrix factors that allow a natural seeming conjugate operation to be defined. That is for a matrix of multivector elements a_{ij}

\begin{aligned}A =\begin{bmatrix}a_{ij}\end{bmatrix}\end{aligned} \quad\quad\quad(7)

define a conjugate matrix, as the transpose of the reversed elements

\begin{aligned}A^\dagger \equiv\begin{bmatrix}\tilde{a}_{ji}\end{bmatrix}\end{aligned} \quad\quad\quad(8)

With this definition, plus two helpers

\begin{aligned}\boldsymbol{\Theta} &\equiv\begin{bmatrix}\theta \\  j\end{bmatrix} \\ R &= \begin{bmatrix}j e^{j\theta} & \sin\theta\end{bmatrix}\end{aligned} \quad\quad\quad(9)

Our velocity becomes

\begin{aligned}{\dot{\hat{\mathbf{z}}}}^2 &= {\dot{\boldsymbol{\Theta}}}^\dagger R^\dagger R \dot{\boldsymbol{\Theta}} \\ \end{aligned} \quad\quad\quad(11)

Explicitly, expanding the inner matrix product we can write

\begin{aligned}Q &\equiv R^\dagger R \\ &=\begin{bmatrix}1 & -j e^{-j\theta} \sin\theta \\ j e^{j\theta} \sin\theta & \sin^2 \theta\end{bmatrix} \\ {\dot{\hat{\mathbf{z}}}}^2 &= {\dot{\boldsymbol{\Theta}}}^\dagger Q \dot{\boldsymbol{\Theta}} \\ \end{aligned} \quad\quad\quad(13)

This is a slightly unholy mix of geometric and matrix algebra, but it works to compactly express the velocity dependence. Observe that this inner matrix Q = Q^\dagger, so it is Hermitian with this definition of conjugation.

Our Lagrangian for the one particle pendulum, measuring potential energy from the horizontal, is then

\begin{aligned}\mathcal{L} = {\dot{\boldsymbol{\Theta}}}^\dagger \frac{1}{{2}} m l^2 Q \dot{\boldsymbol{\Theta}} - m g l \cos\theta\end{aligned} \quad\quad\quad(17)

We also have a mechanism that should generalize fairly easily to the two or many pendulum cases too.

Before continuing, it should be noted that there were assumptions made in this energy expression derivation that are not reflected in the Lagrangian above. One of these was the unit bivector assumption for j, as well as a \mathbf{e}_3 containment assumption for the plane this represents (\mathbf{e}_3 \wedge j = 0). So for completeness we should probably add to the Lagrangian above some Lagrange multiplier enforced constraints

\begin{aligned}\lambda ( {j}^2 + 1 ) + \alpha \dot{c} ( \mathbf{e}_3 \wedge j )\end{aligned} \quad\quad\quad(18)

Here \alpha has got to be a trivector multiplier for the Lagrangian to be a scalar. Can we get away with omitting these constraints?

Two and multi particle case.

Having constructed a way that can express the velocity of a single spherical pendulum in a tidy way, we can move on to consider the multiple pendulum case as shown in figure (\ref{fig:sPolarMultiPendulum:pendulumDouble})

\caption{Double spherical pendulum.}

There are two bivectors depicted, j_1 and j_2 representing oriented planes passing through great circles from a local polar axis (in direction \mathbf{e}_3). Let the positions of the respective masses be z_1 and z_2, where each mass is connected by a rigid massless rod of length l_1 and l_2 respectively. The masses are rotated by angles \theta_1 and \theta_2 in the planes j_1 and j_2 from an initial direction of \mathbf{e}_3. We can express the position of the second mass as

\begin{aligned}\mathbf{z}_2 = \mathbf{z}_1 + \mathbf{e}_3 e^{j_2 \theta_2}\end{aligned} \quad\quad\quad(19)

We can use the same factorization as previously used for the single mass case and write for our collection of angular velocities

\begin{aligned}\boldsymbol{\Theta} &\equiv\begin{bmatrix}\theta_1 \\  j_1 \\ \theta_2 \\  j_2 \\ \end{bmatrix} \end{aligned} \quad\quad\quad(20)

Using this the total Kinetic energy is

\begin{aligned}K &= {\dot{\boldsymbol{\Theta}}}^\dagger \frac{1}{{2}} Q \dot{\boldsymbol{\Theta}} \\ R_1 &=\begin{bmatrix}l_1 j_1 e^{j_1\theta_1} & l_1 \sin\theta_1 &0 &0\end{bmatrix} \\ R_2 &=\begin{bmatrix} l_1 j_1 e^{j_1\theta_1} &  l_1 \sin\theta_1 & l_2 j_2 e^{j_2\theta_2} &  l_2 \sin\theta_2 \end{bmatrix} \\ Q &=m_1 {R_1}^\dagger R_1+m_2 {R_2}^\dagger R_2\end{aligned} \quad\quad\quad(21)

Notation has been switched slightly from the single mass case, and the m l^2 factor is now incorporated directly into Q for convenience.

An expansion of Q is essentially one of block matrix multiplication (where we already have to be careful with order of operations as we do for the geometric product elements themselves). We have something like

\begin{aligned}R_1 &= \begin{bmatrix}A_1 & 0\end{bmatrix} \\ {R_1}^\dagger &= \begin{bmatrix}{A_1}^\dagger \\ 0\end{bmatrix} \\ R_2 &= \begin{bmatrix}A_1 & A_2\end{bmatrix} \\ {R_2}^\dagger &= \begin{bmatrix}{A_1}^\dagger \\ {A_2}^\dagger \\ \end{bmatrix} \end{aligned} \quad\quad\quad(25)

We have for the products

\begin{aligned}{R_1}^\dagger R_1 &=\begin{bmatrix}{A_1}^\dagger A_1 & 0 \\ 0 & 0\end{bmatrix} \\ {R_2}^\dagger R_2 &=\begin{bmatrix}{A_1}^\dagger A_1 ^\dagger A_2 \\ {A_2}^\dagger A_1 ^\dagger A_2 \\ \end{bmatrix} \end{aligned} \quad\quad\quad(29)

So our quadratic form matrix is

\begin{aligned}Q =\begin{bmatrix}(m_1 + m_2) {A_1}^\dagger A_1 & m_2 {A_1}^\dagger A_2 \\ m_2 {A_2}^\dagger A_1 & m_2 {A_2}^\dagger A_2 \\ \end{bmatrix} \end{aligned} \quad\quad\quad(31)

In general for the multiple particle case this is

\begin{aligned}Q &={\begin{bmatrix}\left(\sum_{k=\max(r,c)}^N m_k \right){A_r}^\dagger A_c\end{bmatrix}}_{rc} \\ A_k &= l_k \begin{bmatrix}j_k e^{j_k \theta_k} & \sin\theta_k\end{bmatrix}\end{aligned} \quad\quad\quad(32)

Expanded explicitly this is

\begin{aligned}Q &={\begin{bmatrix}\left(\sum_{k=\max(r,c)}^N m_k \right) l_r l_c\begin{bmatrix}- j_r e^{-j_r \theta_r} j_c e^{j_c\theta_c} & - j_r e^{-j_r \theta_r} \sin\theta_c \\ j_c e^{j_c \theta_c} \sin\theta_r & \sin\theta_r \sin\theta_c\end{bmatrix}\end{bmatrix}}_{rc}\end{aligned} \quad\quad\quad(34)

Observe that the order of products in this expansion is specifically ordered, since the j_c and j_r bivectors do not necessarily commute.

The potential in the multiple particle case is also fairly straightforward to compute. Consider the two particle case to illustrate the pattern. Using the lowest point as the potential reference we have

\begin{aligned}\phi' = g \sum m_i h_i= m_1 r_1 (1 + \cos\theta_1) + m_2 \left( r_1(1 + \cos\theta_1) + r_2( 1 + \cos\theta_2) \right)\end{aligned} \quad\quad\quad(35)

Alternately, dropping all the constant terms (using the horizon as the potential reference) we have for the general case

\begin{aligned}\phi = g \sum_i \left( \sum_{k=i}^N m_k \right) r_i \cos\theta_i\end{aligned} \quad\quad\quad(36)

Lets collect all the bits and pieces now for the multiple pendulum Lagrangian now, repeating for coherency, and introducing a tiny bit more notation (mass sums and block angular velocity matrices) for convenience

\begin{aligned}\mathcal{L} &= K - \phi + \sum_k \lambda_k ({j_k}^2 + 1 ) \\ \boldsymbol{\Theta}_i &=\begin{bmatrix}\theta_i \\ j_i\end{bmatrix} \\ \boldsymbol{\Theta} &={\begin{bmatrix}\boldsymbol{\Theta}_r\end{bmatrix}}_{r} \\ \mu_i &=\sum_{k=i}^N m_k \\ Q &={\begin{bmatrix}\mu_{\max(r,c)}l_r l_c\begin{bmatrix}- j_r e^{-j_r \theta_r} j_c e^{j_c\theta_c} & - j_r e^{-j_r \theta_r} \sin\theta_c \\ j_c e^{j_c \theta_c} \sin\theta_r & \sin\theta_r \sin\theta_c\end{bmatrix}\end{bmatrix}}_{rc} \\ K &=\frac{1}{{2}} {\dot{\boldsymbol{\Theta}}}^\dagger Q \dot{\boldsymbol{\Theta}} \\ \phi &=g \sum_{i=1}^N \mu_i r_i \cos\theta_i\end{aligned} \quad\quad\quad(37)

Building up to the Multivector Euler-Lagrange equations.

Rather than diving right into any abstract math, lets consider a few specific examples of multivector Lagrangians to develop some comfort with non-scalar Lagrangian functions. The generalized “coordinates” in the Lagrangian for the spherical pendulum problem being considered include include bivectors, and we don’t know how to evaluate the Euler Lagrange equations for anything but scalar generalized coordinates.

The goal is to develop Euler-Lagrange equations that can handle a Lagrangian for these more general functions, but until we figure out how to do that, we can at least tackle the problem using variation of the action around a stationary solution.

A first example to build intuition.

To help understand what we have to do, lets consider the very simplest bivector parametrized Lagrangian, that of a spherical pendulum constrained (perhaps by a track or a surface) of moving only in a ring. This is shown pictorially in figure (\ref{fig:sPolarMultiPendulum:pendulumPolarCircular})

\caption{Circularly constrained spherical pendulum.}

The potential energy is fixed on this surface, so our Lagrangian is purely kinetic

\begin{aligned}\mathcal{L} = -\frac{1}{{2}} m l^2 \sin^2 \theta_0 {j'}^2\end{aligned} \quad\quad\quad(44)

We’d like to directly vary the action for the Lagrangian around a stationary solution

\begin{aligned}S = \int \mathcal{L} dt\end{aligned} \quad\quad\quad(45)

Introducing a bivector variation j = \bar{j} + \epsilon, and writing I = m l^2 \sin^2\theta_0 we have

\begin{aligned}\bar{S} + \delta S &= -\frac{1}{{2}} I \int (\bar{j}' + \epsilon')^2 dt \\ &= -\frac{1}{{2}} I \int (\bar{j}')^2 dt-\frac{1}{{2}} I \int \left( {\bar{j}}' \epsilon' + \epsilon' {\bar{j}}' \right) dt-\frac{1}{{2}} I \int (\epsilon')^2 dt\end{aligned}

The first term is just \bar{S}. Setting the variation \delta S = 0 (neglecting the quadratic \epsilon' term) and integrating by parts we have

\begin{aligned}0 &= \delta S \\ &= \int dt \left( \frac{d}{dt}\left( \frac{1}{{2}} I {\bar{j}}' \right) \epsilon + \epsilon \frac{d}{dt}\left( \frac{1}{{2}} I {\bar{j}}' \right) \right) \\ &= \int \left( \frac{d}{dt} I {\bar{j}}' \right) \dot{c} \epsilon dt \end{aligned}

With \epsilon arbitrary it appears that the solutions of the variation problem are given by

\begin{aligned}\frac{d}{dt}\left( I j' \right) = 0\end{aligned} \quad\quad\quad(46)

Has anything been lost by requiring that this is zero identically when all we had originally was the dot product of this with the variation bivector was zero? If this zero describes the solution set, then we should be able to integrate this, yielding a constant. However, following this to its logical conclusion leads to inconsistency. Integrating 46, producing a bivector constant \kappa we have

\begin{aligned}I j' = \kappa\end{aligned} \quad\quad\quad(47)

The original constraint on j was the bivector spanning the plane from the polar axis to the azimuthal unit vector in the x,y plane at angle a \phi. The \phi dependence wasn’t specified, and left encoded in the bivector representation. Without words that was

\begin{aligned}j = \mathbf{e}_3 \mathbf{e}_1 e^{\mathbf{e}_{12} \phi}\end{aligned} \quad\quad\quad(48)

Inserting back into 47 this gives

\begin{aligned}\frac{d}{dt} e^{\mathbf{e}_{12} \phi} = \frac{\mathbf{e}_1 \mathbf{e}_3 \kappa}{I}\end{aligned} \quad\quad\quad(49)

One more integration is trivially possible yielding

\begin{aligned}e^{\mathbf{e}_{12} \phi(t)} = e^{\mathbf{e}_{12} \phi_0} + \frac{\mathbf{e}_1 \mathbf{e}_3 \kappa t}{I} \end{aligned} \quad\quad\quad(50)

There are two possibilities for the grades of \kappa, one is \kappa \propto \mathbf{e}_3 \mathbf{e}_1, so that the time dependence is a scalar, and the other is \kappa \propto \mathbf{e}_3 \mathbf{e}_2 so that we have an x,y plane bivector component. Allowing for both, and separating into real and imaginary parts we have

\begin{aligned}\cos\phi - \cos\phi_0 &= \frac{1}{{I}} \kappa_r t \\ \sin\phi - \sin\phi_0 &= \frac{1}{{I}} \kappa_i t\end{aligned} \quad\quad\quad(51)

This isn’t anything close to the solution that is expected if we were to start with the scalar Lagrangian for the same problem. That Lagrangian is

\begin{aligned}\mathcal{L} = \frac{1}{{2}} I \dot{\phi}^2\end{aligned} \quad\quad\quad(53)

and the Euler Lagrange equations give us, for a scalar constant \mu

\begin{aligned}I \dot{\phi} = \mu\end{aligned} \quad\quad\quad(54)

So the solution for \phi is just

\begin{aligned}\phi - \phi_0 = \frac{\mu t}{I}\end{aligned} \quad\quad\quad(55)

In the absence of friction this makes sense. Our angle increases monotonically, and we have circular motion with constant angular velocity. Compare this to the messier 51 derived from the bivector “solution” to the variational problem. There is definitely something wrong with the variational approach (or conclusion) when the variable is a bivector.

Does it help to include the constraints explicitly? The bivector parametrized Lagrangian with the unit bivector multiplier constraint for this system is

\begin{aligned}\mathcal{L} = -\frac{1}{{2}} I (j')^2 + \lambda (j^2 + 1)\end{aligned} \quad\quad\quad(56)

Doing the variation we get a set of two equations

\begin{aligned}- I j'' &= 2 \lambda j \\ j^2 &= -1\end{aligned} \quad\quad\quad(57)

Once again re-inserting j = \mathbf{e}_{31} e^{\mathbf{e}_{12} \phi} one gets

\begin{aligned}e^{i\phi} = A \sin(\sqrt{I/2\lambda} t) + B \cos(\sqrt{I/2\lambda} t)\end{aligned} \quad\quad\quad(59)

Setting \lambda = 1, A = i, B = 1, we have

\begin{aligned}\phi \propto t\end{aligned} \quad\quad\quad(60)

This is now consistent with the scalar Lagrangian treatment. In both cases we have the angle linearly proportional to time, and have a single parameter to adjust the angular velocity (used \mu in the scalar treatment above and have \lambda this time (although it was set to one here for convenience). The conclusion has to be that the requirement to include the multipliers for the constraints is absolutely necessary to get the right physics out of this bivector parametrized Lagrangian. The good news is that we can get the right physics out of a non-scalar treatment. What is a bit disturbing is that it was fairly difficult in the end to get from the results of the variation to a solution, and that this will not likely get any easier with a more complex system.

A second example.

The scalar expansion 82 of the kinetic term in our spherical polar Lagrangian shows a couple of other specific multivector functions we can consider the variation of.

We have considered an specific example of a Lagrangian function \mathcal{L} = f(j' \dot{c} j') without (yet) utilizing or deriving a multivector form of the Euler-Lagrange equations. Let’s consider a few more specific simple examples motivated by the expansion of the kinetic energy in 82. Lets start with

\begin{aligned}\mathcal{L} = \theta' a \dot{c} j'\end{aligned} \quad\quad\quad(61)

where a is a bivector constant, \theta a scalar, and j a bivector variable for the system. Expanding this around stationary points j = \bar{j} + \epsilon, and \theta = \bar{\theta} + \phi we have to first order

\begin{aligned}\int \delta \mathcal{L} dt &\approx \int \left( \phi' a \dot{c} \bar{j}' + \bar{\theta}' a \dot{c} \epsilon' \right) dt \\ &= \int \left( \phi \frac{d}{dt}( - a \dot{c} \bar{j}' ) + \frac{d}{dt}( - \bar{\theta}' a ) \dot{c} \epsilon \right) dt \\ \end{aligned}

In this simple system where both scalar variable \theta and bivector variable j are cyclic “coordinates”, we have solutions to the variational problem given by the pair of equations

\begin{aligned}- a \dot{c} j' &= \text{constant} \\ - \theta' a &= \text{constant}\end{aligned} \quad\quad\quad(62)

As to what, if anything, this particular Lagrangian (a fragment picked out of a real Kinetic energy function) represents physically that doesn’t matter so much, since the point of this example was to build up to treating the more general case where we are representing something physical.

A third example.

If we throw a small additional wrench into the problem above and allow a to be one of the variables our system is dependent on.

\begin{aligned}\mathcal{L}(\theta, \theta', a, a', j, j') = \theta' a \dot{c} j'\end{aligned} \quad\quad\quad(64)

It is less obvious how to do a first order Taylor expansion of this Lagrangian required for the variation around the stationary solution. If all the coordinates in the Lagrangian were scalars as in

\begin{aligned}\mathcal{L}(\theta, \theta', a, a', j, j') = \theta' a j'\end{aligned} \quad\quad\quad(65)

(dropping dot products now that these are all scalar variables), then the variation requires nothing abnormal. Suppose our stationary point has coordinates \bar{\theta}, \bar{a}, \bar{j}, with variations \alpha, \beta, \gamma that vanish at the extremes of the integral as usual.

With scalar variables our path is clear, and we just form

\begin{aligned}\delta S &= \int \delta \mathcal{L} \\ &= \int \left( \alpha \frac{\partial {\mathcal{L}}}{\partial {\theta}}+\alpha' \frac{\partial {\mathcal{L}}}{\partial {\theta'}}+\beta \frac{\partial {\mathcal{L}}}{\partial {a}}+\beta' \frac{\partial {\mathcal{L}}}{\partial {a'}}+\gamma \frac{\partial {\mathcal{L}}}{\partial {j}}+\gamma' \frac{\partial {\mathcal{L}}}{\partial {j'}}\right) \\ \end{aligned}

Here is it implied that all the partials are evaluated at the stationary points \bar{\theta}, \bar{a}, \bar{j}. Doing the integration by parts we have

\begin{aligned}\delta S&=\int \left( \alpha \left( \frac{\partial {\mathcal{L}}}{\partial {\theta}} - \frac{d}{dt} \frac{\partial {\mathcal{L}}}{\partial {\theta'}} \right) +\beta \left( \frac{\partial {\mathcal{L}}}{\partial {a}} - \frac{d}{dt} \frac{\partial {\mathcal{L}}}{\partial {a'}} \right) +\gamma \left( \frac{\partial {\mathcal{L}}}{\partial {j}} - \frac{d}{dt} \frac{\partial {\mathcal{L}}}{\partial {j'}} \right) \right)\end{aligned}

Setting \delta S = 0 this produces the Euler-Lagrange equations for the system. For our specific Lagrangian this procedure gives

\begin{aligned}\int \delta \mathcal{L} &=\int \alpha' \bar{a} \bar{j}' + \bar{\theta}' \beta \bar{j}' + \bar{\theta}' \bar{a} \gamma' \\ &=\int \alpha \frac{d}{dt}\left( - \bar{a} \bar{j}' \right) + \bar{\theta}' \beta \bar{j}' + \frac{d}{dt}\left( -\bar{\theta}' \bar{a}\right) \gamma \\ \end{aligned}

Each time we do the first order expansion we are varying just one of the coordinates. It seems likely that the correct answer for the multivariable case will be

\begin{aligned}\int \delta \mathcal{L} &=\int \alpha \frac{d}{dt}\left( - \bar{a} \dot{c} \bar{j}' \right) + \beta \dot{c} (\bar{\theta}' \bar{j}') + \frac{d}{dt}\left( -\bar{\theta}' \bar{a}\right) \dot{c} \gamma \\ \end{aligned}

Thus the variational problem produces solutions (the coupled equations we still have to solve) of

\begin{aligned}- a \dot{c} j' &= \text{constant} \\ \theta' j' &= 0 \\ -\theta' a &= \text{constant}\end{aligned} \quad\quad\quad(66)

Multivector Euler-Lagrange equations.


Having considered a few specific Lagrangians dependent on multivector generalized “coordinates”, some basic comfort that it is at least possible has been gained. Let’s now move on to the general case. It is sufficient to consider Lagrangian functions that only depend on blades, since we can write any more general multivector as a sum of such blades


\begin{aligned}\mathcal{L} = \mathcal{L}(X_k, X_k')\end{aligned} \quad\quad\quad(69)

where X_k \in \bigwedge^{m_k} is a blade of grade m_k. We can do a regular old chain rule expansion of this Lagrangian if coordinates are (temporarily) introduced for each of the blades X_k. For example, if X is a bivector in \mathbb{R}^{3}, we can write

\begin{aligned}X = \mathbf{e}_{12} a^{12} +\mathbf{e}_{13} a^{13} +\mathbf{e}_{23} a^{23} \end{aligned}

and then we can do a first order Taylor series expansion of \mathcal{L}(\bar{X} + \mathbf{\epsilon}) in terms of these coordinates. With \mathbf{\epsilon} = \mathbf{e}_{12}\epsilon^{12} + \mathbf{e}_{13}\epsilon^{13} + \mathbf{e}_{23}\epsilon^{23}, for this specific single variable Lagrangian we have

\begin{aligned}\delta \mathcal{L} &= \mathcal{L}(\bar{X} + \mathbf{\epsilon}) - \mathcal{L}(\bar{X}) \\ & \approx \epsilon^{12} {\left.{{\frac{\partial {\mathcal{L}}}{\partial {a^{12}}}}}\right\vert}_{{\bar{X}}}+ \epsilon^{23} {\left.{{\frac{\partial {\mathcal{L}}}{\partial {a^{23}}}}}\right\vert}_{{\bar{X}}}+ \epsilon^{13} {\left.{{\frac{\partial {\mathcal{L}}}{\partial {a^{13}}}}}\right\vert}_{{\bar{X}}}\end{aligned}

If we write the coordinate expansion of our blades X_k as

\begin{aligned}X_k = \sum_{b_k} {x_k}^{b_k} \sigma_{b_k}\end{aligned}

where b_k is some set of indexes like \{12, 23, 13\} from the \mathbb{R}^{3} bivector example, then our chain rule expansion of the Lagrangian to first order about a stationary point becomes

\begin{aligned}\delta \mathcal{L} &=\sum_{b_k} {\epsilon_k}^{b_k} {\left.{{ \frac{\partial {\mathcal{L}}}{\partial {{x_k}^{b_k}}} }}\right\vert}_{{\bar{X}_k, \bar{X}_k'}}+\sum_{b_k} ({\epsilon_k}^{b_k})' {\left.{{ \frac{\partial {\mathcal{L}}}{\partial {({x_k}^{b_k})'}} }}\right\vert}_{{\bar{X}_k, \bar{X}_k'}}\end{aligned} \quad\quad\quad(70)

Trying to write this beastie down with abstract indexing variables is a bit ugly. There’s something to be said for not trying to be general since writing this for the single variable bivector example was much clearer. However, having written down the ugly beastie, it can now be cleaned up nicely by introducing position and velocity gradient operators for each of the grades

\begin{aligned}\nabla_{X_k} &\equiv \sigma^{b_k} \frac{\partial }{\partial {{x_k}^{b_k}}} \\ \nabla_{X_k'} &\equiv \sigma^{b_k} \frac{\partial }{\partial {({x_k}^{b_k})'}}\end{aligned} \quad\quad\quad(71)

Utilizing this (assumed) orthonormal basis pair \sigma^{b_k} \dot{c} \sigma_{b_j} = {\delta^k}_j we have

\begin{aligned}\delta S &= \int \delta \mathcal{L} \\ &=\int \sum_k\mathbf{\epsilon}_k \dot{c} {\left.{{ \nabla_{X_k} \mathcal{L} }}\right\vert}_{{\bar{X}_k, \bar{X}_k'}}+{\mathbf{\epsilon}_k}' \dot{c} {\left.{{ \nabla_{X_k'} \mathcal{L} }}\right\vert}_{{\bar{X}_k, \bar{X}_k'}} \\ &=\int \sum_k\mathbf{\epsilon}_k \dot{c} \left( {\left.{{ \nabla_{X_k} \mathcal{L} }}\right\vert}_{{\bar{X}_k, \bar{X}_k'}} - \frac{d}{dt} {\left.{{ \nabla_{X_k'} \mathcal{L} }}\right\vert}_{{\bar{X}_k, \bar{X}_k'}} \right) \\ \end{aligned}

Setting \delta S = 0 we have a set of equations for each of the blade variables, and the Euler-Lagrange equations take the form

\begin{aligned}\nabla_{X_k} \mathcal{L} = \frac{d}{dt} \nabla_{X_k'} \mathcal{L} \end{aligned} \quad\quad\quad(73)

Work some examples.

For the pendulum problem we are really only interested in applying 73 for scalar and bivector variables. Let’s revisit some of the example Lagrangians, functions of bivectors, already considered and see how we can evaluate the position and velocity gradients. This generalized Euler-Lagrange formulation doesn’t do much good if we can’t work with it.

Let’s take the bivector gradient of a couple bivector functions (all the ones considered previously leading up to the multivector Euler-Lagrange equations).

\begin{aligned}f(B) &= A \dot{c} B \\ g(B) &= B^2 \end{aligned}

It is actually sufficient (in a less than 4D space) to consider only the first,

\begin{aligned}\nabla_B f &= \nabla_B \left\langle{{ A B }}\right\rangle \\ &= \sum_{a<b} \sigma^{ab} \frac{\partial }{\partial {b^{ab}}} \left\langle{{ A b^{a' b'} \sigma_{a' b'} }}\right\rangle \\ &= \sum_{a<b} \sigma^{ab} \left\langle{{ A \sigma_{ab} }}\right\rangle \\ &= \sum_{a<b} \sigma^{ab} A \dot{c} \sigma_{ab} \\ &= A\end{aligned}

For g we then have

\begin{aligned}\nabla_B g &= \nabla_B B B \\ &= \nabla_B \left\langle{{ B B}}\right\rangle \\ &= 2 B\end{aligned}

This is now enough to evaluate our bivector parameterized Lagrangian from the first example, 56, reproducing the result obtained by direct variation (as in Feynman’s lectures)

\begin{aligned}\frac{d}{dt} \nabla_{j'} \mathcal{L} &= \nabla_{j} \mathcal{L} \\ - I j'' &= \\         &= 2 \lambda j\end{aligned}

By inspection we can see that this works for the remaining two motivational examples too.

Evaluating the pendulum Euler-Lagrange equations (scalar, bivector parameterized KE).

We now have the tools to evaluate the Euler-Lagrange equations for the pendulum problem 37. Since all the d{\theta}_a/dt and dj_a/dt dependence is in the kinetic energy we can start with that. For

\begin{aligned}K = \frac{1}{{2}} \dot{\boldsymbol{\Theta}}^\dagger Q \dot{\boldsymbol{\Theta}}\end{aligned} \quad\quad\quad(74)

We want each of \frac{\partial {K}}{\partial {\dot{\theta}_a}} and \nabla_{j_a'} K. Of these the \dot{\theta} derivative is easier, so lets start with that

\begin{aligned}\frac{\partial {K}}{\partial {\theta_a}}&=\frac{1}{{2}} \left( \frac{\partial {\dot{\boldsymbol{\Theta}}}}{\partial {\dot{\theta}_a}}^\dagger \right) Q \dot{\boldsymbol{\Theta}}+\frac{1}{{2}} \dot{\boldsymbol{\Theta}}^\dagger Q \left( \frac{\partial {\dot{\boldsymbol{\Theta}}}}{\partial {\dot{\theta}_a}} \right)  \\ \end{aligned}

Each of these are scalars and thus equal their Hermitian conjuagate. This leaves us with just one term doubled

\begin{aligned}\frac{\partial {K}}{\partial {\theta_a}}&=\dot{\boldsymbol{\Theta}}^\dagger Q \left( \frac{\partial {\dot{\boldsymbol{\Theta}}}}{\partial {\dot{\theta}_a}} \right) \\ \end{aligned}

A final evaluation of the derivatives, in block matrix form over rows r, gives us

\begin{aligned}\frac{\partial {K}}{\partial {\theta_a}}=\dot{\boldsymbol{\Theta}}^\dagger Q {\begin{bmatrix}\delta_{ar}\begin{bmatrix}1 \\ 0\end{bmatrix}\end{bmatrix}}_{r}\end{aligned} \quad\quad\quad(75)

For the bivector velocity gradients, we can do the same,

\begin{aligned}\nabla_{j_a'} K&=\frac{1}{{2}} \left( \stackrel{ \rightarrow }\nabla_{j_a'} \dot{\boldsymbol{\Theta}}^\dagger \right) Q \dot{\boldsymbol{\Theta}}+\frac{1}{{2}} \dot{\boldsymbol{\Theta}}^\dagger Q \left( \dot{\boldsymbol{\Theta}} \stackrel{ \leftarrow }\nabla_{j_a'} \right)  \\ \end{aligned}

Only the j_a' parts of \dot{\boldsymbol{\Theta}} contribute to the velocity gradients, so we need to know how to evaluate a bivector gradient (as opposed to the square which was done earlier). Expanding in coordinates

\begin{aligned}\nabla_j j &= \frac{1}{{2}} \sum_{a b} \sigma^{ab} \frac{\partial }{\partial {j^{ab}}} \frac{1}{{2}} \sum_{a'b'} \sigma_{a' b'} j^{a' b'} \\ &= \frac{1}{{2}} \sum_{a b} \sigma^{ab} \sigma_{a b} \\ &= 3\end{aligned}

(three here because we are in \mathbb{R}^{3}, and our bivectors have three independent coordinates).

\begin{aligned}\nabla_{j_a'} K&=\frac{3}{2} {\begin{bmatrix}\delta_{ac}\begin{bmatrix}0 & -1 \end{bmatrix}\end{bmatrix}}_{c}Q \dot{\boldsymbol{\Theta}}+\frac{3}{2} \dot{\boldsymbol{\Theta}}^\dagger Q {\begin{bmatrix}\delta_{ar}\begin{bmatrix}0 \\ 1 \end{bmatrix}\end{bmatrix}}_{r}\end{aligned}

These terms are both one-by-one bivector matrixes, and negate with Hermitian conjugation, so we can again double up and eliminate one of the two terms, producing

\begin{aligned}\nabla_{j_a'} K&= 3 \dot{\boldsymbol{\Theta}}^\dagger Q {\begin{bmatrix}\delta_{ar}\begin{bmatrix}0 \\ 1 \end{bmatrix}\end{bmatrix}}_{r}\end{aligned} \quad\quad\quad(76)

Completing the Euler-Lagrange equation evaluation we have for the \theta_a coordinates

\begin{aligned}\frac{d}{dt} \left(\dot{\boldsymbol{\Theta}}^\dagger Q {\begin{bmatrix}\delta_{ar}\begin{bmatrix}1 \\ 0\end{bmatrix}\end{bmatrix}}_{r}\right)= \frac{1}{{2}} \dot{\boldsymbol{\Theta}}^\dagger \frac{\partial {Q}}{\partial {\theta_a}} \dot{\boldsymbol{\Theta}} - \frac{\partial {\phi}}{\partial {\theta_a}}\end{aligned} \quad\quad\quad(77)

plus one equation for each of the bivectors j_a

\begin{aligned} 3 \frac{d}{dt} \left(\dot{\boldsymbol{\Theta}}^\dagger Q {\begin{bmatrix}\delta_{ar}\begin{bmatrix}0 \\ 1 \end{bmatrix}\end{bmatrix}}_{r}\right)=\frac{1}{{2}} \sum_{e < f} \mathbf{e}_f \mathbf{e}_e \dot{\boldsymbol{\Theta}}^\dagger \frac{\partial {Q}}{\partial {{j_a}^{e f}}} \dot{\boldsymbol{\Theta}} + 2 \lambda_a j_a\end{aligned} \quad\quad\quad(78)

Because the bivector \mathbf{e}_f \mathbf{e}_e does not (nessessarily) commute with bivectors j_a' that are part of \dot{\boldsymbol{\Theta}} there doesn’t look like there’s much hope of assembling the left hand j_a gradients into a nice non-coordinate form. Additionally, looking back at this approach, and the troubles with producing meaningful equations of motion even in the constrained single pendulum case, it appears that a bivector parameterization of the kinetic energy is genally not a good approach. This is an unfortunate conclusion after going through all the effort to develop intuition that led to the multivector Euler-Lagrange formulation for this problem.

Oh well.

The Hermitian formulation used here should still provide a nice compact way of expressing the kinetic energy, even if we work with plain old scalar spherical polar angles \theta, and \phi. Will try this another day, since adding that to these notes will only make them that much more intractable.

Appendix calculation. A verification that the Kinetic matrix product is a real scalar.

In the kinetic term of the rather scary looking Lagrangian of 37 we have what should be a real scalar, but it is not obvious that this is the case. As a validation that nothing very bad went wrong, it seems worthwhile to do a check that this is in fact the case, expanding this out explicitly in gory detail.

One way to try this expansion is utilizing a block matrix summing over the diagonal and paired skew terms separately. That is

\begin{aligned}K &=\frac{1}{{2}}\sum_{k=1}^N\mu_k{l_k}^2{\dot{\boldsymbol{\Theta}}_k}^\dagger\begin{bmatrix}1 & - j_k e^{-j_k \theta_k} \sin\theta_k \\ j_k e^{j_k \theta_k} \sin\theta_k & \sin^2\theta_k \end{bmatrix}\dot{\boldsymbol{\Theta}}_k \\ &+\frac{1}{{2}}\sum_{a<b}\mu_bl_a l_b\left({\dot{\boldsymbol{\Theta}}_a}^\dagger\begin{bmatrix}- j_a e^{-j_a \theta_a} j_b e^{j_b\theta_b} & - j_a e^{-j_a \theta_a} \sin\theta_b \\ j_b e^{j_b \theta_b} \sin\theta_a & \sin\theta_a \sin\theta_b\end{bmatrix}\dot{\boldsymbol{\Theta}}_b +{\dot{\boldsymbol{\Theta}}_b}^\dagger\begin{bmatrix}- j_b e^{-j_b \theta_b} j_a e^{j_a\theta_a} & - j_b e^{-j_b \theta_b} \sin\theta_a \\ j_a e^{j_a \theta_a} \sin\theta_b & \sin\theta_b \sin\theta_a\end{bmatrix}\dot{\boldsymbol{\Theta}}_a\right)\end{aligned}

Examining the diagonal matrix products and expanding one of these (dropping the k suffix for tidiness), we have

\begin{aligned}{\dot{\boldsymbol{\Theta}}}^\dagger\begin{bmatrix}1 & - j e^{-j \theta} \sin\theta \\ j e^{j \theta} \sin\theta & \sin^2\theta \end{bmatrix}\dot{\boldsymbol{\Theta}} &=\dot{\theta}^2 -\sin^2\theta \left(\frac{dj}{dt}\right)^2 - \dot{\theta} \sin\theta\cos\theta \left( j \frac{dj}{dt} + \frac{dj}{dt} j \right)\end{aligned} \quad\quad\quad(79)

Since we are working in 3D this symmetric sum is twice the dot product of the bivector j with its derivative, which means that it is a scalar. We expect this to be zero though, and can observe that this is the case since j was by definition a unit bivector

\begin{aligned}j \frac{dj}{dt} + \frac{dj}{dt} j = \frac{d j^2}{dt} = \frac{d (-1)}{dt} = 0\end{aligned} \quad\quad\quad(80)

(thus j and its derivative represent orthogonal oriented planes rather like \hat{\mathbf{r}} and its derivative are orthogonal on a circle or sphere). The implication is that the diagonal subset of the kinetic energy expansion contains just

\begin{aligned}\frac{1}{{2}}\sum_{k=1}^N\mu_k{l_k}^2\left(\left(\frac{d\theta_k}{dt}\right)^2 -\sin^2\theta_k \left(\frac{d j_k}{dt}\right)^2  \right)\end{aligned} \quad\quad\quad(81)

If we are going to have any complex interaction terms then they will have to come from the off diagonal products. Expanding the first of these

\begin{aligned}{\dot{\boldsymbol{\Theta}}_a}^\dagger&\begin{bmatrix}- j_a e^{-j_a \theta_a} j_b e^{j_b\theta_b} & - j_a e^{-j_a \theta_a} \sin\theta_b \\ j_b e^{j_b \theta_b} \sin\theta_a & \sin\theta_a \sin\theta_b\end{bmatrix}\dot{\boldsymbol{\Theta}}_b \\ &=\begin{bmatrix}\theta_a' & -j_a'\end{bmatrix}\begin{bmatrix}- j_a e^{-j_a \theta_a} j_b e^{j_b\theta_b} & - j_a e^{-j_a \theta_a} \sin\theta_b \\ j_b e^{j_b \theta_b} \sin\theta_a & \sin\theta_a \sin\theta_b\end{bmatrix}\begin{bmatrix}\theta_b' \\ j_b'\end{bmatrix} \\ &=\begin{bmatrix}\theta_a' & -j_a'\end{bmatrix}\begin{bmatrix}- j_a e^{-j_a \theta_a} j_b e^{j_b\theta_b} \theta_b' - j_a e^{-j_a \theta_a} \sin\theta_b j_b' \\ j_b e^{j_b \theta_b} \sin\theta_a \theta_b' + \sin\theta_a \sin\theta_b j_b'\end{bmatrix} \\ &=- j_a e^{-j_a \theta_a} j_b e^{j_b\theta_b} \theta_a' \theta_b' - j_a e^{-j_a \theta_a} \sin\theta_b \theta_a' j_b' -j_a' j_b e^{j_b \theta_b} \sin\theta_a \theta_b' - \sin\theta_a \sin\theta_b j_a' j_b'\end{aligned}

Adding to this the a \leftrightarrow b exchanged product and rearranging yields

\begin{aligned}- \theta_a' \theta_b' (j_a e^{-j_a \theta_a} j_b e^{j_b\theta_b} + j_b e^{-j_b \theta_b} j_a e^{j_a\theta_a} ) - \sin\theta_a \sin\theta_b (j_a' j_b' + j_b' j_a') \\ - \sin\theta_b \theta_a' ( j_a e^{-j_a \theta_a} j_b' + j_b' j_a e^{j_a \theta_a} )- \sin\theta_a \theta_b' ( j_b e^{-j_b \theta_b} j_a' + j_a' j_b e^{j_b \theta_b} )\end{aligned}

Each of these multivector sums within the brackets is of the form A + \tilde{A}, a multivector plus its reverse. There can therefore be no bivector or trivector terms since they negate on reversal, and the resulting sum can have only scalar and vector grades. Of these the second term, j_a' j_b' + j_b' j_a' = 2 j_a' \dot{c} j_b' so it is unarguably a scalar as expected, but additional arguments are required to show this of the other three terms. Of these remaining three, the last two have the same form. Examining the first of these two

\begin{aligned}j_b e^{-j_b \theta_b} j_a' + j_a' j_b e^{j_b \theta_b} &=(j_b \cos\theta_b + \sin\theta_b) j_a' + j_a' (j_b \cos\theta_b - \sin\theta_b) \\ &=\cos\theta_b (j_b j_a' + j_a' j_b ) \\ &=2 \cos\theta_b (j_b \dot{c} j_a')\end{aligned}

The first term actually expands in a similarly straightforward way. The vector terms all cancel, and one is left with just

\begin{aligned}j_a e^{-j_a \theta_a} j_b e^{j_b\theta_b} + j_b e^{-j_b \theta_b} j_a e^{j_a\theta_a} = 2 \cos\theta_a\cos\theta_b j_a \dot{c} j_b - 2 \sin\theta_a \sin\theta_b \end{aligned}

Writing S_{\theta_k} = \sin\theta_k and C_{\theta_k} = \cos\theta_k (for compactness to fit things all in since the expanded result is messy), all of this KE terms can be assembled into the following explicit scalar expansion

\begin{aligned}\begin{array}{l l}K &=\frac{1}{{2}}\sum_{k=1}^N\mu_k{l_k}^2\left( (\theta_k')^2 - (S_{\theta_k} j_k')^2 \right) \\ &-\sum_{a<b}\mu_bl_a l_b\left(\theta_a'  \theta_b'  (C_{\theta_a}C_{\theta_b} (j_a \dot{c} j_b) - S_{\theta_a} S_{\theta_b} )+S_{\theta_a} S_{\theta_b} (j_a'  \dot{c} j_b' )+S_{\theta_b} C_{\theta_b} \theta_a'  (j_b \dot{c} j_a' )+S_{\theta_a} C_{\theta_a} \theta_b'  (j_a \dot{c} j_b' )\right)\end{array}\end{aligned} \quad\quad\quad(82)

Noting that all the bivector, bivector dot products are scalars really completes the desired verification. We can however, be more explicit using j_a = \mathbf{e}_{31} e^{\mathbf{e}_{12}\phi_a}, which gives after a bit of manipulation

\begin{aligned}j_a \dot{c} j_b &= - \cos(\phi_a - \phi_b) \\ (j_a')^2 &= -(\phi_a')^2 \\ j_a \dot{c} j_b' &= \phi_b' \sin(\phi_a - \phi_b) \\ j_a' \dot{c} j_b' &= -\phi_a' \phi_b' \cos(\phi_a - \phi_b)\end{aligned} \quad\quad\quad(83)

These can then be inserted back into 82 in a straightforward fashion, but it is not any more illuminating to do so.

Posted in Math and Physics Learning. | Tagged: , , , , , , , , | Leave a Comment »

spherical polar line element (continued).

Posted by peeterjoot on October 25, 2009

[Click here for a PDF of this sequence of posts with nicer formatting]


Line element.

The line element for the particle moving on a spherical surface can be calculated by calculating the derivative of the spherical polar unit vector \hat{\mathbf{r}}

\begin{aligned}\frac{d\hat{\mathbf{r}}}{dt} = \frac{\partial {\hat{\mathbf{r}}}}{\partial {\phi}} \frac{d\phi}{dt} +\frac{\partial {\hat{\mathbf{r}}}}{\partial {\theta}} \frac{d\theta}{dt}\end{aligned} \quad\quad\quad(17)

than taking the magnitude of this vector. We can start either in coordinate form

\begin{aligned}\hat{\mathbf{r}} &= \mathbf{e}_3 \cos\theta + \mathbf{e}_1 \sin\theta \cos\phi + \mathbf{e}_2 \sin\theta \sin\phi\end{aligned} \quad\quad\quad(18)

or, instead do it the fun way, first grouping this into a complex exponential form. This factorization was done above, but starting over allows this to be done a bit more effectively for this particular problem. As above, writing i = \mathbf{e}_1 \mathbf{e}_2, the first factorization is

\begin{aligned}\hat{\mathbf{r}} &= \mathbf{e}_3 \cos\theta + \mathbf{e}_1 \sin\theta e^{i\phi} \end{aligned} \quad\quad\quad(19)

The unit vector \boldsymbol{\rho} = \mathbf{e}_1 e^{i\phi} lies in the x,y plane perpendicular to \mathbf{e}_3, so we can form the unit bivector \mathbf{e}_3\boldsymbol{\rho} and further factor the unit vector terms into a \cos + i \sin form

\begin{aligned}\hat{\mathbf{r}} &= \mathbf{e}_3 \cos\theta + \mathbf{e}_1 \sin\theta e^{i\phi} \\ &= \mathbf{e}_3 (\cos\theta + \mathbf{e}_3 \boldsymbol{\rho} \sin\theta) \\ \end{aligned}

This allows the spherical polar unit vector to be expressed in complex exponential form (really a vector-quaternion product)

\begin{aligned}\hat{\mathbf{r}} = \mathbf{e}_3 e^{\mathbf{e}_3 \boldsymbol{\rho} \theta} = e^{-\mathbf{e}_3 \boldsymbol{\rho} \theta} \mathbf{e}_3\end{aligned} \quad\quad\quad(20)

Now, calculating the unit vector velocity, we get

\begin{aligned}\frac{d\hat{\mathbf{r}}}{dt} &= \mathbf{e}_3 \mathbf{e}_3 \boldsymbol{\rho} e^{\mathbf{e}_3 \boldsymbol{\rho} \theta} \dot{\theta} + \mathbf{e}_1 \mathbf{e}_1 \mathbf{e}_2 \sin\theta e^{i\phi} \dot{\phi} \\ &= \boldsymbol{\rho} e^{\mathbf{e}_3 \boldsymbol{\rho} \theta} \left(\dot{\theta} + e^{-\mathbf{e}_3 \boldsymbol{\rho} \theta} \boldsymbol{\rho} \sin\theta e^{-i\phi} \mathbf{e}_2 \dot{\phi}\right) \\ &= \left( \dot{\theta} + \mathbf{e}_2 \sin\theta e^{i\phi} \dot{\phi} \boldsymbol{\rho} e^{\mathbf{e}_3 \boldsymbol{\rho} \theta} \right) e^{-\mathbf{e}_3 \boldsymbol{\rho} \theta} \boldsymbol{\rho}\end{aligned}

The last two lines above factor the \boldsymbol{\rho} vector and the e^{\mathbf{e}_3 \boldsymbol{\rho} \theta} quaternion to the left and to the right in preparation for squaring this to calculate the magnitude.

\begin{aligned}\left( \frac{d\hat{\mathbf{r}}}{dt} \right)^2&=\left\langle{{ \left( \frac{d\hat{\mathbf{r}}}{dt} \right)^2 }}\right\rangle \\ &=\left\langle{{ \left( \dot{\theta} + \mathbf{e}_2 \sin\theta e^{i\phi} \dot{\phi} \boldsymbol{\rho} e^{\mathbf{e}_3 \boldsymbol{\rho} \theta} \right) \left(\dot{\theta} + e^{-\mathbf{e}_3 \boldsymbol{\rho} \theta} \boldsymbol{\rho} \sin\theta e^{-i\phi} \mathbf{e}_2 \dot{\phi}\right) }}\right\rangle \\ &=\dot{\theta}^2 + \sin^2\theta \dot{\phi}^2+ \sin\theta \dot{\phi} \dot{\theta}\left\langle{{ \mathbf{e}_2 e^{i\phi} \boldsymbol{\rho} e^{\mathbf{e}_3 \rho \theta}+e^{-\mathbf{e}_3 \rho \theta} \boldsymbol{\rho} e^{-i\phi} \mathbf{e}_2}}\right\rangle \\ \end{aligned}

This last term (\in \text{span} \{\rho\mathbf{e}_1, \rho\mathbf{e}_2, \mathbf{e}_1\mathbf{e}_3, \mathbf{e}_2\mathbf{e}_3\}) has only grade two components, so the scalar part is zero. We are left with the line element

\begin{aligned}\left(\frac{d (r\hat{\mathbf{r}})}{dt}\right)^2 = r^2 \left( \dot{\theta}^2 + \sin^2\theta \dot{\phi}^2 \right)\end{aligned} \quad\quad\quad(21)

In retrospect, at least once one sees the answer, it seems obvious. Keeping \theta constant the length increment moving in the plane is ds = \sin\theta d\phi, and keeping \phi constant, we have ds = d\theta. Since these are perpendicular directions we can add the lengths using the Pythagorean theorem.

Line element using an angle and unit bivector parameterization.

Parameterizing using scalar angles is not the only approach that we can take to calculate the line element on the unit sphere. Proceding directly with a alternate polar representation, utilizing a unit bivector j, and scalar angle \theta is

\begin{aligned}\mathbf{x} = r \mathbf{e}_3 e^{j\theta}\end{aligned} \quad\quad\quad(22)

For this product to be a vector j must contain \mathbf{e}_3 as a factor (j = \mathbf{e}_3 \wedge m for some vector m.) Setting r = 1 for now, the deriviate of \mathbf{x} is

\begin{aligned}\dot{\mathbf{x}} &= \mathbf{e}_3 \frac{d}{dt} \left( \cos\theta + j \sin\theta \right) \\ &= \mathbf{e}_3 \dot{\theta} \left( -\sin\theta + j \cos\theta \right) + \mathbf{e}_3 \frac{d j}{dt} \sin\theta  \\ &= \mathbf{e}_3 \dot{\theta} j \left( j \sin\theta + \cos\theta \right) + \mathbf{e}_3 \frac{d j}{dt} \sin\theta  \\ \end{aligned}

This is

\begin{aligned}\dot{\mathbf{x}} = \mathbf{e}_3 \left( \frac{d\theta}{dt} j e^{j\theta} + \frac{d j}{dt} \sin\theta \right)\end{aligned} \quad\quad\quad(23)

Alternately, we can take derivatives of \mathbf{x} = r e^{-j\theta} \mathbf{e}_3, for


\begin{aligned}\dot{\mathbf{x}} = -\left( \frac{d\theta}{dt} j e^{-j\theta} + \frac{d j}{dt} \sin\theta \right) \mathbf{e}_3\end{aligned} \quad\quad\quad(24)

Together with 23, the line element for position change on the unit sphere is then

\begin{aligned}\dot{\mathbf{x}}^2 &= \left\langle{{-\left( \frac{d\theta}{dt} j e^{-j\theta} + \frac{d j}{dt} \sin\theta \right) \mathbf{e}_3 \mathbf{e}_3 \left( \frac{d\theta}{dt} j e^{j\theta} + \frac{d j}{dt} \sin\theta \right) }}\right\rangle \\ &= \left\langle{{-\left( \frac{d\theta}{dt} j e^{-j\theta} + \frac{d j}{dt} \sin\theta \right) \left( \frac{d\theta}{dt} j e^{j\theta} + \frac{d j}{dt} \sin\theta \right) }}\right\rangle \\ &=\left(\frac{d\theta}{dt}\right)^2 - \left(\frac{d j}{dt}\right)^2 \sin^2\theta - \frac{d\theta}{dt} \sin\theta \left\langle{{ \frac{dj}{dt} j e^{j\theta} + j e^{-j\theta} \frac{dj}{dt} }}\right\rangle \\ \end{aligned}

Starting with cyclic reordering of the last term, we get zero

\begin{aligned}\left\langle{{ \frac{dj}{dt} j e^{j\theta} + j e^{-j\theta} \frac{dj}{dt} }}\right\rangle &=\left\langle{{ \frac{dj}{dt} j \left( e^{j\theta} + e^{-j\theta} \right) }}\right\rangle  \\ &=\left\langle{{ \frac{dj}{dt} j 2 j \sin\theta }}\right\rangle  \\ &=- 2 \sin\theta \frac{d}{dt} \underbrace{\left\langle{{ j }}\right\rangle}_{=0}  \\ \end{aligned}

The line element (for constant r) is therefore

\begin{aligned}\dot{\mathbf{x}}^2 =r^2 \left( \dot{\theta}^2 - \left(\frac{dj}{dt}\right)^2 \sin^2 \theta \right)\end{aligned} \quad\quad\quad(25)

This is essentially the same result as we got starting with an explicit r, \theta, \phi. Repeating for comparision that was

\begin{aligned}\dot{\mathbf{x}}^2 = r^2 \left( \dot{\theta}^2 + \sin^2\theta \dot{\phi}^2 \right)\end{aligned} \quad\quad\quad(26)

The bivector that we have used this time encodes the orientation of the plane of rotation from the polar axis down to the position on the sphere corresponds to the angle \phi in the
scalar parameterization. The negation in sign is expected due to the negative bivector square.

Also comparing to previous results it is notable that we can explicitly express this bivector in terms of the scalar angle if desired as

\begin{aligned}\boldsymbol{\rho} &= \mathbf{e}_1 e^{\mathbf{e}_1 \mathbf{e}_2 \phi} = \mathbf{e}_1 \cos\phi + \mathbf{e}_2 \sin\phi \\ j &= \mathbf{e}_3 \wedge \boldsymbol{\rho} = \mathbf{e}_3 \boldsymbol{\rho}\end{aligned} \quad\quad\quad(27)

The inverse mapping, expressing the scalar angle using the bivector representation is also possible, but not unique. The principle angle for that inverse mapping is

\begin{aligned}\phi &= -\mathbf{e}_1 \mathbf{e}_2 \ln(\mathbf{e}_1 \mathbf{e}_3 j) \end{aligned} \quad\quad\quad(29)

Allowing the magnitude to vary.

Writing a vector in polar form

\begin{aligned}\mathbf{x} = r \hat{\mathbf{r}}\end{aligned} \quad\quad\quad(30)

and also allowing r to vary, we have

\begin{aligned}\left(\frac{d\mathbf{x}}{dt}\right)^2 &= \left( \frac{dr}{dt} \hat{\mathbf{r}} + r \frac{d\hat{\mathbf{r}}}{dt} \right)^2 \\ &= \left( \frac{dr}{dt} \right)^2 + r^2 \left( \frac{d\hat{\mathbf{r}}}{dt} \right)^2 + 2 r \frac{dr}{dt} \hat{\mathbf{r}} \dot{c} \frac{d\hat{\mathbf{r}}}{dt} \end{aligned}

The squared unit vector derivative was previously calculated to be

\begin{aligned}\left(\frac{d \hat{\mathbf{r}}}{dt}\right)^2 = \dot{\theta}^2 + \sin^2\theta \dot{\phi}^2 \end{aligned} \quad\quad\quad(31)

Picturing the geometry is enough to know that \dot{\hat{\mathbf{r}}} \dot{c} \hat{\mathbf{r}} = 0 since \dot{\hat{\mathbf{r}}} is always tangential to the sphere. It should also be possible to algebraically show this, but without going through the effort we at least now know the general line element

\begin{aligned}\dot{\mathbf{x}}^2 = {\dot{r}}^2 + r^2 \left( \dot{\theta}^2 + \sin^2\theta \dot{\phi}^2 \right)\end{aligned} \quad\quad\quad(32)

Posted in Math and Physics Learning. | Tagged: , , , | Leave a Comment »

Hamiltonian treatment of rigid Spherical pendulum.

Posted by peeterjoot on October 11, 2009

[Click here for a PDF incorporating this post with nicer formatting]

For the spherical rigid pendulum of length l, we have for the distance above the lowest point

\begin{aligned}h = l (1 + \cos\theta)\end{aligned} \quad\quad\quad(157)

(measuring \theta down from the North pole as conventional). The Lagrangian is therefore

\begin{aligned}\mathcal{L} = \frac{1}{{2}} m l^2 (\dot{\theta}^2 + \sin^2\theta \dot{\phi}^2) - m g l (1 + \cos\theta)\end{aligned} \quad\quad\quad(158)

We can drop the constant term, using the simpler Lagrangian

\begin{aligned}\mathcal{L} = \frac{1}{{2}} m l^2 (\dot{\theta}^2 + \sin^2\theta \dot{\phi}^2) - m g l \cos\theta\end{aligned} \quad\quad\quad(159)

To express the Hamiltonian we need first the conjugate momenta, which are

\begin{aligned}P_\theta &= \frac{\partial {\mathcal{L}}}{\partial {\dot{\theta}}} = m l^2 \dot{\theta} \\ P_\phi &= \frac{\partial {\mathcal{L}}}{\partial {\dot{\phi}}} = m l^2 \sin^2\theta \dot{\phi}\end{aligned} \quad\quad\quad(160)

We can now write the Hamiltonian

\begin{aligned}H = \frac{1}{{2 m l^2}} \left({P_\theta}^2 + \frac{1}{{\sin^2\theta}} {P_\phi}^2\right) + m g l \cos\theta\end{aligned} \quad\quad\quad(162)

Before going further one sees that there is going to be trouble where \sin\theta = 0. Curiously, this is at the poles, the most dangling position and the upright. The south pole is the usual point where we solve the planar pendulum problem using the harmonic oscillator approximation, so it is somewhat curious that the energy of the system appears to go undefined at this point where the position is becoming more defined. It seems almost like a quantum uncertainty phenomena until one realizes that the momentum conjugate to \phi is itself proportional to \sin^2 \theta. By expressing the energy in terms of this P_\phi momentum we have to avoid looking at the poles for a solution to the equations. If we go back to the Lagrangian and the Euler-Lagrange equations, this point becomes perfectly tractable since we are no longer dividing through by \sin^2\theta.

Examining the polar solutions is something to return to. For now, let’s avoid that region. For regions where \sin\theta is nicely non-zero, we get for the Hamiltonian equations

\begin{aligned}\frac{\partial {H}}{\partial {P_\phi}} &= \dot{\phi} = \frac{1}{{ m l^2 \sin^2 \theta}} P_\phi \\ \frac{\partial {H}}{\partial {P_\theta}} &= \dot{\theta} = \frac{1}{{ m l^2 }} P_\theta \\ \frac{\partial {H}}{\partial {\phi}} &= -\dot{P}_\phi = 0 \\ \frac{\partial {H}}{\partial {\theta}} &= -\dot{P}_\theta = -\frac{\cos\theta}{ m l^2 \sin^3 \theta} {P_\phi}^2 - m g l \sin\theta\end{aligned} \quad\quad\quad(163)

These now expressing the dynamics of the system. The first two equations are just the definitions of the canonical momenta that we started with using the Lagrangian. Not surprisingly, but unfortunate, we have a non-linear system here like the planar rigid pendulum, so despite this being one of the most simple systems it does not look terribly tractable. What would it take to linearize this system of equations?

Lets write the state space vector for the system as

\begin{aligned}\mathbf{x} =\begin{bmatrix}P_\theta \\ \theta \\ P_\phi \\ \phi \\ \end{bmatrix}\end{aligned} \quad\quad\quad(167)

lets also suppose that we are interested in the change to the state vector in the neighborhood of an initial state

\begin{aligned}\mathbf{x} =\begin{bmatrix}P_\theta \\ \theta \\ P_\phi \\ \phi \\ \end{bmatrix}={\begin{bmatrix}P_\theta \\ \theta \\ P_\phi \\ \phi \\ \end{bmatrix}}_0+ \mathbf{z}\end{aligned} \quad\quad\quad(168)

The Hamiltonian equations can then be written

\begin{aligned}\frac{d\mathbf{z}}{dt} &=\begin{bmatrix}\frac{\cos\theta}{ m l^2 \sin^3 \theta} {P_\phi}^2 + m g l \sin\theta \\ \frac{1}{{ m l^2 }} P_\theta \\ 0 \\ \frac{1}{{ m l^2 \sin^2 \theta}} P_\phi \\ \end{bmatrix}\end{aligned} \quad\quad\quad(169)

Getting away from the specifics of this particular system is temporarily helpful. We have a set of equations that we wish to calculate a linear approximation for

\begin{aligned}\frac{dz_\mu}{dt} = A_\mu(x_\nu) \approx A_\mu(\mathbf{x}_0) + \sum_\alpha {\left. \frac{\partial {A_\mu}}{\partial {x_\alpha}} \right\vert}_{\mathbf{x}_0} z_\alpha\end{aligned} \quad\quad\quad(170)

Our linear approximation is thus

\begin{aligned}\frac{d\mathbf{z}}{dt} &\approx{\begin{bmatrix}\frac{\cos\theta}{ m l^2 \sin^3 \theta} {P_\phi}^2 + m g l \sin\theta \\ \frac{1}{{ m l^2 }} P_\theta \\ 0 \\ \frac{1}{{ m l^2 \sin^2 \theta}} P_\phi \\ \end{bmatrix}}_0+{\begin{bmatrix}0 & -\frac{{P_\phi}^2 (1 + 2 \cos^2 \theta)}{m l^2 \sin^4 \theta} +m g l \cos\theta & \frac{2 \cos\theta}{ m l^2 \sin^3 \theta} P_\phi & 0 \\ \frac{1}{{m l^2}} & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 \\ 0 & \frac{-2 P_\phi \cos\theta}{m l^2 \sin^3\theta} & \frac{1}{{ m l^2 \sin^2 \theta}} & 0 \\ \end{bmatrix}}_0\mathbf{z}\end{aligned} \quad\quad\quad(171)

Now, this is what we get blindly trying to set up the linear approximation of the state space differential equation. We see that the cyclic coordinate \phi leads to a bit of trouble since no explicit \phi dependence in the Hamiltonian makes the resulting matrix factor non-invertible. It appears that we would be better explicitly utilizing this cyclic coordinate to note that P_\phi = \text{constant}, and to omit this completely from the state vector. Our equations in raw form are now

\begin{aligned}\dot{\theta} &= \frac{1}{{ m l^2 }} P_\theta \\  \dot{P}_\theta &= \frac{\cos\theta}{ m l^2 \sin^3 \theta} {P_\phi}^2 + m g l \sin\theta \\ \dot{\phi} &= \frac{1}{{ m l^2 \sin^2 \theta}} P_\phi \end{aligned} \quad\quad\quad(172)

We can treat the \phi dependence later once we have solved for \theta. That equation to later solve is just this last

\begin{aligned}\dot{\phi} &= \frac{1}{{ m l^2 \sin^2 \theta}} P_\phi \end{aligned} \quad\quad\quad(175)

This integrates directly, presuming \theta = \theta(t) is known, and we have

\begin{aligned}\phi - \phi(0) &= \frac{P_\phi}{ m l^2} \int_0^t \frac{d\tau}{\sin^2 \theta(\tau)} \end{aligned} \quad\quad\quad(176)

Now the state vector and its perturbation can be redefined omitting all but the \theta dependence. Namely

\begin{aligned}\mathbf{x} =\begin{bmatrix}P_\theta \\ \theta \\ \end{bmatrix}\end{aligned} \quad\quad\quad(177)

\begin{aligned}\mathbf{x} =\begin{bmatrix}P_\theta \\ \theta \\ \end{bmatrix}={\begin{bmatrix}P_\theta \\ \theta \\ \end{bmatrix}}_0+ \mathbf{z}\end{aligned} \quad\quad\quad(178)

We can now write the remainder of this non-linear system as

\begin{aligned}\frac{d\mathbf{z}}{dt} &=\begin{bmatrix}\frac{\cos\theta}{ m l^2 \sin^3 \theta} {P_\phi}^2 + m g l \sin\theta \\ \frac{1}{{ m l^2 }} P_\theta \\ \end{bmatrix}\end{aligned} \quad\quad\quad(179)

and make the linear approximation around \mathbf{x}_0 as

\begin{aligned}\frac{d\mathbf{z}}{dt} &\approx{\begin{bmatrix}\frac{\cos\theta}{ m l^2 \sin^3 \theta} {P_\phi}^2 + m g l \sin\theta \\ \frac{1}{{ m l^2 }} P_\theta \\ \end{bmatrix}}_0+{\begin{bmatrix}0 & -\frac{{P_\phi}^2 (1 + 2 \cos^2 \theta)}{m l^2 \sin^4 \theta} +m g l \cos\theta  \\ \frac{1}{{m l^2}} & 0 \\ \end{bmatrix}}_0\mathbf{z}\end{aligned} \quad\quad\quad(180)

This now looks a lot more tractable, and is in fact exactly the same form now as the equation for the linearized planar pendulum. The only difference is the normalization required to switch to less messy dimensionless variables. The main effect of allowing the trajectory to have a non-planar component is a change in the angular frequency in the \theta dependent motion. That frequency will no longer be \sqrt{{\left\lvert{\cos\theta_0}\right\rvert} g/l}, but also has a P_\phi and other more complex trigonometric \theta dependencies. It also appears that we can probably have hyperbolic or trigonometric solutions in the neighborhood of any point, regardless of whether it is a northern hemispherical point or a southern one. In the planar pendulum the unambiguous sign of the matrix terms led to hyperbolic only above the horizon, and trigonometric only below.

Posted in Math and Physics Learning. | Tagged: , , , , , | 8 Comments »

Line element for Spherical Polar unit vectors using Geometric Algebra.

Posted by peeterjoot on October 9, 2009

[Click here for a PDF incorporating this post with nicer formatting]

The line element for the particle moving on a spherical surface can be calculated by calculating the derivative of the spherical polar unit vector \hat{\mathbf{r}}

\begin{aligned}\frac{d\hat{\mathbf{r}}}{dt} = \frac{\partial {\hat{\mathbf{r}}}}{\partial {\phi}} \frac{d\phi}{dt} +\frac{\partial {\hat{\mathbf{r}}}}{\partial {\theta}} \frac{d\theta}{dt}\end{aligned} \quad\quad\quad(17)

than taking the magnitude of this vector. We can start either in coordinate form

\begin{aligned}\hat{\mathbf{r}} &= \mathbf{e}_3 \cos\theta + \mathbf{e}_1 \sin\theta \cos\phi + \mathbf{e}_2 \sin\theta \sin\phi\end{aligned} \quad\quad\quad(18)

or, instead do it the fun way, first grouping this into a complex exponential form. Writing i = \mathbf{e}_1 \mathbf{e}_2, the first factorization is

\begin{aligned}\hat{\mathbf{r}} &= \mathbf{e}_3 \cos\theta + \mathbf{e}_1 \sin\theta e^{i\phi} \end{aligned} \quad\quad\quad(19)

The unit vector \boldsymbol{\rho} = \mathbf{e}_1 e^{i\phi} lies in the x,y plane perpendicular to \mathbf{e}_3, so we can form the unit bivector \mathbf{e}_3\boldsymbol{\rho} and further factor the unit vector terms into a \cos + i \sin form

\begin{aligned}\hat{\mathbf{r}} &= \mathbf{e}_3 \cos\theta + \mathbf{e}_1 \sin\theta e^{i\phi} \\ &= \mathbf{e}_3 (\cos\theta + \mathbf{e}_3 \boldsymbol{\rho} \sin\theta) \\ \end{aligned}

This allows the spherical polar unit vector to be expressed in complex exponential form (really a vector-quaternion product)

\begin{aligned}\hat{\mathbf{r}} = \mathbf{e}_3 e^{\mathbf{e}_3 \boldsymbol{\rho} \theta} = e^{-\mathbf{e}_3 \boldsymbol{\rho} \theta} \mathbf{e}_3\end{aligned} \quad\quad\quad(20)

Now, calcuating the unit vector velocity, we get

\begin{aligned}\frac{d\hat{\mathbf{r}}}{dt} &= \mathbf{e}_3 \mathbf{e}_3 \boldsymbol{\rho} e^{\mathbf{e}_3 \boldsymbol{\rho} \theta} \dot{\theta} + \mathbf{e}_1 \mathbf{e}_1 \mathbf{e}_2 \sin\theta e^{i\phi} \dot{\phi} \\ &= \boldsymbol{\rho} e^{\mathbf{e}_3 \boldsymbol{\rho} \theta} \left(\dot{\theta} + e^{-\mathbf{e}_3 \boldsymbol{\rho} \theta} \boldsymbol{\rho} \sin\theta e^{-i\phi} \mathbf{e}_2 \dot{\phi}\right) \\ &= \left( \dot{\theta} + \mathbf{e}_2 \sin\theta e^{i\phi} \dot{\phi} \boldsymbol{\rho} e^{\mathbf{e}_3 \boldsymbol{\rho} \theta} \right) e^{-\mathbf{e}_3 \boldsymbol{\rho} \theta} \boldsymbol{\rho}\end{aligned}

The last two lines above factor the \boldsymbol{\rho} vector and the e^{\mathbf{e}_3 \boldsymbol{\rho} \theta} quaternion to the left and to the right in preparation for squaring this to calculate the magnitude.

\begin{aligned}\left( \frac{d\hat{\mathbf{r}}}{dt} \right)^2&=\left\langle{{ \left( \frac{d\hat{\mathbf{r}}}{dt} \right)^2 }}\right\rangle \\ &=\left\langle{{ \left( \dot{\theta} + \mathbf{e}_2 \sin\theta e^{i\phi} \dot{\phi} \boldsymbol{\rho} e^{\mathbf{e}_3 \boldsymbol{\rho} \theta} \right) \left(\dot{\theta} + e^{-\mathbf{e}_3 \boldsymbol{\rho} \theta} \boldsymbol{\rho} \sin\theta e^{-i\phi} \mathbf{e}_2 \dot{\phi}\right) }}\right\rangle \\ &=\dot{\theta}^2 + \sin^2\theta \dot{\phi}^2+ \sin\theta \dot{\phi} \dot{\theta}\left\langle{{ \mathbf{e}_2 e^{i\phi} \boldsymbol{\rho} e^{\mathbf{e}_3 \rho \theta}+e^{-\mathbf{e}_3 \rho \theta} \boldsymbol{\rho} e^{-i\phi} \mathbf{e}_2}}\right\rangle \\ \end{aligned}

This last term (\in \text{span} \{\rho\mathbf{e}_1, \rho\mathbf{e}_2, \mathbf{e}_1\mathbf{e}_3, \mathbf{e}_2\mathbf{e}_3\}) has only grade two components, so the scalar part is zero. We are left with the line element

\begin{aligned}\left(\frac{d (r\hat{\mathbf{r}})}{dt}\right)^2 = r^2 \left( \dot{\theta}^2 + \sin^2\theta \dot{\phi}^2 \right)\end{aligned} \quad\quad\quad(21)

In retrospect, at least once one sees the answer, it seems obvious. Keeping \theta constant the length increment moving in the plane is ds = \sin\theta d\phi, and keeping \phi constant, we have ds = d\theta. Since these are perpendicular directions we can add the lengths using the Pythagorean theorem.

Posted in Math and Physics Learning. | Tagged: , , , | Leave a Comment »