Peeter Joot's (OLD) Blog.

Math, physics, perl, and programming obscurity.

Posts Tagged ‘quadratic form’

Gaussian quadratic form integrals and multivariable approximation of exponential integrals

Posted by peeterjoot on January 26, 2014

[Click here for a PDF of this post with nicer formatting (especially if my latex to wordpress script has left FORMULA DOES NOT PARSE errors.)]

Motivation

In [1] eq. I.2.20 is the approximation

\begin{aligned}\int d\mathbf{q} e^{-f(\mathbf{q})/\hbar} \approx e^{-f(\mathbf{a})/\hbar} \sqrt{\frac{2 \pi \hbar}{\text{Det} f''(\mathbf{a})} },\end{aligned} \hspace{\stretch{1}}(1.1.1)

where [ f''(\mathbf{a}) ]_{ij} \equiv {\left.{{\partial^2 f/\partial q_i \partial q_j}}\right\vert}_{{\mathbf{q} = \mathbf{a}}}. Here \mathbf{a} is assumed to be an extremum of f. This follows from a generalization of the Gaussian integral result. Let’s derive both in detail.

Guts

First, to second order, let’s expand f(\mathbf{q}) around a min or max at \mathbf{q} = \mathbf{a}. The usual trick, presuming that one doesn’t remember the form of this generalized Taylor expansion, is to expand g(t) = f(\mathbf{a} + t \mathbf{q}) around t = 0, then evaluate at t = 1. We have

\begin{aligned}g'(t) &= \sum_i \frac{\partial {f(\mathbf{a} + t \mathbf{q})}}{\partial {(a_i + t q_i)}} \frac{d{{ (a_i + t q_i) }}}{dt} \\ &= \sum_i q_i \frac{\partial {f(\mathbf{a} + t \mathbf{q})}}{\partial {(a_i + t q_i)}} \\ &= \mathbf{q} \cdot \left(  {\left.{{\boldsymbol{\nabla}_\mathbf{q} f(\mathbf{q})}}\right\vert}_{{\mathbf{q} = \mathbf{a} + t \mathbf{q}}}  \right).\end{aligned} \hspace{\stretch{1}}(1.2.2)

The second derivative is

\begin{aligned}g''(t) = \sum_{i j} q_i q_j \frac{\partial {}}{\partial {(a_j + t q_j)}} \frac{\partial {f(\mathbf{a} + t \mathbf{q})}}{\partial {(a_i + t q_i)}},\end{aligned} \hspace{\stretch{1}}(1.2.3)

This gives

\begin{aligned}\begin{aligned}g'(0) &= \mathbf{q} \cdot \boldsymbol{\nabla}_\mathbf{q} f(\mathbf{q}) = \sum_i q_i \partial q_i f(\mathbf{q}) \\ g''(0) &= \left( \mathbf{q} \cdot \boldsymbol{\nabla}_\mathbf{q} \right)^2 f(\mathbf{q}) = \sum_{i j} q_i q_j \partial q_i \partial q_j f(\mathbf{q}).\end{aligned}\end{aligned} \hspace{\stretch{1}}(1.2.4)

Putting these together, we have to second order in t is

\begin{aligned}f(\mathbf{a} + t \mathbf{q}) \approx f(\mathbf{a}) + \sum_i q_i \partial q_i f(\mathbf{q}) \frac{t^1}{1!}+ \sum_{i j} q_i q_j \partial q_i \partial q_j f(\mathbf{q}) \frac{t^2}{2!},\end{aligned} \hspace{\stretch{1}}(1.2.5)

or

\begin{aligned}f(\mathbf{a} + \mathbf{q}) \approx f(\mathbf{a}) + \sum_i q_i {\left.{{ \left(  \frac{\partial {f}}{\partial {q_i}} \right)}}\right\vert}_{\mathbf{a}}+ \frac{1}{{2}} \sum_{i j} q_i q_j {\left.{{\left(  \frac{\partial^2 f}{\partial q_i \partial q_j}  \right)}}\right\vert}_{\mathbf{a}}.\end{aligned} \hspace{\stretch{1}}(1.2.6)

We can put the terms up to second order in a nice tidy matrix forms

\begin{aligned}\mathbf{b} = {\left.{{\left(  \boldsymbol{\nabla}_\mathbf{q} f \right)}}\right\vert}_{\mathbf{a}}\end{aligned} \hspace{\stretch{1}}(1.0.7a)

\begin{aligned}A = {\begin{bmatrix}{\left.{{ \left(  \frac{\partial^2 f}{\partial q_i \partial q_j} \right)}}\right\vert}_{\mathbf{a}}\end{bmatrix}}_{i j}.\end{aligned} \hspace{\stretch{1}}(1.0.7b)

Note that eq. 1.0.7b is a real symmetric matrix, and can thus be reduced to diagonal form by an orthonormal transformation. Putting the pieces together, we have

\begin{aligned}f(\mathbf{a} + \mathbf{q}) \approx f(\mathbf{a}) + \mathbf{q}^\text{T} \mathbf{b} + \frac{1}{{2}} \mathbf{q}^\text{T} A \mathbf{q}.\end{aligned} \hspace{\stretch{1}}(1.0.8)

Integrating this, we have

\begin{aligned}\int dq_1 dq_2 \cdots dq_N \exp\left( -\left(  f(\mathbf{a}) + \mathbf{q}^\text{T} \mathbf{b} + \frac{1}{{2}} \mathbf{q}^\text{T} A \mathbf{q}  \right)\right)=e^{-f(\mathbf{a})}\int dq_1 dq_2 \cdots dq_N \exp\left(  -\mathbf{q}^\text{T} \mathbf{b} - \frac{1}{{2}} \mathbf{q}^\text{T} A \mathbf{q}  \right).\end{aligned} \hspace{\stretch{1}}(1.0.9)

Employing an orthonormal change of variables to diagonalizae the matrix

\begin{aligned}A = O^\text{T} D O,\end{aligned} \hspace{\stretch{1}}(1.0.10)

and \mathbf{r} = O \mathbf{q}, or r_i = O_{ik} q_k, the volume element after transformation is

\begin{aligned}dr_1 dr_2 \cdots dr_N &= \frac{\partial(r_1, r_2, \cdots, r_N)}{\partial(q_1, q_2, \cdots, q_N)}dq_1 dq_2 \cdots dq_N \\ &= \begin{vmatrix}O_{11} & O_{12} & \cdots & O_{1N} \\ O_{21} & O_{22} & \cdots & O_{2N} \\ \dot{v}s & \dot{v}s & \dot{v}s & \dot{v}s \\  O_{N1} & O_{N2} & \cdots & O_{NN} \\ \end{vmatrix}dq_1 dq_2 \cdots dq_N \\ &= (\text{Det} O)dq_1 dq_2 \cdots dq_N \\ &= dq_1 dq_2 \cdots dq_N \end{aligned} \hspace{\stretch{1}}(1.0.10)

Our integral is

\begin{aligned}e^{-f(\mathbf{a})}\int dq_1 dq_2 \cdots dq_N \exp\left(  -\mathbf{q}^\text{T} \mathbf{b} - \frac{1}{{2}} \mathbf{q}^\text{T} A \mathbf{q}  \right) &= e^{-f(\mathbf{a})}\int dr_1 dr_2 \cdots dr_N \exp\left(  -\mathbf{q}^\text{T} O^\text{T} O \mathbf{b} - \frac{1}{{2}} \mathbf{q}^\text{T} O^\text{T} D O \mathbf{q}  \right) \\ &= e^{-f(\mathbf{a})}\int dr_1 dr_2 \cdots dr_N \exp\left(  -\mathbf{r}^\text{T} (O \mathbf{b}) - \frac{1}{{2}} \mathbf{r}^\text{T} D \mathbf{r}  \right) \\ &= e^{-f(\mathbf{a})}\int dr_1 e^{ -\frac{1}{{2}} r_1^2 \lambda_1 - r_1 (O \mathbf{b})_1 }\int dr_2 e^{ -\frac{1}{{2}} r_2^2 \lambda_2 - r_2 (O \mathbf{b})_2 }\cdots \int dr_N e^{ -\frac{1}{{2}} r_N^2 \lambda_N - r_N (O \mathbf{b})_N }.\end{aligned} \hspace{\stretch{1}}(1.0.10)

We now have products of terms that are of the regular Gaussian form. One such integral is

\begin{aligned}\int e^{-a x^2/2 + J x} &= \int \exp\left(-\frac{1}{{2}} \left(\left(  \sqrt{a} x - J/\sqrt{a}  \right)^2- \left(  J/\sqrt{a}  \right)^2\right)\right) \\ &= e^{J^2/2a} \sqrt{2 \pi \int_0^\infty r dr e^{-a r^2/2}}\end{aligned} \hspace{\stretch{1}}(1.0.10)

This is just

\begin{aligned}\int e^{-a x^2/2 + J x}= e^{J^2/2a} \sqrt{ \frac{2 \pi}{a} }.\end{aligned} \hspace{\stretch{1}}(1.0.10)

Applying this to the integral of interest, writing m_i = (O \mathbf{b})_i

\begin{aligned}\begin{aligned}e^{-f(\mathbf{a})}\int &dq_1 dq_2 \cdots dq_N \exp\left(  -\mathbf{q}^\text{T} \mathbf{b} - \frac{1}{{2}} \mathbf{q}^\text{T} A \mathbf{q}  \right) \\ &=e^{-f(\mathbf{a})}e^{-m_1^2/2\lambda_1} \sqrt{ \frac{2 \pi}{\lambda_1}}e^{-m_2^2/2\lambda_2} \sqrt{ \frac{2 \pi}{\lambda_2}}\cdots e^{-m_N^2/2\lambda_N} \sqrt{ \frac{2 \pi}{\lambda_N}} \\ &=e^{-f(\mathbf{a})}\sqrt{\frac{2 \pi}{\text{Det} A}}\exp\left(-\frac{1}{{2}}\left(  -m_1^2/\lambda_1 -m_2^2/\lambda_2 \cdots -m_N^2/\lambda_N  \right) \right).\end{aligned}\end{aligned} \hspace{\stretch{1}}(1.0.10)

This last exponential argument can be put into matrix form

\begin{aligned}-m_1^2/\lambda_1-m_2^2/\lambda_2\cdots -m_N^2/\lambda_N &= (O \mathbf{b})^\text{T} D^{-1} O \mathbf{b} \\ &= \mathbf{b}^\text{T} O^\text{T} D^{-1} O \mathbf{b} \\ &= \mathbf{b}^\text{T} A^{-1} \mathbf{b},\end{aligned} \hspace{\stretch{1}}(1.0.10)

Finally, referring back to eq. 1.0.7, we have

\begin{aligned}\int d\mathbf{q} e^{-f(\mathbf{q})} \approx e^{-f(\mathbf{a})}\sqrt{\frac{2 \pi}{\text{Det} A}}e^{-\mathbf{b}^\text{T} A^{-1} \mathbf{b}/2}.\end{aligned} \hspace{\stretch{1}}(1.0.10)

Observe that we can recover eq. 1.1.1 by noting that \mathbf{b} = 0 for that system was assumed (i.e. \mathbf{a} was an extremum point), and by noting that the determinant scales with 1/\hbar since it just contains the second partials.

An afterword on notational sugar:

We didn’t need it, but it seems worth noting that we can write the Taylor expansion of eq. 1.0.8 in operator form as

\begin{aligned}f(\mathbf{a} + \mathbf{q}) = \sum_{k = 0}^\infty \frac{1}{{k!}} {\left.{{ \left( \mathbf{q} \cdot \boldsymbol{\nabla}_{\mathbf{q}'} \right)^k f(\mathbf{q}') }}\right\vert}_{{\mathbf{q}' = \mathbf{a}}}={\left.{{ e^{\mathbf{q} \cdot \boldsymbol{\nabla}_{\mathbf{q}'}} f(\mathbf{q}') }}\right\vert}_{{\mathbf{q}' = \mathbf{a}}}.\end{aligned} \hspace{\stretch{1}}(1.0.18)

References

[1] A. Zee. Quantum field theory in a nutshell. Universities Press, 2005.

Advertisements

Posted in Math and Physics Learning. | Tagged: , , | Leave a Comment »

PHY450H1S. Relativistic Electrodynamics Tutorial 1 (TA: Simon Freedman).

Posted by peeterjoot on January 21, 2011

[Click here for a PDF of this post with nicer formatting]

Worked question.

The TA blasted through a problem from Hartle [1], section 5.17 (all the while apologizing for going so slow). I’m going to have to look these notes over carefully to figure out what on Earth he was doing.

At one point he asked if anybody was completely lost. Nobody said yes, but given the class title, I had the urge to say “No, just relatively lost”.

\paragraph{Q:}
In a source’s rest frame S emits radiation isotropically with a frequency \omega with number flux f(\text{photons}/\text{cm}^2 s). Moves along x’-axis with speed V in an observer frame (O). What does the energy flux in O look like?

A brief intro with four vectors

A 3-vector:

\begin{aligned}\mathbf{a} &= (a_x, a_y, a_z) = (a^1, a^2, a^3) \\ \mathbf{b} &= (b_x, b_y, b_z) = (b^1, b^2, b^3)\end{aligned} \hspace{\stretch{1}}(1.1)

For this we have the dot product

\begin{aligned}\mathbf{a} \cdot \mathbf{b} = \sum_{\alpha=1}^3 a^\alpha b^\alpha\end{aligned} \hspace{\stretch{1}}(1.3)

Greek letters in this course (opposite to everybody else in the world, because of Landau and Lifshitz) run from 1 to 3, whereas roman letters run through the set \{0,1,2,3\}.

We want to put space and time on an equal footing and form the composite quantity (four vector)

\begin{aligned}x^i = (ct, \mathbf{r}) = (x^0, x^1, x^2, x^3),\end{aligned} \hspace{\stretch{1}}(1.4)

where

\begin{aligned}x^0 &= ct \\ x^1 &= x \\ x^2 &= y \\ x^3 &= z.\end{aligned} \hspace{\stretch{1}}(1.5)

It will also be convenient to drop indexes when referring to all the components of a four vector and we will use lower or upper case non-bold letters to represent such four vectors. For example

\begin{aligned}X = (ct, \mathbf{r}),\end{aligned} \hspace{\stretch{1}}(1.9)

or

\begin{aligned}v = \gamma \left(c, \mathbf{v} \right).\end{aligned} \hspace{\stretch{1}}(1.10)

Three vectors will be represented as letters with over arrows \vec{a} or (in text) bold face \mathbf{a}.

Recall that the squared spacetime interval between two events X_1 and X_2 is defined as

\begin{aligned}{S_{X_1, X_2}}^2 = (ct_1 - c t_2)^2 - (\mathbf{x}_1 - \mathbf{x}_2)^2.\end{aligned} \hspace{\stretch{1}}(1.11)

In particular, with one of these zero, we have an operator which takes a single four vector and spits out a scalar, measuring a “distance” from the origin

\begin{aligned}s^2 = (ct)^2 - \mathbf{r}^2.\end{aligned} \hspace{\stretch{1}}(1.12)

This motivates the introduction of a dot product for our four vector space.

\begin{aligned}X \cdot X = (ct)^2 - \mathbf{r}^2 = (x^0)^2 - \sum_{\alpha=1}^3 (x^\alpha)^2\end{aligned} \hspace{\stretch{1}}(1.13)

Utilizing the spacetime dot product of 1.13 we have for the dot product of the difference between two events

\begin{aligned}(X - Y) \cdot (X - Y)&=(x^0 - y^0)^2 - \sum_{\alpha =1}^3 (x^\alpha - y^\alpha)^2 \\ &=X \cdot X + Y \cdot Y - 2 x^0 y^0 + 2 \sum_{\alpha =1}^3 x^\alpha y^\alpha.\end{aligned}

From this, assuming our dot product 1.13 is both linear and symmetric, we have for any pair of spacetime events

\begin{aligned}X \cdot Y = x^0 y^0 - \sum_{\alpha =1}^3 x^\alpha y^\alpha.\end{aligned} \hspace{\stretch{1}}(1.14)

How do our four vectors transform? This is really just a notational issue, since this has already been discussed. In this new notation we have

\begin{aligned}{x^0}' &= ct' = \gamma ( ct - \beta x) = \gamma ( x^0 - \beta x^1 ) \\ {x^1}' &= x' = \gamma ( x - \beta ct ) = \gamma ( x^1 - \beta x^0 ) \\ {x^2}' &= x^2 \\ {x^3}' &= x^3\end{aligned} \hspace{\stretch{1}}(1.15)

where \beta = V/c, and \gamma^{-2} = 1 - \beta^2.

In order to put some structure to this, it can be helpful to express this dot product as a quadratic form. We write

\begin{aligned}A \cdot B = \begin{bmatrix}a^0 & \mathbf{a}^\text{T} \end{bmatrix}\begin{bmatrix}1 & 0 & 0 & 0 \\ 0 & -1 & 0 & 0 \\ 0 & 0 & -1 & 0 \\ 0 & 0 & 0 & -1 \end{bmatrix}\begin{bmatrix}b^0 \\ \mathbf{b}\end{bmatrix}= A^\text{T} G B.\end{aligned} \hspace{\stretch{1}}(1.19)

We can write our Lorentz boost as a matrix

\begin{aligned}\begin{bmatrix}\gamma & -\beta \gamma & 0 & 0 \\ -\beta \gamma & \gamma & 0 & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \end{bmatrix}\end{aligned} \hspace{\stretch{1}}(1.20)

so that the dot product between two transformed four vectors takes the form

\begin{aligned}A' \cdot B' = A^\text{T} O^\text{T} G O B\end{aligned} \hspace{\stretch{1}}(1.21)

Back to the problem.

We will work in momentum space, where we have

\begin{aligned}p^i &= (p^0, \mathbf{p}) = \left( \frac{E}{c}, \mathbf{p}\right) \\ p^2 &= \frac{E^2}{c^2} -\mathbf{p}^2 \\ \mathbf{p} &= \hbar \mathbf{k} \\ E &= \hbar \omega \\ p^i &= \hbar k^i \\ k^i &= \left(\frac{\omega}{c}, \mathbf{k}\right)\end{aligned} \hspace{\stretch{1}}(1.22)

Justifying this.

Now, the TA blurted all this out. We know some of it from the QM context, and if we’ve been reading ahead know a bit of this from our text [2] (the energy momentum four vector relationships). Let’s go back to the classical electromagnetism and recall what we know about the relation of frequency and wave numbers for continuous fields. We want solutions to Maxwell’s equation in vacuum and can show that such solution also implies that our fields obey a wave equation

\begin{aligned}\frac{1}{{c^2}} \frac{\partial^2 \Psi}{\partial t^2} - \boldsymbol{\nabla}^2 \Psi = 0,\end{aligned} \hspace{\stretch{1}}(1.28)

where \Psi is one of \mathbf{E} or \mathbf{B}. We have other constraints imposed on the solutions by Maxwell’s equations, but require that they at least obey 1.28 in addition to these constraints.

With application of a spatial Fourier transformation of the wave equation, we find that our solution takes the form

\begin{aligned}\Psi = (2 \pi)^{-3/2} \int \tilde{\Psi}(\mathbf{k}, 0) e^{i (\omega t \pm \mathbf{k} \cdot \mathbf{x}) } d^3 \mathbf{k}.\end{aligned} \hspace{\stretch{1}}(1.29)

If one takes this as a given and applies the wave equation operator to this as a test solution, one finds without doing the Fourier transform work that we also have a constraint. That is

\begin{aligned}\frac{1}{{c^2}} (i \omega)^2 \Psi - (\pm i \mathbf{k})^2 \Psi = 0.\end{aligned} \hspace{\stretch{1}}(1.30)

So even in the continuous field domain, we have a relationship between frequency and wave number. We see that this also happens to have the form of a lightlike spacetime interval

\begin{aligned}\frac{\omega^2}{c^2} - \mathbf{k}^2 = 0.\end{aligned} \hspace{\stretch{1}}(1.31)

Also recall that the photoelectric effect imposes an experimental constraint on photon energy, where we have

\begin{aligned}E = h \nu = \frac{h}{2\pi} 2 \pi \nu = \hbar \omega\end{aligned} \hspace{\stretch{1}}(1.32)

Therefore if we impose a mechanics like P = (E/c, \mathbf{p}) relativistic energy-momentum relationship on light, it then makes sense to form a nilpotent (lightlike) four vector for our photon energy. This combines our special relativistic expectations, with the constraints on the fields imposed by classical electromagnetism. We can then write for the photon four momentum

\begin{aligned}P = \left( \frac{\hbar \omega}{c}, \hbar k \right)\end{aligned} \hspace{\stretch{1}}(1.33)

Back to the TA’s formula blitz.

Utilizing spherical polar coordinates in momentum (wave number) space, measuring the polar angle from the k^1 (x-like) axis, we can compute this polar angle in both pairs of frames,

\begin{aligned} \cos \alpha &= \frac{k^1}{{\left\lvert{\mathbf{k}}\right\rvert}} = \frac{k^1}{\omega/c} \\ \cos \alpha' &= \frac{{k^1}'}{\omega'/c} = \frac{\gamma (k^1 + \beta \omega/c)}{\gamma(\omega/c + \beta k^1)}\end{aligned} \hspace{\stretch{1}}(1.34)

Note that this requires us to assume that wave number four vectors transform in the same fashion as regular mechanical position and momentum four vectors. Also note that we have the primed frame moving negatively along the x-axis, instead of the usual positive origin shift. The question is vague enough to allow this since it only requires motion.

\paragraph{check 1}

as \beta \rightarrow 1 (ie: our primed frame velocity approaches the speed of light relative to the rest frame), \cos \alpha' \rightarrow 1, \alpha' = 0. The surface gets more and more compressed.

In the original reference frame the radiation was isotropic. In the new frame how does it change with respect to the angle? This is really a question to find this number flux rate

\begin{aligned}f'(\alpha') = ?\end{aligned} \hspace{\stretch{1}}(1.36)

In our rest frame the total number of photons traveling through the surface in a given interval of time is

\begin{aligned}N &= \int d\Omega dt f(\alpha) = \int d \phi \sin \alpha d\alpha = -2 \pi \int d(\cos\alpha) dt f(\alpha) \\ \end{aligned} \hspace{\stretch{1}}(1.37)

Here we utilize the spherical solid angle d\Omega = \sin \alpha d\alpha d\phi = - d(\cos\alpha) d\phi, and integrate \phi over the [0, 2\pi] interval. We also have to assume that our number flux density is not a function of horizontal angle \phi in the rest frame.

In the moving frame we similarly have

\begin{aligned}N' &= -2 \pi \int d(\cos\alpha') dt' f'(\alpha'),\end{aligned} \hspace{\stretch{1}}(1.39)

and we again have had to assume that our transformed number flux density is not a function of the horizontal angle \phi. This seems like a reasonable move since {k^2}' = k^2 and {k^3}' = k^3 as they are perpendicular to the boost direction.

\begin{aligned}f'(\alpha') = \frac{d(\cos\alpha)}{d(\cos\alpha')} \left( \frac{dt}{dt'} \right) f(\alpha)\end{aligned} \hspace{\stretch{1}}(1.40)

Now, utilizing a conservation of mass argument, we can argue that N = N'. Regardless of the motion of the frame, the same number of particles move through the surface. Taking ratios, and examining an infinitesimal time interval, and the associated flux through a small patch, we have

\begin{aligned}\left( \frac{d(\cos\alpha)}{d(\cos\alpha')} \right) = \left( \frac{d(\cos\alpha')}{d(\cos\alpha)} \right)^{-1} = \gamma^2 ( 1 + \beta \cos\alpha)^2\end{aligned} \hspace{\stretch{1}}(1.41)

Part of the statement above was a do-it-yourself. First recall that c t' = \gamma ( c t + \beta x ), so dt/dt' evaluated at x=0 is 1/\gamma.

The rest is messier. We can calculate the d(\cos) values in the ratio above using 1.34. For example, for d(\cos(\alpha)) we have

\begin{aligned}d(\cos\alpha) &= d \left( \frac{k^1}{\omega/c} \right) \\ &= dk^1 \frac{1}{{\omega/c}} - c \frac{1}{{\omega^2}} d\omega.\end{aligned}

If one does the same thing for d(\cos\alpha'), after a whole whack of messy algebra one finds that the differential terms and a whole lot more mystically cancels, leaving just

\begin{aligned}\frac{d\cos\alpha'}{d\cos\alpha} = \frac{\omega^2/c^2}{(\omega/c + \beta k^1)^2} (1 - \beta^2)\end{aligned} \hspace{\stretch{1}}(1.42)

A bit more reduction with reference back to 1.34 verifies 1.41.

Also note that again from 1.34 we have

\begin{aligned}\cos\alpha' = \frac{\cos\alpha + \beta}{1 + \beta \cos\alpha}\end{aligned} \hspace{\stretch{1}}(1.43)

and rearranging this for \cos\alpha' gives us

\begin{aligned}\cos\alpha = \frac{\cos\alpha' - \beta}{1 - \beta \cos\alpha'},\end{aligned} \hspace{\stretch{1}}(1.44)

which we can sum to find that

\begin{aligned}1 + \beta \cos\alpha = \frac{1}{{\gamma^2 (1 - \beta \cos \alpha')^2 }},\end{aligned} \hspace{\stretch{1}}(1.45)

so putting all the pieces together we have

\begin{aligned}f'(\alpha') = \frac{1}{{\gamma}} \frac{f(\alpha)}{(\gamma (1-\beta \cos\alpha'))^2}\end{aligned} \hspace{\stretch{1}}(1.46)

The question asks for the energy flux density. We get this by multiplying the number density by the frequency of the light in question. This is, as a function of the polar angle, in each of the frames.

\begin{aligned}L(\alpha) &= \hbar \omega(\alpha) f(\alpha) = \hbar \omega f \\ L'(\alpha') &= \hbar \omega'(\alpha') f'(\alpha') = \hbar \omega' f'\end{aligned} \hspace{\stretch{1}}(1.47)

But we have

\begin{aligned}\omega'(\alpha')/c = \gamma( \omega/c + \beta k^1 ) = \gamma \omega/c ( 1 + \beta \cos\alpha )\end{aligned} \hspace{\stretch{1}}(1.49)

Aside, \beta << 1,

\begin{aligned}\omega' = \omega ( 1 + \beta \cos\alpha) + O(\beta^2) = \omega + \delta \omega\end{aligned} \hspace{\stretch{1}}(1.50)

\begin{aligned}\delta \omega &= \beta, \alpha = 0 		\qquad \text{blue shift} \\ \delta \omega &= -\beta, \alpha = \pi 		\qquad \text{red shift}\end{aligned} \hspace{\stretch{1}}(1.51)

The TA then writes

\begin{aligned}L'(\alpha') = \frac{L/\gamma}{(\gamma (1 - \beta \cos\alpha'))^3}\end{aligned} \hspace{\stretch{1}}(1.53)

although, I calculate

\begin{aligned}L'(\alpha') = \frac{L}{\gamma^4 (\gamma (1 - \beta \cos\alpha'))^4}\end{aligned} \hspace{\stretch{1}}(1.54)

He then says, the forward backward ratio is

\begin{aligned}L'(0)/L'(\pi) = {\left( \frac{ 1 + \beta }{1-\beta} \right)}^3\end{aligned} \hspace{\stretch{1}}(1.55)

The forward radiation is much bigger than the backwards radiation.

For this I get:

\begin{aligned}L'(0)/L'(\pi) = {\left( \frac{ 1 + \beta }{1-\beta} \right)}^4\end{aligned} \hspace{\stretch{1}}(1.56)

It is still bigger for \beta positive, which I think is the point.

If I can somehow manage to keep my signs right as I do this course I may survive. Why did he pick a positive sign way back in 1.34?

References

[1] J.B. Hartle and T. Dray. Gravity: an introduction to Einsteins general relativity, volume 71. 2003.

[2] L.D. Landau and E.M. Lifshits. The classical theory of fields. Butterworth-Heinemann, 1980.

Posted in Math and Physics Learning. | Tagged: , , , , , , , , , , , , , , , , , | Leave a Comment »

Hamiltonian equations for double and multiple planar pendulums, and general quadratic form velocity dependence.on

Posted by peeterjoot on October 19, 2009

[Click here for a PDF of this post with nicer formatting]

(NOTE: the process of converting from standalone latex to wordpress got messed up in a few places. This have been fixed manually, but if anything seems wrong, check first in the standalone PDF above.)

In the following section I started off with the goal of treating two connected pendulums moving in a plane. Even setting up the Hamiltonians for this turned out to be a bit messy, requiring a matrix inversion. Tackling the problem in the guise of using a more general quadratic form (which works for the two particle as well as N particle cases) seemed like it would actually be simpler than using the specifics from the angular velocity dependence of the specific pendulum problem. Once the Hamiltonian equations were found in this form, an attempt to do the first order Taylor expansion as done for the single planar pendulum and the spherical pendulum was performed. This turned out to be a nasty mess and is seen to not be particularily illuminating. I didn’t know that is how it would turn out ahead of time since I had my fingers crossed for some sort of magic simplification once the final substuitions were made. If such a simplification is possible, the procedure to do so is not obvious.

Although the Hamiltonian equations for a spherical pendulum have been considered previously, for the double pendulum case it seems prudent to avoid temptation, and to first see what happens with a simpler first step, a planar double pendulum.

Setting up coordinates x axis down, and y axis to the left with i = \hat{\mathbf{x}} \hat{\mathbf{y}} we have for the position of the first mass m_1, at angle {\theta_1} and length l_1

\begin{aligned}z_1 = \hat{\mathbf{x}} l_1 e^{i{\theta_1}}\end{aligned} \quad\quad\quad(180)

If the second mass, dangling from this is at an angle {\theta_2} from the x axis, its position is

\begin{aligned}z_2 = z_1 + \hat{\mathbf{x}} l_2 e^{i{\theta_2}}\end{aligned} \quad\quad\quad(181)

We need the velocities, and their magnitudes. For z_1 this is

\begin{aligned}{\left\lvert{\dot{z}_1}\right\rvert}^2 = {l_1}^2 {\dot{\theta}_1}^2\end{aligned} \quad\quad\quad(182)

For the second mass

\begin{aligned}\dot{z}_2 = \hat{\mathbf{x}} i \left( l_1 {\dot{\theta}_1} e^{i{\theta_1}} + l_2 {\dot{\theta}_2} e^{i{\theta_2}} \right)\end{aligned} \quad\quad\quad(183)

Taking conjugates and multiplying out we have

\begin{aligned}{\left\lvert{\dot{z}_2}\right\rvert}^2 ={l_1}^2 {\dot{\theta}_1}^2+ 2 l_1 l_2 {\dot{\theta}_1} {\dot{\theta}_2} \cos({\theta_1} - {\theta_2})+{l_2}^2 {\dot{\theta}_2}^2\end{aligned} \quad\quad\quad(184)

That’s all that we need for the Kinetic terms in the Lagrangian. Now we need the height for the m g h terms. If we set the reference point at the lowest point for the double pendulum system, the height of the first particle is

\begin{aligned}h_1 = l_2 + l_1 (1 -\cos{\theta_1})\end{aligned} \quad\quad\quad(185)

For the second particle, the distance from the horizontal is

\begin{aligned}d = l_1 \cos{\theta_1} + l_2 \cos{\theta_2}\end{aligned} \quad\quad\quad(186)

So the total distance from the reference point is

\begin{aligned}h_2 = l_1 (1 - \cos{\theta_1}) + l_2 (1 -\cos{\theta_2})\end{aligned} \quad\quad\quad(187)

We now have the Lagrangian

\begin{aligned}\mathcal{L}' & =\frac{1}{{2}} m_1 {l_1}^2 {\dot{\theta}_1}^2+ \frac{1}{{2}} m_2 \left({l_1}^2 {\dot{\theta}_1}^2+ 2 l_1 l_2 {\dot{\theta}_1} {\dot{\theta}_2} \cos({\theta_1} - {\theta_2})+{l_2}^2 {\dot{\theta}_2}^2\right) \\ & -m_1 g (l_2 + l_1 (1 -\cos{\theta_1}))-m_2 g ( l_1 (1 - \cos{\theta_1}) + l_2 (1 -\cos{\theta_2}) )\end{aligned} \quad\quad\quad(188)

Dropping constant terms (effectively choosing a difference reference point for the potential) and rearranging a bit, also writing M = m_1 + m_2, we have the simpler Lagrangian

\begin{aligned}\mathcal{L} & =\frac{1}{{2}} M {l_1}^2 {\dot{\theta}_1}^2 + \frac{1}{{2}} m_2 {l_2}^2 {\dot{\theta}_2}^2+ m_2 l_1 l_2 {\dot{\theta}_1} {\dot{\theta}_2} \cos({\theta_1} - {\theta_2})+M l_1 g \cos{\theta_1}+m_2 l_2 g \cos{\theta_2}\end{aligned} \quad\quad\quad(190)

The conjugate momenta that we need for the Hamiltonian are

\begin{aligned}P_{\theta_1} & =M {l_1}^2 {\dot{\theta}_1}+ m_2 l_1 l_2 {\dot{\theta}_2} \cos({\theta_1} - {\theta_2}) \\ P_{\theta_2} & =m_2 {l_2}^2 {\dot{\theta}_2}+ m_2 l_1 l_2 {\dot{\theta}_1} \cos({\theta_1} - {\theta_2})\end{aligned} \quad\quad\quad(191)

Unlike any of the other simpler Hamiltonian systems considered so far, the coupling between the velocities means that we have a system of equations that we must first invert before we can even express the Hamiltonian in terms of the respective momenta.

That is

\begin{aligned}\begin{bmatrix}P_{\theta_1} \\ P_{\theta_2}\end{bmatrix}=\begin{bmatrix}M {l_1}^2 & m_2 l_1 l_2 \cos({\theta_1} - {\theta_2}) \\ m_2 l_1 l_2 \cos({\theta_1} - {\theta_2}) & m_2 {l_2}^2\end{bmatrix}\begin{bmatrix}{\dot{\theta}_1} \\ {\dot{\theta}_2}\end{bmatrix}\end{aligned} \quad\quad\quad(193)

While this is easily invertible, doing so and attempting to substitute it back, results in an unholy mess (albeit perhaps one that can be simplified). Is there a better way? A possibly promising way is motivated by observing that this matrix, a function of the angular difference \delta = {\theta_1} - {\theta_2}, looks like it is something like a moment of inertia tensor. If we call this \mathcal{I}, and write

\begin{aligned}\boldsymbol{\Theta} & \equiv \begin{bmatrix}{\theta_1} \\ {\theta_2}\end{bmatrix}\end{aligned} \quad\quad\quad(194)

Then the relation between the conjugate momenta in vector form

\begin{aligned}\mathbf{p} & \equiv \begin{bmatrix}P_{\theta_1} \\ P_{\theta_2}\end{bmatrix}\end{aligned} \quad\quad\quad(195)

and the angular velocity vector can be written

\begin{aligned}\mathbf{p} & = \mathcal{I}(\delta)\dot{\boldsymbol{\Theta}}\end{aligned} \quad\quad\quad(196)

Can we write the Lagrangian in terms of \dot{\boldsymbol{\Theta}}? The first Kinetic term is easy, just

\begin{aligned}\frac{1}{{2}} m_1 l^2 {\dot{\theta}_1}^2 = \frac{1}{{2}} m_1\dot{\boldsymbol{\Theta}}^\text{T}\begin{bmatrix}l_1^2 & 0 \\ 0 & 0\end{bmatrix}\dot{\boldsymbol{\Theta}}\end{aligned} \quad\quad\quad(197)

For the second mass, going back to 183, we can write

\begin{aligned}\dot{z}_2 = \hat{\mathbf{x}} i\begin{bmatrix}l_1 e^{i{\theta_1}} & l_2 e^{i{\theta_2}}\end{bmatrix}\dot{\boldsymbol{\Theta}}\end{aligned} \quad\quad\quad(198)

Writing \mathbf{r} for this 1x2 matrix, we can utilize the associative property for compatible sized matrices to rewrite the speed for the second particle in terms of a quadratic form

\begin{aligned}{\left\lvert{\dot{z}_2}\right\rvert}^2=(\mathbf{r} \dot{\boldsymbol{\Theta}}) (\bar{\mathbf{r}} \dot{\boldsymbol{\Theta}})=\dot{\boldsymbol{\Theta}}^\text{T} (\mathbf{r}^\text{T} \bar{\mathbf{r}}) \dot{\boldsymbol{\Theta}}\end{aligned} \quad\quad\quad(199)

The Lagrangian kinetic can all now be grouped into a single quadratic form

\begin{aligned}Q \equiv m_1\begin{bmatrix}l_1 \\ 0\end{bmatrix}\begin{bmatrix}l_1 & 0\end{bmatrix}+m_2\begin{bmatrix}l_1 e^{i{\theta_1}} \\ l_2 e^{i{\theta_2}}\end{bmatrix}\begin{bmatrix}l_1 e^{-i{\theta_1}} & l_2 e^{-i{\theta_2}}\end{bmatrix}\end{aligned} \quad\quad\quad(200)

\begin{aligned}\mathcal{L} =\frac{1}{{2}} \dot{\boldsymbol{\Theta}}^\text{T} Q \dot{\boldsymbol{\Theta}}+M l_1 g \cos{\theta_1}+m_2 l_2 g \cos{\theta_2}\end{aligned} \quad\quad\quad(201)

It is also clear that this generalize easily to multiple connected pendulums, as follows

\begin{aligned}K & = \frac{1}{{2}} \dot{\boldsymbol{\Theta}}^\text{T} \sum_k m_k Q_k \dot{\boldsymbol{\Theta}} \\ Q_k & ={\begin{bmatrix}l_r l_s e^{i(\theta_r - \theta_s)}\end{bmatrix}}_{r,s \le k} \\ \phi & = - g \sum_k l_k \cos\theta_k \sum_{j=k}^N m_j \\ \mathcal{L} & = K - \phi\end{aligned} \quad\quad\quad(202)

In the expression for Q_k above, it is implied that the matrix is zero for any indexes r,s > k, so it would perhaps be better to write explicitly

\begin{aligned}Q = \sum_k m_k Q_k={\begin{bmatrix}\sum_{j=\max(r,s)}^N m_j l_r l_s e^{i(\theta_r - \theta_s)}\end{bmatrix}}_{r,s}\end{aligned} \quad\quad\quad(206)

Returning to the problem, it is convenient and sufficient in many cases to only discuss the representative double pendulum case. For that we can calculate the conjugate momenta from 201 directly

\begin{aligned}P_{\theta_1} & =\frac{\partial {}}{\partial {{\dot{\theta}_1}}}\frac{1}{{2}} \dot{\boldsymbol{\Theta}}^\text{T} Q \dot{\boldsymbol{\Theta}} \\ & =\frac{\partial {}}{\partial {{\dot{\theta}_1}}}\frac{1}{{2}} \dot{\boldsymbol{\Theta}}^\text{T} Q\begin{bmatrix}1 \\ 0\end{bmatrix}+\frac{1}{{2}}\begin{bmatrix}1 & 0\end{bmatrix}Q \dot{\boldsymbol{\Theta}} \\ & =\begin{bmatrix}1 & 0\end{bmatrix}\left(\frac{1}{{2}}(Q + Q^\text{T})\right) \dot{\boldsymbol{\Theta}} \\ \end{aligned}

Similarly the {\theta_2} conjugate momentum is

\begin{aligned}P_{\theta_2}=\begin{bmatrix}0 & 1\end{bmatrix}\left(\frac{1}{{2}}(Q + Q^\text{T})\right) \dot{\boldsymbol{\Theta}} \\ \end{aligned}

Putting both together, it is straightforward to verify that this recovers 193, which can now be written

\begin{aligned}\mathbf{p} & =\frac{1}{{2}}(Q + Q^\text{T}) \dot{\boldsymbol{\Theta}} = \mathcal{I} \dot{\boldsymbol{\Theta}}\end{aligned} \quad\quad\quad(207)

Observing that \mathcal{I} = \mathcal{I}^\text{T}, and thus (\mathcal{I}^\text{T})^{-1} = \mathcal{I}^{-1}, we now have everything required to express the Hamiltonian in terms of the conjugate momenta

\begin{aligned}H = \mathbf{p}^\text{T} \left( \frac{1}{{2}} \mathcal{I}^{-1} Q \mathcal{I}^{-1} \right) \mathbf{p} - M g l_1 \cos{\theta_1} - m_2 l_2 g \cos{\theta_2}\end{aligned} \quad\quad\quad(208)

This is now in a convenient form to calculate the first set of Hamiltonian equations.

\begin{aligned}\dot{\theta}_k & =\frac{\partial {H}}{\partial {P_{\theta_k}}} \\ & =\frac{\partial {\mathbf{p}^\text{T}}}{\partial {P_{\theta_k}}} \frac{1}{{2}} \mathcal{I}^{-1} Q \mathcal{I}^{-1} \mathbf{p}+ \mathbf{p}^\text{T} \frac{1}{{2}} \mathcal{I}^{-1} Q \mathcal{I}^{-1} \frac{\partial {\mathbf{p}^\text{T}}}{\partial {P_{\theta_k}}} \\ & ={\begin{bmatrix}\delta_{kj}\end{bmatrix}}_j\frac{1}{{2}} \mathcal{I}^{-1} Q \mathcal{I}^{-1} \mathbf{p}+ \mathbf{p}^\text{T} \frac{1}{{2}} \mathcal{I}^{-1} Q \mathcal{I}^{-1}{\begin{bmatrix}\delta_{ik}\end{bmatrix}}_i \\ & ={\begin{bmatrix}\delta_{kj}\end{bmatrix}}_j\mathcal{I}^{-1} \underbrace{\frac{1}{{2}}(Q + Q^\text{T})}_{\mathcal{I}} \mathcal{I}^{-1} \mathbf{p} \\ & ={\begin{bmatrix}\delta_{kj}\end{bmatrix}}_j\mathcal{I}^{-1} \mathbf{p} \\ \end{aligned}

So, when the velocity dependence is a quadratic form as identified in 200, the first half of the Hamiltonian equations in vector form are just

\begin{aligned}\dot{\boldsymbol{\Theta}} & ={\begin{bmatrix}\frac{\partial {}}{\partial {P_{\theta_1}}} \cdots \frac{\partial {}}{\partial {P_{\theta_N}}}\end{bmatrix}}^\text{T} H=\mathcal{I}^{-1} \mathbf{p}\end{aligned} \quad\quad\quad(209)

This is exactly the relation we used in the first place to re-express the Lagrangian in terms of the conjugate momenta in preparation for this calculation. The remaining Hamiltonian equations are trickier, and what we now want to calculate. Without specific reference to the pendulum problem, lets do this calculation for the general Hamiltonian for a non-velocity dependent potential. That is

\begin{aligned}H = \mathbf{p}^\text{T} \left( \frac{1}{{2}} \mathcal{I}^{-1} Q \mathcal{I}^{-1} \right) \mathbf{p} + \phi(\boldsymbol{\Theta})\end{aligned} \quad\quad\quad(210)

The remaining Hamiltonian equations are {\partial {H}}/{\partial {\theta_a}} = -\dot{P}_{\theta_a}, and the tricky part of evaluating this is going to all reside in the Kinetic term. Diving right in this is

\begin{aligned}\frac{\partial {K}}{\partial {\theta_a}} & = \mathbf{p}^\text{T} \left( \frac{1}{{2}} \frac{\partial {(\mathcal{I}^{-1})}}{\partial {\theta_a}} Q \mathcal{I}^{-1} \right) \mathbf{p} +\mathbf{p}^\text{T} \left( \frac{1}{{2}} \mathcal{I}^{-1} \frac{\partial {Q}}{\partial {\theta_a}} \mathcal{I}^{-1} \right) \mathbf{p} +\mathbf{p}^\text{T} \left( \frac{1}{{2}} \mathcal{I}^{-1} Q \frac{\partial {(\mathcal{I}^{-1})}}{\partial {\theta_a}} \right) \mathbf{p} \\ & = \mathbf{p}^\text{T} \frac{\partial {(\mathcal{I}^{-1})}}{\partial {\theta_a}} \underbrace{\frac{1}{{2}}(Q + Q^\text{T})}_{=\mathcal{I}} \mathcal{I}^{-1} \mathbf{p} +\mathbf{p}^\text{T} \left( \frac{1}{{2}} \mathcal{I}^{-1} \frac{\partial {Q}}{\partial {\theta_a}} \mathcal{I}^{-1} \right) \mathbf{p}  \\ & = \mathbf{p}^\text{T} \frac{\partial {(\mathcal{I}^{-1})}}{\partial {\theta_a}} \mathbf{p} +\mathbf{p}^\text{T} \left( \frac{1}{{2}} \mathcal{I}^{-1} \frac{\partial {Q}}{\partial {\theta_a}} \mathcal{I}^{-1} \right) \mathbf{p}  \\ \end{aligned}

For the two particle case we can expand this inverse easily enough, and then take derivatives to evaluate this, but this is messier and intractable for the general case. We can however, calculate the derivative of the identity matrix using the standard trick from rigid body mechanics

\begin{aligned}0 & = \frac{\partial {I}}{\partial {\theta_a}} \\ & = \frac{\partial {(\mathcal{I} \mathcal{I}^{-1})}}{\partial {\theta_a}} \\ & = \frac{\partial {\mathcal{I}}}{\partial {\theta_a}} \mathcal{I}^{-1}+\mathcal{I} \frac{\partial {(\mathcal{I}^{-1})}}{\partial {\theta_a}} \\ \end{aligned}

Thus the derivative of the inverse (moment of inertia?) matrix is

\begin{aligned}\frac{\partial {(\mathcal{I}^{-1})}}{\partial {\theta_a}} & = -\mathcal{I}^{-1}\frac{\partial {\mathcal{I}}}{\partial {\theta_a}} \mathcal{I}^{-1} \\ & = -\mathcal{I}^{-1}\frac{1}{{2}}\left( \frac{\partial {Q}}{\partial {\theta_a}} + \frac{\partial {Q^\text{T}}}{\partial {\theta_a}} \right) \mathcal{I}^{-1}\end{aligned}

This gives us for the Hamiltonian equation

\begin{aligned}\frac{\partial {H}}{\partial {\theta_a}} & = -\frac{1}{{2}} \mathbf{p}^\text{T} \mathcal{I}^{-1} \left(\frac{\partial {Q}}{\partial {\theta_a}}\right)^\text{T} \mathcal{I}^{-1} \mathbf{p}  + \frac{\partial {\phi}}{\partial {\theta_a}} \end{aligned} \quad\quad\quad(211)

If we introduce a phase space position gradients

\begin{aligned}\nabla & \equiv {\begin{bmatrix}\frac{\partial {}}{\partial {\theta_1}} \cdots \frac{\partial {}}{\partial {\theta_N}}\end{bmatrix}}^\text{T} \\ \end{aligned} \quad\quad\quad(212)

Then for the second half of the Hamiltonian equations we have the vector form

\begin{aligned}-\nabla H = \dot{\mathbf{p}} = {\begin{bmatrix}\frac{1}{{2}} \mathbf{p}^\text{T} \mathcal{I}^{-1} \left(\frac{\partial {Q}}{\partial {\theta_r}}\right)^\text{T} \mathcal{I}^{-1} \mathbf{p} \end{bmatrix}}_r- \nabla \phi\end{aligned} \quad\quad\quad(214)

The complete set of Hamiltonian equations for 210, in block matrix form, describing all the phase space change of the system is therefore

\begin{aligned}\frac{d}{dt} \begin{bmatrix}\mathbf{p} \\ \boldsymbol{\Theta}\end{bmatrix}=\begin{bmatrix}{\begin{bmatrix}\frac{1}{{2}} \mathbf{p}^\text{T} \mathcal{I}^{-1} \left(\frac{\partial {Q}}{\partial {\theta_r}}\right)^\text{T} \mathcal{I}^{-1} \mathbf{p} \end{bmatrix}}_r- \nabla \phi \\ \mathcal{I}^{-1} \mathbf{p} \\ \end{bmatrix}=\begin{bmatrix}{\begin{bmatrix}\frac{1}{{2}} \dot{\boldsymbol{\Theta}} \left(\frac{\partial {Q}}{\partial {\theta_r}}\right)^\text{T} \dot{\boldsymbol{\Theta}}\end{bmatrix}}_r- \nabla \phi \\ \dot{\boldsymbol{\Theta}}\end{bmatrix}\end{aligned} \quad\quad\quad(215)

This is a very general relation, much more so than required for the original two particle problem. We have the same non-linearity that prevents this from being easily solved. If we want a linear expansion around a phase space point to find an approximate first order solution, we can get that applying the chain rule, calculating all the {\partial {}}/{\partial {\theta_k}}, and {\partial {}}/{\partial {P_{\theta_k}}} derivatives of the top N rows of this matrix.

If we write

\begin{aligned}\mathbf{z} \equiv \begin{bmatrix}\mathbf{p} \\ \boldsymbol{\Theta}\end{bmatrix}-{{\begin{bmatrix}\mathbf{p} \\ \boldsymbol{\Theta}\end{bmatrix}}}_{{t=0}}\end{aligned} \quad\quad\quad(216)

and the Hamiltonian equations 215 as

\begin{aligned}\frac{d}{dt}\begin{bmatrix}\mathbf{p} \\ \boldsymbol{\Theta}\end{bmatrix}= A(\mathbf{p}, \boldsymbol{\Theta})\end{aligned} \quad\quad\quad(217)

Then the linearization, without simplifying or making explicit yet is

\begin{aligned}\dot{\mathbf{z}} \approx{{\begin{bmatrix}{\begin{bmatrix}\frac{1}{{2}} \dot{\boldsymbol{\Theta}} \left(\frac{\partial {Q}}{\partial {\theta_r}}\right)^\text{T} \dot{\boldsymbol{\Theta}}\end{bmatrix}}_r- \nabla \phi \\ \dot{\boldsymbol{\Theta}}\end{bmatrix}}}_{{t=0}}+{\left.{{\begin{bmatrix}\frac{\partial {A}}{\partial {P_{\theta_1}}} \cdots \frac{\partial {A}}{\partial {P_{\theta_N}}} \cdots \frac{\partial {A}}{\partial {\theta_1}} \cdots \frac{\partial {A}}{\partial {\theta_N}} \end{bmatrix}}}\right\vert}_{{t=0}} \mathbf{z}\end{aligned} \quad\quad\quad(218)

For brevity the constant term evaluated at t=0 is expressed in terms of the original angular velocity vector from our Lagrangian. The task is now evaluating the derivatives in the first order term of this Taylor series. Let’s do these one at a time and then reassemble all the results afterward.

So that we can discuss just the first order terms lets write \Delta for the matrix of first order derivatives in our Taylor expansion, as in

\begin{aligned}f(\mathbf{p}, \boldsymbol{\Theta}) = {\left.{{f(\mathbf{p}, \boldsymbol{\Theta})}}\right\vert}_{{0}} + {\left.{{\Delta f}}\right\vert}_{{0}} \mathbf{z} + \cdots \end{aligned} \quad\quad\quad(219)

First, lets do the potential gradient.

\begin{aligned} \Delta (\nabla \phi) = \begin{bmatrix} 0 & {\begin{bmatrix} \frac{\partial^2 \phi}{ \partial \theta_r \partial \theta_c } \end{bmatrix}}_{r,c} \end{bmatrix} \end{aligned} \quad\quad\quad(220)

Next in terms of complexity is the first order term of \dot{\boldsymbol{\Theta}}, for which we have

\begin{aligned}\Delta (\mathcal{I}^{-1} \mathbf{p}) = \begin{bmatrix} {\begin{bmatrix} {\begin{bmatrix}\mathcal{I}^{-1} \delta_{rc}\end{bmatrix}}_{r}\end{bmatrix}}_{c} & {\begin{bmatrix}\frac{\partial (\mathcal{I}^{-1})}{\partial \theta_c} \mathbf{p} \\ \end{bmatrix}}_{c}\end{bmatrix} \\ \end{aligned}

The \delta over all rows r and columns c is the identity matrix and we are left with

\begin{aligned}\Delta (\mathcal{I}^{-1} \mathbf{p}) & =\begin{bmatrix}\mathcal{I}^{-1} & {\begin{bmatrix}\frac{\partial {(\mathcal{I}^{-1})}}{\partial {\theta_c}} \mathbf{p} \\ \end{bmatrix}}_{c}\end{bmatrix} \end{aligned} \quad\quad\quad(221)

Next, consider just the P_\theta dependence in the elements of the row vector

\begin{aligned}{\begin{bmatrix}\frac{1}{{2}} \mathbf{p}^\text{T} \mathcal{I}^{-1} \left(\frac{\partial {Q}}{\partial {\theta_r}}\right)^\text{T} \mathcal{I}^{-1} \mathbf{p} \end{bmatrix}}_r\end{aligned} \quad\quad\quad(222)

We can take derivatives of this, and exploiting the fact that these elements are scalars, so they equal their transpose. Also noting that {A^{-1}}^\text{T} = {A^\text{T}}^{-1}, and \mathcal{I} = \mathcal{I}^\text{T}, we have

\begin{aligned}\frac{\partial {}}{\partial {P_{\theta_c}}}\left( \frac{1}{{2}} \mathbf{p}^\text{T} \mathcal{I}^{-1} \left(\frac{\partial {Q}}{\partial {\theta_r}}\right)^\text{T} \mathcal{I}^{-1} \mathbf{p}  \right) & =\frac{1}{{2}} \mathbf{p}^\text{T} \mathcal{I}^{-1} \left(\frac{\partial {Q}}{\partial {\theta_r}}\right)^\text{T} \mathcal{I}^{-1} {\begin{bmatrix}\delta_{rc}\end{bmatrix}}_r+\frac{1}{{2}} \left({\begin{bmatrix}\delta_{rc}\end{bmatrix}}_r \right)^\text{T}\mathcal{I}^{-1} \left(\frac{\partial {Q}}{\partial {\theta_r}}\right)^\text{T} \mathcal{I}^{-1} \mathbf{p} \\ & =\mathbf{p}^\text{T} \mathcal{I}^{-1} \left( \frac{\partial {}}{\partial {\theta_r}}\frac{1}{{2}} \left( Q + Q^T \right) \right)\mathcal{I}^{-1} {\begin{bmatrix}\delta_{rc}\end{bmatrix}}_r \\ & =\mathbf{p}^\text{T} \mathcal{I}^{-1} \frac{\partial {\mathcal{I}}}{\partial {\theta_r}}\mathcal{I}^{-1} {\begin{bmatrix}\delta_{rc}\end{bmatrix}}_r \\ \end{aligned}

Since we also have B' B^{-1} + B (B^{-1})' = 0, for invertable matrixes B, this reduces to

\begin{aligned}\frac{\partial {}}{\partial {P_{\theta_c}}}\left( \frac{1}{{2}} \mathbf{p}^\text{T} \mathcal{I}^{-1} \left(\frac{\partial {Q}}{\partial {\theta_r}}\right)^\text{T} \mathcal{I}^{-1} \mathbf{p}  \right) & =-\mathbf{p}^\text{T} \frac{\partial {(\mathcal{I}^{-1})}}{\partial {\theta_r}}{\begin{bmatrix}\delta_{rc}\end{bmatrix}}_r \\ \end{aligned}

Forming the matrix over all rows r, and columns c, we get a trailing identity multiplying from the right, and are left with

\begin{aligned}{\begin{bmatrix}\frac{\partial {}}{\partial {P_{\theta_c}}}\left( \frac{1}{{2}} \mathbf{p}^\text{T} \mathcal{I}^{-1} \left(\frac{\partial {Q}}{\partial {\theta_r}}\right)^\text{T} \mathcal{I}^{-1} \mathbf{p}  \right)\end{bmatrix}}_{r,c} & ={\begin{bmatrix}-\mathbf{p}^\text{T} \frac{\partial {(\mathcal{I}^{-1})}}{\partial {\theta_r}}\end{bmatrix}}_{r} ={\begin{bmatrix}-\frac{\partial {(\mathcal{I}^{-1})}}{\partial {\theta_c}} \mathbf{p}\end{bmatrix}}_{c} \end{aligned} \quad\quad\quad(223)

Okay, getting closer. The only thing left is to consider the remaining \theta dependence of 222, and now want the theta partials of the scalar matrix elements

\begin{aligned}\frac{\partial {}}{\partial {\theta_c}} & \left( \frac{1}{{2}} \mathbf{p}^\text{T} \mathcal{I}^{-1} \left(\frac{\partial {Q}}{\partial {\theta_r}}\right)^\text{T} \mathcal{I}^{-1} \mathbf{p}  \right) \\ & =\mathbf{p}^\text{T} \left( \frac{\partial {}}{\partial {\theta_c}}\left( \frac{1}{{2}} \mathcal{I}^{-1} \left(\frac{\partial {Q}}{\partial {\theta_r}}\right)^\text{T} \mathcal{I}^{-1} \right)\right)\mathbf{p}  \\ & =\mathbf{p}^\text{T} \frac{1}{{2}} \mathcal{I}^{-1} \frac{\partial^2 Q^\text{T}}{\partial \theta_c \partial \theta_r} \mathcal{I}^{-1} \mathbf{p} +\mathbf{p}^\text{T} \frac{1}{{2}} \left( \frac{\partial {(\mathcal{I}^{-1})}}{\partial {\theta_c}} \left(\frac{\partial {Q}}{\partial {\theta_r}}\right)^\text{T} \mathcal{I}^{-1} +\mathcal{I}^{-1} \left(\frac{\partial {Q}}{\partial {\theta_r}}\right)^\text{T} \frac{\partial {(\mathcal{I}^{-1} )}}{\partial {\theta_c}}\right)\mathbf{p} \\ & =\mathbf{p}^\text{T} \frac{1}{{2}} \mathcal{I}^{-1} \frac{\partial^2 Q^\text{T}}{\partial \theta_c \partial \theta_r} \mathcal{I}^{-1} \mathbf{p} +\mathbf{p}^\text{T} \frac{\partial {(\mathcal{I}^{-1})}}{\partial {\theta_c}} \frac{\partial {\mathcal{I}}}{\partial {\theta_r}} \mathcal{I}^{-1} \mathbf{p} \\ \end{aligned}

There is a slight asymmetry between the first and last terms here that can possibly be eliminated. Using {B^{-1}}' = -B^{-1} B' B^{-1}, we can factor out the \mathcal{I}^{-1}\mathbf{p} = \dot{\boldsymbol{\Theta}} terms

\begin{aligned}\frac{\partial {}}{\partial {\theta_c}}\left( \frac{1}{{2}} \mathbf{p}^\text{T} \mathcal{I}^{-1} \left(\frac{\partial {Q}}{\partial {\theta_r}}\right)^\text{T} \mathcal{I}^{-1} \mathbf{p}  \right) & =\dot{\boldsymbol{\Theta}}^\text{T} \left(\frac{1}{{2}} \frac{\partial^2 Q^\text{T}}{\partial \theta_c \partial \theta_r} -\frac{\partial {\mathcal{I}}}{\partial {\theta_c}} \mathcal{I}^{-1}\frac{\partial {\mathcal{I}}}{\partial {\theta_r}} \right)\dot{\boldsymbol{\Theta}}\end{aligned}

Is this any better? Maybe a bit. Since we are forming the matrix over all r,c indexes and can assume mixed partial commutation the transpose can be dropped leaving us with

\begin{aligned}{\begin{bmatrix}\frac{\partial {}}{\partial {\theta_c}}\left( \frac{1}{{2}} \mathbf{p}^\text{T} \mathcal{I}^{-1} \left(\frac{\partial {Q}}{\partial {\theta_r}}\right)^\text{T} \mathcal{I}^{-1} \mathbf{p}  \right) \end{bmatrix}}_{r,c} & ={\begin{bmatrix}\dot{\boldsymbol{\Theta}}^\text{T} \left(\frac{1}{{2}} \frac{\partial^2 Q}{\partial \theta_c \partial \theta_r} -\frac{\partial {\mathcal{I}}}{\partial {\theta_c}} \mathcal{I}^{-1}\frac{\partial {\mathcal{I}}}{\partial {\theta_r}} \right)\dot{\boldsymbol{\Theta}}\end{bmatrix}}_{r,c} \end{aligned} \quad\quad\quad(224)

We can now assemble all these individual derivatives

\begin{aligned}\dot{\mathbf{z}} & \approx {{\begin{bmatrix}{\begin{bmatrix}\frac{1}{{2}} \dot{\boldsymbol{\Theta}} \left(\frac{\partial {Q}}{\partial {\theta_r}}\right)^\text{T} \dot{\boldsymbol{\Theta}}\end{bmatrix}}_r- \nabla \phi \\ \dot{\boldsymbol{\Theta}}\end{bmatrix}}}_{{t=0}} +{{\begin{bmatrix}-{\begin{bmatrix}\frac{\partial {(\mathcal{I}^{-1})}}{\partial {\theta_c}} \mathbf{p}\end{bmatrix}}_{c} & {\begin{bmatrix}\dot{\boldsymbol{\Theta}}^\text{T} \left(\frac{1}{{2}} \frac{\partial^2 Q}{\partial \theta_c \partial \theta_r} -\frac{\partial {\mathcal{I}}}{\partial {\theta_c}} \mathcal{I}^{-1}\frac{\partial {\mathcal{I}}}{\partial {\theta_r}} \right)\dot{\boldsymbol{\Theta}}-\frac{\partial^2 \phi}{ \partial \theta_r \partial \theta_c }\end{bmatrix}}_{r,c} \\ \\ \mathcal{I}^{-1} & {\begin{bmatrix}\frac{\partial {(\mathcal{I}^{-1})}}{\partial {\theta_c}} \mathbf{p} \\ \end{bmatrix}}_{c}\end{bmatrix} }}_{{t=0}} \mathbf{z}\end{aligned} \quad\quad\quad(225)

We have both {\partial {(\mathcal{I}^{-1})}}/{\partial {\theta_k}} and {\partial {\mathcal{I}}}/{\partial {\theta_k}} derivatives above, which will complicate things when trying to evaluate this for any specific system. A final elimination of the derivatives of the inverse inertial matrix leaves us with

\begin{aligned}\dot{\mathbf{z}} & \approx {{\begin{bmatrix}{\begin{bmatrix}\frac{1}{{2}} \dot{\boldsymbol{\Theta}} \left(\frac{\partial {Q}}{\partial {\theta_r}}\right)^\text{T} \dot{\boldsymbol{\Theta}}\end{bmatrix}}_r- \nabla \phi \\ \dot{\boldsymbol{\Theta}}\end{bmatrix}}}_{{t=0}} +{{\begin{bmatrix}{\begin{bmatrix}\mathcal{I}^{-1} \frac{\partial {\mathcal{I}}}{\partial {\theta_c}} \dot{\boldsymbol{\Theta}}\end{bmatrix}}_{c} & {\begin{bmatrix}\dot{\boldsymbol{\Theta}}^\text{T} \left(\frac{1}{{2}} \frac{\partial^2 Q}{\partial \theta_c \partial \theta_r} -\frac{\partial {\mathcal{I}}}{\partial {\theta_c}} \mathcal{I}^{-1}\frac{\partial {\mathcal{I}}}{\partial {\theta_r}} \right)\dot{\boldsymbol{\Theta}}-\frac{\partial^2 \phi}{ \partial \theta_r \partial \theta_c }\end{bmatrix}}_{r,c} \\ \\ \mathcal{I}^{-1} & -{\begin{bmatrix}\mathcal{I}^{-1} \frac{\partial {\mathcal{I}}}{\partial {\theta_c}} \dot{\boldsymbol{\Theta}}\end{bmatrix}}_{c}\end{bmatrix} }}_{{t=0}} \mathbf{z}\end{aligned} \quad\quad\quad(226)

Single pendulum verification.

Having accumultated this unholy mess of abstraction, lets verify this first against the previous result obtained for the single planar pendulum. Then if that checks out, calculate these matrixes explicitly for the double and multiple pendulum cases. For the single mass pendulum we have

\begin{aligned}Q & = \mathcal{I} = m l^2 \\ \phi & = - m g l \cos\theta\end{aligned} \quad\quad\quad(227)

So all the \theta partials except that of the potential are zero. For the potential we have

\begin{aligned}{\left.{{- \frac{\partial^2 \phi}{\partial^2 \theta}}}\right\vert}_{{0}} = - m g l \cos\theta_0\end{aligned} \quad\quad\quad(229)

and for the angular gradient

\begin{aligned}{\left.{{-\nabla \phi}}\right\vert}_{{0}} = \begin{bmatrix} - m g l \sin\theta_0\end{bmatrix} \end{aligned} \quad\quad\quad(230)

Putting these all together in this simplest application of 226 we have for the linear approximation of a single point mass pendulum about some point in phase space at time zero:

\begin{aligned}\dot{\mathbf{z}} \approx \begin{bmatrix} - m g l \sin\theta_0 \\ \dot{\theta}_0\end{bmatrix} +\begin{bmatrix} 0 & - m g l \cos\theta_0 \\ \frac{1}{{m l^2}} & 0\end{bmatrix} \mathbf{z}\end{aligned} \quad\quad\quad(231)

Excellent. Have not gotten into too much trouble with the math so far. This is consistent with the previous results obtained considering the simple pendulum directly (it actually pointed out an error in the earlier pendulum treatment which is now fixed (I’d dropped the \dot{\theta}_0 term)).

Double pendulum explicitly.

For the double pendulum, with \delta = \theta_1 - \theta_2, and M = m_1 + m_2, we have

\begin{aligned}Q = \begin{bmatrix}M {l_1}^2 & m_2 l_2 l_1 e^{i(\theta_2 - \theta_1)} \\ m_2 l_1 l_2 e^{i(\theta_1 - \theta_2)} & m_2 {l_2}^2 \\ \end{bmatrix}=\begin{bmatrix}M {l_1}^2 & m_2 l_2 l_1 e^{-i\delta} \\ m_2 l_1 l_2 e^{i\delta} & m_2 {l_2}^2 \\ \end{bmatrix}\end{aligned} \quad\quad\quad(232)

\begin{aligned}\frac{1}{{2}} \dot{\boldsymbol{\Theta}}^\text{T} \left(\frac{\partial {Q}}{\partial {\theta_1}} \right)^\text{T} \dot{\boldsymbol{\Theta}} & = \frac{1}{{2}} m_2 l_1 l_2 i\dot{\boldsymbol{\Theta}}^\text{T} {\begin{bmatrix}0 & - e^{-i\delta} \\ e^{i\delta} & 0\end{bmatrix}}^\text{T}\dot{\boldsymbol{\Theta}} \\ & = \frac{1}{{2}} m_2 l_1 l_2 i\dot{\boldsymbol{\Theta}}^\text{T} \begin{bmatrix}e^{i\delta} \dot{\theta}_2 \\ -e^{-i\delta} \dot{\theta}_1 \end{bmatrix} \\ & = \frac{1}{{2}} m_2 l_1 l_2 i \dot{\theta}_1 \dot{\theta}_2 (e^{i\delta} -e^{-i\delta}) \\ & = -m_2 l_1 l_2 \dot{\theta}_1 \dot{\theta}_2 \sin\delta\end{aligned}

The \theta_2 derivative is the same but inverted in sign, so we have most of the constant term calculated. We need the potential gradient to complete. Our potential was

\begin{aligned}\phi = - M l_1 g \cos{\theta_1} - m_2 l_2 g \cos{\theta_2}\end{aligned} \quad\quad\quad(233)

So, the gradient is

\begin{aligned}\nabla \phi =\begin{bmatrix}M l_1 g \sin{\theta_1} \\ m_2 l_2 g \sin{\theta_2}\end{bmatrix}\end{aligned} \quad\quad\quad(234)

Putting things back together we have for the linear approximation of the two pendulum system

\begin{aligned}\dot{\mathbf{z}} = {{\begin{bmatrix}m_2 l_1 l_2 \dot{\theta}_1 \dot{\theta}_2 \sin(\theta_1 - \theta_2) \begin{bmatrix}-1 \\ 1\end{bmatrix} & - g\begin{bmatrix}M l_1 \sin{\theta_1} \\ m_2 l_2 \sin{\theta_2} \\ \end{bmatrix} \\ \dot{\theta}_1 \\ \dot{\theta}_2 \\ \end{bmatrix}}}_{{t=0}}+ A \mathbf{z}\end{aligned} \quad\quad\quad(235)

Where A is still to be determined (from 226).

One of the elements of A are the matrix of potential derivatives. These are

\begin{aligned}\begin{bmatrix}\frac{\partial {\nabla \phi}}{\partial {\theta_1}} & \frac{\partial {\nabla \phi}}{\partial {\theta_2}}\end{bmatrix}=\begin{bmatrix}M l_1 g \cos\theta_1 & 0 \\ 0 & m_2 l_2 g \cos\theta_2\end{bmatrix}\end{aligned} \quad\quad\quad(236)

We also need the inertial matrix and its inverse. These are

\begin{aligned}\mathcal{I}=\begin{bmatrix}M {l_1}^2 & m_2 l_2 l_1 \cos\delta \\ m_2 l_1 l_2 \cos\delta & m_2 {l_2}^2 \\ \end{bmatrix}\end{aligned} \quad\quad\quad(237)

\begin{aligned}\mathcal{I}^{-1}=\frac{1}{{{l_1}^2 {l_2}^2 m_2 (M - m_2 \cos^2\delta)}}\begin{bmatrix}m_2 {l_2}^2 & -m_2 l_2 l_1 \cos\delta \\ -m_2 l_1 l_2 \cos\delta & M {l_1}^2 \\ \end{bmatrix}\end{aligned} \quad\quad\quad(238)

Since

\begin{aligned}\frac{\partial {Q}}{\partial {\theta_1}} & = m_2 l_1 l_2 i\begin{bmatrix}0 & - e^{-i\delta} \\ e^{i\delta} & 0\end{bmatrix} \end{aligned} \quad\quad\quad(239)

We have

\begin{aligned}\frac{\partial {}}{\partial {\theta_1}} \frac{\partial {Q}}{\partial {\theta_1}} & =-m_2 l_1 l_2 \begin{bmatrix}0 & e^{-i\delta} \\ e^{i\delta} & 0\end{bmatrix} \\ \frac{\partial {}}{\partial {\theta_2}} \frac{\partial {Q}}{\partial {\theta_1}} & =m_2 l_1 l_2 \begin{bmatrix}0 & e^{-i\delta} \\ e^{i\delta} & 0\end{bmatrix} \\ \frac{\partial {}}{\partial {\theta_1}} \frac{\partial {Q}}{\partial {\theta_2}} & =m_2 l_1 l_2 \begin{bmatrix}0 & e^{-i\delta} \\ e^{i\delta} & 0\end{bmatrix}  \\ \frac{\partial {}}{\partial {\theta_2}} \frac{\partial {Q}}{\partial {\theta_2}} & =-m_2 l_1 l_2 \begin{bmatrix}0 & e^{-i\delta} \\ e^{i\delta} & 0\end{bmatrix} \end{aligned} \quad\quad\quad(240)

and the matrix of derivatives becomes

\begin{aligned}\frac{1}{{2}}\dot{\boldsymbol{\Theta}}^\text{T} \frac{\partial {}}{\partial {\theta_c}}\frac{\partial {Q}}{\partial {\theta_r}} \dot{\boldsymbol{\Theta}}=m_2 l_1 l_2 \dot{\theta_1} \dot{\theta_2} \cos(\theta_1 - \theta_2)\begin{bmatrix}-1 & 1 \\ 1 & -1\end{bmatrix} \end{aligned} \quad\quad\quad(244)

For the remaining two types of terms in the matrix A we need \mathcal{I}^{-1} {\partial {\mathcal{I}}}/{\partial {\theta_k}}. The derivative of the inertial matrix is

\begin{aligned}\frac{\partial {\mathcal{I}}}{\partial {\theta_k}}=-m_2 l_1 l_2 (\delta_{k1} - \delta_{k2})\begin{bmatrix}0 & \sin\delta \\ \sin\delta & 0\end{bmatrix}\end{aligned} \quad\quad\quad(245)

Computing the product

\begin{aligned}\mathcal{I}^{-1} \frac{\partial {\mathcal{I}}}{\partial {\theta_k}} & =\frac{ -m_2 l_1 l_2 (\delta_{k1} - \delta_{k2}) }{{l_1}^2 {l_2}^2 m_2 (M - m_2 \cos^2\delta)}\begin{bmatrix}m_2 {l_2}^2 & -m_2 l_2 l_1 \cos\delta \\ -m_2 l_1 l_2 \cos\delta & M {l_1}^2 \\ \end{bmatrix}\begin{bmatrix}0 & \sin\delta \\ \sin\delta & 0\end{bmatrix} \\ & =\frac{ -m_2 l_1 l_2 (\delta_{k1} - \delta_{k2}) \sin\delta}{{l_1}^2 {l_2}^2 m_2 (M - m_2 \cos^2\delta)}\begin{bmatrix}-m_2 l_2 l_1 \cos\delta & m_2 {l_2}^2 \\ M {l_1}^2 & -m_2 l_1 l_2 \cos\delta \\ \end{bmatrix}\end{aligned}

We want the matrix of \mathcal{I}^{-1} {\partial {\mathcal{I}}}/{\partial {\theta_c}} \dot{\boldsymbol{\Theta}} over columns c, and this is

\begin{aligned}{\begin{bmatrix}\mathcal{I}^{-1} {\partial {\mathcal{I}}}/{\partial {\theta_c}} \dot{\boldsymbol{\Theta}}\end{bmatrix}}_{c}=\frac{ m_2 l_1 l_2 \sin\delta}{{l_1}^2 {l_2}^2 m_2 (M - m_2 \cos^2\delta)}\begin{bmatrix}m_2 l_2 l_1 \cos\delta \dot{\theta}_1 - m_2 {l_2}^2 \dot{\theta}_2 & -m_2 l_2 l_1 \cos\delta \dot{\theta}_1 + m_2 {l_2}^2 \dot{\theta}_2 \\ -M {l_1}^2 \dot{\theta}_1 +m_2 l_1 l_2 \cos\delta \dot{\theta}_2 & M {l_1}^2 \dot{\theta}_1 -m_2 l_1 l_2 \cos\delta \dot{\theta}_2\end{bmatrix}\end{aligned} \quad\quad\quad(246)

Very messy. Perhaps it would be better not even bothering to expand this explicitly? The last term in the matrix A is probably no better. For that we want

\begin{aligned}-\frac{\partial {\mathcal{I}}}{\partial {\theta_c}} \mathcal{I}^{-1} \frac{\partial {\mathcal{I}}}{\partial {\theta_r}} & =\frac{ -{m_2}^2 {l_1}^2 {l_2}^2 (\delta_{c1} - \delta_{c2}) (\delta_{r1} - \delta_{r2}) \sin^2\delta}{{l_1}^2 {l_2}^2 m_2 (M - m_2 \cos^2\delta)}\begin{bmatrix}0 & 1 \\ 1 & 0\end{bmatrix}\begin{bmatrix}-m_2 l_2 l_1 \cos\delta & m_2 {l_2}^2 \\ M {l_1}^2 & -m_2 l_1 l_2 \cos\delta \\ \end{bmatrix} \\ & =\frac{ -{m_2}^2 {l_1}^2 {l_2}^2 (\delta_{c1} - \delta_{c2}) (\delta_{r1} - \delta_{r2}) \sin^2\delta}{{l_1}^2 {l_2}^2 m_2 (M - m_2 \cos^2\delta)}\begin{bmatrix}M {l_1}^2 & -m_2 l_1 l_2 \cos\delta \\ -m_2 l_2 l_1 \cos\delta & m_2 {l_2}^2 \\ \end{bmatrix} \\ \end{aligned}

With a sandwich of this between \dot{\boldsymbol{\Theta}}^\text{T} and \dot{\boldsymbol{\Theta}} we are almost there

\begin{aligned}-\dot{\boldsymbol{\Theta}}^\text{T}\frac{\partial {\mathcal{I}}}{\partial {\theta_c}} \mathcal{I}^{-1} \frac{\partial {\mathcal{I}}}{\partial {\theta_r}}\dot{\boldsymbol{\Theta}} & =\frac{ -{m_2}^2 {l_1}^2 {l_2}^2 (\delta_{c1} - \delta_{c2}) (\delta_{r1} - \delta_{r2}) \sin^2\delta}{{l_1}^2 {l_2}^2 m_2 (M - m_2 \cos^2\delta)}\left( M {l_1}^2 {\dot{\theta}_1}^2 -2m_2 l_1 l_2 \cos\delta \dot{\theta_1} \dot{\theta}_2 ++ m_2 {l_2}^2 {\dot{\theta}_2}^2 \right)\end{aligned}

we have a matrix of these scalars over r,c, and that is

\begin{aligned}{\begin{bmatrix}-\dot{\boldsymbol{\Theta}}^\text{T}\frac{\partial {\mathcal{I}}}{\partial {\theta_c}} \mathcal{I}^{-1} \frac{\partial {\mathcal{I}}}{\partial {\theta_r}}\dot{\boldsymbol{\Theta}}\end{bmatrix}}_{rc} & =\frac{ {m_2}^2 {l_1}^2 {l_2}^2 \sin^2\delta}{{l_1}^2 {l_2}^2 m_2 (M - m_2 \cos^2\delta)}\left(M {l_1}^2 {\dot{\theta}_1}^2 - 2 m_2 l_1 l_2 \cos\delta \dot{\theta_1} \dot{\theta}_2 + m_2 {l_2}^2 {\dot{\theta}_2}^2  \right)\begin{bmatrix}-1 & 1 \\  1 & -1\end{bmatrix} \end{aligned} \quad\quad\quad(247)

Putting all the results for the matrix A together is going to make a disgusting mess, so lets summarize in block matrix form

\begin{aligned}A & = {{\begin{bmatrix}B & C \\ \mathcal{I}^{-1} & -B\end{bmatrix}}}_{{t=0}} \\ B & =\frac{ m_2 l_1 l_2 \sin\delta}{{l_1}^2 {l_2}^2 m_2 (M - m_2 \cos^2\delta)}\begin{bmatrix}m_2 l_2 l_1 \cos\delta \dot{\theta}_1 - m_2 {l_2}^2 \dot{\theta}_2 & -m_2 l_2 l_1 \cos\delta \dot{\theta}_1 + m_2 {l_2}^2 \dot{\theta}_2 \\ -M {l_1}^2 \dot{\theta}_1 +m_2 l_1 l_2 \cos\delta \dot{\theta}_2 & M {l_1}^2 \dot{\theta}_1 -m_2 l_1 l_2 \cos\delta \dot{\theta}_2\end{bmatrix} \\ C & =\left(m_2 l_1 l_2 \dot{\theta_1} \dot{\theta_2} \cos \delta+\frac{ {m_2}^2 {l_1}^2 {l_2}^2 \sin^2\delta}{{l_1}^2 {l_2}^2 m_2 (M - m_2 \cos^2\delta)}\left(M {l_1}^2 {\dot{\theta}_1}^2 - 2 m_2 l_1 l_2 \cos\delta \dot{\theta_1} \dot{\theta}_2 + m_2 {l_2}^2 {\dot{\theta}_2}^2  \right)\right)\begin{bmatrix}-1 & 1 \\ 1 & -1\end{bmatrix} \\ & \qquad+\begin{bmatrix}M l_1 g \cos\theta_1 & 0 \\ 0 & m_2 l_2 g \cos\theta_2\end{bmatrix} \\ \mathcal{I}^{-1} & =\frac{1}{{{l_1}^2 {l_2}^2 m_2 (M - m_2 \cos^2\delta)}}\begin{bmatrix}m_2 {l_2}^2 & -m_2 l_2 l_1 \cos\delta \\ -m_2 l_1 l_2 \cos\delta & M {l_1}^2 \\ \end{bmatrix} \\ b & =\begin{bmatrix}m_2 l_1 l_2 \dot{\theta}_1 \dot{\theta}_2 \sin(\theta_1 - \theta_2) \begin{bmatrix}-1 \\ 1\end{bmatrix} & - g\begin{bmatrix}M l_1 \sin{\theta_1} \\ m_2 l_2 \sin{\theta_2} \\ \end{bmatrix} \\ \dot{\theta}_1 \\ \dot{\theta}_2 \\ \end{bmatrix}\end{aligned}

where these are all related by the first order matrix equation

\begin{aligned}\frac{d\mathbf{z}}{dt} & = {\left.{\mathbf{b}}\right\vert}_{{t=0}} + {\left.{{A}}\right\vert}_{{t=0}} \mathbf{z}\end{aligned} \quad\quad\quad(248)

Wow, even to just write down the equations required to get a linear approximation of the two pendulum system is horrendously messy, and this isn’t even trying to solve it. Numerical and or symbolic computation is really called for here. If one elected to do this numerically, which looks pretty much mandatory since the analytic way didn’t turn out to be simple even for just the two pendulum system, then one is probably better off going all the way back to 215 and just calculating the increment for the tragectory using a very small time increment, and do this repeatedly (i.e. do a zeroth order numerical procedure instead of the first order which turns out much more complicated).

Posted in Math and Physics Learning. | Tagged: , , , | 4 Comments »