### Geodesic Spheres and Gram Matrices

#### Posted by Tom Leinster

This is a short weekend diversion, containing nothing profound.

Any sphere in $\mathbb{R}^n$ can be given the geodesic metric, where the distance between two points is defined to be the length of the arc of a great circle between them. It’s not a big surprise that the same is true in any real inner product space $X$: for instance, the unit sphere $S(X)$ centred at the origin can be given a geodesic metric in a natural way. This defines a functor

$S: \mathbf{IPS} \to \mathbf{MS}$

from inner product spaces to metric spaces, where in both cases the maps are not-necessarily-surjective isometries.

What’s more interesting is that it’s not quite as straightforward to prove as you might imagine. One way to do it leads into the topic of Gram matrices, as I’ll explain.

In $\mathbb{R}^n$, we define the geodesic metric $d$ on the unit sphere $S(\mathbb{R}^n)$ by

$d(x, y) = \cos^{-1}(x \cdot y) \in [0, \pi].$

The Cauchy–Schwarz inequality tells us that

$|x \cdot y| \leq \|x\| \|y\| = 1$

(since $x$ and $y$ are unit vectors), so the inverse cosine makes sense.

So in an arbitrary inner product space $X$, we’re obviously going to define the geodesic metric $d$ on the unit sphere $S(X)$ by

$d(x, y) = \cos^{-1} \langle x, y \rangle.$

Again, the Cauchy–Schwarz inequality guarantees that the inverse cosine makes sense, and again we interpret $\cos^{-1}$ as taking values in $[0, \pi]$.

Great! We’ve defined $d$. The next step is to check that it really is a
metric — and here’s where it gets interesting. All the axioms are
easy to check, *except* the triangle inequality:

$\cos^{-1} \langle x, y \rangle + \cos^{-1} \langle y, z \rangle \geq \cos^{-1} \langle x, z \rangle$

for all unit vectors $x$, $y$ and $z$.

How can we prove this?

In short, the first step is to follow your nose and do all the obvious
moves in order to get rid of the trig functions. Here’s how it goes (and
you may now wish to skip to the next paragraph): we’re treating $\cos^{-1}$
as a function $[-1, 1] \to [0, \pi]$, and it’s strictly decreasing there,
so we might as well take the cosine of each side and reverse the
inequality. That gives something of the form $\cos(A + B)$ on the
left-hand side, but we can apply the usual trigonometric formula. *That*
gives something involving sines of inverse cosines, but we can use the fact
that $\sin = \sqrt{1 - \cos^2}$ to get it back in terms of cosines. But
now we’ve got some undesired square roots, so we rearrange and square both
sides (taking care of the dangers inherent in such a move).

After working through this (and it’s actually less work than I’ve made it sound), we end up concluding that the triangle inequality for $x$, $y$ and $z$ holds if and only if $\langle x, y \rangle \langle y, z \rangle \leq \langle x, z \rangle$ or

$1 + 2 \langle x, y \rangle \langle y, z \rangle \langle z, x \rangle - \bigl\{ \langle x, y \rangle^2 + \langle y, z \rangle^2 + \langle z, x \rangle^2 \bigr\} \geq 0.$

The first inequality in this either-or statement only sometimes holds, but the second inequality always holds, as we’ll show. How? Well, that’s where Gram matrices come in.

We’ll now change direction and forget about geodesic metrics for a while.

Given a finite list $x_1, \ldots, x_n$ of vectors in a real inner product space, there arises an $n \times n$ matrix, the so-called Gram matrix

$G(x_1, \ldots, x_n) = \bigl(\langle x_i, x_j \rangle\bigr)_{i, j = 1}^n.$

A famous type of Gram matrix is the covariance matrix of a family $X_1, \ldots, X_n$ of random variables. Technicalities aside, random variables form an inner product space with covariance as the inner product, and the covariance matrix $(Cov(X_i, X_j))_{i, j}$ of $X_1, \ldots, X_n$ is by definition their Gram matrix.

The crucial point now is that *the determinant of a Gram matrix is
nonnegative*. There are a couple of ways to prove this, but probably the
best is to prove the superficially stronger statement that every Gram
matrix is positive semidefinite. (“Positive semidefinite” is only usually
defined for *symmetric* matrices, which Gram matrices obviously are.) And
this is easy: for all $c = (c_1, \ldots, c_n) \in \mathbb{R}^n$,

$\sum_{i, j} c_i \langle x_i, x_j \rangle c_j = \Bigl\| \sum_i c_i x_i \Bigr\|^2 \geq 0.$

The same argument shows that the Gram matrix of *linearly independent*
vectors $x_1, \ldots, x_n$ is positive *definite*, and in particular has
*strictly* positive determinant. But we won’t need this extra precision.

So: for any finite list $x_1, \ldots, x_n$ of vectors in any inner product space, their Gram matrix has nonnegative determinant. This gives a whole sequence of inequalities (“Gram inequalities”),

$\det G(x_1, \ldots, x_n) \geq 0,$

one for each $n \geq 0$. Let’s go through them for the first few values of $n$.

$n = 0$: the Gram matrix of a list of length $0$ is the unique $0 \times 0$ matrix, which has determinant $1$, so Gram’s inequality for $n = 0$ says $1 \geq 0$. Undoubtedly.

$n = 1$: the Gram matrix of a single vector $x$ is $(\|x\|^2)$, which has determinant $\|x\|^2$, so Gram’s inequality for $n = 1$ says $\|x\|^2 \geq 0$.

$n = 2$: take vectors $x$ and $y$ in a real inner product space. Their Gram matrix is $G(x, y) = \begin{pmatrix} \|x\|^2 & \langle x, y \rangle \\ \langle y, x \rangle & \|y\|^2 \end{pmatrix}.$ Gram’s inequality for $n = 2$ therefore says that $\|x\|^2 \|y\|^2 - \langle x, y \rangle^2 \geq 0.$ This is exactly the Cauchy–Schwarz inequality! Why’s that exciting? First, because it suggests that the Gram inequalities for higher $n$ can be regarded as higher Cauchy–Schwarz inequalities. And second, because given the all-pervasive nature of Cauchy–Schwarz, it’s a strong hint that the higher Gram inequalities are going to be useful too. So we have a machine for generating useful inequalities.

$n = 3$: take vectors $x$, $y$ and $z$ in a real inner product space. Their Gram matrix is $G(x, y, z) = \begin{pmatrix} \|x\|^2 &\langle x, y \rangle &\langle x, z \rangle \\ \langle y, x \rangle &\|y\|^2 &\langle y, z \rangle \\ \langle z, x \rangle &\langle z, y \rangle &\|z\|^2 \end{pmatrix}$ and Gram’s inequality for $n = 3$ says that $\det G(x, y, z) \geq 0$. I won’t bother writing that out explicitly for

*general*$x, y, z$, but look what it says when $\|x\| = \|y\| = \|z\| = 1$: $1 + 2 \langle x, y \rangle \langle y, z \rangle \langle z, x \rangle - \bigl\{ \langle x, y \rangle^2 + \langle y, z \rangle^2 + \langle z, x \rangle^2 \bigr\} \geq 0.$ That’s exactly the inequality we wanted to prove earlier — the one that proves the triangle inequality for the geodesic metric!

So: we’ve now proved that on the unit sphere of any inner product space, the “geodesic metric”

$d(x, y) = \cos^{-1} \langle x, y \rangle$

satisfies the triangle inequality, and is, therefore, genuinely a metric.

## Re: Geodesic Spheres and Gram Matrices

My initial instinct is that phrasing this on $\mathbb{R}^n$ is a bit of a red herring, since if one can solve the problem for the unit sphere in a $3$-dimensional real inner product space one gets it for the unit sphere in all higher dimensions.

That said, at present I don’t see a “nice” way to exploit the reduction to this special case – my analyst’s training is to then WLOG place my vectors $x_1$, $x_2$ and $x_3$ with $x_1$ and $x_2$ in the $z=0$ plane, and have $x_3$ somewhere on the “upper hemisphere”, and then start hacking. So certainly the approach via Gram matrices, which I’ve not seen before, is more appealing!