Geodesic Spheres and Gram Matrices

November 16, 2019

Posted by Tom Leinster

$MathML-enabled post (click for more details).$

This is a short weekend diversion, containing nothing profound.

Any sphere in $\mathbb{R}^n$ can be given the geodesic metric, where the distance between two points is defined to be the length of the arc of a great circle between them. It’s not a big surprise that the same is true in any real inner product space $X$ : for instance, the unit sphere $S(X)$ centred at the origin can be given a geodesic metric in a natural way. This defines a functor

$S: \mathbf{IPS} \to \mathbf{MS}$

from inner product spaces to metric spaces, where in both cases the maps are not-necessarily-surjective isometries.

What’s more interesting is that it’s not quite as straightforward to prove as you might imagine. One way to do it leads into the topic of Gram matrices, as I’ll explain.

$MathML-enabled post (click for more details).$

In $\mathbb{R}^n$ , we define the geodesic metric $d$ on the unit sphere $S(\mathbb{R}^n)$ by

$d(x, y) = \cos^{-1}(x \cdot y) \in [0, \pi].$

The Cauchy–Schwarz inequality tells us that

$|x \cdot y| \leq \|x\| \|y\| = 1$

(since $x$ and $y$ are unit vectors), so the inverse cosine makes sense.

So in an arbitrary inner product space $X$ , we’re obviously going to define the geodesic metric $d$ on the unit sphere $S(X)$ by

$d(x, y) = \cos^{-1} \langle x, y \rangle.$

Again, the Cauchy–Schwarz inequality guarantees that the inverse cosine makes sense, and again we interpret $\cos^{-1}$ as taking values in $[0, \pi]$ .

Great! We’ve defined $d$ . The next step is to check that it really is a metric — and here’s where it gets interesting. All the axioms are easy to check, except the triangle inequality:

$\cos^{-1} \langle x, y \rangle + \cos^{-1} \langle y, z \rangle \geq \cos^{-1} \langle x, z \rangle$

for all unit vectors $x$ , $y$ and $z$ .

How can we prove this?

In short, the first step is to follow your nose and do all the obvious moves in order to get rid of the trig functions. Here’s how it goes (and you may now wish to skip to the next paragraph): we’re treating $\cos^{-1}$ as a function $[-1, 1] \to [0, \pi]$ , and it’s strictly decreasing there, so we might as well take the cosine of each side and reverse the inequality. That gives something of the form $\cos(A + B)$ on the left-hand side, but we can apply the usual trigonometric formula. That gives something involving sines of inverse cosines, but we can use the fact that $\sin = \sqrt{1 - \cos^2}$ to get it back in terms of cosines. But now we’ve got some undesired square roots, so we rearrange and square both sides (taking care of the dangers inherent in such a move).

After working through this (and it’s actually less work than I’ve made it sound), we end up concluding that the triangle inequality for $x$ , $y$ and $z$ holds if and only if $\langle x, y \rangle \langle y, z \rangle \leq \langle x, z \rangle$ or

$1 + 2 \langle x, y \rangle \langle y, z \rangle \langle z, x \rangle - \bigl\{ \langle x, y \rangle^2 + \langle y, z \rangle^2 + \langle z, x \rangle^2 \bigr\} \geq 0.$

The first inequality in this either-or statement only sometimes holds, but the second inequality always holds, as we’ll show. How? Well, that’s where Gram matrices come in.

We’ll now change direction and forget about geodesic metrics for a while.

Given a finite list $x_1, \ldots, x_n$ of vectors in a real inner product space, there arises an $n \times n$ matrix, the so-called Gram matrix

$G(x_1, \ldots, x_n) = \bigl(\langle x_i, x_j \rangle\bigr)_{i, j = 1}^n.$

A famous type of Gram matrix is the covariance matrix of a family $X_1, \ldots, X_n$ of random variables. Technicalities aside, random variables form an inner product space with covariance as the inner product, and the covariance matrix $(Cov(X_i, X_j))_{i, j}$ of $X_1, \ldots, X_n$ is by definition their Gram matrix.

The crucial point now is that the determinant of a Gram matrix is nonnegative. There are a couple of ways to prove this, but probably the best is to prove the superficially stronger statement that every Gram matrix is positive semidefinite. (“Positive semidefinite” is only usually defined for symmetric matrices, which Gram matrices obviously are.) And this is easy: for all $c = (c_1, \ldots, c_n) \in \mathbb{R}^n$ ,

$\sum_{i, j} c_i \langle x_i, x_j \rangle c_j = \Bigl\| \sum_i c_i x_i \Bigr\|^2 \geq 0.$

The same argument shows that the Gram matrix of linearly independent vectors $x_1, \ldots, x_n$ is positive definite, and in particular has strictly positive determinant. But we won’t need this extra precision.

So: for any finite list $x_1, \ldots, x_n$ of vectors in any inner product space, their Gram matrix has nonnegative determinant. This gives a whole sequence of inequalities (“Gram inequalities”),

$\det G(x_1, \ldots, x_n) \geq 0,$

one for each $n \geq 0$ . Let’s go through them for the first few values of $n$ .

$n = 0$ : the Gram matrix of a list of length $0$ is the unique $0 \times 0$ matrix, which has determinant $1$ , so Gram’s inequality for $n = 0$ says $1 \geq 0$ . Undoubtedly.
$n = 1$ : the Gram matrix of a single vector $x$ is $(\|x\|^2)$ , which has determinant $\|x\|^2$ , so Gram’s inequality for $n = 1$ says $\|x\|^2 \geq 0$ .
$n = 2$ : take vectors $x$ and $y$ in a real inner product space. Their Gram matrix is $G(x, y) = \begin{pmatrix} \|x\|^2 & \langle x, y \rangle \\ \langle y, x \rangle & \|y\|^2 \end{pmatrix}.$ Gram’s inequality for $n = 2$ therefore says that $\|x\|^2 \|y\|^2 - \langle x, y \rangle^2 \geq 0.$ This is exactly the Cauchy–Schwarz inequality! Why’s that exciting? First, because it suggests that the Gram inequalities for higher $n$ can be regarded as higher Cauchy–Schwarz inequalities. And second, because given the all-pervasive nature of Cauchy–Schwarz, it’s a strong hint that the higher Gram inequalities are going to be useful too. So we have a machine for generating useful inequalities.
$n = 3$ : take vectors $x$ , $y$ and $z$ in a real inner product space. Their Gram matrix is $G(x, y, z) = \begin{pmatrix} \|x\|^2 &\langle x, y \rangle &\langle x, z \rangle \\ \langle y, x \rangle &\|y\|^2 &\langle y, z \rangle \\ \langle z, x \rangle &\langle z, y \rangle &\|z\|^2 \end{pmatrix}$ and Gram’s inequality for $n = 3$ says that $\det G(x, y, z) \geq 0$ . I won’t bother writing that out explicitly for general $x, y, z$ , but look what it says when $\|x\| = \|y\| = \|z\| = 1$ : $1 + 2 \langle x, y \rangle \langle y, z \rangle \langle z, x \rangle - \bigl\{ \langle x, y \rangle^2 + \langle y, z \rangle^2 + \langle z, x \rangle^2 \bigr\} \geq 0.$ That’s exactly the inequality we wanted to prove earlier — the one that proves the triangle inequality for the geodesic metric!

So: we’ve now proved that on the unit sphere of any inner product space, the “geodesic metric”

$d(x, y) = \cos^{-1} \langle x, y \rangle$

satisfies the triangle inequality, and is, therefore, genuinely a metric.

Posted at November 16, 2019 11:12 PM UTC

TrackBack URL for this Entry: https://golem.ph.utexas.edu/cgi-bin/MT-3.0/dxy-tb.fcgi/3155

18 Comments & 0 Trackbacks

Re: Geodesic Spheres and Gram Matrices

$MathML-enabled post (click for more details).$

My initial instinct is that phrasing this on $\mathbb{R}^n$ is a bit of a red herring, since if one can solve the problem for the unit sphere in a $3$ -dimensional real inner product space one gets it for the unit sphere in all higher dimensions.

That said, at present I don’t see a “nice” way to exploit the reduction to this special case – my analyst’s training is to then WLOG place my vectors $x_1$ , $x_2$ and $x_3$ with $x_1$ and $x_2$ in the $z=0$ plane, and have $x_3$ somewhere on the “upper hemisphere”, and then start hacking. So certainly the approach via Gram matrices, which I’ve not seen before, is more appealing!

Posted by: Yemon Choi on November 17, 2019 3:43 PM | Permalink | Reply to this

Re: Geodesic Spheres and Gram Matrices

$MathML-enabled post (click for more details).$

Ah yes, that’s a very good point!

Oddly enough, I had actually thought of the reduction to $\mathbb{R}^n$ , but it had only occurred to me to use this reduction in the part of the argument concerning Gram matrices. That is: suppose we want to prove that $n \times n$ Gram matrices have nonnegative determinants. I gave one argument in the post. Another one runs as follows.

Let $x_1, \ldots, x_n$ be vectors in some inner product space; we want to prove that

$\det G(x_1, \ldots, x_n) \geq 0.$

The subspace spanned by $x_1, \ldots, x_n$ has dimension $d \leq n$ , and is therefore isometrically isomorphic to $\mathbb{R}^d$ (with its standard inner product), and therefore embeds isometrically into $\mathbb{R}^n$ . So, it’s enough to prove the inequality in the case where $x_1, \ldots, x_n \in \mathbb{R}^n$ . Write $x_i = (x_{i 1}, \ldots, x_{i n})$ and let $M$ be the $n \times n$ matrix $(x_{i j})$ . Then

$G(x_1, \ldots, x_n) = M^{T} M$

— or is it $M M^{T}$ ? I’m feeling too lazy to work it out, but in any case, it follows that

$\det G(x_1, \ldots, x_n) = (\det M)^2 \geq 0.$

And there we are. And all we need for the triangle inequality is the case $n = 3$ .

Posted by: Tom Leinster on November 17, 2019 7:24 PM | Permalink | Reply to this

Re: Geodesic Spheres and Gram Matrices

$MathML-enabled post (click for more details).$

I guess this has a kind of converse? Suppose $(X,d)$ is a metric space with $d(x,y)\in [0,\pi]$ , and define $f(x,y):=\cos d(x,y)$ . If $f$ has the property that $\bigl( f(x_i,x_j)\bigr)$ is positive semi-definite for any finite sequence $x_1,\dots,x_n\in X$ , then I expect that $(X,d)\approx (X',d')$ , where $X'$ is a subset of a unit sphere in some real inner product space, and $d'$ is the restriction of the geodesic metric to it.

Posted by: Charles Rezk on November 17, 2019 9:18 PM | Permalink | Reply to this

Re: Geodesic Spheres and Gram Matrices

$MathML-enabled post (click for more details).$

Interesting… why do you expect that to be the case?

We know lots about metric spaces such that the matrix $(e^{-d(x_i, x_j)})_{i, j}$ is positive (semi)definite for all points $x_1, \ldots, x_n$ in it. But I don’t know what’s known about metric spaces such that $(\cos (d(x_i, x_j)))_{i, j}$ is positive (semi)definite for all $x_1, \ldots, x_n$ .

Posted by: Tom Leinster on November 19, 2019 8:50 PM | Permalink | Reply to this

Re: Geodesic Spheres and Gram Matrices

$MathML-enabled post (click for more details).$

I think this: given $(X,d)$ as I described, you can construct a universal map $\rho\colon X\to V$ where $V$ is an inner product space, such that $\langle \rho(x),\rho(y)\rangle= f(x,y)$ .

In fact, let $V$ be the vector subspace of the set of all functions $X\to \mathbb{R}$ which is spanned by the functions $\rho(x) \colon X\to \mathbb{R},\qquad \rho(x)(y):= f(y,x).$ Then the function $f$ naturally extends to a bilinear map $\langle -,-\rangle\colon V\times V\to \mathbb{R}$ , and the Gram matrix condition implies that $\langle -,-\rangle$ is positive semi-definite. I don’t know if it’s automatically positive definite, but if not I guess you can replace $V$ with $V/V_0$ , where $V_0=\{v\in V\;\mid\; \langle v,v'\rangle=0\}$ is the subspace of elements with trivial pairing.

Given this, it’s clear that $\langle \rho(x),\rho(y)\rangle = 1$ iff $f(x,y)=1$ iff $x=y$ , so $\rho\colon X\to V$ is a monomorphism into the unit sphere.

I believe this is more-or-less the same as what functional analysts call the Moore–Aronszajn theorem, though I learned about it from this paper. The function $\rho$ is kind of a linear algebra analogue of the Yoneda embedding.

Posted by: Charles Rezk on November 20, 2019 5:46 PM | Permalink | Reply to this

Re: Geodesic Spheres and Gram Matrices

$MathML-enabled post (click for more details).$

This is very much in the orbit of the ‘support vector machine’ approach to machine learning. Back in my machine learning phase I posted about such matters: Kernels in Machine Learning I and Kernels in Machine Learning II.

Posted by: David Corfield on November 22, 2019 9:36 AM | Permalink | Reply to this

Re: Geodesic Spheres and Gram Matrices

$MathML-enabled post (click for more details).$

Cool! Thanks for that.

The function $\rho$ is kind of a linear algebra analogue of the Yoneda embedding.

It’s impossible for category theorists to read about reproducing kernel Hilbert spaces without feeling that God is teasing them.

Posted by: Tom Leinster on November 20, 2019 10:01 PM | Permalink | Reply to this

Re: Geodesic Spheres and Gram Matrices

$MathML-enabled post (click for more details).$

It’s impossible for category theorists to read about reproducing kernel Hilbert spaces without feeling that God is teasing them.

I, on the other hand, feel like Charles’s comment gets me closer than I have before to getting the point of the Yoneda embedding.

(And yes, I agree this is basically the Moore–Aronszajn theorem; I missed the point earlier because I got distracted by the cosines.)

Posted by: Mark Meckes on November 21, 2019 4:56 PM | Permalink | Reply to this

Re: Geodesic Spheres and Gram Matrices

$MathML-enabled post (click for more details).$

I’m not sure about Charles’s suggestion, but it’s worth noting that there is a closely related exact characterization of metric spaces that embed in Euclidean space (i.e., the converse is true in that setting). Here’s the theorem:

Given a finite metric space $X$ with points $x_0, x_1 \ldots, x_n$ , form the $(n+1) \times (n+1)$ matrix $M$ with entries $m_{i j} = d(x_0, x_i)^2 + d(x_0, x_j)^2 - d(x_i, x_j)^2$ . Then $X$ embeds isometrically into a Euclidean space if and only if $M$ is positive semidefinite.

In this context it’s interesting to note that if $x_1, \ldots, x_n$ lie in the unit sphere, equipped with the Euclidean rather than geodesic metric, and we set $x_0 = 0$ , then $m_{i j} = 2 \langle x_i, x_j \rangle$ .

Anyway, I think the theorem I quoted goes back to Schoenberg, and I bet the answer to Charles’s question is somewhere in Schoenberg’s papers.

Posted by: Mark Meckes on November 20, 2019 4:37 PM | Permalink | Reply to this

Re: Geodesic Spheres and Gram Matrices

$MathML-enabled post (click for more details).$

Thank you for this interesting post! Silly typo: At one point, what is obviously supposed to be $\lvert\displaystyle\sum_i c_i x_i\rvert$ inadvertently became $_{\lvert\rvert}\displaystyle\sum_i c_i x_i$ .

Posted by: L Spice on November 18, 2019 4:33 PM | Permalink | Reply to this

Re: Geodesic Spheres and Gram Matrices

$MathML-enabled post (click for more details).$

Thanks! Most of my posts do contain typos, but I think in this particular case I typed what I meant to: each $x_i$ is a vector, so it really is a norm rather than an absolute value.

Posted by: Tom Leinster on November 18, 2019 6:57 PM | Permalink | Reply to this

Re: Geodesic Spheres and Gram Matrices

$MathML-enabled post (click for more details).$

Sorry to fuss (and hopefully these posts can be deleted once it’s been fixed), but my objection wasn’t to $\lVert x\rVert$ versus $\lvert x\rvert$ , but rather to $_{\lvert\rvert}x$ (which is what’s there: either two absolute-value bars or one norm bar, on the left, as a subscript) versus $\lVert x\rVert$ . (Where this appears in the post, $x$ is $\displaystyle\sum_i c_i x_i$ .)

Posted by: L Spice on November 18, 2019 11:49 PM | Permalink | Reply to this

Re: Geodesic Spheres and Gram Matrices

$MathML-enabled post (click for more details).$

I don’t see it either. Could be a browser issue. I know that at the nLab, which uses MathML as does this site, it is said

Presently only Firefox and its derivatives have implemented native rendering of MathML. Presently all other browsers fall back to invoking MathJax. This works fine on small pages, but on pages with substantial content the MathJax rendering takes up to several minutes.

This means that presently you should use Firefox or its derivatives to view the nLab.

Posted by: Todd Trimble on November 19, 2019 1:30 PM | Permalink | Reply to this

Re: Geodesic Spheres and Gram Matrices

$MathML-enabled post (click for more details).$

No need to apologize; I appreciate your help! But either there’s some software issue that means you’re seeing things differently to me, or there’s a wetware issue between my ears. I can’t see a double-bar subscript typo anywhere. What I do see is a line like this:

displayed formula

(That’s a graphics file taken from a screenshot.) Does it display differently for you, or am I still not getting the point?

Posted by: Tom Leinster on November 19, 2019 3:10 AM | Permalink | Reply to this

Re: Geodesic Spheres and Gram Matrices

$MathML-enabled post (click for more details).$

It does render differently for me on Safari, roughly as I described it; I tried including a picture of what I see, but no luck. (How do you do it?) I get a different, although still wrong, rendering on Mobile Safari. It is thus clearly a browser issue, and, as Todd suggested upthread, viewing it in Firefox shows what is surely the expected result. I apologise for the de-rail; please feel free to delete this whole thread.

Posted by: L Spice on November 20, 2019 8:03 PM | Permalink | Reply to this

Re: Geodesic Spheres and Gram Matrices

$MathML-enabled post (click for more details).$

Nice! You took us up to the third Gram inequality and showed why they’re all interesting up to that point. Does anyone know anything interesting the fourth one?

Posted by: John Baez on November 18, 2019 10:56 PM | Permalink | Reply to this

Re: Geodesic Spheres and Gram Matrices

$MathML-enabled post (click for more details).$

> Nice! You took us up to the third Gram inequality and showed why they’re all interesting up to that point. Does anyone know anything interesting the fourth one?

Interestingly, Yemon Choi’s comment and its descendants suggests the opposite, that it is only really the 3rd Gram inequality that is interesting.

(This reminds me, probably spuriously but who knows, of my own research in character formulæ, in which somehow the terms $\operatorname{ad}(X)Y$ and $\operatorname{ad}(X)^2Y$ in the Campbell—Baker—Hausdorff formula are important, with the $\operatorname{ad}(X)^2Y$ terms giving rise to a quadratic form, but $\operatorname{ad}(X)^3Y$ manages somehow apparently not to play a role.)

Posted by: L Spice on November 18, 2019 11:54 PM | Permalink | Reply to this

Craig

$MathML-enabled post (click for more details).$

Speaking of “higher Cauchy-Schwarz inequalities”, what happens if you look at $\left\lVert \sum_i c_i x_i \right\rVert^4 \geq 0$ ? You wind up with a 4-dimensional tensor whose entries look like $\sum_a x_i^a x_j^a x_k^a x_m^a$ ; does that tensor have an interesting invariant which is not immediately derived from the Gram determinant?

Posted by: Craig on November 19, 2019 3:39 PM | Permalink | Reply to this

The n-Category Café

Skip to the Main Content

November 16, 2019