Skip to the Main Content

Note:These pages make extensive use of the latest XHTML and CSS Standards. They ought to look great in any standards-compliant modern browser. Unfortunately, they will probably look horrible in older browsers, like Netscape 4.x and IE 4.x. Moreover, many posts use MathML, which is, currently only supported in Mozilla. My best suggestion (and you will thank me when surfing an ever-increasing number of sites on the web which have been crafted to use the new standards) is to upgrade to the latest version of your browser. If that's not possible, consider moving to the Standards-compliant and open-source Mozilla browser.

November 16, 2019

Geodesic Spheres and Gram Matrices

Posted by Tom Leinster

This is a short weekend diversion, containing nothing profound.

Any sphere in n\mathbb{R}^n can be given the geodesic metric, where the distance between two points is defined to be the length of the arc of a great circle between them. It’s not a big surprise that the same is true in any real inner product space XX: for instance, the unit sphere S(X)S(X) centred at the origin can be given a geodesic metric in a natural way. This defines a functor

S:IPSMS S: \mathbf{IPS} \to \mathbf{MS}

from inner product spaces to metric spaces, where in both cases the maps are not-necessarily-surjective isometries.

What’s more interesting is that it’s not quite as straightforward to prove as you might imagine. One way to do it leads into the topic of Gram matrices, as I’ll explain.

In n\mathbb{R}^n, we define the geodesic metric dd on the unit sphere S( n)S(\mathbb{R}^n) by

d(x,y)=cos 1(xy)[0,π]. d(x, y) = \cos^{-1}(x \cdot y) \in [0, \pi].

The Cauchy–Schwarz inequality tells us that

|xy|xy=1 |x \cdot y| \leq \|x\| \|y\| = 1

(since xx and yy are unit vectors), so the inverse cosine makes sense.

So in an arbitrary inner product space XX, we’re obviously going to define the geodesic metric dd on the unit sphere S(X)S(X) by

d(x,y)=cos 1x,y. d(x, y) = \cos^{-1} \langle x, y \rangle.

Again, the Cauchy–Schwarz inequality guarantees that the inverse cosine makes sense, and again we interpret cos 1\cos^{-1} as taking values in [0,π][0, \pi].

Great! We’ve defined dd. The next step is to check that it really is a metric — and here’s where it gets interesting. All the axioms are easy to check, except the triangle inequality:

cos 1x,y+cos 1y,zcos 1x,z \cos^{-1} \langle x, y \rangle + \cos^{-1} \langle y, z \rangle \geq \cos^{-1} \langle x, z \rangle

for all unit vectors xx, yy and zz.

How can we prove this?

In short, the first step is to follow your nose and do all the obvious moves in order to get rid of the trig functions. Here’s how it goes (and you may now wish to skip to the next paragraph): we’re treating cos 1\cos^{-1} as a function [1,1][0,π][-1, 1] \to [0, \pi], and it’s strictly decreasing there, so we might as well take the cosine of each side and reverse the inequality. That gives something of the form cos(A+B)\cos(A + B) on the left-hand side, but we can apply the usual trigonometric formula. That gives something involving sines of inverse cosines, but we can use the fact that sin=1cos 2\sin = \sqrt{1 - \cos^2} to get it back in terms of cosines. But now we’ve got some undesired square roots, so we rearrange and square both sides (taking care of the dangers inherent in such a move).

After working through this (and it’s actually less work than I’ve made it sound), we end up concluding that the triangle inequality for xx, yy and zz holds if and only if x,yy,zx,z\langle x, y \rangle \langle y, z \rangle \leq \langle x, z \rangle or

1+2x,yy,zz,x{x,y 2+y,z 2+z,x 2}0. 1 + 2 \langle x, y \rangle \langle y, z \rangle \langle z, x \rangle - \bigl\{ \langle x, y \rangle^2 + \langle y, z \rangle^2 + \langle z, x \rangle^2 \bigr\} \geq 0.

The first inequality in this either-or statement only sometimes holds, but the second inequality always holds, as we’ll show. How? Well, that’s where Gram matrices come in.

We’ll now change direction and forget about geodesic metrics for a while.

Given a finite list x 1,,x nx_1, \ldots, x_n of vectors in a real inner product space, there arises an n×nn \times n matrix, the so-called Gram matrix

G(x 1,,x n)=(x i,x j) i,j=1 n. G(x_1, \ldots, x_n) = \bigl(\langle x_i, x_j \rangle\bigr)_{i, j = 1}^n.

A famous type of Gram matrix is the covariance matrix of a family X 1,,X nX_1, \ldots, X_n of random variables. Technicalities aside, random variables form an inner product space with covariance as the inner product, and the covariance matrix (Cov(X i,X j)) i,j(Cov(X_i, X_j))_{i, j} of X 1,,X nX_1, \ldots, X_n is by definition their Gram matrix.

The crucial point now is that the determinant of a Gram matrix is nonnegative. There are a couple of ways to prove this, but probably the best is to prove the superficially stronger statement that every Gram matrix is positive semidefinite. (“Positive semidefinite” is only usually defined for symmetric matrices, which Gram matrices obviously are.) And this is easy: for all c=(c 1,,c n) nc = (c_1, \ldots, c_n) \in \mathbb{R}^n,

i,jc ix i,x jc j= ic ix i 20. \sum_{i, j} c_i \langle x_i, x_j \rangle c_j = \Bigl\| \sum_i c_i x_i \Bigr\|^2 \geq 0.

The same argument shows that the Gram matrix of linearly independent vectors x 1,,x nx_1, \ldots, x_n is positive definite, and in particular has strictly positive determinant. But we won’t need this extra precision.

So: for any finite list x 1,,x nx_1, \ldots, x_n of vectors in any inner product space, their Gram matrix has nonnegative determinant. This gives a whole sequence of inequalities (“Gram inequalities”),

detG(x 1,,x n)0, \det G(x_1, \ldots, x_n) \geq 0,

one for each n0n \geq 0. Let’s go through them for the first few values of nn.

  • n=0n = 0: the Gram matrix of a list of length 00 is the unique 0×00 \times 0 matrix, which has determinant 11, so Gram’s inequality for n=0n = 0 says 101 \geq 0. Undoubtedly.

  • n=1n = 1: the Gram matrix of a single vector xx is (x 2)(\|x\|^2), which has determinant x 2\|x\|^2, so Gram’s inequality for n=1n = 1 says x 20\|x\|^2 \geq 0.

  • n=2n = 2: take vectors xx and yy in a real inner product space. Their Gram matrix is G(x,y)=(x 2 x,y y,x y 2). G(x, y) = \begin{pmatrix} \|x\|^2 & \langle x, y \rangle \\ \langle y, x \rangle & \|y\|^2 \end{pmatrix}. Gram’s inequality for n=2n = 2 therefore says that x 2y 2x,y 20. \|x\|^2 \|y\|^2 - \langle x, y \rangle^2 \geq 0. This is exactly the Cauchy–Schwarz inequality! Why’s that exciting? First, because it suggests that the Gram inequalities for higher nn can be regarded as higher Cauchy–Schwarz inequalities. And second, because given the all-pervasive nature of Cauchy–Schwarz, it’s a strong hint that the higher Gram inequalities are going to be useful too. So we have a machine for generating useful inequalities.

  • n=3n = 3: take vectors xx, yy and zz in a real inner product space. Their Gram matrix is G(x,y,z)=(x 2 x,y x,z y,x y 2 y,z z,x z,y z 2) G(x, y, z) = \begin{pmatrix} \|x\|^2 &\langle x, y \rangle &\langle x, z \rangle \\ \langle y, x \rangle &\|y\|^2 &\langle y, z \rangle \\ \langle z, x \rangle &\langle z, y \rangle &\|z\|^2 \end{pmatrix} and Gram’s inequality for n=3n = 3 says that detG(x,y,z)0\det G(x, y, z) \geq 0. I won’t bother writing that out explicitly for general x,y,zx, y, z, but look what it says when x=y=z=1\|x\| = \|y\| = \|z\| = 1: 1+2x,yy,zz,x{x,y 2+y,z 2+z,x 2}0. 1 + 2 \langle x, y \rangle \langle y, z \rangle \langle z, x \rangle - \bigl\{ \langle x, y \rangle^2 + \langle y, z \rangle^2 + \langle z, x \rangle^2 \bigr\} \geq 0. That’s exactly the inequality we wanted to prove earlier — the one that proves the triangle inequality for the geodesic metric!

So: we’ve now proved that on the unit sphere of any inner product space, the “geodesic metric”

d(x,y)=cos 1x,y d(x, y) = \cos^{-1} \langle x, y \rangle

satisfies the triangle inequality, and is, therefore, genuinely a metric.

Posted at November 16, 2019 11:12 PM UTC

TrackBack URL for this Entry:   https://golem.ph.utexas.edu/cgi-bin/MT-3.0/dxy-tb.fcgi/3155

18 Comments & 0 Trackbacks

Re: Geodesic Spheres and Gram Matrices

My initial instinct is that phrasing this on n\mathbb{R}^n is a bit of a red herring, since if one can solve the problem for the unit sphere in a 33-dimensional real inner product space one gets it for the unit sphere in all higher dimensions.

That said, at present I don’t see a “nice” way to exploit the reduction to this special case – my analyst’s training is to then WLOG place my vectors x 1x_1, x 2x_2 and x 3x_3 with x 1x_1 and x 2x_2 in the z=0z=0 plane, and have x 3x_3 somewhere on the “upper hemisphere”, and then start hacking. So certainly the approach via Gram matrices, which I’ve not seen before, is more appealing!

Posted by: Yemon Choi on November 17, 2019 3:43 PM | Permalink | Reply to this

Re: Geodesic Spheres and Gram Matrices

Ah yes, that’s a very good point!

Oddly enough, I had actually thought of the reduction to n\mathbb{R}^n, but it had only occurred to me to use this reduction in the part of the argument concerning Gram matrices. That is: suppose we want to prove that n×nn \times n Gram matrices have nonnegative determinants. I gave one argument in the post. Another one runs as follows.

Let x 1,,x nx_1, \ldots, x_n be vectors in some inner product space; we want to prove that

detG(x 1,,x n)0. \det G(x_1, \ldots, x_n) \geq 0.

The subspace spanned by x 1,,x nx_1, \ldots, x_n has dimension dnd \leq n, and is therefore isometrically isomorphic to d\mathbb{R}^d (with its standard inner product), and therefore embeds isometrically into n\mathbb{R}^n. So, it’s enough to prove the inequality in the case where x 1,,x n nx_1, \ldots, x_n \in \mathbb{R}^n. Write x i=(x i1,,x in)x_i = (x_{i 1}, \ldots, x_{i n}) and let MM be the n×nn \times n matrix (x ij)(x_{i j}). Then

G(x 1,,x n)=M TM G(x_1, \ldots, x_n) = M^{T} M

— or is it MM TM M^{T}? I’m feeling too lazy to work it out, but in any case, it follows that

detG(x 1,,x n)=(detM) 20. \det G(x_1, \ldots, x_n) = (\det M)^2 \geq 0.

And there we are. And all we need for the triangle inequality is the case n=3n = 3.

Posted by: Tom Leinster on November 17, 2019 7:24 PM | Permalink | Reply to this

Re: Geodesic Spheres and Gram Matrices

I guess this has a kind of converse? Suppose (X,d)(X,d) is a metric space with d(x,y)[0,π]d(x,y)\in [0,\pi], and define f(x,y):=cosd(x,y)f(x,y):=\cos d(x,y). If ff has the property that (f(x i,x j))\bigl( f(x_i,x_j)\bigr) is positive semi-definite for any finite sequence x 1,,x nXx_1,\dots,x_n\in X, then I expect that (X,d)(X,d)(X,d)\approx (X',d'), where XX' is a subset of a unit sphere in some real inner product space, and dd' is the restriction of the geodesic metric to it.

Posted by: Charles Rezk on November 17, 2019 9:18 PM | Permalink | Reply to this

Re: Geodesic Spheres and Gram Matrices

Interesting… why do you expect that to be the case?

We know lots about metric spaces such that the matrix (e d(x i,x j)) i,j (e^{-d(x_i, x_j)})_{i, j} is positive (semi)definite for all points x 1,,x nx_1, \ldots, x_n in it. But I don’t know what’s known about metric spaces such that (cos(d(x i,x j))) i,j (\cos (d(x_i, x_j)))_{i, j} is positive (semi)definite for all x 1,,x nx_1, \ldots, x_n.

Posted by: Tom Leinster on November 19, 2019 8:50 PM | Permalink | Reply to this

Re: Geodesic Spheres and Gram Matrices

I think this: given (X,d)(X,d) as I described, you can construct a universal map ρ:XV\rho\colon X\to V where VV is an inner product space, such that ρ(x),ρ(y)=f(x,y)\langle \rho(x),\rho(y)\rangle= f(x,y).

In fact, let VV be the vector subspace of the set of all functions XX\to \mathbb{R} which is spanned by the functions ρ(x):X,ρ(x)(y):=f(y,x). \rho(x) \colon X\to \mathbb{R},\qquad \rho(x)(y):= f(y,x). Then the function ff naturally extends to a bilinear map ,:V×V\langle -,-\rangle\colon V\times V\to \mathbb{R}, and the Gram matrix condition implies that ,\langle -,-\rangle is positive semi-definite. I don’t know if it’s automatically positive definite, but if not I guess you can replace VV with V/V 0V/V_0, where V 0={vVv,v=0}V_0=\{v\in V\;\mid\; \langle v,v'\rangle=0\} is the subspace of elements with trivial pairing.

Given this, it’s clear that ρ(x),ρ(y)=1\langle \rho(x),\rho(y)\rangle = 1 iff f(x,y)=1f(x,y)=1 iff x=yx=y, so ρ:XV\rho\colon X\to V is a monomorphism into the unit sphere.

I believe this is more-or-less the same as what functional analysts call the Moore–Aronszajn theorem, though I learned about it from this paper. The function ρ\rho is kind of a linear algebra analogue of the Yoneda embedding.

Posted by: Charles Rezk on November 20, 2019 5:46 PM | Permalink | Reply to this

Re: Geodesic Spheres and Gram Matrices

This is very much in the orbit of the ‘support vector machine’ approach to machine learning. Back in my machine learning phase I posted about such matters: Kernels in Machine Learning I and Kernels in Machine Learning II.

Posted by: David Corfield on November 22, 2019 9:36 AM | Permalink | Reply to this

Re: Geodesic Spheres and Gram Matrices

Cool! Thanks for that.

The function ρ\rho is kind of a linear algebra analogue of the Yoneda embedding.

It’s impossible for category theorists to read about reproducing kernel Hilbert spaces without feeling that God is teasing them.

Posted by: Tom Leinster on November 20, 2019 10:01 PM | Permalink | Reply to this

Re: Geodesic Spheres and Gram Matrices

It’s impossible for category theorists to read about reproducing kernel Hilbert spaces without feeling that God is teasing them.

I, on the other hand, feel like Charles’s comment gets me closer than I have before to getting the point of the Yoneda embedding.

(And yes, I agree this is basically the Moore–Aronszajn theorem; I missed the point earlier because I got distracted by the cosines.)

Posted by: Mark Meckes on November 21, 2019 4:56 PM | Permalink | Reply to this

Re: Geodesic Spheres and Gram Matrices

I’m not sure about Charles’s suggestion, but it’s worth noting that there is a closely related exact characterization of metric spaces that embed in Euclidean space (i.e., the converse is true in that setting). Here’s the theorem:

Given a finite metric space XX with points x 0,x 1,x nx_0, x_1 \ldots, x_n, form the (n+1)×(n+1)(n+1) \times (n+1) matrix MM with entries m ij=d(x 0,x i) 2+d(x 0,x j) 2d(x i,x j) 2m_{i j} = d(x_0, x_i)^2 + d(x_0, x_j)^2 - d(x_i, x_j)^2. Then XX embeds isometrically into a Euclidean space if and only if MM is positive semidefinite.

In this context it’s interesting to note that if x 1,,x nx_1, \ldots, x_n lie in the unit sphere, equipped with the Euclidean rather than geodesic metric, and we set x 0=0x_0 = 0, then m ij=2x i,x jm_{i j} = 2 \langle x_i, x_j \rangle.

Anyway, I think the theorem I quoted goes back to Schoenberg, and I bet the answer to Charles’s question is somewhere in Schoenberg’s papers.

Posted by: Mark Meckes on November 20, 2019 4:37 PM | Permalink | Reply to this

Re: Geodesic Spheres and Gram Matrices

Thank you for this interesting post! Silly typo: At one point, what is obviously supposed to be | ic ix i|\lvert\displaystyle\sum_i c_i x_i\rvert inadvertently became || ic ix i_{\lvert\rvert}\displaystyle\sum_i c_i x_i.

Posted by: L Spice on November 18, 2019 4:33 PM | Permalink | Reply to this

Re: Geodesic Spheres and Gram Matrices

Thanks! Most of my posts do contain typos, but I think in this particular case I typed what I meant to: each x ix_i is a vector, so it really is a norm rather than an absolute value.

Posted by: Tom Leinster on November 18, 2019 6:57 PM | Permalink | Reply to this

Re: Geodesic Spheres and Gram Matrices

Sorry to fuss (and hopefully these posts can be deleted once it’s been fixed), but my objection wasn’t to x\lVert x\rVert versus |x|\lvert x\rvert, but rather to ||x_{\lvert\rvert}x (which is what’s there: either two absolute-value bars or one norm bar, on the left, as a subscript) versus x\lVert x\rVert. (Where this appears in the post, xx is ic ix i\displaystyle\sum_i c_i x_i.)

Posted by: L Spice on November 18, 2019 11:49 PM | Permalink | Reply to this

Re: Geodesic Spheres and Gram Matrices

I don’t see it either. Could be a browser issue. I know that at the nLab, which uses MathML as does this site, it is said

Presently only Firefox and its derivatives have implemented native rendering of MathML. Presently all other browsers fall back to invoking MathJax. This works fine on small pages, but on pages with substantial content the MathJax rendering takes up to several minutes.

This means that presently you should use Firefox or its derivatives to view the nLab.

Posted by: Todd Trimble on November 19, 2019 1:30 PM | Permalink | Reply to this

Re: Geodesic Spheres and Gram Matrices

No need to apologize; I appreciate your help! But either there’s some software issue that means you’re seeing things differently to me, or there’s a wetware issue between my ears. I can’t see a double-bar subscript typo anywhere. What I do see is a line like this:

displayed formula

(That’s a graphics file taken from a screenshot.) Does it display differently for you, or am I still not getting the point?

Posted by: Tom Leinster on November 19, 2019 3:10 AM | Permalink | Reply to this

Re: Geodesic Spheres and Gram Matrices

It does render differently for me on Safari, roughly as I described it; I tried including a picture of what I see, but no luck. (How do you do it?) I get a different, although still wrong, rendering on Mobile Safari. It is thus clearly a browser issue, and, as Todd suggested upthread, viewing it in Firefox shows what is surely the expected result. I apologise for the de-rail; please feel free to delete this whole thread.

Posted by: L Spice on November 20, 2019 8:03 PM | Permalink | Reply to this

Re: Geodesic Spheres and Gram Matrices

Nice! You took us up to the third Gram inequality and showed why they’re all interesting up to that point. Does anyone know anything interesting the fourth one?

Posted by: John Baez on November 18, 2019 10:56 PM | Permalink | Reply to this

Re: Geodesic Spheres and Gram Matrices

> Nice! You took us up to the third Gram inequality and showed why they’re all interesting up to that point. Does anyone know anything interesting the fourth one?

Interestingly, Yemon Choi’s comment and its descendants suggests the opposite, that it is only really the 3rd Gram inequality that is interesting.

(This reminds me, probably spuriously but who knows, of my own research in character formulæ, in which somehow the terms ad(X)Y\operatorname{ad}(X)Y and ad(X) 2Y\operatorname{ad}(X)^2Y in the Campbell—Baker—Hausdorff formula are important, with the ad(X) 2Y\operatorname{ad}(X)^2Y terms giving rise to a quadratic form, but ad(X) 3Y\operatorname{ad}(X)^3Y manages somehow apparently not to play a role.)

Posted by: L Spice on November 18, 2019 11:54 PM | Permalink | Reply to this

Craig

Speaking of “higher Cauchy-Schwarz inequalities”, what happens if you look at ic ix i 40\left\lVert \sum_i c_i x_i \right\rVert^4 \geq 0? You wind up with a 4-dimensional tensor whose entries look like ax i ax j ax k ax m a\sum_a x_i^a x_j^a x_k^a x_m^a; does that tensor have an interesting invariant which is not immediately derived from the Gram determinant?

Posted by: Craig on November 19, 2019 3:39 PM | Permalink | Reply to this

Post a New Comment