Hacks for thinking about high-dimensional space

Posted on June 2, 2015 by Isomorphismes in R bloggers | 0 Comments

[This article was first published on isomorphismes, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

High-dimensional Euclidean space is ℝ×ℝ×ℝ×ℝ×ℝ×…. Cartesian product of many continuous quantities.

You are already familiar with the concept via “an arbitrary number of sliders” or “an arbitrary number of columns in a spreadsheet”. Pivot tables are the normal way business people navigate a low-dimensional subset of a high-dimensional space.

OLAP OLAP OLAP

One database or slider-config would be one point in a high-dimensional space. But what do these spaces look like overall?

Lots of Corners

Think about the graph-skeleton of cubes and squares. Zoom in on a corner.

Look at how the number of neighbours increases as you increase dimension. The corner of a box in 2-D has two sticks coming out of it; in 3-D three sticks come out of it. In 30 dimensions there would be 30 sticks coming out of each ball. (Toward the interior. You’d need 2•30 sticks if we’re talking about a grid rather than a box.)

3d grid

More on the outside, less on the inside.

Think about a Matryushka of nested spheres ⬤⊂⬤⊂⬤⊂… or, ok, even dolls.

The 10th one requires more plastic than the 1st one.

Picture by Borb. Licensed under CC BY-SA 3.0 via Wikimedia Commons.

In 3-D a two-dimensional surface gets larger as radius². In 30 dimensions surface areas will go up like radius29.

Fatter at the equator.

Circles of latitude are wider in the middle of the ball than at the top.

But this is a consequence of the coördinatisation (lat,long), not of the object itself. A hectare in Finland is as big as a hectare in Kenya. And I’m not suddenly faster running in Finland either.

(Manifolds solve the problem by patching together different charts, each of which shows one area more faithfully.)

The sphere is tiny or the cube’s mass is in its spikes.

This is justified by comparing spheres with cubes.

In the 3-D the corners of the cube stick out beyond the sphere. I could measure the size of these corners by putting another sphere outside the corners, and calculating volume(circumsphere) − volume(insphere). The Earth has more volume in its mantle than in its inner core.

Combine this with what I said above in the Beatles Matryushka section. I’m seeing the mantle and core as successions of nested shells of volume ≈ ∫radius², each bigger than the last. Once again this pattern will only get more extreme in high dimensions.

Simplices

These are the spaces on which probability happens. In 3 variables I can think about (⅓,⅓,⅓), (⅔,0,⅓), (0,1,0), and (0,0,1).

Do they have to get more complicated when we pad them with zeroes? (0,0,⅓,0,0,⅓,0,0,0,⅓) and (0,0,⅔,0,0,0,0,0,0,⅓) tell me that a 10-simplex has 10 choose 3 = 120 3-simplices embedded in it. But (.1,.1,.1,.1,.1,.1,.1,.1,.1,.1) tells me it contains more than that.

I can still think about the 10-simplex as connections between (1,0,0,0,0,0,0,0,0,0), (0,1,0,0,0,0,0,0,0,0), … (0,0,0,0,0,0,0,0,0,1), but now there are more ways to connect—since each of those points has up to 9 buddies it can be heading toward at a time.

When 2 entries are nonzero, that’s a 2-D surface. When 7 entries are nonzero, that’s a 7-D surface. (Just think of 7 sliders.)

sliders

3-Spheres on Up

All of this mention of spheres, it’d be nice to have an alternative description to w²+x²+y²+z²=1 and sin θ₁ sin θ₂ sin θ₃ … cos θₙ, which I’ve always found hard to digest.

I know that S¹×S¹ isn’t S², because that would be a torus instead. So what do I combine, and how, to get S²?

I like the Drawstring Bag model:

That takes me from the open disk to S² (= the surface of the Earth—sans magma). This is a topological construction so small perturbations are ok. Calling the geoid a “sphere” is ok. The disk doesn’t have to be exactly circular.

To get to the 3-sphere I then imagine a solid sphere (= Earth + magma), attach another point ∞, and identify all of the Earth’s crust with ∞.

Going back to the open disk, here’s another imagination-exercise I do to clear this up for myself. In some of the early Mario games, a glitch you could do was to put half of Mario’s body on one side of the screen and half on the right. This is what I think of when I imagine two sides of a square being identified.

(You can play inside an octagon-with-identifications too. Thicken this octagon and imagine yourself flying around inside it. Imagine waving your arm through eg a̅₁. Where does it come out? Turn your head and watch what it’s doing. BTW, just like Mario and Pac–Man, this identification manifold was considered by entertainment artists around the same time period—remember Scooby-Doo and the Gang running in and out of hotel doors? But the animators will change the rules either over time or for different people or introduce noncommutativity—part of the humour, which proves that people do intuitively understand identification polyhedra (as well as, obviously, object permanence), or else those changes wouldn’t be funny.)

“DunceHatSpace”. Licensed under Public Domain via Wikimedia Commons.

To do this with the drawstring-bag, I imagine a circle of friends. Even though they look like they’re standing all the way across the disk from each other, they’re actually immediately next to each other at ★, and they could make a ring and touch hands and dance around it.

Or, a solitary explorer at ★ could take small steps and look, in the disk model, like he’s jumped all the way across the Earth. He stretches his hand out and, like Mario, it shows up on the other side. But this is just an aberrance of the disk model, not of S². (This also tells you how the gridding should go on the disk.)

So in symbols, T² = ◯×◯ ≠ S² = ⬤/◯ = ⬤/∂⬤ (The pattern Im ƒ / ker ƒ is much broader, including the +c of integration.)

Specific dimension sometimes matters.

A lot of what I’m walking you through is dimension → ∞ kind of stuff. But sometimes odd or even matters. And sometimes there are structures that show up in like dimension 7 or even seemingly random high numbers. Just a warning.

Computing on the surface of a high-dimensional sphere.

Persi Diaconis explains how to do this. The trick is to think of “matrices with determinant one” as the group SO(n), which represents either a point on a sphere or the action of twisting a sphere so ★ moves to that point. (The “noun” and “verb” versions are so easily interchangeable that mathematicians will usually elide the two.) If you allow determinant ±1, that’s O(n)—so allowing reflection—and O(n) is what the Gram-Schmidt algorithm reduces matrices to in order to solve [A]⋅x⃗ = b⃗.

To do what Dr Diaconis says in R you would run this 2-liner:

n = 3
rnorm(n**2) %>% matrix(n,n) %>% qr %>% qr.Q

which you can verify has determinant ±1 and is orthogonal with things like:

rnorm(n**2) %>% matrix(n,n) %>% qr %>% qr.Q -> diaconis
diaconis %>% eigen
{diaconis %>% eigen}$values %>% Mod     #eigenvalues of an O(n) are length 1
diaconis %>% det
diaconis %>% crossprod %>% zapsmall     #diaconis %*% t(diaconis)

Use rotations.

A heap of linear algebra—and therefore the logic of high-dimensional Euclidean space—is simplified if you can imagine quotienting by SO(n).

Having the drawstring-bag model and the stick+ball model and the ability to compute examples with matrices and the rectangular-grid model helps me imagine high-dimensional space. Hope that was useful for you.

Next steps

Tits Buildings
Coxeter-Dynkin diagrams
Polytopes
cell complexes

To leave a comment for the author, please follow the link and comment on their blog: isomorphismes.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

R-bloggers

R news and tutorials contributed by hundreds of R bloggers

Hacks for thinking about high-dimensional space

Lots of Corners

More on the outside, less on the inside.

Fatter at the equator.

The sphere is tiny or the cube’s mass is in its spikes.

Simplices

3-Spheres on Up

Specific dimension sometimes matters.

Computing on the surface of a high-dimensional sphere.

Use rotations.

Next steps

Related

Lots of Corners

More on the outside, less on the inside.

Fatter at the equator.

The sphere is tiny or the cube’s mass is in its spikes.

Simplices

3-Spheres on Up

Specific dimension sometimes matters.

Computing on the surface of a high-dimensional sphere.

Use rotations.

Next steps

Related

Never miss an update! Subscribe to R-bloggers to receive e-mails with the latest R posts. (You will not see this message again.)

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)