Part of a series on Dimension Diagrams:

  1. (This post).
  2. Physics
  3. Data Engineering

Table of Contents

I. Vectors

Behold a three-dimensional vector space:

3 dimensions

And a vector:

a 3d vector

Here I’ve labeled the components of the vector called in the coordinate system. Simple enough.

But if there were more than three dimensions, we couldn’t simply draw a vector in perspective. Instead, let’s collapse the extras onto a single axis:

an Nd vector

The double line represents “one or more dimensions”—the rest of the dimensions, for any . And stands for the projection of the vector onto that entire subspace.

Maybe you even decide to forget how many dimensions are in , “reducing” the information in those axes into a single scalar . This will then give a 3 dimensional space as in the first diagram.

But there are various ways to perform this reduction—how do you pick? An obvious choice is to map that component to its length . Since , this preserves the length of the full vector:

But that isn’t the only option; you could, for example, project to a single one of the dimensions, such as .

Or you could go the opposite direction: maybe you thought you were working in 3 dimensions, and then: more pop out! You “unreduce” one dimension into a whole -dimensional subspace. This clearly adds information, so you have some choice in how you do it. This choice will in turn determine how operations on the original vector extend to the unreduced vector—what, for example, will happen to a rotation which previously took the “1” dimension into the “3” dimension? How should this map to the larger dimensions? It could rotate into “3” only (the inverse of “projection”), or it could map to all of the new dimensions “equally”, i.e. into the vector , which has the same length as the original but spread evenly over all dimensions (making this the inverse of the reduction to a length). Or something else!

Now, if you have one vector , you can decompose some other vector into components parallel and perpendicular to .

decomposing a vector

Here I’ve adopted a few conventions:

  • and are subspaces, and will not be typeset in boldface. These are labeled in the diagram as lines without arrows, while vectors have arrows at the end. Their “negative” halves are not depicted (what would negative mean?)—but it might be useful to depict these in other instances.
  • A double line is again used for a subspace of greater than one dimension. Here, in dimensions, will be -dimensional.
  • On the left are shown the vector projection and rejection .
  • On the right, the scalar projection and rejection are shown. The “sides” of are labeled something like their lengths, but should be thought of as containing components.
  • These projections and rejections are written to suggest that they are operations between and the subspaces and , rather than between and the vector itself. This is helpful because it avoids having to make reference to any particular basis on the subspace .

The same decomposition as an equation is:

More conventions: is a vector, but can not be. Only in two dimensions is there a vector spanning the space orthogonal to , and even then there’s no particular reason to choose any particular vector on the subspace . In more than two dimensions, we will take , , and to mean “whatever they need to” for the above to make sense—matrix, or a particular choice of basis, or an oriented area? We’ll figure it out later.

In any case, once we’ve defined , we should just as easily be able to start over by projecting onto that. Then we’d call the term the “projection” (onto an -dimensional space) and the rejection. We should get the same result:

So whatever means, it ought to be able to play the role of “projection” and “rejection” equally well.

The right diagram above depicted a vector in terms of its two components . The latter component could be taken to stand for components at once, or could represent a “reduction” to a single scalar as discussed earlier. With this convention we could “draw” an -dimensional vector in a plane, and at least preserve the apparent “orthogonality” of the components parallel to and perpendicular to .

Let’s now throw out the rule that “right angles represent orthogonal dimensions”. Instead, for the rest of this post, we’ll use a half-axis starting from the origin to stand for an entire dimension, orthogonal to all the rest, no matter what angle they’re drawn at. Double-lines will represent multiple dimensions collapsed into a single half-axis. We can then fit more than two or three dimensions in a single diagram. Here’s a 10-dimensional space and a vector (which is zero on dimensions 5 through 9). I’m encoding the absolute values of the projections onto each axis as a dotted line.

The obvious definition of an expression like is as the norm of that component, , but I am hoping to keep open the option of using different “reductions” rather than only using

Is the polygon anything? Probably not. It’s quite underdetermined: it’s only defined only up to the signs of the components and up to the choice “projection” on any multi-dimensions (the dimension here.) This kind of diagram might be more sensible for depicting probabilities, which can’t be negative anyway.

II. Matrices

We can do something similar with a matrix. Suppose you have some rank-10 real matrix , which can be diagonalized as follows:

That is, this matrix:

  • scales its first two eigendimensions by respectively.
  • rotates dimensions 3 and 4 into each other, while scaling by .
  • scales dimensions 5 through 9 by a common factor .
  • annihilates dimension 10.

This can be visualized as follows:

10 eigenvectors

This could be read as a vector along the lines of the previous section, but I don’t think that interpretation would be very meaningful. Instead this should now be thought of as a simply standing for the diagonal representation of the matrix itself.

The first two eigenvalues are the biggest, so we could approximate this matrix by only its first two “principal components”, i.e. by zeroing all but the first two eigenvectors, defining a new matrix :

PCA of a 10D matrix

Explicitly:

This is something like a “principal component analysis”. The action of on a given input vector will deviate from that of the original in some way that depends on ‘s components along the zeroed dimensions:

where the last expression is to be understood as a standard inner product.

We can equivalently think of this PCA-like-reduction as a reduction of the vector space to three-dimensions, where the first two dimensions are chosen as principal eigendimensions. Under this reduction transforms into a reduced :

or, graphically:

PCA with an error dimension

where is something representing the error of the transformation. Its action on vectors is

The action of in the reduced space would then be

where is some other thing, and the multiplication operation in the last expression is not specific. If it were full-rank matrix multiplication, we’d recover the original , but the point is to leave its implementation unspecified.

Meanwhile the reduced-truncated has no component on the remaining dimensions:

One constraint we can place on the meaning of is that it should reduce to a scalar in the case of only one additional dimension. one obvious scalar value would be the square root of Frobenius norm of , which for Normal matrices is the norm of the eigenvalues considered as a vector:

Then if we take to also be a norm

we get an interpretation of as an upper bound on the error of

(All this would be more general if working with singular values, but I can’t be bothered.)

Another “reduction” operation would be to define as the determinant of the minor of on dimensions . In the present example this would be zero, since we said , but there might be some other class of matrices for which this makes more sense. And the determinant (of a minor) has as its most natural interpretation the scaling of (here, -dimensional) unit volumes—so perhaps the action on a mere vector is not what we should be thinking of here.

If we can reduce away some of the dimensions of a matrix, we can reduce away all of them, getting a single scalar:

reducing to a determinant or something

Some good candidates for the “one-dimensional reduction” of a matrix are:

  • the determinant, or geometric mean, of eigenvalues
  • the trace
  • the Frobenius norm
  • the sum of absolute values of eigenvalues/SVs.
  • the largest eigenvalue, or perhaps its absolute value. Same for SVs.

This is really the starting point for this whole line of thinking: it seems that there are multiple sensible ways to “reduce” a linear operator to a single number. Each ought to be able to be thought of as operations on the vector space instead on the operator, and it should be possible to “partially apply” any of them to give successive “approximations” to the original operator.

My point here is not really to discover a new matrix operation, but to arrive at a “unified framework” in which to understand a number of disparate linear algebra concepts—in particular, from which to motivate them. The idea of such operations as applying to the space rather than to the operator is highly suggestive to me—it is something like

Now, when we mapped down to one dimension , there was an obvious way to do that so that . But there’s no reverse operation that brings all that information back: you could unfold the scalar into any particular vector in your new dimensions , but there is way to distinguish any of the new dimensions from each other unless you also specify that structure.

Or, consider rotating around dimension 2 (rotating dimension 1 into 3), before and after the unfolding. Before, mapped to . After, does it map to any particular vector? To all vectors in ? To an equivalence class of vectors? All of these will work, but I think the most sensible and least opinionated target is to map to the volume element on the unfolded space:

This looks like like the reverse of what we just did with the matrix , so we can run that backwards to see where to go from here. We arrived at by forgetting the dimensions of the matrix; therefore, anywhere it it appears, we can restore the matrix’s full dimensions pulling eigenvalues out of the :

expanding a determinant

could of course be the determinant of many matrices—knowing it belonged to amounts to a choice of how to unpack it—the same choice we would have to make with our . Some applications might be indifferent to the choice; others might depend on the specific choice, or that the same choice is made each time an unfolding occurs.

And of course, we can equally well imagine folding 10 dimensions into 2 or 3 rather than 1, and the same “choices” will arise.

All of this isn’t so strange: we do this to get from all the time. “Actually this is 2D”:

discovering a complex plane

And there are different ways you can do it. The obvious one is to identify with , but it as easily be mapped to any other vector. Or you could do something weirder: maybe you take to the volume element , or to the set of all the complex numbers of radius , i.e , with the sign of perhaps encoding something else. This is all to say: unfolding dimensions is inherently underspecified, as it maps to a larger space.