Part of a series on elementary linear algebra:

  1. Vectors
  2. (This post)
  3. Division

Table of Contents

Vector Multiplication

Our definitions of multiplication or division by a scalar will carry over directly from the 1D case. We have, for some :

But how should we define multiplication or division between two vectors?

In one dimension it was simple enough; for and we found that the obvious definitions were:

The thing we could assign a sense like “square meters”. Or we could simply shave it off and call this the number , since one-dimensional vectors act just like regular numbers.

This might work in 1D, but for higher dimensions We will want to be more thorough. We’ll start by asking: what properties should the multiplication of two vectors obey?

  1. Multiplication ought to reduce to the obvious definitions in simple cases—otherwise it’s not really “multiplication”! The two “familiar” senses of multiplication of are:

    • The length of a line consisting of copies of , or copies of .
    • The area of a rectangle with sides and
  2. The multiplication operation ought to play nicely with the “linear combinations” we used to define vectors in the first place. Multiplication of numbers distributes like , and it seems sensible that a similar “distributive property” hold over combinations of vectors. If we call the undefined operation for the time being, we want:

    Taken together we will call these properties “linearity”.

  3. The value we get for multiplication ought to depend on the exact vectors involved, and on the fact that they are vectors at all. We could just define multiplication to simply multiply the lengths of the vectors, like , which obviously means something. But this doesn’t even care that they’re vectors; they could as well just be numbers.

  4. If possible, multiplication should be a purely geometric operation, defined in terms of the vectors themselves without reference to a choice of basis, and if possible, a choice of metric—although it seems likely that this second condition won’t be possible, since both simple cases in condition (1) make explicit reference to “length”.

Of all of these properties, the first is the most intuitive, so we’ll start there.

Linearity

We gave two “senses” of the multiplication of numbers: ” copies of ”, which clearly ought to give for two parallel vectors, and the “area of a rectangle”, which naturally equals for two perpendicular vectors. The simplest multiplication operation giving both of these would just be to multiply the lengths of the two vectors, i.e. . But this doesn’t conform to condition 3—it doesn’t really care that these two vectors are vectors at all; it simply treats them as numbers without a meaningful sense of “direction”.

Let’s suppose instead that we have one multiplication operation which gives the prescribed results for parallel and perpendicular vectors, and now let us ask what “linearity” tell us about its value on any other pair of vectors. We’ll first illustrate how linearity works in practice with a simple graphical example:

vectors and a line

In the image above, we have chosen two vectors , and then drawn the line of vectors you would get if you added various multiples of to , with the cases of , and shown. If we now imagine some arbitrary “linear” function of vectors , we can imagine drawing the whole line representing this function’s value on “multiples of added ”, as a function of the multiple:

adding multiples of vectors in a line

For fixed , this is just a function of the parameter :

a linear function of a vector

This should hold for any vector ; the condition is that “varying the argument to along any line produces some linear change in the value of ”. The slope of this line is , as we can see if we expand write the linearity condition out, with representing the number of copies of to add to :

And actually, this tell us more; it also means that . Apparently, if we know for two non-parallel vectors, then we know everywhere:

a linear transformation of the plane

Linearity, then, is a very strong condition: the only thing a linear function on the 2D plane can do is “tilt” the plane, as can be seen in the above diagram. It can’t even have a “-intercept”. 1

Now, having established the geometric meaning of “linearity”, we can ask what it means for “multiplying two vectors” to be linear in each argument. If we fix one vector , then can be thought of as a function of just the second vector : . Then our “intuitive” senses of multiplication suggests that the new should produce the “natural” value when the second vector is exactly equal to one of the two vectors with some length and oriented either parallel or perpendicular to , that is, when and .

If we linearly interpolate between those exact two vectors, then the value of must vary linearly between and… :

But the only way for this to be linear is if the value does not vary along the line at all. Then we immediately know the value everywhere else, because has to be zero, and we get a function that increases along the line and is constant along the line perpendicular to this:

… well, apparently this is a function we could define. But it doesn’t look like a particularly useful one—except for the values we started with, and for the opposite orientations, the other values don’t admit any obvious interpretation.

Let’s step back. Maybe those two conditions on the products of parallel and perpendicular vectors don’t need to be cases of the same operation. We can instead define two multiplication operations, each in terms of projections:

  1. One which multiplies the parallel components of vectors, , which we will write as and read as a “dot product”. This is a natural generalization of the multiplication of two one-dimensional vectors; we simply ignore the rest of the space!
  2. One which multiplies the perpendicular components of vectors, which we will write as , read as “cross product”. This is the natural generalization of the sense of multiplication producing an “area”.

Visualizing these as linear functions of the argument, relative to some fixed , we see that they “tilt” the plane either along or perpendicular to it—more natural, I think, than the peculiar diagonal-tilting function we came up with before.

Multiplication 1: The Dot Product

The “dot” product needs to multiply the lengths of vectors that are parallel. The simplest way to define an operation that does this is to to multiply the length of by the projection of onto , i.e.:

What about the opposite projection, ? If we cite the definition in terms of from earlier, we see that these are equal:

So we ought to think of this product as a “symmetric projection”—it is a projection which incorporates both lengths, and doesn’t care which one you projected onto the other or the order of the two arguments. In all we have a few names:

  • “dot product”
  • “symmetric multiplication”
  • “symmetric projection”
  • or a “scalar product”, because it produces a scalar, although in two dimensions will also give a scalar, so this won’t be very helpful.
  • “inner product”

If we don’t want to use , perhaps believing we pulled that out of a hat, then we’ll have to find another way to write in terms of a basis .

We do some geometry: first, draw , and their sum .

a, b, and a + b

We’re going to compose in two ways. The first is shown on the right above: we compose into components on and off :

Then we can write a Pythagorean theorem relating the three:

Then at the same time we can expand in our standard basis of :

Comparing the two, we see the following equivalence

The argument was completely symmetric between , so we can write it either way. This is our dot product in terms of components:

Because we didn’t specify what that basis was, it must hold in any basis.

We can do a lot with this. For one, we can write the squared-length of any vector as the dot with itself:

We discussed a lot of caveats about the definition of “length” above—have we now found “innate length” of a vector? No—the dot product depends on the choice of metric, since it is composed of vector lengths, and in fact the expression would be more complicated if our basis vectors were not unit-length.

In fact, our derivation provided an expression for the dot product which depends only on vector lengths:

This is one form of what is called the “polarization identity”.

We can also use the dot product to write the “projection” in a way that’s easier to calculate:

As a “symmetric projection” the dot product must give zero for perpendicular vectors. Therefore won’t be affected if we add any multiple of a vector perpendicular to to :

Graphically, this says that all of these vectors have the same projection , and therefore give the same value for :

all the vectors with the same projection

Finally, we can use the other projection definition to derive the Law of Cosines by considering the triangle spanned by :

visualization of the law of cosines

Multiplication 2: The Cross Product

Now, what about our other definition of multiplication, generalizing the “area of a rectangle”? Clearly the analogous multiplication formula is:

To find this in components this, let’s write , and square both sides. We have:

We get a result with a sign, and furthermore, we were trying to write an “area”, but what we got can be negative.

Well, our original “linearity” condition actually required that some areas come out negative; otherwise we couldn’t have linearity hold for two vectors adding to zero:

But the coordinate expression also tells us that flipping the arguments must the sign. So we get:

The second line holds because we have defined the “line” to a vector to be rotated from the original vector. But this means that one of must point opposite to the original , giving a minus. This product, then, is “antisymmetric”, and one name for it will simply be the “antisymmetric product”. Another name might be the “area” or “areal” product—this would avoid some of the confusion that sets in with cross products, I think.

So these are “signed” or “oriented” areas; the sign depends on the order you choose the vectors around the rectangle:

oriented areas

You can always take an absolute value if you need the literal area.

I’ll write these as:

The which arose when we took a square root in the component formula for implies that this multiplication can be defined with either sign. We’ll choose the positive sign, which for vectors and can be seen to give a positive area when the arguments sweep from . On a usual coordinate plane this is a counterclockwise orientation. This is consistent with for having the usual orientation around the unit circle.

We have been casually using for a vector perpendicular to , and as some kind of operation on which gives this perpendicular vector . It has not mattered which direction is chosen to be, but in defining as , we have now linked these two choices together. The choice of counterclockwise orientation implies we must define to be

which is a rotation of , sweeping from .

This definition also implies that we can write a cross product as a dot product with :

Note that these definitions only apply in 2D. In higher dimensions will not be a vector but a plane or a larger space—we’ll deal with that when we come to it.

Now, we found this definition by looking for a multiplication which gives the area of the rectangle formed by perpendicular vectors. What does it do on non-perpendicular vectors? If two vectors are parallel, antisymmetry requires that:

So parallel vectors have a cross product of zero.

What about vectors at some angle to each other? The value of can easily be seen to give the area of the parallelogram formed by the two vectors:

the area of a parallelogram

In the above the left triangle could be moved to the right. So we have:

Note that adding any component of to will lengthen the parallelogram but will not change its area. This can also be seen from fact that adding any multiple of to the second argument of doesn’t change the value:

Drawing a diagonal in either direction divides the parallelogram into two triangles

the area of triangles

Each has half the area of the whole parallelogram, which gives us a vector formula for the area of a triangle:

If we write as , we get something equivalent to the elementary "".

If we the angle between is , then we can also express these area with a sine:

We even get the right sign from as long as we consider clockwise angles to be negative.

If we draw the triangle formed by and call the third side , with opposite angles ,

the law of sines

we can write three equivalent expressions for the unsigned area, in terms of each pair of sides, which leads to the Law of Sines:

The Shoelace Formula

Now we’ll demonstrate the usefulness of “negative areas”.

But this means that we can find the area of any polygon if we know the coordinates of all its vertices—we just carve it into triangles:

the area of a slice of a polygon

With as the origin, the total area of this polygon is the sum of the area of all the triangles:

(The sum is periodic, so .)

The area of each triangle is equal to half the cross product of its two sides. We can use either the two sides formed by one vertex and the next (akin to taking in the left parallelogram above) or those formed by one vertex and the vector to the next vertex (akin to on the right):

These formulas are equivalent because = 0. So we’ve expressed the entire area in terms of only the vertex vectors, drawn with arrows below:

a sliced-up polygon

This is called the Shoelace Formula for the area of a polygon, for reasons you can read on the wiki. It’s simple! It could be taught in high school! 2

Remarkably, this still works if the origin is outside of the polygon—the oriented areas become negative for the part outside of the polygon, in exactly the way that cancels out all the overlapping contributions:

demonstrating the shoelace formula

The corresonding sum is:

with the sign in the second term becoming negative because the original vertices were in a clockwise (negatively-oriented) order; this becomes a minus sign out front in the second line where I’ve rewritten them in counterclockwise order.

And it works even if the polygon’s vertices loop back on themselves, as long as we cancel out all of the negatively-oriented areas correctly:

shoelace with cancellations

The above stands for the sum:

This one’s harder to see, but if you stack all of those pieces of the sum on top of each other everything will cancel exactly to give the area of the l.h.s. polygon.

Multiplication 3: The Geometric Product

To recap: we asked that our multiplication operation on vectors have four properties:

  1. intuitiveness, and in fact we found we needed two products to match both senses of intuition for parallel and perpendicular vectors
  2. linearity, which is manifest in the definitions and .
  3. that they were purely “geometric” concepts, independent of a coordinate system. This we have satisfied by defining both operations in terms of projection vectors, though the values of the multiplications do still depend on “length” through the length of the projections.
  4. that they depend on the unique vectors. We somewhat satisfied this; each multiplication has some vectors it considers equivalent (like sliding the parallelogram) but otherwise does depend on its inputs.

We made use of the purely-geometric concepts of projection/rejection in defining the two products, so we should expect they will be well-formed. But let’s also check: what do these results do if we rotate the whole coordinate system? If we express both of our input vectors in terms of the “angle with respect to the axis”, we have:

Then we can write in terms of these angles:

We get two trig identities, each depending on the difference of the angles—which means they will be rotational invariants: . Therefore we can conclude that a rotation of the whole space by , which would take and , will have no effect on the values of these products.

This also suggests we can create write a Pythagorean theorem with the two products:

This is called Lagrange’s Identity. Lagrange’s Identity lets us write a sort of “polarization identity” for as well:

So both products can be derived from only the definition of length. But the cross product involves a square root and therefore a , which isn’t too surprising.

Interestingly, Lagrange’s Identity suggests we might think of the two products as two components of a vector in a new vector space—something we didn’t even consider at the outset of this “multiplication” section. Calling its basis , we get a new candidate multiplication operation, called the “geometric product”, which I’ll write as as in our original experiments:

The geometric product maps two vectors (here in ) to a new vector in a new space. Only in 2D does the output space have the same dimension as the inputs, as we’ll see.

the geometric product

The geometric product IS linear in both arguments. Recall that our “linearity” argument did not allow us to define a single product which produced a scalar and satisfied both intuitive definitions of “multiplication”. Rather than choose one or the other as we did for the dot and cross, here we’ve simply chosen both.

Unlike the dot and cross products, the geometric product has no particular symmetry w.r.t. a swap of the two input vectors: its first term is symmetric while the second is antisymmetric.

How should we interpret this “geometric product”? When the two vectors are parallel it has only the first component; when they are perpendicular it has only the second. It is proportional to the lengths of both vectors. And we know it is invariant with respect to rotations of the original space .

So it would appear to be some kind of characterization of the relationship between the two vectors: perhaps it tells us how to transform one into the other? This will bear out, but we don’t have enough machinery to go any further at this point.

More Products

You could write two more products which look like but make different choices for the signs. Their angular forms would then involve angle-addition trig identities:

These two products are not rotationally invariant; instead they rotate at double the rate of the coordinate system. But the do obey Lagrange’s identity:

These in fact correspond to the real and complex parts in the multiplication of complex numbers:

The angle-addition behavior implies that multiplication according to these products acts like “composing rotations”, relative to the fixed axis we measured from.

We can also write a “fully general product” which simply distributes over all the basis vectors, the “tensor product”:

In the tensor product we don’t even try to combine any of the terms into scalars; we leave them all distributed out, and we consider the basis elements like to all be distinct from each other.

It is not obvious from the expression, but the tensor product does not depend on the choice of basis: is a distinct “thing” within the space of all possible pairs of two vectors from our original space (though there are many pairs of vectors whose tensor products produce the same “tensor” as ).

We won’t do much with this thing, but it’s worth remarking that our other products amount to particular ways of “mapping” the general tensor product to smaller spaces.

The dot product amounts to choosing , while the cross product chooses .

The two “opposite-sign” products just mentioned make the opposite choices.

Footnotes

  1. What we’re describing here is a which is a “linear in its argument ”. This is not really the same as a “linear function of an argument ”, like a 1D line ; a function “linear in an argument” must have .

  2. I distantly recall expecting to learn how to calculate polygon areas in high school. But we never did, all because high school math is afraid of vectors. In fact they’re a completely natural object to consider after some basic geometry and algebra. I think the kinds of things I’m demonstrating in this post would make a good Algebra II curriculum, and would feel far less arbitrary than what is normally considered at that point in one’s education: conic sections? Roots of polynomials? Who cares?