Part of a series on elementary linear algebra:

  1. (This post)
  2. Multiplication
  3. Division

Table of Contents

Introduction

This post is part one of an experimental course in linear algebra. It will differ from the conventional treatment in a number of ways:

  • in the use of geometric and graphical arguments throughout, in preference to proofs and especially to matrix-based derivations.
  • in the use of “generalized inverses”, introduced initially to implement an analog of “division” for the dot and cross products.
  • in the use of exterior algebra and wedge products, with goal of making the determinant, when it arrives, completely trivial.
  • in the use of unconventional notations for the atomic operations out of which more complex operations (like matrix inversion) will be defined.

The level of technicality will be inconsistent. This is not intended to be pedagogical, but rather to develop the ideas which would be in a pedagogical treatment in a logical order, while justifying the various decisions and notations. These discussions will tend to be more technical, but after they are all stripped away, the ideal result would be a course which could plausibly be taught at a late-high-school level—a high bar, or a low one, I suppose.

We begin with the simplest case.

One Dimension

Vectors

Behold a “vector”:

a vector

For now I’ll notate this as .

We can make a longer vector by doubling it:

This is the same as adding to itself:

We can easily imagine multiplying by any number , giving a new vector , which can be longer or shorter than the original vector. In this context we call the number a “scalar”, because it “scales” vectors.

There’s no reason can’t be negative too, reversing the direction:

Then we can add and subtract these, which simply adds or subtracts their coefficients:

Therefore must be a valid vector too, which we’ll just call “zero” or the “zero vector”, written as .

We can divide by a constant, which just acts on the coefficient

We can even divide two vectors by each other, as long as the denominator isn’t zero:

If one is reversed, we get the negative:

(Note: this notion of dividing vectors is the first of many unconventional things in this post.)

The set of all of multiples of this single vector is what we’ll call a “vector space”. Calling this space , we can write it as the “set of all linear combinations of ”, along with the operations we’ve described, that is:

This is one-dimensional: all of the vectors are multiples of a single vector , making it more of a “line” than a “space”.

Note we can also get the same vector space by starting with any other vector in the same space, say, , because any multiple of the first can be written in terms of the second . So the two are equal as sets:

We can see this construction as an operation "" which gives the same set for both elements:

And in fact, apparently, it hardly matters what the vectors in the space are multiples of—the important feature of this “vector space” is the coefficients, which give it the same structure as the real numbers .

We can therefore “vectorize” any single object and get a vector space with the same structure as , for example, an apple or orange. We won’t go so far as to say that the spaces 🍏 and 🍊 are the same space, though—they only have the same structure. But we will identify the vector spaces generated by and as the same space—this is possible because we additionally have asserted that these objects have an innate relationship; each is a multiple of the other according to .

Length

Now, what is the length of this arrow ?

Well, it’s of, uh, whatever the length of is. Clearly our second vector above has twice its length:

And any constant multiple will have -times the length, e.g.:

Maybe we think of as having length “1 meter” and our as a meter stick. But we can just as easily measure in inches instead. We could use any vector in the space to measure the others. For example, our long-left-arrow would produce lengths which are half as large:

A vector, then, has no inherent length—not until we pick some reference vector to measure it with. Then its length carries “units” of that unit length. (The zero vector , however, does have an innate length of zero.)

Given our one-dimensional vector space , you could choose any specific vector (like or ), declare that it has length “1”, and this would give a length for every other vector. Note that for any choice, there will be exactly two vectors in our 1D vector-space with that length. It’s still just a number line. When you think of a vector space you should really not think of “things with lengths”—the vectors themselves are just dumb arrows that can be added and scaled; which one has length is really an additional choice you get to make—it’s no more “inherent” to the vectors themselves than the choice of whether to measure length in meters vs. inches.

Multiplication

Above, we demonstrated the following operations:

  • vector times scalar , giving a vector.
  • vector divided by scalar, giving a vector.
  • vector divided by (nonzero) vector, giving a scalar, as in

What’s missing is the multiplication of two vectors. The obvious definition would be:

but we get something that is neither a multiple of , nor a scalar. Instead it has two units of ; if our original vector had represented meters, then this new thing has units of “meters squared”. Its length in units of is just , and, since we noted above that it doesn’t matter what you created your vector space out of, you could easily assign this new vector to a vector space:

This seems sensible enough, but we’ll have to see if it’s useful.

Two Dimensions

Vectors

So far our vectors aren’t very interesting: they act just like the real numbers , and the arrow is just a strange “unit” for these numbers. We’ll have to add another dimension to make vectors meaningfully distinct from numbers.

Now imagine we start with two objects. Unlike and above, we assume the two objects are unrelated to each other; they could as well stand for “apples” and “oranges” or anything else. We will draw them as arrows in two perpendicular directions:

2 2D vectors

We’ll notate these as, what else, and , for now.

Now, clearly we could create a one-dimensional vector space out of either of these individually, with all of the properties defined above. The two spaces would be:

It’s easy to imagine the next step: we create a single space out of both vectors by considering any linear combination of the two, i.e.

Some elements of include:

Of course, we get a unique vector for each ordered pair . So this vector space has the same structure as the two-dimensional coordinate plane: one vector for each point, one point for each vector.

Components

If we have some arbitrary vector in our two-dimensional , like , we can decompose it into a “component” along each of and individually. Here it’s

Because no amount of can ever point upward or downward, and no amount of can ever point rightward or leftward, there must only be one way to represent any given vector as a sum of our original two generating vectors. These are the “projections” onto each of the original vectors, which we write:

At this point all these arrows are cumbersome, so we’ll switch to a normal notation. Henceforth we write vectors as and will refer to our original two arrows as:

We’ll write the projection of onto as . Then the above reads, for :

Now, we originally constructed this vector space out of our two vectors and . In one-dimension we observed that we could have generated the whole line by starting with any single arrow. In 2D, likewise, any pair of vectors which are not parallel could be used to generate the entire plane. Any such pair is called a “basis”, for example, two other bases would be:

and the components of the same in the bases from before would be:

Therefore the set of numbers describing the specific vector, such as or , are the “coordinates of the vector in the basis” and depend on the choice of basis.

Any “basis” of , then, consists of a set of vectors which “span” , in that any vector in can be written uniquely as a linear combination of the basis vectors. All bases of have the same “dimension”, which is just the definition of dimension. Here, .

We will always think of a vector as being a fundamentally “geometric” object, rather than a “set of coordinates”—the coordinates are a description of the vector in a basis. A basis, then, can be thought of as a “lens” we can look at the space through. Many such lenses are possible, and not all see the space the same way, but the space also has inherent properties that do not depend on the lens we use.

Length

In one dimension we observed that our vectors had no “inherent” length. But we were able to define the length of a vector in “units” of some standard vector. Choosing as this standard vector, we can write things like:

But we could just as easily choose as the standard, in which case the two vectors in the example would have lengths of 1 unit only.

Clearly we could do the same for any individual direction in two dimensions.

Now, how should we define the length of a two-dimensional vector in general? The obvious answer is to assert that some pair of two basis vectors both have length “1”. The obvious choice are the two vectors we originally used to build the space :

Once two unit vectors are chosen we can define a length by a standard Pythagorean theorem:

But, just as in 1D, we could as easily choose some other set of vectors to be the “unit” of length, such as . Then if we choose , which is , then the Pythagorean length of a generic vector would be:

Therefore the “lengths” of vectors are only defined up to choice of basis. And, while it might seem like and are the natural choices to define as length “1”, there is actually nothing special about these vectors that makes them equal in length—they do not inherently even have a length, except for the fact that I’ve chosen to represent both with equal-length arrows in my notation.

If you define to have length 1, then so it is, and now has length in these units. This transformation would squash the “circle of vectors of length 1” in half in one direction, but the overall structure of the vector space is unchanged.

The concept of length is not actually necessary to work with vectors at all. While all the vectors we’ll consider are naturally lengthed, it might make more sense to do without if considering, for example, a vector space of “apples” and “oranges”.

We will again aim to view our vectors as “geometric” or “graphical” objects, and therefore will not consider any vector to have an inherent length, except in view of a particular set of reference vectors—just as we did not consider vectors to have inherent coordiantes, except wth respect to a basis.

The operation of “length” then can be seen as an act of “measurement” with respect to a set of unit-length reference vectors. This choice of reference vectors we call a choosing a “metric”. Just like choosing a “basis”, the choice of “metric” acts like a lens through which the vector space can be viewed and described. In the simplest cases, the chosen basis consists of unit-length vectors, which is the case when we use . But in general the two choices are independent—you could measure length relative to , while representing your vectors in terms of and ., which each would have length . Then would have length rather than 1. So, in general, the metric and basis can be chosen or varied independently. Varying the basis is more common; in most instances the “metric” is simply taken to be some obvious choice and is not discussed directly. You rarely think about changing the metric until you’re working at a pretty advanced level.

Often we simplify the whole description by considering bases comprised of perpendicular and unit-length vectors, i.e., “orthonormal bases”. Then the lengths of all vectors can be equally well measured in any orthonormal basis using a Pythagorean theorem.

We will often write the length of vector as the same name without the vector notation, . (This rule will only apply when no subscripts are involved, which we’ll see in a moment.) Then we write the unit-length vector parallel to with a hat as . As an example, the above basis becomes:

And in the basis we have our earlier .

For the rest of this post we’ll stick to a standard metric which takes the two vectors as having “unit length”, and we’ll notate them with hats to indicate this.

Projection and Rejection

We’ve been writing the same vector in a few different bases—using unit-vector bases for now, we had:

We’ll now start writing this decompositions generically, in terms of subscripted constants :

Clearly are the coordinates of in the basis of , and this holds for any vector. We’ll use numeric subscripts like only for the bases named like . These subscripts will always be defined relative to unit basis vectors. While it might seem more natural to define such that , this makes the scalar be unitless, which seems to cause to confusion. Instead we’ll adopt the convention that a subscript applied to the name of a vector never changes its “units”. If is in meters, then must be too.

We can generalize “the component along…” to any pair of vectors. For :

  • is the “vector projection of onto ”, which is the vector . Read this as “-parallel-”.
  • is the “scalar projection of onto ”; the component of along , whch makes this exactly the length of the vector project: . Read this as “-along-” or “-sub-”. Note that this makes the scalar projection a “length”- or “metric”-dependent quantity. 1

Here we are making a distinction between the object named "" and the object which we intend to represent the “oriented line” along which points—this line knows about the direction of but not its length; clearly this is all we need to define a projection. We will likewise use to refer to the oriented line rotated from the direction of . We will treat these “line”-typed objects more thoroughly when in higher dimensions.

Graphically, the vector projections are:

two projections

In each case we find a projection by “dropping a line” from the end of one vector until it meets the line of the other vector at a right angle. Note that is a scalar multiple of ; the exact ratio is , with being positive or negative.

If we know the angle between the vectors, then the projection vectors and their lengths are:

The terms “projection” and “component” are mostly interchangeable. We will typically use “component” when referring to a “projection on a basis vector”, whereas “projection” is an operation between any two vectors. One also sees the notations and for the projection.

When the vector is one of our numbered basis vectors , we will write the projections themselves with numbers . When the target of the projection is a unit vector, we will also sometimes omit the symbol, writing or , since in this case vector projection can be written in terms of the scalar projection as . The “components of a vector in a basis” are then just the projections onto each basis unit vector.

Given the projection , we can also define the “rejection” of with , which we will write as either:

  • , which reads as ” perpendicular to ”, or just ‘-perp-”.
  • , read as “-not-

The notation should be read as “projection onto the line perpendicular to ”, with being, here, the “oriented line” rotated from the direction of , which is then being used as the argument to the projection operation. It will be helpful later to also define the vector as vector rotated by ; i.e. the vector of the same length as along the line . This vector is depicted in the diagrams below. Clearly the rejection could be defined as projection onto either of or , but the former is more general.

The notation on the other hand, conveys the sense of “rejection” by the vector itself. In higher dimensions this object will no longer be a vector, and the equivalence of the two definitions will be less trivial, but for now they mean the same thing.

The definition of “rejection” is just “whatever’s left after projection”:

Graphically:

rejections

We can also define a scalar rejection, in terms of the angles shown in the graphic:

Note that again the scalar rejection is not simply the length of the vector rejection, because it can be negative. Instead we have, using now:

Here we can see the projection and rejection together:

projections and rejections

In each case the projection and rejection together are two sides of a triangle, and their sum gives the original vector. Therefore we can write a Pythagorean theorem relating their lengths:

We will be able to give more useful definitions for the projection and rejection, not making use of , once we introduce some additional tools. (Arguably we have not really yet defined what angles mean at all.)

Footnotes

  1. Note that my notation uses for the length . A different “natural” definition for the “scalar projection” would be , which has the nice property that , and makes a metric-independent quantity. I am not making this choice because I’ll be using a different expression for this quantity later on, , and I want to have the two mean distinct things. I’m okay with being metric-dependent because the other lower case scalars are too. Unlike those scalars, which were lengths and therefore positive-only, can be negative. 2

  2. But… what if we allowed the in the definition of length to include the negative branch of the ? Later on I’ll derive the orientedness of areas by picking one branch of the ; who’s to say we can’t do the same for lengths? Hmm.