Translations Using Matrices

Dec 20, 2020

If you have taken a linear algebra course, you may be familiar with the idea that matrices can be used to represent linear transformations (or maps). For example, in two dimensions, the matrix: $\begin{pmatrix} 0 & 1 \\ 1 & 0 \\ \end{pmatrix}$ corresponds to a reflection across the line $$y = x$$. In general, a matrix can be used to represent any linear map. A map $$f: \mathbb{R}^n \rightarrow \mathbb{R}^n$$, which takes every point in $$n$$-dimensional space and maps it to a new location, is said to be linear if and only if it satisfies the following conditions:

• $$f(\vec{u} + \vec{v}) = f(\vec{u}) + f(\vec{v})$$ (addition)
• $$f(c\vec{u}) = cf(\vec{u})$$ (scalar multiplication)

In the above conditions, $$\vec{u}$$ and $$\vec{v}$$ are points in $$\mathbb{R}^n$$ and $$c$$ is a scalar. Many common types of transformations satisfy these conditions of linearity, including rotations, reflections, and scalings. However, there is a notable and simple type transformation that does not satisfy linearity: translations. A translation is a "shift" of all points in the space - for example, moving all the points up by one unit.

To see why translations are not linear, observe that the scalar multiplication property implies that all linear transformations must map the origin $$\vec{0}$$ to itself: $f(\vec{0}) = f(0 \cdot \vec{u}) = 0 \cdot f(\vec{u}) = \vec{0}$ As translations shift the origin to a new location, they cannot be linear transformations. Therefore, there is no way to express a translation in $$n$$-dimensions as an $$n \times n$$ matrix.

This seems unfortunate, as translations appear often and we would like to be able to apply the powerful machinery we have from linear algebra to handle them. Luckily there is a clever technique that allows us to represent translations using matrices. The main idea is to represent points in $$n$$-dimensional space using $$n+1$$ coordinates, such that the extra coordinate is always set to one. With this change, transformations will be represented by $$n+1 \times n+1$$ matrices. Now, the last column of the transformation matrix can be used to specify the amount and direction to translate the points by!

As an example, let us construct a matrix that corresponds to a reflection across $$y = x$$, followed by a translation of 2 units to the right and 3 units up. Starting with a point $$\begin{pmatrix} a \\ b \end{pmatrix}$$, we first append the extra coordinate to get $$\begin{pmatrix} a \\ b \\ 1 \end{pmatrix}$$. Then, we want to find the matrix $$R$$ such that $R \begin{pmatrix} a \\ b \\ 1 \end{pmatrix} = \begin{pmatrix} b \\ a \\ 1 \end{pmatrix}$ since this corresponds to a reflection across $$y = x$$. By inspection, this matrix is: $R = \begin{pmatrix} 0 & 1 & 0 \\ 1 & 0 & 0 \\ 0 & 0 & 1 \end{pmatrix}$ Next, we want to find the matrix $$T$$ such that $T \begin{pmatrix} b \\ a \\ 1 \end{pmatrix} = \begin{pmatrix} b + 2 \\ a + 3 \\ 1 \end{pmatrix}$ since this corresponds to the desired translation. By inspection, this matrix is: $T = \begin{pmatrix} 1 & 0 & 2 \\ 0 & 1 & 3 \\ 0 & 0 & 1 \end{pmatrix}$ Note that without the extra coordinate, we wouldn't have been able to construct a matrix to represent the translation! Finally, we can compute the matrix that represents the overall transformation $$M = TR$$, which equals: $\begin{pmatrix} 0 & 1 & 2 \\ 1 & 0 & 3 \\ 0 & 0 & 1 \end{pmatrix}$ To summarize, in the modified coordinate space where every point has an extra coordinate that is set to one, the matrix $$M$$ corresponds to a reflection followed by a translation. In our original coordinate system, it is impossible to represent this transformation as a matrix.

As a side note, in this modified coordinate space, the extra coordinate is set to zero for vectors. The difference between points and vectors is that points represent locations whereas vectors represent directions. In this way, the extra coordinate is acting as a boolean flag to indicate whether a coordinate represents a point or a vector. The nice thing about this boolean flag is that it behaves well with the algebraic operations that we'd like to perform on coordinates. For example, if we multiply a vector by $$M$$, then only the reflection is applied, but not the translation. This behavior is desired as it does not really make sense to translate a direction. Furthermore, the boolean flag also behaves nicely with addition, as it correctly encodes that a direction plus a direction equals a direction ($$0 + 0 = 0$$), a position plus a direction equals a position ($$1 + 0 = 1$$), and a position plus a position is undefined ($$1 + 1 = 2$$)[1]. How elegant!

This idea of appending an extra coordinate is a specific form of a more general concept called homogenous coordinates. I learned this material in a computer graphics course I took this semester, MIT's 6.837. As you may or may not expect, computer graphics involves a lot of linear algebra like this!

Footnotes

1. Both "3 miles north plus 2 miles east" and "New York plus 3 miles north" make sense, but what would "New York plus Boston" mean? Back to text.