linear transformations

Linear transformations

A linear transformation (synonym for linear, linear map and linear function) is a function \[f\] of vectors which has the following properties:

\[f(x+y)=f(x)+f(y)\] for any vectors \[x\] and \[y\].
\[f(ax)=af(x)\] for any vector \[x\] and any scalar \[a\].

These properties are what it takes to ensure that the function \[f\] has "no curvature". So it's like a straight line, but possibly in higher dimensions.

This should not be mixed up with the linear function, \[f(x)=a+bx\], as that does not fulfill the properties. To show why, \[f(u+v)=a(u+v)+b=au+av+b=f(u)+f(v)-b\ne f(u)+f(v)\], which tells us that \[a+bx\] does not fulfill the requirement \[f(x+y)=f(x)+f(y)\].

This brings us to the next point, being the reasoning behind the properties. A linear function does not preserve the origin, neither can you transform a vector space into another vector space. The intuitive way of seeing it is that a linear transformation takes vectors, rotates/scales then project them onto a subspace (usually). A linear function does the same plus in the end it translates the origin.

Do note that if we combine multiple transformation matrices say, \[S\] and \[R\], where \[S\] is a stretching matrix and \[R\] is a rotation matrix, on a vector \[\vec{v}\], \[SR\vec{v}\], the order in which the transformations applied will be \[S(R\vec{v})\], or \[R\] then \[S\], as \[RS\ne SR\].

Showing a transformation \[T\] is linear

To show \[T:\mathbb{R}^{2}\to \mathbb{R}\], \[T\begin{pmatrix} x\\y \end{pmatrix}=2x-y\] is linear, we just have to show it fulfills both properties above.

First property:

\begin{align*} T \left( \lambda\begin{pmatrix} x\\y \end{pmatrix} \right)&=T \begin{pmatrix} \lambda x\\\lambda y \end{pmatrix}\\ &=2\lambda x-\lambda y\\ &=\lambda (2x-y)\\ &=\lambda T\begin{pmatrix} x\\y \end{pmatrix} \end{align*}

Second property:

\begin{align*} T \left( \begin{pmatrix} x_{1}\\y_{1} \end{pmatrix}+\begin{pmatrix} x_{2}\\y_{2} \end{pmatrix} \right)&=T \begin{pmatrix} x_{1}+x_{2}\\y_{1}+y_{2} \end{pmatrix}\\ &=2(x_{1}+x_{2})-(y_{1}+y_{2})\\ &=(2x_{1}-y_{1})+(2x_{2}-y_{2})\\x &=T\begin{pmatrix} x_{1}\\y_{1} \end{pmatrix}+T\begin{pmatrix} x_{2}\\y_{2} \end{pmatrix} \end{align*}

Examples of non-linear \[T\] include \[T:\mathbb{R}\to \mathbb{R},T(x)=\left| x \right|\] or \[T:\mathbb{R}^{3}\to \mathbb{R}^{3},T(\mathbf{x})=\mathbf{x}+\begin{pmatrix} 1\\0\\0 \end{pmatrix}\].

Order when combining transformations

To demonstrate why the order is important (and why matrix multiplication is non-commutative), we should think of transformations as functions instead. Say we have two transformation matrices, \[A=\begin{pmatrix} 1&1\\0&1 \end{pmatrix}\] and \[B=\begin{pmatrix} 2&0\\0&3 \end{pmatrix}\].

Rewriting them as functions, we get \[T(x,y)=(x+y,y)\] and \[S(x,y)=(2x,3y)\].

Assume we want to apply \[T\] followed by \[S\]. Logically, it would be \[S(T(x,y))\], which would be equal to \[S(T(x,y))=(2x+2y,3y)\].

If we were to use the logic that "since \[T\] followed by \[S\] then it should be \[T(S(x,y))\]", we would get a different (incorrect) combined transformation, \[T(S(x,y))=(2x+3y,3y)\].

Relating back to matrices, since \[T(\mathbf{x})=A\mathbf{x},S(\mathbf{x})=B\mathbf{x}\] then \[S(T(\mathbf{x}))=S(A\mathbf{x})=B(A\mathbf{x})=BA\mathbf{x}\].

\[\text{area of image}=\text{area of object}\cdot\det(T)\]

See determinant.

Say we have a shape,

What we can do is a draw red lines from origin to the vertices of the polygon and sum up the areas of the triangles \[(0,1,2),(0,2,3),(0,3,4),(0,4,5),(0,5,6),(0,6,1)\]. Since as we noted above, we can have "negative" area (as a result of orientation), the former three triangles would be positive area and the latter three being negative area. Then, summing up everything would give us exactly the area of the shape.

Notice that each (red) side of the triangle can be represented as a vector (forming a parallelogram), \[A=\begin{pmatrix} x_{1}&x_{2}\\y_{1}&y_{2} \end{pmatrix}\], thus the area of each triangle can be calculated via the \[\frac{1}{2}\det(A)\], essentially taking half the area of the parallelogram.

Let \[T\] be a transformation matrix, defined as \[T=\begin{pmatrix} a&b\\c&d \end{pmatrix}\]. We apply \[T\] to any triangle to form a new matrix, call it \[B\], where,

\begin{align*} B&=TA\\ &=\begin{pmatrix} a&b\\c&d \end{pmatrix}\begin{pmatrix} x_{1}&x_{2}\\y_{1}&y_{2} \end{pmatrix}\\ &=\begin{pmatrix} x_{1}^{\prime}&x_{2}^{\prime}\\y_{1}^{\prime}&y_{2}^{\prime} \end{pmatrix}\\ \end{align*}

Then, the area of the new transformed triangle, \[\frac{1}{2}\det(B)=\frac{1}{2}\det(TA)=\det(T)\cdot \frac{1}{2}\det(A)\], proving that \[\text{area of image}=\text{area of object}\cdot\det(T)\].

If we were to sum all the triangle areas after the transformation, \[\sum(\text{area of triangles}\cdot\det(T))=\det(T)\cdot\sum(\text{area of triangles})\].

Sometimes, if we're sure that the sign of the area of an object is irrelevant, then we can write the formula as take \[\det(T)\] as \[\left| \det(T) \right|\].

Reversing changes

Consider a matrix \[B\]:

\begin{align*} B= \begin{pmatrix} 2 & 0 \\ 0 & 2 \\ \end{pmatrix} \end{align*}

If we multiply \[B\] with \[A\] we would get \[BA\], to reverse the effects, we simply multiply \[BA\] with \[B^{-1}\] to form \[B^{-1}(BA)=IA=A\].

If we have applied two transformation matrices, say \[C(BA)\], where \[C\] is another transformation matrix. Since we applied \[B\] to \[A\] followed by \[C\], to get \[A\] back we would have to perform \[C^{-1}\] followed by \[B^{-1}\] (think of it like you have stored \[A\] inside a box labelled \[B\] then inside \[C\], so to retrieve \[A\] one needs to open box \[C\] then \[B\]), or generally \[(CB)^{-1}=B^{-1}C^{-1}\].

linear transformations