Understanding 3D matrix transforms

Understanding 3D matrix transforms

Translation, Scaling, Rotation, and Skewing?!

In elementary school, we are taught translation, rotation, re-sizing/scaling, and reflection. The first three are used heavily in computer graphics ? and they?re done using matrix multiplication.

If you?ve ever done a 2D or 3D game?s UI, you might have encountered transformations. Tutorials elsewhere describe them very superficially ? they don?t dive into the mathematics on which those concepts were built. That works if you?re building fairly simple applications. At one point, it becomes necessary to just know it.

Here, I?m going to describe how transformations apply to points (and then objects) in a coordinate space.

Redefining points & vectors to fit our needs

A simple set of rules can help in reinforcing the definitions of points and vectors:

  • In a n-dimensional space, a point can be represented using ordered pairs/triples.
  • A vector can be added to a point to get another point.

Image for postVector <a, b> can be added to point (x, y)

  • Similarly, the difference of two points can be taken to get a vector.
  • A vector can be ?scaled?, e.g. multiplied by a scalar to increase or decrease its magnitude. If that scalar is negative, then it will be flipped and will be rotated 180 degrees.

Vector Space

Our n-dimensional vector space is described using the origin O(0, 0[, 0]). Any point can be derived as the sum of the origin O and a vector V.

Image for post

[, a] notation demonstrates the concept for one higher dimension, i.e., P(x, y[, z]) is P(x, y) in a 2D context and P(x, y, z) in a 3D context.



I?m going to demonstrate how matrices can be used to translate, scale, and rotate any object consisting of vertices/control-points. Each transformation is applied to each point, rather than the object as a whole.

But, why should we use matrices for translation and scaling? After all, they are basic addition and multiplication operations on a 2D point. This is because of the associative property of matrix multiplication. You can multiply the matrices of multiple transformations to form one resulting matrix that can be directly applied on a point.

Image for post

This is one reason why GPUs are optimized for fast matrix multiplications. In computer graphics, we need to apply lots of transforms to our 3D model to display it to the end-user on a 2D monitor. Those transforms are compiled down into one matrix which is applied to all the points in the 3D world.

Points as matrices

As we?re going to be using matrices, a point needs to be represented as a matrix rather than an ordered set.

Before going into 3D space, we?re going to first handle the simple 2D case. A point in 2D space is going to be represented using matrices.

Image for postA point is essentially the multiplication of two matrices ? one describing the point?s coordinates and the other describing unit vectors and origin of the vector space.

Hence, we are going to shorthand the matrix form into one as:

Image for postShorthand

1. Translation

Suppose we want translate a point P(x, y) by (?x, ?y) to get to P`(?x, ?y).

Image for postP to P` in matrix form

To do this translation, we multiply P by the translation matrix:

Image for postP x T(?x, ?y) = P?

2. Scaling

It is counter-intuitive to think of ?scaling? a point, rather than an object. So let?s take a rectangle centered at the origin. We want to zoom in 2x; by intuition, we will multiply the coordinates of each point by 2 here.

Image for postThe inner rectangle is scaled twice to produce the outer rectangle

And it actually works. However, this doesn?t work in the case of an object that isn?t centered at the origin.

It will also translate the whole object away from the origin.

Image for postThe smaller rectangle is scaled directly 2x; the result is shifted to the top-right.

To solve this ?automatic shifting? problem, we do scaling in three steps:

  • Translate the object so that its center lies on the origin. Call the translation vector V.
  • Scale the control-points one-by-one.
  • Reverse the first step, i.e. translate the object with vector -V.

Scaling Transform ? Instead of multiplying the coordinates of each point by the scale, we can instead use the following matrix:

Image for postScaling transform matrix

To complete all three steps, we will multiply three transformation matrices as follows:

Image for postFull scaling transformation, when the object?s barycenter lies at c(x,y)

The point c(x,y) here is the barycenter of the object. This is just the average of all the control-points.

3. Rotation

2D rotation is fairly simple to visualize. It is done around the origin, where the clockwise direction is for positive angles.

Image for postRotate (2,1) by 90 degrees about the origin

High school math helps us here by telling us a point P(x, y) becomes P?(X, Y) after rotating through ?, where

Image for postCheck https://matthew-brett.github.io/teaching/rotation_2d.html out for proof/explanation!

The rotation matrix is fairly simple to follow:

Image for postRotation matrix

Again, when we are rotating an object w.r.t its center, we must first bring its center to the origin via translation.

Image for postRotating an object around its barycenter c(x,y)

3D Transformations

If you work with OpenGL or WebGL, you?re going to work in a 3D vector space; hence, generalizing the previous three transforms into 3D space makes them a lot more useful.

In a 3D space, a point is represented by a 1×3 matrix.

Image for post

1. Translation

Image for post3D Translation Matrix

2. Scaling

Image for post3D scaling matrix

Again, we must translate an object so that its center lies on the origin before scaling it.

3. Rotation

Rotation is a complicated scenario for 3D transforms. Here, you need an axis around which you rotate the object.

Before generalizing the rotation for any axis, let?s do it around the x-, y-, and z-axes. After doing it with one axis, the other two will become fairly easy.

  • z-axis: Imagine a 3D coordinate system, where the x-y plane is your screen/monitor. A point on this plane is (h, k, 0); when you rotate is along the z-axis, which is pointing towards you, its z-component will still be zero, i.e. (h?, k?, 0). Hence, you can treat the rotation as happening in 2D with the x-y coordinates solely.

Image for postRotation along z-axis

  • y-axis: Here, you are rotating in the z-x plane with y unchanged. It can be treated as 2D rotation with z-x coordinates solely.

Image for postRotation along y-axis

  • x-axis: Again, it is in the y-z plane and x is unchanged.

Image for postRotation along x-axis

Generalization to any axis: An axis is essentially a 3D line. It can be characterized with a point A on that line and a vector L along the line.

To rotate P along an axis, we will make A that point that is the intersection of the axis and its perpendicular passing through P, i.e. the orthogonal projection of P on the axis.

To do the transformation, we will now translate A to the origin and then rotate the vector L along one axis (we?ll use the z-axis here).

  1. Translate A to the origin
  2. Rotate vector L w.r.t y-axis so that it lies in the y-z plane.

Image for postThe original vector (unchanged). We must rotate as shown by the cupping arrow. The vector is projected on the x-y to increase clarity.

3. Now, we rotate the vector w.r.t the x-axis so that it is aligned with the z-axis.

Image for post

Now, we have transformed our coordinates so that our axis is aligned with the z-axis. We can apply the R(z) transform directly now, provided we have the angle alpha, which is the required rotation we want.

After applying the R(z) rotation, we must reverse the three preliminary transformation in order.

Overall, the whole rotation can be written as the product of 7 matrices:

Image for post

Remember, these seven transformation can be multiplied beforehand to form one matrix, which is then applied on each control point. This is beauty of matrices in the world of graphics.

Using this method of rotation suffers from the Gimbal lock; hence, a more advanced method called ?quaternion rotation? is employed in real-world implementation. I?ll discuss that in a separate story!

Matrices in code ? with PixiJS

PixiJS is a 2D graphics engine written around WebGL. The @pixi/math package contains a class Matrix . Let?s see how it?s documented.

constructor(a=1, b=0, c=0, d=1, tx=0, ty=0);

This constructor creates a matrix as follows:

Image for post

The documentation shows it as the above?s transposed form. However, a point is transformed by multiplying with the above form in the code.

On multiplying a point [x, y, 1] with the above matrix, you get:

Image for post

a and d resemble the scaling factor in the x- and y- directions. c- and b- are called the y-skew and x-skew. t(x) and t(y) are translations in x- and y- directions.

PixiJS allows you to multiply this matrix with a translation, rotation, or scaling transform. It also provides basic matrix operation methods like identity, inverse, and application to a point.

Transform class

The Matrix class is bare-bone and doesn?t do any bookkeeping. It also doesn?t keep a parent-child relationship, which is particularly important for PixiJS as it uses a hierarchical object structure to draw the UI.

Transform objects have three types of transformation ObservablePoint properties. An ObservablePoint represents a (x, y) ordered pair that triggers a callback when its value is modified.

  • position : The position of the object relative to the parent; in other words, the translation for this object.
  • scale : The scale along X- and Y- axes.
  • pivot : The pivot point around which the object is rotated around.
  • skew : This represent the shear factors along the X- and Y- axes for skewing (I?ll talk about skewing/shear-mapping later on!)

The rotation property is the radians of rotation about the pivot point.

On top of the Matrix class, Transform provides these features:

  • Individual setting of the five transformation arguments. The matrix is automatically updated.
  • The pivot property allows you to apply rotation around any center point rather than the origin.

A Transform class represents the transformation of an object w.r.t to its parent. Hence, it has two matrices:

  1. localTransform ? the transformation w.r.t the immediate parent. For example, if you scale a rectangle drawn inside its parent rectangle that is rotated, it will inherently be rotated.
  2. worldTransform ? the resulting transformation that is essentially the product of the parent?s worldTransform and localTransform . The parent can be specified using the updateTransform(parent) method.

Image for postParent?s transformation **must** be done first, since matrix multiplication isn?t commutative.


I?ve touched on the word skew multiple times here. It also called shear mapping, transvection, or just shearing. It?s related to the physics term ?shear stress?, which occurs when a force is applied horizontally to an object whose base is fixed.

Image for postThe dashed rectangle is deformed using ? force and its base is fixed. As a result, it is moved by delta L. Courtesy of Wikipedia.

?Shear mapping displaces each point in fixed direction, by an amount proportional to its signed distance from the line that is parallel to that direction and goes through the origin? ? Wikipedia

The shear stress diagram depicts a horizontal shear mapping. Here the displacement occurs along the x-axis, and is proportional to the distance from the x-axis (e.g. the y coordinate).

Image for post

Similarly, vertical shear mapping occurs along the y-axis.

Image for post

Horizontal and vertical shearing can be combined to the following generalized form:

Image for post

Shearing and scaling can be carefully combined to cause a rotation. This because the relation below ?

Image for postScaling & shearing can become rotation.

The shear mapping transformation matrix:

Image for postNow you can understand how c- & b- and the PixiJS Matrix constructor were called the skews.

Transformation decomposition

PixiJS?s Matrix has an interesting method ? decompose , which essentially converts a Matrix into a Transform object and spits out the position , scale , rotation , and skew properties.

I showed you the resemblance of scaling & skewing simultaneously and rotating. To decompose a transformation matrix, we have to solve that equation to check if scaling & skewing are being done separately or if a rotation was intended.

A PixiJS matrix is denoted as:

Image for post


Image for post

this expression is defined for all x and y, the constants must equal each other:

Image for post

If rotation was intended, the relation in the box will uphold.

This algorithm is used in the Matrix#decompose method in PixiJS. Check out the code ?

// decompose() in Matrix; I’ve added commentsconst skewX = -Math.atan2(-c, d);// -thetaconst skewY = Math.atan2(b, a);// +theta// if theta’s are nearly equal, then delta is// nearly zeroconst delta = Math.abs(skewX + skewY);if (delta < 0.00001 || Math.abs(PI_2 – delta) < 0.00001){ transform.rotation = skewY;// theta transform.skew.x = transform.skew.y = 0;}else{ transform.rotation = 0; transform.skew.x = skewX; transform.skew.y = skewY;}// next set scaletransform.scale.x = Math.sqrt((a * a) + (b * b)); transform.scale.y = Math.sqrt((c * c) + (d * d));// next set positiontransform.position.x = this.tx;transform.position.y = this.ty;

Hey, I?m the creator of the Silcos kernel. I?ve created prototypes for playing Tonkin and editing B-Splines.

Additional reading:

  • Inside PixiJS?s high-performance update loop
  • The Advent of Cooperative Scheduling in the JavaScript world
  • Curves & how they?re stored in computers

No Responses

Write a response