DEV Community

San Askaruly
San Askaruly

Posted on • Edited on • Originally published at github.com

A Visual Guide to Affine Transformations: Translation, Scaling, Rotation, and Shear

Gif affine transformation

Image credits: murray, Stack Exchange

What is affine transformation exactly?

Affine transformation is a technique used in image processing to modify the geometry of an image while preserving certain properties. It's a combination of linear transformations that can change an image's position, size, shape, and orientation [1, 2].

In simple terms, affine transformations allow you to:

  1. Move the image (translation)
  2. Resize the image (scaling)
  3. Tilt or slant the image (shear)
  4. Rotate the image (rotation)

These transformations can be applied individually or combined to achieve various effects [1, 2]. The key characteristic of affine transformations is that they preserve:

  • Straight lines (they remain straight after transformation)
  • Parallel lines (they stay parallel after transformation)
  • Ratios of distances between points on a line [3, 4]

Why is it useful to know

In the context of image processing, affine transformations are commonly used to:

  • Correct distortions in images, such as those caused by camera angles or lens effects
  • Align or register multiple images
  • Prepare images for further analysis or processing

For example, in satellite imagery, affine transformations help correct distortions from wide-angle lenses and create accurate, flat maps from curved Earth images [2]. This makes it easier to analyze and work with the imagery without having to account for distortions.


Outline of this post

In this tutorial, we will cover and visualize common affine transformations: translation, scaling, shear and rotation. We will use Python code from OpenCV and NumPy libraries. The post is structured as follows:

Source code: https://github.com/tuttelikz/notes/blob/main/affine/code.ipynb


Definitions

Mathematically, an affine transformation is a relation between two images that can be expressed as a matrix multiplication (linear transformation), AA , followed by a vector addition (translation), BB .

T=Axy+B=a00a01a10a11xy+b00b01 T=A\cdot\begin{vmatrix} x \\ y \end{vmatrix}+B=\begin{vmatrix} a_{00} & a_{01} \\ a_{10} & a_{11} \end{vmatrix}\cdot\begin{vmatrix} x \\ y \end{vmatrix}+\begin{vmatrix} b_{00} \\ b_{01} \end{vmatrix}
T=a00x+a01y+b00a10x+a11y+b10 T=\begin{vmatrix*}[r] a_{00}x+a_{01}y+b_{00}\\ a_{10}x+a_{11}y+b_{10} \end{vmatrix*}

In Python, affine transformation can be realized using cv2.warpAffine. However, we should supply the 2×32 × 3 transformation matrix, MM .

import cv2

src = cv2.imread("images/lena.jpg")
h, w = src.shape[:2]

# M is affine transformation matrix which shall be defined
warp_dst = cv2.warpAffine(
    src=src, M=M, dsize=(w, h)
)
Enter fullscreen mode Exit fullscreen mode

Translation

A translation

To perform image translation, the elements of AA and BB should be as follows:
A=a00a01a10a11=1001 A=\begin{vmatrix} a_{00} & a_{01} \\ a_{10} & a_{11} \end{vmatrix}=\begin{vmatrix} 1 & 0 \\ 0 & 1 \end{vmatrix}

B=b00b01=dxdy B=\begin{vmatrix} b_{00} \\ b_{01} \end{vmatrix}=\begin{vmatrix} dx \\ dy \end{vmatrix}
where dxdx is desired shift in horizontal direction, whereas dydy represents shift in vertical direction. Remembering that affine transformation is T=Axy+B T=A\cdot\begin{vmatrix} x \\ y \end{vmatrix}+B , we can simply derive a translation transformation for the simplest case. Let's take as an example four coordinates: (0,0),(0,1),(1,0),(1,1)(0,0), (0,1), (1,0), (1, 1) and derive where each of these points should be located after translation:

  • A=(0,0)A=(dx,dy)A=(0,0) \rightarrow A'=(dx, dy)
  • B=(0,1)B=(dx,1+dy)B=(0,1) \rightarrow B'=(dx, 1+dy)
  • C=(1,0)C=(1+dx,dy)C=(1,0) \rightarrow C'=(1+dx, dy)
  • D=(1,1)D=(1+dx,1+dy)D=(1,1) \rightarrow D'=(1+dx, 1+dy) Translation

To be able to translate an image using cv2.warpAffine mentioned previously, 2×32 × 3 transformation matrix MM can be created using NumPy as follows:

import numpy as np

dx = 20 # shift in X (pixels)
dy = 50 # shift in Y (pixels)
M = np.array([
    [1,0,dx],
    [0,1,dy]
]).astype(np.float32)
Enter fullscreen mode Exit fullscreen mode

And here is the output of image translation:
Image translation
Note: You may have noticed that translation of the image is inverted when compared to the graph. This is because in an image, the origin is at the top-left corner, and the y-axis goes down, while in a typical coordinate system, the origin is at the bottom-left, and the y-axis goes up.


Scale

A scale

To perform image scaling, the elements of AA and BB should be as follows:
A=a00a01a10a11=wx00wy A=\begin{vmatrix} a_{00} & a_{01} \\ a_{10} & a_{11} \end{vmatrix}=\begin{vmatrix} wx & 0 \\ 0 & wy \end{vmatrix}

B=b00b01=00 B=\begin{vmatrix} b_{00} \\ b_{01} \end{vmatrix}=\begin{vmatrix} 0 \\ 0 \end{vmatrix}

where wxwx is desired scaling in horizontal direction, whereas wywy represents scaling in vertical direction. Remembering that affine transformation is T=Axy+B T=A\cdot\begin{vmatrix} x \\ y \end{vmatrix}+B , we can simply derive a scaling transformation for the simplest case. Let's take as an example four coordinates: (0,0),(0,1),(1,0),(1,1)(0,0), (0,1), (1,0), (1, 1) and derive where each of these points should be located after scaling:

  • A=(0,0)A=(0,0)A=(0,0) \rightarrow A'=(0, 0)
  • B=(0,1)B=(0,wy)B=(0,1) \rightarrow B'=(0, wy)
  • C=(1,0)C=(wx,0)C=(1,0) \rightarrow C'=(wx, 0)
  • D=(1,1)D=(wx,wy)D=(1,1) \rightarrow D'=(wx, wy) Scale To be able to scale an image using cv2.warpAffine mentioned previously, 2×32 × 3 transformation matrix MM can be created using NumPy as follows:
import numpy as np

wx = 2 # scale in horizontal direction
wy = 3 # scale in vertical direction
M = np.array([
    [wx,0,0],
    [0,wy,0]
]).astype(np.float32)
Enter fullscreen mode Exit fullscreen mode

And here is the output of image scaling:
Image scaling


Shear

Shear in X direction

A shear in X

To perform shear in X direction, the elements of AA and BB should be as follows:
A=a00a01a10a11=1tanϕ01 A=\begin{vmatrix} a_{00} & a_{01} \\ a_{10} & a_{11} \end{vmatrix}=\begin{vmatrix} 1 & \tan\phi \\ 0 & 1 \end{vmatrix}

B=b00b01=00 B=\begin{vmatrix} b_{00} \\ b_{01} \end{vmatrix}=\begin{vmatrix} 0 \\ 0 \end{vmatrix}

where tanϕ\tan\phi represents amount of the horizontal shear. Remembering that affine transformation is T=Axy+B T=A\cdot\begin{vmatrix} x \\ y \end{vmatrix}+B , we can simply derive a horizontal shearing for the simplest case. Let's take as an example four coordinates: (0,0),(0,1),(1,0),(1,1)(0,0), (0,1), (1,0), (1, 1) and derive where each of these points should be located after shearing:

  • A=(0,0)A=(0,0)A=(0,0) \rightarrow A'=(0, 0)
  • B=(0,1)B=(tanϕ,1)B=(0,1) \rightarrow B'=(\tan\phi, 1)
  • C=(1,0)C=(1,0)C=(1,0) \rightarrow C'=(1, 0)
  • D=(1,1)D=(1+tanϕ,1)D=(1,1) \rightarrow D'=(1+\tan\phi, 1) Shear in X direction To be able to horizontally shear an image using cv2.warpAffine mentioned previously, 2×32 × 3 transformation matrix MM can be created using NumPy as follows:
import numpy as np

phi = math.pi/8
M = np.array([
    [1,math.tan(phi),0], # amount of shear in X direction
    [0,1,0]
]).astype(np.float32)
Enter fullscreen mode Exit fullscreen mode

And here is the output of image shear in X direction:
Image shear in X direction

Shear in Y direction

A shear in Y

To perform shear in Y direction, the elements of AA and BB should be as follows:
A=a00a01a10a11=10tanψ1 A=\begin{vmatrix} a_{00} & a_{01} \\ a_{10} & a_{11} \end{vmatrix}=\begin{vmatrix} 1 & 0 \\ \tan\psi & 1 \end{vmatrix}

B=b00b01=00 B=\begin{vmatrix} b_{00} \\ b_{01} \end{vmatrix}=\begin{vmatrix} 0 \\ 0 \end{vmatrix}

where tanψ\tan\psi represents amount of the vertical shear. Remembering that affine transformation is T=Axy+B T=A\cdot\begin{vmatrix} x \\ y \end{vmatrix}+B , we can simply derive a vertical shearing transformation for the simplest case. Let's take as an example four coordinates: (0,0),(0,1),(1,0),(1,1)(0,0), (0,1), (1,0), (1, 1) and derive where each of these points should be located after shearing:

  • A=(0,0)A=(0,0)A=(0,0) \rightarrow A'=(0, 0)
  • B=(0,1)B=(0,1)B=(0,1) \rightarrow B'=(0, 1)
  • C=(1,0)C=(1,tanψ)C=(1,0) \rightarrow C'=(1, \tan\psi)
  • D=(1,1)D=(1,1+tanψ)D=(1,1) \rightarrow D'=(1, 1+\tan\psi) Shear in Y direction To be able to vertically shear an image using cv2.warpAffine mentioned previously, 2×32 × 3 transformation matrix MM can be created using NumPy as follows:
import numpy as np

psi = math.pi/10
M = np.array([
    [1,0,0],
    [math.tan(psi),1,0] # amount of shear in Y direction
]).astype(np.float32)
Enter fullscreen mode Exit fullscreen mode

And here is the output of image shear in Y direction:
Image shear in Y direction


Rotation

A rotation

To perform rotation, the elements of AA and BB should be as follows:
A=a00a01a10a11=cosθsinθsinθcosθ A=\begin{vmatrix} a_{00} & a_{01} \\ a_{10} & a_{11} \end{vmatrix}=\begin{vmatrix} \cos\theta & -\sin\theta \\ \sin\theta & \cos\theta \end{vmatrix}

B=b00b01=00 B=\begin{vmatrix} b_{00} \\ b_{01} \end{vmatrix}=\begin{vmatrix} 0 \\ 0 \end{vmatrix}

where θ\theta represents desired angle of the rotation. Remembering that affine transformation is T=Axy+B T=A\cdot\begin{vmatrix} x \\ y \end{vmatrix}+B , we can simply derive a rotation transformation for the simplest case. Let's take as an example four coordinates: (0,0),(0,1),(1,0),(1,1)(0,0), (0,1), (1,0), (1, 1) and derive where each of these points should be located after rotation:

  • A=(0,0)A=(0,0)A=(0,0) \rightarrow A'=(0, 0)
  • B=(0,1)B=(sinθ,cosθ)B=(0,1) \rightarrow B'=(-\sin\theta, \cos\theta)
  • C=(1,0)C=(cosθ,sinθ)C=(1,0) \rightarrow C'=(\cos\theta, \sin\theta)
  • D=(1,1)D=(cosθsinθ,sinθ+cosθ)D=(1,1) \rightarrow D'=(\cos\theta-\sin\theta, \sin\theta+\cos\theta) Rotation To be able to rotate an image using cv2.warpAffine mentioned previously, 2×32 × 3 transformation matrix MM can be created using NumPy as follows:
import numpy as np
import math

angle = math.pi/6 # rotation angle
M = np.array([
    [math.cos(angle),-math.sin(angle),0],
    [math.sin(angle),math.cos(angle),0]
]).astype(np.float32)
Enter fullscreen mode Exit fullscreen mode

And here is the output of image rotation:
Image rotation


References

[1] Educative. What is affine transformation? Educative. Retrieved from https://www.educative.io/answers/what-is-affine-transformation
[2] MathWorks. Affine transformation. MathWorks. Retrieved from https://www.mathworks.com/discovery/affine-transformation.html
[3] Hughes, R. Affine transformation. University of Edinburgh. Retrieved from https://homepages.inf.ed.ac.uk/rbf/HIPR2/affine.htm
[4] Wikipedia. Affine transformation. Wikipedia. Retrieved from https://en.wikipedia.org/wiki/Affine_transformation


Thanks for reading! Stay tuned for more content, and feel free to share your thoughts and feedback! Your reactions help me improve and create even more useful posts 🙂

Alternate URL: https://github.com/tuttelikz/notes/tree/main/affine

Top comments (0)