Part 2.2: Projection Matrix

What is the Projection Matrix?

While the View Matrix transforms world coordinates into the camera’s local space, the Projection Matrix transforms that 3D camera space into a clipped 2D space.
This is the step where perspective is introduced where far objects look smaller, and near objects look larger.

view space -> clip space

The projection matrix defines how the 3D scene (Game) is projected onto a 2D surface (Your screen).
It Determines:

What part of the world is visible through the camera (field of view).
How deep the camera can see (near/far clipping planes).
How the perspective distortion should look (objects farther away appear smaller etc…)
Defines the 3D view Frustum

There are two main types of projection matrices

Perspective Projection
Orthographic Projection

Here we will only be looking at the Perspective Projection Matrix.

What the Projection Matrix Does:

The projection matrix transforms view space into clip space. The graphics pipeline then performs the perspective divide (dividing by w) to convert clip space into Normalized Device Coordinates (NDC).
In NDC (for DirectX):

The x and y coordinates are squeezed into the range −1 to +1, and the z coordinate is squeezed into the range 0 to +1.
Everything inside this cube is visible.
Everything outside is clipped (not rendered).

The Projection Matrix represents how the camera sees. It gives the world its sense of depth and perspective, and its parameters: FOV, aspect ratio, near and far planes define the visible 3D frustum.

Layout of the Perspective Projection Matrix

Here is the standard layout of a perspective projection matrix (DirectX):

\[P = \begin{bmatrix} x_{scale} & 0 & 0 & 0 \\ 0 & y_{scale} & 0 & 0 \\ 0 & 0 & \dfrac{z_{far}}{z_{far} - z_{near}} & 1 \\ 0 & 0 & -\dfrac{z_{near} \cdot z_{far}}{z_{far} - z_{near}} & 0 \end{bmatrix}\]

Lets breakdown all the parameters:

zNear: Near clip plane
zFar: Far clip plane
xScale:
\(x_{scale} = \frac{1}{\tan{\left(\frac{Fov_X}{2}\right)}}\)
yScale:
\(y_{scale} = \frac{1}{\tan{\left(\frac{Fov_Y}{2}\right)}}\)

or equivalently,
\(y_{scale} = \frac{1}{AspectRatio \cdot \tan{\left(\frac{Fov_X}{2}\right)}}\)

For a deep dive and visual derivation of how these parameters (FOV, Aspect Ratio, Near/Far planes) determine each element of the Perspective Projection Matrix check out this video by by pikuma.
Perspective Projection Matrix (Math for Game Developers)

Relationship Between Horizontal and Vertical FOV

Let A denote the Aspect Ratio, defined as

\[A = \text{AspectRatio} = \frac{\text{Width}}{\text{Height}}\]

The relationship between \(FOV_x\) and \(FOV_y\) is:

\[\tan\left(\frac{FOV_X}{2}\right) = A \cdot \tan\left(\frac{FOV_Y}{2}\right)\]

From this, we can express each one in terms of the other:

\[FOV_X = 2 \cdot \tan^{-1}\!\left(A \cdot \tan\!\left(\frac{FOV_Y}{2}\right)\right)\] \[FOV_Y = 2 \cdot \tan^{-1}\!\left(\frac{\tan\!\left(\frac{FOV_X}{2}\right)}{A}\right)\]

If you want to learn how these formulas are derived i recommand this write-up by Wojtek Sterna:
Field of View – Horizontal and Vertical Conversion

What is Reverse Z?

Since this game uses the Reverse Z technique for its projection matrix, i will quickly explain what it is:

Reverse Z is a trick used in modern engines to fix depth precision issues. Normally, the projection matrix maps the near plane to 0.0 and the far plane to 1.0, but that causes most of the precision to get crammed near the camera. The farther the objects go, the less depth precision they get, which leads to Z-fighting, basically, two surfaces fighting to render because their depth values end up almost identical.

Reverse Z simply flips this, the near plane becomes 1.0 and the far plane becomes 0.0. That way you get way better precision at far distances.

This works because floating-point values are denser near zero, so by reversing the range, the GPU distributes precision where it’s actually needed which is for objects farther away, not just right in front of the camera.

Layout of Reverse Z Projection Matrix

\[P = \begin{bmatrix} x_{scale} & 0 & 0 & 0 \\ 0 & y_{scale} & 0 & 0 \\ 0 & 0 & -\dfrac{z_{near}}{z_{far} - z_{near}} & 1 \\ 0 & 0 & \dfrac{z_{far} \cdot z_{near}}{z_{far} - z_{near}} & 0 \end{bmatrix}\]

Swapped far to negative near for numerator at m22
Fliped signs for numerator at m32

This is the specific layout of the projection matrix used in Ghost of Tsushima, this is not an absolute layout for reverse-Z.
What defines a Reverse-Z projection is simply the depth mapping after the perspective divide:

near → 1

far → 0

As long as this mapping holds true, it qualifies as Reverse-Z.

We’ll later see how this matrix fits into the full pipeline and how the 3D position finally lands on your screen.