Part 6: World To Screen Explanation And Code

What “World to Screen” Means

The name really says it all: world → screen. A World to Screen (W2S) function takes a 3D position in the game’s world space represented as a Vector3 (x, y, z) and transforms it into a 2D coordinate on your screen space, an (X, Y) pixel position on your monitor.

It’s this transformation that enables you to take information about an object’s position in a 3D environment and figure out exactly where it should be drawn on your screen. In practice, this is the core of how an ESP works: without W2S, you can know an enemy’s position in the game world, but you wouldn’t know where to render the box, snapline, or marker on your screen.

World to Screen Code Explanation:

bool WorldToScreen( const Vec3& worldPos, Vec2& screenPos, const Matrix4x4& viewProj, int screenWidth, int screenHeight) { // Multiply the world position by the view-projection matrix (row-major) float x = worldPos.x * viewProj.m[0][0] + worldPos.y * viewProj.m[1][0] + worldPos.z * viewProj.m[2][0] + viewProj.m[3][0]; float y = worldPos.x * viewProj.m[0][1] + worldPos.y * viewProj.m[1][1] + worldPos.z * viewProj.m[2][1] + viewProj.m[3][1]; float z = worldPos.x * viewProj.m[0][2] + worldPos.y * viewProj.m[1][2] + worldPos.z * viewProj.m[2][2] + viewProj.m[3][2]; float w = worldPos.x * viewProj.m[0][3] + worldPos.y * viewProj.m[1][3] + worldPos.z * viewProj.m[2][3] + viewProj.m[3][3]; // Clip check if (w < 0.0001f) return false; // Normalize device coordinates float invW = 1.0f / w; x *= invW; y *= invW; // Convert to screen space screenPos.x = (x * 0.5f + 0.5f) * screenWidth; screenPos.y = (1.0f - (y * 0.5f + 0.5f)) * screenHeight; // Invert Y for top-left origin return true; }

We’ll go through the WorldToScreen function line by line to see how a 3D world position gets converted into a 2D screen coordinate.

World Space → Clip Space

float x = worldPos.x * viewProj.m[0][0] + worldPos.y * viewProj.m[1][0] + worldPos.z * viewProj.m[2][0] + viewProj.m[3][0]; float y = worldPos.x * viewProj.m[0][1] + worldPos.y * viewProj.m[1][1] + worldPos.z * viewProj.m[2][1] + viewProj.m[3][1]; float z = worldPos.x * viewProj.m[0][2] + worldPos.y * viewProj.m[1][2] + worldPos.z * viewProj.m[2][2] + viewProj.m[3][2]; float w = worldPos.x * viewProj.m[0][3] + worldPos.y * viewProj.m[1][3] + worldPos.z * viewProj.m[2][3] + viewProj.m[3][3];

This is a 4D vector-matrix multiplication. It’s transforming the Vec3 worldPos into a 4D (x, y, z, w) coordinate.
Why do we do this?: This single calculation transforms a point from World space to Clip space by simultaneously moving the entire game world so the camera is at the center (the View transformation) and applying the perspective squish that makes distant objects appear smaller (the Projection transformation).
We assume a 1.0 for the 4th component of the position.

Clip Check

if (w < 0.0001f) return false;

This is a safety check to discard any points that are behind the camera.
The w value from the previous step tells us the point’s depth relative to the camera.
- If w>0, the point is in front of the camera.
- If w<0, the point is behind the camera.
- If w=0, the point is exactly on the camera’s “film,” which is an invalid state.
Returning false ensures we don’t try to draw points that are out of view.

Clip Space → NDC

float invW = 1.0f / w; x *= invW; y *= invW;

This is the Perspective Divide, this is what makes distant objects appear smaller. By dividing the x and y coordinates by their w (depth), we “squish” their screen positions.
A point far away will have a large w, so dividing x by a large number makes the resulting x small. Conversely, a point close to the camera will have a small w, so dividing by a small number keeps x large, making it appear to take up more space on the screen.
Why do we Multiply instead of Divide?: This is an optimization. Division is a “slow” CPU operation. Multiplying is “fast.” So instead of doing x / w and y / w (two divisions), we do one division to get invW (the inverse of w) and then do two fast multiplications
The resulting x and y are now in Normalized Device Coordinates (NDC). This is a standardized, abstract space where the screen is a 2D box from -1 to +1.

NDC → Screen Space

screenPos.x = (x * 0.5f + 0.5f) * screenWidth; screenPos.y = (1.0f - (y * 0.5f + 0.5f)) * screenHeight;

These lines convert NDC coordinates to actual screen pixels.
Explanation of x:
- x * 0.5f + 0.5f maps [-1,1] to [0,1].
- Multiply by screenWidth to scale to the actual width of your monitor.
Explanation of y:
- y * 0.5f + 0.5f also maps [-1,1] → [0,1].
- 1.0f - (…) inverts the Y-axis because screen coordinates usually have (0,0) at the top-left corner.

return true;

Finally, if the point passed all checks, the function returns true and screenPos now contains the 2D coordinates where the world point should be drawn.

Summary of the Transformation Pipeline

World Space → Clip Space: Multiply by the ViewProjection matrix.
Clip Space → NDC: Divide by w (perspective divide).
NDC → Screen Space: Scale to your screen’s width and height, flipping Y-axis as needed.

This step-by-step breakdown shows that World to Screen is really just a series of linear algebra transformations

TL;DR of w2s:

Take your entity’s position as a Vec4(x, y, z, 1), multiply it by the ViewProjection matrix, perform a clip check to discard points behind the camera, then divide by w to get Normalized Device Coordinates (NDC) and finally convert those to screen space. That’s essentially what World-to-Screen (W2S) does.