Robbin Marcus

Wednesday, October 14, 2015

Real-time Raytracing part 1

In the next few posts I will talk about Real-time Raytracing. The real-time part has different interpretations. Some define it as being 'interactive', which is another vague term as any application running at 1 FPS can still be interactive. In this case we will define a "real-time constraint" as a time constraint per frame of about 16ms (which leads to 60 frames per second). This is a common time constraint used in games which have to be responsive.

Now for the raytracing part. The basic concept of raytracing is tracing the paths of light for every pixel on an image plane. This way we can simulate visual realism by by mimicking the real process. You can create pretty images without much code:

Unfortunately raytracing also has downsides. The process to trace the light paths requires global scene access, and is thus unable to work similar to a rasterizer (which draws objects one by one). This makes the whole rendering process more difficult and the standard route of rendering objects is averted. Raytracing has a very high computational cost. A basic algorithm for checking which triangle was hit already shows the problem:

for (int x = 0; x < screenwidth; x++)
 for (int y = 0; y < screenheight; y++)
 for (int t = 0; t < nTriangles; t++)
 RayTriangle(x, y, t)

The whole algorithm scales with screen size and the amount of triangles. Running this algorithm with the dragon model shown above is basically asking for your computer to explode. The model has 100.000 triangles exactly, and checking that amount of triangles for every pixel in a low resolution (1024x1024) is about 104 billion ray triangle checks. A simple line-triangle intersection already has a lot of instructions, needless to say, this algorithm isn't going to run real-time...

So, how do we make it real-time? Several decades of research show us that there are plenty of possibilities to speed up the process of raytracing. In the upcoming posts I will talk about some of these processes to speed up my path tracer. Most of the subjects will be on parallel programming on the GPU using CUDA. Even though I will try my best to keep it as readable as possible, unexplained terminology can always be found in the excellent ray tracing tutorial of scratchapixel.

If you're really interested in ray tracing, and haven't read it already: Ray tracey's blog already has a huge collection of ray tracing posts.

Part 2.

Friday, April 3, 2015

Compute Shader Framework

In the last few posts about ray tracing I briefly mentioned compute shaders. If you don't know what they are, here is a short summary:

Introduction

Compute shaders are not part of the ordinary graphics pipeline, they can be used separately of any other stage. They are particularly meant for computation on the GPU. The compute shaders are in the same language as the other pipeline stages, like the pixel shader. In this case, HLSL. The compute shader takes advantage of the huge speedup the GPU has to offer over the CPU. This is done by taking into account the parallel computation power of the GPU.

When I first started out with compute shaders, I saw it as a black box and didn't really understand how to get started using one. After I found out that it can be really useful for large computations, I decided to implement one for my ray tracing project (with success). After this, I decided to make a simple framework allowing everyone access to compute shaders in a more user friendly way. This is only intented for DirectX compute shaders in C#

Framework

So without further ado, here is the framework: ComputeShader. You can also view it on GitHub.
On the first run, the framework will download some NuGet packages from SharpDX. If you have already have SharpDX installed, you can simply reference them to skip this part.

With this framework it's possible to bind any structure to the GPU. You can do numerous things with these structures in your shader, and then output some data that you want to know. You can read this data back in your code and use it later on! A simple example would be updating a particle system: you dump all positions and velocities to the GPU, and then calculate the next positions in your shader.

Usage:
In your project, either reference the ComputeShaderAddon.dll or add the project to your solution and reference the project.
You can now calculate anything on the GPU by using the next 4 lines of code (don't forget to include ComputeShaderAddon):

ComputeShaderHelper CSHelper = new ComputeShaderHelper(Device, "effect.fx");
int index = CSHelper.SetData<ExampleStruct>(data);
CSHelper.Execute(50);
CSHelper.GetData<ExampleStruct>(index);

This is the code from the example in the framework. What it does per line:
- Initialize the helper, this compiles the shader (if necessary) and sets it up.
- Set your data from any possible struct to the GPU buffers. The index is stored to retrieve the data later on.
- Executes the compute shader, the number is the amount of cores used on the GPU. The maximum number of cores is 1024, however this will use all calculation power of the GPU at once!
- Retrieve the data from the GPU, using the index from above.
Create your compute shader. Set the amount of cores you want to use in the brackets above the main function like this:[numthreads(cores, 1, 1)]

Likewise, save the length of the array of structs somewhere in the compute shader, if you want to use this like I did in the framework.
Done! Run your project!

Results
The example is a small program I wrote to test the computation power of the GPU in comparison with the CPU. The operation to perform is simple: for every struct you get, count numbers from zero to the length of the array and store them in the struct. Below you'll find a CPU and GPU version of this in code:

// CPU
for (int i = 0; i < amount; i++)
{
 int result = 0;
 for (int j = 0; j < amount; j++)
 result += j;
 data[i].Data = new Vector3(result, result, result);
}

// GPU -- ComputeShaderExample.fx in framework
[numthreads(nThreads, 1, 1)]
void CSMain(uint3 id : SV_DispatchThreadID)
{
 int range = nStructs / nThreads;
 for (uint i = id.x * range; i < id.x * range + range; i++)
 {
 int result = 0;
 for (uint j = 0; j < nStructs; j++)
 {
 result += j;
 }
 data[i].Data = float3(result, result, result);
 }
}

The framework ran with 50 cores on the GPU, and the results are as follows:

Last data: X:4,9995E+07 Y:4,9995E+07 Z:4,9995E+07
It took the GPU: 102 milliseconds
Last data: X:4,9995E+07 Y:4,9995E+07 Z:4,9995E+07
It took the CPU: 283 milliseconds

With this small calculation the GPU, using 50 cores, is about 3 times faster than the CPU, using only one core.

Some scalings:

Structs	Cores	GPU time (ms)	CPU time (ms)
10k	50	102	283
20k	50	380	1131
30k	50	751	2463
10k	10	452	283
10k	100	119	283
10k	1000	480	283

You can see that the amount of cores is something to fiddle with, since the resulting time differs greatly. This is because the overhead of running all the threads costs more than the actual computation itself, so be careful with this!

Fun stuff: the outcome is easily calculated by: n(n + 1) / 2. This is an easy way to calculate a numerical sequence like this. In this case, n = 9999 (because we start at 0).

Future work

This framework currently only supports Unordered View bindings, so if you would use DirectX 10 you can only bind one array of structures to the compute shader and that's it. In DirectX 11 this is increased to 8, which is supported in this example project.

Currently you still have to set the length of the array and the amount of threads manually in the compute shader. I don't know if it's possible to change this dynamically from code, but if I ever find a way, I will update the framework for sure.

Wednesday, January 21, 2015

Raytracing Part 3

The final post in this series about my ray tracer. I spent last weekend creating some black magic called a compute shader. It allows you to use your GPU to aid in calculations. In my case this came in quite handy with all these ray calculations. The GPU is optimal for calculations that can be executed completely in parallel. Fortunately, raytracing falls in this category.

This is the first result worth looking at:

First thing to notice is that it's quite grainy. This is because there are not that many samples per pixel, only about 5. However, the (still local) Cook Torrance shading is already working as you can see on the metal-ish sphere in the back.

The main thing to notice here are the soft shadows. A small change, but 'realism' awaits. There are some incorrect results in this image, because you can see the diffuse reflection of the green box is still biased. The local shading model from two lights draws the reflecting rays in their direction, meaning this is not a totally correct result.

The box in the middle suddenly became radioactive and now emits light. The image is still biased, even the emitting box only emits towards the light sources.

Experimenting with rougher surfaces and more glossy surfaces.

The first results of refracting rays.

Perfectly refracting sphere showing the bias is indeed gone here. The grain is also gone since I actually took the time to sit back and wait for the image to converge (this image took about two minutes to converge to this state).

The same scene with different materials and a more emitting 'roof'.

Wednesday, January 14, 2015

Raytracing Part 2

In this post some more images of the continuing project on raytracing. First of all, I found out that the bounding spheres were way too big, and a big performance hit on my ray tracing project. Instead of spheres I now calculate bounding boxes around all the triangles in the model and test with these instead of the spheres. All in all, this improvement alone did not speed up things enough. By looking at my quickly made octree, I found some fatal flaws rendering the octree quite worthless. All of the triangles were stored in the topmost bounding box instead of the leaves as they should be.

This image shows the octree of the Stanford dragon and 2 teapots:

After I got all this working I was quite amazed by the quick rendering times. It took almost half the time as before, making it possible to even render this huge dragon:

The next image shows some reflections in the windows and the body of the car:

Of course, I could never stop here. So why not add some refraction as well:

By combining the result of reflections and refraction:

Now, these images are already quite fancy. However, there is still a lot of work to do. The following image shows a sneak peak of the next and final part of this raytracing series. By tracing multiple rays within a pixel area we have more information about the color. Combining this with a nice gaussian function over multiple pixels I have created a nice anti-aliasing function:

In the next part I will show results of a distributed raytracer slowly converging towards rendering an image with global illumination effects.

Tuesday, January 6, 2015

Raytracing Part 1

This post shows the progress for an assignment I'm currently creating for a university course. It's about creating a raytracer starting from scratch. I chose to implement it using SharpDX. I first started by creating a standard renderer, which draws triangles on the GPU using DirectX. After I could load in simple objects and draw them on the screen, I created a basic raytracer.

First I had to convert the loaded models in SharpDX to triangles I could actually use to trace. After some work I finally got some results and started taking screenshots. Hence the result so far in graphical representation:

The model drawn with the basic effect

Silhouette by raytracing the boundary. The gray area is the bounding sphere (which is a little off scale).

Material loading, getting the 'correct' colors for the car. The light color interfered here, thats why the colors are still a little off.

I forgot to order the triangles by depth! What a mess..

After ordering the triangles

Blinn-phong shading.

More lights! Shadows!

Soft shading, taking the average of three normals for each point on a triangle.

Our car is back! However some normals are still incorrect, you can see headlights missing.

Soft Shadows! Sending 20 rays from every point to the lightsource to see if we have shadow. The correct normals have been put in.

Blinn-Phong shading back in the picture. The car looks really metallic now:

There are still some strange 'light artifacts' on the floor and the car window. More updates will be in the next post.

Monday, September 8, 2014

From zero to lighting in 2D

This tutorial is about spicing up your 2D game with some awesome lighting. I made this tutorial because there aren't any out there yet, and it's a very cool effect to add to your 2D game.

What you need to know

First off, I assume you have some familiarity with C#, or coding in general. Secondly it is advised to at least have programmed in a shader language before. Used in this tutorial is HLSL. The rest is what this tutorial is for!

The final result

You can download the code example here (or you can scroll down to find some explanations). The final result will look something like this:

Normal maps

Normal maps are textures to containing a color indicating the normal of a certain pixel. How does this work? Take the normal map used in the code example alongside the original texture:

The left texture contains the normal colors, and on the right is the generated color defining the normal. To calculate the actual normal vector, we have to apply a little transformation:
$$normal = 2.0 * pixelcolor - 1.0$$

To explain this better, we take the most common color in the picture, a light blue-ish color. In RGB values it is about $((128, 128, 255)$), which is reduced in the range of [0,1]as $((0.5, 0.5, 1.0)$). After applying our transformation the value becomes $((0.0, 0.0, 1.0)$) which is a normal pointing in the Z direction.

In our 2D game this value will point from screen towards the viewer. As you can see in the normal image, there are several red and green colored portions, which will affect the direction of the normal. With this information we can add fake depth to a plain 2D texture!

The drawing setup

To pass this information to our lighting effect we need to have this normal map ready, which means we can't draw everything on the screen immediately. The setup used for this comes close to Deferred Shading. We only need the color and normal buffer, since the depth buffer is worthless here.

All the normal sprites are drawn first, as you're used to, except, this time we save them to a rendertarget. After which I draw all of the normals to another rendertarget. Unlike standard deferred shading, I chose to render the lights to a seperate rendertarget here. If you wish, you can combine drawing the textures and drawing the lights to a single pass. This was merely done to show the different rendering steps here.

The actual lighting magic
Because it's a 2D game, you expect to need to draw a "lighting texture", something like this:

But we don't need to! Since we have a normal, we can simply apply the technique to draw a light in 3D, which is really fancy and easy to create. The effect I used to create this tutorial with is called Diffuse Reflection, or Lambertian Reflectance. We set up a point light (which is, a point from which light emanates) and calculate the pixel color on the GPU.

Diffuse reflection requires three things: the position of the light, the position of the current pixel being shaded, and the normal at that position. From the first two you can calculate the light direction, and by looking at the value of the dot product from the light direction and the normal you can determine the lighting coefficient.

Sometimes you will want to rotate the normal retrieved from the normal map. This is done by creating a separate rotation matrix and adding it to the shader. More information about creating such a matrix can be found in my other tutorial series on rotations.

Finding the correct normal on the pixel position is rather easy: we have a full screen buffer of normals, and a position given by the draw call. Dividing this position by the screen size, we have the texture coordinates of the normal pixel ranging in [0,1]. Exactly what we need!

Code example

All of this code can be found in the source code, I'd like to point out a few things in this article though, here's the code used for lighting in HLSL:

// Basic XNA Vertex shader
float4x4 MatrixTransform;
void SpriteVertexShader(inout float4 color    : COLOR0,
     inout float2 texCoord : TEXCOORD0,
     inout float4 position : SV_Position)
{
     position = mul(position, MatrixTransform);
}

float4 PixelShaderFunction(float2 position : SV_POSITION, 
         float4 color : COLOR0,
         float2 TexCoordsUV : TEXCOORD0) : COLOR0
{
     // Obtain texture coordinates corresponding to the current pixel on screen
     float2 TexCoords = position.xy / screenSize;
     TexCoords += 0.5f / screenSize;

     // Sample the input texture
     float4 normal = 2.0f * tex2D(NormalSampler, TexCoords) - 1.0f;

     // Transform input position to view space
     float3 newPos = float3(position.xy, 0.0f);
     float4 pos = mul(newPos, InverseVP);

     // Calculate the lighting with given normal and position
     float4 lighting = CalculateLight(pos.xyz, normal.xyz);
     return lighting;
}

// Calculates diffuse light with attenuation and normal dot light
float4 CalculateLight(float3 pos, float3 normal)
{
   float3 lightDir = LightPosition - pos;

   float attenuation = saturate(1.0f - length(lightDir) / LightRadius);
   lightDir = normalize(lightDir); 
 
   float NdL = max(0, dot(normal, lightDir));
   float4 diffuseLight = NdL * LightColor * LightIntensity * attenuation;
 
   return float4(diffuseLight.rgb, 1.0f);
}

As you can see, the shader consists of a vertex and pixel shader. The vertex shader simply passes on the color, texture coordinates and position. It only transforms the position with the given matrix. After the vertex shader we know the rectangle on the screen and the pixel shader will analyze all the pixel inside of it.

What you first see in the pixel shader is getting the texture coordinates from the position. This is done by dividing it through the screen size and adding half a pixel width (so we're in the center of the pixel). With this coordinate we can sample the normal map to get the color of the normal, and as shown in this article, calculate the actual normal from it. What happens next is retrieving the original position, by multiplying it with the inverse view-projection matrix. We can now calculate the lighting with given parameters.

Where the light calculation method is nothing more than a normal times lightdirection to see if the surface should get lit. Of course, not to forget, the attenuation which looks at the range of the light and caps it (smoothly) by multiplying it with this value.

Optimization

You want to draw a lot of lights, right? Normal deferred shading can't handle a lot of point lights, since you have to redraw the whole screen for every pointlight. Thus follows the first optimization: if we only draw a small square on the screen were we expect the light to shine, we don't draw the rest of the screen. This is done quite simple by adding a light radius, from which we can create a rectangle to draw in spritebatch.

Since we don't need to draw any textures, and we still have to make a draw call, I found the following optimization: in spritebatch you have to supply a texture for a draw call, the best way to use our previous optimization is drawing a pixel and upsize it to the square. This way, the pixel shader can sample the normal map and output the lighting on the positions given by the draw call. In the code I just pass the normal map as texture for simplicity.

I hope you learned something from this tutorial, and sure hope to see some awesome games created with this effect!

Friday, May 9, 2014

Bloomifier

I spent some time working out an idea that randomly popped into my head. I thought: XNA runs in windows forms, is it possible to add some Windows functionality to the XNA window? Turns out it was!

By just adding some simple form code in XNA, you can treat the window as a form. Which of course lead to some fun ideas to work out. My idea: Drag & Drop a picture into the XNA window, then load it as Texture2D. After we have succesfully retrieved this texture we can add some bloom to make it interesting.

Here is the (simple) code to get the FormHandle and allow for files to be dropped into the screen:

form = Form.FromHandle(Window.Handle) as Form;

if (form == null)
{
throw new InvalidOperationException("Unable to get underlying Form.");
}

// We must set AllowDrop to true in order to let Windows know that we're accepting drag and drop input
form.AllowDrop = true;
form.DragEnter += new System.Windows.Forms.DragEventHandler(form_DragEnter);
form.DragOver += new System.Windows.Forms.DragEventHandler(form_DragOver);
form.DragDrop += new System.Windows.Forms.DragEventHandler(form_DragDrop);

Of course you still need to define these functions DragEnter, DragOver and DragDrop yourself.

Once I had this, I added a little bloom effect and combined this with the original. The result was pretty cool:

Here's the link: Bloomifier. To succesfully run this on any Windows system, you need the XNA redistributable: XNA