O. Fryazinov, A. Pasko
National Centre for Computer Animation, Bournemouth University, UK
ofryazinov@bournemouth.ac.uk, apasko@bournemouth.ac.uk
A new method is presented for rendering general FRep (functionally represented) models using GPU-accelerated ray casting. We use the GPU acceleration for all computations in the rendering algorithm: ray-surface intersection calculation, function evaluation, and normal vector computation. Performing geometric intersection calculations in parallel with shading allows us to combine the whole process of rendering within one fragment program on GPU. The algorithm is well-suited for modern GPU and provides good performance with good quality of results; it is practically memoryless and does not require a powerful CPU
The function representation (FRep) defines a geometric object by a single continuous real function of point coordinates as: F(X) ≥ 0 [15], where the function is evaluated while tracing an underlying tree structure or by running a “black box” evaluation procedure. Functionally based models are also called implicit models or implicit surfaces. Methods of constructing implicit models are developed well enough; however, rendering of these models with interactive rates remains an open problem.
At present, there are two ways to render FRep models. The first one is the polygonization, where the surface of a FRep object is represented approximately with a set of polygons. The second one is ray-tracing. The polygonization has become popular due to the intensive development of software and hardware for polygonal mesh visualization. However, it is memory- and computationally expensive to generate in real time and moreover it is not robust, because features like spikes and sharp edges can be lost during the polygonal mesh generation. Ray-tracing is regarded as more precise method to visualize FRep models; however it is computationally expensive to perform in real time too.
In this paper, we present a method of ray-casting (ray-tracing primary rays only) accelerated using GPUs (graphics processing units) and specialized for rendering FRep models with interactive rates. In recent years, the evolution of graphics hardware has resulted in using graphics cards not only for rendering polygons, but for solving more general problems. It is due to introduction of shaders, which are GPU programs for processing vertices and fragments. Although shaders basically allow for the parallel calculations of positions of vertices or colours of pixels, they can be used as programs for more general computations. These computations can be performed faster on GPUs than similar computations on CPUs because of using multi-core parallel processing in modern GPUs.
The ray-tracing algorithm computes the ray-surface intersection for the particular pixel independently from other pixels; therefore we can accelerate these computations with parallel processing on a GPU. In our method, we use GPU for all the necessary computations: ray-surface intersection calculation, function evaluation, and calculation of the normal. By performing shading computations in a shader program we avoid superfluous computations on CPU and process all the data in one shader pass. Computation of normals of the FRep surface allows us to use per-pixel lighting, leading to better surface rendering. Moreover, we only need to store ray data (two vectors) for each pixel, so our method is practically memoryless, thereby alleviating the large memory consumption problems essential to polygonization based rendering.


a. b.
Fig. 1. Rendering of some FRep objects using our method. a) stand (virtual shikki), b) sandbox (Hyperfun gallery)
By using acceleration on GPU, we achieve ray-tracing performance competitive with real-time. We also present techniques for additional accelerations of the ray-tracing algorithm that allow for further improving its performance.
Ray-tracing of functionally based models and especially skeletal implicit surfaces was examined by many researchers in recent years. Classical methods of ray-tracing were summarized in [9]. These methods generally are very slow even on modern hardware and they were improved for different special cases. Thus, in [17] ray-surface intersection was accelerated using polynomial approximations of implicit surfaces. The authors of [5] deal with implicit models represented by tree data structures; rendering was accelerated due to this restriction on models. For models represented by tree structures, acceleration was achieved in [10] because of analytical root finding in the tree leaves with implicit surface primitives. Although these methods provide good performance, they are not appropriate for general case objects, because not each procedural model can be easily represented with a tree-like structure. In [4] acceleration was achieved for dynamic scenes, where the previous frame was used as input data for the current frame. However, rendering of the first frame and finding the difference between frames remains computationally expensive.
Ray-tracing on GPU is also a well researched area. However, most papers have focused on polygonal meshes and parametric surfaces. Thus, GPU-accelerated ray-tracing for triangle meshes was introduced in [16, 1], where computations were divided between GPU and CPU because of limitations of graphics hardware. Further development of GPU-based ray-tracing of scenes composed of triangle meshes was considered in [2], where classical recursive ray-tracing was implemented for GPU. In [6], performance of ray-tracing was improved using kd-trees, in [20] acceleration of ray-tracing was achieved using threaded bounding box hierarchy stored as a geometry image. GPU ray-tracing of volumetric data using graphic hardware was introduced in [12]. The work [14] presented a method of GPU rendering of piecewise algebraic surfaces.
GPU-accelerated ray-tracing of implicit surfaces was introduced only for several particular types of surfaces. The work [3] considered ray-tracing of implicit surfaces defined by radial based functions. Rendering of quadratic implicit surfaces on GPU was reviewed in [13]; in [19] effective ray-tracing on GPU was implemented for objects represented by CSG-trees with pre-defined primitives; and [8] introduced ray-tracing of discrete isosurfaces.
The main idea of our
algorithm is using GPU for most calculations in classical ray-tracing
algorithms adapted to FRep objects. As it was
mentioned above, the calculation of the ray-surface intersection for each pixel
does not depend on other pixels. Therefore, we can use parallel calculations in
GPU to accelerate rendering. We use a program called a fragment shader to
perform computations on pixels. We use the fragment shader program for
performing the following:
o
Find the
ray-surface intersection
o
Calculate
the surface normal
o
Compute
shading
The contents of the
fragment shader are shown in Fig. 2.
The main calculation
load is accounted for the function value calculation at the given point. If we
use any classic method such as
o
Calculation
of the function value
o
Root
finding when the function value is known
The first part
concerns the internal model representation. The second part concerns the
ray-surface intersection algorithm. The part of the shader that calculate the
function value depends on the model and should be re-generated for each new
model; the root-finding part is similar for all the models and depends only on
the selected method of the ray-surface intersection.
For our implementation, we use OpenGL and the shader language GLSL. Note that we should bare in mind GPU restrictions such as inability to use recursion or early breaks in functions. Hardware restrictions depend on current graphics hardware, in this paper we mention restrictions that we met during the implementation.
Fig. 2. Scheme of the
fragment shader
3.1 Model representation
We briefly review how the model can be represented by a fragment program in GPU. In the function representation any object can be described by a real value function with real value arguments. This function can be either given by a text file describing a tree structure (as in BlobTree [5]) or by an evaluation procedure (HyperFun [11]). In this work, we use HyperFun objects, because both HyperFun and GLSL are C-alike languages and the conversion between them can be done easily. Thus, the model representation in the form of a HyperFun file is converted to a shader program in GLSL.
The problem in the conversion from HyperFun could be with the library of primitives and operations, but all the library functions can be implemented in GLSL without any problem. For example, we show below how a HyperFun model of a sphere can be represented with GLSL functions:
HyperFun:
my_model(x[3], a[1])
{
my_model = 9 - x[1]* x[1] - x[2]*
x[2] - x[3]* x[3];
}
GLSL:
bool my_model(in vec3 vecX, in vec3 vecA, inout float fValue)
{
fValue = 9-vecX.x*vecX.x-vecX.y* vecX.y-vecX.z*vecX.z;
bool bResult = (fMyModel>=0);
return bResult;
}
Thus, the function related part of the shader can be generated as follows:
o
Check the input file for using library functions
o
For the used library functions, include their implementations into the shader
o
Convert objects from the input model to the shader language
In the similar way, practically all HyperFun programs can be converted to the GLSL representation. Although user can write function for the model representation directly on the GLSL language and avoid conversion, it’s necessary to have in mind the syntax of the function, because the rest of the shader uses the model functions with defined context and defined arguments. As we have shown above, the function should have the point coordinates and the vector of parameters as the input and the function value and the function sign as the output. Note that the main return is the function sign instead of value. We did it for optimize the operations quantity in the shader.
After all we should have the function, where the functional model is defined.
3.2 Ray-surface intersection
For a FRep model, ray-surface intersection means the search of zero roots of the defining function along a ray. This process can be done using either approximate methods or methods with exact root search. Moreover we can use localization of several roots and approximate search of the others instead of exact root search.
Approximate methods were discussed in detail in [9] for software ray-tracing. In our method, we use interval analysis combined with the Newton method. This method was selected as the easiest one to implement, light weighted from the point of view of the quantity of shader operators and relatively robust. Thus, during the generation of the fragment shader, we add the ray-surface intersection part built with the following algorithm:
o
calculate
function value at the first point of the ray
o
subdivide
the ray into intervals
o
for each
interval
o
calculate
the function value at the end of the interval
o
compare
signs of the function at the beginning of the interval and at the end
o
if signs
are different, set the flag of the found root as true
o
if the
interval with a root is not found, return the no-intersection flag
o
depending
on the interval tolerance calculate the number of iterations for the
o
at each
iteration refine the root with the
o
return the
intersection point coordinates
The length of the interval and all needed tolerances are set manually by the user in the beginning and then our technology writes these tolerances as the constants in the shader.. Input data for the ray-surface intersection are given for each pixel and include the ray beginning vector as the pixel-specific data and the ray direction as the shader-specific data. As much as we use only primary rays, we do not have to store ray direction for each pixel, because all directions are the same. We store input data in the texture and in the uniform variables.
The performance of the ray-surface intersection algorithm can be increased using exact root search. Unfortunately, in general case we cannot find exact roots, but we can use it as preprocessed data for some case studies. In section 4.2 we consider details of these modifications.
3.3 Calculating the surface normal and shading
The result of the fragment shader is the colour of the pixel. Many applications use graphics hardware to perform advanced per-pixel shading. We use it also in the generated fragment shader in such way that shading computations take place together with other computations considered above.
In addition to global light parameters such as light position and colour, eye position and view direction, we have to calculate the surface normal vector at the found ray-surface intersection point. We use an approximate method to calculate the normal locally as:
Note that we have to calculate additional function values for the normal calculation; therefore the normal calculation procedure should be inserted after the defining function in the generated shader. The shading is performed using the Phong method or similar. In the shading step we can also add procedural texture using methods, described in [18].
In this section we
show how rendering of a functionally based model is performed using the
fragment shader generated by our algorithm. Because of restrictions of modern
graphics hardware, we have to use auxiliary polygonal data for rendering.
However, we can render only a single fit-to-viewport polygon and it is
sufficient for performing the shader calculations.
The standard pipeline
for polygonal object processing is shown in Fig. 3.
Fig. 3. Standard rendeing pipeline
There are two programs
in the pipeline; one is applied to vertices and another one – to fragments. In
our algorithm we do not use a vertex program at all because all the
calculations are performed in the fragment shader. As mentioned above, we use
texture as the container for input data and we apply this texture to the
fit-to-viewport polygon using 1:1 (one pixel – one texel)
mapping.
In this work we use
the main algorithm for general models and its modification with pre-processing
on CPU for better speed and quality for case studies. Below we describe these
rendering algorithms in detail.
4.1 Rendering using the fragment shader
The implementation of
the rendering algorithm on GPU is a classical approach from the point of view
of general-purpose calculations technology on GPU, also known as GPGPU [7]. As
it was mentioned above, the auxiliary model for rendering is the viewport-sized
polygon. We also use viewport-sized texture that we map to this polygon, so we
have 1:1 mapping from texel to pixel. This texture is
the data source for our method, and we store point coordinates in it. The
viewing direction and the bounding box are defined as global variables that we
pass to the fragment shader.
In the fragment
shader, we obtain the ray position coordinates from input data stored in
texture, then calculate the ray-surface intersection
and the normal, and after that in the same fragment shader we make shading.
Return data is pixel colour in output area. If there is no intersection, we
shade the pixel to the background colour. As we make shading inside the
fragment shader along with other calculations, we do not have to return
anything to the CPU programs unlike general GPGPU programs. The rendering
process in this case looks as follows (Fig. 4.):
Fig. 4. Rendering using the fragment shader.
Advantages of such a
process are:
1.
All calculations performed only in the fragment
shader, there are no suspicious calculations. Moreover, we do not encounter
unnecessary rasterization issues, because there is
only one quad (two triangles) as the input data.
2.
We can get the best image quality for general
functionally represented models. However, the better quality-lower speed law is
applicable.
3.
Shading is performed along with other
calculations, so there are no unnecessary data transfers.
Disadvantages of the
described approach:
1.
For general models we use interval analysis, so
the object has thin or sharp features, we can skip an interval, where the root
is located. The solution is to decrease the interval length, but in this case
the number of intervals increases and the speed is reduced.
2.
Some GPU cannot handle fragment programs with
many instructions; therefore some complicated models cannot be rendered with
our method.
4.2 Rendering with CPU pre-processing
We use this
modification of the general algorithm when the ray-surface intersections can be
calculated at the pre-processing step. This technology was introduced in [10]
for CPU-based rendering. The pre-processing step on CPU can include:
1)
General
procedural solution for the ray-surface intersection points. In this case we
just need to substitute initial data such as ray origin and direction to the
provided solution and find the function root. These calculations can be
executed on GPU and the rendering process looks as following (Fig. 5.):
Fig. 5. Rendering with CPU pre-processing (procedural
solution).
2)
Exact
roots. In this case we calculate the roots on CPU, then calculate normals and
perform shading on GPU (Fig. 6):
Fig. 6. Rendering with CPU pre-processing (exact
roots).
Advantages of this
modification:
1)
If the
exact root finding is possible, the speed of the rendering and the quality is
the best between all these methods.
2)
The number
of operations on GPU is less than in other modifications, so we can use more
complicated models.
Disadvantages:
1)
Unfortunately,
exact function roots for the ray-surface intersection cannot be found for an
arbitrary model. An even relatively simple object such as blended union between
two cylinders leads to the root search in polynomials of degree of 5. In such
cases we have to use approximate methods and the speed with quality can
decrease.
2)
If there
are many possible roots, problems can occur with transferring these data from
CPU to GPU because of limits of the data which can be transferred within one
pass.



Fig. 7. Rendering of a torus primitive: real time rotating and zoom, Phong shading
(2 light sources).
We tested our
algorithm implementation with rendering several relatively simple and more
complicated functionally based models (Fig. 1, 7, 8). In the performance results
shown below, we use a standard torus primitive as a
simple model. We also used complicated models from the Virtual Shikki project [11]. For computations in general methods we
use the tolerance value that produces minimum of artefacts.
Fig. 8. Rendering of dynamic model with procedural
texturing: metamorphosis from a rabbit (Hyperfun gallery) to a sandbox
(Hyperfun gallery)
Performance
characteristics of our implementation were measured on a PC with single NVIDIA GeForce 6800 card and Intel Pentium 4 3.20GHz CPU.
According to specifications, this graphics card can process up to 16 pixels per
clock. All models were rendered on a 256x256 pixels viewport. The speed
characteristics were measured for the general algorithm (“Fragment” in the
table) and for cases where we use solution pre-processed on CPU(“CPU/Fragment”
in the table). For comparison with CPU, we also measure speed characteristics
of a CPU-based ray-tracer implemented in PovRay (“CPU
general” in the table) and a CPU-based ray-tracer with root solving implemented
in [10] (“CPU, root solving” in the table). We provide the result in the
following table, where speed is measured in frames per second (bigger fps means
higher speed).
|
CPU,
general
|
CPU, root
solving
|
Fragment
|
CPU/Fragment
|
Torus
|
0.16
|
6
|
30
|
100
|
Cup
|
0.25
|
N/A
|
20
|
60
|
Rabbit
|
0.03
|
4.7
|
12
|
50
|
Sandbox
|
0.015
|
N/A
|
10
|
N/A
|
Stand
|
0.02
|
N/A
|
8.56
|
N/A
|
It can be seen from
the table that with the proposed GPU-based algorithm we can achieve the
substantial acceleration of rendering (up to five times) compared with the
fastest CPU-based method.


a b
Fig. 9. Rendering of a cup model. a) Fragment b)
CPU/Fragment. Note that because of interval method errors we have bowed edge in
the first case.
Adding pre-processing on CPU, when the model allows, brings further
increase in speed with the factor up to three and, in some cases, improves
quality (Fig. 9). Another shown advantage of the GPU-based solution is that
it can process general procedural models (see Fig. 10), while other methods
have limitations on model structure or complexity.
Fig. 10. Procedural 3D fractals rendered with ~ 5 FPS
by the fragment shader (models courtesy of F. Delhoume).
6. Conclusions and Future Work
In this paper we
presented the method of real-time rendering of general procedural FRep models with GPU-accelerated ray casting. As it was
shown, with this method good image quality can be obtained along with high
rendering speed providing interactive frame rates. This has been achieved
because of:
-
Parallel
calculations on GPU of intersection points of cast rays with functionally
represented models.
-
Depending
on the algorithm modification, we can obtain higher speed along with lower
quality for approximate visualisation; also we can use modifications to fit the
algorithm implementation to a concrete graphic card.
-
We perform
shading in the GPU shader, so we obtain a model image with per-pixel lighting.
-
If the
exact root finding is possible, we can increase the speed of rendering with
pre-processing on CPU.
However, our method
has some limitations:
-
In our
method, we find only the first intersection of the ray with the surface, so it
is not currently possible to render functionally based models with transparent
materials.
-
Traditional
methods of texturing will not work in the fragment shader and in CPU/fragment
modifications, because of binding of texture coordinates to vertices, but we do
not have vertices as the direct ray casting is performed.
-
It is
impossible to represent recursively defined models, because recursion is not
supported by modern GPUs. Probably this possibility
will appear in the next-generation GPUs.
-
Modern GPUs can handle many instructions per shader. However, it
still can be insufficient for very complicated models.
The removal of these limitations and further optimization of the proposed method is the subject for future research and development
[1] N. A. Carr, J. D. Hall, J. C. Hart. GPU
Algorithms for Radiosity and Subsurface Scattering. Proceedings of the
ACM SIGGRAPH/EUROGRAPHICS conference on Graphics hardware,
[2] M. Christen. Ray Tracing on GPU. Master's thesis, Univ. of Applied Sciences
Basel (FHBB), 2005
[3] A. Corrigan, H. Quynh Dinh.
Computing and Rendering Implicit Surfaces Composed of Radial Basis Functions on
the GPU. International
Workshop on Volume Graphics, June
2005
[4] E. de Groot, B. Wyvill. Rayskip: Faster Ray Tracing of Implicit Surface
Animations. International Conference on Computer Graphics and Interactive
Techniques in
[5] M. Fox, C. Galbraith, B. Wyvill. Efficient Use of the BlobTree for
Rendering Purposes. Proceedings of the International
Conference on Shape Modeling & Applications,
2001, p 306.
[6] T. Foley, J. Sugerman. KD-tree
acceleration structures for a GPU raytracer. Proc.
SIGGRAPH/Eurographics Workshop on graphics hardware
2005, pp. 15-2
[7] Mark Harris. GPGPU: Beyond Graphics. NVIDIA GDC
Presentations, 2004
[8] M. Hadwiger, C. Sigg,
H. Scharsach, K. Bühler,
M. Gross. Real-Time Ray-Casting and Advanced Shading of
Discrete Isosurfaces. In Eurographics,
Blackwell Publishing, M. Alexa and J. Marks, Eds.,
vol. 24.
[9] J. C. Hart. Ray Tracing Implicit Surfaces. Siggraph 93 Course Notes No 25, pp 1-15.
[10] M. Hašan. An Efficient F-rep
Visualization Framework. Master thesis, Faculty of Mathemetics,
Physics and Informatics,
[11] V. Adzhiev, R. Catwright,
E. Fausett, A. Ossipov, A. Pasko, V. Savchenko. HyperFun project: Language and Software tools for F-rep
Shape Modeling. Computer Graphics & Geometry,
vol. 1, No 10, 1999
[12] J. Kruger, R. Westermann. Acceleration Techniques for GPU-based Volume Rendering. Proceedings
of the 14th IEEE Visualization 2003, pp 38.
[13] C. Lessig. Interaktives Ray-Tracing und
Ray-Casting auf programmierbaren Grafikkarten. Bachelor Thesis, Bauhaus University
[14] C. Loop, J. Blinn.
Real-Time GPU Rendering of Piecewise Algebraic Surfaces. Proceedings of Siggraph 2006. pp 664-670.
[15] A. Pasko, V. Adzhiev, A. Sourin, V. Savchenko. Function representation in geometric modeling: concepts, implementation and applications, The
Visual Computer, vol.11, No.8, 1995, pp. 429-446.
[19] R. Toledo, B. Levy. Extending the graphic
pipeline with new GPU-accelerated primitives. Tech report, INRIA (2004)
[20] N. A. Carr, J. Hoberock, K. Crane, J. C. Hart. Fast GPU Ray Tracing of
Dynamic Meshes using Geometry Images. ACM International Conference Proceeding
Series; Vol. 137, pp 203-209