Geometry Caching
For optimal execution of tasks in parallel, also known as
load balancing, it is essential to subdivide
the entire rendering operation into small tasks, which are called
jobs in mental ray. Jobs may perform a wide variety of
operations: tessellating a portion of a surface, loading a portion of a texture,
casting a bundle of rays, rendering a section of the image, balancing a
photon map, and many others.
mental ray is executing jobs on demand, managed by a dependency graph of
jobs. For example, a job is executed only when another job accesses data
generated by that job. The data generated by jobs enters a memory pool,
traditionally called the geometry cache.
This cache manages almost all data stored in memory by mental ray, including
textures, photons, images, and more. Data can not only enter the cache but may
also be deleted when memory fills up and the data has not been used for a
while, and it is known that the source data is available or can be recreated
at any time. This is handled by mental ray
memory management.
Dynamic job execution and the geometry cache as the central hub of
mental ray has several advantages:
- rendering begins almost immediately,
instead of having to wait
for the completion of a possibly large number of preparation jobs.
- data that is not used will not be computed.
For example, typically
only parts of shadow maps are ever actually used; mental ray may only
compute those parts instead of creating the entire shadow map. Also,
in fly-through scenes, only parts of the scene may actually be used
for rendering (if global illumination effects are not enabled).
- not all data is used at the same time.
Data enters the cache when needed,
and may get displaced later when memory fills up. Memory usage rises smoothly
up to the defined memory limit, and then will
try to stay there.
- parallelism is enhanced,
because threads no longer become idle
near the end of each phase. Job execution is completely independent of
(host) location and of other jobs, and can be freely scheduled all over
the network to make use of available resources.
- scene complexity has lower impact on performance.
If the complexity of the scene exceeds the capacity of the system, the system
performance degrades gracefully instead of hitting the wall
by failing
completely, or occupying with excessive disk swapping.
In general, geometry caching has the effect of exploiting scene coherence
very effectively. mental ray achieves significantly higher speed at noticeable
memory usage reduction compared to previous generation. Unlike traditional
geometry caching methods, this advantage is not limited to specific scenes;
it works with all aspects of mental ray including
ray tracing and
global illumination. It works better if the
scene does have a high degree of coherence, but advantages remain even for
less-coherent scenes. For example, scenes that make extensive use of global
illumination, which by definition is global and less coherent, may require a
larger cache but will still benefit from its use.
Demand Loading
The capability of mental ray to defer operations until the time they are
actually required can be utilized to optimize loading and storage of the scene
elements during rendering. This avoids consuming memory for things that are
currently not needed, and which might in fact never be accessed. It is quite
common to render a small portion of a scene, like a specific view of a large
virtual world. Not having to fit the entire scene in memory, including
potential off-screen parts, can save a lot of space and time.
The demand loading mechanisms in mental ray allow to move the
typically expensive file reading, scene parsing, geometry translation, or
tessellation tasks to late render time, when the rendering algorithm is
requesting it ultimately. As a benefit, the renderer is able to throw out
parts of the scene temporarily, also known as
memory flushing, to make room in tight memory situations and finish
successfully anyway. Those elements can be re-created at any later time if
demanded. mental ray is using this procedure internally for all tessellation
operations, for instance when generating triangle representations from higher
level geometry descriptions like free-form
surfaces or subdivision surfaces. It
also allows to handle extremely detailed tessellations and huge triangle counts,
especially in cases of displacement with
fine approximations. Similarly, input textures read from
texture files are always loaded on demand.
Demand loading of scene elements is also available on the .mi file level and
through geometry shaders, by using
assemblies
for delayed reading or translation of sub-scenes, or
placeholder
objects for individual geometry.
Memory Management
The memory management system in mental ray allows efficient use of the memory
resources available on the rendering machine, and provides the basis for the
capability to render huge scenes on systems with limited memory. Any memory
that is allocated in mental ray core and in shaders (using the memory
shader interface functions) will be registered in the memory module together
with other essential information. This permits to watch overall consumption
and to react to progress and failure conditions during rendering.
The primary control to tune memory usage in mental ray is the
memory limit. It is predefined on 32bit
systems to about 1GB to work well in most existing systems, and unlimited
on 64bit systems. It may be set to any other value, and even to "unlimited"
in special cases. In the "unlimited" case, mental ray is fully depending on
the operating system to provide memory, and starts to intervene only when the
system fails to allocate. However, in this case, typically very little
resources are left to perform clean-up procedures in reasonable time, and
mental ray's attempt to survive the bottleneck and finish the current rendering
may take a long time or fail anyway. Setting the limit too high has a similar
effect. On the other hand, tuning down the limit can help to render a huge scene
successfully, at the expense of reducing the performance noticeably (due to
releasing and re-creating pieces of the scene). It is best practice to model
scene elements in smaller pieces instead of creating huge concrete objects,
and to utilize callback mechanisms like object
placeholder and assemblies whenever
possible, to benefit most from the memory management system.
The memory limit may be adjusted globally by a
registry variable, or overridden for a
specific rendering by a command line
option.
Copyright © 1986, 2015
NVIDIA ARC GmbH. All rights reserved.