Geometry Caching

For optimal execution of tasks in parallel, also known as load balancing, it is essential to subdivide the entire rendering operation into small tasks, which are called jobs in mental ray. Jobs may perform a wide variety of operations: tessellating a portion of a surface, loading a portion of a texture, casting a bundle of rays, rendering a section of the image, balancing a photon map, and many others.

mental ray is executing jobs on demand, managed by a dependency graph of jobs. For example, a job is executed only when another job accesses data generated by that job. The data generated by jobs enters a memory pool, traditionally called the geometry cache. This cache manages almost all data stored in memory by mental ray, including textures, photons, images, and more. Data can not only enter the cache but may also be deleted when memory fills up and the data has not been used for a while, and it is known that the source data is available or can be recreated at any time. This is handled by mental ray memory management.

Dynamic job execution and the geometry cache as the central hub of mental ray has several advantages:

rendering begins almost immediately,
instead of having to wait for the completion of a possibly large number of preparation jobs.
data that is not used will not be computed.
For example, typically only parts of shadow maps are ever actually used; mental ray may only compute those parts instead of creating the entire shadow map. Also, in fly-through scenes, only parts of the scene may actually be used for rendering (if global illumination effects are not enabled).
not all data is used at the same time.
Data enters the cache when needed, and may get displaced later when memory fills up. Memory usage rises smoothly up to the defined memory limit, and then will try to stay there.
parallelism is enhanced,
because threads no longer become idle near the end of each phase. Job execution is completely independent of (host) location and of other jobs, and can be freely scheduled all over the network to make use of available resources.
scene complexity has lower impact on performance.
If the complexity of the scene exceeds the capacity of the system, the system performance degrades gracefully instead of hitting the wall by failing completely, or occupying with excessive disk swapping.

In general, geometry caching has the effect of exploiting scene coherence very effectively. mental ray achieves significantly higher speed at noticeable memory usage reduction compared to previous generation. Unlike traditional geometry caching methods, this advantage is not limited to specific scenes; it works with all aspects of mental ray including ray tracing and global illumination. It works better if the scene does have a high degree of coherence, but advantages remain even for less-coherent scenes. For example, scenes that make extensive use of global illumination, which by definition is global and less coherent, may require a larger cache but will still benefit from its use.

Demand Loading

The capability of mental ray to defer operations until the time they are actually required can be utilized to optimize loading and storage of the scene elements during rendering. This avoids consuming memory for things that are currently not needed, and which might in fact never be accessed. It is quite common to render a small portion of a scene, like a specific view of a large virtual world. Not having to fit the entire scene in memory, including potential off-screen parts, can save a lot of space and time.

The demand loading mechanisms in mental ray allow to move the typically expensive file reading, scene parsing, geometry translation, or tessellation tasks to late render time, when the rendering algorithm is requesting it ultimately. As a benefit, the renderer is able to throw out parts of the scene temporarily, also known as memory flushing, to make room in tight memory situations and finish successfully anyway. Those elements can be re-created at any later time if demanded. mental ray is using this procedure internally for all tessellation operations, for instance when generating triangle representations from higher level geometry descriptions like free-form surfaces or subdivision surfaces. It also allows to handle extremely detailed tessellations and huge triangle counts, especially in cases of displacement with fine approximations. Similarly, input textures read from texture files are always loaded on demand.

Demand loading of scene elements is also available on the .mi file level and through geometry shaders, by using assemblies for delayed reading or translation of sub-scenes, or placeholder objects for individual geometry.

Memory Management

The memory management system in mental ray allows efficient use of the memory resources available on the rendering machine, and provides the basis for the capability to render huge scenes on systems with limited memory. Any memory that is allocated in mental ray core and in shaders (using the memory shader interface functions) will be registered in the memory module together with other essential information. This permits to watch overall consumption and to react to progress and failure conditions during rendering.

The primary control to tune memory usage in mental ray is the memory limit. It is predefined on 32bit systems to about 1GB to work well in most existing systems, and unlimited on 64bit systems. It may be set to any other value, and even to "unlimited" in special cases. In the "unlimited" case, mental ray is fully depending on the operating system to provide memory, and starts to intervene only when the system fails to allocate. However, in this case, typically very little resources are left to perform clean-up procedures in reasonable time, and mental ray's attempt to survive the bottleneck and finish the current rendering may take a long time or fail anyway. Setting the limit too high has a similar effect. On the other hand, tuning down the limit can help to render a huge scene successfully, at the expense of reducing the performance noticeably (due to releasing and re-creating pieces of the scene). It is best practice to model scene elements in smaller pieces instead of creating huge concrete objects, and to utilize callback mechanisms like object placeholder and assemblies whenever possible, to benefit most from the memory management system.

The memory limit may be adjusted globally by a registry variable, or overridden for a specific rendering by a command line option.