Architecture

Parallelism

mental ray has been designed to take full advantage of parallel hardware and achieve maximal performance benefit through concurrent execution of rendering tasks. On multi-processor and multi-core machines it automatically exploits thread parallelism where multiple threads of computation can access shared memory. No user intervention is required to take advantage of this type of parallelism.

mental ray is also capable of exploiting thread and process level parallelism where multiple threads or processes cooperate in the rendering of a single image but do not share memory. This is done using a distributed shared database that provides demand-driven transparent sharing of database items on multiple systems [1]. This allows parallel execution across a network of computers for distributed rendering.

mental ray also supports hyperthreading on capable processors if enabled. Hyperthreading, introduced on Intel Pentium 4 CPUs, is a technique that runs a second thread, including a separate process counter, on a single CPU chip to exploit otherwise idle execution units on the chip. Since this increases memory access frequency and reduces locality of reference, hyperthreading achieves a performance increase of only about 15-25%. Early machines with hyperthreading support are not very reliable, so most vendors disable hyperthreading in their hardware, BIOS, or operating system even though the CPU supports it. mental ray will not pull a license for a hyperthreading companion thread.

Distributed Rendering

A mental ray renderer that is started on a host to read or translate the scene, or is executed in application software it is integrated in, may be used as a master to utilize remote machines as slave renderers to contribute to the rendering of the current image and reduce the overall rendering time. mental ray needs to be available as a service on the remote machines to support distributed rendering.

The master is responsible for connecting to all other machines, assigning the rendering of a specific tile of the image to a slave, transporting the requested scene data to the remote host, and collecting the final results back on the master to write the complete image.

The slave is always tied to a specific master, message communication and resource usage is typically managed by the master. The machine of a slave may be used by another user at the same time; systems do not become unavailable for other jobs if used as slaves. However, running a mental ray slave on a host may degrade the performance of independent interactive application programs such as modelers on that host significantly. If a slave aborts rendering, the entire network participating in the render (the master and all slaves) will be notified and abort as well. mental ray periodically checks the health of all hosts on the master/slave network.

The computation of the actual rendering tasks may be pushed to the slave(s) in the no master rendering mode. This can be used to reduce the load and resource consumption on the master machine, which is helpful if the master also runs front-end applications that must remain responsive during rendering of demanding scenes.

The list of remote machines for distributed rendering can be provided to the master in several ways.

Note, that master and slave can run on the same local machine. They will be executed in separate processes which communicate through the network layer of the system. They operate on their own copy of the database and don't share memory. Thus, this constellation may only be useful in special cases like the no master rendering mode.

Progressive Rendering

By default, mental ray computes images by splitting them into rectangular pieces called tiles which are rendered as parallel as possible to their final quality. This is well suited for distributed rendering of larger scenes at highest quality. It has the disadvantage that final images are available only after all tiles have been finished. The progressive rendering mode, in contrast, computes and delivers images at full resolution with incrementally progressing quality. This supports interactive rendering applications of mental ray with full support for its feature set. This mode is well suited even for computationally expensive effects like ray tracing and indirect illumination, since the first few rendered frames will already provide a visual impression of the final result. Users will be able to adjust their model instantaneously without the need to wait for the complete image. Best performance can be achieved for scenes which completely fit into the memory of the machine, coupled with feature selection and rendering options in mental ray that best match and support the purpose of the rendering.

Additional optimization techniques are available in this mode. They can be used for typical scene setup, like illumination with IBL, to accelerate progressive rendering even further, at the expense of accuracy of certain rendering effects which may not contribute much to the intended visual result.

Some advanced mental ray features are not supported with progressive rendering to gain speed, see Known Limitations. This rendering mode can be enabled on the command line or with scene options.

Note, that rendering a frame in this mode is stopped when special quality criteria are fulfilled, or a specified time limit is hit. The default settings are very high, which basically let the render go for a very long time before it would stop automatically.

[1] The parallel rendering technology which is required for the support of distributed shared databases has been originally developed by mental images as part of the ESPRIT Project 6173 Design by Simulation and Rendering on Parallel Architectures (DESIRE). See Herken94.

Copyright © 1986, 2015 NVIDIA ARC GmbH. All rights reserved.