GPU Settings Rollout

tyFlow’s various solvers may utilize GPU compute cores, depending on your simulation settings. This rollout allows you to control which GPU will be used for computations.


OpenCL GPU

Use the listbox to select which GPU should be used for OpenCL computations.

CUDA GPU

Use the listbox to select which GPU should be used for CUDA computations.

Only NVidia GPUs with up-to-date drivers can be used for CUDA computations.

  • Compatibility Mode: enabling this mode will substitute faster cuBlas functions with simpler matrix multiplication kernels. The simpler kernels are slower, but may prevent CUDA crashes on certain systems where mysterious cuBlas crashes have been occurring. Hopefully a solution to the crashing will eventually be discovered and this mode will no longer be required. Until then, it exists as an option for users running into CUDA issues.

If CUDA crashed, try enabling this mode. You will need to restart 3ds Max after enabling this mode in order for it to take effect.

PhysX

  • CUDA: controls whether PhysX computations will be accelerated with CUDA.

In order for CUDA acceleration to work, tyFlow requires that two DLL files (PhysXDevice64.DLL and PhysXGPU64.DLL - both available on the tyFlow download page) be placed in the same folder where the tyFlow DLO file is loaded from. CUDA acceleration is not compatible with sticky starting penetrations (on by default in the PhysX Shape operator, in v1.118+).

CUDA acceleration does not always guarantee faster simulations. Due to performance costs related to transferring necessary data to-and-from the GPU, speed benefits from CUDA might not be apparent until hundreds/thousands of rigidbody particles are in the simulation. When a simulation has only a few rigidbody particles, CUDA acceleration being enabled may actually decrease overall performance.

  • Memory MB: the amount of GPU memory to allocate for CUDA computations.

Setting the CUDA memory limit higher than the default value does not mean the simulation will run faster. The memory limit controls the amount of VRAM to allocate for constraint/contact processing, and generally a CUDA simulation does not require much VRAM in order to process all contacts, even if a lot of rigidbodies are present in the simulation. Setting this value very high is usually unnecessary, and can actually contribute to slowdowns at the beginning of the simulation, due to the time it takes to initially allocate the VRAM. Just because you have a GPU with a lot of VRAM, does not necessarily mean you should increase this setting from its default value. For reference, the default value suggested by NVidia is approximately 140mb.