FAQ: Performance

•  Is tyFlow multi-threaded?

Yes, every part of tyFlow is designed to make use of as many available CPU cores as possible.

•  When simulating why aren’t all of my available CPU cores at 100% usage?

Every operator within tyFlow is multi-threaded, but not all specific algorithms use all cores at all times. Frequent context switching and intermediate data collation can also manifest itself as lower reported CPU usage. To measure the precise benefit a multi-core system is having on simulation time, try reducing the number of max threads a flow can use to 1, and comparing the time it takes to complete the simulation when using that reduced thread count. More info about setting max threads can be found here.

•  I have X available cores. Why don’t my simulations run X times faster than using 1 core?

Simulations are both CPU and memory bound. Adding more cores to a system won’t necessarily increase the speed of simulations in a linear fashion. As more cores try to access the same memory, memory bandwidth limits will result in diminishing overall returns as the number of available CPU cores increases. For example, a system with 16 cores may see an 8-10x speedup on a complex simulation over a system with a single core, rather than a full 16x speedup.

•  Will tyFlow use my GPU?

If you have an OpenCL 1.2-compatible GPU with applicable drivers installed, tyFlow will make use of it in its Particle Bind Solver (used to simulate cloth/soft-bodies/grains/etc), as well as the Particle Physics operator (used to calculate inter-particle forces). Currently, no other part of tyFlow’s simulation loop is GPU-accelerated.

If your GPU does not have enough RAM to store all necessary bindings during a solve, a warning will be displayed to the 3ds Max Listener and the solver will defer to the CPU for its calculations. For simulations with millions of particles, a GPU with at least 8-12 gigs of RAM is recommended.

The GPU solver is heavily bound to a GPU’s memory bandwidth, and transferring necessary data to-and-from the GPU can take up a lot of the overall simulation time. Older GPUs with limited bandwidth may perform slower than a CPU solve. Users will have to experiment with their own hardware to see if using their GPU actually offers a performance benefit. If it does not, GPU simulation mode can be disabled within the Particle Bind Solver settings.

Currently, tyFlow can only use one GPU at a time. Multiple GPUs are not supported.

If you want to use OpenCL accelerated operators, make sure you have installed your GPU’s latest drivers. OpenCL acceleration is completely dependent on compatible driver support, so ensuring your drivers are up to date will help ensure that OpenCL acceleration won’t fail due to a software issue.

If you have an OpenCL-compatible device, but tyFlow is reporting that OpenCL is not found or cannot initialize, a re-install of your GPU drivers may be necessary. Also check your 3ds max root directory for improper OpenCL.DLL files (backup and then remove any OpenCL.DLL files found which are less than 100kb in size). Several users have reported that doing so fixed their problems, which allowed tyFlow to make use of their OpenCL-compatible device.

•  Why is tyFlow using so much memory?

Most of tyFlow is designed to use as little memory as possible, but sometimes there is no getting around the fact that a lot of RAM is still required to compute a simulation, depending its complexity. Even with measures in place to reduce data bloat, millions of particles with complex properties and interactions may still require many gigabytes of RAM to compute, and that’s not even considering the amount of RAM required to store sequences of those particles in memory (if timeline caching is enabled). tyFlow itself makes no effort to monitor available RAM while a simulation is running. This can result in unexpected behavior if system RAM becomes full, so it is a good idea for users to monitor system RAM usage if they believe it may reach maximum capacity during flow evaluation.