GPU Acceleration

Tissue Forge supports modular, runtime-configurable GPU acceleration of a simulation using CUDA. Computational features of Tissue Forge that support GPU-acceleration can be configured, offloaded to a GPU, brought back to the CPU and reconfigured at any time during a simulation. For Tissue Forge installations with enabled GPU acceleration, no computations are performed on a GPU by default. Rather, GPU-supporting features of Tissue Forge must be explicitly configured and offloaded to a GPU using their corresponding interactive interface. This modular, configurable approach allows fine-grain control of computations to achieve maximum performance for a given set of hardware and a particular simulation.

For example, suppose a simulation begins with a few hundred particles. Such a simulation would likely not benefit from GPU acceleration (or even run slower on a GPU). However, suppose that over the course of the simulation, hundreds of thousands of particles are created. At some point, this simulation will run faster on a GPU. Tissue Forge easily handles such a situation by allowing the computations of the particle interactions to be offloaded to a GPU mid-execution of the simulation (and brought back to the CPU, should the particle number significantly decrease).

Deployment on a GPU is best accomplished when running Tissue Forge in windowless mode, since real-time rendering of interactive Tissue Forge simulations also utilizes available GPUs.

Note

Tissue Forge currently supports acceleration using a single GPU. Future releases will support deploying computations on multiple GPUs by computational feature.

Tissue Forge includes a flag has_cuda to check whether GPU acceleration is supported by the installation (hasCuda in C++),

import tissue_forge as tf

print(tf.has_cuda)  # True if GPU acceleration is installed; False otherwise

GPU-Accelerated Simulator

Simulator provides access to runtime control of GPU-accelerated simulation features. Each GPU-accelerated simulation feature has its own runtime control interface for configuring and deploying on a GPU. GPU runtime control of simulation modules can be accessed directly from Simulator,

cuda_config_sim: tf.cuda.SimulatorConfig = tf.Simulator.getCUDAConfig()

The returned cuda.SimulatorCUDAConfig (cuda::SimulatorConfig in C++) provides convenient access to all current GPU-accelerated simulation features.

GPU-Accelerated Engine

Engine GPU acceleration is a GPU-accelerated simulation feature that offloads nonbonded potential interactions, fluxes, particle sorting and space partitioning onto a GPU. All runtime controls of engine GPU acceleration are available on cuda.EngineConfig (cuda::EngineConfig in C++), which is an attribute with name engine on cuda.SimulatorConfig,

cuda_config_engine = tf.Simulator.getCUDAConfig().engine  # Get engine cuda runtime interface

Engine GPU acceleration can be enabled, disabled and customized during simulation according to hardware capabilities and simulation state,

cuda_config_engine.set_blocks(numBlocks=64)               # Set number of blocks
cuda_config_engine.set_threads(numThreads=32)             # Set number of threads per block
cuda_config_engine.to_device()                            # Send engine to GPU
# Simulation code here...
if cuda_config_engine.on_device():                        # Ensure engine is on GPU
    cuda_config_engine.from_device()                      # Bring engine back from GPU

Setting a number of blocks specifies the maximum number of CUDA thread blocks that can be deployed during a simulation step, which work on various engine tasks (e.g., calculating interactions among particles in a subspace of the simulation space). Setting a number of threads per block specifies the number of threads launched per block to work on each engine task.

Many Tissue Forge operations automatically update data when running on a GPU. However, some operations (e.g., binding a Potential) requires manual refreshing of engine data for changes to be reflected when running on a GPU. Engine GPU acceleration runtime control provides methods to explicitly tell Tissue Forge to refresh data on a GPU at various levels of granularity,

cuda_config_engine.refresh_potentials()           # Capture changes to potentials
cuda_config_engine.refresh_fluxes()               # Capture changes to fluxes
cuda_config_engine.refresh_boundary_conditions()  # Capture changes to boundary conditions
cuda_config_engine.refresh()                      # Capture all changes

Refer to the Tissue Forge API Reference for which operations automatically update engine data on a GPU.

Note

It’s not always clear what changes are automatically detected by Tissue Forge when running on a GPU. When in doubt, refresh the data! Performing a refresh comes with additional computational cost but must be performed only after all changes to simulation data have been made, and before the next simulation step is called.

GPU-Accelerated Bonds

Bond GPU acceleration is a GPU-accelerated simulation feature that offloads bonded interactions onto a GPU. All runtime controls of bond GPU acceleration are available on cuda.BondConfig (cuda::BondConfig in C++), which is an attribute with name bonds on cuda.SimulatorConfig,

cuda_config_bonds = tf.Simulator.getCUDAConfig().bonds    # Get bond cuda runtime interface

The bond GPU acceleration runtime control interface is very similar to that of engine GPU acceleration. Bond GPU acceleration can be enabled, disabled and customized at any point in simulation,

cuda_config_bonds.set_blocks(numBlocks=64)                # Set number of blocks
cuda_config_bonds.set_threads(numThreads=32)              # Set number of threads per block
cuda_config_bonds.to_device()                             # Send bonds to GPU
# Simulation code here...
if cuda_config_bonds.on_device():                         # Ensure bonds are on GPU
    cuda_config_bonds.from_device()                       # Bring bonds back from GPU

Setting a number of blocks specifies the maximum number of CUDA thread blocks that can be deployed during a simulation step, which calculate pairwise forces due to each bond. Setting a number of threads per block specifies the number of threads launched per block to work force calculations.

Adding and destroying bonds both automatically update data while running on a GPU. However, changes to bond properties (e.g., half life) and bond potential require manual refreshing of bond data for changes to be reflected when running on a GPU. Bond GPU acceleration runtime control provides methods to explicitly tell Tissue Forge to refresh data on a GPU at various levels of granularity,

cuda_config_bonds.refresh_bond(bond)    # Capture changes to a bond
cuda_config_bonds.refresh_bonds(bonds)  # Capture changes to multiple bonds
cuda_config_bonds.refresh()             # Capture all changes

Angle GPU acceleration is a similar GPU-accelerated simulation feature that offloads angle interactions onto a GPU. The angle GPU acceleration runtime control interface is practically identical to that of bond GPU acceleration (e.g., refresh_angles for angle GPU acceleration is analogous to refresh_bonds for bond GPU acceleration). The angle GPU acceleration runtime control interface is accessible on cuda.AngleConfig (cuda::AngleConfig in C++), which is available as an attribute with name angles on cuda.SimulatorConfig,

cuda_config_angles = tf.Simulator.getCUDAConfig().angles  # Get angle cuda runtime interface

Refer to the Tissue Forge API Reference for which operations automatically update bond and angle data on a GPU.