Speeding up LIGGGHTS using a MPI/OpenMP hybrid parallelization
Sprache des Vortragstitels:
IV International Conference on Particle-based Methods (Particles 2015)
Sprache des Tagungstitel:
LIGGGHTS uses domain decomposition and message-passing (MPI) to scale across many hundreds of processing cores. Because load-imbalance is a common problem in granular simulations, it added the ability to dynamically adjust subdomain boundaries along coordinate axes. However in many simulation domains heterogeneous particle distributions make finding good domain decompositions not trivial and is limited.To better tackle load-imbalance a hybrid parallelization using both MPI and OpenMP parallelization has been developed. MPI domain decomposition is still used to generate multiple subdomains, allowing to distribute the workload among multiple compute nodes. Each node can then be fully utilized using OpenMP threads. Threaded versions of all major computational steps, including particle-particle and particle-wall interactions were created and achieve load balancing through various means. E.g., through the usage of the RCB partitioning algorithm implemented in existing libraries.
The benefit of the hybrid approach is the added flexibility of multiple layers of decomposition. One mapping to the physical layout of the computing hardware, another focusing on balancing the workload among compute resources. So far these improvements have led to better performance of about 44% compared to a MPI-only parallelization in typical test cases.