Automated Hybrid Variance Reduction on Advanced Architectures in the Shift Monte Carlo Code

Thomas M. Evans; Katherine E. Royston; Steven P. Hamilton; Gregory G. Davidson; Elliott Biondo; Seth R. Johnson

doi:dx.doi.org/10.1080/00295639.2025.2484511

Monte Carlo transport methods are the most accurate schemes for solving problems with complex energy and spatial features, but they come with a high computational cost. Although hybrid methods have enabled the use of Monte Carlo transport for a large class of problems, they still require significant computing resources. Modern multicore CPUs with large numbers of compute cores and graphical processing units (GPUs) provide opportunities to optimize the memory and run-time costs of hybrid Monte Carlo methods.

This paper documents the development and analysis of three Monte Carlo transport algorithms that support hybrid transport using the consistent adjoint-driven importance sampling (CADIS) and forward-weighted CADIS methods in the Shift Monte Carlo code: history-based transport using static and dynamic threading on multicore CPUs and event-based transport enabling weight window tracking on GPUs.

The results are shown for two challenging hybrid problems on the Frontier supercomputer at the Oak Ridge Leadership Computing Facility. The results show that all three methods yield good performance and enable solutions of difficult fixed-source transport problems in less than 2 min on 20 nodes of Frontier. Dynamic threading was observed to give up to 20% better scaling behavior than static threading. Moreover, the AMD Instinct 250X GPU was found to give 9 to 11 times greater throughput per graphics compute die than the best CPU performance. Additional opportunities for optimization of hybrid transport on GPUs are discussed.