Milan Hanus, Yunhuang Zhang, Marvin Adams, Nancy Amato, Jim Morel, Lawrence Rauchwerger

*Project Alumni:*
Daryl Hawkins, Timmie Smith, Nathan Thomas, Anthony Barbu, Tarek Ghaddar, Jonathan Madsen, Carolyn McGraw, Michael Adams, Ping An, Teresa Bailey, Jae Chang, Tao Huang, Josh Jarrell, Alin Jula, Alex Maslowski, Mark Mathis, Olga Pearce, Silvius Rus, Armando Solar, Lidia Smith, Gabriel Tanase, Mauro Bianco

The overall goal of the project is the development of strategies that will produce the desired discrete-ordinates transport solutions in the lowest possible wall-clock time on computational platforms of interest to the DOE's Accelerated Strategic Computing Initiative (ASCI). The project began in 1998 with the best known sequential algorithms for solving the discrete-ordinates transport problem as starting points for devising algorithms and coding strategies to yield the best possible performance on modern parallel cache-based systems. To date the project has produced a transport code (PDT) that is capable of solving steady-state discrete-ordinates problems using regular hexahedral grids and multiple transport algorithms and acceleration methods. Work to allow the code to handle arbitrary grids and to solve time-dependent transport problems is in progress.

Problem Statement

The deterministic transport problem solved by PDT can be described briefly as:

**Given**:

- A domain in an N-dimensional space made of known materials.
- An initial flow of particles through the domain at a starting time.
- An initial set of sources generating particles inside the domain.
- Knowledge about the behavior at the domain boundary.

**Compute**:
The flow of particles at a subsequent time at every point in the domain.

Code Overview

PDT is written in C++ using STAPL. The code represents the problem as a set of objects described below. The main function of the application uses the constructors and initialization methods of the objects to setup the problem and then calls the solve method of the problem class.

**Generic Problem**

We represent a problem as a system containing:

- A spatial discretization (Grid).
- A material repository.
- An energy and angular discretization system.
- A solver.

**Generic Grid and ElementMap**

The Grid object represents the topology of the given spatial discretization. It is made of cells.

In order to solve the problem we introduced a parallel topology based on computational elements rather than spatial ones. That is the ElementMap. It is made of elements.

**Parallel Solver**

We have already introduced the concept of ElementMap, which was created to ease the abstraction of solving the problem. There are other constructs that ease the development of solvers for parallel machines:

- The Chunk
- The Executor

**Generic Chunk**

The Chunk is composed of:

- A set of spatial cells
- A set of energy groups
- A set of angles

It is the computation unit for sweeping algorithms. A chunk is the atomic unit of work in the sweep. For distributed memory systems the chunks are also communications atoms since messages carrying information to cells on different processors are buffered until all cells in the set are processed..

**Partitioner and Scheduler**

The Partitioner reads the problem size from the input file and determines the assignment of cells to each thread of execution. The Scheduler accepts a set of dependence graphs on Cells as input and aggregates the cells into cell sets. The Scheduler also produces dependence graphs based on cell sets and angle sets.

**Generic Executor**

The executor is implemented as a pAlgorithm in
STAPL named p_for_all.
The algorithm takes as input a set of dependency graphs on Chunks (which are
represented using the STAPL pRange) and a generic function to be executed on
every Chunk. It manages parallel execution by determining the next
available Chunk to process on each thread based on the dependence graphs
given. On distributed memory systems it manages the communication too
by forcing any buffered messages to be sent after each call of the work
function given as input.

Presentations

Researchers at Lawrence Livermore National Laboratory, Los Alamos National Laboratory, and Sandia National Laboratory are working on deterministic transport calculations. In order to avoid duplication of effort, foster collaboration and cross-fertilization, and expose students and researchers to one another, there has been a series of workshops held at Texas A&M University. At each of the workshops researchers from the labs and students working on the project would present their work on their respective projects.

Labfest 3, May 2001

Labfest 4, November 2003

Labfest 5, April 2004

Labfest 6, May 2005

Papers

**Validation of Full-Domain Massively Parallel Transport Sweep Algorithms**, W Hawkins, Marvin Adams, Michael Adams, Timmie Smith, Nancy Amato, Lawrence Rauchwerger, Teresa Bailey, Peter Brown, Adam Kunen, In *Trans. Amer. Nucl. Soc.*, pp. 699-792, 2014.

Proceedings(pdf, abstract)

**Provably Optimal Parallel Transport Sweeps on Regular Grids**, W Hawkins, Timmie Smith, Michael Adams, Lawrence Rauchwerger, Nancy Amato, Marvin Adams, Teresa Bailey, Robert Falgout, In *Proc. Int. Conf. on
Math. Meth. and Supercomp. for Nuc. App.*, Idaho, May 2013.

Proceedings(pdf, abstract)

**Efficient Massively Parallel Transport Sweeps**, W Hawkins, Timmie Smith, Michael Adams, Lawrence Rauchwerger, Nancy Amato, Marvin Adams, *Trans. Amer. Nucl. Soc.*, 107(1):477-481, Nov 2012.

Journal(pdf)

**A Performance Model of non-Deterministic Particle Transport on Large-Scale Systems**, Mark M. Mathis, Darren J. Kerbyson, Adolfy Hoisie, *Future Generation Computer Systems*, 22(3):324-335, Feb 2006. Also, In *Proc. Int. Conf. on
Computational Science (ICCS)*, Melbourne, Australia, Jun 2003.

Journal(pdf, abstract) Proceedings(ps, pdf, ppt, abstract)

**A General Performance Model of Structured and Unstructured Mesh Particle Transport Computations**, Mark M. Mathis, Darren J. Kerbyson, *Journal of Supercomputing*, 34(2):181 - 199, Nov 2005.

Journal(pdf, abstract)

**Performance Modeling of Unstructured Mesh Particle Transport Computations**, Mark M. Mathis, Darren J. Kerbyson, In *Proc. Int. Par. and Dist.
Proc. Symp. (IPDPS)*, Santa Fe, NM, Apr 2004.

Proceedings(ps, pdf, ppt, abstract)

**Finding Strongly Connected Components in Parallel in Particle Transport Sweeps**, William McLendon, Bruce Hendrickson, Steve Plimpton, Lawrence Rauchwerger, In *Proc. ACM Symp. Par. Alg.
Arch. (SPAA)*, pp. 328-329, Crete, Greece, Jul 2001.

Proceedings(ps, pdf, abstract)

**A General Performance Model for Parallel Sweeps on Orthogonal Grids for Particle Transport Calculations**, Mark M. Mathis, Masters Thesis, Department of Computer Science and Engineering, Texas A&M University, Dec 2000.

Masters Thesis(ps, pdf, abstract)

**A General Performance Model for Parallel Sweeps on Orthogonal Grids for Particle Transport Calculations**, Mark M. Mathis, Nancy M. Amato, Marvin Adams, In *Proc. ACM Int. Conf.
Supercomputing (ICS)*, pp. 255-263, Santa Fe, NM, May 2000. Also, Technical Report, TR00-004, Parasol Laboratory, Department of Computer Science, Texas A&M University, Dec 1999.

Proceedings(ps, pdf, abstract) Technical Report(ps, pdf, abstract)

**Task Scheduling and Parallel Mesh-Sweeps in Transport Computations**, Nancy M. Amato, Ping An, Technical Report, TR00-009, Department of Computer Science and Engineering, Texas A&M University, Jan 2000.

Technical Report(ps, pdf)