After joining ICL in 2019, I have worked on two projects within the Exascale Computing Project (ECP): CEED and CLOVER.

For CEED, I primarily worked with the libCEED library, leading the effort to port the CUDA backends to HIP for AMD GPUs, as well as contributing to other GPU-related developments in the library. For CLOVER, I improved and expanded the interface of the MFEM finite element library to Ginkgo, enabling a minimal-overhead framework for interoperability between Ginkgo's solvers and preconditioners and those included in MFEM, as well as MFEM's matrix-free operators. (The podcast episode listed below discusses this interface in more detail.) I have also worked with porting the MAGMA linear algebra library to SYCL for Intel GPUs.

For both projects, I have been involved with efforts to investigate mixed-precision acceleration of simulations on GPUs. The following list gives an overview of the kind of work I have been involved in through ECP.

ECP CEED Ginkgo


Work with my postdoc advisor, Adrianna Gillman.

The Hierarchical Poincaré-Steklov (HPS) method is a discretization technique based on domain decomposition and classical spectral collocation methods. It is accurate and robust, even for highly oscillatory solutions. An associated fast direct solver is built from hierarchically merging Poincaré-Steklov operators on box boundaries and storing the necessary information; the solve stage then involves applying the operators in a series of small matrix-vector multiplications.

We worked on parallelizing, optimizing, and accelerating the algorithm, especially the build stage. Our 2D shared memory parallelization technique reduced the build stage times from approximately 10 minutes to approximately 30 seconds for over 3 million unknowns.

Speedup.
Speedup achieved in the build and solve portions of the algorithm with 56 available threads.


We developed and implemented a new, high order finite element-integral equation coupling method for elliptic interface problems. The method leverages the strengths of FE and IE methods to handle different aspects of the problem. It can handle general jump conditions at the interface, and the jump conditions appear only in the right hand side of the system to be solved. Additionally, the method does not suffer from loss of convergence order near the boundary when the boundary is smooth and the problem data can be extended as necessary across the interface. It also doesn't require the construction or use of any special basis functions; in fact, much of the standard machinery for both the FE and IE solvers can be used, simplifying implementation from a software standpoint.

An interface problem on a starfish-shaped domain.
An example interface problem with jump conditions along a starfish-shaped interface.


The goal of this project was to create an FE-based version of the particle-particle–particle-mesh (P3M) methods for N-body problems. P3M methods separate particle interactions into short-range (only required for near neighbors) and far-field (smooth, and solved easily on a mesh with your favorite numerical method). A common choice is to achieve the P-P/P-M splitting through Gaussian screen functions (as in the classic Ewald sum) and solve the mesh problem with FFTs.

We developed a method which constructs special polynomial screen functions out of finite element basis functions. Thus the screens are represented exactly on the finite element mesh, removing several possible sources of error in a finite element solution to the mesh problem. Indeed, we aimed to make the mesh problem especially suited to being solved with finite elements, avoiding the geometry restrictions and parallel communication burdens of the FFT.

Example polynomial screens.
Polynomial screens formed from increasing order of basis functions for a charge located at the starred location.