CMU 15-418/618 (Spring 2013) Final Project Proposal

Project Proposal:

Molecular Dynamics Simulations on a GPU

Summary

I plan to create a molecular dynamics simulation. The computation will be done on a GPU using OpenCL to write the kernel. It will be displayed using OpenGL. I plan on exploring multiple strategies and comparing the performance. I will vary the strategy for distributing work and possibly the method for approximating potentials.

Background

Simulations of physical systems in chemistry and physics are computationally expensive problems. The complexity of these problems scale asymptotically anywhere from $O(n^2)$ to $O(n^6)$. Further, the simulations only yield interesting results for large inputs and accurate results for a timestep on the order of femtoseconds. Therefore, designing fast algorithms is a necessity. Fortunately, these problems have reasonable opportunities for parallelism. To my knowledge, comparatively little work in this field has been done based on the GPU. It has generally been centered around supercomputers and large clusters using message passing and SIMD on a CPU. Therefore, my project will investigate the viability of the GPU as a platform for these computations.

Molecular dynamics simulations are based around Newton's laws of motion: update force, then update position, repeat. The interesting part of this is calculating the force. Here, there is a huge tradeoff between accuracy and speed. For my simple calculation, I'll being using the Lenndard-Jones potential[2]. If I can, I would like to compare this to the ab initio Hartree-Fock method[4]. This will tax my knowledge of quantum mechanics far more than my knowledge of parallel computation, which is why I'm not initially committing to it. Since varying the strategy for parallelizing the calculation is more relevant, that will be my primary focus. However, since Hartree-Fock would have a very different arithmetic intensity, it may have an impact on this strategy.

Challenge

The challenges of this task vary considerably from the existing CPU-based solutions mentioned above. For example, I won't have to worry about efficient message passing. On the other hand, I expect the challenge of my approach is centered on the difficultly of organizing the work of the computation to efficiently utilize the GPU. The major step of the computation involves populating a sparse matrix of forces generated from potentials. Since most of these are not evaluated, effectively organizing the computation will be extremely challenging, since just evaluating each element in the matrix would be prohibitively wasteful. In this regard, it is much more difficult than the CPU-based approach. For these reason, I plan on implementing and fine-tuning multiple strategies.

Resources

I will be using the assignment 2 code as a jumping off point for the OpenGL code. Other than that, I plan on pretty much coding the rest from scratch. I may need a linear algebra library, but I haven't looked into that yet.

I considered finding an existing code base to work with. However, these tend to be huge packages that do much more than I need (thus are difficult to become familiar with quickly) and involve far more Fortran than I care to go near.

Goals/Deliverables

The Basics:

I plan on implementing 3 kernel assignment strategies, based on atom, force, and spatial decomposition[1].
Use Lennard-Jones to calculate the potential[2][3].
Display the 2D simulation in meaningful way. If possible, show the grouping.
The program should be sufficiently fast to run a meaningful simulation in a short amount of time. I don't have a great target, but let's say 1000s of atoms for 1000s of timesteps on the order of minutes.

Extras:

Add a quantum mechanical approximation for the potential (Hartree-Fock[4]).
Extend to 3D (if this turns out to be sensible to just do from the beginning, then I will).
Port code to WebGL/WebCL, just because it would be cool to have in a browser instead (a distant last goal, since it's not terribly relevant).

Platform

The project will be done on a GPU using OpenCL. I will probably do most of the development on my computer, running a GTX 560 Ti. for the sake of compaision, I will also do some runs on the various GPUs available on GHC machines.

Proposed Schedule

Week	What We Plan To Do
Apr 1-7	Research the basic principles and methods for MD.
Apr 8-14	Choose specific algorithms. Put together a MWE of an OpenCL/OpenGL program that just does random particle motion.
Apr 15-21	Implement the calculation of potentials based on Lennard-Jones. Use a really dumb assignment scheme to run the kernel just to verify that things are working.
Apr 22-28	Add new kernel assignment strategies. Compare the atom, force, and spatial approach to decomposition[1].
Apr 29-May 5	Fine tune performance. Determine whether I know QM well enough to try Hartree-Fock.
May 6-11	Work on "Extras." This depends on whether I think I can pull off Hartree-Fock.

References

Steve Plimpton, Fast Parallel Algorithms for Short-Range Molecular Dynamics, Journal of Computational Physics, Volume 117, Issue 1, 1 March 1995, Pages 1-19, ISSN 0021-9991, 10.1006/jcph.1995.1039. (http://www.sciencedirect.com/science/article/pii/S002199918571039X)
http://en.wikipedia.org/wiki/Lennard-Jones_potential
Molecular Dynamics Simulations
http://en.wikipedia.org/wiki/Hartree%E2%80%93Fock_method