Compiling with CUDA support
---------------------------

These tools are intended to allow CP2K to perform portions of the calculation on NVIDIA graphics cards using the CUDA API.  In order to use these features you need a CUDA compatible graphics card.  In addition, you must download and install the CUDA toolkit, CUDA SDK, and a CUDA compatible NVIDIA driver.  These can be obtained from NVIDIA's website.

To compile CP2K with CUDA support, three modifications must be made to the arch file.  An example is provided in arch/Linux-x86-64-cuda.sopt.  The necessary modifications are:

1)  Add a line "NVCC = nvcc" pointing the makefile to the nvcc compiler.

2)  Add -D__CUDA and -D__FFTSGL to the DFLAGS and NVFLAGS environmental variables

3)  Add libcudart.so, libcufft.so, and cublas to the LIBS variable.

Then, compile as normal.

Alternatively one can comile with support for running DGEMM and DSYMM on the gpu, but nothing else.  This is accomplised as above, but instead of __CUDA and __FFTSGL one should include __CUBLASDP.

Features
--------
At the moment, the only portion of the calculation which can be performed on the graphics card is the FFT, the associated scatter/gather operations, and some linear algebra.  Because NVIDIA graphics cards only reach full performance for single precision floating point arithmetic, the FFT must be performed in single precision (hence the -D__FFTSGL option.)

