Scientific Computing on GPU's
The Graphics Processing Unit or GPU is used more and more for Scientific Computing.
For relatively low costs one can obtain supercomputer
performance (1 Teraflop).
It appears however that some work has to be done to make an
ordinary program suitable for use on the GPU.
One of the important tools is the development of CUDA (Compute Unified Device
Architecture). This is an extension of the C programming language, which
can be used to program the GPU in an easy way.
Furthermore, in many cases algorithms have to be adapted in order to make them suitable for
GPU computing. Finally, optimization of the algorithm, implementation and use of the
hierarchy of the GPU is needed to obtain real high speed ups.
Please find below some items which are important for the Numerical Analysis Group of the TU Delft.
Read in
Russian
( by Everycloudtech)
GPU Teaching
Delft University of Technology is a recognized NVIDIA
GPU Education Center
We teach the course "Introduction/advanced course Programming on the GPU with CUDA" a number of times.
The teachers are Prof.dr.ir. C. Vuik,
Ir.
C.W.J. Lemmens,
and
Dr. M. Möller.
The next time the course is given on January 21 and 22, 2016,
registration.
Please consult the
flyer for more details.
Schedule
2015
- 23/24-11-2015 Advanced course Programming on the GPU with CUDA, TU
Delft
- 22/23-06-2015 Advanced course Programming on the GPU with CUDA, TU Delft
- 12/13-02-2015 Advanced course Programming on the GPU with CUDA, TU Delft
2014
- 25/26-11-2014 Advanced course Programming on the GPU with CUDA, TU Delft
- 25/26-9-2014 Advanced course Programming on the GPU with CUDA, TU Delft
- 5-6-2014 Introduction course Programming on the GPU with CUDA, TU Delft
- 24-1-2014 Introduction course Programming on the GPU with CUDA, TU Delft
2013
- 13-9-2013 Introduction course Programming on the GPU with CUDA, TU Delft
- 26-4-2013 Introduction course Programming on the GPU with CUDA, TU Delft
- 31-1-2013 Introduction course Programming on the GPU with CUDA, TU Delft
Start
GPU Research
The core of our research is how to invent and implement algorithm to solve systems
of discretized partial differential equations in an efficient way. Below we give some
of the work that has been done and some new Bachelor, Master, and PhD Thesis projects.
Bachelor Projects
Master Projects
PhD Projects
- GPU accelerated iterative solvers for discretized partial differential equations (Rohit Gupta, start September 2010)
A
presentation
and
slides
of this work has been given at the GPU Technology Conference
2012, May 14-17, 2012, San Jose, California, USA.
- On a GPU implementation of shifted Laplace preconditioned solvers for the Helmholtz equation
(Hans Knibbe, start September 2009). He obtained one of the first
results
on the LGM. Also a
paper has appeared and a
presentation
has been given at the ENUMATH Conference 2011.
A
presentation
and
slides
of this work has been given at the GPU Technology Conference
2012, May 14-17, 2012, San Jose, California, USA.
Presentations
GPU Software
This
zip file contains software to
solve a linear system Ax = b by the Deflated Preconditioned Conjugate Gradient Method on the GPU under certain assumptions.
Please save the file and unzip it. In the final directory
there are two readme files which have to be read in the
following order:
preREADME
operatingREADME
Deflation is also used in the
PARALUTION library which enables you
to perform various sparse iterative solvers and preconditioners on
multi/many-core CPU and GPU devices.
GPU Hardware
The Little GREEN Machine
The
Little GREEN Machine (LGM) supercomputer is a Beowulf cluster composed of off-the-shelf hardware and contains many-core Graphics Processing Units (GPUs) which offer tremendous amounts of computational power at a relative low price and energy ratio
(Press bulletins).
Hans Knibbe obtained one of the first
results
on the LGM.
Configuration:
Little Green Machine configuration
1 head node
- 2 Intel hexacore X5650
- 24 GB memory
- 24 TB disk (RAID)
1 large ram node
- 2 Intel quadcore E5620
- 96 GB memory
- 8 TB disk
- 2 NVIDIA GTX480 (in 2011 it will be replaced by 2 NVDIA C2070)
1 secundairy tesla node
- 2 Intel quadcore E5620
- 24 GB memory
- 8 TB disk
- 2 NVIDIA GTX480
1 test node
- 2 Intel quadcore E5620
- 24 GB memory
- 2 TB disk
- 1 NVIDIA C2050
- 1 NVIDIA GTX480
20 LGM general computing nodes
- 2 Intel quadcore E5620
- 24 GB memory
- 2 TB disk
- 2 NVIDIA GTX480
interconnect
- 40Gbps Infiniband
The machine is funded by:
Press bulletins
64-bit Linux clusters
Besides this DIAM also has 2
64-bit Linux clusters, of which one has 8 nodes and the other 16 nodes.
Both have state-of-the-art dual or quad core Intel processors with about 16
GByte internal memory for each node. These systems are mainly used for
heavy computations that cannot be done on an ordinary desktop. These
applications run either standalone or in an MPI based cluster environment.
GPU processors
The most recent (Nov 2010)
GPU processor is the "Nvidia Tesla C2070", also known as the Fermi,
with 6 Gigabyte internal memory.
Recently, all clusternodes were equipped with so-called GPU processors
by Nvidia, which is known to give a performance boost of a factor 20-100
for several mathematical operations not involving recurrence. We also
acquired one of the fastest architectures available at the moment (Feb
2010): the "Nvidia Tesla C1060", which will be used in the near future for
advanced mathematical computations.
Student lab facility
DIAM also has a student lab facility with 16 simple Linux desktops (now
also equipped with a simple GPU). This labroom is used for instructions
connected with our math courses, but also to organize courses were we
teach our students and new researchers how to use e.g. MPI on the clusters
and the GPUs.
Contact information:
Kees
Vuik
Back to the
home
page
of Kees Vuik