Elsevier Science Home
Computer Physics Communications Program Library
Full text online from Science Direct
Programs in Physics & Physical Chemistry
CPC Home

[Licence| Download | E-mail| New Version Template] aebn_v2_0.tar.gz(1668 Kbytes)
Manuscript Title: Introducing PROFESS 2.0: a parallelized, fully linear scaling program for orbital-free density functional theory calculations
Authors: Linda Hung, Chen Huang, Ilgyou Shin, Gregory S. Ho, Vincent L. Lignères, Emily A. Carter
Program title: PROFESS
Catalogue identifier: AEBN_v2_0
Distribution format: tar.gz
Journal reference: Comput. Phys. Commun. 181(2010)2208
Programming language: Fortran 90.
Computer: Intel with ifort; AMD Opteron with pathf90.
Operating system: Linux.
Has the code been vectorised or parallelized?: Yes. Parallelization is implemented through domain composition using MPI
RAM: Problem dependent, but 2 GB is sufficient for up to 10,000 ions.
Keywords: Orbital-free density functional theory, Optimization, Electronic structure.
PACS: 31.15.-p..
Classification: 7.3.

External routines: FFTW 2.1.5 (http://www.fftw.org)

Does the new version supersede the previous version?: Yes

Nature of problem:
Given a set of coordinates describing the initial ion positions under periodic boundary conditions, recovers the ground state energy, electron density, ion positions, and cell lattice vectors predicted by orbital-free density functional theory. The computation of all terms is effectively linear scaling. Parallelization is implemented through domain decomposition, and up to ~10,000 ions may be included in the calculation on just a single processor, limited by RAM. For example, when optimizing the geometry of ~50,000 aluminum ions (plus vacuum) on 48 cores, a single iteration of conjugate gradient ion geometry optimization takes ~40 minutes wall time. However, each CG geometry step requires two or more electron density optimizations, so step times will vary.

Solution method:
Computes energies as described in text; minimizes this energy with respect to the electron density, ion positions, and cell lattice vectors.

Reasons for new version:
To allow much larger systems to be simulated using PROFESS.

Summary of revisions:
  • PROFESS can run in parallel [1]. Parallelization is implemented through domain decomposition using MPI. (However, copies of all ion positions, which take up a relatively small amount of memory, are saved on all processors.) An updated serial version of PROFESS, with some memory-efficient features specific to the use of a single process, can also be compiled from the same code.
  • Instead of linking to the FFTW3 library, we use FFTW 2.1.5, which is the most recent version of FFTW that supports MPI parallel transforms.
  • Ion-ion and ion-electron calculations can scale as O(N ln N) through the use of cardinal B-splines [1]-[3]. (For ion-ion calculations, this is known as particle-mesh Ewald.)
  • The line search during electron density optimization (when using the square root of electron density as the variational parameter) automatically conserves the total number of electrons in the system, using a similar line search mixing scheme as in Reference [4].
  • The square root density CG and TN optimizations are generally more stable.
  • Conjugate gradient ion optimization is improved and stable.
  • Positions of chosen ions can be held fixed during geometry optimization.
  • Variable time steps are used during cell lattice optimization instead of fixed steps.
  • The CAT kinetic energy density functional [5] is available.
  • A cutoff to avoid divergence in vacuum regions is now an option for some kinetic energy and exchange-correlation functionals (keywords WTV, WGV, CAV, PBEC) [6].
  • The PBE exchange-correlation subroutine is more stable.
  • Calculations of energy and potential for some functionals are more efficient after removing duplicate computations. (Note: CAT, LQ, and HQ functionals have not yet been consolidated.)
  • Density and potential output files have a new format that is more convenient for output from multiple processes. A utility to convert between old and new density formats, as well as Tecplot format, is provided in RhoConvert.f90.
  • The interpolation scheme used when reading in pseudopotentials is more accurate.
  • WGC kernel integration uses the Runge-Kutta method for better accuracy.

PROFESS cannot use nonlocal (such as ultrasoft) pseudopotentials. A variety of local pseudopotential files are available at the Carter group website
(http://www.princeton.edu/mae/people/faculty/carter/homepage/research/local-pseudopotentials/). Also, due to the current state of the kinetic energy functionals, PROFESS is only reliable for main group metals and some properties of semiconductors.

Running time:
Problem dependent: the test example provided with the code takes less than a second to run. Timing results for large scale problems are given in the PROFESS paper and Reference 1.

[1] L. Hung and E.A. Carter, Chem. Phys. Lett. 475 (2009) 163.
[2] U. Essmann, L. Perera, M. Berkowitz, T. Darden, L. Hsing, L.G. Pedersen, J. Chem. Phys. 103 (1995) 8577.
[3] N. Choly and E. Kaxiras, Phys. Rev. B 67 (2003) 155101.
[4] H. Jiang and W. Yang, J. Chem. Phys. 121 (2004) 2030.
[5] D. Garcìa-Aldea and J.E. Alvarellos, Phys. Rev. A 76 (2007) 052504.
[6] I. Shin, A. Ramasubramaniam, C. Huang, L. Hung, and E.A. Carter, Philos. Mag. 89 (2009) 3195.