Programs in Physics & Physical Chemistry
|[Licence| Download | New Version Template] acex_v1_0.gz(21 Kbytes)|
|Manuscript Title: ICCG3: subprograms for the solution of a linear symmetric matrix equation arising from a 7, 15, 19 or 27 point 3D discretization. See erratum Comp. Phys. Commun. 31(1984)439.|
|Authors: D.V. Anderson|
|Program title: ICCG3|
|Catalogue identifier: ACEX_v1_0|
Distribution format: gz
|Journal reference: Comput. Phys. Commun. 30(1983)51|
|Programming language: Fortran.|
|Operating system: CTSS.|
|Word size: 64|
|Keywords: General purpose, Matrix, Partial differential Equations, Elliptic, Parabolic, Three-dimensional, Implicit, Pre-conditioned, Conjugate gradient Algorithm, Iccg, Ilucg, Symmetric matrix, Incomplete cholesky Method.|
Nature of problem:
Elliptic and parabolic partial differential equations which arise in plasma physics applications (as well as in others) are solved in three dimensions. Plasma diffusion, equilibria, and phase space transport (Fokker-Planck equation) have been treated by similar methods in two dimensions using the codes ICCG2 and ILUCG2. These problems share the common feature of being stiff and requiring implicit solution techniques. Sometimes, the resulting matrix equations are symmetric; we solve them here with the ICCG3 program. In a previous article we described a slower and more general algorithm, ILUCG3, which must be used when the matrix is asymmetric.
A generalization of the incomplete Cholesky conjugate gradient (ICCG) algorithm is used to solve the linear symmetric matrix equation.
The discretization of the three-dimensional PDE and its boundary conditions must result in a spatial operator with a 7, 15, 19 or 27 point stencil which can be represented by a symmetric block tridiagonal supermatrix composed of block-tridiagonal submatrices whose elements are elementary tridiagonal matrices. The resulting sparse symmetric matrix has 7, 15, 19 or 27 non-zero diagonals.
The loops are arranged to vectorize on the Cray-1 (with the CFT compiler) wherever possible. Recursive loops (which cannot be vectorized) are written for optimum scalar speed. Some of these loops were made more efficient by increasing the arithmetic per pass and at the same time reducing the number of passes. This trick significantly reduced loop overhead and made these scalar loops 30% faster. One of the sources is not in standard FORTRAN; we show how to make it standard.
These are problem dependent because ill-conditioned matrices will require more iterations than well-conditioned ones. However, the running times per iteration are known. For enough equations (1287 here) the running times per iteration per equation become independent of the number of equations. Also, the running times depend on which of the 16 versions we are using. In the test problems, which we regard as typical, the time for the incomplete Cholesky factorization ranged from 3.8 to 25.9 mu s per unknown for the versions that employed 7-banded and 27-banded L sparsity patterns, respectively. Each conjugate gradient iteration required from 1.9 to 10.9 mu s per unknown; these times depend strongly on the sparsity pattern of L and weakly on that of M. In the test problems the number of iterations required to reach a relative residual error of 10**-10 (in the L2 norm) ranged from 8 to 18.
|Disclaimer | ScienceDirect | CPC Journal | CPC | QUB|