Skip to content

Releases: pyscf/gpu4pyscf

v1.5.0

24 Nov 21:03
ce3d8f9

Choose a tag to compare

  • New Features (PBC systems)
    • PBC GDF extended to k-mesh computations; k-point GDF integrals stored in host memory with compression.
    • Multigrid algorithm supports PBC k-point SCF and band structure calculations.
    • Add the .analyze() method for PBC gamma-point and k-mesh DFT to summarize results and charge populations.
    • Fermi and Gaussian smearing for PBC and molecular DFT.
    • PBC RSJK algorithm for J/K matrix evaluation (J via MD J-engine; K via Rys quadrature).
    • Analytical nuclear gradients using RSJK and AFTDF for PBC gamma-point HF, k-mesh HF, and hybrid DFT.
    • Stress tensor evaluation using RSJK and AFTDF for PBC gamma-point HF, k-mesh HF, and hybrid DFT.
    • Geometry optimizer for PBC DFT.
  • New Features (molecular systems)
    • Support for QMMM point charges and external electric fields.
    • 3c2e integrals contracted with density matrices and auxiliary vectors for memory-efficient DF Coulomb matrix evaluation.
    • DFT Hessian second-derivative grid response.
    • Minimum energy crossing point (MECP) search functionality.
    • PCM support for TDDFT derivative coupling calculations.
    • Basic GKS and two-component numerical integration, including GPU-accelerated multi-collinear functionals.
    • Multi-collinear spin-flip TDA/TDDFT excitation energies and analytical gradients.
  • Improvements (PBC systems)
    • Linear-dependency handling for basis functions in molecular and PBC DFT calculations.
    • Refactored PBC nuclear gradients for more efficient GTH pseudopotential evaluation.
    • Faster GTH pseudopotential evaluation using the multigrid algorithm on large systems.
  • Improvements (molecular systems)
    • Optimized DFT numerical integration memory usage, achieving ~20% performance gains.
    • Refactored and optimized molecular four-center-integral J/K builder, achieving 50-100% speed-up.
    • Improved phase determination method for NACV.
    • More numerically stable Hessian integrals for large-exponent GTOs.
    • MD J-engine optimized with reduced CUDA register pressure.
    • Third-order XC derivatives can be evaluated on GPU (requires gpu4pyscf-libxc 0.7).
    • Default auxbasis_response level increased to 2 for Hessian calculations with DF integrals.
    • Dimension checks for eigh, enabling scipy fallback for large arrays (size > 21350).
  • Fixes
    • Handle eps=inf in solvent models.
    • Fixed an edge case in EDA electrostatics when cross-fragment nocc is 1.
    • Fixed EDA crash caused by fragments accessing JK matrices after DF 3-index tensors were freed.
    • Workaround for CUDA 13 compiler bugs affecting ECP kernels (disabling compilier optimizations)
    • Molecular and PBC 3c2e integral dimension issues for generally contracted basis sets.
    • Removed pre-allocated streams that caused inconsistent synchronization.
    • SMD Hessian.
    • UHF crash when level_shift is enabled.
  • API updates
    • Fix to_gpu/to_cpu interface in SMD, TDDFT, and PCM-TDDFT
    • Added from_cpu hook on the GPU side, allowing pyscf to invoke this hook in its to_gpu method.

v1.4.3

19 Aug 03:14

Choose a tag to compare

  • New Features
    • Geometry optimization for excited states using TDDFT-ris methods.
    • Non-adiabatic coupling vectors for TDDFT-ris methods.
    • Analytical gradients for DFT+U with k-point sampling.
    • Stress tensor calculations for semi-local DFT with k-point sampling and at the gamma-point.
    • ASE interface to support crystal lattice optimization.
    • Multigrid v2 algorithm for meta-GGA functionals.
  • Improvements
    • GPU kernels for PBC overlap and kinetic integrals.
    • Reduced GPU memory usage for TDDFT-ris by storing tensors in host memory.
    • In PBC methods, scaled k-points (fractional coordinates) are stored to
      simplify lattice optimization calculations.
    • A preconditioned Krylov solver to accelerate convergence in TDDFT and
      dynamic polarizability calculations.
  • Fixes
    • Basis decontraction issue for d and f orbitals.
    • A bug in Multigrid v2 algorithm related to non-orthogonal lattices.
    • Incorrect virtual orbital energies when level_shift was enabled.

v1.4.2

23 Jul 17:22

Choose a tag to compare

  • New Features
    • Raman spectrum calculations.
    • Non-adiabatic coupling vector for time-dependent RKS, including the coupling.
      between ground state and excited states as well as among excited states.
    • DFT+U for molecule and PBC systems.
    • ALMO EDA 2 method.
    • Analytical gradients for TDDFT-ris method.
    • Analytical gradients for PBC k-point DFT.
    • Efficient analytical gradients for PBC Gamma-point DFT using the multigrid algorithm.
    • A custom CuPy memory pool to reduce GPU memory usage.
  • Improvements
    • Improved PBC GDF integral computation at the Gamma point, including reduced.
      GPU memory usage and enhanced computational efficiency.
    • Set the J engine as the defult Coulomb matrix algorithm in the direct SCF driver.
    • Efficient Multigrid integral algorithm for various functions in PBC DFT.
      Gamma point computation such as get_nuc, get_pp, and GGA functionals.
    • Supporting xc='HF' setting in DFT.
  • Fixes
    • Ensured compatibility with CUDA 12.3.
    • Issues related to the combination of density fitting, PCM solvent, and TDDFT.

v1.4.1

27 May 13:51
17ac106

Choose a tag to compare

  • New Features
    • Analytical hessian for VV10 functionals
    • DFT polarizability with VV10 functionals
    • TDDFT for VV10 functionals
    • Non-adiabatic coupling constants for TDDFT states
    • TDDFT gradients and geometry optimization solver for excited states
    • LR-PCM for TDDFT and TDDFT gradients
    • TDDFT-ris method
  • Improvements
    • Optimization CUDA kernel and integral screening for MD J-engine. The MD
      J-engine is utilized by default for large system HF and DFT computation.
    • Optimization for PBC gaussian density fitting at gamma point.
    • ECP gradients CUDA kernel
    • Reduced atomicAdd overhead in Rys JK kernel
  • Fixes
    • MINAO initial guess for ghost atoms

v1.4.0

08 Apr 13:38
9a22053

Choose a tag to compare

  • New Features
    • RKS and UKS TDDFT Gradients for density fitting and direct-SCF methods.
    • ECP integrals and its first and second derivatives accelerated on GPU.
    • Multigrid algorithm for Coulomb matrix and LDA, GGA, MGGA functionals computation.
    • PBC Gaussian density fitting integrals.
    • ASE interface for molecular systems.
  • Improvements
    • Reduce memory footprint in SCF driver.
    • Reduce memory requirements for PCM energy and gradients.
    • Reduce memory requirements for DFT gradients.
    • Utilize the sparsity in cart2sph coefficients in the cart2sph transformation in scf.jk kernel
    • Molecular 3c2e integrals generated using the block-divergent alogrithm.
    • Support I orbitals in DFT.
  • Fixes
    • LRU cached cart2sph under the multiple GPU environment.
    • A maxDynamicSharedMemorySize setting bug in gradient and hessian calculation under the multiple GPU environment.
    • Remove the limits of 6000 GTO shells in DFT numerical integration module.

v1.3.2

10 Mar 22:05
3b0c194

Choose a tag to compare

  • Improvements
    • Dump xc info and grids into to log file
    • Optimize 4-center integral evalulation CUDA kernels using warp divergent algorithm
    • Support up to I orbitals in DFT
    • Fix out-of-bound issue in DFT hessian for heavy atoms (>=19)
  • Deprecation
    • SM60 is not supported in PyPi package

v1.3.1

05 Feb 02:31
b82e219

Choose a tag to compare

  • New Features
    • Analytical Hessian for PCM solvent model
    • Driver for 3c methods (wB97x-3c, R2Scan-3c, B97-3c, etc.)
  • Improvements
    • Preconditioner and computation efficiency of Davidson iterations for TDDFT

v1.3.0

08 Jan 02:15
0427e6a

Choose a tag to compare

  • New Features
    • PBC analytical Fourier transform on GPU
  • Improvements
    • Optimized computation efficiency and memory footprint for density fitting Hessian
    • Support pickle serialization for most classes (SCF, DF, PCM, etc.)
    • Efficiency of moving CuPy arrays between GPU cards

v1.2.1

20 Dec 22:57
4247242

Choose a tag to compare

What's Changed

  • Change the license from GPL v3.0 to Apache 2.0
  • Support direct SCF algorithms with multi-GPU
  • Change the default conv_tol_cpscf = 1e-3 / batch of atoms to conv_tol_cpscf = 1e-6 / atom
  • Add PBC HF and DFT with k-points, UHF/UKS, and density fitting

Improvements

  • Fix numerical instability in complex-valued TDHF diagonalization
  • Improve PCM and QMMM with int1e_grids kernel
  • Support non-symmetric int3c2e integral
  • Optimize Hessian calculation with direct SCF
  • Improve the numerical stability of int3c2e for point charge
  • Add CI workflow for multi-GPU

Bugfixes

  • Fix non-contiguous array error in p2p transfer between GPUs.
  • Fix bugs in NMR calculations

Merry Christmas!

Full Changelog: v1.2.0...v1.2.1

v1.2.0

09 Dec 19:08
5811bb4

Choose a tag to compare

New Features

  • Spin-conserved TDA and TDDFT methods
  • Spin-flip TDA method.
  • J-engine using McMuchie Davidson integral algorithm
  • Support Multi-GPU density fitting energy, gradients and Hessian computation.
  • Second order SCF solver

Improvements

  • Support non-hermitian density matrix in J/K builder
  • Secondary grids for CPHF solver
  • 3-center integral computation efficiency for gradients and hessian
  • One-electron Coulomb integrals against point charges and Gaussian charge distributions on grids.
  • Automatically apply SCF initial guess from existing wavefunction