Skip to content

ctsm5.4.024: Hot fix with updated derecho_intel and one line fix needed for it#3796

Draft
ekluzek wants to merge 10 commits intoESCOMP:masterfrom
ekluzek:ccs_config_1077_derecho_intel_needed_updates
Draft

ctsm5.4.024: Hot fix with updated derecho_intel and one line fix needed for it#3796
ekluzek wants to merge 10 commits intoESCOMP:masterfrom
ekluzek:ccs_config_1077_derecho_intel_needed_updates

Conversation

@ekluzek
Copy link
Collaborator

@ekluzek ekluzek commented Mar 4, 2026

Description of changes

Update to ccs_config1.0.77 with derecho_intel updates to intel/2024.3.2. And a one line fix so that it will build.

Also fix so that when coupled to CAM we get the clm6_0_cam7.0 parameter file.

Specific notes

Contributors other than yourself, if any: @jedwards4b

CTSM Issues Fixed (include github issue #):
Fixes #3791
Fixes #3795

Are answers expected to change (and if so in what way)? Yes, some

Any User Interface Changes (namelist or namelist defaults changes)? No

Does this create a need to change or add documentation? Did you do so? No No

Testing performed, if any: will run regular
So far ran:
SMS.f45_f45_mg37.I2000Clm60FatesSpRsGs.derecho_nvhpc.clm-FatesColdSatPhen
SMS_D.f10_f10_mg37.I1850Clm60BgcCrop.derecho_intel.clm-ciso_soil_matrixcn_only
ERS_D_Ld5.f45_f45_mg37.I2000Clm60Fates.derecho_intel.clm-FatesColdCamLndTuningMode

which all pass and compare identically to the ctsm5.4.022 baseline

ekluzek added 2 commits March 3, 2026 23:23
… and clarify the strings with a concatenation so it's more clear and there's the needed space between the and and the namelist option name do_transient_urban
@ekluzek ekluzek self-assigned this Mar 4, 2026
@ekluzek ekluzek added enhancement new capability or improved behavior of existing capability code health improving internal code structure to make easier to maintain (sustainability) priority: Immediate Highest priority, something that was unexpected modernization E.g., for improving ability to perform on new computing architectures size: small labels Mar 4, 2026
@github-project-automation github-project-automation bot moved this to Ready to start (or start again) in CTSM: Upcoming tags Mar 4, 2026
@ekluzek ekluzek moved this from Ready to start (or start again) to In progress - master in CTSM: Upcoming tags Mar 4, 2026
@ekluzek ekluzek moved this from Todo to In Progress in LMWG: Sprint Planning Board Mar 4, 2026
@ekluzek
Copy link
Collaborator Author

ekluzek commented Mar 4, 2026

This is on top of #3786 so that needs to come to master first to show the true differences.

@ekluzek ekluzek marked this pull request as draft March 4, 2026 08:30
@ekluzek
Copy link
Collaborator Author

ekluzek commented Mar 4, 2026

There were 73 tests on Derecho with changed answers, which isn't surprising since this was a compiler update:

ERI_Ld9.f10_f10_mg37.I2000Clm50BgcCru.derecho_intel.clm-default
ERI_Ld9.f45_g37.I2000Clm50BgcCru.derecho_intel.clm-nofire
ERI_Ld9.f45_g37.I2000Clm50BgcCru.derecho_intel.clm-nofire--clm-matrixcnOn
ERP_Ld5.f10_f10_mg37.I1850Clm50Bgc.derecho_intel.clm-default
ERP_Ld9.f45_g37.I2000Clm60Bgc.derecho_intel.clm-default
ERP_Ld9.f45_g37.I2000Clm60BgcCrujra.derecho_intel.clm-default
ERP_Ly3_P64x2.f10_f10_mg37.IHistClm50BgcCrop.derecho_intel.clm-cropMonthOutput
ERP_Ly3_P64x2.f10_f10_mg37.IHistClm60BgcCrop.derecho_intel.clm-cropMonthOutput--clm-matrixcnOn_ignore_warnings
ERP_P128x2_Ld30.f45_f45_mg37.I2000Clm60FatesSpCruRsGs.derecho_intel.clm-FatesColdSatPhen
ERP_P64x2_Ld1096.f10_f10_mg37.I2000Clm50BgcCrop.derecho_intel.clm-clm50cropIrrigMonth_interp
ERP_P64x2_Ld1096.f10_f10_mg37.I2000Clm50BgcCrop.derecho_intel.clm-irrig_o3falk_reduceOutput
ERP_P64x2_Ld366.f10_f10_mg37.I2000Clm50BgcCrop.derecho_intel.clm-irrig_alternate_monthly
ERP_P64x2_Ld396.f10_f10_mg37.IHistClm60Bgc.derecho_intel.clm-monthly            EXPECTED POSSIBILITY
ERP_P64x2_Ld396.f10_f10_mg37.IHistClm60Bgc.derecho_intel.clm-monthly--clm-matrixcnOn_ignore_warnings            EXPECTED POSSIBILITY
ERP_P64x2_Ld765.f10_f10_mg37.I2000Clm60BgcCrop.derecho_intel.clm-monthly
ERS_Ld1640_Mmpi-serial.1x1_numaIA.I2000Clm50BgcCrop.derecho_intel.clm-ciso_monthly_matrixcn_spinup
ERS_Ld3.f10_f10_mg37.I2000Clm60Bgc.derecho_intel.clm-ciso_cwd_hr
ERS_Ld3.f10_f10_mg37.I2000Clm60Bgc.derecho_intel.clm-ciso_cwd_hr--clm-matrixcnOn
ERS_Ld3.f10_f10_mg37.I2000Clm60BgcCrujra.derecho_intel.clm-ciso_cwd_hr
ERS_Ld30.f45_f45_mg37.I2000Clm50FatesRs.derecho_intel.clm-FatesColdFixedBiogeo
ERS_Ld30.f45_f45_mg37.I2000Clm50FatesRs.derecho_intel.clm-FatesColdSizeAgeMort
ERS_Ld396.f10_f10_mg37.I1850Clm60Bgc.derecho_intel.clm-monthly_matrixcn_fast_spinup
ERS_Ld9.f10_f10_mg37.I2000Clm50FatesCruRsGs.derecho_intel.clm-FatesColdCH4Off
ERS_Ly20_Mmpi-serial.1x1_numaIA.I2000Clm50BgcCropQianRs.derecho_intel.clm-cropMonthlyNoinitial
ERS_Ly20_Mmpi-serial.1x1_numaIA.I2000Clm50BgcCropQianRs.derecho_intel.clm-cropMonthlyNoinitial--clm-matrixcnOn
ERS_Ly3.f10_f10_ais8gris4_mg37.I1850Clm60SpGag.derecho_intel
ERS_Ly3.f10_f10_mg37.I1850Clm60BgcCropCmip6.derecho_intel.clm-basic
ERS_Ly3.f10_f10_mg37.I2000Clm60BgcCrop.derecho_intel
ERS_Ly3_P64x2.f10_f10_mg37.IHistClm50BgcCropG.derecho_intel.clm-cropMonthOutput
ERS_Ly5_P128x1.f10_f10_mg37.IHistClm45BgcCrop.derecho_intel.clm-cropMonthOutput
ERS_Ly5_P128x1.f10_f10_mg37.IHistClm60BgcCrop.derecho_intel.clm-cropMonthOutput
ERS_Ly5_P128x1.f10_f10_mg37.IHistClm60BgcCrop.derecho_intel.clm-cropMonthOutput--clm-matrixcnOn_ignore_warnings
ERS_Ly6_Mmpi-serial.1x1_smallvilleIA.IHistClm50BgcCropQianRs.derecho_intel.clm-cropMonthOutput
ERS_Ly6_Mmpi-serial.1x1_smallvilleIA.IHistClm50BgcCropQianRs.derecho_intel.clm-cropMonthOutput--clm-matrixcnOn_ignore_warnings
ERS_P128x1_Ld765.f10_f10_mg37.I2000Clm60Fates.derecho_intel.clm-FatesColdNoComp
LCISO_Ld396.f10_f10_mg37.I1850Clm50BgcCrop.derecho_intel.clm-ciso_monthly_matrixcn_spinup
LCISO_Ld396.f10_f10_mg37.IHistClm60BgcCrop.derecho_intel.clm-ciso_monthly
LCISO_Ld396.f10_f10_mg37.IHistClm60BgcCrop.derecho_intel.clm-ciso_monthly--clm-matrixcnOn_ignore_warnings
MKSURFDATAESMF_P128x1.f10_f10_mg37.I1850Clm50BgcCrop.derecho_intel
NCK_Ld1.f10_f10_mg37.I2000Clm50Sp.derecho_intel.clm-default--clm-nofireemis
PEM_Ld1.f10_f10_mg37.I2000Clm60BgcCrop.derecho_intel.clm-till
PEM_Ld1.f10_f10_mg37.I2000Clm60BgcCrop.derecho_intel.clm-till--clm-matrixcnOn
PFS_Ld10_PS.f19_g17.I2000Clm50BgcCrop.derecho_intel.clm-default--clm-matrixcnOn
REP_P64x2_Ld13.f10_f10_mg37.IHistClm60Bgc.derecho_intel.clm-monthly--clm-matrixcnOn_ignore_warnings
RXCROPMATURITYSKIPGEN_Ld1097.f10_f10_mg37.IHistClm60BgcCrop.derecho_intel.clm-cropMonthOutput
SMS.f10_f10_mg37.I2000Clm50BgcCrop.derecho_intel.clm-crop
SMS.ne30pg3_t232.I2000Clm60BgcCrop.derecho_intel.clm-clm60cam7LndTuningMode
SMS_D_Ld1_Mmpi-serial.ne3_ne3_mg37.I2000Clm50SpRs.derecho_intel.clm-ptsRLA
SMS_Ld10_D_Mmpi-serial.CLM_USRDAT.I1PtClm60SpRs.derecho_intel.clm-default--clm-NEON-TOOL--clm-nofireemis
SMS_Ld3_PS.f09_g17.IHistClm50BgcCrop.derecho_intel.clm-f09_dec1990Start_GU_LULCC
SMS_Ld3_PS.f09_g17.IHistClm50BgcCrop.derecho_intel.clm-f09_dec1990Start_GU_LULCC--clm-matrixcnOn_ignore_warnings
SMS_Ld5.f10_f10_mg37.I1850Clm45BgcCrop.derecho_intel.clm-crop
SMS_Ld5.f10_f10_mg37.I2000Clm45Fates.derecho_intel.clm-FatesCold
SMS_Ld5.f10_f10_mg37.I2000Clm50FatesRs.derecho_intel.clm-FatesCold
SMS_Lm1.f10_f10_mg37.I1850Clm60Bgc.derecho_intel.clm-clm60_monthly_matrixcn_soilCN30
SMS_Lm1.f10_f10_mg37.IHistClm60Bgc.derecho_intel.clm-leafcn_t_evolving
SMS_Lm1.ne30pg3_t232.I1850Clm60BgcCropCrujraG.derecho_intel.clm-clm60cam7LndTuningMode
SMS_Lm1.ne30pg3_t232.I1850Clm60BgcCropG.derecho_intel.clm-clm60cam7LndTuningMode
SMS_Ln9.f09_f09_mg17.I1850Clm45Bgc.derecho_intel.clm-clm45cam4LndTuningModeZDustSoilErod
SMS_Ln9.f10_f10_mg37.I1850Clm45Bgc.derecho_intel.clm-clm45cam4LndTuningModeZDustSoilErod
SMS_Ln9.f10_f10_mg37.I2000Clm50Sp.derecho_intel.clm-clm50cam5LndTuningModeZDustSoilErod--clm-nofireemis
SMS_Ln9.f19_f19_mg17.IHistClm50Sp.derecho_intel.clm-clm50cam7LndTuningMode_1979Start--clm-nofireemis
SMS_Ln9.ne30pg2_ne30pg2_mg17.I1850Clm50Sp.derecho_intel.clm-clm50cam6LndTuningMode--clm-nofireemis
SMS_Ln9.ne30pg2_ne30pg2_mg17.I2000Clm50BgcCrop.derecho_intel.clm-clm50cam6LndTuningMode
SMS_Ln9_P256x2.C96_C96_mt232.IHistClm50BgcCrop.derecho_intel.clm-clm50cam6LndTuningMode
SMS_Ly1_Mmpi-serial.1x1_brazil.IHistClm60BgcQianRs.derecho_intel.clm-output_bgc_highfreq
SMS_Ly2_PS.f19_g17.I2000Clm60BgcCrop.derecho_intel.clm-cropMonthOutput
SMS_Ly3.f10_f10_mg37.I1850Clm50SpG.derecho_intel.clm-glcMEC_long--clm-nofireemis
SMS_Ly3.f10_f10_mt232.IHistClm60BgcCropCrujra.derecho_intel.clm-ciso_cmip7_monthly_2013Start
SMS_Ly3.f10_f10_mt232.IHistClm60BgcCropCrujra.derecho_intel.clm-ciso_monthly_2013Start
SMS_Ly5_Mmpi-serial.1x1_brazil.IHistClm50BgcQianRs.derecho_intel.clm-newton_krylov_spinup
SOILSTRUCTUD_Ld5.f10_f10_mg37.I2000Clm50BgcCrop.derecho_intel.clm-default
SSPMATRIXCN_Ly5_Mmpi-serial.1x1_numaIA.I2000Clm60BgcCropQianRs.derecho_intel.clm-ciso_monthly

And there are seven unexpected fails that I need to look into:

ERP_D_P64x2_Ld3.f10_f10_mg37.I1850Clm50BgcCrop.derecho_intel.clm-default--clm-matrixcnOn_ignore_warnings        (RUN)
ERP_P64x2_D_Ld5.f10_f10_mg37.I1850Clm50Bgc.derecho_intel.clm-ciso--clm-matrixcnOn_ignore_warnings       (RUN)
ERS_D_Ld7_Mmpi-serial.1x1_smallvilleIA.IHistClm50BgcCropRs.derecho_intel.clm-decStart1851_noinitial     (RUN)
ERS_D_Mmpi-serial_Ld5.5x5_amazon.I2000Clm60FatesRs.derecho_intel.clm-FatesCold  (RUN)
SMS_D_Ly6_Mmpi-serial.1x1_smallvilleIA.IHistClm45BgcCropQianRs.derecho_intel.clm-cropMonthOutput        (RUN)
SMS_Ld10_D_Mmpi-serial.CLM_USRDAT.I1PtClm60Fates.derecho_intel.clm-FatesFireLightningPopDens--clm-NEON-FATES-NIWO       (RUN)
SMS_Lm3_D_Mmpi-serial.1x1_brazil.I2000Clm50FatesCruRsGs.derecho_intel.clm-FatesColdHydro        (RUN)

@ekluzek
Copy link
Collaborator Author

ekluzek commented Mar 4, 2026

The first two fails are an error in the SparseMatrixMultiply for CN matrix with errors that look like this:

cesm.log:

dec2454.hsn.de.hpc.ucar.edu 6:  ERROR in SparseMatrixMultiplyMod.F90 at line 973
dec2454.hsn.de.hpc.ucar.edu 62:  ERROR in SparseMatrixMultiplyMod.F90 at line 973
dec2454.hsn.de.hpc.ucar.edu 47:  Negative conc. in ch4tran. c,j,deficit (mol):           7           2
dec2454.hsn.de.hpc.ucar.edu 47:   1.046852068611785E-003
dec2454.hsn.de.hpc.ucar.edu 62: Image              PC                Routine            Line        Source
dec2454.hsn.de.hpc.ucar.edu 62: cesm.exe           0000000003A188E5  shr_abort_backtra         110  shr_abort_mod.F90
dec2454.hsn.de.hpc.ucar.edu 62: cesm.exe           0000000003A188A1  shr_abort_abort            65  shr_abort_mod.F90
dec2454.hsn.de.hpc.ucar.edu 62: cesm.exe           0000000003A196CE  shr_assert                 95  shr_assert_mod.F90.in
dec2454.hsn.de.hpc.ucar.edu 62: cesm.exe           0000000003A19A32  shr_assert_all_1d         112  shr_assert_mod.F90.in
dec2454.hsn.de.hpc.ucar.edu 62: cesm.exe           0000000002AFF8C8  spmp_b_acc                973  SparseMatrixMultiplyMod.F90
dec2454.hsn.de.hpc.ucar.edu 62: cesm.exe           0000000001256C95  cnsoilmatrix              614  CNSoilMatrixMod.F90
dec2454.hsn.de.hpc.ucar.edu 62: cesm.exe           000000000331EFB4  cndriverleaching         1113  CNDriverMod.F90
dec2454.hsn.de.hpc.ucar.edu 62: cesm.exe           00000000015BD670  ecosystemdynamics        1127  CNVegetationFacade.F90
dec2454.hsn.de.hpc.ucar.edu 62: cesm.exe           000000000091FCE9  clm_driver_mp_clm        1157  clm_driver.F90

For these tests:

ERP_D_P64x2_Ld3.f10_f10_mg37.I1850Clm50BgcCrop.derecho_intel.clm-default--clm-matrixcnOn_ignore_warnings        (RUN)
ERP_P64x2_D_Ld5.f10_f10_mg37.I1850Clm50Bgc.derecho_intel.clm-ciso--clm-matrixcnOn_ignore_warnings       (RUN)

@ekluzek
Copy link
Collaborator Author

ekluzek commented Mar 4, 2026

The five mpi-serial tests that fail:

ERS_D_Ld7_Mmpi-serial.1x1_smallvilleIA.IHistClm50BgcCropRs.derecho_intel.clm-decStart1851_noinitial     (RUN)
ERS_D_Mmpi-serial_Ld5.5x5_amazon.I2000Clm60FatesRs.derecho_intel.clm-FatesCold  (RUN)
SMS_D_Ly6_Mmpi-serial.1x1_smallvilleIA.IHistClm45BgcCropQianRs.derecho_intel.clm-cropMonthOutput        (RUN)
SMS_Ld10_D_Mmpi-serial.CLM_USRDAT.I1PtClm60Fates.derecho_intel.clm-FatesFireLightningPopDens--clm-NEON-FATES-NIWO       (RUN)
SMS_Lm3_D_Mmpi-serial.1x1_brazil.I2000Clm50FatesCruRsGs.derecho_intel.clm-FatesColdHydro 

Do so because of a divide by zero that looks like this:

forrtl: error (73): floating divide by zero
Image              PC                Routine            Line        Source             
libc.so.6          000014FBD7847900  Unknown               Unknown  Unknown
libesmf.so         000014FBD9E4B616  _ZN5ESMCI4BBoxC1E     Unknown  Unknown
libesmf.so         000014FBDA2CF410  _ZN5ESMCI16OctSea     Unknown  Unknown
libesmf.so         000014FBDA2D1568  _ZN5ESMCI9OctSear     Unknown  Unknown
libesmf.so         000014FBDA0C9F01  _ZN5ESMCI6InterpC     Unknown  Unknown
libesmf.so         000014FBDA239978  _ZN5ESMCI6regridE     Unknown  Unknown
libesmf.so         000014FBDA2783B8  _Z19ESMCI_regrid_     Unknown  Unknown
libesmf.so         000014FBDA1F8EC3  _ZN5ESMCI7MeshCap     Unknown  Unknown
libesmf.so         000014FBDA2C16B2  c_esmc_regrid_cre     Unknown  Unknown
libesmf.so         000014FBDA974A50  esmf_regridmod_mp     Unknown  Unknown
libesmf.so         000014FBDA6E7578  esmf_fieldregridm     Unknown  Unknown
cesm.exe           00000000037EA55C  shr_strdata_init          639  dshr_strdata_mod.F90
cesm.exe           00000000037DE3C2  shr_strdata_init_         347  dshr_strdata_mod.F90
cesm.exe           00000000037282BB  init                      115  ch4FInundatedStreamType.F90
cesm.exe           00000000030ACF25  init                      241  ch4Mod.F90
cesm.exe           0000000000936BBB  clm_instinit              396  clm_instMod.F90
cesm.exe           00000000009290FC  initialize2               419  clm_initializeMod.F90
cesm.exe           000000000085A034  initializerealize         677  lnd_comp_nuopc.F90

@samsrabin samsrabin changed the title ctsm5.4.023: Hot fix with updated derecho_intel and one line fix needed for it ctsm5.4.024: Hot fix with updated derecho_intel and one line fix needed for it Mar 9, 2026
ekluzek added 3 commits March 9, 2026 16:00
…he namelist move for ESCOMP#3391 the move_nml_parameters branch
…he paramfile"

Temporarily revert the change to fix the coupled model parameter file,
so that it comes in on an answer changing tag.

This reverts commit 580708d.
…-issues' into ccs_config_1077_derecho_intel_needed_updates
@ekluzek
Copy link
Collaborator Author

ekluzek commented Mar 10, 2026

@slevis-lmwg and I discussing this.

On the CN fails. In #3360 we note that resubmitted enabled the test to pass. So I'll do the same here.

Also there is a matrixcn testlist and I should run it, to check more CN matrix tests. Actually looking again it looks like all the matrixcn tests are in aux_clm. Maybe we should delete the matrixcn testlist then? Actually let's keep it because the list is 50 tests and this makes it easier to run just that set of things.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

code health improving internal code structure to make easier to maintain (sustainability) enhancement new capability or improved behavior of existing capability modernization E.g., for improving ability to perform on new computing architectures priority: Immediate Highest priority, something that was unexpected size: small

Projects

Status: In progress - master
Status: In Progress

Development

Successfully merging this pull request may close these issues.

Update to ccs_config1.0.77 which has further updates to the derecho_intel compiler environment intel/2025.3.2 wants a one line change

1 participant