Conversation
kkamdin
left a comment
There was a problem hiding this comment.
Hey David, I don't see any big issues so this will probably be fine as-is. I flagged two things in comments that you may want to check out in bl823/alcf.py before merging:
- mixed logging modules
- if/else and ValueError raise behavior in the flow steps
| # Transfer A: Send reconstructed data (tiff) to data832 | ||
| logger.info(f"Transferring {file_name} from {config.alcf832_scratch} " | ||
| f"at ALCF to {config.data832_scratch} at data832") | ||
| data832_tiff_transfer_success = transfer_controller.copy( |
There was a problem hiding this comment.
does Step 2B (the multi-res flow below) depend on data832_tiff_transfer_success being true? If yes, you could wrap 2B in an if/else statement so it doesn't try to run and crash.
Or do a similar value error raise if that's the right behavior like you do here (if I have this right -- if alcf_multi_res_success is not true you don't try to run the zarr transfer "Transfer B" even further below):
if not alcf_multi_res_success:
logger.error("Tiff to Zarr Failed.")
raise ValueError("Tiff to Zarr at ALCF Failed")
else:
logger.info("Tiff to Zarr Successful.")
There was a problem hiding this comment.
The multiresolution step does not rely on the TIFF transfer being successful, just that reconstruction was successful. I switched the transfer order so that users have the option to view the TIFF stack at the beamline before waiting for the Zarr conversion to complete (an extra 10+ minute overhead).
…ng the TIFF reconstruction file transfer to happen before Zarr generation for more immediate results.
… stage is complete
Simple fixes to the ALCF recon flow as outlined in this issue: #87
Update: also adding a timeout to globus compute jobs to address this issue: #45