Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update ESACCI-OZONE CMORizer(formatter and downloader) for REF #3899

Open
wants to merge 28 commits into
base: main
Choose a base branch
from

Conversation

diegokam
Copy link
Contributor

@diegokam diegokam commented Jan 31, 2025

Description

This PR updates the ESACCI-OZONE cmorizer (NCL version to python) for REF, and it will produce 2 aggregated output files for toz (1995-2023) and o3 (1984-2022) variables defined in the CMOR Table AERmon.


Before you get started

Checklist

It is the responsibility of the author to make sure the pull request is ready to review. The icons indicate whether the item will be subject to the 🛠 Technical or 🧪 Scientific review.

New or updated recipe/diagnostic

New or updated data reformatting script


To help with the number of pull requests:

@diegokam diegokam added the REF Important for the CMIP Rapid Evaluation Framework (REF) label Jan 31, 2025
@diegokam diegokam self-assigned this Jan 31, 2025
@diegokam diegokam marked this pull request as ready for review February 7, 2025 14:46
@diegokam diegokam requested a review from a team as a code owner February 7, 2025 14:46
Copy link
Contributor

@schlunma schlunma left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Diego!

I have a couple of comments on the code. In addition, please also call the function fix_coords (if possible) on the data to make sure that all coordinate metadata is set properly.

@diegokam
Copy link
Contributor Author

Thanks @schlunma and @axel-lauer for your reviews. I confirm that I have addresed all the comments related to the cmorizer. Please check if the cmorized output files are fine now as the coordinates metadata look better now indeed, thanks!

Copy link
Contributor

@schlunma schlunma left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Diego for addressing my comments!

Just minor ones remaining. Please make sure to add the appropriate download instructions and source (I am sure @axel-lauer will point you to the right location).

I haven't tested this though.

… config file to switch to OBS6, add check_obs entries for O3 and toz in OBS6
@diegokam
Copy link
Contributor Author

Following clarification with @schlunma, I had to switch to OBS6 project as by running the check_obs on the latest output files the checks were failing because during the CMORization I was using the project OBS that is referring to the CMIP5 tables and not to the CMIP6 ones. Furthermore I had also to use the fix_alt16_metadata function from esmvalcore.cmor._fixes.native_datasets.
The output filenames will have now this naming:
OBS6_ESACCI-OZONE_sat_L3_AERmon_o3_YYYYMM-YYYYMM.nc
OBS6_ESACCI-OZONE_sat_L3_AERmon_toz_YYYYMM-YYYYMM.nc
The new output files are passing the check_obs tests now.

Copy link
Contributor

@schlunma schlunma left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code looks good now. Will approve once the documentation is updated properly. Please also update the datasets.yml file.

Copy link
Contributor

@schlunma schlunma left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code looks good! Didn't run the code nor looked at the output from a scientific point of view, though.

Also the link to the data does not work for me: https://cds.climate.copernicus.eu/datasets/satellite-ozone-v1

Copy link
Contributor

@axel-lauer axel-lauer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here are a few first comments:

  • I think the unit conversion needs some changes (see code suggestions).
  • The default time range in cmorization is 1984/1/1 to 2023/12/31. This needs to be changed depending on the dataset that is processed (formatter does not run otherwise):
    • GTO-ECV covers the period 1995/7 to 2023/04
    • SAGE-CCI-OMPS covers the period 1984/10 to 2022/12
  • The old downloader is still the one for the old dataset (used with the NCL formatter). As a minimum, the downloader needs to be removed but I would prefer to add a downloader to retrieve the current data from the CDS. Examples for retrieving data from CDS are, e.g. cds_satellite_soil_moisture.py or cds_xch4.py. I believe they can be used as a template with some small modifications, see CDS changing infrastructure: changes to users and possibly some cmorizers #3750. Maybe @bettina-gier can advice what is needed to download data from the (new) CDS.

@diegokam diegokam changed the title Update ESACCI-OZONE CMORizer(python version) for REF Update ESACCI-OZONE CMORizer(formatter and downloader) for REF Feb 19, 2025
@diegokam
Copy link
Contributor Author

Thank you for your review @axel-lauer.
I do confirm that I have addressed all your comments regarding the formatter and it should working properly now.
Regarding the downloader I also pushed a working version to download the dataset from CDS but not sure if this could be a proper solution as the new APIs requires this .cdsapirc file, as mentioned by @bettina-gier indeed, but it needs to be saved in my home directory and it contains the key assigned to my account registration in the CDS.
This downloader at the moment is creating two zip files containing all the files over the timepriods for the o3 and toz variables and try to use a function in the esmavaltool to automatically extract them in the OBS folder.

@bettina-gier
Copy link
Contributor

Regarding CDS

  • I checked their forum yesterday and there was a post in November about a solution for the different urls coming "soon".. who knows when zoom is. In this post they say you can pass the url when calling the client, I did not check if it's possible to only pass the url and not the key.
  • The .cdsapirc file in the home folder was always needed for the CDS downloaders, so that is fine, the only problem is not wanting to change the url in someones home folder file through ESMValTool.
  • I also saw they split the format key from e.g. format: 'netcdf.zip' to data_format: 'netcdf' and download_format: 'unarchived' not sure if this applies here, didn't look into the code. In case you didn't want to download zips you have to unpack.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved by technical reviewer in scientific review REF Important for the CMIP Rapid Evaluation Framework (REF)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants