Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Standard names: *ocean model grid information* #219

Open
jmecki opened this issue Sep 8, 2024 · 106 comments
Open

Standard names: *ocean model grid information* #219

jmecki opened this issue Sep 8, 2024 · 106 comments
Labels
CMIP7 Vocabulary proposals for CMIP7 variables standard name (added by template) Requests and discussions for standard names and other controlled vocabulary

Comments

@jmecki
Copy link

jmecki commented Sep 8, 2024

Before submitting an issue be sure you have read and understood the rules for vocabulary changes and review the guidance for constructing standard names

Please note that it is fine to group together a number of proposals in a single GitHub issue (i.e. it is not necessary to open a separate issue for each vocabulary term). Change proposals should include the following information as applicable.

Proposer's name
Jenny Mecking

Date
Sept. 8 2024

For each term please try to give the following:

- Term
eastward_ocean_gridbox_length_at _t_point

- Description
The length of the gridbox in the east/west direction for the t-point gridbox

- Units (If applicable).
m

- Term
northward_ocean_gridbox_length_at _t_point

- Description
The length of the gridbox in the north/south direction for the t-point gridbox (i.e dy on t-points)

- Units (If applicable).
m

- Term
eastward_ocean_gridbox_length_at _u_point

- Description
The length of the gridbox in the east/west direction for the u-point gridbox

- Units (If applicable).
m

- Term
northward_ocean_gridbox_length_at _u_point

- Description
The length of the gridbox in the north/south direction for the t-point gridbox (i.e dy on u-points)

- Units (If applicable).
m

- Term
eastward_ocean_gridbox_length_at _v_point

- Description
The length of the gridbox in the east/west direction for the v-point gridbox

- Units (If applicable).
m

- Term
northward_ocean_gridbox_length_at _v_point

- Description
The length of the gridbox in the north/south direction for the t-point gridbox (i.e dy on v-points)

- Units (If applicable).
m

@jmecki jmecki added add to cfeditor (added by template) Moderators are requested to add this proposal to the CF editor standard name (added by template) Requests and discussions for standard names and other controlled vocabulary labels Sep 8, 2024
Copy link

github-actions bot commented Sep 8, 2024

Thank you for your proposal. These terms will be added to the cfeditor (http://cfeditor.ceda.ac.uk/proposals/1) shortly. Your proposal will then be reviewed and commented on by the community and Standard Names moderator.

@JonathanGregory
Copy link
Contributor

Dear Jenny @jmecki

It is interesting that standard names haven't previously been defined for the horizontal size of gridcells. Usually this information is not necessary to store as a data variable, because you can calculate it when needed from the bounds of the latitude and longitude coordinate variables. Please could you describe why these standard names are needed?

Best wishes and thanks

Jonathan

@ChrisBarker-NOAA
Copy link

I'd note also that these would only apply for north-east aligned rectangular grids.

maybe there's a more general standard name that could be used?

@japamment japamment added the CMIP7 Vocabulary proposals for CMIP7 variables label Sep 11, 2024
@jmecki
Copy link
Author

jmecki commented Sep 12, 2024

Thank you for your responses, it's the first time I've proposed variable names so I might have missed something when I looked things up.

In response to Jonathan's comment, while you can compute it from the latitude and longitude (bounds/vertices) models often artificially narrow channels, for example the Strait of Gibraltar and in the Indonesian Throughflow region. This is especially true for models that have ocean model resolution similar to the models used in CMIP6 (i.e. 1 degree), therefore giving incorrect values.

In response to Chris Barker's comment, what I had in mind were the lengths/widths of the grid boxes which if they are on an irregular grid then they wouldn't be strictly northward or eastward but along the model grid lines.

Maybe the names:
ocean_gridbox_length_at _t_point
ocean_gridbox_width_at _t_point
ocean_gridbox_length_at _u_point
ocean_gridbox_width_at _u_point
ocean_gridbox_length_at _v_point
ocean_gridbox_width_at _v_point

However, how length and width are defined might be unclear.

@ChrisBarker-NOAA
Copy link

models often artificially narrow channels

very interesting -- I had no idea. So yes, this does make sense to me as something to capture in a standard_name

Is "gridbox" the correct term, however? though I have idea what a bette tem might be:

channel_width? or some such?

However, how length and width are defined might be unclear.

yes, that's a trick -- "x" and "y" -- I can't recall if those terms are currently used to mean "logical x [y]" rather than literal.

@atreguier
Copy link

The convention so far in standard names is to use x and y, e.g., "ocean_heat_x_transport: "x" indicates a vector component along the grid x-axis, positive with increasing x"

@ChrisBarker-NOAA
Copy link

thanks @atreguier -- then "x" and "y" do seem to mean the correct thing in this context.

@martinjuckes
Copy link

Could this information be provided using the mesh topology attributes defined in Appendix K of the CF conventions?

@JonathanGregory
Copy link
Contributor

Dear Jenny

I'm a bit puzzled. In answer to my question above you agreed you can get gridcell dimensions from the cell bounds, but you say that these are not necessarily the "right" answers. Surely they are the true answers for the grid, even if not for reality. Is it perhaps not the gridcell dimensions that you want to record and give standard names to, but the real distances across straits etc?

Best wishes

Jonathan

@taylor13
Copy link

Hi Jenny,

I would find it helpful if you could describe how a model actually treats the processes in these "special" grid cells where somehow the equations governing the ocean are modified to take into account a channel that is narrower than the what the cell can represent. Is there a reference for how they are treated?

thanks,
Karl

@atreguier
Copy link

Dear Karl, dear Jenny,
I know that adjusting the width of a one-grid-point-wide strait (narrowing it to reduce the volume flux) was done often in the NEMO model. It may be referenced in the manual, I am not sure. Even when no "ad hoc" modification has been made, in NEMO the "scale factors" (the dx,dy, used in the disctretized equations) are not exactly the distances between grid points (to machine accuracy) because they are computed analytically from the equations describing the grid. So, in order to close the mass budgets, we need to use the original "scale factors" of the model. I do not know how it is for other models, especially fine volume models with triangular meshes. It would certainly be interesting for modelling groups to publish on esgf everything a user needs to compute budgets.

@jmecki
Copy link
Author

jmecki commented Sep 17, 2024

Thank you Anne-Marie, I agree with what you are saying and I am also most familiar with NEMO. The variables I'm looking for are basically the equivalent of the NEMO variables 'e1t, e2t, e1u, e2u, etc...' while the e3* variables are a bit easier to estimate (I also opened the issue 220 for these). Not having this specifically has lead to difficulties in estimating transports in models. I think just having widths of channels will not be sufficient because the channels with artificial narrowing might end up being different in different models and resolutions. I tried estimating the grid box dx/e1* and dy/e2* (lengths and widths) from the information provided in the CMIP5 and CMIP6 database at T,U and V points and have found that several models have had this issue but mainly the NEMO based ones.

Copy link

This issue has had no activity in the last 30 days. Accordingly:

  • If you proposed this issue or have contributed to the
    discussion, please reply to any outstanding concerns.
  • If there has been little or no discussion, please comment
    on this issue, to assist with reaching a decision.
  • If the proposal seems to have come to a consensus, please
    wait for the moderators to take the next steps towards
    acceptance.

Standard name moderators are also reminded to review @feggleton @japamment @efisher008

@github-actions github-actions bot added the moderator attention (added by GitHub action) Moderators are requested to consider this issue label Oct 18, 2024
@jmecki
Copy link
Author

jmecki commented Oct 20, 2024

Has there been a decision made about having a name for this? It would be useful to have one even if all models might not be able to provide this field.

Would a possible suggestion now be:

ocean_gridbox_x_length_at _t_point
ocean_gridbox_y_length_at _t_point
ocean_gridbox_x_length_at _u_point
ocean_gridbox_y_length_at _u_point
ocean_gridbox_x_length_at _v_point
ocean_gridbox_y_length_at _v_point

@JonathanGregory
Copy link
Contributor

Dear Jenny

Speaking for myself, I don't think it's yet clear what these quantities actually are. Please could you describe them in geophysical terms, as we need for standard names? How do you use them in a calculation, for example? That might clarify the issue. I think that "gridbox lengths" would generally be understood as the distance between coordinates, as I interpreted it before.

Also, I don't think we would mention T, U and V points in standard names. CF doesn't have a notion of grid arrangements, to which that refers. (Maybe it should, as has been discussed a few times, but that's not the issue here.) I think we need to give a name to a distance needed for some purpose; the gridpoints it applies to should be obvious from the coordinates or bounds.

Please don't give up. These discussions are often difficult, but usually achieve a good result!

Best wishes

Jonathan

@github-actions github-actions bot removed the moderator attention (added by GitHub action) Moderators are requested to consider this issue label Oct 21, 2024
@fmassonn
Copy link

Hi,

I agree that having these terms would be great to recompute fluxes, or to compute fluxes over non-standard straits.

François

@taylor13
Copy link

I'm trying to understand how your model handles a single grid cell or column of single cells (representing a strait). For a strait separating a northern and southern body of water, are the equations applied there the same as for grid cells elsewhere (but, of course, with no east-west transport)? Or is that grid cell narrowed to the size of the actual strait?

If the equations are applied to a grid cell that is narrower than the nearby cells, then I would say your grid is not a simple latxlon grid, but rather a special 2-d grid which is mostly identical to a latxlon grid except for a few grid cells. You could represent the true grid as described in section 5.2 of the conventions and define the cell shapes following the approach described is section 7.1 under Bounds for 2-D coordinate variables with 4-sided cells. The true area of the grid cell (as represented in the model) could then be calculated using these bounds for 2-D coordinate variables. You wouldn't be able to correctly calculate the grid cell area (which is useful for several purposes) if your longitude bounds as defined in the model formulation are not correctly given everywhere (even for the strait cells).

Of course, I might be misinterpreting how the model handles these special "strait" grid cells. It would seem if it solves the equations assuming the strait cells are narrower than nearby cells, then I would think there would be real problems calculating the vertical exchanges between the ocean and the atmosphere. Are those calculations also performed assuming the cells are narrow? If so, then the land model must have wider than normal cells on either side of the strait. That seems unlikely, so maybe the above is all wrong.

@taylor13
Copy link

taylor13 commented Oct 28, 2024

@Mecki Hi Jenny, Somehow I got an email from you that was, I think, supposed to be posted here, so I've copied it here:

Hi Jonathan and Taylor,

I hope this helps answer some of the questions.

I would use these variables as follows:

To compute the volume transport along a constant line in the y/j direction:

$volTrans(t)=\int_{xmin}^{xmax} \int_{H}^{surface}$

Perhaps something has been lost in the rendering. (width of strait)*(depth of strait) will give you the cross-sectional area (in the x-z plane) spanning the strait (separating a more-or-less northern ocean basin from a southern ocean basin). To get a volume transport you would multiply this by a velocity, but that's missing from your formula.

The following remains unclear to me: Has the velocity reported in model output been calculated by the model on a regular grid and then inflated to account for the fact that the strait is narrower than the grid cell? If that's the case, then I see that you must know what the model assumed to be the true strait width if you want to calculate a volume transport.

@ChrisBarker-NOAA
Copy link

I edited the TeX in the post -- I think I got it right, but please do correct it if I misunderstood what it was supposed to be.

But as to the topic -- in some models (ROMS), what is computed is the flux through a gird cell -- though it may be provided in the output as a velocity -- in that case, you'd need to know the cross sectional area of the channel -- and it that doesn't match the area of the cell face, you'd need to know that -- is that what this is for?

@jmecki
Copy link
Author

jmecki commented Oct 28, 2024

I edited the TeX in the post -- I think I got it right, but please do correct it if I misunderstood what it was supposed to be.

But as to the topic -- in some models (ROMS), what is computed is the flux through a gird cell -- though it may be provided in the output as a velocity -- in that case, you'd need to know the cross sectional area of the channel -- and it that doesn't match the area of the cell face, you'd need to know that -- is that what this is for?

Thanks, I was really struggling with the latex in github and accidentally posted it before it was finished. I will post a more complete response once I have it typed up better.

@jmecki
Copy link
Author

jmecki commented Nov 8, 2024

Hi All,

Sorry it took me a while to respond.

A simple example for which I would use dx on the v grid points would be to compute the volume transport through a section as follows:

Screenshot 2024-11-08 at 16 04 40

where you would have to compute dx on the v points either from the vertices or the centre points of the grid points of the grid boxes where the computation would differ depending on the Arakara grid used in the model. While this isn't too complex it gets more complex when changing to temperature/heat and salinity/freshwater transports where you have to move all the data (temperature, salinity and velocity) to the same grid point, typically along the grid box edge of the t-points. This again is dependent on the model grid used and having dx, dy and dz on the different grid points (i.e. t,u,v,etc).

In terms of the comment related to ROMS, yes this is something that it would be used for.

Furthermore, as mentioned before if the grid boxes have an artificially narrowed point for some channels, often estimating dx, dy, and dz give the wrong value leading to spikes that shouldn't be there...

In Mecking and Drijfhout 2023 in the methods we made the comment: 'In global and Indo-Pacific computations of OHT there are spikes at the latitudes which are impacted by Indonesian throughflow in some models. Several ocean models which do not have high enough horizontal resolution to resolve the narrow ocean channels, like the ones present in the Indonesian throughflow, artificially narrow these channels but only on either the U- or V-grid points. The information available in the CMIP5/6 archives only contains information about the grid size on the T-grid points. For the models where these spikes occur, we removed the data from the latitudes where the spikes appear.'

While it is possible to compute dx and dy from the grid box vertices this is not always provided and it adds extra computations and may not give the correct values, especially if there are artificially narrows channels. Please correct me if I'm wrong, I'm not a model developer, but from the ocean model code I have looked at, in both Arakawa-B and Arakawa-C grid models they used dx and dy values at the t,u and v grid points in the computations as opposed to the vertices. I believe that it would still be very useful to have names for these variables even if there might be work arounds.

@ChrisBarker-NOAA
Copy link

Furthermore, as mentioned before if the grid boxes have an artificially narrowed point for some channels, often estimating dx, dy, and dz give the wrong value leading to spikes that shouldn't be there...

While it is possible to compute dx and dy from the grid box vertices this is not always provided and it adds extra computations and may not give the correct values, especially if there are artificially narrows channels.

I think that "artificially narrows channels" is the key point here -- otherwise, the grid is well defined, and I don't think we need standard names. But if there's been an adjustment in the model, such that you can't compute the correct flux from the grid geometry, then you absolutely need to know the "virtual" channel size.

So I support the idea -- but have no opinion about how to exactly spell the name :-)

@atreguier
Copy link

Hi everybody, I just realized that this procedure to artificially narrow some straits is well explained in the NEMO manual , with a nice figure. Here it is
nemo_manual_handmade_grid_corrections.pdf

@JonathanGregory
Copy link
Contributor

Dear Jenni @jmecki et al.

Thanks for explaining. I agree that working out dx and dy on velocity points from the tracer grid is quite intricate, so it's convenient to record them as data variables with metadata to identify them. We can regard dx and dy as metrics of the grid. We already have a standard name for dz, namely cell_thickness. It would be consistent to use standard names of cell_x_width and cell_y_width for dx and dy respectively. The coordinates of the variable containing the quantity show what grid it's on, so that shouldn't be in the standard name. To make the correct variables easier to find, we could define thickness, x_width and y_width as keywords for the cell_measures attribute, which currently supports only area and volume.

However, if straits have effective widths that differ from what the gridpoint coordinates indicate, that's not a metric of the grid, and you can't work them out. I don't exactly understand your equation, because of three appearances of v on the right-hand side. I think we can write the northward volume transport in m3 s-1 through a section at latitude row $j$ as

$$V(j,t) = \sum_{i=\mbox{west}}^{\mbox{east}} \sum_{k=\mbox{surface}}^{\mbox{bottom}} \delta V(i,j,k,t)$$

where

$$\delta V(i,j,k,t) = v(i,j,k,t)\ dx'\ dz$$

is the contribution from gridbox $(i,j,k)$ to the transport, where $dx'$ is the effective width of the cell for transport and $dx'\leq dx$, the cell x-width. Is that right? If so,

$$dx' = \frac{\delta V/dz}{v}$$

This is the ratio of the ocean_volume_y_transport $\delta V$ (an existing standard name, unit m3 s-1) per unit thickness $1/dz$ to sea_water_y_velocity $v$ (an existing standard name, unit m s-1). Hence we could give $dx'$ the standard name ratio_of_ocean_y_transport_per_unit_thickness_to_sea_water_y_velocity. Would that make sense to you?

Best wishes

Jonathan

PS How marvellous that you can write $\TeX$ in markdown. I didn't know that before!

@jmecki
Copy link
Author

jmecki commented Nov 11, 2024

Hi All, sorry about all the v's in the equation. That's my bad, in my head I think of dx on the v grid points as dxv and similarly dy as dyv and dz as dzv I should have been more precise and defined this. I like Jonathan's suggestion about using
cell_x_width and cell_y_width, that makes it clear for me.

@atreguier
Copy link

Hi, @JonathanGregory , I did not follow the beginning of the discussion, so I'm not sure how I can help... if more description is needed, where must these descriptions/definitions be submitted?
@jmecki , I agree that cell_x_length_for_tracers is more appealing to the Nemo community (we talk about the "T" points as "tracer" points) but I am afraid this may lead to discussions like "what is a tracer? which tracers? which units"? etc etc. If there is no danger of this happening, then yes, I prefer "tracers" as well.

@jmecki
Copy link
Author

jmecki commented Feb 14, 2025

@japamment in the description can tracers and/or examples of tracers be added to the description, regardless of if for_tracers or for_temperature is used?

@atreguier I worry the other way around, that people might not understand that for_temperature should also be used for salinity...

@taylor13
Copy link

It appears that many of the names propose apply only to specific types of grids, so should the "grid type" be included in the standard name? There may be a problem also defining these in terms of different associated variables (e.g., tracers vs. velocities). It's true that velocities may be calculated on cell interfaces, but values are often reported (for some uses) at the same location as the tracers. So cell_x_length_for_x_velocity would depend on which on which of the two grids were used.

I wonder if the "ugrid"conventions might be a better way to describe more completely the grid structure.

Perhaps for now, we should not try to address the general "grid description" problem and deal with the seemingly special case (see top of this issue) of how to indicate the modified width of some grid cells (in certain models), which is done in an attempt to better represent processes in narrow straights (e.g. vertical mixing that depends on vertical shear in horizontal velocity).

@atreguier
Copy link

Hi,

I like to compute quantities in a way consistent with the model code, it is necessary when you want to close budgets. In every model on a C grid, the grid lengths are used in the discretized equations. cell_x_length_for_y_velocity is called "e1v" in NEMO, "dxCv" in Mom6, etc. I have not seen any way to submit those variables on esgf. The grid cells are not exact rectangles. An approximate computation of these lengths (as opposed to taking the ones that were actually used by the model) sometimes results in sizeable errors when, e.g., you re-compute a meridional overturning of a transport across many cells. This happens even if there has not been any local modification of the cell lengths.

If there was a way to submit these grid variables to esgf, I suppose that they would be found useful for ocean analyses of all the models that use rectangular grids. Jenny's proposal of the new variables dxu, dxt, etc has been welcomed positively by the "ocean and sea-ice" CMIP7 data request group.

I am not sure that I understand well the relation between cf names and cmor names. Could there be less cf names than Cmor names? for example: a generic "cell_x_length_for rectangular_grid" that would be used for variables Ofx.dxu, dxv, dxt, and a generic "cell_y_length_for_rectangular_grid" that would be used for variables Ofx.dyu, dyv, dyt?

@JonathanGregory
Copy link
Contributor

Dear Anne Marie, Jenny, Karl, Adam et al.

The purpose of standard names is to distinguish and identify quantities in scientific terms. The description of grids isn't part of the function of standard names; that's done by the coordinates. Therefore I don't think the standard names should include any information referring to Arakawa grids or describing the grid type. For the same kind of reason, for instance, air_temperature always has the same standard name regardless of whether it's on height levels, pressure levels or any other kind of vertical coordinate surface.

This relates to Anne Marie's last question. There is no relationship between CF standard names and CMOR names. Exactly as you say, there are fewer CF standard names than CMOR names. CMOR names correspond to distinct data requests, and involve the choice of grid as well as the scientific quantity in their definition.

I agree that it's important to be able to compute quantities in a way which is consistent with model code. That is why CF provides ways to store areas and volumes of cells, and that's why you want to store and identify the relevant grid lengths. For this purpose, it makes sense to to provide data variables with standard names cell_x_length and cell_y_length, both (y,x) where x and y are the horizontal dimensions, to contain the values actually used by the model for each gridcell to calculate derivatives, fluxes and so on. If there is more than one grid, there will be different cell-length data variables for each grid (e.g. the four grids shown in Adam's diagram).

It is fine to have four variables with the standard name cell_x_length, although they can't have the same netCDF variable name of course if they're in the same file. They can be distinguished by the dimensions or coordinates. That's a bit laborious, however. If it's useful, we could add a new feature to the CF convention, so that the appropriate metric variables could be linked to each data variable using the cell_measures attribute. We do this for cell area and volume, for the same reason of convenience. It would look like something like this on an Arakawa B-grid:

float opottemp(level,yt,xt);
  opottemp:standard_name="sea_water_potential_temperature";
  opottemp:units="degC";
  opottemp:cell_measures="area: areacellot x_length: dxt y_length: dyt";
float uo(level,yu,xu);
  uo:standard_name="sea_water_x_velocity";
  uo:units="m s-1";
  uo:cell_measures="area: areacellou x_length: dxu y_length: dyu";
float dxt(yt,xt);
  dxt:standard_name="cell_x_length";
  dxt:units="m";
  dxt:long_name="distance across tracer cells in the x direction";
float dyt(yt,xt);
  dyt:standard_name="cell_y_length";
  dyt:units="m";
  dyt:long_name="distance across tracer cells in the y direction";
float areacellot(yt,xt);
  areacellot:standard_name="cell_area";
  areacellot:units="m2";
  areacellot:long_name="area of tracer cells";
float dxu(yu,xu);
  dxu:standard_name="cell_x_length";
  dxu:units="m";
  dxu:long_name="distance across velocity cells in the x direction";
float dyu(yu,xu);
  dyu:standard_name="cell_y_length";
  dyu:units="m";
  dyu:long_name="distance across velocity cells in the y direction";
float areacellou(yu,xu);
  areacellou:standard_name="cell_area";
  areacellou:units="m2";
  areacellou:long_name="area of velocity cells";

For a lot of this discussion we were talking about "modified" cell lengths, say dxu_strait and dyu_strait. I understood that these are like dxu and dyu (computed as differences between tracer gridpoints), except that they take narrow channels into account, so that at some gridcells they are smaller than dxu and dyu. Computing the flux through the face of a cell in the x-direction involves the product of x_velocity and dyu_strait, for instance.

The proposed standard names cell_x_length_for_y_velocity and cell_y_length_for_x_velocity would be appropriate for dxu_strait and dyu_strait. They're different quantities from the usual cell_x_length and cell_y_length, because they aren't gridpoint separations, they're values used for computing fluxes. Is that right? The for_[xy]_velocity part of the name is there to indicate their scientific use, not to indicate which grid they're on. Included in the above example, they would look like this:

float dxu_strait(yu,xu);
  dxu_strait:standard_name="cell_x_length_for_y_velocity";
  dxu_strait:units="m";
  dxu_strait:long_name="distance across velocity cells for calculating fluxes in the x direction";
float dyu_strait(yu,xu);
  dyu_strait:standard_name="cell_y_length_for_x_velocity";
  dyu_strait:units="m";
  dyu_strait:long_name="distance across velocity cells for calculating fluxes in the y direction";

Does this make sense?

Happy weekend

Jonathan

@taylor13
Copy link

taylor13 commented Feb 14, 2025

Earlier today (before Jonathan's post), I started to compose a comment about this but got interrupted. My comment started:

Apologies if I've missed something, but if this issue is about defining characteristics of cells and not simply recording the cell widths that are modified for certain purposes, then I think we should consider expanding "cell_measures" to include an option like "effective_cross_section". Just as the cell_measures can record the area of a cell, we could have it record the "width of the cell orthogonal to the direction of the dimension.

So I think that two of us at least support exploring the use of cell_measures* to record this sort of information. Like "cell area" the cell width is useful for those analyzing data.
Karl

*corrected JMG

@atreguier
Copy link

Hi @JonathanGregory ,

The solution to have "cell_x_lenth" and "cell_y_length" as standard names seems fine to me. I don't think we need "modified" cell lengths, because once the cells are are modified before running the model, the original values are never used. dyu would be the y-width of "u" cells, used to calculate zonal transport everywhere, whatever the way it has been computed or modified.

The issue of modified lengths is one of the motivations to publish e.g., dyu on esgf, because in that "modified" case if you try to re-compute the dyu approximately from the coordinates of neighbouring points you will not be able to recover something that could approximate the "true" value used by the model at that specific location.

I have never used the "cell_measure" attribute myself, but your suggestion seems great.

@JonathanGregory
Copy link
Contributor

@atreguier, wouldn't you need the unmodified cell lengths (= distances between points on another grid) to calculate derivatives consistently with the model code?

@atreguier
Copy link

Hi @JonathanGregory , here is the page in the NEMO manual that explains the narrowing of straits: https://github.com/user-attachments/files/17686527/nemo_manual_handmade_grid_corrections.pdf .
To my knowledge, this is done when the width of one single grid point is significantly larger than the width of the channel in the real world. So, only channels that are one grid point wide are narrowed, and there is land on each side, so the derivative of the along channel velocity in the cross channel direction is never used anywhere. The velocity derivatives that are computed using the modified scale factors are all masked because they fall in land.

@JonathanGregory
Copy link
Contributor

Thanks, @atreguier. I see, yes, the unmodified cell length is not used in the model code. However, that manual page explains that the cell area is still the product of the unmodified cell lengths and hence the cell volume depends on the unmodified length too. For that reason I think we ought to stick to what we'd discussed before: define the plain cell_x_length and cell_y_length standard names for the lengths that relate only to the geometry of the grid, and also define cell_x_length_for_y_velocity and cell_y_length_for_x_velocity for the modified ones used for flux calculations. Is that OK for you and Jenny?

As Karl @taylor13 and I suggested, we can also define new cell_measures to make it easy to find the right variables for the grid of interest. That would be a conventions enhancement, not a standard name issue.

@atreguier
Copy link

atreguier commented Feb 18, 2025

@JonathanGregory, you quote "the cell area is still the product of the unmodified cell lengths and hence the cell volume depends on the unmodified length too". However, the volume mentioned here is the volume of the tracer cell, dxt*dyt. The two lengths at T point, dxt and dyt are never modified. The manual says that only the lengths at u or v points can be modified without losing consistency (heat conservation, for example). The "unmodified" dyu and dxv are never used and should never be used when analyzing the model results.

What you propose will be a bit confusing for NEMO users, because the information about unmodified dyu, dxv is not kept around in the files we use (meshmask, meshgrid...). This information no longer exists. For a nemo user, there is only one dyu array and one dxv array. Does your proposal imply that we would need to define different cmor names for modified/unmodified dyu and dxv? That would complicate things, I think. An automatic analysis program based on the cmor name "dyu" would fail when the width has been modified at some grid point, because in that case the variable "dyu" would not be published and replaced by "dyu_modified" or something like it. This would be a pity. Jenny is on vacation until march 17th, she can't answer now, but I think that her intent was to facilitate model analysis, not complicate it.

The process of modifying a strait width by hand is the same thing as modifying the bathymetry to ensure that a strait is open, or deep enough in the model's world. All modelling groups modify the bathymetry, it is necessary when you run a 1° model. I don't think anybody has suggested that the model groups publish their original (unmodified) bathymetry on esgf. What is needed for the analysis is the bathymetry that the model has been run with. Same for the dyu, dxv: what we need on esgf is the variables that should be used to analyse the model.

Does this analogy help make my case?

@JonathanGregory
Copy link
Contributor

Dear Anne Marie @atreguier

Thanks for further explanation. I'm sure that no-one wants to complicate the work of analysis - certainly not me. I have been seeking to clarify whether there were any purposes for which the unmodified cell length would be needed, because earlier in this discussion it was not clear. If the modified cell length is used for all purposes, both in analysis and the model code, and you can't imagine a physically useful reason for needing the unmodified grid spacing instead, then I agree with you that we only need cell_x_length and cell_y_length standard names for the NEMO data. The description should define them as being the extent of the cell in the relevant dimension according to its cell boundaries and the model code. It would probably be helpful to remark in the description that they are not necessarily equal to the spacing of points on another grid of the model.

Best wishes

Jonathan

@atreguier
Copy link

Thanks!
We are all set then!
Please let me know if I need to do something in the coming weeks, while Jenny is away.
Cheers,
Anne Marie

@ChrisBarker-NOAA
Copy link

which is what confused me -- e.g. there are two U values, on the opposite edges of the cell. Those edges are not necessarily the same length. Is it correct to use the (presumably average) value for the "width" of the overall cell? In practice, cells aren't all that distorted, so it probably doesn't matter, but we should be clear about what a "cell length" really means.

Hi @ChrisBarker-NOAA , its a grid, so the two U values have different indices and, as you say, the cell lengths that correspond to each of these points are not necessarily the same. The For NEMO T(i,j) is equidistant between U(i,j) and U(i-1,j):

And the cell_x_length and cell_y_length would be defined on the U and V points?

If so, then I think that this is all OK, but I'm not completely thrilled with the terminology. The problem is that the U and V "points" are really on the edges of the cell (sometime treated as points, sometimes as the full edge (e.g. for flux).

So I think the quantity we are talking about is not the "cell length", but really the "edge length"
-- but so far, in CF, we don't have any way to talk about cell edges -- just cells [*].

So the question is -- if CF adds the concept of edges some day (e.g. something like SGRID) -- would it still work for the standard names of cell_?_length to be used for this same quantity?

I'm thinking probably yes, so we're OK.

[*] -- well, we do have edges in the UGRID spec -- and in that case, "cell length" makes no sense, but "edge length" does (though there's less need for that -- unstructured grids are designed to avoid these challenges :-)

@davidhassell
Copy link
Collaborator

Hello,

I think that the descriptions posted earlier, which provide a really useful reference, have been superseded by the subsequent discussion. Might it be possible for someone with all this in their head to post some new standard name descriptions?

Thanks,
David

@atreguier
Copy link

Hi, we can start with the proposal by @JonathanGregory

cell_x_length
Units: m
Description: "Cell" refers to a model grid-cell. "Cell_x_length" is the length of the grid-cell in the x direction of the model grid.

cell_y_length
Units: m
Description: "Cell" refers to a model grid-cell. "Cell_y_length" is the length of the grid-cell in the y direction of the model grid.

We could add: "These lengths are used to compute transports, integrals and derivatives. In the case of staggered grids different model variables have different locations in space, and multiple cell_x_lengths can be defined in each cell, centered on the different variables (tracers or velocities). These lengths may not be exactly equal to the distance between two variables on adjacent grid cells. "

@JonathanGregory
Copy link
Contributor

Dear @atreguier et al.

Thanks for your helpful text. I have some suggestions:

  • Although these are most likely to be used for model data, they could apply to any gridded dataset. Perhaps we could omit the first sentence? ("'Cell' refers to a model grid-cell.")

  • Because "length" might not be self-explanatory, I suggest we use a different phrase to describe it in the second sentence. For instance, "cell_x_length is the distance across the grid-cell in the x-direction of the grid."

  • Since CF doesn't associate the staggered grids, we don't have a notion of corresponding cells on different grids, so it's not quite correct to say that we'd define multiple lengths for "each cell".

  • It occurs to me that the point of the final sentence doesn't only apply to staggered grids. It's possible that even on an A-grid you might have cell lengths which aren't equal to grid point separations. Hence I would suggest separating these two remarks:

Although the locations of the grid points may be geometrically related, the cell lengths are not in general required to be equal to the spacing of grid points. In the case of staggered grids (for instance, an Arakawa B-grid), different variables in a dataset may have different locations in space (e.g. tracers or velocities), and each grid has its own cell lengths.

Best wishes

Jonathan

@davidhassell
Copy link
Collaborator

cell_x_length is the distance across the grid-cell in the x-direction of the grid.

Isn't it the case that cell_x_length is a distance that is representative of the grid cell size in the x-direction of the grid? This is acknowledging cases where there is no unique distance (such as will almost always be the case for cells defined in spherical polar coordinates)

@JonathanGregory
Copy link
Contributor

Yes, I agree, that's a better way to say it, @davidhassell. Thanks.

@atreguier
Copy link

agreed as well, thanks.

@ChrisBarker-NOAA
Copy link

Although the locations of the grid points may be geometrically related, the cell lengths are not in general required to be equal to the spacing of grid points. In the case of staggered grids (for instance, an Arakawa B-grid), different variables in a dataset may have different locations in space (e.g. tracers or velocities), and each grid has its own cell lengths.

Hmm -- there's only one grid -- "each grid" doesn't make sense here.

maybe: "different variables in a dataset may be defined at different locations on the cells (e.g. tracers or velocities), and each location on the cell can have different cell lengths"

???

which is why I'm uncomfortable with "cell" here -- it may not be the length of a cell, but rather the length of a cell edge -- but no one else seems to share that discomfort, so here we are.

@taylor13
Copy link

I also think of one grid of cells with cell vertices and edges and areas and such. But I haven't thought of another term (yet).

@atreguier
Copy link

Hi,

The lengths may not all be easily defined as "edges". In rectangular grids the area may be defined by the product of two lengths that are not "edges". In NEMO, areacello (cell centered on a T point) =dxt*dyt (lengths centered at T point, in the x and y directions).

@davidhassell
Copy link
Collaborator

Some thoughts on "cells" ...

In the CF data model, the data of one variable are defined on the cells of a grid (i.e. the "domain"), and each cell describes the extent (typically spatio-temporal, but not necessarily) to which a value of that data applies. The data of any other variables are of no consequence here. A cell is $N$-dimensional ($N\ge 0$), and may even be non-contiguous (e.g. climatological cells, geometry cells). The edges or vertices of a cell with 1 or more dimension are not themselves cells - rather they are lower-level elements that define the cell's extent. This is not to say that those same vertices or edges can't be re-purposed to form the cells for a different variable's data.

(Aside: UGRID explicitly links up to three different grids into one "mesh" - those defined by nodes, the edges connecting those nodes (i.e. vertices), and the faces enclosed by those edges. This allows you to know the relationship between two variables defined at different mesh locations, but does not mean that data defined on any one of these grids is also defined on the other two.)

If a cell is 1-d (an "edge") then what does cell_x_length mean - is it the length of the X component of the vector that you'd follow when traveling between the edge's two vertices? Or would a name of cell_length be more appropriate, defined as the distance along the edge?

@JonathanGregory
Copy link
Contributor

JonathanGregory commented Feb 20, 2025

Dear all

Putting together and elaborating the words which David and I previously suggested, for the definition we could say

cell_x_length is the linear extent of the cell in the x-direction of the grid. For a rectangular cell whose sides are parallel to the Cartesian axes of a plane, the cell_x_length is the distance between the two sides in the x-direction. For the usual case of horizontal grids in geoscience, where the grid cells are not planar or rectangular, the cell_x_length is a distance which is representative of the extent of the cell in the x-direction. Generally it is not uniquely defined, but for a model dataset it should be a distance which is consistent with the model's own computation of extensive quantities e.g. fluxes or areas.

That's quite a lot, but is it clear? Is it what you have in mind, Anne Marie @atreguier?

Chris is right that a my reference to "grids" isn't satisfactory, since there's only one grid for a given data variable. For that part of text, how about

Although the locations of the grid points may be geometrically related, the cell lengths are not in general required to be equal to the spacing of grid points. When a dataset contains variables on more than one grid (for instance, the Arakawa B tracer and velocity grids), each grid has its own cell lengths.

Best wishes

Jonathan

@ChrisBarker-NOAA
Copy link

The lengths may not all be easily defined as "edges". In rectangular grids the area may be defined by the product of two lengths that are not "edges". In NEMO, areacello (cell centered on a T point) =dxt*dyt (lengths centered at T point, in the x and y directions).

here's another problem then -- if the area can not be calculated from the geometry of the cell, then it should be specified.

As @davidhassell said:
"cell_x_length is a distance that is representative of the grid cell size"
-- and this is the key problem:

If a cell is not rectangular, then there is no one x-length -- hence the "representative" -- but representative of what? apparently this is well defined for NEMO, but is it the same for all models? Or for that matter all potential non-rectangular cells?

Earlier on in this discussion, there was the idea of adding a "for_velocity" or some such, making it clear what it means.

If we do want to keep it this simple, then some language to convey:

x_length and y_length are the lengths of an equivalent rectangular cell, and can be used for any calculation where the dimensions of the cell are needed.

Is that what we are trying to do here?

@atreguier
Copy link

Hello,
I don't understand what you mean by "equivalent rectangular cell". We wish to have the exact lengths for the cells that a given model is using.

Let's consider a typical ocean model, with the equations discretized on a curvilinear orthogonal grid on the sphere, and the variables staggered in space following what we call an Arakawa B or C grid (most ocean models have these characteristics: GFDL and ACCESS, NCAR, NEMO-based models in Europe, NorESM, all the Chinese models...). When making calculations using the model results, it is more accurate to use the exact lengths that were used in the model equations. The suggestion of having a cf name is to make possible for modelling groups to publish these exact lengths. Of course these lengths can be approximated from grid points positions, but the approximation is not good enough for some types of analyses.

@ChrisBarker-NOAA
Copy link

When making calculations using the model results, it is more accurate to use the exact lengths that were used in the model equations.

of course -- but what are they???

In a logically rectangular, curvilinear grid, the cells are non-rectangular quads -- there are four vertices, and four edges -- each edge has a length, the cell has an area, bu there is no one "cell length" or "cell width".

So what is used in the calculations??? I have no idea if all the, e.g. Arakawa C grid models do the same thing, but If you want to conserve mass, you need to know the flux though the cell edges, yes? so each edge is going to have a length (or area in 3D) -- if that's what you want, shouldn't we call it something like the edge length, not cell length?

What I've seen from ROMS output is that "tracers" are computed/defined on what ROMS calls the "rho points" -- which are at the center of the cells. What does it mean to have single "cell_x_length" defined on those points? It seems that is the length of an equivalent rectangular cell (that is can compute an area from it) -- but if you want area, maybe save the area?

On the other hand, on the "U points" or "V points" -- those are on the edges of the cells -- and to compute, e.g. flux, you'd need the length of that edge. IIUC, the proposal is to call that length the "cell_x_length" which seems potentially confusing to me. Also, a given edge is either the x direction or the y direction, so hopefully no one will define the x_length on a y-edge ....

The fundamental problem here is that CF has no way to fully define a curvilinear grid -- we can define the cells themselves (what ROMS calls the Psi grid) but not how the different pieces fit together (what some folks are calling different "grids" -- because CF has no way to define one grid with different pieces). It does talk about Cell connectivity, but I don't think that the concept of an edge as it's own thing is there.

If the geometry of the grid (e.g. the size of the cell) can be fully computed from it's nodal coordinates, then we're OK -- but this whole topic is about what to do when that's not the case.

Which is (one of ) the reason we introduced the UGRID standard -- yes, you can theoretically fully describe a UGRID as a set of independent cells with vertices, but it's a whole lot more efficient, and usefull, to describe the mesh as a mesh, including how the pieces fit together.

Again -- I'm not intending to block consensus here -- if most folks are happy with this, then fine, go ahead -- but as a user of these types of model results, this feels like a confusing and not-complete solution to me.

And ultimately, I'd like to see a complete solution, such as the SGRID proposal, introduced to CF.

(https://sgrid.github.io/sgrid/)

I don't think we should block this name proposal while we wait for that, but maybe keep it in mind when finalizing the names?

To the NEMO folks -- have you looked at SGRID? It would be great to know if that would work for NEMO as well and how -- there are currently examples from Delft3D, ROMS, and WRF.

@davidhassell
Copy link
Collaborator

Thanks, Chris. I agree with your concerns about non-rectangular cells. I'd be fine with restricting the description to rectangular cells, so as to avoid such ambiguities. (Although tripolar grids (like NEMO) have rectangular cells, but they get very stretched and twisted in the northern high latitudes - again questioning "what is x", perhaps?)

On the other hand, on the "U points" or "V points" -- those are on the edges of the cells

I think this is can be a source of confusion. In many models, there are U and V cells whose centres coincide with the mid-points of the edges of Rho cells (or at the vertices of rho cells, depending on the staggering). We are talking about the "size" of cells, not the length of the common interface between adjacent cells. Right?

I need to remind myself about SGRID ...

@ChrisBarker-NOAA
Copy link

On the other hand, on the "U points" or "V points" -- those are on the edges of the cells

I think this is can be a source of confusion. In many models, there are U and V cells whose centres coincide with the mid-points of the edges of Rho cells (or at the vertices of rho cells, depending on the staggering).

I'm thinking of C-grid models, where, yes, this is the case. But I finally figured out the (my) confusion.

IIUC:

In CF, a "cell" can be 1D, or 2D, or 3D.

This is why we used the terms "face" and "volume", rather than "cell" in UGRID (and SGRID)

-- and a "grid" or "mesh" really isn't a concept at all, it's just a bunch of cells, for which you have to assume or compute the connectivity. (thought there are some features that make that easy for the Usual cases.

So for a curvilinear, quadrilateral, logically rectangular model grid, you have quadrilateral 2D cells, defined by the "psi points" -- those are the faces, in SGRID terms.
(Link to the classic ROMS grid diagram, for reference)
the "rho points" are at the center of those quads, and the U and V points are on the edges of those quads.

But in CF terms, there is no overall "grid" -- only cells, so in that case, the rho points are on the 2D quadrilateral cells defined by the psi points, and the U and V points are on 2D cells defined also by psi points.

Though that gets back to my point -- if we want to know the length of the edge that, e.g. the U points are on (to compute a flux, for instance) then that's the length of a 1D cell, in CF parlance, not the x_length or y_length of a 2D cell.

But:

(like NEMO) have rectangular cells

Whoa! I'm sorry, I have been confused all along -- I thought NEMO supported curvilinear grids -- but if indeed, it is only rectangular, then the whole cell_x_length and cell_y_length does make sense! Sorry for all the noise caused by my confusion.

One question though:

In the case of an artificially narrowed cell (which is what this is all about, yes) -- where exactly is the narrowing applied?

What I mean is that if a cell is narrowed, then the presumably the adjacent cell must be consistent -- so the edge between those two cells is what is narrowed, rather than the whole cell -- so we are back to specifying an edge length, rather than a cell length.

Or maybe I'm missing something key here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CMIP7 Vocabulary proposals for CMIP7 variables standard name (added by template) Requests and discussions for standard names and other controlled vocabulary
Projects
None yet
Development

No branches or pull requests