timedelta64 #12

d-v-b · 2025-05-06T11:54:34Z

this PR adds timedelta64, based on the data type with the same name defined in numpy.

Zarr v2 deferred to numpy's data type semantics, which means that Zarr v2 users could transparently create arrays using numpy's timedelta64 data type. The data type defined in this PR enables the same usage pattern for zarr v3. This will be valuable for zarr v2 users who intend to migrate their data to zarr v3, or numpy users who want a simple way to store their data using zarr v3 arrays.

Thus, the goal of this PR is not to specify an excellent data type for representing temporal durations. We should evaluate this spec based on how well it captures the semantics already defined by the numpy timedelta64 data type.

partially addresses #11

rabernat · 2025-05-06T13:16:03Z

data-types/timedelta64/README.md

+## Fill value representation
+
+`timedelta64` fill values are represented as one of:
+- a JSON number with no fraction or exponent part that is within the range `[-2^63, 2^63 - 1]`. 
+- the string `"NaT"`, which denotes the value `NaT`. 
+
+> Note: the `NaT` value may optionally be encoded as the JSON number `-9223372036854775808`, i.e., `-2^63`. 


I don't understand this part. Here it seems like the user can configure their own custom fill value? If so, shouldn't fill_value be in configuration? And what would be the use case for that?

Isn't it simpler if we just say that, like numpy, the integer -2^63 represents NaT?

what I'm trying to say here is that the following two cases are the only acceptable fill values:

"fill_value" : <a JSON integer in the range [-2^63, 2^63]>

"fill_value" : "NaT"

With one degenerate case:

"fill_value": "NaT"

has the same meaning as

"fill_value": -9223372036854775808

the statement "timedelta64 fill values are represented as one of" is intended to mean "there are two possible forms for the "fill_value" metadata. Maybe I should make this clear. I definitely don't want to convey that users can configure a custom fill value.

data-types/timedelta64/README.md

normanrz · 2025-05-06T17:11:18Z

This PR looks good. Are you ready to have it merged or are you still looking for more feedback from the community?

rabernat

I think I get it now. Thanks for walking me through the fill value stuff.

rabernat · 2025-05-06T17:23:50Z

data-types/timedelta64/README.md

+| Y        | year   |
+| M        | month   |


Just noting that year and month are super problematic as units because they don't actually have a fixed duration (leap years, variable months). I would hate to see us proliferating data with this encoding into the world. But I guess if the goal is numpy compatibility, we should leave them in.

100% agree that the numpy definition is problematic. But I think there's value in a data type that numpy users (or zarr v2 users) can adopt without thinking. We should specify a less problematic, more generally useful datetime data type in a separate PR.

Maybe it would be useful to rename this data type to numpy.timedelta64 to signal the intent that it is only meant for compatibility?

numpy.timedelta64 is actually my preferred name, but iirc @rabernat was not a fan.

since this naming concern affects all the numpy dtypes, we should resolve that conversation in #4.

jbms · 2025-05-06T17:44:36Z

For general use I'd suggest a more general "unit" mechanism rather than a data type but this seems reasonable for numpy compatibility.

Note that "year" and "month" still seem like very plausibly useful units even though they can't be precisely converted to seconds --- for example you may have a table listing the ages of people in years, or of children/infants in months. The source data may well not contain any more precise information anyway.

Technically this issue also exists with every other unit because datetime64 excludes leap seconds.

d-v-b · 2025-05-06T19:32:29Z

This PR looks good. Are you ready to have it merged or are you still looking for more feedback from the community?

I think we should keep this open for a few days at a minimum. I'm very open to feedback on certain things (e.g., should it be named timedelta64 or numpy.timedelta64), and there's not really a rush to get this merged from my POV

d-v-b · 2025-05-09T15:14:36Z

this data type is now identified as numpy.timedelta64.

d-v-b added 6 commits May 6, 2025 13:36

add timedelta64 data type

46809a5

clarify step size lower bound

30f7e46

prose

9442288

prose

6279747

lint and prose and typos

7d29fb8

use scale factor consistently

0d708de

rabernat reviewed May 6, 2025

View reviewed changes

update fill value section

deedf6a

d-v-b commented May 6, 2025

View reviewed changes

data-types/timedelta64/README.md Outdated Show resolved Hide resolved

d-v-b added 2 commits May 6, 2025 17:20

reflow text

e2676ad

fix typo

967f4c2

rabernat approved these changes May 6, 2025

View reviewed changes

d-v-b mentioned this pull request May 6, 2025

datetime64 #14

Merged

use numpy prefix

c8432bf

normanrz approved these changes May 12, 2025

View reviewed changes

normanrz merged commit b256e2b into zarr-developers:main May 15, 2025

normanrz mentioned this pull request May 16, 2025

datetime64 and timedelta64 #11

Closed

normanrz mentioned this pull request Jun 2, 2025

Define codec for LZW compression #16

Draft

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

timedelta64 #12

timedelta64 #12

Uh oh!

d-v-b commented May 6, 2025 •

edited

Loading

Uh oh!

rabernat May 6, 2025 •

edited

Loading

Uh oh!

d-v-b May 6, 2025

Uh oh!

d-v-b May 6, 2025

Uh oh!

Uh oh!

normanrz commented May 6, 2025

Uh oh!

rabernat left a comment

Uh oh!

rabernat May 6, 2025

Uh oh!

d-v-b May 6, 2025

Uh oh!

normanrz May 6, 2025

Uh oh!

d-v-b May 6, 2025

Uh oh!

d-v-b May 7, 2025

Uh oh!

jbms commented May 6, 2025

Uh oh!

d-v-b commented May 6, 2025 •

edited

Loading

Uh oh!

d-v-b commented May 9, 2025

Uh oh!

Uh oh!

timedelta64 #12

timedelta64 #12

Uh oh!

Conversation

d-v-b commented May 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

rabernat May 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

d-v-b May 6, 2025

Choose a reason for hiding this comment

Uh oh!

d-v-b May 6, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

normanrz commented May 6, 2025

Uh oh!

rabernat left a comment

Choose a reason for hiding this comment

Uh oh!

rabernat May 6, 2025

Choose a reason for hiding this comment

Uh oh!

d-v-b May 6, 2025

Choose a reason for hiding this comment

Uh oh!

normanrz May 6, 2025

Choose a reason for hiding this comment

Uh oh!

d-v-b May 6, 2025

Choose a reason for hiding this comment

Uh oh!

d-v-b May 7, 2025

Choose a reason for hiding this comment

Uh oh!

jbms commented May 6, 2025

Uh oh!

d-v-b commented May 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

d-v-b commented May 9, 2025

Uh oh!

Uh oh!

d-v-b commented May 6, 2025 •

edited

Loading

rabernat May 6, 2025 •

edited

Loading

d-v-b commented May 6, 2025 •

edited

Loading