-
Notifications
You must be signed in to change notification settings - Fork 226
Add precise definitions and limits for all calendars #7123
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
/// | ||
/// This calendar is intended to represent the traditional Chinese lunar calendar as used | ||
/// officially in the People's Republic of China as of 2025. This takes a best-effort approach | ||
/// to match past and future dates as used in the region for the year 1900 onwards. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Issue: don't name a specific year, because then we need to debate "why 1900". Just say that it intends to match ground truth dates over an arbitrarily long range. Then in the next paragraph we can name 1900 as part of the data source.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is deliberate. I am defining the calendar as matching ground truth in a particular time period, as its core definition. And then on top of that we specify additional behaviors.
I could instead choose to define it as matching ground truth for a different time period. I would prefer not to define it as matching ground truth for an unspecified period, but we could do that too. This is the core definitional discussion we need to have.
I want these definitions to be useful, so actually giving a minimum that is the core definition is good. If you'd like I can change this number to 1950. I think we should have a number where "best effort" is correct.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Defining this as 1950 and potentially increasing it in the future is one easy 2.1 option.
The answer to "why 1900" is that the ground truth 1900 onwards is easily available in multiple sources so there is very little chance of needing to deal with discrepancies (which is the problem when attempting to define a calendar: you don't want to deal with "well these people handled the calendar this way, and these people handled it this way"). That does start to become a problem the further back you go.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, "why 1900" is "the ground truth 1900 onwards is easily available in multiple sources". That's not how we define the calendar, though. The definition is that it is a calendar that matches ground truth in China. The implementation constraint is that we are choosing to use the data source that has data for 1900 onwards.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's fine, but this statement is serving a different purpose that is important to the definition.
Specifically, it is important to define it as something that we make a best effort attempt to match ground truth in 2026, and 2027, and so on, that does not change meaning as time moves on. This distinguishes it from a calendar that is attempting to be stable in behavior even when projected ground truth changes.
So I do not wish to say "for future dates", I wish to say "best effort attempt to match ground truth from $year onwards". I don't have a strong reason to care about what $year is, it should just not be something we need to update every year. It could be 1900, or 1950, or 2000, or 2025 (kept static). I think picking a year further in the past is nice because it gives people an idea of what they can consider stable, but we also have a separate section on stability anyway.
If you can offer wording for the definition that works for that, I can get rid of this bit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shane convinced me to instead just say "best effort attempt for future dates".
/// # Precise definition and limits | ||
/// | ||
/// This calendar is defined algorithmically as a solar calendar that has 13 months, with a leap day in | ||
/// the 13th month every 4 years, as used by the Coptic orthodox church as of 2025. This calendar extends proleptically |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit, here and elsewhere: rather than 2025, name the year the calendar was adopted in an official status
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was a deliberate choice: it is a calendar disambiguator in case the church decides to change things around. It's supposed to be modern. I could say "in 2025" instead of "as of 2025" if that makes it clearer.
"What calendar did $entity use in 2025" is an easy thing to verify. "what calendar did $entity use in $far_past_year" is often hard, especially since identity is tricky. There have been many splits and rejoins of countries and churches and this avoids dealing with that problem when we are doing a baseline disambiguation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It also avoids the problem of discovering later that the Coptic church actually tweaked the precise algorithm in 1000 AD or something, or potentially made a mixup with incorrect calculations (similar to the pre-4 AD Julian calendar). An algorithm tweak happened multiple times with Hebrew; I am not 100% convinced it has never happened with the other calendars.
ICU4X encodes useful calendars that are in modern use (plus Julian). When attempting to unambiguously identify a calendar, talking about its modern use is more important than talking about it in terms of when it was introduced, because what we care about is modern use.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd prefer saying "as currently used by the Coptic church". Otherwise next January we have to either update them all to 2026 or our library will look outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, "as currently used by the Coptic church" means that this is defined as "whatever the church uses" and is prone to issues if the church ever decides to discard this calendar.
I could pick 2000 instead if people want a nice round number that is clear as an arbitrary number.
We should not be updating the date every year. This is a part of the disambiguation, "the calendar used by the Coptic church in 2025" is unambiguous and will remain so. "the calendar currently used by the Coptic church" does not have that property.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They've used this for almost 2000 years now, I would consider this very stable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we should focus on who uses these calendars in 2025. The Coptic calendar was used by the Coptic church since its inception, that's its whole identity. Saying it's used by the Coptic Church "as of 2025" sounds like they keep changing calendars, which is not true. If you see this code in 2026, you have to wonder what calendar the Coptic church uses currently, which is not what this documentation should make you wonder.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They've used this for almost 2000 years now, I would consider this very stable.
I could say "in at least the period XXX CE - 2025 CE" but that makes it sound like they stopped in 2025.
I do not wish to make sure claims about the future here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shane convinced me to say "as of the publication date of this crate".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm confused by the different levels of details for the "precise definition" of the different calendars. But then, if we link to the Wikipedia page of the calendar, do we also need to provide a "precise definition" if that's unambiguous (Gregorian et al)?
I also don't like classifying calendars as "algorithmic" and (implicitly) non-algorithmic. We don't provide a single non-algorithmic calendar (the only non-algorithmic calendar I'm aware of is an observational Hijri). We sometimes hardcode data because we don't implement the algorithms, but the calendars are still very much algorithmic.
/// # Precise definition and limits | ||
/// | ||
/// This calendar is intended to represent the traditional Chinese lunar calendar as used | ||
/// officially in the People's Republic of China as of 2025. This takes a best-effort approach |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's try to be generic over "China", this is also used in Taiwan
/// officially in the People's Republic of China as of 2025. This takes a best-effort approach | |
/// officially in China as of 2025. This takes a best-effort approach |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I chose a polity because I am trying to unambiguously indicate a particular calendar.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think "China" is better because the calendar is used throughout the region. If different sub-regions adopt different rules, then we might want to get more specific with the naming. And in that case, I might say "in Beijing" rather than "in China".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I decided that saying "China" is fine and if the two polities every diverge here we have sufficient leeway to decide what to do then.
/// # Precise definition and limits | ||
/// | ||
/// This calendar is intended to represent the traditional Chinese lunar calendar as used | ||
/// officially in the People's Republic of China as of 2025. This takes a best-effort approach |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's not a "best-effort approach" from the year 1900 onwards. it's "matches ground truth" from the year 1900 onwards"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No it doesn't. It is a best effort attempt to match ground truth from 1900 onwards. It succeeds for the years 1900-2025. We cannot make sure statements beyond that. I can frame that better, but then we get to the problem you are already complaining about where there is a year in the docs we need to keep updating.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Based on feedback from Shane, I've changed this to be "best effort for future dates"
/// This calendar is defined algorithmically as a solar calendar with a leap month every 4 years, as used | ||
/// by the Roman Empire since 1 CE. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"as used by the Roman Empire since 1 CE" is a weird thing to say. the Roman empire doesn't exist anymore, many different entities have used this calendar for centuries, but this is all already explained in previous paragraphs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, this could be framed better.
/// | ||
/// # Precise definition and limits | ||
/// | ||
/// This calendar is defined algorithmically as a solar calendar with a leap month every 4 years, as used |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not a very precise definition of the calendar
/// | ||
/// # Precise definition and limits | ||
/// | ||
/// This calendar is defined as a solar calendar which uses the astronomical vernal equinox as its new year, and |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how many months? how long?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it necessary to be that precise in each calendar? Expressing some characteristics and a polity where it is official is pretty unambiguous.
"algorithmic" is about the definition of the calendar. The Gregorian calendar is defined as a precise, computable algorithm. The United States deciding to add a day to a year does not change that. This is not true for UAQ or Chinese, where while there is an algorithm we implement, we are also following ground truth.
Not necessarily, I was trying to be as redundant as possible. This did end up with different levels of detail. I don't think consistency across calendars is important here when it comes to level of detail. |
/// This calendar generically covers any pure lunar calendar used liturgically in Islam, | ||
/// with 12 months each of length 29 or 30, with an epoch intended to mark the Hijrah in 622 CE. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note: I'm fine saying that the epoch is the Hijrah, but I thought we previously said we needn't make that commitment, which is why we added ECMA reference year functions to the hijri::Rules trait.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think from a calendrical definition perspective saying the epoch is Hijrah is probably good.
Fixes #6459, fixes #7026
This is an attempt to define calendars in terms of my position in #6970 (comment).
For each calendar, I have attempted to first give enough information to unambiguously identify the calendar. Typically, this means mentioning whether it is lunar or solar, talking a little bit about the leap situation, and if it is a civil or otherwise officially-used calendar, mentioning a country that it is the official calendar of in 2025.
For solar calendars, I have identified when the calendar was first introduced and expressed and explicitly called it out as being proleptic before that.
For lunar calendars, I have attempted to unambiguously identify the exact algorithm when there is one, and if there is not, I have defined it as the ground truth in a region for a given range of dates, and also specified what we do outside that range. There's some playing around we can do with what we guarantee vs what we do now. I have done an attempt to specify the ways in which these implementations may change in the future.