Skip to content

Conversation

Manishearth
Copy link
Member

@Manishearth Manishearth commented Oct 18, 2025

Fixes #6459, fixes #7026

This is an attempt to define calendars in terms of my position in #6970 (comment).

For each calendar, I have attempted to first give enough information to unambiguously identify the calendar. Typically, this means mentioning whether it is lunar or solar, talking a little bit about the leap situation, and if it is a civil or otherwise officially-used calendar, mentioning a country that it is the official calendar of in 2025.

For solar calendars, I have identified when the calendar was first introduced and expressed and explicitly called it out as being proleptic before that.

For lunar calendars, I have attempted to unambiguously identify the exact algorithm when there is one, and if there is not, I have defined it as the ground truth in a region for a given range of dates, and also specified what we do outside that range. There's some playing around we can do with what we guarantee vs what we do now. I have done an attempt to specify the ways in which these implementations may change in the future.

///
/// This calendar is intended to represent the traditional Chinese lunar calendar as used
/// officially in the People's Republic of China as of 2025. This takes a best-effort approach
/// to match past and future dates as used in the region for the year 1900 onwards.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Issue: don't name a specific year, because then we need to debate "why 1900". Just say that it intends to match ground truth dates over an arbitrarily long range. Then in the next paragraph we can name 1900 as part of the data source.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is deliberate. I am defining the calendar as matching ground truth in a particular time period, as its core definition. And then on top of that we specify additional behaviors.

I could instead choose to define it as matching ground truth for a different time period. I would prefer not to define it as matching ground truth for an unspecified period, but we could do that too. This is the core definitional discussion we need to have.

I want these definitions to be useful, so actually giving a minimum that is the core definition is good. If you'd like I can change this number to 1950. I think we should have a number where "best effort" is correct.

Copy link
Member Author

@Manishearth Manishearth Oct 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Defining this as 1950 and potentially increasing it in the future is one easy 2.1 option.

The answer to "why 1900" is that the ground truth 1900 onwards is easily available in multiple sources so there is very little chance of needing to deal with discrepancies (which is the problem when attempting to define a calendar: you don't want to deal with "well these people handled the calendar this way, and these people handled it this way"). That does start to become a problem the further back you go.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, "why 1900" is "the ground truth 1900 onwards is easily available in multiple sources". That's not how we define the calendar, though. The definition is that it is a calendar that matches ground truth in China. The implementation constraint is that we are choosing to use the data source that has data for 1900 onwards.

Copy link
Member Author

@Manishearth Manishearth Oct 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's fine, but this statement is serving a different purpose that is important to the definition.

Specifically, it is important to define it as something that we make a best effort attempt to match ground truth in 2026, and 2027, and so on, that does not change meaning as time moves on. This distinguishes it from a calendar that is attempting to be stable in behavior even when projected ground truth changes.

So I do not wish to say "for future dates", I wish to say "best effort attempt to match ground truth from $year onwards". I don't have a strong reason to care about what $year is, it should just not be something we need to update every year. It could be 1900, or 1950, or 2000, or 2025 (kept static). I think picking a year further in the past is nice because it gives people an idea of what they can consider stable, but we also have a separate section on stability anyway.

If you can offer wording for the definition that works for that, I can get rid of this bit.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shane convinced me to instead just say "best effort attempt for future dates".

/// # Precise definition and limits
///
/// This calendar is defined algorithmically as a solar calendar that has 13 months, with a leap day in
/// the 13th month every 4 years, as used by the Coptic orthodox church as of 2025. This calendar extends proleptically
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit, here and elsewhere: rather than 2025, name the year the calendar was adopted in an official status

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was a deliberate choice: it is a calendar disambiguator in case the church decides to change things around. It's supposed to be modern. I could say "in 2025" instead of "as of 2025" if that makes it clearer.

"What calendar did $entity use in 2025" is an easy thing to verify. "what calendar did $entity use in $far_past_year" is often hard, especially since identity is tricky. There have been many splits and rejoins of countries and churches and this avoids dealing with that problem when we are doing a baseline disambiguation.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It also avoids the problem of discovering later that the Coptic church actually tweaked the precise algorithm in 1000 AD or something, or potentially made a mixup with incorrect calculations (similar to the pre-4 AD Julian calendar). An algorithm tweak happened multiple times with Hebrew; I am not 100% convinced it has never happened with the other calendars.

ICU4X encodes useful calendars that are in modern use (plus Julian). When attempting to unambiguously identify a calendar, talking about its modern use is more important than talking about it in terms of when it was introduced, because what we care about is modern use.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer saying "as currently used by the Coptic church". Otherwise next January we have to either update them all to 2026 or our library will look outdated

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, "as currently used by the Coptic church" means that this is defined as "whatever the church uses" and is prone to issues if the church ever decides to discard this calendar.

I could pick 2000 instead if people want a nice round number that is clear as an arbitrary number.

We should not be updating the date every year. This is a part of the disambiguation, "the calendar used by the Coptic church in 2025" is unambiguous and will remain so. "the calendar currently used by the Coptic church" does not have that property.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They've used this for almost 2000 years now, I would consider this very stable.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we should focus on who uses these calendars in 2025. The Coptic calendar was used by the Coptic church since its inception, that's its whole identity. Saying it's used by the Coptic Church "as of 2025" sounds like they keep changing calendars, which is not true. If you see this code in 2026, you have to wonder what calendar the Coptic church uses currently, which is not what this documentation should make you wonder.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They've used this for almost 2000 years now, I would consider this very stable.

I could say "in at least the period XXX CE - 2025 CE" but that makes it sound like they stopped in 2025.

I do not wish to make sure claims about the future here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shane convinced me to say "as of the publication date of this crate".

@Manishearth Manishearth requested a review from sffc October 18, 2025 07:10
Copy link
Member

@robertbastian robertbastian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm confused by the different levels of details for the "precise definition" of the different calendars. But then, if we link to the Wikipedia page of the calendar, do we also need to provide a "precise definition" if that's unambiguous (Gregorian et al)?

I also don't like classifying calendars as "algorithmic" and (implicitly) non-algorithmic. We don't provide a single non-algorithmic calendar (the only non-algorithmic calendar I'm aware of is an observational Hijri). We sometimes hardcode data because we don't implement the algorithms, but the calendars are still very much algorithmic.

/// # Precise definition and limits
///
/// This calendar is intended to represent the traditional Chinese lunar calendar as used
/// officially in the People's Republic of China as of 2025. This takes a best-effort approach
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's try to be generic over "China", this is also used in Taiwan

Suggested change
/// officially in the People's Republic of China as of 2025. This takes a best-effort approach
/// officially in China as of 2025. This takes a best-effort approach

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I chose a polity because I am trying to unambiguously indicate a particular calendar.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think "China" is better because the calendar is used throughout the region. If different sub-regions adopt different rules, then we might want to get more specific with the naming. And in that case, I might say "in Beijing" rather than "in China".

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I decided that saying "China" is fine and if the two polities every diverge here we have sufficient leeway to decide what to do then.

/// # Precise definition and limits
///
/// This calendar is intended to represent the traditional Chinese lunar calendar as used
/// officially in the People's Republic of China as of 2025. This takes a best-effort approach
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's not a "best-effort approach" from the year 1900 onwards. it's "matches ground truth" from the year 1900 onwards"

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No it doesn't. It is a best effort attempt to match ground truth from 1900 onwards. It succeeds for the years 1900-2025. We cannot make sure statements beyond that. I can frame that better, but then we get to the problem you are already complaining about where there is a year in the docs we need to keep updating.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on feedback from Shane, I've changed this to be "best effort for future dates"

Comment on lines 66 to 67
/// This calendar is defined algorithmically as a solar calendar with a leap month every 4 years, as used
/// by the Roman Empire since 1 CE.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"as used by the Roman Empire since 1 CE" is a weird thing to say. the Roman empire doesn't exist anymore, many different entities have used this calendar for centuries, but this is all already explained in previous paragraphs.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, this could be framed better.

///
/// # Precise definition and limits
///
/// This calendar is defined algorithmically as a solar calendar with a leap month every 4 years, as used
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not a very precise definition of the calendar

///
/// # Precise definition and limits
///
/// This calendar is defined as a solar calendar which uses the astronomical vernal equinox as its new year, and
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how many months? how long?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it necessary to be that precise in each calendar? Expressing some characteristics and a polity where it is official is pretty unambiguous.

@Manishearth
Copy link
Member Author

I also don't like classifying calendars as "algorithmic" and (implicitly) non-algorithmic. We don't provide a single non-algorithmic calendar (the only non-algorithmic calendar I'm aware of is an observational Hijri). We sometimes hardcode data because we don't implement the algorithms, but the calendars are still very much algorithmic.

"algorithmic" is about the definition of the calendar. The Gregorian calendar is defined as a precise, computable algorithm. The United States deciding to add a day to a year does not change that.

This is not true for UAQ or Chinese, where while there is an algorithm we implement, we are also following ground truth.

But then, if we link to the Wikipedia page of the calendar, do we also need to provide a "precise definition" if that's unambiguous (Gregorian et al)?

Not necessarily, I was trying to be as redundant as possible. This did end up with different levels of detail. I don't think consistency across calendars is important here when it comes to level of detail.

Comment on lines +55 to +56
/// This calendar generically covers any pure lunar calendar used liturgically in Islam,
/// with 12 months each of length 29 or 30, with an epoch intended to mark the Hijrah in 622 CE.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: I'm fine saying that the epoch is the Hijrah, but I thought we previously said we needn't make that commitment, which is why we added ECMA reference year functions to the hijri::Rules trait.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think from a calendrical definition perspective saying the epoch is Hijrah is probably good.

@Manishearth Manishearth requested a review from sffc October 20, 2025 23:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Buddhist: should it have an April new year before 1941? Document calendar validity ranges

3 participants