kccqzy 17 hours ago

I wrote my own date calculation functions a while ago. And during that, I had an aha moment to treat March 1 as the beginning of the year during internal calculations[0]. I thought it was a stroke of genius. It turns out this article says that’s the traditional way.

[0]: https://github.com/kccqzy/smartcal/blob/9cfddf7e85c2c65aa6de...

  • zokier 16 hours ago

    not completely coincidentally, March was also the first month of the year in many historical calendars. Afaik that also explains why the month names have offset to them (sept, oct, nov, dec)

    edit: I just love that there are like 5 different comments pointing out this same thing

    • Mikhail_Edoshin 15 hours ago

      I've read that not only March was the first month, but the number of months was only ten: winter months did not need to be counted because there was no agricultural work to be done (which was the primary purpose of the calendar). So after the tenth month there was a strange unmapped period.

      • Dylan16807 9 hours ago

        How do you figure out it's March 1 if you're not counting days?

        • eru 5 hours ago

          Equinox or something like that?

          • Dylan16807 2 hours ago

            The precise equinox sounds fussy to measure and even then you need to know three weeks before the equinox. While counting days is very easy.

        • rusk 3 hours ago

          Druid tells you

      • _dain_ 13 hours ago

        >So after the tenth month there was a strange unmapped period.

        this is when time-travelling fugitives hide out

    • stouset 10 hours ago

      I thought Sept, Oct, Nov, and Dec were shifted by the addition of July (Julius) and August (Augustus)?

      • jayknight 10 hours ago

        That's a common misconception. Those were just renamed for the Caesars. January and February we added, before that there was just a gap in the winter.

        • rusk 3 hours ago

          What were they before? Quintember? Sextober?

          • sfblah 3 hours ago

            Basically, yes. Quintilis and Sextilis.

    • thaumasiotes an hour ago

      > not completely coincidentally, March was also the first month of the year in many historical calendars.

      And often the last month too. The early modern English calendar began the year on March 25.

      This is coincidental in relation to the offset in the names of the months. The Romans started their year in January just like we do today.

      (Though in a very broad sense, it's common to begin the year with the new spring. That's the timing of Chinese new year and Persian new year. I believe I've read that the Roman shift two months backward was an administrative reform so that the consuls for the year would have time to prepare for the year's upcoming military campaigns before it was time to march off to war.)

    • Izikiel43 14 hours ago

      > explains why the month names have offset to them (sept, oct, nov, dec)

      Everything now makes sense, I always wondered why September was the nine month with a 7 prefix.

  • silisili 16 hours ago

    At this risk of me feeling stupid, could you briefly explain the benefit of this?

    • kccqzy 16 hours ago

      I just added a link to the code with a brief comment. Basically, it simplifies the leap year date calculation. If February is the last month of the year, then the possibly-existing leap day is the last day of the year. If you do it the normal way your calculations for March through December need to know whether February is a leap year. Now none of that is needed. You don’t even need explicit code to calculate whether a given year is a leap year: it’s implicit in the constants 146097, 36524, and 1461.

      • zamadatix 16 hours ago

        The magic numbers at the end of this explanation are the number of days of each part of the leap year cycle:

        146097 days = 400 year portion of the leap year cycles (including leap years during that)

        36524 days = same for the 100 year portion of the leap year cycles

        1461 days = 4 year cycle + 1 leap day

      • d--b 16 hours ago

        IIRC, it's also why the leap day was set to Feb 29th in the first place. At the time (romans?) the year started March 1st.

        In case someone was wondering why in the world someone said we should add a day to the second month of the year...

        • jcranmer 12 hours ago

          The calendar was regularized to include a leap day during the reign of Julius Caesar (hence the name "Julian calendar"), which would have been 45 BC.

          The Roman calendar moved to January as the first month of the year in 153 BC, over a hundred years before the leap day was added. The 10-month calendar may not have even existed--we see no contemporary evidence of its existence, only reports of its existence from centuries hence and the change there is attributed to a mythical character.

          • eru 5 hours ago

            Btw, the Romans had leap days before Julius Caesar, but they were added ad hoc by the Pontifex Maximus.

            Caesar happened to be the Pontifex Maximus (an office you hold for life once elected to), but he wasn't in Rome much to do that job. So after he came back from hanging out with Cleopatra in Egypt he came back and set the calendar on auto-pilot.

          • avadodin 2 hours ago

            I don't know if it ever made it to production, and I don't remember exactly why it made sense at the time, but one early hack I did was passing a date in Julian format because there weren't enough bits to pass a full timestamp.

          • pbhjpbhj 7 hours ago

            Wikipedia gives 3 dates for January being the first month, either (approx) 700, 450, 150 BCE.

            It's fair to say January was the first month of the Roman calendar; despite it having formerly been March.

          • anyfoo 12 hours ago

            Are you saying that while we do see evidence that September, October, November, December were once the 7th, 8th, 9th, and 10th month, we don't see any evidence that the calendar was ever "10 months long"? (How would that have worked anyway, did they have more days per month?)

            • jcranmer 11 hours ago

              Pretty much.

              > How would that have worked anyway, did they have more days per month?

              The way I've heard, they just simply didn't track the date during the winter.

        • variaga 15 hours ago

          That's correct, the Romans had March as the first month of the year, so leap day was the last day of the year and September, October, November and December were the 7th (sept), 8th (oct), ninth (nov) and 10th (dec) months.

          • ljsprague 14 hours ago

            June and July used to be Quintilis and Sextilis.

            • quesera 11 hours ago

              I think Quintilis and Sextilis were renamed to July and August, in honor of Julius and Augustus, respectively.

        • amenghra 15 hours ago

          And (oct)ober was the 8th month of the year, (nov)ember the ninth, (dec)ember the tenth!

          • sltkr 11 hours ago

            Weird parenthesization. The latin numbers are septem, octo, novem, decem for 7, 8, 9, 10. And then they all have a -ber suffix.

            • amenghra 3 hours ago

              The three letter prefixes show up in English (eg oct in octal, dec in decimal, etc.).

              • thaumasiotes an hour ago

                The prefix in "decimal" is "decim", like you'd expect given that the root is "decem". There is no "dec-" prefix.

                (You do see "deca" used as a prefix, "a" included, but that doesn't come from Latin.)

          • Izikiel43 14 hours ago

            Don't forget (sep)tember being the 7th month

        • wizzwizz4 12 hours ago

          Technically, the leap day (bissextus) was the 24th. (Wikipedia tells me this is because that's when Mercedonius used to be, before the Julian reforms.)

    • da_chicken 16 hours ago

      It's easy to know what day of the year it is because leap days are at the end.

    • 3eb7988a1663 15 hours ago

      Not so relevant, but some fun history, the Roman calendar did start in March, so tacking on the leap years was done at the finale. This also meant that the root of the words - the "oct" in october means 8 was also the eighth month of the year.

    • zimpenfish 14 hours ago

      As well as the leap year stuff people have mentioned, there was something else that I've got a vague memory of (from an old SciAm article, IIRC, which was about using March as the first month for calculations) which pointed out that if you use March as 0, you can multiple the month number by (I forget exactly what but it was around 30.4ish?) and, if you round the fraction up, you get the day number of the start of that month and it all works out correctly for the right 31-30-31 etc sequence.

benjoffe 4 days ago

A write-up of a new Gregorian date conversion algorithm.

It achieves a 30–40% speed improvement on x86-64 and ARM64 (Apple M4 Pro) by reversing the direction of the year count and reducing the operation count (4 multiplications instead of the usual 7+).

Paper-style explanation, benchmarks on multiple architectures, and full open-source C++ implementation.

  • sltkr 16 hours ago

    Very cool algorithm and great write-up!

    I was a bit confused initially about what your algorithm actually did, until I got to the pseudo-code. Ideally there would be a high level description of what the algorithm is supposed to do before that.

    Something as simple as: “a date algorithm converts a number of days elapsed since the UNIX epoch (1970-01-01) to a Gregorian calendar date consisting of day, month, and year” would help readers understand what they're about to read.

  • zozbot234 12 hours ago

    How would this algorithm change on 16-bit or 8-bit devices? Or does some variety of the traditional naïve algorithm turn out to be optimal in that case? There's quite a bit of microcontroller software that might have to do date conversions, where performance might also matter. It's also worth exploring alternative epochs and how they would affect the calculation.

  • digitalPhonix 17 hours ago

    Very nice writeup!

    > Years are calculated backwards

    How did that insight come about?

zX41ZdbW 15 hours ago

Interesting how it compares with the ClickHouse implementation, which uses a lookup table: https://github.com/ClickHouse/ClickHouse/blob/master/src/Com...

So that a day number can be directly mapped to year, month, and day, and the calendar date can be mapped back with a year-month LUT.

  • simlevesque 11 hours ago

    Simply, ClickHouse only works on a 399 years span while OP's algorith parses any date, over 3 trillion years.

swiftcoder 17 hours ago

Nice to see the micro-optimising folks are still making progress on really foundational pieces of the programming stack

ape4 7 hours ago

Perhaps nicer to avoid the comment and write:

    const C1 = 505054698555331      // floor(2^64*4/146097)
as

    constexpr int C1 = floor(2^64*4/146097);
  • aw1621107 5 hours ago

    std::floor was made constexpr in C++23, which is pretty recent as far as C++ standards go. It's possible the author didn't think using C++23 was worth the constraints it places on who could use the code.

    • CodesInChaos 5 hours ago

      That's a mathematical expression, not a C++ expression. And floor here isn't the C++ floor function, it's just describing the usual integer division semantics. The challenge here is that you need 128-bit integers to avoid overflowing.

      • aw1621107 4 hours ago

        Ah, you're right. I saw that the expression in the comment and in the code was the same and assumed that the commented bit was valid C++ code. You got me to look again and it's obvious that that isn't the case. I had even gone looking through the codebase to see if std::floor was included, and still missed the incorrect `^`.

        I guess in that case as long as the 128-bit type supports constexpr basic math operations that should suffice to replace the hardcoded constants with their source expressions.

juancn 16 hours ago

It took me a while to understand that internally it uses 128bit numbers, that `>> 64` in the pseudocode was super confusing until I saw the C++ code.

Neat code though!

  • brucehoult 10 hours ago

    Not really. It looks like that in the C code, but in the generated machine code it'll just be a single `MULH` instruction giving (only) the upper 64 bits of the result, no shift needed.

zkmon 16 hours ago

Nice to see that there are still some jewels left to be dug out from the algorithm land.

  • rurban an hour ago

    Well searching for strings, appending strings, comparing strings. All still unimplemented in standard libs. (Strings being unicode of course)

aidenn0 11 hours ago

An interesting writeup on using a different representation for time is here[1]. It can represent any specific second from March 1, 2000 +/-2.9Myears with 62 bits and can efficiently calculate Gregorian dates using only 32-bit arithmetic. An optimization involving a 156K lookup table is also discussed.

A few notes for those not familiar with Lisp:

1. Common Lisp defines a time called "universal time" that is similar to unix time, just with a different epoch

2. A "fixnum" is a signed-integer that is slightly (1-3 bits) smaller than the machine word size (32-bits at the time the article was written). The missing bits are used for run-time type tagging. Erik's math assumes 31-bits for a fixnum (2.9M years is approximately 2^30 days and fixnums are signed).

3. Anywhere he talks about "vectors of type (UNSIGNED-BYTE X)" this means a vector of x-bit unsigned values. Most lisp implementations will allow vectors of unboxed integers for reasonable values of X (e.g. 1, 8, 16, 32, 64), and some will pack bits for arbitrary values of X, doing the shift/masking for you.

1: https://naggum.no/lugm-time.html

wood_spirit 16 hours ago

Admittedly in a different league speed wise but also scope wise is my very fast timestamp library for Java https://github.com/williame/TimeMillis

This focuses on string <-> timestamp and a few other utilities that are super common in data processing and where the native Java date functions are infamously slow.

I wrote it for some hot paths in some pipelines but was super pleased my employer let me share it. Hope it helps others.

Findecanor 13 hours ago

TIL that Unix Time does not count leap seconds. If it did, it wouldn't have been possible to write routines that are this fast.

  • toast0 6 hours ago

    If Unix Time enumerated leap seconds, you couldn't convert future timestamps into localized times.

    • dxdm 3 hours ago

      Could you elaborate on what you mean? I think it's already impossible to accurately turn a future timestamp into a local time, leap seconds or not, because of timezone shenanigans. So I'm probably misunderstanding what you're talking about.

kittikitti 16 hours ago

Thank you for sharing. This is a great achievement not only in the ability to invent a novel algorithm with significant performance gains but also the presentation of the work. It's very thorough and detailed, and I appreciated reading it.

hyperhello 11 hours ago

Can this algorithm tell me how old I was last year?

pyrolistical 13 hours ago

For something this short that is pure math, why not just hand write asm for the most popular platforms? Prevents compiler from deoptimizing in the future.

Have a fallback with this algorithm for all other platforms.

  • flumpcakes 11 hours ago

    This pretty much is assembly written as C++... there's not much the compiler can ruin.

vladde 17 hours ago

> The algorithm provides accurate results over a period of ±1.89 Trillion years

i'm placing my bets that in a few thousand years we'll have changed calendar system entirely haha

but, really interesting to see the insane methods used to achieve this

  • layer8 15 hours ago

    Maybe not in a few thousand years, but given the deceleration of the Earth’s rotation around its axis, mostly due to tidal friction with the moon, in a couple hundred thousand years our leap-day count will stop making sense. In roughly a million years, day length will have increased such that the year length will be close to 365.0 days.

    I therefore agree that a trillion years of accuracy for broken-down date calculation has little practical relevance. The question is if the calculation could be made even more efficient by reducing to 32 bits, or maybe even just 16 bits.

    • aidenn0 11 hours ago

      > The question is if the calculation could be made even more efficient by reducing to 32 bits, or maybe even just 16 bits.

      This is somewhat moot considering that 64-bits is the native width of most modern computers and Unix time will exceed 32-bits in just 12 years.

    • PaulHoule 11 hours ago

      Shorter term the Gregorian calendar has the ratio for leap years just a tiny bit wrong which will be a day off by 3000 years or so.

  • hugmynutus 15 hours ago

    > i'm placing my bets that in a few thousand years we'll have changed calendar system entirely haha

    Given the chronostrife will occur in around 40_000 years (give or take 2_000) I somewhat doubt that </humor>

  • ____tom____ 15 hours ago

    The calendar system already changed. So this won't get correct dates, meaning the dates actually used, past that date. Well, those dates, as different countries changed at different times.

  • fnordsensei 16 hours ago

    Wouldn’t it be accurate for that as well? Unless we change to base 10 time units or something. Then we all have a lot of work to do.

    But if it’s just about starting over from 0 being the AI apocalypse or something, I’m sure it’ll be more manageable, and the fix could hopefully be done on a cave wall using a flint spear tip.

    • jandrese 16 hours ago

      Or set 0 to be the Big Bang and make the type unsigned. Do it the same time we convert all temperature readings to Kelvin.

      • Tor3 5 hours ago

        And count Planck time instead of seconds.. it's not as impossible as it may sound. You'll need more than 128 bits but less than 256 bits even if the epoch is the Big Bang (I can't recall exactly how many bits are needed, but I did the math once, some years ago). And it'll be compatible with alien or future time systems too, in case what we call a second (currently defined by caesium-133 periods) changes.