Let me start at the end for those who don't want to read the whole thing. COBOL does not have a "default date" or any kind of default value for a missing date. COBOL doesn't even have a "date" data type. ISO8601 is an interchange format for dates. "May 20, 1875" has no prominence in that standard or any other date standard.
That's the short version. How did we get here?
Elon Musk mentioned in a press briefing that there were many names in the Social Security database of people who weren't marked as dead but were 150 years old. Clearly, this is bad data. At the time, he provided no other information, so speculation abounded on the internet.
In the middle of all of it, this appeared on threads:
https://www.threads.net/@ashmore_glenn/post/DGDfmj6TsZS
It's interesting that he starts off making the claim that "in COBOL, if a date is missing the program defaults to 1875. e.g. 2025-1875=150". This claim isn't just false, it's false in multiple ways:
- There is no "date" data type in COBOL. Dates are stored however the programmer wants, but usually numeric character strings
- There's no "default" date, even if there were such a data type
- Even if there were a default, 1875 would be a bizarre choice
His initial post has over 20,000 "likes" and numerous shares as of this writing. The followup - included in the picture above - has him backpedaling hard and pointing out that they were "taught to never leave a null date" and "I assume some group at IBM or elsewhere decided that May 20,1875, the date of the Convention on the Meter, would be the standard filler." That got 10 likes. Welcome to the internet.
So, later on in the same thread he gets a lot of push back from other programmers, and then we find this with 1000+ likes:
I'll put the whole thing here:
Correct. The original ISO standard for a default reference date was May 20, 1875 - the date the international standards and metrics treaty was signed. Geeky to be sure but that is coders for you. This standard was changed in 2019.
As you've probably guessed, that's not correct, either.
But this was added to the lore and we had some disinformation with running shoes on.
ISO8601
In the ISO8601:2004 standard, "May 20, 1875" is mentioned as a "reference date" in a section discussing the Gregorian calendar.
For those of you unaware, the Gregorian calendar is almost certainly the calendar that you're using. The "Julian calendar" is still used in some contexts in the west, mostly among Orthodox churches. You most likely see this as "Orthodox Christmas" and "Orthodox Easter" on your (Gregorian) calendar.
Without getting into all the specifics, Julias Caeser adopted what's now known as the "Julian calendar" in 46 BC. It has 365.25 days in a year, so every four years a leap year is added. That's pretty close.
Unfortunately, "pretty close" doesn't work over the course of centuries, and people determined that the real number of days in a year is slightly less than 365.25. So, in 1582 Pope Gregory XIII established a new calendar system that's pretty similar, but to get closer to the number of days in the year the leap year is skipped every hundred years, unless the year is a multiple of "400".
In order to "reset" the calendar, October 4, 1582 (Julian) was followed by October 15, 1582 (Gregorian), which was October 5, 1582 (Julian).
The Julian calendar will continue to drift apart from the Gregorian calendar. The starting offset in 1582 was 10 days. The Gregorian calendar skipped leap years in 1700, 1800, and 1900, but the Julian didn't skip those. As of 2025, there are 13 days between the two.
So, Christmas is December 25 in both calendar systems. But those aren't the same day. "December 25, 2024" in the Julian calendar is "January 7, 2025" Gregorian. That's why "Orthodox Christmas" is celebrated in what most of us call "January". Inside the Julian calendar, it's still December.
In the ISO8601 specification, 2004 edition, section 3.2.1 discusses the Gregorian calendar as it is considered the "standard" calendar and the one that ISO8601 assumes. In this discussion of the Gregorian calendar, it is mentioned:
The Gregorian calendar has a reference point that assigns 20 May 1875 to the calendar day that the "Convention du Mètre" was signed in Paris.
This is a statement about the Gregorian calendar, not ISO8601. It is saying that the date that the "Meter Convention" was signed in Paris is "May 20, 1875" on the Gregorian calendar, as an example.
Equally valid statements:
"The Julian calendar has a reference point that assigns 8 May 1875 to the calendar day that the 'Convention du Mètre' was signed in Paris."
Even more:
"The Gregorian calendar has a reference point that assigned 29 Aug 1958 to the calendar day that Michael Joseph Jackson was born."
That's not a statement about ISO8601 - it's a statement about a calendar system. It was removed from the 2019 update to ISO8601.
https://www.loc.gov/standards/datetime/iso-tc154-wg5_n0038_iso_wd_8601-1_2016-02-16.pdf
Due to what was shown on threads, this morphed into a full-scale disinformation campaign that usually disparaged Elon Musk or his "DOGE" team as hapless idiots who don't know about COBOL and thus don't know that "May 20, 1875 is the epoch in COBOL".
Ah, the epoch. What's an "epoch"?
Let's back up a bit. The ISO8601 standard for exact dates is pretty simple:
yyyy-mm-dd (dashes optional)
That's the entire standard for an exact date. Four-digit year, two-digit month, and two-digit day of month, each with leading zeroes as applicable. So, "May 20, 1875" is "1875-05-20". "May 19, 1875" is "1875-05-19". "October 15, 1582" is "1582-10-15". "October 10, 1582" - ha! fooled you there, that date technically doesn't exist. The Gregorian calendar didn't exist then, so we use the "proleptic" version which just numbers the days as if they did exist. So, "October 10, 1582" Gregorian would be "September 30, 1582" Julian, and we would still use "1582-10-10". Anyway....
I use ISO8601 all the time. It's the format used by HTML form fields for dates and times. It's a date/time interchange format, and it's useful because a date/time value includes all information to accurately specify the exact date and time of an event, even if someone in a different time zone consumes the data being produced or if daylight savings time is different. It's also great for dates as they can be sorted easily.
Epochs
An "epoch" in computer lore is basically when a system of timekeeping starts. Year "1" is the first year in our Gregorian calendar system. Times before that are just "BC". There is no year "0", which is a bit odd, but exact dates that long ago typically don't matter.
In the Unix operating system, the "epoch" is January 1, 1970 at 12:00:00 AM. Later, that was determined to be "UTC". A standard Unix timestamp is 32 bits, but it's considered to be a "signed" number so 31 bits are accessible. That means that we can count up to 2,147,483,647 seconds past January 1, 1970 in this system, which puts us at 2038 when the number "overflows" and resets to a negative value.
Put another way, with this system we can specify a date and time within about 68 years either way from January 1, 1970. Here, I show what this means using the Ruby programming language:
3.3.6 :001 > Time.at(0).utc
=> 1970-01-01 00:00:00 UTC
3.3.6 :002 > Time.at(2**31-1).utc
=> 2038-01-19 03:14:07 UTC
3.3.6 :003 > Time.at(-2**31).utc
=> 1901-12-13 20:45:52 UTC
As I type this, we are at 1,740,000,100 seconds since the epoch.
The system that I just described has a couple of features that we can see as relevant:
- It only works within a fairly narrow range of years
- A timestamp of "0" will be obvious since it's always the same value
When I'm working with systems and I see the date/time is "January 1, 1970 12:00:00am UTC" or "December 31, 1969 6:00:00pm CST" I know right away that I have a timestamp that is 0. This is a pretty common bug form - we forget to initialize a value, it ends up being 0, so the date that shows up is January 1, 1970 or December 31, 1969.
There are other epochs. A bunch of others. One example is a timestamp for a file in MS-DOS, which starts in 1980.
Now, in ISO8601 the specification for an exact date is just the four-digit year, two-digit month, and two-digit day of month. There's no "epoch" other than, basically, year "1" or something like that. In this system, May 20, 1875 has no significance. That's why the line was removed from the standard in the 2019 edition.
Disinformation Multiplies
Musk countered the disinformation by showing the counts by age bracket:
https://x.com/elonmusk/status/1891350795452654076
By this point, the disinformation was "if Musk and company were competent they would have noticed the huge spike at 150 years old and known that was important." As the data shows, there is no spike at 150 but rather the downward slope past 70 that one would expect with this data. And it ends not long after 150, not surprising since those are the first Social Security recipients. There are a couple of outliers on there as well.
It was too late to stop the disinformation campaign. It was repeated by Rachel Maddow on MSNBC, and shows up in other places. For instance, it's repeated by this "Politifact" "fact-check":
https://www.politifact.com/article/2025/feb/17/are-150-year-old-americans-receiving-social-securi/
This guy repeated it on X and got quite a few views:
https://x.com/toshiHQ/status/1889928670887739902
It showed up in this Wired article:
Note that the guy who wrote this - David Gilbert - supposedly covers "disinformation". Yeah, he covered it. By repeating it.
Here it shows up in a yahoo!news article:
https://www.yahoo.com/news/trump-press-secretary-hit-embarrassing-191941730.html
And this article from someone on Daily Kos:
This disinformation is showing up almost exclusively on left-wing media and "fact checkers", and usually includes digs at Elon Musk and the "young" DOGE employees.
COBOL
So, how does COBOL store dates? That's up to the programmer, and I'm sure there are many, many different ways that dates have been stored.
I pulled out my "Structured COBOL, Pseudocode Edition" by Shelly, Cashman, and Forsythe, 1986 edition. ISBN 0-87835-196-5. This is pre-Y2K, and it's mind-blowing to me that they weren't planning for Y2K yet.
On page 6.32, "Defining the Date Work Area":
On most computers, the current date is stored in computer memory as a 6 character field (Year, Month, Day). In the sample program, the date, which will be printed in the report heading, is obtained from computer memory and is placed in a work area within the Working Storage Section of the Data Division. The work area and the method used in the Procedure Division of the sample program to place the date in the work area are shown in Figure 6-48.
In Figure 6-48, the DATE-WORK field is defined on line 006080. The DATE-WORK field is then subdivided into three 2-digit numeric fields - the YEAR-WORK field, the MONTH-WORK field, and the DAY-WORK field. The date is six characters in length and is stored in YYMMDD format (i.e. - January 25, 1987 would be stored as 870125).
The date is obtained from computer memory using the Accept statement. When the Accept statement is executed, the reserved work DATE identifies that the current date is to be copied from the area in main computer memory where it is stored by the operating system to the field DATE-WORK which has been defined in the program.
The Accept statement can also be used to retrieve the day of the year and the time of day. The day of the year is returned in a Julian date format. The first two numeric characters are the year and the next three numeric characters are the day of the year. Thus, the value returned for January 25, 1987 would be 87025. The time is returned as a two-digit numeric hour, a two-digit numeric minute, and a two-digit numeric second, and a two-digit numeric hundredths of a second. Thus, the time 9:15 a.m. would be returned as 09150000.
Figure 6-48, Data Division:
006070 01 WORK-AREAS
006080 05 DATE-WORK
006090 10 YEAR-WORK PIC 99
006100 10 MONTH-WORK PIC 99
006110 10 DAY-WORK PIC 99
Procedure Division:
014070 ACCEPT DATE-WORK FROM DATE
014080 MOVE MONTH-WORK TO MONTH-HEADING
014090 MOVE DAY-WORK TO DAY-HEADING
014100 MOVE YEAR-WORK TO YEAR-HEADING
For those of you wondering how we had the Y2K bug, there it is in all its glory. This was written 14 years before 2000, and in the world of COBOL nobody apparently thought to point out that this was going to cause problems very, very soon.
The part of the book shown above is the entire treatise on handling dates (times aren't even mentioned except in that section) out of hundreds of pages. They show two possible date formats: "YYMMDD" or "YYDDD". COBOL generally stored data like this as characters, so the standard date format would be six separate characters.
Contrast that with a Unix timestamp which is a binary format that uses the space of four characters total and is able to specify more than 130 years of dates and times. The YYMMDD format can specify only 100 years of dates, and uses six character spaces. In modern terms, that's 48 bits to store around 36,525 days, which handily fits into 16 bits.
But the nice thing about YYMMDD is that there is no endianness to worry about, no epoch, no real calculation to determine the date. Just add a couple more Ys and you have ISO8601. That's why it's a great interchange format.
Summary
But, "May 20, 1875" has nothing to do with ISO8601, COBOL, or basically anything else other than the day that the Meter Convention was signed in Paris.
It was interesting to see a disinformation campaign take off like this, and I think at this point those involved have moved on to something else. But this will live on because of the internet and all the newly-minted COBOL date experts expounding on how Musk and company don't know something that's just so obvious. Sigh.
Never change, internet. Never change.
Top comments (4)
Yes, it has been very frustrating to see how this took off like a brush fire in California. I have done my part to try to correct people when I have seen references to this, but people have been very stubborn about it. I usually mention the fact that I started COBOL programming in 1990, and I have never in my life heard any reference to 1875 being special in any way - until last week! It is the latest example of how people love to repeat things on the interwebs just to jump on what they think is a popular bandwagon, even though they have no clue what they are talking about. Then as more people repeat the same BS, you have people referring to the number of times it has been repeated as some kind of proof that it must be correct. Very lazy thinking!
My favorite part of it is watching people who know nothing about computers beyond how to send an email talk condescendingly about Musk and the DOGE team being so stupid that they don't know something this trivial that everybody knows.
★★★★★★★★★★ out of ☆☆☆☆☆
Well done!!! Glad to see others pull out their hair!! (Not really but ya know!)
Signed,
90s COBOL student and programmer
Not 1875 but I definitely remember a lot of the Y2K "fixes" we implemented had logic around YY > 75 being CC 19 and < 75 being CC 20. Probably still lurking in the JPMorgan mainframes
Wondering if that's another potential misunderstanding along the way