Hey,
DeepCode offers an AI based Static Program Analysis for Java, Javascript and Typescript, and Python. You might know, DeepCode uses thousands of open source repos to train our engine. We asked the engine team to provide some stats on the findings. On the top suggestions from our engine, we want to introduce and give some background in this series of blog articles. And this one made me dizzy as we are looking back on like 70 years of software engineering and still have problems with — drumroll — date formats…
Language: Java
Defect: Date Data Format (Category General 1)
Diagnose: Ambiguous date-time output formats (e.g. 12h output without am/pm suffix)
Background:
Quick, what is the difference between the date format strings MM-DD-YYYY
and mm-dd-yyyy
or MM-dd-yyyy
?? I can tell you, they will all have the same result when you call them at the right moment. But obviously, they are fundamentally different. Well, let us shed some light…
Java had a bad start regarding its date classes. java.util.Date
had serious design flaws when it was introduced which led to lots of confusion. A Date
instance in Java is actually not a date but a moment in time, therefore (1) it has no time zone, (2) no format, (3) no calendar system. I suggest this valuable blog post for the full story. After years, the Java community acted and introduced new classes. Still, there are lots of traps (for example, java.text.SimpleDateFormat
is not thread-safe while java.time.format.DateTimeFormatter
is) but for now, let us focus the most common mistake which is the one flagged by DeepCode:
- Using
mm
for months and/orMM
for minutes (wrong!). - Using
hh
for “hour of the day” when reallyHH
was intended.HH
ranges from 0 to 23 whileh
ranges from 1 to 12 and is mostly used as single character. It needs the AM/PM information or it is ambiguous. - Using
YYYY
for year. It is meant to be used in conjunction with “week of the year” and can lead to unexpected results in the first and last week of a year. Normally, you want to useyyyy
. - Using
DD
for “day of the month” but in reality, it means “day of the year” Make sure to use the correct pattern characters. As a reference in Java, the following applies.
Pattern Character | Date or Time component | Example Result |
---|---|---|
G | Era designator | AD |
y | Year | 2020(yyyy),20(yy) |
Y | Week-year (year of the week, may provide unexpected results first and last week of the year) | 2020(YYYY), 20(YY) |
M | Month in year | July(MMMM), Jul(MMM), 07(MM) |
w | Results in week in year | 16 |
W | Results in week in month | 3 |
D | Gives the day count in the year | 266 |
d | Day of the month | 09(dd), 9(d) |
F | Day of the week in month | 4 |
E | Day name in the week | Tuesday, Tue |
u | Day number of week where 1 represents Monday, 2 represents Tuesday and so on | 2 |
a | AM or PM marker | AM |
H | Hour in the day (0-23) | 12 |
k | Hour in the day (1-24) | 23 |
K | Hour in am/pm for 12 hour format (0-11) | 0 |
h | Hour in am/pm for 12 hour format (1-12) | 12 |
m | Minute in the hour | 59 |
s | Second in the minute | 35 |
S | Millisecond in the minute | 978 |
z | Timezone | Pacific Standard Time; PST; GMT-08:00 |
Z | Timezone offset in hours (RFC pattern) | -0800 |
X | Timezone offset in ISO format | -08; -0800; -08:00 |
Note: This is Java. Do not simply expect this to be the same elsewhere. Always check the documentation.
This is not exhaustive on problems around dates and times. We could talk about the difference between UTC offset and timezones, the problems around timezone abbreviations (is BST British Summer Time or British Standard Time or rather Bougainville Standard Time (No, I did not make this up)? Well, who knows), or different calendars in different locales.
To provide the answer to our little puzzle above, you probably almost always want MM-dd-yyyy
.
CU
0xff
Top comments (0)