Description
Currently, there are a few subtle bugs in the code that gets executed by the harvest API. It's all very terrible and mostly due to badly designed libs IMO. I could identify these four problems:
- By default, GSON uses the local time zone to format
Date
values. If sender and receiver are in different time zones, stuff breaks. - By default, GSON serializes
Date
values without milliseconds. So we lose milliseconds in a GSON roundtrip. Date
with JPAs@Temporal(TemporalType.TIMESTAMP)
is turned into aDATETIME
in mysql which by default does not store milliseconds. This is different from PostgreSQL where milliseconds are stored (required by the SQL standard for this type, but who cares about that...)- Something weird is going on with
Date
and JPA regarding time zones. Where I'm testing right now, the mysql server runs in UTC, while the Java application runs in UTC+2. This leads to wrong (2 hours too early) times received from the database.
While GSON and JPA is certainly also to blame, the common denominator seems to be Date
. And in fact, apparently it's a terrible legacy class that one should not use anymore at all. Apparently with Java 8 a bunch of not-terribly-broken classes were added with java.time
. It seems like we should just use those where ever possible.
I stopped working on this now as it was quite a time sink (plus very annoying) and the concrete problem with the Bern deployment was fixed. Somehow. I don't know what changed but it's not a problem anymore.
Some notes from my research into this:
- JPA 2.2 adds support for
java.time
classes. - JPA is just a specification. Common implementations are hibernate and eclipselink. Opencast uses eclipselink.
- We likely want to use
java.time.Instant
orjava.time.OffsetDateTime
formodified_date
anddeletion_date
.- To avoid time zone confusions in the future, it's probably useful to store the time zone in the DB explicitly (always UTC!). This is technically wasted memory, but probably worth it to save dev time and headaches.
OffsetDateTime
would fit that description more closely.- mariadb/mysql don't seem to support
timestamp with time zone
?!
- mariadb/mysql don't seem to support
- But of course, storing the explicit offset in every database row is not necessary and one can just declare the timestamp to be in UTC. And in Java, using
Instant
is more appropriate here. Instant
should be mapped toTIMESTAMP_WITH_TIMEZONE
. For mysql, it results in the typedatetime(3)
.Instant
is not mentioned in JPA 2.2 directly and not supported by eclipselink. It is supported by hibernate.OffsetDateTime
is supported by both.- This answer says
Instant
is better: https://stackoverflow.com/a/50039491/2408867
- To avoid time zone confusions in the future, it's probably useful to store the time zone in the DB explicitly (always UTC!). This is technically wasted memory, but probably worth it to save dev time and headaches.
- One can use
javax.persistence.AttributeConverter
to map the type of your entity during (de)serialization. - The standard serialization of GSON for
Instant
is:{ "seconds": 1656425299, "nanos": 763000000 }
- WIP patch can be found here
- The GSON serialization is not nice to read, but it's probably fine because the relevant API is ONLY for internal remote-impl use anyway. Or does this serialization appear in any other API?
- The database stuff also seems to work alright.
OffsetDateTime
is stored to the DB (which, DB systems that support it, should result intimestamp with time zone
), butInstant
is used in the Java code. Whether there can be weird effects when talking to DB, I don't know.
Some other links:
- Relevant code of
eclipselink
. - https://in.relation.to/2018/02/20/java8-date-time-mapping/
- https://velog.io/@lsb156/Instant-vs-LocalDateTime
- https://thorben-janssen.com/map-date-time-api-jpa-2-2
- https://docs.oracle.com/javase/8/docs/api/java/time/package-summary.html
- https://stackoverflow.com/q/49309076/2408867