You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Right now, the persistent identifier uses UTC in the identifier. This should be "brussels/europe" time. Using moment.js, we should have the right behavior though. If we decide to go on with this, we should also change this in the repo https://github.com/iRail/gtfsrt2lc.
Perhaps we should however get rid of the time all together and only use a date after all. However, the problem is bigger than this:
Problem
The GTFS ZIP files update every 3 months and have a time span of 4 months. This means that each 3 months, there is a 1 month overlap of time schedules that may have changed slightly. E.g., the planned departure time of a train may have changed with 1 or more minutes (this actually happens from time to time). With out current identifier strategy, we will get a lot of duplicate connections.
Possible solutions
1. We use the departure date (year, month, day) of the trip's departure for all connections
Then it becomes hard to generate a URI automatically for a connections from which the trip started on a previous day in connections. Could we however find a solution for this somehow?
Mind that in this case we will still have a duplicate entry problem when a trip start date would change to the next day. However, the probability this happens is pretty low
2. We use the departure date (year, month, day) of the connection itself
Then we still have the problem that a connection can change from one day to another. The chance this happens is small though and the problems for our archive are small?
Earlier tests showed that this resulted in a lot of duplicate URIs. I should investigate why this happens as a route id should only happen once a day...
The text was updated successfully, but these errors were encountered:
pietercolpaert
changed the title
Uses UTC time in URI: should be local time
Persistent URI strategy is broken
Jul 24, 2016
Right now, the persistent identifier uses UTC in the identifier. This should be "brussels/europe" time. Using moment.js, we should have the right behavior though. If we decide to go on with this, we should also change this in the repo https://github.com/iRail/gtfsrt2lc.
Perhaps we should however get rid of the time all together and only use a date after all. However, the problem is bigger than this:
Problem
The GTFS ZIP files update every 3 months and have a time span of 4 months. This means that each 3 months, there is a 1 month overlap of time schedules that may have changed slightly. E.g., the planned departure time of a train may have changed with 1 or more minutes (this actually happens from time to time). With out current identifier strategy, we will get a lot of duplicate connections.
Possible solutions
1. We use the departure date (year, month, day) of the trip's departure for all connections
Then it becomes hard to generate a URI automatically for a connections from which the trip started on a previous day in connections. Could we however find a solution for this somehow?
Mind that in this case we will still have a duplicate entry problem when a trip start date would change to the next day. However, the probability this happens is pretty low
2. We use the departure date (year, month, day) of the connection itself
Then we still have the problem that a connection can change from one day to another. The chance this happens is small though and the problems for our archive are small?
Earlier tests showed that this resulted in a lot of duplicate URIs. I should investigate why this happens as a route id should only happen once a day...
The text was updated successfully, but these errors were encountered: