These data are stored in the partitioned table schema ttc.cis
.
Column | Type/Format | Description/Format |
---|---|---|
message_datetime | timestamp | Position timestamp (from server poll, not vehicle) |
route | INT | route id |
run | INT | run id |
vehicle | INT | vehicle id |
latitude | TEXT | latitude position, stored as string |
longitude | TEXT | longitude position, stored as string |
position | geometry(Point, 4326) | geometry of the position, derived from the two previous columns using the trg_mk_position_geom |
See CIStable.ipynb
for an exploration of the 2017 data we received from the TTC.
The data were sent to us in monthly compressed csvs. The cis_batch_upload.sh
bash script cycles through .csv.gz
files in the current directory and copies them to ttc.cis
, a partitioned table with the above structure. Partitioning is currently by year and is handled by the cis_insert_trigger
function.
Currently data is provided in daily extracts available through sftp. This can be pulled using the pull_cis_data
python script.
You will need to add the host key to your ~/.ssh/known_hosts
file. You can do this in bash by getting the key with ssh-keyscan
:
ssh-keyscan -p 2222 sftp_url.ca
# sftp_url.ca SSH-salfkjdsa;'flkjdsa
[example.com:2222] ssh-rsa AAAAB3NzaC1yc2EAAAADAQAB...
Copy the second line of output and paste it into your known_hosts
file. Do some text editing to remove the port number such that you're left with sftp_url.ca ssh-rsa AAAAB3NzaC1yc2EAAAADAQAB...
.
Usage: pull_cis_data.py [OPTIONS] COMMAND [ARGS]...
Pull CIS data from the TTC's sftp server from --startdate to --enddate
The default is to grab yesterday's data. If using --startdate to
--enddate, will loop through those dates inclusively to pull each day of
data. If a dump of data spanning multiple days is available, use
--filename instead
Options:
-s, --startdate TEXT YYYYMMDD
-e, --enddate TEXT YYYYMMDD
-d, --config PATH .cfg file containing db, email, and sftp settings
--filename PATH filename to pull from sftp instead of using dates.
--help Show this message and exit.
Commands:
get Copy file of data from TTC's sftp server
upload Upload datafile to database
The default usage would be to run python pull_cis_data.py
to pull yesterday's data and upload it to the server. If there has been a large dump of data you can use the --filename
option like: `python pull_cis_data.py --filename
cis_processing.md
details the process for generating stop arrival and departure times from the CIS GPS positions with a companion Jupyter Notebook validating_cis_processing.ipynb
that details validation of each step of the processing.
Previous version of the AVL TTC data.
Incoming data is stored in the avl
table of the ttc
schema.
Column | Type/Format | Description/Format |
---|---|---|
LineNumber | smallint | Route number |
Date_Key | integer | Date values concatenated into a number; YYYYMMDD |
Date_KeyName | text | Day of week |
PatternName | text | Some shorthand description of the preplanned route |
DirectionName | text | Direction of travel; EAST or WEST |
TripId | integer | Identifier for the trip; 8 digits |
VehicleNumber | smallint | Vehicle identifier; 4 digits |
VehicleType | text | Describes the type of vehicle; streetcars or replacement buses |
TimePeriod_FromStop | text | Time period of the day; AM Peak, PM Peak, Midday, Evening, Other; with hour range |
FromStop | text | Stop description of From stop |
ArrivalTime_FromStop | timestamp without time zone | Timestamp of actual arrival time to From stop; mm/dd/yyyy h:MM:ss AM/PM |
ScheduledArrivalTime_FromStop | timestamp without time zone | Timestamp of scheduled arrival time to From stop; mm/dd/yyyy h:MM:ss AM/PM |
ScheduleAdherence_FromStop(Seconds) | integer | Difference between actual and scheduled arrival times; (+) is late, (-) is early |
ToStop | text | Stop description of To stop |
ArrivalTimeTime_ToStop | timestamp without time zone | Timestamp of actual arrival time to To stop; mm/dd/yyyy h:MM:ss AM/PM |
ScheduledArrivalTime_ToStop | timestamp without time zone | Timestamp of scheduled arrival time to To stop; mm/dd/yyyy h:MM:ss AM/PM |
ScheduleAdherence_ToStop(Seconds) | integer | Difference between actual and scheduled arrival times; (+) is late, (-) is early |
CurrentStop_Latitude | numeric | FromStop; Latitude to 6 decimal points |
CurrentStop_Longitude | numeric | FromStop; Longitude to 6 decimal points |
CurrentStop_PointStop_Key | integer | FromStop; Identifier for lookup to PointStop table |
Lead_Latitude | numeric | ToStop; Latitude to 6 decimal points |
Lead_Longitude | numeric | ToStop; Longitude to 6 decimal points |
Lead_PointStop_Key | integer | ToStop; Identifier for lookup to PointStop table |
Using route_id
stored in table ttc_routes
, segment_id
stored in table ttc_segments
, and the connection and order provided by table ttc_route_segments
, a lookup table was created so that TTC segments could be aggregated into matching Bluetooth segments.
Column | Type | Description |
---|---|---|
bdit_id | serial | unique id |
bt_id | integer | Bluetooth id derived from report name |
bt_id_name | text | report name of Bluetooth segment, from all_analyses |
segment_id | integer | TTC segment id |
f_id | integer | from stop id |
f_stopname | text | from stop name |
t_id | integer | to stop id |
t_stopname | text | to stop name |