Skip to content

Add cycling network #551

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 14 commits into
base: cycling-2025
Choose a base branch
from

Conversation

ryn-trnr
Copy link
Collaborator

@ryn-trnr ryn-trnr commented Jul 1, 2025

The purpose of this new cycling-2025 branch is to provide a starting point for generating cycling related indicators to add to the GHSCI software in future. This code contribution is intended to provide a starting point with the goal of generating a cycling network as opposed to the existing pedestrian network. Future discussions are still required to determine if we decide to focus on 'protected' cycling networks or not, as this will require further work to fine tune how we can fetch this kind of data from OpenStreetMap for any city.

In the meantime, this code contribution appends all existing instances of the network and associated edges, nodes and intersections with _pedestrian and generates a second network which appends all associated edges, nodes and intersections with _cycling. The 'cycleway' attribute has also been added to both outputs as it will probably be useful for future processing in some way or another. Unlike the existing default 'pedestrian' network, no custom filter is used, but rather the OSMnx 'bike' network type is used - the definition of which can be found here:

https://github.com/gboeing/osmnx/blob/b5538d0182b012fe20da46d682db3b6df43f3f6a/osmnx/_overpass.py#L108-L115

A custom filter could be added following these steps:

  1. In process/configuration/templates/config.yml around line 36, add a new 'cycling' query following the syntax of the existing 'pedestrian' query already there.
  2. Then, in process/subprocesses/_03_create_network_resources.py around line 525 of G_proj_cycling , update this line:
    r, ghsci.settings['network_analysis'], with r, ghsci.settings['network_analysis']['cycling'],

@ryn-trnr ryn-trnr requested a review from carlhiggs July 1, 2025 14:06
@ryn-trnr ryn-trnr self-assigned this Jul 1, 2025
@carlhiggs
Copy link
Collaborator

carlhiggs commented Jul 2, 2025

Thanks for your work on this @ryn-trnr , this is a really interesting contribution. I have had a quick look through the code, and can see you've put a lot of thought into this. One thing we should consider (before or after merging into the cycling branch, but before we bring it into the main software) is how to not overwhelm people with options. I can understand the logic of instead of allowing people to optionally configure 'intersections' (and in principle, there should also be 'network') this being split in the new branch into 'pedestrian intersections' and 'cycling intersections' (and if we implemented it, 'pedestrian network' and 'cycling network') but I wonder about how to do this without overwhelming people with the configuration file (e.g. as by default, people won't specify this).

The role of providing custom intersections data is for evaluating street connectivity per sqkm (eg using a local governments official network, so estimates are 'clean' and might make more sense than some OpenStreetMap representations that could be compromised by individual mappers cartographic/routing/other interests (OSMnx has useful algorithms for cleaning networks, but sometimes, if its available, an official dataset might still be preferable). Is this a metric people are interested in for cycling? I can imagine it could be in theory, but if it isn't, maybe we shouldn't over-complicate things. We don't use this data for routing.

When we have a GUI implemented for configuration (I made a start on this, but stopped as there seemed more interest in starting from scratch), this could side step things as optional configurations could be hidden under advanced options. But as I was reviewing the modified existing configuration files, I wondered if we are making things more overwhelming.

The other consideration is, if we're modifying the configuration file, we should also modify the schema for the configuration file that in principle should test if the configuration file matches the schema and perhaps alert if it doesn't. I set up a workflow for this previously, but I wonder now if its working as intended as I'd expect it to fail with these modifications (or maybe it doesn't because its a superset?). Anyway --- we should consider and maintain the json schema as part of this change.

You've done a lot of work here Ryan and I want to do it justice -- i'll have a closer look later in the week.

@@ -52,4 +52,4 @@ documentation:
# Contact e-mail (for metadata)
points_of_interest:
destination_list: [supermarket_osm, bakery_osm, meat_seafood_osm, fruit_veg_osm, deli_osm, convenience_osm, petrolstation_osm, newsagent_osm, market_osm, pt_any]
osm_destination_definitions: process/configuration/osm_destination_definitions.csv
osm_destination_definitions: process/configuration/osm_destination_definitions.csv
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what's changed here?

@@ -130,4 +130,4 @@ def main():


if __name__ == '__main__':
main()
main()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not clear to me what has changed here

@@ -74,7 +74,7 @@ def derive_population_grid_variables(r):
SELECT h."grid_id",
COUNT(i.*) intersection_count
FROM {r.config["population_grid"]} h
LEFT JOIN {r.config["intersections_table"]} i
LEFT JOIN {r.config["intersections_pedestrian_table"]} i
Copy link
Collaborator

@carlhiggs carlhiggs Jul 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not necessarily convinced that it is worthwhile evaluating cycling intersection connectivity as distinct from pedestrian connectivity... Is this something that people use and are interested in? if its not, I'd be inclined to avoid the extra complication of introducing a separate dataset for cycling. (but see later comment of a possible approach using a single dataset)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As per other comments, I'm not sure that the complication of splitting out cycling from pedestrian intersection (assuming that is the data the user configures) is worthwhile if this is not a measure that will actually be used. Let's do it for now, but canvas our colleagues whose research focuses on cycling as to whether this distinction matters for them.

@@ -24,6 +24,8 @@ def osmnx_configuration(r):
"""Set up OSMnx for network retrieval and analysis, given a configured ghsci.Region (r)."""
ox.settings.use_cache = True
ox.settings.log_console = True
# Include 'cycleway' as an attribute in the 'edges' outputs
ox.settings.useful_tags_way = ox.settings.useful_tags_way + ['cycleway']
Copy link
Collaborator

@carlhiggs carlhiggs Jul 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! I can see this as being a really useful addition for post hoc analyses (as suggested elsewhere); thanks @ryn-trnr

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any additional OSM tags that we may need to include in the edges outputs can be added using this syntax

Copy link
Collaborator

@carlhiggs carlhiggs Jul 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder --- rather than force a distinction between 'pedestrian' (which in our definition implicitly allows cycling, in contrast to OSMnx's default modes that distinguish walk--excluding cycleways--and bike), and 'cycling' for the routable network, perhaps we could instead (maybe more simply) use the existing active transport network (or potentially, an updated version explicitly include cycling) and conduct additional routing analyses for cycling- or other mode-specific purposes using some derived 'stress weight' based on the available tags, that could serve as a kind of impedence additional to distance. So, when finding the least cost path, it would take into account the degree to which the existing network actually provides quality cycling infrastructure based on the available information. This would save constructing multiple networks, and the risk of complicating configuration options. Perhaps instead, there could be an advanced option for specifying a list of accessibility penalties each associated with a meaningful name --- one of these could be 'cycling', but potentially there could be other penalty functions defined. Then all accessibility indicators are calculated for the status quo, plus the penalty scenarios. Additional amenity maps could be generated that could highlight the attributes that may have been defined as facilitators or barriers to cycling. I imagine this would be feasible, and involve less code modification --- it uses the same network, but evaluates a continuous score for suitability for cycling. What do you think @ryn-trnr

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the level of traffic stress or bicycling routing world, roadway itself is often considered as part of the routable network for bike (as high stress road but could act as part of the active travel network). If we are to calculate accessibility indicator, it might be too narrow of a definition if we only look at ways that specifically tagged as cycleway or footway, may underestimate accessibility. If we are to think in terms of the stress level framework, we should probably take into account different roadway classification and its attributes like max speed limit, number of lanes etc. @carlhiggs I like your thinking on the derived 'stress weight', this is similar to our approach on stress level assignment on ways based on different roadway attributes. It will worth a discussion to come up with a generalized LTS framework at the global context. Once we have the stress level method specified, we could probably introduce the R5 router for bike accessibility indicator calculation at different stress level network. The framework from these two articles someone in our group shared before could be quite useful https://pmc.ncbi.nlm.nih.gov/articles/PMC12153112/ and https://doi.org/10.1016/j.cities.2024.105526, we can draw some references for low stress bike path indicators like connectivities and accessibility, as they do inform recreational cycling and physical activities.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code here is actually pretty crucial for routing; its involved in pre-computing accessibility distances from sampled locations to their nearest intersections, that we later use to calculate full distances. So, if you were routing by different networks, maybe code here should be modified to evaluate this for the the other network. I think it might over-complicate things actually (I understand why you didn't go this far, for that reason). My inclination would be to not have separate network strictly for routing perhaps, but account for this in routing via a suitability weight (that could be evaluated in permutations for different scenarios, or modes) as suggested in a different comment.

Copy link
Collaborator

@carlhiggs carlhiggs Jul 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we were able to keep things simple by using a single network, but evaluating it for different purposes based on a custom definition for interpreting its attributes on each edge as a suitability weight, then we wouldn't need to change this file, except for the routing portion. The current changes are really only to point to the renamed existing active transport network and replicate the status quo, so if we didn't distinguish networks (a priori; only post hoc as required by querying, eg using the cycleway tag you've added), the current edits here wouldn't be required.

if (
'intersections' in r['network']
and r['network']['intersections'] is not None
'intersections_pedestrian' in r['network']
Copy link
Collaborator

@carlhiggs carlhiggs Jul 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As per elsewhere, I'm not sure the distinction between pedestrian and cycling for intersection evaluation in this way (purely for calculation of intersections per sqkm statistic) is necessary, and so may be an over-complication. Perhaps there could be a different approach where the single set of configured intersections are post hoc evaluated (as suggested elsewhere) eg for 'junction stress' --- but this wouldn't strictly require a separate dataset, it would instead post hoc identify some intersections as more suitable than others. That could be interesting for an interactive map perhaps, or to account for in routing. Or if it were of interest, a cycling-weighted intersection density statistic (using the same data, but with a custom weight cycling suitability). Do you think something like that could work @ryn-trnr?

@ryn-trnr
Copy link
Collaborator Author

Thanks for all the comments @carlhiggs. Yes, I agree that perhaps making a second version of the network is not necessary and probably overcomplicates things, especially with the intersection datasets.

I totally agree with your comment about instead of using two networks for 'pedestrian' and 'cycling' we should rather use the existing network (rename to active transport) and then 'conduct additional routing analyses for cycling using some derived 'stress weight' based on the available tags, that could serve as a kind of impedance additional to distance.'

So for now, I'll go through and push a few more commits to this PR to undo some of these changes and make sure all the relevant tags that we would require to perform mode specific routing are including in the 'edges' output. I'll include any tags from our existing custom pedestrian definition, and OSMNx's 'walk' and 'bike' filters that aren't already included in our 'edges' output. In this way, we can filter the network based on these tags post network generation, and then design some code to apply different routing definitions based on mode. i.e. walking or cycling.

@carlhiggs
Copy link
Collaborator

Thanks @ryn-trnr, It would be good to loop @shiqin-liu , @gboeing and @rschifan into this from both indicator and software perspectives for modelling cycling to get their thoughts. They might have different perspectives or other insights, or fully endorse the approach, which would be great to know. Thanks again Ryan for all your thought and work on this.

@ryn-trnr
Copy link
Collaborator Author

ryn-trnr commented Jul 11, 2025

Ok basically all this PR is now doing is adding four attributes to the 'edges' output. Adding foot sidewalk bicycle cycleway now gives us all the elements I believe we require to allow us to filter the network based on our existing custom 'pedestrian' definition and OSMNx's 'walk' and 'bike' definition. Now we can perform mode-specific routing, such as for pedestrian walking and cycling, based on different filtering of the 'active travel' network.

This PR is also showing some diffs in the very last line of some py files that I modified previously and then reverted back to the original main state. I guess this is because of some end of line file differences between CRLF and LF. I am working on Windows but haven't had this happen before - doesn't seem to affect functionality, is just troublesome because there aren't supposed to be any tangible changes to those files.

@ryn-trnr
Copy link
Collaborator Author

Fixed the end of file differences issue - now only the 2 files that are supposed to have changes have been modified in this PR

@@ -24,6 +24,8 @@ def osmnx_configuration(r):
"""Set up OSMnx for network retrieval and analysis, given a configured ghsci.Region (r)."""
ox.settings.use_cache = True
ox.settings.log_console = True
# Include additional attributes in the 'edges' outputs for later mode-specific routing and analysis
ox.settings.useful_tags_way = ox.settings.useful_tags_way + ['foot'] + ['sidewalk'] + ['bicycle'] + ['cycleway']
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can simplify to:

ox.settings.useful_tags_way += ['foot', 'sidewalk', 'bicycle', 'cycleway']

@shiqin-liu
Copy link
Collaborator

Thanks @ryn-trnr, It would be good to loop @shiqin-liu , @gboeing and @rschifan into this from both indicator and software perspectives for modelling cycling to get their thoughts. They might have different perspectives or other insights, or fully endorse the approach, which would be great to know. Thanks again Ryan for all your thought and work on this.

Thanks @ryn-trnr for the hard work, and @carlhiggs for looping me in. It is a great idea to add that customized tag attributes to allow further configuration of the network, it would be especially useful as we are thinking further categorizing the bicycle network in term of the stress level, like I mentioned in earlier comment, there might be other tags we could consider adding that would help classify the stress, which would need more thorough thinking. Agree it is a good starting point for the cycling work within our software framework

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants