Skip to content

Conversation

@thelleboid-tsr
Copy link
Contributor

Summary:
update TripAndShapeDistanceValidator to ignore shape with no shape_dist_traveled

Expected behavior:

Some GTFS feeds do not provide shape_dist_traveled in shapes.txt In this case, the value is set to 0 in the code, so sorting shapes based on ShapeDistTraveled does not work and usually returns the first shape row.

It leads to an error, the first shape of the shape is usually very far away from the last stop of the trips following the shape.

We can assume the biggest shape_dist_traveled of a given shape should never be 0, otherwise it means the value is not set. In this case, we return early to ensure no validation error will be issued.

Fixes #2018

⚠️ I've not been able to run unit tests on this PR

  • Run the unit tests with gradle test to make sure you didn't break anything
  • Add or update any needed documentation to the repo
  • Format the title like "feat: [new feature short description]". Title must follow the Conventional Commit Specification(https://www.conventionalcommits.org/en/v1.0.0/).
  • Linked all relevant issues
  • Include screenshot(s) showing how this pull request works and fixes the issue(s)

@welcome
Copy link

welcome bot commented Apr 18, 2025

Thanks for opening this pull request! You're awesome. We use semantic commit messages to streamline the release process. Before your pull request can be merged, you should update your pull request title to start with a semantic prefix. Examples of titles with semantic prefixes:

  • fix: Bug with ssl network connections + Java module permissions.
  • feat: Initial support for multiple @PrimaryKey annotations.
  • docs: update RELEASE.md with new process
    To get this PR to the finish line, please do the following:
  • Read our Contribution Guidelines
  • Follow Google Java style coding standards
  • Include tests when adding/changing behavior
  • Include screenshots

@CLAassistant
Copy link

CLAassistant commented Apr 18, 2025

CLA assistant check
All committers have signed the CLA.

@emmambd
Copy link
Contributor

emmambd commented Apr 22, 2025

Thank you for this great contribution @thelleboid-tsr! The team at MobilityData will take a look and review this week.

.setShapePtLon(lonLat)
.setShapePtSequence(0)
.setShapeDistTraveled(shapeDistTraveled + i)
.setShapeDistTraveled(shapeDistTraveled * i)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the contribution!
Why this change?
It makes some tests fail:

TripAndShapeDistanceValidatorTest.testTripDistanceExceedsShapeDistance
TripAndShapeDistanceValidatorTest.testTripDistanceExceedsShapeDistanceWarning

Your test will work if you keep the line as is (i.e +1), since you set shapeDistTraveled to 0 when calling createShapeTable and only one row is created.

You can run the tests locally with:
./gradlew test

List<ValidationNotice> notices =
generateNotices(
createTripTable(2),
createStopTimesTable(1, 10.0),
Copy link
Contributor

@jcpitre jcpitre Apr 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this will not really test your change. You want to set test data that would normally create a notice but will not because shapeTable.shapeDistTraveled is 0 (please correct me if I am wrong).
But to generate a notice you need the distance to be at least 11.1 meters. And this test sets the distance to 10 meters. So the test as it is now will not generate notices, because the distance falls below the threshold.

I believe that if you set createStopTimesTable(1, 100.0) (for example), it will test properly, meaning the distance is above the 11.1 meter threshold, but it will still not generate a notice because shapeTable.shapeDistTraveled is 0.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What I wanted to implement: "if ShapeDistTraveled is zero, then it does not generate notice nor warning" (because it returns early on line 89).

That's why I've tried setShapeDistTraveled(shapeDistTraveled * i) , so when shapeDistTraveled is 0, shapeDistTraveled * i is also 0 too.

I'm a bit lost about which tests currently fail. I do see job https://github.com/MobilityData/gtfs-validator/actions/runs/14531693846/job/40961320600?pr=2020 failed, but can't find where.

I'll revert my change to set shapeDistTraveled + i back, but I suspect the test I've introduced will raise a notice, instead of returning without any notice/warning.

Feel free to push changes on my PR to fix it! I don't mind at all (and will learn a bit about Java :-)

@skalexch
Copy link
Contributor

@thelleboid-tsr I don't see a problem spec-wise. Once you're able to generate acceptance tests, I can take a look at the data and notices.

@emmambd
Copy link
Contributor

emmambd commented Apr 23, 2025

Hi @skalexch! @thelleboid-tsr already generated the acceptance tests at https://github.com/MobilityData/gtfs-validator/actions/runs/14531693839 so we can review them there. There's a separate issue for generating a comment when the acceptance test is created from forked PRs. .

Copy link
Contributor

@skalexch skalexch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@emmambd thank you. @thelleboid-tsr looking at the datasets, it seems like the error is dropped for both feeds that do not have the column (eg mdb-67) and feeds that have the field but it's empty (eg mdb-1270). All good spec-wise! I'll leave the approval for dev.

…pe_dist_traveled (MobilityData#2018)

Some GTFS feeds do not provide shape_dist_traveled in shapes.txt
In this case, the value is set to 0 in the code, so sorting shapes based
on ShapeDistTraveled does not work and usually returns the first shape
row.

It leads to an error, the first shape of the shape is usually very far
away from the last stop of the trips following the shape.

We can assume the biggest shape_dist_traveled of a given shape should
never be 0, otherwise it means the value is not set. In this case, we
return early to ensure no validation error will be issued.

Fixes MobilityData#2018
Copy link
Contributor

@jcpitre jcpitre left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!
The increased number of corrupted feeds is not related to this PR.

@jcpitre jcpitre merged commit cdd44eb into MobilityData:master Apr 28, 2025
133 of 134 checks passed
@welcome
Copy link

welcome bot commented Apr 28, 2025

🥳 Congrats on getting your first pull request merged!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

trip_distance_exceeds_shape_distance: do not trigger when shapes.shape_dist_traveled column is empty

5 participants