Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

py conventions for trailing slash management #97

Closed
pvgenuchten opened this issue Apr 19, 2019 · 16 comments
Closed

py conventions for trailing slash management #97

pvgenuchten opened this issue Apr 19, 2019 · 16 comments
Labels
stale Issue marked stale by stale-bot

Comments

@pvgenuchten
Copy link
Contributor

I noticed sometimes users arrive in collections/obs/items and sometimes at collections/obs/items/. Both url's provide a valid answer however browsers treat both url's different when linking relatively. From these locations a '..' link will guide users to /collections in case 1 and to /collections/obs in case 2

Some 'solutions'

  • do not use relative links, always link to a full url
  • check all links for consistency on adding the trailing slash (or not)
  • auto-remove/add a trailing slash if detected it is not added (rewrite rule)
@pvgenuchten
Copy link
Contributor Author

@justb4 part of this seems solved by #103, should we close this, or can it still be relevant in other scenario's?

@justb4
Copy link
Member

justb4 commented May 15, 2019

@pvgenuchten think this can be closed as indicated above. you can try on http://geo.kralidis.ca/pygeoapi which runs current master branch.

@pvgenuchten
Copy link
Contributor Author

pvgenuchten commented Oct 31, 2023

i want to reopen this issue, because we seem to be incompatible with stac

i'm not aware if common/features/records has any recommendation on trainling slash

stac seems to require a trailing slash, https://github.com/radiantearth/stac-spec/blob/e8c409513fb46685556c2e340d06deaf5d4c7084/best-practices.md#consistent-uris

our behaviour:

@pvgenuchten pvgenuchten reopened this Oct 31, 2023
@pvgenuchten
Copy link
Contributor Author

pvgenuchten commented Nov 1, 2023

chat with Clemens Portele on discord:

  • if you use a trailing slash, it is a different URI. In OGC API the resources do not have a trailing slash, which - from my experience - is the more common approach in APIs.
  • see how stac references use of trailing slash https://github.com/radiantearth/stac-spec/blob/e8c409513fb46685556c2e340d06deaf5d4c7084/best-practices.md#consistent-uris
  • Just look at the paths that are specified in the standards. None of them has a trailing slash.
    (Caveat: Strictly, you could argue that the Landing Page is an exception where the path is stated as "/", but this is a side effect from OpenAPI where all paths MUST start with a slash. It is not the intention that the Landing Page URI has a slash at the end.)

Copy link

As per RFC4, this Issue has been inactive for 90 days. In order to manage maintenance burden, it will be automatically closed in 7 days.

@github-actions github-actions bot added the stale Issue marked stale by stale-bot label Mar 10, 2024
Copy link

As per RFC4, this Issue has been closed due to there being no activity for more than 90 days.

@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Mar 24, 2024
@dblodgett-usgs
Copy link

@pvgenuchten, @webb-ben, @tomkralidis -- I feel I may have stumbled into a hornets nest?

radiantearth/stac-browser#486 (comment)

Is there any way around this issue that will satisfy both OGC-API and STAC or is the issue of trailing slashes a point of divergence and incompatibility?

@pvgenuchten
Copy link
Contributor Author

pvgenuchten commented Oct 15, 2024

I guess the second:

the issue of trailing slashes is a point of divergence and incompatibility

a way forward could be to introduce a trailing slash setting in config

@m-mohr
Copy link

m-mohr commented Oct 15, 2024

Wrong guess. There should not be any divergence and incompatibility. Resolving relative URIs is well-defined. Either an implementation is following RFC 3986 (see chapter 5) or not. STAC and OGC APIs must follow RFC 3986, otherweise they are pretty much unusable because the URIs are ambiguous and can't be resolved properly.

A trailing slash is not required, but depending on whether a trailing slash is provided in the base URL, the relative URLs must be provided differently. See some examples here: https://github.com/radiantearth/stac-spec/blob/master/best-practices.md#consistent-uris (and although this is in the STAC specification, this is generally applicable and not STAC specific at all.)

@ksonda
Copy link
Contributor

ksonda commented Oct 16, 2024

I'll admit I don't know STAC exactly, but the consistent-uri guidance underscores the issue: To avoid issues it is recommended to consistently add a slash at the end of the URL if it doesn't point to a file.

OGC-API Features (and Common) ./collections is a resource (one could say, a file), whereas ./collections/ could be a construed as base URI for specific collections (and constituent items). However, in all extant documentation /collections selflink, and its target URL in other links is defined in the OGC API Feature spec as a resource, which thus is not supposed to be a base URI. Basically, "base URI" does not seem to be a thing that can be communicated to clients consistently in servers implementing OGC API specs. For example, constituent items of collections are advised to link to /collections/{collectionId} not /collections/{collectionsId}/ See Example 14. So, I don't think its as simple as "conform to IETF RFC 3986", since the OGC-API Features spec would seem to be in contradiction and/or agnostic to it, and as a humble manager of developers who contribute to pygeoapi, would not know to point people to RFC 3986, but would know to point people to OGC specs, and would look to the OGC to ensure their standards support compliance with such things if they are essential to interoperability.

@dblodgett-usgs
Copy link

dblodgett-usgs commented Oct 17, 2024

Not sure how to interpret a thumbs down... but I think it is worth being clear here and teasing this apart more than that.

It seems that there are two interpretations of how OGC-API links should be constructed from a geojson feature collection.

On one hand, the id of a geojson feature within a geojson feature collection can be interpreted as a relative url where self is thought to be the base url. In this context, we would expect the overall geojson document to conform to IETF RFC 3986. So we would expect trailing slashes on self even though the OGC-API spec does not include trailing slashes on paths or we would expect geojson ids to be valid relative url fragments, something they are not thought to be in common usage.

On the other hand, the id of a geojson feature is a API-instance-scoped id of a feature that can be used according to the OGC-API Features convention for constructing a get item by id request. In this context, we would expect client software to have knowledge of the path template /collections/{collectionId}/items/{featureId} in requirement 33A of OGC API Features where, according to requirement 33B, {featureId} is, "featureId is the local identifier of the feature."

This is why I was asking about the use of the OGC-API Common "conformance" document. Since stac-browser is querying for conformance information, it should know that some servers that do not return an indication of STAC conformance but do indicate that they adhere to OGC-API Features will likely adhere to the latter interpretation of how geojson id should be bound to a url.

Am I off base here? I'm concerned that we are pushing the IETF URL construction standard into a place that it is not intended to apply.

@tomkralidis
Copy link
Member

I would suggest we bring this up at OGC API - Common or Features as a next step.

@dblodgett-usgs
Copy link

dblodgett-usgs commented Oct 17, 2024

Scanning issues -- this one pops up. opengeospatial/ogcapi-features#918

This? opengeospatial/ogcapi-features#742

Also this. opengeospatial/ogcapi-features#139

Any others that folks see that may have discussed this issue previously?

@m-mohr
Copy link

m-mohr commented Oct 17, 2024

I feel like there's a mixup of various things as the discussion is around resources, files, Features, feature IDs and trailing slashes in self links etc, which is all irrelevant, I think. It should just be about how to resolve a relative link against a base URL. How that's done is described in RFC3896.

It's pretty simple. Let say you have a Feature at https://example.com/collections/123/items/321 and you want to have a link from the collection 123 to that feature, using a self-links and a relative link. You have two options (you can omit the ./):

  1. self url: https://example.com/collection/123, relative URL to the feature: ./123/items/321
  2. self url: https://example.com/collection/123/, relative URL to the feature: ./items/321

That are your options. No one cares whether you do 1 or 2, as long as it complies to RFC 3986.

Important: For this example, self url: https://example.com/collection/123 and relative URL to the feature: ./items/321 resolves to https://example.com/collection/items/321 according to the RFC, so is not valid for this example!

I'm actually not sure what to discuss here. Follow the RFC or accept that your API can't be properly handled by clients because the URL resolving is ambiguous/invalid. 🤷‍♂️

@dblodgett-usgs
Copy link

The discussion is that the convention we are using (OGC-API Features) doesn't actually say that there should be a relative URL encoded in hypermedia from https://example.com/collection/123 to https://example.com/collections/123/items/321 at all.

The url template: /collections/{collectionId}/items/{featureId} is a convention of the API specification that must be constructed by a client application with knowledge of the API rather than by hydermedia url construction that would be governed by RFC3986.

So for example, we look here:

https://demo.pygeoapi.io/master/collections/lakes?f=json and can follow the absolute link to https://demo.pygeoapi.io/master/collections/lakes/items?f=json

Notice that the links in the items response does not include links to individual items. The featureId must be retrieved from the geojson id -- which is not a relative url and, by that token, should not be subject to RFC3986 URL rules.

Notice that if we give the example collection of lakes to stac-browser we do get tiles for each lake.

https://radiantearth.github.io/stac-browser/#/external/demo.pygeoapi.io/master/collections/lakes?f=json

But if we click a link, the geojson id (not a relative url) is put into a url that drops the collection id in /collections/{collectionId}/items/{featureId}.

@webb-ben made a modification to https://reference.geoconnex.us/ which adds a trailing slash to urls such as:

https://reference.geoconnex.us/collections/hu02/?f=jsonld which was https://reference.geoconnex.us/collections/hu02?f=jsonld

e.g. https://radiantearth.github.io/stac-browser/#/external/reference.geoconnex.us/collections/hu02/?f=json&.language=en

Which does get stac-browser to template the url as would be expected. But I still feel like we are pushing RFC3986 rules into an area (geojson IDs) where it shouldn't necessarily be thought to apply dogmatically.

Am I just missing where a relative URL actually exists in these documents? I feel like I'm just not seeing something here.

@m-mohr
Copy link

m-mohr commented Oct 17, 2024

Aha! That makes a lot of sense and was never clear to me. But I think in your original examples there were relative links?! Can't find them anymore, but anyway:

STAC is all about hypermedia, there should be no "URL construction by knowing the API" or so, just a fallback mechanism to care for some invalid APIs, which ideally I actually should remove from STAC Browser to encourage valid implementations.
So if there are no links you just rely on a fallback mechanism in STAC Browser, which indeed requires you to add a trailing slash to the self link right now. The issue is, the Browser can't know whether having or not having a slash at the end is intentional or not, whether it should stay or not. So I can't reliably construct URLs for every case (remember: STAC and Records have static catalogs with actual files which you want to remove in URL resolving!). But the RFC says that if the last path component should stay, it should have a trailing slash, so that's the only thing we can follow here really...

Also note, there's no official support for OGC API - Features/Records in STAC Browser. If someone wants to fund that long-term (i.e. implementation and maintenance), I'm happy to talk.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
stale Issue marked stale by stale-bot
Projects
None yet
Development

No branches or pull requests

6 participants