Skip to content

Protocol of URI without authority not correctly idenfied #1811

Open
@observingClouds

Description

@observingClouds

I am currently developing a fsspec driver for a filesystem protocol that does not need/have an authority, such that:

URI = protocol ":" path

is a valid URI.

However, fsspec's split_protocol function seems to assume that an authority is always given, i.e.:

URI = protocol ":" "//" authority path

def split_protocol(urlpath):
"""Return protocol, path pair"""
urlpath = stringify_path(urlpath)
if "://" in urlpath:
protocol, path = urlpath.split("://", 1)
if len(protocol) > 1:
# excludes Windows paths
return protocol, path
if urlpath.startswith("data:"):
return urlpath.split(":", 1)
return None, urlpath

The only exceptions are data (see above) and indirectly file (see here) which are handled specifically.

Are there any issues with supporting also protocols without //authority? E.g. could we just change split_protocol to something like:

def split_protocol(urlpath):
    """Return protocol, path pair"""
    urlpath = stringify_path(urlpath)
    if "://" in urlpath:
        protocol, path = urlpath.split("://", 1)
        if len(protocol) > 1:
            # excludes Windows paths
            return protocol, path
    elif ":/" in urlpath:
        protocol, path = urlpath.split(":/", 1)
        if len(protocol) > 1:
            path = "/" + path
            return protocol, path
    if urlpath.startswith("data:"):
        return urlpath.split(":", 1)
    return None, urlpath

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions