Skip to content

Improve error message of UnicodeEncodeError in zoneinfo.load_tzdata #140145

@LamentXU123

Description

@LamentXU123

Feature or enhancement

Proposal:

When I am looking at #140039. I think we could add separate UnicodeEncodeError from File-related errors either. The source code is here

def load_tzdata(key):
    from importlib import resources

    components = key.split("/")
    package_name = ".".join(["tzdata.zoneinfo"] + components[:-1])
    resource_name = components[-1]

    try:
        path = resources.files(package_name).joinpath(resource_name)
        # gh-85702: Prevent PermissionError on Windows
        if path.is_dir():
            raise IsADirectoryError
        return path.open("rb")
    except (ImportError, FileNotFoundError, UnicodeEncodeError, IsADirectoryError):
        # There are four types of exception that can be raised that all amount
        # to "we cannot find this key":
        #
        # ImportError: If package_name doesn't exist (e.g. if tzdata is not
        #   installed, or if there's an error in the folder name like
        #   Amrica/New_York)
        # FileNotFoundError: If resource_name doesn't exist in the package
        #   (e.g. Europe/Krasnoy)
        # UnicodeEncodeError: If package_name or resource_name are not UTF-8,
        #   such as keys containing a surrogate character.
        # IsADirectoryError: If package_name without a resource_name specified.
        raise ZoneInfoNotFoundError(f"No time zone found with key {key}")

Four errors are caught together and both raised as a ZoneInfoNotFoundError. Issue #140039 suggest to separate ImportError out for more details. I think this is reasonable since the Error ZoneInfoNotFoundError(f"No time zone found with key {key}") gives so little information for people to debug with four possible cause. It will be better if we separate them and add detailed information each.

Now, I think we can leave the file-related errors here (FileNotFoundError, IsADirectoryError) since they don't need that much description (the original one is OK), I think we can separate ImportError and UnicodeEncodeError with detailed messages. The ImportError one is done in #140040 and I'm ready to send PR for the unicode one.

Has this already been discussed elsewhere?

This is a minor feature, which does not need previous discussion elsewhere

Links to previous discussion of this feature:

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    stdlibStandard Library Python modules in the Lib/ directorytype-featureA feature request or enhancement

    Projects

    Status

    No status

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions