Skip to content

Unable to validate date and date-time with jsonschema #420

@rennerocha

Description

@rennerocha

After #358, the validation of date fields using jsonschema is not working as before. Spidermon was serializing date fields into strings (https://github.com/scrapinghub/spidermon/pull/358/files#diff-7937ac85a30630fe837b9c133f4459ee590680bb5dfce72775db6005f2b45f51L142), so when injected into jsonschema validators, the date and date-time checkers (https://python-jsonschema.readthedocs.io/en/stable/validate/#validating-formats) didn't work as expected if the item contains a datetime.date or a datetime.datetime instance.

Given the code:

import datetime
from jsonschema._format import FormatChecker
from jsonschema.validators import validator_for
from spidermon.contrib.scrapy.pipelines import ItemValidationPipeline

format_checker = FormatChecker()

schema = {
    "$schema": "http://json-schema.org/draft-07/schema#",
    "type": "object",
    "properties": {
        "date": {
            "description": "Date of the gazzete",
            "type": "string",
            "format": "date"
        }
    },
    "required": [
        "date",
    ]
}

validator_cls = validator_for(schema)
validator = validator_cls(schema=schema, format_checker=format_checker)
original_data = {
    'date': datetime.date.today()
}

Validating with spidermon 1.20.0

item_adapter = ItemAdapter(original_data)
item_dict = item_adapter.asdict()
>>> errors = validator.iter_errors(item_dict)
>>> [error for error in errors]
<ValidationError: "datetime.date(2023, 9, 19) is not of type 'string'">]

With spidermon 1.17.0

>>> data = ItemValidationPipeline._convert_item_to_dict(_, original_data)
>>> errors = validator.iter_errors(data)
>>> [error for error in errors]
[]

Validating with spidermon 1.20.0

>>> errors = validator.iter_errors(data)
>>> [error for error in errors]
<ValidationError: "datetime.date(2023, 9, 19) is not of type 'string'">]

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions