Skip to content

fix: serialisation of since datetime #88

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 9 commits into
base: master
Choose a base branch
from

Conversation

palkerecsenyi
Copy link
Member

@palkerecsenyi palkerecsenyi commented Jul 7, 2025

Please release this PR at the same time as the changes in #87 (comment).

Closes #87.


  • The since parameter of many jobs is represented in Python with a datetime.datetime object. This cannot be directly serialised into a str, so a default=str was added to json.dumps. However, this serialises the date into a non-standard format instead of ISO 8601 which is what job handlers expect to receive.

  • The marshmallow schema PredefinedArgsSchema indeed assigns the iso format to the since argument but then contains a non-ISO format in the description. This has been changed to be more clear.

  • Added a custom default function to turn any datetime into an ISO 8601 formatted string, without affecting any other job args..

* The `since` parameter of many jobs is represented in Python with a
datetime.datetime object. This cannot be directly serialised into a str,
so a `default=str` was added to `json.dumps`. However, this serialises
the date into a non-standard format instead of ISO 8601 which is what
job handlers expect to receive.

* Added a custom default function to turn any datetime into an ISO 8601
formatted string, without affecting any other job args.
* To ensure the UI appears consistent with the actual job arguments, I
generalised the `json.dumps` custom `default` function and used it
everywhere a job is serialised.

* The uses in `services/results.py` are purely UI-related but it's still
important to ensure the timestamps are displayed consistently (as well
as any future changes to the custom serialiser)
* Some job implementations downstream assume the `since` argument in
jobs is an ISO timestamp *including timezone info*. We store all dates
in the DB in UTC but without associated TZ info.

* It is safe to add in the TZ info (UTC) when setting the `since` arg as
we know it is always in the UTC zone. Job implementors can then more
easily (and without ambiguity about the TZ) use the `since` arg.
@palkerecsenyi palkerecsenyi marked this pull request as ready for review July 8, 2025 11:27
@palkerecsenyi palkerecsenyi linked an issue Jul 8, 2025 that may be closed by this pull request
@palkerecsenyi palkerecsenyi linked an issue Jul 8, 2025 that may be closed by this pull request
@@ -50,7 +51,7 @@ def data(self):
job_dict["last_run"] = self._obj.last_run.dump()
job_dict["last_runs"] = self._obj.last_runs
job_dict["default_args"] = json.dumps(
self._obj.default_args, indent=4, sort_keys=True, default=str
self._obj.default_args, default=job_arg_json_dumper
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@palkerecsenyi indent and sort_keys removed because are the default values?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as I could tell, we aren't using the dumped JSON for anything user-facing, it's only stored in the database. As such we don't need the indent/sort_keys which (I think) serve purely aesthetic purposes. With the default values instead it returns a more minified JSON string that also saves some space in the DB.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm we should be using it when you serialize the job args in a configured job in the administration panel, no?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the admin panel this is passed into a JS-based UI which prettifies the JSON anyway, so the format it receives the object in doesn't matter in theory

image

def job_arg_json_dumper(obj):
"""Handle non-serializable values such as datetimes when dumping the arguments of a job run."""
if isinstance(obj, datetime):
return obj.isoformat()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed, let's ensure here that there is no timezone or Z char added to the string, so that we avoid manipulating the string in any celery task or UI. Can you please add a test?

@zzacharo
Copy link
Member

zzacharo commented Jul 10, 2025 via email

* Changed the `since` argument so task implementors always receive a
timestamp with the `+00:00` syntax instead of `Z` which is unsupported
by Python <3.10

* Added an error message shown on the admin dashboard when running a job
with an invalid `since` timestamp (i.e. not ISO compliant)

* Added unit test for the custom `json.dumps` serialiser that
stringifies dates to ISO timestamps

* Modified the job unit test to use the `since` argument, and added a
test case for an invalid timestamp to ensure the correct error message
is returned to the client
@palkerecsenyi
Copy link
Member Author

Tests are currently failing due to inveniosoftware/pytest-invenio#128

@utnapischtim
Copy link
Contributor

i answerd on the pytest-invenio issue

@palkerecsenyi
Copy link
Member Author

For reference: the tests are passing locally with the changes in #94, but we won't cherry-pick the commit into this branch to avoid duplication. We can merge this PR, and the tests will pass once #94 is also merged.

Local test output
==================================== tests coverage ====================================
___________________ coverage: platform linux, python 3.12.10-final-0 ___________________

Name                                                                  Stmts   Miss  Cover   Missing
---------------------------------------------------------------------------------------------------
invenio_jobs/__init__.py                                                  3      0   100%
invenio_jobs/administration/__init__.py                                   0      0   100%
invenio_jobs/administration/jobs.py                                      73     15    79%   105, 109-115, 119-124, 133-138
invenio_jobs/administration/runs.py                                      48     26    46%   39-53, 57-68, 72-73, 77-78
invenio_jobs/alembic/1f896f6990b8_update_jobs_module_table_names.py      20     10    50%   25-32, 40-45
invenio_jobs/alembic/371f4cbcb73d_create_invenio_jobs_branch.py          10      2    80%   22, 27
invenio_jobs/alembic/356496a01197_create_invenio_jobs_tables.py          17      4    76%   30-62, 105-106
invenio_jobs/api.py                                                       4      0   100%
invenio_jobs/config.py                                                   28      0   100%
invenio_jobs/errors.py                                                    4      2    50%   16-17
invenio_jobs/ext.py                                                      57      1    98%   106
invenio_jobs/jobs.py                                                     44      9    80%   66-68, 90, 103, 106-114, 131, 135
invenio_jobs/logging/__init__.py                                          0      0   100%
invenio_jobs/logging/celery_signals.py                                   19      5    74%   23-26, 38-40, 49
invenio_jobs/logging/index_templates/__init__.py                          0      0   100%
invenio_jobs/logging/index_templates/os-v1/__init__.py                    0      0   100%
invenio_jobs/logging/index_templates/os-v2/__init__.py                    0      0   100%
invenio_jobs/logging/jobs.py                                             65      6    91%   97, 113, 123-125, 129-130
invenio_jobs/logging/tasks.py                                             8      2    75%   20-21
invenio_jobs/models.py                                                  120     13    89%   80-88, 147, 152, 195-197
invenio_jobs/proxies.py                                                  12      0   100%
invenio_jobs/registry.py                                                 26      5    81%   23, 32-35
invenio_jobs/resources/__init__.py                                        3      0   100%
invenio_jobs/resources/config.py                                         56      0   100%
invenio_jobs/resources/resources.py                                     126     14    89%   48-53, 58-63, 210-217, 237-243, 249-255
invenio_jobs/services/__init__.py                                         4      0   100%
invenio_jobs/services/config.py                                          79      0   100%
invenio_jobs/services/errors.py                                          25      8    68%   38-41, 49-51, 63
invenio_jobs/services/links.py                                           11      1    91%   38
invenio_jobs/services/permissions.py                                     25      0   100%
invenio_jobs/services/results.py                                         97     13    87%   51-52, 81, 90-93, 118-119, 146-147, 165, 167-168
invenio_jobs/services/scheduler.py                                       50     29    42%   28-29, 34, 54, 61, 65-66, 70-89, 94-101, 108-112
invenio_jobs/services/schema.py                                         144     10    93%   51-56, 132, 183, 259, 291-292
invenio_jobs/services/services.py                                       153     25    84%   51, 59-63, 78, 81, 194, 277-291, 296-302, 311, 342, 346, 355, 361-362
invenio_jobs/services/ui_schema.py                                       19      0   100%
invenio_jobs/services/uow.py                                             10      0   100%
invenio_jobs/tasks.py                                                    30     20    33%   24-28, 34-63
invenio_jobs/utils.py                                                    28     19    32%   20-31, 36-47, 55
invenio_jobs/views.py                                                    14      0   100%
invenio_jobs/webpack.py                                                   2      0   100%
---------------------------------------------------------------------------------------------------
TOTAL                                                                  1434    239    83%
==================== 42 passed, 112 skipped, 94 warnings in 15.05s =====================

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Incorrect since: datetime argument handling
6 participants