Skip to content

"indexing.IndexQueue.optimize": index requests affecting the same catalog data are not merged  #94

Open
@d-maurer

Description

@d-maurer

Performing Archetypes -> dexterity migrations (with plone.app.contenttypes.migration), I observed extremely strange catalog inconsistencies: the catalog indexes where inconsistent with the catalog metadata (for the same objects). The whole issue is described in "plone/plone.app.contenttypes#556"; the IndexQueue related part mainly in "plone/plone.app.contenttypes#556 (comment)".

Here is a short summary of the problem: the IndexQueue records indexing requests for objects. When the actual reindexing is necessary, the queue is optimized. During the optimization, some requests are merged. Whether 2 requests can be merged is decided via a comparison of the object hashes and the object paths - both must agree. After the optimization, the remaining/merged requests are executed, not necessarily in the original order. A problem arises when the queue contains reindexing requests for different objects with the same path. In this case, those requests are not merged (because the objects are different) but they affect the same catalog data (because the path is identical). As a consequence, the execution order become important for correctness. But, the optimization can change the order leading to inconsistencies.

I have the strong feeling that IndexQueue should only use the path when it determines whether two requests should be merged -- because requests with the same path affect the same catalog data. I.e.: two request should be merged if and only if they refer to the same path.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions