You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Performing Archetypes -> dexterity migrations (with plone.app.contenttypes.migration), I observed extremely strange catalog inconsistencies: the catalog indexes where inconsistent with the catalog metadata (for the same objects). The whole issue is described in "plone/plone.app.contenttypes#556"; the IndexQueue related part mainly in "plone/plone.app.contenttypes#556 (comment)".
Here is a short summary of the problem: the IndexQueue records indexing requests for objects. When the actual reindexing is necessary, the queue is optimized. During the optimization, some requests are merged. Whether 2 requests can be merged is decided via a comparison of the object hashes and the object paths - both must agree. After the optimization, the remaining/merged requests are executed, not necessarily in the original order. A problem arises when the queue contains reindexing requests for different objects with the same path. In this case, those requests are not merged (because the objects are different) but they affect the same catalog data (because the path is identical). As a consequence, the execution order become important for correctness. But, the optimization can change the order leading to inconsistencies.
I have the strong feeling that IndexQueue should only use the path when it determines whether two requests should be merged -- because requests with the same path affect the same catalog data. I.e.: two request should be merged if and only if they refer to the same path.
The text was updated successfully, but these errors were encountered:
Maybe you saw that already, but there is an environment variable CATALOG_OPTIMIZATION_DISABLED that disables the whole catalog optimization, as the name implies 😅
That would at least help you on the migration and while cooking a patch to fix the current situation.
Could it be though, that the problem you describe is mostly happening only on migrations, rather than on regular work? 🤔
Gil Forcada Codinachs wrote at 2020-6-18 01:05 -0700:
Maybe you saw that already, but there is an environment variable `CATALOG_OPTIMIZATION_DISABLED` that disables the whole catalog optimization, as the name implies 😅
I have seen this indeed.
But I did not want to disable the optimizations:
the migration causes a very large number of reindeing requests
and takes a long time -- even with optimizations.
It would be significantly more time consuming without optimizations.
That would at least help you on the migration and while cooking a patch to fix the current situation.
This issue report refers to one in `plone/plone.app.contenttypes`.
There, I describe how I work around the problem for my migration.
Could it be though, that the problem you describe is mostly happening only on migrations, rather than on regular work? 🤔
Sure.
The problem occurs when the `IndexQueue` contains indexing requests
for different objects with the same "path".
For "regular work", an old object at path "p" needs to be deleted
(and this means it gets unindexed) before a new object can occupy the same
path. `IndexQueue.optimize` ensures that `unindex` requests are performed
before other indexing requests. Thus, "regular work" should be unaffected.
Nevertheless, there is a conceptual bug: `optimize` should merge
all requests affecting the same catalog data (i.e. refering to the same
path) and if this is impossible for any reason, it must execute the
requests in the original order. This would ensure correctness
independent of the concrete use (e.g. "regular" versus "migration").
Performing
Archetypes -> dexterity
migrations (withplone.app.contenttypes.migration
), I observed extremely strange catalog inconsistencies: the catalog indexes where inconsistent with the catalog metadata (for the same objects). The whole issue is described in "plone/plone.app.contenttypes#556"; theIndexQueue
related part mainly in "plone/plone.app.contenttypes#556 (comment)".Here is a short summary of the problem: the
IndexQueue
records indexing requests for objects. When the actual reindexing is necessary, the queue is optimized. During the optimization, some requests are merged. Whether 2 requests can be merged is decided via a comparison of the object hashes and the object paths - both must agree. After the optimization, the remaining/merged requests are executed, not necessarily in the original order. A problem arises when the queue contains reindexing requests for different objects with the same path. In this case, those requests are not merged (because the objects are different) but they affect the same catalog data (because the path is identical). As a consequence, the execution order become important for correctness. But, the optimization can change the order leading to inconsistencies.I have the strong feeling that
IndexQueue
should only use the path when it determines whether two requests should be merged -- because requests with the same path affect the same catalog data. I.e.: two request should be merged if and only if they refer to the same path.The text was updated successfully, but these errors were encountered: