Skip to content

[edison-mongo] AbstractMongoRepository.size() is unsuitable for big collections #121

@simonwiesmann8732

Description

@simonwiesmann8732

Hello edisoneers,

AbstractMongoRepository.size() uses the MongoDB collection's countDocuments() internally. This method is unsuitable for timely responses on big collections as it

wraps the $group aggregation stage with a $sum expression to perform the count and is available for use in Transactions.

For us the operation was timing out with a timeout of 30 seconds (30.000 millis) on a collection with more than 30m documents. The original behavior of edison-mongo changed with this commit when a switch was made from the deprecated count() (which was not transaction-safe).

The direct counterpart to the deprecated count() is estimatedDocumentCount().

The danger in using size() as it is is that everything is fine when you create a new collection and only have a few documents and will fail as the collections grows without having made any changes to the code. Thus, it would make sense to support both methods and make the difference obvious (maybe estimatedSize() and atomicSize() ?).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions