Skip to content
This repository was archived by the owner on May 16, 2025. It is now read-only.
This repository was archived by the owner on May 16, 2025. It is now read-only.

Scalability Issue: outages / timeouts / slow responses in the recrawler service may lead to message queue buildups #43

Open
@jayaddison

Description

@jayaddison

Describe the bug
The recrawler service has been switched off since early January, due to a lack of query results which will be opened and tracked as a separate issue for that service.

If no recrawler pods are available, requests to that service fail with connection errors -- after a considerable timeout -- as visible here in the backend-worker deployment logs:

[2021-01-27 18:28:19,290: WARNING/ForkPoolWorker-2] Recrawling failed due to "ConnectionError" exception
[2021-01-27 18:28:19,291: WARNING/ForkPoolWorker-3] Recrawling failed due to "ConnectionError" exception
[2021-01-27 18:30:30,362: WARNING/ForkPoolWorker-1] Recrawling failed due to "ConnectionError" exception
[2021-01-27 18:30:30,366: WARNING/ForkPoolWorker-3] Recrawling failed due to "ConnectionError" exception

This causes the throughput of the backend-worker instances to drop dramatically since most of the task worker time is spent attempting to make a connection.

It may be useful to consider both a short-term and longer-term fix here. Since we are not currently receiving results from the recrawler service, a patch would involve re-deploying that service to respond with empty results (effectively a no-op). Longer-term we likely want to isolate the queue workers that handle event logs, and perhaps add circuit breakers and/or adjust the connection timeouts they use.

Expected behavior
Throughput for the majority of the RecipeRadar message queues should not be adversely affected by outages in a minor service.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions