Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MDEV-36226 Stall and crash when page cleaner fails to generate free p… #3885

Open
wants to merge 1 commit into
base: 10.6
Choose a base branch
from

Conversation

mariadb-DebarunBanerjee
Copy link
Contributor

  • The Jira issue number for this PR is: MDEV-36226

Description

During regular iteration the page cleaner does flush from flush list with some flush target and then goes for generating free pages from LRU tail. When asynchronous flush is triggered i.e. when 7/8 th of the LSN margin is filled in the redo log, the flush target for flush list is set to innodb_io_capacity_max. If it could flush all, the flush bandwidth for LRU flush is currently set to zero. If the LRU tail has dirty pages, page cleaner ends up freeing no pages in one iteration. The scenario could repeat across multiple iterations till async flush target is reached. During this time the DB system is starved of free pages resulting in apparent stall and in some cases dict_sys latch fatal error.

Fix: In page cleaner iteration, before LRU flush, ensure we provide enough flush limit so that freeing pages is no blocked by dirty pages in LRU tail. Log IO and flush state if double write flush wait is long.

Impact: It could result in increased IO due to LRU flush in specific cases.

Release Notes

None

How can this PR be tested?

Regular Innodb test should cover the path. Performance and stress Test should be run to judge for possible impact.

Reproducing the base issue would require large buffer pool, long run and synchronization between foreground and Innodb background threads.

Basing the PR against the correct MariaDB version

  • This is a new feature or a refactoring, and the PR is based against the main branch.
  • This is a bug fix, and the PR is based against the earliest maintained branch in which the bug can be reproduced.

PR quality check

  • I checked the CODING_STANDARDS.md file and my PR conforms to this where appropriate.
  • For any trivial modifications to the PR, I am ok with the reviewer making the changes themselves.

@CLAassistant
Copy link

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
You have signed the CLA already but the status is still pending? Let us recheck it.

Copy link
Contributor

@dr-m dr-m left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The change to the logic looks reasonable to me, but in the diagnostic output I’d avoid excessive numbers of calls to operator<<() and use the common logging functions sql_print_warning() or sql_print_information().

Comment on lines 2697 to 2704
sql_print_information("Innodb: LSN flush parameters\n"
"-------------------\n"
"System LSN : %" PRIu64 "\n"
"Checkpoint LSN: %" PRIu64 "\n"
"Flush ASync LSN: %" PRIu64 "\n"
"Flush Sync LSN: %" PRIu64 "\n"
"-------------------",
lsn, clsn, buf_flush_async_lsn, buf_flush_sync_lsn);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Usually all InnoDB messages are prefixed by InnoDB: (note the case). Do we need this many rows for the output? You need to write uint64_t{buf_flush_async_lsn} or similar to avoid compilation errors:

error: cannot pass object of non-trivial type 'Atomic_relaxed<lsn_t>' (aka 'Atomic_relaxed<unsigned long long>') through variadic function; call will abort at runtime [-Wnon-pod-varargs]

ulint lru_size= UT_LIST_GET_LEN(LRU);
ulint dirty_size= UT_LIST_GET_LEN(flush_list);
ulint free_size= UT_LIST_GET_LEN(free);
ulint dirty_pct= lru_size ? dirty_size * 100 / (lru_size + free_size) : 0;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dirty_pct seems to be redundant information that can be calculated from the rest. It could also be totally misleading, because we were reading these fields without proper mutex or flush_list_mutex protection.

…ages during Async flush

During regular iteration the page cleaner does flush from flush list
with some flush target and then goes for generating free pages from LRU
tail. When asynchronous flush is triggered i.e. when 7/8 th of the LSN
margin is filled in the redo log, the flush target for flush list is
set to innodb_io_capacity_max. If it could flush all, the flush
bandwidth for LRU flush is currently set to zero. If the LRU tail has
dirty pages, page cleaner ends up freeing no pages in one iteration.
The scenario could repeat across multiple iterations till async flush
target is reached. During this time the DB system is starved of free
pages resulting in apparent stall and in some cases dict_sys latch
fatal error.

Fix: In page cleaner iteration, before LRU flush, ensure we provide
enough flush limit so that freeing pages is no blocked by dirty pages
in LRU tail. Log IO and flush state if double write flush wait is long.

Reviewed by: Marko Mäkelä
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging this pull request may close these issues.

4 participants