-
-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MDEV-19574 innodb_stats_method is not honored when innodb_stats_persistent=ON #3886
base: main
Are you sure you want to change the base?
Conversation
|
…stent=ON Problem: ======= InnoDB persistent statistics doesn't take innodb_stats_method variable while calculating n_diff_pfx for the n-prefix index columns. InnoDB persistent statistics doesn't calculate number of non-null key values for n-prefix index columns. Solution: ========= To address the above issues, InnoDB consider all nulls as different value when innodb_stats_method is set to NULLS_UNEQUAL and NULLS_IGNORED. It also adds the n_nonnull_pfx01, n_nonull_pfx02 etc stats description to indicate how many non-nulls exist for n-prefix index
0a7a0f5
to
1446c74
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see that this is implementing my rough ideas from MDEV-19574 that I expressed back in 2022. Since you are targeting a non-GA release with this, I think that we should consider some larger format changes.
Specifically, I’d like to understand what it would take to make innodb_stats_method
control only the way how statistics are being used, and collecting statistics that can serve all variants. What would be the minimum amount of statistics to store for accommodating this, and how would these aggregate statistics be combined for each value of innodb_stats_method
?
vidxcd n_diff_pfx01 d | ||
vidxcd n_diff_pfx02 d,DB_ROW_ID | ||
vidxcd n_leaf_pages Number of leaf pages in the index | ||
vidxcd n_nonnull_pfx01 d | ||
vidxcd n_nonnull_pfx02 d,DB_ROW_ID | ||
vidxcd size Number of pages in the index | ||
vidxe n_diff_pfx01 e | ||
vidxe n_diff_pfx02 e,DB_ROW_ID | ||
vidxe n_leaf_pages Number of leaf pages in the index | ||
vidxe n_nonnull_pfx01 e | ||
vidxe n_nonnull_pfx02 e,DB_ROW_ID | ||
vidxe size Number of pages in the index |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The name nonnull
is rather misleading here. In MariaDB, virtual columns cannot ever be NOT NULL
; in MariaDB they can.
Can we have a more descriptive name, such as replacing the nonnull
with diff_null
if these statistics are covering different prefixes for NULLS_UNEQUAL
? In the commit message it is unclear which values were stored by the n_diff_pfx
statistics up until now, and how that would be changing here.
if (n_not_null != NULL) { | ||
btr_record_not_null_field_in_rec( | ||
n_cols, offsets_rec, n_not_null); | ||
} | ||
btr_record_not_null_field_in_rec( | ||
n_cols, offsets_rec, n_not_null); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We seem to be unnecessarily calling this function when all the columns in the index are declared NOT NULL
. In that case, n_not_null[]
should be identical to n_diff[]
, right?
--- stats_method.result 2025-03-10 15:30:38.087625820 +0530 | ||
+++ stats_method.reject 2025-03-10 15:34:26.697129924 +0530 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please do not add any timestamps; 88d9348 just recently tried to get rid of them.
-n_diff_pfx01 16384 f3 | ||
+n_diff_pfx01 1 f3 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wouldn’t it be better to make the statistics collection independent of the parameter, and only make innodb_stats_method
affect the way how previously collected statistics are being used?
n_diff_pfx01 16341 DB_ROW_ID | ||
n_leaf_pages 37 Number of leaf pages in the index | ||
n_nonnull_pfx01 0 DB_ROW_ID | ||
size 97 Number of pages in the index | ||
n_diff_pfx01 16384 f1 | ||
n_diff_pfx02 16384 f1,f3 | ||
n_diff_pfx03 16384 f1,f3,DB_ROW_ID | ||
n_leaf_pages 1 Number of leaf pages in the index | ||
n_nonnull_pfx01 0 f1 | ||
n_nonnull_pfx02 0 f1,f3 | ||
n_nonnull_pfx03 0 f1,f3,DB_ROW_ID |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because DB_ROW_ID
as well as f1
are declared NOT NULL
, it feels very strange that n_nonnull_pfx01
for those fields is different from n_diff_pfx01
.
As far as I understand, it is necessary to store the n_nonnull_pfx02
and n_nonnull_pfx03
, because the column f3
allows NULL
values. I think that we should try not to store any redundant statistics, such as n_nonnull_pfx01
here.
Description
Problem:
InnoDB persistent statistics doesn't take innodb_stats_method
variable while calculating n_diff_pfx for the n-prefix index columns. InnoDB persistent statistics doesn't calculate number of non-null key values for n-prefix index columns.
Solution:
To address the above issues, InnoDB consider all nulls as different value when innodb_stats_method is set to NULLS_UNEQUAL and NULLS_IGNORED. It also adds the n_nonnull_pfx01, n_nonull_pfx02 etc stats description to indicate how many non-nulls exist for n-prefix index
Release Notes
innodb_stats_method is honoured when innodb_stats_persistent=1
How can this PR be tested?
./mtr innodb.stats_method
Basing the PR against the correct MariaDB version
main
branch.PR quality check