Skip to content

Conversation

@k-rus
Copy link
Member

@k-rus k-rus commented Dec 1, 2025

What is the issue

PartitionAwarePrimaryKeyMap implements overcomplicated ceiling method calling exactRowIdOrInvertedCeiling.
Part of https://github.com/riptano/cndb/issues/15608

What does this PR fix and why was it fixed

Simplifies PartitionAwarePrimaryKeyMap.ceiling to use the corresponding correct method from the reader directly.
This can be seen as a follow up to https://github.com/datastax/cassandra/pull/1096/files#diff-c5011580ab9b0d99d9e504570c4cccb152221d3dbe62c8a956e83fce9070b380

CNDB's PR: https://github.com/riptano/cndb/pull/16154

PartitionAwarePrimaryKeyMap implements unused exactRowIdOrInvertedCeiling
and overcomplicated ceiling method.
@github-actions
Copy link

github-actions bot commented Dec 1, 2025

Checklist before you submit for review

  • This PR adheres to the Definition of Done
  • Make sure there is a PR in the CNDB project updating the Converged Cassandra version
  • Use NoSpamLogger for log lines that may appear frequently in the logs
  • Verify test results on Butler
  • Test coverage for new/modified code is > 80%
  • Proper code formatting
  • Proper title for each commit staring with the project-issue number, like CNDB-1234
  • Each commit has a meaningful description
  • Each commit is not very long and contains related changes
  • Renames, moves and reformatting are in distinct commits
  • All new files should contain the DataStax copyright header instead of the Apache License one

@sonarqubecloud
Copy link

sonarqubecloud bot commented Dec 1, 2025

@cassci-bot
Copy link

❌ Build ds-cassandra-pr-gate/PR-2142 rejected by Butler


4 regressions found
See build details here


Found 4 new test failures

Test Explanation Runs Upstream
o.a.c.distributed.test.FullRepairCoordinatorTimeoutTest.prepareRPCTimeout[DATACENTER_AWARE/true] REGRESSION 🔵🔴 0 / 19
o.a.c.distributed.test.repair.ConcurrentValidationRequestsTest.testConcurrentValidations REGRESSION 🔴🔵 0 / 19
o.a.c.index.sai.cql.VectorCompaction100dTest.testZeroOrOneToManyCompaction[dc true] NEW 🔴 0 / 19
o.a.c.index.sai.cql.VectorSiftSmallTest.testSiftSmall[dc false] REGRESSION 🔴 0 / 19

Found 2 known test failures

@k-rus
Copy link
Member Author

k-rus commented Dec 1, 2025

4 regressions found See build details here

Found 4 new test failures

Test Explanation Runs Upstream
o.a.c.distributed.test.FullRepairCoordinatorTimeoutTest.prepareRPCTimeout[DATACENTER_AWARE/true] REGRESSION 🔵🔴 0 / 19
o.a.c.distributed.test.repair.ConcurrentValidationRequestsTest.testConcurrentValidations REGRESSION 🔴🔵 0 / 19
o.a.c.index.sai.cql.VectorCompaction100dTest.testZeroOrOneToManyCompaction[dc true] NEW 🔴⚪ 0 / 19
o.a.c.index.sai.cql.VectorSiftSmallTest.testSiftSmall[dc false]

Test failures are unrelated to the PR.

@k-rus k-rus changed the title [WIP] CNDB-15608 simplify ceiling row id from primary key CNDB-15608 simplify ceiling row id from primary key Dec 1, 2025
@k-rus
Copy link
Member Author

k-rus commented Dec 1, 2025

Copy link
Member

@michaeljmarshall michaeljmarshall left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

The implementation is identical, AFAICT. The one minor detail is that neither indexOf nor ceilingRowId actually return negative values beyond the Long.MIN_VALUE or the -1 for when lastIndex >= valueCount. This does not break the logic of ceiling, but does lead to a pessimization that should be fixed. I will create a subsequent issue to track that work.

    @Override
    public long ceilingRowId(long targetValue)
    {
        // already out of range
        if (lastIndex >= valueCount)
            return -1;

        long rowId = findBlockRowId(targetValue);
        lastIndex = rowId >= 0 ? rowId : -rowId - 1;
        return lastIndex >= valueCount ? -1 : lastIndex;
    }

    @Override
    public long indexOf(long targetValue)
    {
        // already out of range
        if (lastIndex >= valueCount)
            return Long.MIN_VALUE;

        long rowId = findBlockRowId(targetValue);
        lastIndex = rowId >= 0 ? rowId : -rowId - 1;
        return rowId >= valueCount ? Long.MIN_VALUE : rowId;
    }

@k-rus
Copy link
Member Author

k-rus commented Dec 1, 2025

The implementation is identical, AFAICT. The one minor detail is that neither indexOf nor ceilingRowId actually return negative values beyond the Long.MIN_VALUE or the -1 for when lastIndex >= valueCount. This does not break the logic of ceiling, but does lead to a pessimization that should be fixed. I will create a subsequent issue to track that work.

@michaeljmarshall
This is another difference, which was the reason for me to mention:
ceilingRowId compares lastIndex >= valueCount, while indexOf compares rowId >= valueCount before returning the actual value or the corresponding negative value. I guess subtracting 1 in lastIndex = rowId >= 0 ? rowId : -rowId - 1 guarantees for not exceeding valueCount on negative rowId.

@k-rus k-rus merged commit 42ae0f3 into main Dec 1, 2025
486 of 496 checks passed
@k-rus k-rus deleted the rf-15608-simplify-v1-ceiling branch December 1, 2025 18:49
@michaeljmarshall
Copy link
Member

In re-reviewing the logic, I was mistaken about the misused methods and the pessimization. Everything looks correct to me.

ceilingRowId compares lastIndex >= valueCount, while indexOf compares rowId >= valueCount before returning the actual value or the corresponding negative value. I guess subtracting 1 in lastIndex = rowId >= 0 ? rowId : -rowId - 1 guarantees for not exceeding valueCount on negative rowId.

I'm not sure I follow your point about exceeding valueCount. As far as I can tell, this logic:

        long rowId = findBlockRowId(targetValue);
        lastIndex = rowId >= 0 ? rowId : -rowId - 1;
        return rowId >= valueCount ? Long.MIN_VALUE : rowId;

works by getting either the exact row match back or -(low + 1) from the binary search when no match is found. The -rowId - 1 undoes that -(low + 1) logic and gets the next value. When we check rowId >= valueCount, I think it would work fine if we also had lastIndex >= valueCount since in the negative case, the caller needs to convert the inverted ceiling to a value and then discovers it is out of scope anyway.

@k-rus
Copy link
Member Author

k-rus commented Dec 1, 2025

The -rowId - 1 undoes that -(low + 1) logic and gets the next value.

This is for lastIndex.

When we check rowId >= valueCount, I think it would work fine if we also had lastIndex >= valueCount since in the negative case, the caller needs to convert the inverted ceiling to a value and then discovers it is out of scope anyway.

Why will it be out of the scope? lastIndex is a positive value due to -(low + 1) and it might not go out of scope like rowId >= valueCount for negative rowId.

cc @michaeljmarshall

k-rus added a commit that referenced this pull request Dec 12, 2025
PartitionAwarePrimaryKeyMap implements overcomplicated `ceiling` method
calling `exactRowIdOrInvertedCeiling`.

This commit Simplifies PartitionAwarePrimaryKeyMap.ceiling to use the corresponding
correct method from the reader directly.
This can be seen as a follow up to
https://github.com/datastax/cassandra/pull/1096/files#diff-c5011580ab9b0d99d9e504570c4cccb152221d3dbe62c8a956e83fce9070b380
djatnieks pushed a commit that referenced this pull request Dec 17, 2025
PartitionAwarePrimaryKeyMap implements overcomplicated `ceiling` method
calling `exactRowIdOrInvertedCeiling`.

This commit Simplifies PartitionAwarePrimaryKeyMap.ceiling to use the corresponding
correct method from the reader directly.
This can be seen as a follow up to
https://github.com/datastax/cassandra/pull/1096/files#diff-c5011580ab9b0d99d9e504570c4cccb152221d3dbe62c8a956e83fce9070b380
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants