Skip to content

Conversation

@SrdjanLL
Copy link
Contributor

@SrdjanLL SrdjanLL commented Nov 27, 2025

Relates to: https://github.com/elastic/sdh-kibana/issues/5923

Summary

  • Handle Lock Manager's alias resolution during migration:
    • when checking lock index mappings
    • in cases when concrete index creation is attempted, but the name is already used as alias

Testing

New unit tests:

yarn test:jest src/platform/packages/shared/kbn-lock-manager/src/setup_lock_manager_index.test.ts

Updated integration tests:

node scripts/functional_tests_server --config x-pack/solutions/observability/test/api_integration_deployment_agnostic/configs/stateful/oblt.ai_assistant.stateful.config.ts

# Then run in a separate terminal:
node scripts/functional_test_runner --config x-pack/solutions/observability/test/api_integration_deployment_agnostic/configs/stateful/oblt.ai_assistant.stateful.config.ts --grep="LockManager"

Checklist

Check the PR satisfies following conditions.

Reviewers should verify this PR satisfies this list as well.

Identify risks

  • This PR aims to mitigate risks that occur during migration. The implementation makes certain assumptions on the way migration tooling functions. In the unlikely events of that changing, the logic may need to be revisited

Reviewer notes

  • AI assisted coding with Cursor and Claude 4.5 Opus:
    • Plan generation (to help cover all relevant places
    • Unit test generation

@SrdjanLL SrdjanLL added backport:version Backport to applied version labels Team:obs-ai Observability AI team labels Nov 27, 2025
});

describe('when lock index is accessed via alias (simulating reindexed scenario)', () => {
const reindexedIndexName = `${LOCKS_CONCRETE_INDEX_NAME}-reindexed-for-10`;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for covering this scenario!

Comment on lines +854 to +861
// This should NOT throw "Cannot destructure property 'mappings' of 'res[LOCKS_CONCRETE_INDEX_NAME]' as it is undefined"
try {
await setupLockManagerIndex(es, logger);
} catch (error) {
expect().fail(
`setupLockManagerIndex should not throw when index is accessed via alias, but got: ${error.message}`
);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand this test. Do you expect it to throw or not?
The comment seems to say that it should not throw, but the code assumes that it throws because the expectation is in the catch block

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I never used expect().fail() but I suppose that fails the test if it reaches the catch block

Copy link
Member

@sorenlouv sorenlouv Nov 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

btw. if the index ${LOCKS_CONCRETE_INDEX_NAME}-reindexed-for-10 exists, and setupLockManagerIndex is called, will that create a new index called ${LOCKS_INDEX_ALIAS}-000001?

Based on this I think it will:

await esClient.indices.create({ index: LOCKS_CONCRETE_INDEX_NAME });

Can you double check if that's the case? Because that could be problematic

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I never used expect().fail() but I suppose that fails the test if it reaches the catch block

Yeah.. it's kinda like re-throwing an exception and enriching it with more context. Happy to remove and just let fail if that's not our common practice.

btw. if the index ${LOCKS_CONCRETE_INDEX_NAME}-reindexed-for-10 exists, and setupLockManagerIndex is run, will that create a new index called ${LOCKS_INDEX_ALIAS}-000001?

About this, you're good to question this and it's what I've noticed when running the integration tests (hence why I left the PR still in draft).

After resolving the issue with mappings, I found the integ tests still to fail with invalid_index_name_exception:

Error: setupLockManagerIndex should not throw when index is accessed via alias, but got: invalid_index_name_exception
        Root causes:
                invalid_index_name_exception: Invalid index name [.kibana_locks-000001], already exists as alias

I guess this happens within the migration tooling during index migration (aliasing on the concrete index name) and will affect the index creation as you mention.

I got around this locally by catching the exception and skipping creation similarly to resource_already_exists_exception, as a usable alias already exists? Does that workaround sound good to you?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Happy to remove and just let fail if that's not our common practice.

No, I like it 👍

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I got around this locally by catching the exception and skipping creation similarly to resource_already_exists_exception, as a usable alias already exists? Does that workaround sound good to you?

Yes, I think if an alias exists and it points to a concrete index we should handle (swallow) the error, and skip error creation

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great, thanks for confirming - I've added that change already (might just get lost in all the auto-generated moon changes 🌔 )

@SrdjanLL SrdjanLL marked this pull request as ready for review November 28, 2025 10:24
@SrdjanLL SrdjanLL requested review from a team as code owners November 28, 2025 10:24
@elasticmachine
Copy link
Contributor

Pinging @elastic/obs-ai-team (Team:obs-ai)

Comment on lines +108 to +113
// Handle the case where the index name already exists as an alias (e.g., after a reindex operation during cluster upgrade)
const isIndexNameExistsAsAliasError =
error instanceof errors.ResponseError &&
error.body.error.type === 'invalid_index_name_exception';

if (isIndexAlreadyExistsError || isIndexNameExistsAsAliasError) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sorenlouv I applied the changes I suggested for what you mentioned in this comment. Thought it's easier to iterate on it if it's seen in the works (and I no longer see the integ test errors). Let me know what you think!

Copy link
Member

@sorenlouv sorenlouv Nov 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does the existing integration test cover this? Does it fail without this change? (it should)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it fail without this change? (it should)

Yes, confirmed it fails with:

2)    Stateful Observability - Deployment-agnostic AI Assistant API integration tests
       observability AI Assistant
         LockManager
           index assets
             when lock index is accessed via alias (simulating reindexed scenario)
               should successfully acquire a lock when index is accessed via alias:

      ResponseError: invalid_index_name_exception
        Root causes:
                invalid_index_name_exception: Invalid index name [.kibana_locks-000001], already exists as alias

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

btw. not needed now, but I wonder if ensureTemplatesAndIndexCreated should create the alias .kibana_locks pointing to .kibana_locks-000001.
And if the alias (or index) already exists we do nothing.

But we'd have to consider if this is fully backwards compatible.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds like a good, future-proof, suggestion, but because of the backwards compatibility considerations and my (just acquired) familiarity with the lock manager, I'd suggest to leave it out of this PR and track as a follow up?

If you're feeling strongly, happy to figure out, but may need more heads as I'm rotating out from SDH and we'll be focusing on AB offsite next week.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, I agree. Let's leave that out

@elasticmachine
Copy link
Contributor

Pinging @elastic/obs-ux-management-team (Team:obs-ux-management)

@SrdjanLL SrdjanLL force-pushed the sdh-lock-manager-alias-handling branch from 614071f to b95fe4a Compare November 28, 2025 11:56
@SrdjanLL SrdjanLL removed request for a team November 28, 2025 11:57
Copy link
Member

@sorenlouv sorenlouv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you so much! Great improvement

@SrdjanLL SrdjanLL removed the v9.2.1 label Dec 2, 2025
@SrdjanLL SrdjanLL enabled auto-merge (squash) December 2, 2025 08:23
@SrdjanLL SrdjanLL merged commit 6b81b37 into elastic:main Dec 2, 2025
12 checks passed
@kibanamachine
Copy link
Contributor

Starting backport for target branches: 8.19, 9.1, 9.2

https://github.com/elastic/kibana/actions/runs/19855390956

@elasticmachine
Copy link
Contributor

💚 Build Succeeded

Metrics [docs]

✅ unchanged

History

@kibanamachine
Copy link
Contributor

💔 All backports failed

Status Branch Result
8.19 Backport failed because of merge conflicts
9.1 Backport failed because of merge conflicts
9.2 Backport failed because of merge conflicts

Manual backport

To create the backport manually run:

node scripts/backport --pr 244559

Questions ?

Please refer to the Backport tool documentation

NicholasPeretti pushed a commit to NicholasPeretti/kibana that referenced this pull request Dec 2, 2025
…ngs (elastic#244559)

Relates to: elastic/sdh-kibana#5923

## Summary

- Handle Lock Manager's alias resolution during migration:
  - when checking lock index mappings
- in cases when concrete index creation is attempted, but the name is
already used as alias

## Testing
New unit tests:
```bash
yarn test:jest src/platform/packages/shared/kbn-lock-manager/src/setup_lock_manager_index.test.ts
```

Updated integration tests:

```bash
node scripts/functional_tests_server --config x-pack/solutions/observability/test/api_integration_deployment_agnostic/configs/stateful/oblt.ai_assistant.stateful.config.ts

# Then run in a separate terminal:
node scripts/functional_test_runner --config x-pack/solutions/observability/test/api_integration_deployment_agnostic/configs/stateful/oblt.ai_assistant.stateful.config.ts --grep="LockManager"
```
### Checklist

Check the PR satisfies following conditions. 

Reviewers should verify this PR satisfies this list as well.
- [x] [Unit or functional
tests](https://www.elastic.co/guide/en/kibana/master/development-tests.html)
were updated or added to match the most common scenarios


### Identify risks

- This PR aims to mitigate risks that occur during migration. The
implementation makes certain assumptions on the way migration tooling
functions. In the unlikely events of that changing, the logic may need
to be revisited


### Reviewer notes
- AI assisted coding with Cursor and Claude 4.5 Opus:
  - Plan generation (to help cover all relevant places
  - Unit test generation

---------

Co-authored-by: Søren Louv-Jansen <[email protected]>
Co-authored-by: kibanamachine <[email protected]>
@kibanamachine kibanamachine added the backport missing Added to PRs automatically when the are determined to be missing a backport. label Dec 3, 2025
@kibanamachine
Copy link
Contributor

Friendly reminder: Looks like this PR hasn’t been backported yet.
To create automatically backports add a backport:* label or prevent reminders by adding the backport:skip label.
You can also create backports manually by running node scripts/backport --pr 244559 locally
cc: @SrdjanLL

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backport missing Added to PRs automatically when the are determined to be missing a backport. backport:version Backport to applied version labels release_note:fix Team:obs-ai Observability AI team Team:obs-ux-management v8.19.8 v9.1.8 v9.2.2 v9.3.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants