Skip to content

fix(sagemaker): singleton window bug#8385

Merged
laileni-aws merged 7 commits intoaws:masterfrom
aws-ajangg:fix-singleton-window-bug
Jan 15, 2026
Merged

fix(sagemaker): singleton window bug#8385
laileni-aws merged 7 commits intoaws:masterfrom
aws-ajangg:fix-singleton-window-bug

Conversation

@aws-ajangg
Copy link
Contributor

@aws-ajangg aws-ajangg commented Dec 4, 2025

Problem

  • When user connects to space, it creates a duplicate window for the same space

Solution

  • Assign unique identifier to workspace. This will open same external window when connecting to same workspace.

Notes

  • Test cases added in separate pr with unit tests
  • Retested locally for most recent changes with new vsix

  • Treat all work as PUBLIC. Private feature/x branches will not be squash-merged at release time.
  • Your code changes must meet the guidelines in CONTRIBUTING.md.
  • License: I confirm that my contribution is made under the terms of the Apache 2.0 license.

@aws-ajangg aws-ajangg requested a review from a team as a code owner December 4, 2025 00:03
@amazon-inspector-ohio
Copy link

⏳ I'm reviewing this pull request for security vulnerabilities and code quality issues. I'll provide an update when I'm done

@github-actions
Copy link

github-actions bot commented Dec 4, 2025

  • This pull request modifies code in src/* but no tests were added/updated.
    • Confirm whether tests should be added or ensure the PR description explains why tests are not required.
  • This pull request implements a feat or fix, so it must include a changelog entry (unless the fix is for an unreleased feature). Review the changelog guidelines.
    • Note: beta or "experiment" features that have active users should announce fixes in the changelog.
    • If this is not a feature or fix, use an appropriate type from the title guidelines. For example, telemetry-only changes should use the telemetry type.

@amazon-inspector-ohio
Copy link

✅ I finished the code review, and didn't find any security or code quality issues.

const hyperPodEnv: NodeJS.ProcessEnv = {
AWS_REGION: region,
SESSION_ID: session || '',
SESSION_ID: hostname || '',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why can't this be session?

Copy link
Contributor Author

@aws-ajangg aws-ajangg Dec 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hostname also uses session but session is randomly generated in the presigned url and this is what's causing a new external window to be opened when the user tries to connect to the workspace.

After discussing with @edwardps, we decided to use a deterministic unique identifier for the hostname instead

if (domain === '' && eksClusterArn) {
const { accountId, region, clusterName } = parseEKSClusterArn(eksClusterArn)
connectionType = 'sm_hp'
session = `${workspaceName}_${namespace}_${clusterName}_${region}_${accountId}`
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

${workspaceName}_${namespace}

Please check the naming convention(limits) to confirm the characters allow by k8s can be accepted as ssh hostname.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added k8s naming convention validation

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added k8s naming convention validation

This is not to validation if it's k8s compliant. The comment is about if there is any letters allowed by k8s by not allowed by ssh as a ssh hostname.

Copy link
Contributor Author

@aws-ajangg aws-ajangg Jan 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

misunderstood the ask. corrected to validate if session is compliant with ssh

@aws-ajangg aws-ajangg force-pushed the fix-singleton-window-bug branch from 83cf892 to 9ec8341 Compare December 4, 2025 01:03
@aws-ajangg aws-ajangg force-pushed the fix-singleton-window-bug branch from 9ec8341 to b025fb0 Compare December 4, 2025 18:55
let connectionType = 'sm_dl'
if (domain === '') {
if (!domain && eksClusterArn && workspaceName && namespace) {
const { accountId, region, clusterName } = parseEKSClusterArn(eksClusterArn)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit - can you use parseArn instead ? may need to import @aws-sdk/util-arn-parser

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit addressed

@aws-jasakshi
Copy link

Any testing or unit tests ? Can you update description if done already

@aws-ajangg aws-ajangg force-pushed the fix-singleton-window-bug branch from 0a9e8db to d851720 Compare January 13, 2026 00:10
Copy link

@aakashmandavilli96 aakashmandavilli96 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM , have you tested this locally? and also some Lint checks are failing are they expected to fail ?

@aws-ajangg
Copy link
Contributor Author

aws-ajangg commented Jan 13, 2026

@aakashmandavilli96
Updated description: tested locally with vsix.
As for failing lint checks, it's coming from SMUS and codewhisperer

): string {
const sanitize = (str: string): string =>
str
.replace(/[^a-zA-Z0-9.-]/g, '-')

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if capital case will cause issue, should we keep hostname all lower case?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ssh hostname naming convention is case-insensitive but keeping it lowercase-only is also a safe approach. reverted sanitation to use lower case characters only

str
.replace(/[^a-zA-Z0-9.-]/g, '-')
.replace(/^-+|-+$/g, '')
.substring(0, 50)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

which attribute could exceed 50? if it's truncated, does that break our logic?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

workspaceName, namespace, and clusterName could exceed 50. I adjusted limits for each attribute based on the max allotted characters per attribute

    const components = [
        sanitize(workspaceName, 63), // K8s limit
        sanitize(namespace, 63), // K8s limit
        sanitize(clusterName, 63), // HP cluster limit
        sanitize(region, 16), // Longest AWS region limit
        sanitize(accountId, 12), // Fixed
    ].filter((c) => c.length > 0)
    // Total: 63 + 63 + 63 + 16 + 12 + 4 separators + 3 chars for hostname header = 224 < 253 (max limit)


const session = components
.join('_')
.substring(0, 253)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems the account could be truncated when length is > 253. Is hostname used as unique string or used to carry over the information e.g. for reconnecting?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hostname is used as a unique string. with the adjustment of character allotment, no attribute will be truncated

const modifiedUrl = url.toString()
getLogger().info(`Connection Type: ${connectionType}`)
getLogger().info(`Modified Presigned URL: ${modifiedUrl}`)
return { type: connectionType || 'vscode-remote', url: modifiedUrl }
Copy link
Contributor

@ashishrp-aws ashishrp-aws Jan 14, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: why is the URL parsing and string conversion necessary? if so, can we add try and catch for URL parsing of presignedURL?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

URL parsing is necessary because previously the eks cluster name from the eksClusterArn was used as an attribute to create a unique identifier for each session. However, the eks cluster name exceed the max char limit which was at risk of being truncated and might affect other cases like reconnect. After discussing with team, a decision was made to use the hyperpod cluster name from the clusterArn which has a satisfied the max limit.

the url string conversion is necessary in order to parse the attributes in the presigned url. without the conversion, a url object would be returned and cause a type error

URL parsing and string conversion already exist in try/catch. Are you suggesting to add another try catch inside the existing one?

const { accountId, region, clusterName } = parseArn(clusterArn)
connectionType = 'sm_hp'
session = `${workspaceName}_${namespace}_${clusterName}_${region}_${accountId}`
if (!isValidSshHostname(session)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: avoid nested if

    const sessionCandidate = `${workspaceName}_${namespace}_${clusterName}_${region}_${accountId}`;
    
    session = isValidSshHostname(sessionCandidate) 
        ? sessionCandidate 
        : createValidSshSession(workspaceName, namespace, clusterName, region, accountId);

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nested if has been refactored


getLogger().info(`Connection Type: ${connectionType}`)
getLogger().info(`Presigned URL: ${presignedUrl}`)
const url = new URL(presignedUrl)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit:
What if presignedUrl is undefined ? I know we catch the error but I think this will be to generic.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added safeguard to throw specific error if presignedUrl is undefined

}
}

function parseArn(arn: string): { accountId: string; region: string; clusterName: string } {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we add tests for these utils?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tests are added to another pr with unit tests

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tests are added to another pr with unit tests

TODO

@laileni-aws laileni-aws merged commit b698a4f into aws:master Jan 15, 2026
32 of 34 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants