Skip to content

Conversation

yehudit1987
Copy link

  • Migrate frontend tests, multi-arch build, and integration test workflows

  • Optimize during migration:
    • Add npm and Docker build caching for 40-50% faster builds
    • Eliminate code duplication by centralizing common library build
    • Replace manual multi-arch builds with docker/build-push-action
    • Standardize naming conventions and make scripts executable

    Resolve issue [TASK] Migrate TWA test-related workflows from kubeflow/kubeflow to notebooks-v1 branch #589

Copy link

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign kimwnasptd for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@yehudit1987 yehudit1987 force-pushed the feat/migrate_twa_tests branch 3 times, most recently from 01e48e4 to a5bfa91 Compare October 5, 2025 08:41
@google-oss-prow google-oss-prow bot added area/backend area - related to backend components area/frontend area - related to frontend components area/v1 area - version - kubeflow notebooks v1 size/XXL and removed size/L labels Oct 5, 2025
@yehudit1987 yehudit1987 force-pushed the feat/migrate_twa_tests branch 2 times, most recently from 0387628 to 8b9b78f Compare October 5, 2025 08:59
@yehudit1987 yehudit1987 force-pushed the feat/migrate_twa_tests branch from 8b9b78f to 28f1111 Compare October 5, 2025 09:00
@yehudit1987 yehudit1987 marked this pull request as ready for review October 5, 2025 10:46
@google-oss-prow google-oss-prow bot requested a review from orfeas-k October 5, 2025 10:46
Copy link
Contributor

@andyatmiami andyatmiami left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really nice investigations/implementations here @yehudit1987 - but for the immediate needs of notebooks-v1 migration - I would prefer we dial back some of these changes to align more with the "status quo" of how kubeflow/kubeflow implemented these workflows.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am concerned about the massive number of changes being introduced in components/crud-web-apps/tensorboards/frontend/package-lock.json

I tried to replicate the upgrade of cypress as seen on this experimental PR on my fork here:

My process:

  • Check out current upstream/notebooks-v1 branch
  • Create a chore/cypress-update branch
  • nvm use v16.20.2 to ensure we are using Node v16
  • npm install [email protected]
  • commit + push changes to remote branch for inspection

While its completely possible (likely?) I have done something wrong myself - I only see ~500 changed lines in my PR - whereas this PR contains almost 18000 changes to package-lock.json.

We need to understand this significant delta.. I'm hypothesizing this is due to perhaps one (or both) of:

  • using version of node ( + npm) that is more recent than Node v16
  • improper means of updating this cypress dependency

Lets discuss this more so I can understand how you updated the lock file.

Comment on lines +34 to +46
- name: Build multi-arch images
uses: docker/build-push-action@v5
with:
context: components/crud-web-apps
file: components/crud-web-apps/tensorboards/Dockerfile
platforms: ${{ env.PLATFORMS }}
push: false
load: false
tags: |
${{ env.IMG }}:${{ github.sha }}
${{ env.IMG }}:latest
cache-from: type=gha
cache-to: type=gha,mode=max No newline at end of file
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please see my comments provided on a different PR that outlines how/why I'd like to simplify the configuration of build-push-action

@@ -0,0 +1,147 @@
name: TWA Frontend Tests
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While I greatly appreciate your effort and investigation into bringing caching into this workflow - I'd prefer for now we keep things more inline with how the kubeflow/kubeflow workflow is structured.

I created a branch off your branch that strips the caching changes out of this workflow and verified its still running successfully if you want to use as reference:

Main motivations behind preferring to avoid caching for now:

  • K.I.S.S principles
    • as we are solely focused on getting releases published from notebooks-v1 branch - I don't want to increase "complexity" by adding features/functionality not present in kubeflow/kubeflow that aren't absolutely necessary
    • note that I realize we are changing a workflow here to use build-push-action as opposed to the Makefile target.. but I view that as "absolutely necessary" giving the "hanging build" observations you previously highlighted
  • Introducing "unknown unknowns"
    • if a cache somehow gets corrupted and/or incorrectly "stale" - this is a whole new paradigm we are introducing that doesn't have well established processes within the community to address
  • Consistency
    • GH repos have limits on caching and given notebooks-v1 and notebooks-v2 exist within the same repo - they are relegated to sharing the same cache. If we consistently rolled out caching changes in the manner outlined here - I am concerned we wouldn't have enough cache space to support all the various components (which means the cache would be full and constantly churning on cache misses). I saw a very simple NodeJS App consume ~100MB of cache (and we have a 10GB repo limit).
    • If we were to implement these caching changes with agreement from community - I would prefer that is done as a singular PR that updates ALL workflows - vs. implementing it only here in a 'one-off' fashion for tensorboards-web-app

If/as we see long run times become a problem for us testing/publishing packages - its good to keep these capabilities in mind.. but for now - I don't see the need to cache as the time taken to test and/or publish is not so egregious as to impact our work.

Happy to discuss more in the event you disagree!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/backend area - related to backend components area/ci area - related to ci area/frontend area - related to frontend components area/v1 area - version - kubeflow notebooks v1 size/XXL

Projects

Status: Needs Triage

Development

Successfully merging this pull request may close these issues.

2 participants