Skip to content

Releases: determined-ai/determined

v0.38.1-ee

19 Mar 23:12
6bcda5d

Choose a tag to compare

Release Notes

v0.38.1-ee

Changelog

v0.38.1

19 Mar 23:12
6bcda5d

Choose a tag to compare

Release Notes

v0.38.1

Changelog

v0.38.0-ee

07 Mar 00:27

Choose a tag to compare

Release Notes

v0.38.0-ee

Changelog

  • 0fb554e ci: fix 0.38.0-ee release
  • 79ea08a ci: add gorelease ee dryrun
  • e3161d6 ci: remove dependency on codecov (#10238)
  • e6d1952 ci: remove deploying latest gke cluster (#10235)
  • f4e5909 ci: remove deploying latest main and preview clusters (#10234)
  • 7b1fc8e ci: fix apex installation [DT-5] (#10233)
  • de1c0f1 test: update CI setup for running GPU unit tests (#10230)
  • 7154424 chore: release notes 0.38.0 (#10231)
  • 13e49a7 [AUTO-BACKPORT release-0.38.0] 10226: chore: eliminate use of fury repo (#10229)
  • a554cd0 [AUTO-BACKPORT release-0.38.0] 10224: fix: make some k8s tests pass (#10228)
  • 0d373b1 [AUTO-BACKPORT release-0.38.0] 10221: fix: use new migration gist (#10222)
  • c93b848 [AUTO-BACKPORT release-0.38.0] 10213: fix: port k8s perf fix (#10220)
  • 0cc57df chore: backport 10208 to release 0.38.0 (#10219)
  • 7d9c5ed [AUTO-BACKPORT release-0.38.0] 10216: fix: license check tests (#10217)
  • e2d8f47 [AUTO-BACKPORT release-0.38.0] 10206: ci: remove datadog from ci (#10214)
  • 9619dcf [AUTO-BACKPORT release-0.38.0] 10211: chore: fix license check (#10215)
  • 332cefc [AUTO-BACKPORT release-0.38.0] 10207: fix: revert: fix: resolve indefinitely queued (STOPPING_COMPLETED) trials (#10210)
  • e693655 [AUTO-BACKPORT release-0.38.0] 10203: revert: log search (#10205)
  • 50b7690 chore: 0.38.0 environment images (#10197)
  • bb6f140 [AUTO-BACKPORT 10160] fix: maxPoolSlotCapacity bug (#10195)
  • 7db183e [AUTO-BACKPORT 10182] docs: docs changes for searcher context removal (#10194)
  • 23f9793 [AUTO-BACKPORT 10192] fix: keras continue from cloud checkpoint (#10193)
  • 508d400 [AUTO-BACKPORT 10174] docs: update docs for non-Trial-centric world (#10186)
  • 87f5ff8 [AUTO-BACKPORT 10188] fix: include max_length in continue expconf (#10190)
  • e725918 [AUTO-BACKPORT 10183] docs: fix typos in the release note (#10185)
  • 23687db [AUTO-BACKPORT 10178] docs: known issue of tb_plugin (#10181)
  • 5427a68 [AUTO-BACKPORT 10172] fix: ban archive columns in filter for experiment/search search (#10176)
  • 88c8887 [AUTO-BACKPORT 10173] fix: client.logout() re-enables client.login() (#10177)
  • 42f74e6 [AUTO-BACKPORT 10168] chore: ignore test_e2e_longrunning tests when merging auto-backports (#10179)
  • 020fc43 [AUTO-BACKPORT 10161] fix: fix diffusion example [DET-10470] (#10169)
  • c69aa68 [AUTO-BACKPORT 10140] fix: set max slots and checkpoint gc policy should comply with config policies (#10167)
  • b5e6315 fix: set max slots and checkpoint gc policy should comply with config policies (#10140)
  • 8e6a658 [AUTO-BACKPORT 10105] chore: change det deploy aws's default deployment type to simple-rds (#10162)
  • 6fc6710 [AUTO-BACKPORT 10153] docs: checkpoint storage note for config policies (#10165)
  • b366f80 [AUTO-BACKPORT 10138] feat: determined_master_host and friends helm support, better defaults (#10159)
  • d8afc57 [AUTO-BACKPORT 10155] fix: fix iris example to use reported metric name (#10156)
  • 38ae54b [AUTO-BACKPORT 10149] fix: error message fix for duplicate model name (#10154)
  • 47ba6a9 build: INFENG-943: GoReleaser configure prerelease (#10146)
  • aad58c1 build: INFENG-942: Conditionally bypass build-react job checks (#10145)
  • d7f0bbf chore: lock published urls to preserve redirects
  • e3c31f0 Temporarily disable GitHub Actions credentials.
  • 3be954b build: INFENG-938: Update version format in Makefiles (#10142)
  • 69b93b0 build: INFENG-940: Fix logic error in CircleCI config make-component job (#10143)
  • 00870f5 build: INFENG-937: Publish Helm chart release candidates (#10141)
  • 3910426 feat: remove searcher context from harness and master [MD-498] (#10131)
  • 27bebdd build: INFENG-938: Tweak version string format (#10139)
  • 30ad3c0 feat: add master configurations for access token max and default lifespans [DET-10464] (#10101)
  • 782f7a0 revert: "chore: determined_master_host and friends helm support, better defaults" (#10134)
  • 233e095 chore: add checkpoint and max slots config policy enforcements in PATCH experiment (#10125)
  • b3f928b chore: determined_master_host and friends helm support, better defaults (#10092)
  • 6755467 chore: bump Go version used by CI builds to 1.22.8 (#10127)
  • 834eeda feat: add actual select all to glide tables [ET-238] (#10081)
  • c7e0fb5 docs: add log signal release note and update docs (#10126)
  • 02fcc74 test: Add test for filtering user by Role Id (#10095)
  • f97fb5a build: INFENG-933: add GitHub action to start a minor release (#10112)
  • 685918d docs: Add aurora postgres release note (#10115)
  • a84f8c6 chore: SSO improvement feature requires Enterprise Edition. (#10124)
  • c71617c feat: Log Signal Exp Config and Monitoring (#9947)
  • 06b0b31 chore: fix merge exp flake (#10122)
  • 962810a chore: improve messaging when workspace configs conflict with global … (#10121)
  • 6158ef7 docs: Update postgres aurora info (#10116)
  • 4b0c065 docs: log policies restore exp config (#10120)
  • 186962c chore: add config policies to CLI reference docs (#10118)
  • 11ea6f4 chore: clarify version overrides during helm installs (#10094)
  • 4394f29 chore: standardize status api errors for task config policies (#10119)
  • e834302 fix: Add on delete cascade to system_metrics (#10113)
  • 3c59233 chore: populate final merged config with defaults when merging invariant configs (#10107)
  • deb3772 feat: additional APIs to support "actual select all" functions [ET-238] (#10102)
  • fd9cd8a feat: Allow master configuration for ssh key type (#10072)
  • 5e9df7c docs: Update release notes (#10114)
  • c655f33 docs: fix internal link in multi-rm docs page. (#10074)
  • e7186fe docs: Update log policies (#10098)
  • 993296b fix: update copy in experiment and trial headers (#10111)
  • d74a462 docs: Describe sso improvements (#10110)
  • 24d3390 chore: conditionally create VolumeSnapshotClass (#10103)
  • f45ebb9 chore: improve documentation surrounding slot caps helm configuration (#10090)
  • 0013fd0 ci: shorten test_pending_hpc.py (#10104)
  • 22ad457 fix: version upgrade notification bug [CM-411] (#10069)
  • 935fa66 fix: Log searche feedbacks (#10088)
  • 29a08ec Revert "docs: Describe arbitrary metadata logging" (#10099)
  • c6c476c chore: remove e2e_slurm_preemption test series (#10053)
  • e6182ed docs: Describe arbitrary metadata logging (#10073)
  • 539df5e chore: update CLI commands to work with global APIs (#10089)
  • 1f2bea0 feat: update ConfigPolicies with docs link [CM-558] (#10055)
  • 4afc15f build: INFENG-926: Fix version.sh version string output (#10085)
  • 04861dd chore: return error if workspace config violates global constraints (#10076)
  • 912f91e docs: task config policies release note (#10087)
  • 6d56101 fix: remove flake-inducing logretention global singleton (#10016)
  • b70a622 fix: correct token creation CLI to ensure it works with default expiry (#10084)
  • b155332 docs: Describe task config policies (#9969)
  • 27a014b fix: Tensorboard broken on unified install [CM-578] (#10080)
  • bdb56a4 chore: INFENG-922: use correct gh_team tag for infrastructure (#10077)
  • 91e358a INFENG-382: Release redesign (#10002)
  • 34e4749 chore: remove redundant rm.ExternalPreemptionPending interface (#10071)
  • 28bc072 feat: SSO Improvement - alter user_sessions table to include access token, implement CRUD ops, GET, POST, PATCH APIs and det token CLIs (#9867)
  • 472baf9 feat: Add copy task id to task list (#10058)
  • 2e822b7 chore: fix update invariant config and constraints (#10078)
  • d69f7cc chore(deps): bump google.golang.org/grpc from 1.64.0 to 1.64.1 (#9910)
  • e796b92 fix: run checkpoint GC more aggressively to ensure tensorboards are GC'd (#10017)
  • a14525f fix: nil deref in usage of incomplete experiment config policies (#10068)
  • 6c46a46 refactor: remove annotations requiring search ids in bulk action js (ET-241) (#10062)
  • 3ca3418 Docs: describe data files apptainer (#10020)
  • 315f65d chore: ntsc config not supported (#10056)
  • 2e8de9b test: User Management test updates [CM-468] (#10051)
  • 3fc9fed chore: experiment config slots to comply with constraint max slots (#10054)
  • 1d5c984 chore: fix slices and maps merge test (#10063)
  • 219409b chore: fix helptext for det user (#10060)
  • 7d6a1a7 docs: add k8s RP example to the helm values.yaml. (#10027)
  • 9efd96d fix: apply config policy constraints to PATCH /experiments/:id (#10048)
  • dd6aeda chore: change error code back (#10042)
  • 5a39ecb chore: check config policies on 'det notebook set priority' (#10047)
  • 2ef2f12 feat: bulk actions matching filters (ET-241) (#9895)
  • ac82b3c chore: default priority earlier to ensure constraints are satisfied [CM-553] (#10043)
  • 34557ef feat: Extend LogViewer to support scrollable search (#10005)
  • dadf75e chore: take invariant_config priority into account with manage job workflow (#10025)
  • 2356f91 chore: remove e2e_slurm_misconfigured series tests (#10023)
  • b243c26 ci: deflake test_disable_agent_zero_slots (#10040)
  • 4e0f1c4 chore: validate global, admin input against task config policies & constraints (#10028)
  • 3c1630f test: add e2e tests to the "move project" functionality on the "List View" (#10037)
  • 0613cc6 docs: revise postgres permission setup instructions. (#10039)
  • 2594d90 chore: remove e2e_slurm_gpu series tests (#10021)
  • 1f7ccad chore: exp invariant config silent override during add or update (#10019)
  • 30b197d feat: Global Config Policies UI [CM-522] (#10022)
  • c27054d feat: add e2e tests for multi-sort filter on experiments lista (#9992)
  • 9faa0cb chore: wait_for_task_state shows logs on failure (#10029)
  • a166826 fix: Workspace Projects and Tasks test flakes [CM-554] (#10026)
  • 33dfdaf test: Workspace Models tests [CM-538] (#9998)
  • 7e8dbac fix: Update action bar row layout in UserManagement page (#9862)
    *...
Read more

v0.38.0

22 Nov 21:17
7154424

Choose a tag to compare

Release Notes

v0.38.0

Changelog

  • 7154424 chore: release notes 0.38.0 (#10231)
  • 13e49a7 [AUTO-BACKPORT release-0.38.0] 10226: chore: eliminate use of fury repo (#10229)
  • a554cd0 [AUTO-BACKPORT release-0.38.0] 10224: fix: make some k8s tests pass (#10228)
  • 0d373b1 [AUTO-BACKPORT release-0.38.0] 10221: fix: use new migration gist (#10222)
  • c93b848 [AUTO-BACKPORT release-0.38.0] 10213: fix: port k8s perf fix (#10220)
  • 0cc57df chore: backport 10208 to release 0.38.0 (#10219)
  • 7d9c5ed [AUTO-BACKPORT release-0.38.0] 10216: fix: license check tests (#10217)
  • e2d8f47 [AUTO-BACKPORT release-0.38.0] 10206: ci: remove datadog from ci (#10214)
  • 9619dcf [AUTO-BACKPORT release-0.38.0] 10211: chore: fix license check (#10215)
  • 332cefc [AUTO-BACKPORT release-0.38.0] 10207: fix: revert: fix: resolve indefinitely queued (STOPPING_COMPLETED) trials (#10210)
  • e693655 [AUTO-BACKPORT release-0.38.0] 10203: revert: log search (#10205)
  • 50b7690 chore: 0.38.0 environment images (#10197)
  • bb6f140 [AUTO-BACKPORT 10160] fix: maxPoolSlotCapacity bug (#10195)
  • 7db183e [AUTO-BACKPORT 10182] docs: docs changes for searcher context removal (#10194)
  • 23f9793 [AUTO-BACKPORT 10192] fix: keras continue from cloud checkpoint (#10193)
  • 508d400 [AUTO-BACKPORT 10174] docs: update docs for non-Trial-centric world (#10186)
  • 87f5ff8 [AUTO-BACKPORT 10188] fix: include max_length in continue expconf (#10190)
  • e725918 [AUTO-BACKPORT 10183] docs: fix typos in the release note (#10185)
  • 23687db [AUTO-BACKPORT 10178] docs: known issue of tb_plugin (#10181)
  • 5427a68 [AUTO-BACKPORT 10172] fix: ban archive columns in filter for experiment/search search (#10176)
  • 88c8887 [AUTO-BACKPORT 10173] fix: client.logout() re-enables client.login() (#10177)
  • 42f74e6 [AUTO-BACKPORT 10168] chore: ignore test_e2e_longrunning tests when merging auto-backports (#10179)
  • 020fc43 [AUTO-BACKPORT 10161] fix: fix diffusion example [DET-10470] (#10169)
  • c69aa68 [AUTO-BACKPORT 10140] fix: set max slots and checkpoint gc policy should comply with config policies (#10167)
  • b5e6315 fix: set max slots and checkpoint gc policy should comply with config policies (#10140)
  • 8e6a658 [AUTO-BACKPORT 10105] chore: change det deploy aws's default deployment type to simple-rds (#10162)
  • 6fc6710 [AUTO-BACKPORT 10153] docs: checkpoint storage note for config policies (#10165)
  • b366f80 [AUTO-BACKPORT 10138] feat: determined_master_host and friends helm support, better defaults (#10159)
  • d8afc57 [AUTO-BACKPORT 10155] fix: fix iris example to use reported metric name (#10156)
  • 38ae54b [AUTO-BACKPORT 10149] fix: error message fix for duplicate model name (#10154)
  • 47ba6a9 build: INFENG-943: GoReleaser configure prerelease (#10146)
  • aad58c1 build: INFENG-942: Conditionally bypass build-react job checks (#10145)
  • d7f0bbf chore: lock published urls to preserve redirects
  • e3c31f0 Temporarily disable GitHub Actions credentials.
  • 3be954b build: INFENG-938: Update version format in Makefiles (#10142)
  • 69b93b0 build: INFENG-940: Fix logic error in CircleCI config make-component job (#10143)
  • 00870f5 build: INFENG-937: Publish Helm chart release candidates (#10141)
  • 3910426 feat: remove searcher context from harness and master [MD-498] (#10131)
  • 27bebdd build: INFENG-938: Tweak version string format (#10139)
  • 30ad3c0 feat: add master configurations for access token max and default lifespans [DET-10464] (#10101)
  • 782f7a0 revert: "chore: determined_master_host and friends helm support, better defaults" (#10134)
  • 233e095 chore: add checkpoint and max slots config policy enforcements in PATCH experiment (#10125)
  • b3f928b chore: determined_master_host and friends helm support, better defaults (#10092)
  • 6755467 chore: bump Go version used by CI builds to 1.22.8 (#10127)
  • 834eeda feat: add actual select all to glide tables [ET-238] (#10081)
  • c7e0fb5 docs: add log signal release note and update docs (#10126)
  • 02fcc74 test: Add test for filtering user by Role Id (#10095)
  • f97fb5a build: INFENG-933: add GitHub action to start a minor release (#10112)
  • 685918d docs: Add aurora postgres release note (#10115)
  • a84f8c6 chore: SSO improvement feature requires Enterprise Edition. (#10124)
  • c71617c feat: Log Signal Exp Config and Monitoring (#9947)
  • 06b0b31 chore: fix merge exp flake (#10122)
  • 962810a chore: improve messaging when workspace configs conflict with global … (#10121)
  • 6158ef7 docs: Update postgres aurora info (#10116)
  • 4b0c065 docs: log policies restore exp config (#10120)
  • 186962c chore: add config policies to CLI reference docs (#10118)
  • 11ea6f4 chore: clarify version overrides during helm installs (#10094)
  • 4394f29 chore: standardize status api errors for task config policies (#10119)
  • e834302 fix: Add on delete cascade to system_metrics (#10113)
  • 3c59233 chore: populate final merged config with defaults when merging invariant configs (#10107)
  • deb3772 feat: additional APIs to support "actual select all" functions [ET-238] (#10102)
  • fd9cd8a feat: Allow master configuration for ssh key type (#10072)
  • 5e9df7c docs: Update release notes (#10114)
  • c655f33 docs: fix internal link in multi-rm docs page. (#10074)
  • e7186fe docs: Update log policies (#10098)
  • 993296b fix: update copy in experiment and trial headers (#10111)
  • d74a462 docs: Describe sso improvements (#10110)
  • 24d3390 chore: conditionally create VolumeSnapshotClass (#10103)
  • f45ebb9 chore: improve documentation surrounding slot caps helm configuration (#10090)
  • 0013fd0 ci: shorten test_pending_hpc.py (#10104)
  • 22ad457 fix: version upgrade notification bug [CM-411] (#10069)
  • 935fa66 fix: Log searche feedbacks (#10088)
  • 29a08ec Revert "docs: Describe arbitrary metadata logging" (#10099)
  • c6c476c chore: remove e2e_slurm_preemption test series (#10053)
  • e6182ed docs: Describe arbitrary metadata logging (#10073)
  • 539df5e chore: update CLI commands to work with global APIs (#10089)
  • 1f2bea0 feat: update ConfigPolicies with docs link [CM-558] (#10055)
  • 4afc15f build: INFENG-926: Fix version.sh version string output (#10085)
  • 04861dd chore: return error if workspace config violates global constraints (#10076)
  • 912f91e docs: task config policies release note (#10087)
  • 6d56101 fix: remove flake-inducing logretention global singleton (#10016)
  • b70a622 fix: correct token creation CLI to ensure it works with default expiry (#10084)
  • b155332 docs: Describe task config policies (#9969)
  • 27a014b fix: Tensorboard broken on unified install [CM-578] (#10080)
  • bdb56a4 chore: INFENG-922: use correct gh_team tag for infrastructure (#10077)
  • 91e358a INFENG-382: Release redesign (#10002)
  • 34e4749 chore: remove redundant rm.ExternalPreemptionPending interface (#10071)
  • 28bc072 feat: SSO Improvement - alter user_sessions table to include access token, implement CRUD ops, GET, POST, PATCH APIs and det token CLIs (#9867)
  • 472baf9 feat: Add copy task id to task list (#10058)
  • 2e822b7 chore: fix update invariant config and constraints (#10078)
  • d69f7cc chore(deps): bump google.golang.org/grpc from 1.64.0 to 1.64.1 (#9910)
  • e796b92 fix: run checkpoint GC more aggressively to ensure tensorboards are GC'd (#10017)
  • a14525f fix: nil deref in usage of incomplete experiment config policies (#10068)
  • 6c46a46 refactor: remove annotations requiring search ids in bulk action js (ET-241) (#10062)
  • 3ca3418 Docs: describe data files apptainer (#10020)
  • 315f65d chore: ntsc config not supported (#10056)
  • 2e8de9b test: User Management test updates [CM-468] (#10051)
  • 3fc9fed chore: experiment config slots to comply with constraint max slots (#10054)
  • 1d5c984 chore: fix slices and maps merge test (#10063)
  • 219409b chore: fix helptext for det user (#10060)
  • 7d6a1a7 docs: add k8s RP example to the helm values.yaml. (#10027)
  • 9efd96d fix: apply config policy constraints to PATCH /experiments/:id (#10048)
  • dd6aeda chore: change error code back (#10042)
  • 5a39ecb chore: check config policies on 'det notebook set priority' (#10047)
  • 2ef2f12 feat: bulk actions matching filters (ET-241) (#9895)
  • ac82b3c chore: default priority earlier to ensure constraints are satisfied [CM-553] (#10043)
  • 34557ef feat: Extend LogViewer to support scrollable search (#10005)
  • dadf75e chore: take invariant_config priority into account with manage job workflow (#10025)
  • 2356f91 chore: remove e2e_slurm_misconfigured series tests (#10023)
  • b243c26 ci: deflake test_disable_agent_zero_slots (#10040)
  • 4e0f1c4 chore: validate global, admin input against task config policies & constraints (#10028)
  • 3c1630f test: add e2e tests to the "move project" functionality on the "List View" (#10037)
  • 0613cc6 docs: revise postgres permission setup instructions. (#10039)
  • 2594d90 chore: remove e2e_slurm_gpu series tests (#10021)
  • 1f7ccad chore: exp invariant config silent override during add or update (#10019)
  • 30b197d feat: Global Config Policies UI [CM-522] (#10022)
  • c27054d feat: add e2e tests for multi-sort filter on experiments lista (#9992)
  • 9faa0cb chore: wait_for_task_state shows logs on failure (#10029)
  • a166826 fix: Workspace Projects and Tasks test flakes [CM-554] (#10026)
  • 33dfdaf test: Workspace Models tests [CM-538] (#9998)
  • 7e8dbac fix: Update action bar row layout in UserManagement page (#9862)
  • 5b1380c chore: check experiment constraints (#10018)
  • f609a2d fix: remove formatDatetime (#10011)
  • 9b6f0ac docs: Update release notes date (#9999)
  • f5400ea feat: Add regex search to task logs API (#9994)
  • ddca766 fix: correct expToWebhookConfig cache locking (#10014)
  • 80b29fa feat: Config Policies UI, Workspaces Experiments [CM-521] (#10009)
  • 262b4a9 chore: check task conf...
Read more

0.35.1

09 Nov 01:04

Choose a tag to compare

Release Notes

0.35.1

Changelog

  • 9d4bed2 chore: bump version: 0.35.1-rc0 -> 0.35.1
  • 46b3761 fix: perf issue with too many API reqs when listing pods in all ns (#10202)
  • 5b03599 chore: bump version: 0.35.0 -> 0.35.1-rc0
  • 4182da4 chore: bump current environment image versions to 0.35.1

0.37.0

30 Sep 15:28

Choose a tag to compare

Release Notes

0.37.0

Changelog

  • c415087 chore: bump version: 0.37.0-rc4 -> 0.37.0
  • 736fba6 docs: add release notes for 0.37.0 (#9995)
  • 73dee98 docs: fix broken links (#9996)
  • ecf8ac7 chore: bump version: 0.37.0-rc3 -> 0.37.0-rc4
  • 1b50305 fix: fix default id search for runs (#9988)
  • 0990c11 chore: bump version: 0.37.0-rc2 -> 0.37.0-rc3
  • a78b190 fix: fix hf on_save raise exception (#9977)
  • 0560939 fix: bring in handleEmptyCell from #9963 (#9984)
  • 7caf18a chore: bump version: 0.37.0-rc1 -> 0.37.0-rc2
  • 08d782a fix: show search progress in run table (#9976)
  • 478c78f fix: Cluster page height (#9975)
  • 2772a3c fix: correct dataPath for hyperparameters (#9971)
  • 94f2d95 chore: bump version: 0.37.0-rc0 -> 0.37.0-rc1
  • 63e7df0 chore: 0.37.0 environment images (#9967)
  • b2267d1 chore: bump version: 0.37.0-dev0 -> 0.37.0-rc0
  • f758303 chore: lock published urls to preserve redirects
  • 2a8e7dd chore: lock api state for backward compatibility check
  • 3f54d07 chore: bump version: 0.36.1-dev0 -> 0.37.0-dev0
  • baf451f chore: do not log error for resource pools with zero agents (#9960)
  • 6a8606e docs: Add hpc installation guide (#9945)
  • 3241edb fix: fix flaky generic task pause test (#9962)
  • 43556e9 fix: Remove CSS rule for hiding the Form.Item error message (#9872)
  • 5906001 perf: improve the initial page load speed (#9939)
  • eb1b0de docs: Add workload alerting (#9938)
  • cedfcfe chore: refactor and test RBAC config policies work [CM-530] (#9943)
  • 2d884b9 docs: Add cluster overview (#9936)
  • e17d12c feat: release notes and improvements for workload alerting (#9944)
  • 0db2e3b ci: deflake make slurmcluster, hopefully (#9957)
  • 95f079d feat: add GET global config policies API (#9952)
  • d943d85 chore: fix global PUT for task config policies (#9941)
  • 410edf6 fix: broken MNIST download in e2e tests (#9937)
  • 004c194 ci: fix flaky test_allocation_csv tests (#9953)
  • 88a4c67 feat: add Config Policies GET API and modify CRUD functions to accept both Workload types (#9946)
  • a73c8db test: debug auth [TESTENG-95] (#9942)
  • 13db674 test: experiment list show archived filter [ET-753] (#9932)
  • 02e302f chore: remove unused languages from code editor (#9898)
  • f6d874d docs: Replace slack links (#9919)
  • 26b0954 chore: implement Delete config policies API handlers (#9927)
  • 2d12be1 test: add projects tests [CM-467] (#9928)
  • 062cb52 fix: use different modules for Trial and Cluster topology (#9917)
  • 0928958 chore: change log level for log retention policies (#9935)
  • b559467 chore: bump coverage target (#9920)
  • 3a2ea56 fix: do not filter slots for mixed-slot-type pools (#9902)
  • a58ed7c chore: reassign RM code to CM in CODEOWNERS (#9926)
  • cb3515e fix: update LogRetentionDays from master config when master starts/upgrades (#9930)
  • 13b7b3f ci: increase timeout for k8s intg tests (#9929)
  • 6f36969 fix: flaky workspace test (#9931)
  • 867eb31 fix: update huggingface example (#9925)
  • 5b2275f fix: Refactor sorting logic in WorkspaceProjects for filtering projects (#9903)
  • fd7f77a fix: move validation dataloader check in PyTorchTrial [MD-515] (#9923)
  • db2881f chore: fix config policy unmarshal tests (#9924)
  • 3900742 chore: update test log pattern webhook cache (#9922)
  • f44687d chore: create config policies table and add NTSC CRUD operations (#9915)
  • de89f68 feat: support updating web hook url [MD-482] (#9890)
  • 02fbdbb fix: huggingface callback raise process preempted exception (#9913)
  • 8c799b8 chore: prune cruft out of no_op fixture (#9912)
  • 11de119 chore(deps): bump path-to-regexp and express in /webui/react (#9909)
  • 03961b5 test: add workspace tests (#9905)
  • c877383 fix: GetTrialRemainingLogRetentionDays should take global log retention days into account [CM-518] (#9914)
  • fb0d5f9 fix: change workspace name and set resource quota simultaneously (#9847)
  • 8fb9f6b docs: Update ROCM support (#9893)
  • 481bddb chore(deps): bump github.com/docker/docker from 24.0.9+incompatible to 25.0.6+incompatible (#9780)
  • c1499ac chore: removing model_hub references from Makefile (#9901)
  • c961dbd feat: new run object for Run Centric API (#9897)
  • bfeb418 feat: Implement custom trigger for webhooks (#9879)
  • b6eb05e chore: Remove model hub (#9869)
  • 4a28c10 chore: add unmarshal functions for task config policies (#9896)
  • d842383 fix: timezone handling error in queued allocation time update (#9892)
  • 55b3f9b test: cover project id filtering on bulk actions [ET-138] (#9870)
  • 036477b chore: stub new APIs for task config policies [CM-485] (#9880)
  • be2622a test: Delete workspace after webhook test (#9891)
  • a30bc25 feat: Add rbac for config policies (#9873)
  • 8c83d31 chore: create WorkloadType enum and Go config + constraints structs (#9885)
  • 0a18c5a fix: add backwards compatibility for Pods to Jobs for k8s <v1.27 [CM-461] (#9878)
  • 8e6bba8 ci: fix master-config syntax (#9889)
  • d5d647a fix: inconsistent timezone handling in daily allocation aggregation (#9888)
  • b4209ef test: login redirect with nested route (#9881)
  • 8cacba6 ci: add e2e bulk kill test (#9868)
  • 590c362 fix: Hf callback metric naming (#9887)
  • 61fd26b fix: reset Model Registry page number on pageload [ET-640] (#9876)
  • ce27f81 fix: show - for empty data in run table (#9871)
  • b1c0814 fix: prevent hyperparameter search modal submitting the same request multiple times (#9883)
  • d54713c fix: use new ruamel yaml APIs (#9882)
  • ad5fe5a fix: prevent out of bounds navigation on new list views (#9875)
  • a605f00 fix: reject reconnecting agents with different resource pool configuration (#9815)
  • db92bad feat: Support RBAC in webhook (#9859)
  • 0ef81aa fix: sorting by arbitrary metadata (#9874)
  • c1b7767 feat: Auto-Populate POSIX Information on sign in using SSO [CM-399] (#9755)
  • 54b6165 feat: Logic of different modes for webhook (#9865)
  • a773551 fix: allow for objects inside array metadata to be typed properly (#9864)
  • ee269c8 test: successful login with weak or strong password (#9858)
  • e21fc6f ci: pin chromadb version to avoid incompatibility (#9849)
  • a1234a1 chore: bump version: 0.36.0-dev0 -> 0.36.1-dev0
  • d79c90d chore: add docs dropdown link for new version
  • ce6da74 docs: add release notes for 0.36.0 (#9854)
  • a55af74 fix: use task sessions in Core API [MD-509] (#9860)
  • 3ee88bb fix: replace tree with code mirror for metadata view (#9853)
  • 8dd46d5 chore: Improve CompareTrials perfomance (#9807)
  • 6e08303 fix: fix error toast popping up in Workpace Creator view (#9855)
  • fb95df8 chore: add backport github action (#9835)
  • a37e6e7 fix: prevent loading issues with ipynb files (#9850)
  • 9de4f72 feat: configurable preemption timeout [MD-500] (#9833)
  • 640126b feat: Add workspaceId, mode, name to webhook (#9820)
  • d436c23 fix: reset pinned column state when resetting columns (#9852)
  • 3a91552 fix: fix fallback logic for partially provided custom logos (#9842)
  • 707ad07 Revert "chore: add tracing info to some backend APIs" (#9843)
  • 73a756a fix: update broken tensorflow & certbot links (#9846)
  • 771bbe4 ci: sequential metric count sweep test [Scale-35] (#9791)
  • 32fafdd perf: remove duplicate ids in ExpMetricNames api (#9848)
  • a8fa015 docs: Fix broken links (#9845)
  • 2b1856a fix: model version name overflow on mobile [ET-384] (#9827)
  • e13de20 docs: Document rbac editorprojectrestricted role (#9844)
  • 2838af4 chore: add tracing info to some backend APIs (#9841)
  • e3dfb0a fix: change filter form to say "Show runs" in flat runs view [ET-740] (#9840)
  • 52f2b9f chore: add release notes for PR 9822 (#9837)
  • a37d482 fix: experiment single trial tabs don't scroll on load (#9831)
  • aff486c feat: Rocm bumpenvs (#9830)
  • 13622ad feat: Add report_progress to TrainContext (#9826)
  • d831461 fix: replace rawsource attribute with node directly, due to removal of rawsource in Docutil 2.0 (#9838)
  • 7ed9e83 feat: add EOL notice regarding Aurora V1 & Postgres 12 along with Master Log warnings for Postgres <=12 [CM-413] [CM-416] (#9832)
  • 5c5f107 docs: Minor docs enhancements (#9836)

0.36.0

23 Aug 21:25

Choose a tag to compare

Release Notes

0.36.0

Changelog

  • c349314 chore: bump version: 0.36.0-rc7 -> 0.36.0
  • 39db2a8 docs: add release notes for 0.36.0 (#9854)
  • 61538a2 chore: bump version: 0.36.0-rc6 -> 0.36.0-rc7
  • 9494823 fix: fix error toast popping up in Workpace Creator view (#9855)
  • bd33228 chore: bump version: 0.36.0-rc5 -> 0.36.0-rc6
  • fa155de chore: bump version: 0.36.0-rc4 -> 0.36.0-rc5
  • 9332ab9 chore: 0.36.0 environment images (#9851)
  • 838cafe Revert "chore: add tracing info to some backend APIs" (#9843)
  • 1e2447d chore: bump version: 0.36.0-rc3 -> 0.36.0-rc4
  • f70a03d fix: update broken tensorflow & certbot links (#9846)
  • e3695a9 perf: remove duplicate ids in ExpMetricNames api (#9848)
  • 101441d docs: Fix broken links (#9845)
  • 8e28493 docs: Document rbac editorprojectrestricted role (#9844)
  • 9e73cd3 chore: bump version: 0.36.0-rc2 -> 0.36.0-rc3
  • 8acaee5 chore: add tracing info to some backend APIs (#9841)
  • 46a400e fix: change filter form to say "Show runs" in flat runs view [ET-740] (#9840)
  • 119d544 chore: bump version: 0.36.0-rc1 -> 0.36.0-rc2
  • 5affb09 chore: add release notes for PR 9822 (#9837)
  • 21bc083 feat: Rocm bumpenvs (#9830)
  • 26f8ed2 chore: bump version: 0.36.0-rc0 -> 0.36.0-rc1
  • 89d5ddb fix: replace rawsource attribute with node directly, due to removal of rawsource in Docutil 2.0 (#9838)
  • d58ff68 feat: add EOL notice regarding Aurora V1 & Postgres 12 along with Master Log warnings for Postgres <=12 [CM-413] [CM-416] (#9832)
  • 4be07af docs: Minor docs enhancements (#9836)
  • 34b567e chore: bump version: 0.36.0-dev0 -> 0.36.0-rc0
  • e11629b chore: lock published urls to preserve redirects
  • 6e0b9d1 chore: lock api state for backward compatibility check
  • e1a2273 chore: bump version: 0.35.1-dev0 -> 0.36.0-dev0
  • 42c2efa docs: Docs cleanup (#9834)
  • 3ed0a39 docs: Make docs consistent with run centric ux (#9824)
  • a367cd0 chore: deprecate Custom Searcher [MD-504] (#9829)
  • f7846cb feat: allow users with role Viewer and above to view resource quotas (#9822)
  • 97353c9 fix: Group and User management (CM-436) (#9825)
  • 358ed28 fix: hide metadata section if there's no metadata (#9823)
  • 287f3be chore: unskip flaky test (#9819)
  • e85ac89 Clarify basic data lineage to mldm (#9828)
  • c0ca659 fix: checkpoint table action menu shouldn't vanish on polling [ET-277] (#9812)
  • 740b0e7 docs: Describe basic lineage steps (#9813)
  • e5d4b7f chore: initial k8s rocm support [CM-367] (#9794)
  • 9548790 chore: fix torch version to 2.2.2 for intel mac (#9821)
  • b2a82e8 chore: deprecate kubernetes priority w/ preemption scheduler (#9763)
  • 2002bf0 docs: Getting a list of files in a checkpoint (#9818)
  • 91d0b67 docs: Fix broken links (#9816)
  • e357849 fix: don't ignore failures during experiment shutdown (#9693)
  • 9b96416 test: add go unit tests for experiment bulk actions [ET-138] (#9658)
  • 92a7ff5 feat: support filter by metadata with string type (#9810)
  • 9da5620 feat: exclude Array type columns (#9808)
  • 79ffa52 chore: bump version: 0.35.0-dev0 -> 0.35.1-dev0
  • 9949ab0 chore: add docs dropdown link for new version
  • 261e2e7 docs: add release notes for 0.35.0 (#9786)
  • a11e9e8 chore(deps): bump torch from 1.11.0 to 2.3.0 (#9726)
  • bebaf17 fix: make navigation sidebar scrollable [ET-633] (#9803)
  • f7e18fc fix: prevent multiple calls to time-series on compare view select (#9805)
  • db98c4f ci: Add a portable testing framework and scalability tests [SCALE-29] (#9762)
  • 9702d22 fix: prevent extra initial calls to search endpoints (#9782)
  • 4e47a1e chore: change the comment for defaultNamespace in values.yaml (#9793)
  • d3f3e76 test: datagrid action pause flake (#9802)
  • 1f7473c fix: return proper error message when moving a project with a matching names (#9795)
  • 15d1a60 ci: fix scripting for make slurmcluster job (#9801)
  • 8173cab fix: forked from link (#9798)
  • c3400df feat: add editor project restricted role and testing [DET-10428] (#9796)
  • 2cb1022 test: base model package dependency update [TESTENG-59] (#9777)
  • 4f31942 test: omnibar tree-extension tests [ET-203] (#9783)
  • cdbbedd fix: don't filter single runs in the comparison view (#9789)
  • 80822eb ci: label make slurmcluster instances for cloud spend [CM-405] (#9792)
  • ea589d8 chore: fix readme typo (#9797)
  • 7b4f01c fix: Add loading indicator when creating HP search (#9774)
  • a4d74af chore: readme should include codecov (#9787)
  • 786f258 fix: uncomment helm values (#9790)
  • a034964 fix: fixed helm chart values and master-config.yaml (#9788)
  • fe14062 feat: show metadata in run table (#9776)
  • 2b589c4 feat: add array column type for abitrary metadata (#9759)
  • 094c58b test: skip flaky test (#9784)
  • 49c3fa0 chore: add a utility for connecting devcluster to remote k8s clusters (#9739)
  • 13ebf47 chore: add Cluster Name title and change helm value (#9775)
  • 61aad78 fix: fix contains filter for hyperparameters and metadata (#9779)
  • 15226b7 feat: add master config option to provide custom logo (#9664)
  • f42daca feat: make groups scope optional to support azure with OIDC (#9773)
  • 6105b3f docs: fix insecure link to systemd docs (#9772)
  • 068b959 feat: checkpoint view for flat runs [ET-658] (#9769)
  • dab6978 feat: add code tab to run page [ET-657] (#9771)
  • 2c91098 test: use previously created experiment for pause test (#9727)
  • 935799d fix: use run checkpoint data instead of experiment for run table filter (#9767)
  • 30d6e79 fix: extract searcher metric from experiment payload (#9768)
  • b8c6773 fix: fix missing task_stats start_time on restored allocation (#9745)
  • a094ea1 chore: pin numpy version and upgrade sphinx [MD-468] (#9736)
  • 0806597 feat: add Metadata section to TrialDetailsOverview (ET-224) (#9639)
  • 287faf7 chore: bumpenv pin numpy to 1.x [MD-470] (#9748)
  • becd8b6 chore: remove RM Name from RP descriptions (#9758)
  • fc8ac0b chore: undo test skip after fix was merged (#9754)
  • de898c9 Revert "chore: add configurable posix claims fields to master config [RM-398]" (#9753)
  • 623c945 fix: load trial data for single run searches in search view #9742 (#9752)
  • 41a512e fix: debounce searches column width settings #9700 (#9751)
  • bc721bf refactor: change 'close' to 'save' on button in ManageJob modal [DET-10446] (#9750)
  • 0ce2ff1 fix: change external_run_id to string type in FlatRun proto (#9749)
  • 20ed126 fix: reduce the number of api calls from Workspace Create Modal (#9735)
  • 61bc7bb chore: add configurable posix claims fields to master config [RM-398] (#9690)
  • 2cdfdf9 fix: change external_run_id to string type in FlatRun proto (#9744)
  • 36aaed7 chore: fine-tune error and help messages of CLI commands for slot caps (#9743)
  • 0df7ad3 test: workspace and project tests [TESTENG-60] (#9740)
  • e00d9f4 chore: add release note for ComparisonView bugfix (#9741)
  • 35ec077 chore: add 'masterService.annotations' to Helm (#9697)
  • 5f8dae3 chore: fix exp delete log msg (#9716)
  • 9dc0afa fix: deadlock issue (#9728)
  • f8067ba chore: skip failing Deactivate and reactivate user test (#9723)
  • 9efb216 feat: CLI command to list the members of a Workspace [RM-388] (#9686)
  • dc12336 chore: lengthen abbreviation to avoid ambiguity (#9733)
  • e3524b7 chore: add release notes for metrics fetching UI bug (#9737)
  • 719f8be chore: update copy when f_flat_runs is on (#9642)
  • 6c4f69b test: workspace and project api [TESTENG-46] (#9731)
  • 4aa6ffa docs: Add release docs for continue trial, edit hp search, resource a… (#9729)
  • d46d776 fix: use before/after search params for historic allocation CSV download endpoint [DET-10442] (#9730)
  • a32b010 fix: show selections in ComparisonView on any page (ET-189) (#9694)
  • 7260f04 chore: default flat runs to on (#9709)
  • 202ab62 fix: Endless fetching for cancelled experiment without metrics (#9714)
  • 4466c33 feat: change search-experiments from GET to POST [ET-602] (#9717)
  • 787a2f3 docs: Fix workspace cli doc (#9720)
  • c3ca1d4 docs: Describe link to mldm data (#9718)

0.35.0

08 Aug 18:10

Choose a tag to compare

Release Notes

0.35.0

Changelog

  • 7d1b0df chore: bump version: 0.35.0-rc20 -> 0.35.0
  • e770ee5 docs: add release notes for 0.35.0 (#9786)
  • 7f03a87 chore: bump version: 0.35.0-rc19 -> 0.35.0-rc20
  • 3c9a188 fix: prevent multiple calls to time-series on compare view select (#9805)
  • c65c6cd chore: bump version: 0.35.0-rc18 -> 0.35.0-rc19
  • da58c92 fix: prevent extra initial calls to search endpoints (#9782)
  • 8074fd9 chore: bump version: 0.35.0-rc17 -> 0.35.0-rc18
  • 6fed766 chore: change the comment for defaultNamespace in values.yaml (#9793)
  • 6d9b780 chore: bump version: 0.35.0-rc16 -> 0.35.0-rc17
  • f02b6b5 fix: forked from link (#9798)
  • 7928af1 chore: bump version: 0.35.0-rc15 -> 0.35.0-rc16
  • 1841a8e fix: don't filter single runs in the comparison view (#9789)
  • 0b89fad chore: bump version: 0.35.0-rc14 -> 0.35.0-rc15
  • c451482 fix: uncomment helm values (#9790)
  • f041440 chore: bump version: 0.35.0-rc13 -> 0.35.0-rc14
  • f144957 fix: fixed helm chart values and master-config.yaml (#9788)
  • bfe7912 chore: bump version: 0.35.0-rc12 -> 0.35.0-rc13
  • 2794cdc chore: add Cluster Name title and change helm value (#9775)
  • 720dcbb chore: bump version: 0.35.0-rc11 -> 0.35.0-rc12
  • 94f916d fix: fix contains filter for hyperparameters and metadata (#9779)
  • 501d45c chore: bump version: 0.35.0-rc10 -> 0.35.0-rc11
  • 44e4786 feat: checkpoint view for flat runs [ET-658] (#9769)
  • e1ff8bb chore: bump version: 0.35.0-rc9 -> 0.35.0-rc10
  • 5314c58 feat: add code tab to run page [ET-657] (#9771)
  • f23ca4c chore: bump version: 0.35.0-rc8 -> 0.35.0-rc9
  • 2b1f0e7 fix: use run checkpoint data instead of experiment for run table filter (#9767)
  • b750c30 fix: extract searcher metric from experiment payload (#9768)
  • 461434e chore: bump version: 0.35.0-rc7 -> 0.35.0-rc8
  • 5cb9a32 fix: fix missing task_stats start_time on restored allocation (#9745)
  • ca4df77 chore: bump current environment image versions to 0.35.0 (#9760)
  • b0b9d84 chore: bumpenv pin numpy to 1.x [MD-470] (#9748)
  • 375244d Revert "chore: 0.35.0 images (#9732)"
  • bd4af9d chore: remove RM Name from RP descriptions (#9758)
  • 22c5ae9 chore: bump version: 0.35.0-rc6 -> 0.35.0-rc7
  • 2b76ac8 refactor: change 'close' to 'save' on button in ManageJob modal [DET-10446] (#9746)
  • f739956 fix: load trial data for single run searches in search view (#9742)
  • d1520a4 chore: bump version: 0.35.0-rc5 -> 0.35.0-rc6
  • ed47fb0 fix: reduce the number of api calls from Workspace Create Modal (#9735)
  • e345871 fix: change external_run_id to string type in FlatRun proto (#9744)
  • 8ef93f8 chore: fine-tune error and help messages of CLI commands for slot caps (#9743)
  • c17fc72 chore: add release note for ComparisonView bugfix (#9741)
  • 1dbdf00 chore: bump version: 0.35.0-rc4 -> 0.35.0-rc5
  • 27b8dbd fix: deadlock issue (#9728)
  • f939bc4 chore: bump version: 0.35.0-rc3 -> 0.35.0-rc4
  • c554aec chore: lengthen abbreviation to avoid ambiguity (#9733)
  • e173bdb chore: add release notes for metrics fetching UI bug (#9737)
  • b6051af chore: update copy when f_flat_runs is on (#9642)
  • 1dddfa3 chore: bump version: 0.35.0-rc2 -> 0.35.0-rc3
  • 40e1f56 chore: 0.35.0 images (#9732)
  • f68e36e docs: Add release docs for continue trial, edit hp search, resource a… (#9729)
  • a341f9e chore: bump version: 0.35.0-rc1 -> 0.35.0-rc2
  • 9f5fefb fix: use before/after search params for historic allocation CSV download endpoint [DET-10442] (#9730)
  • 9947eba fix: show selections in ComparisonView on any page (ET-189) (#9694)
  • 76dc02f chore: bump version: 0.35.0-rc0 -> 0.35.0-rc1
  • 5e22807 chore: default flat runs to on (#9709)
  • f84d4eb fix: Endless fetching for cancelled experiment without metrics (#9714)
  • 556de9c feat: change search-experiments from GET to POST [ET-602] (#9717)
  • 8be3e85 docs: Fix workspace cli doc (#9720)
  • ca375ad docs: Describe link to mldm data (#9718)
  • d375364 chore: bump version: 0.35.0-dev0 -> 0.35.0-rc0
  • 408a609 chore: bump version: 0.34.1-dev0 -> 0.35.0-dev0
  • 24feb35 chore: add release notes for workspace slot caps (#9706)
  • 3fbc8c0 chore: lock published urls to preserve redirects
  • 248a1ba chore: lock api state for backward compatibility check
  • fbb5d24 test: skip flaky test until after release (#9719)
  • 12fc71c feat: add workspace namespace bindings and resource quotas to workspaces (#9180)
  • 769d600 docs: basic lineage release notes (#9708)
  • 0703e8a chore: saas authz changes for rbac (#9657)
  • bd59ced fix: disallow slots in exp config [MD-454] (#9698)
  • 3995b72 test: create Pause experiment action test [ET-644] (#9699)
  • 004d9d0 chore: fix makeslurm workload option reference (#9645)
  • 4f81548 fix: assign only run in a single run experiment as best_trial_id (#9051)
  • 543380d fix: show the correct empty page in flat run table when filters are applied (#9702)
  • d5b6181 docs: Describe workspace slot level caps (#9687)
  • 1255d07 feat: auto-redirect to SSO provider when expired remote session detected (DET-10392) (#9613)
  • 95c636c test: move wait statement to shared code (#9701)
  • def5e34 feat: obfuscate data.secrets if present in experiment config [DET-10232] (#9635)
  • 8776e35 fix: unmanaged experiment checkpoint storage path (#9625)
  • e000e70 docs: use postgres 14 in code snippets. (#9691)
  • 4a0eb5f fix: Remove experiments immediately after deleting by filtering out deleting experiments (#9688)
  • f672a88 fix: use existed templates when launching notebook (#9681)
  • f721751 test: experiment list sort [INFENG-766] (#9675)
  • ca90a63 fix: highlight Data Input link (#9676)
  • 160445a fix: require full object for debounced settings updates (#9682)
  • 88b93a2 fix: links for Recent Submissions on Dashboard (#9651)
  • 1c7f4b5 docs: Link to cluster info (#9633)
  • 5eb0bda ci: disable telemetry for ci tests (#9671)
  • 086de84 chore: fix e2e test Hf trainer searcher lengths (#9683)
  • c70dd8c fix: resolve indefinitely queued (STOPPING_COMPLETED) trials (#9605)
  • f4ecd91 chore: remove ContainerState from ResourcesStateChanged (#9680)
  • 3baece2 fix: default value in filter form (#9678)
  • 4bbde85 test: workspace spec rearrange + one more test case for now [TESTENG-24] (#9674)
  • 1500f39 docs: announce deprecation of Kubernetes priority scheduler (#9667)
  • ccac2c1 chore: add determined_master_scheme for K8s multirm (#9673)
  • c36705b chore: raise exception in HF trainer for mismatched train units [MD-456] (#9669)
  • e956f28 feat: Add selection label to FlatRuns page (ET-309) (#9670)
  • bbd6f8a test: collect detailed logs for tests in datadog[infeng-752] (#9637)
  • 274d763 fix: update sort settings (#9665)
  • 4b3a100 chore: Deprecate model hub (#9628)
  • c3e5211 fix: SearchDetails Pivot bug [ET-632] (#9656)
  • 0786e08 chore: bumpenvs for jupyter upgrades [MD-242] (#9660)
  • 6d1f778 docs: expand det abbreviation in docs (#9652)
  • 6528500 chore: remove singularity agent znode nightly test (#9659)
  • 6cb6a90 fix: user IDs instead of user session IDs for notebook sessions [MD-453] (#9627)
  • e4a9ae3 feat: add Metadata to project columns in Run Table (#9629)
  • 3663c5b fix: upload non-conflicting files for sharded checkpointing [MD-298] (#9598)
  • 4ece949 test: add test cases for filter group (#9647)
  • 02da2a2 chore: Improve core v2 init API [MD-441] (#9560)
  • 0494cdf fix: ensure historic usage charts assume the correct timezone (DET-10407) (#9650)
  • 9a8591f feat: Add resources allocation-csv to det cli [DFR-519] (#9649)
  • 3c310c7 ci: environment in e2e-slurm-enroot-znode [MD-451] (#9617)
  • 71a9c4b docs: Update historical csv release note (#9654)
  • c3e0a41 chore: deprecating container_runtime config, agentrm supporting singularity,podman, and apptainer (#9516)
  • 4eeb4db feat: add metadata filtering to SearchRuns (#9611)
  • 0709cab chore: hide the docker password in cloudformation stacks (#9641)
  • a85fd6d docs: Describe flat runs view (#9644)
  • e630bfb feat: support image pull secrets in genai (#9653)
  • 8379b13 feat: Add/remove HPs when creating experiment through HP search (#9610)
  • e9e4458 fix: allocation csv: gpu_hours -> slot_hours, add resource_pool [DET-10408] (#9616)
  • 6299dcd feat: add basic lineage MDLM link (#9482)
  • a498008 chore: send alerts can wait forever and fail for broken workflows (#9638)
  • 4ac569d ci: remove deprecated label from agent config (#9648)
  • 2aa54e1 chore(deps): bump anchore/scan-action from 3 to 4 (#9634)
  • 7fab87b test: chore: fix some typos (#9646)
  • 53aa974 feat: switch det deploy aws and CI from m5.large to m6i.large (#9636)
  • 13aa327 fix: overflow in hyperparameter modal (#9626)
  • d4c50b5 chore: set cli pwd warning go to stderr (#9536)
  • d11c3ee docs: Update remote users auto redirect (#9623)
  • c0fc4c4 feat: add Search actions [ET-603] (#9622)
  • 1ec5d01 chore: deprecate job move within priority group (#9624)
  • d257b89 feat: add getMetadataValues for projects (#9618)
  • 55b6d25 chore: remove deprecated labels config option (#9609)
  • 3a8c042 feat: pause unpause UX (#9615)
  • 58fbf68 feat: continue trial from WebUI for multi-trial experiment (#9589)
  • 95d1d2f test: chore: add gratitude to test page model readme [INFENG-767] (#9621)
  • 547a4c4 test: Add e2e test for experiment edit (#9619)
  • 576e244 fix: column picker should effect pinned columns in compare view [ET-605] (#9608)
  • 262ad5a test: collect det job ci logs only in case of failure (#9537)
  • 3f7cad6 Revert "Docs/improve sample master yaml" (#9620)
  • c28545d fix: docker version bump to unpin requests (#9614)
  • 0a57cde Docs/improve sample master yaml (#9607)
  • 0e7b3ab docs: Describe supported k8s versions (#9277)
  • 2958d42 docs: Add link to FSDP example (#9606)
  • 000c679 ci: extend experiment timeout for slurm test (#9601)
  • a6a79b8 fix: comparison view parall...
Read more

0.34.0

28 Jun 20:09

Choose a tag to compare

Release Notes

0.34.0

Changelog

  • ede2396 chore: bump version: 0.34.0-rc12 -> 0.34.0
  • f0d825d chore: bump version: 0.34.0-rc11 -> 0.34.0-rc12
  • 1556c18 fix: Pause/Resume run test flake (#9592)
  • a74e389 docs: add release notes for 0.34.0 (#9561)
  • e5fc5f1 chore: bump version: 0.34.0-rc10 -> 0.34.0-rc11
  • a51a640 fix: edit/move modals for projects in workspaces unexpectedly closes [DET-10388] (#9588)
  • ce3ea17 chore: bump version: 0.34.0-rc9 -> 0.34.0-rc10
  • 5b40a5c chore: remove shared cluster test for circle ci (#9579)
  • 0b4dec4 chore: bump version: 0.34.0-rc8 -> 0.34.0-rc9
  • 01baf33 chore: Release 0.34.0 bumpenvs (#9578)
  • bad22b2 chore: add Nvidia drivers version matching test and bump env [MD-413] (#9567)
  • 9adbe7c Revert "chore: 0.34.0 bumpenvs (#9565)"
  • 60ada0c chore: bump version: 0.34.0-rc7 -> 0.34.0-rc8
  • cde8a18 fix: wrong notebook idleness payload [MD-447] (#9571)
  • 3f292f5 chore: bump version: 0.34.0-rc6 -> 0.34.0-rc7
  • f36b110 fix: correct workspace_id column type on allocation_workspace_info (#9574)
  • f0f45f8 chore: bump version: 0.34.0-rc5 -> 0.34.0-rc6
  • a6c7918 fix: persist workspace id/name & experiment id for historic allocations [DET-10378] (#9550)
  • f66e816 chore: bump version: 0.34.0-rc4 -> 0.34.0-rc5
  • eaabab1 fix: add validation to patching project key (ET-305)
  • d8b80ad fix: do not modify cached GetAgentsResponse (#9569)
  • 25804d7 chore: bump version: 0.34.0-rc3 -> 0.34.0-rc4
  • ead2232 fix: return workspace name for breadcrumb in Project Details page (#9564)
  • 8da67d2 chore: 0.34.0 bumpenvs (#9565)
  • 2677dc2 chore: bump version: 0.34.0-rc2 -> 0.34.0-rc3
  • 42bea1a chore: fix boto3 requirement syntax (#9551)
  • b2c7e22 chore: bump version: 0.34.0-rc1 -> 0.34.0-rc2
  • 2f1283d fix: hide warning for weak password unless it actually applies [DET-10216] (#9538)
  • ca208b9 chore: bump version: 0.34.0-rc0 -> 0.34.0-rc1
  • f15bda8 feat: det deploy local generates a password for you [DET-10197] (#9518)
  • abaf2e3 chore: bump version: 0.34.0-dev0 -> 0.34.0-rc0
  • 0cf7aba chore: lock published urls to preserve redirects
  • cd85b44 chore: lock api state for backward compatibility check
  • 25b6299 chore: bump version: 0.33.1-dev0 -> 0.34.0-dev0
  • 83b9a8b feat: add connect modal for notebook and shell tasks [MD-404] (#9545)
  • b9ea173 chore: Bumpenvs 8c90e80 (#9544)
  • f9a5dd5 fix: update getProjectColumns calls (ET-270) (#9509)
  • 325d47e pre-commit lint check fix (#9543)
  • 553521e feat: enable token auth for Jupyter notebooks [MD-404] (#9452)
  • ea929fc test: det framework supports "nth" component [testeng-1] (#9540)
  • 7568129 docs: address two link check failures (#9539)
  • 3641bfc feat: support proxied Determined tasks on remote k8s clusters (#9469)
  • 44f446c fix: Huggingface Trust Remote Repo (#9535)
  • 3320107 chore: allow empty run metadata requests to delete existing metadata (#9524)
  • 8006e2e fix: localize debounced settings updates (#9513)
  • fec31a1 chore: handle empty nested structs in run metadata as nil leaf nodes (#9526)
  • 88b01c6 refactor: remove DataGrid pagination code (ET-259) (#9520)
  • edbeee9 test: increase timeouts for running experiments on k8s after env split (#9530)
  • 0f6eb24 fix: webui page height (#9527)
  • 1630c45 docs: Clarify startuphook (#9517)
  • 63a4163 feat: support node selectors & affinities for Kubernetes resource pools (#9428)
  • 6cd7d06 chore: Improving SearchRuns performance when doing hyperparameter filtering (#9489)
  • 4321143 ci: add new feature signoff checkbox [INFENG-710][skip ci] (#9410)
  • 10667f1 feat: remove round robin scheduler for agentrm (#9493)
  • 735fb2c chore: remove hyperparameters from projects table (#9504)
  • 66ec006 feat: warn users to change their passwords [DET-10216] (#9519)
  • 2bce8b6 fix: historical allocations not appearing (#9522)
  • f9ba7f4 fix: skip webhook regex matching for exp config (#9511)
  • b51bc93 docs: Fix broken links (#9523)
  • 9adc092 fix: partially scheduled k8s jobs should display as queued (#9468)
  • 32585ad feat: flat runs comparison view [ET-190] (#9477)
  • d44013c feat: add arbitrary metadata GET/POST endpoints (#9130)
  • 21ecda5 test: preparing a homework assignment [TESTENG-3] (#9510)
  • ee66d15 fix: allow doesnotcontains filters on hyperparameter column (#8842)
  • 382995c fix: historical allocations not showing task allocation workspace (#9496)
  • 8e9067b feat: Framework Splitting and Bumpenvs (#9457)
  • f0d26db ci: fix some failing long-running tests related to password requirements (#9421)
  • d0d30cf test: collect det task logs as artifacts for ci jobs (#9459)
  • e3d01c1 chore: remove debug logs that were accidentally committed (#9503)
  • 3857b7a test: upload unit and intg tests to datadog[infeng-501] (#9505)
  • 3afe5df chore: check non-multiples of slots per pod for kubernetes rm [MD-403] (#9393)
  • 2ca7733 fix: ensure number of project keys possible for testing is not exceeded (#9501)
  • 0f30189 docs: Update the URL to the genai docs (#9507)
  • 86e6b68 feat: add cluster-wide message (#9261)
  • e138267 fix: Searches view fixes (ET-297) (#9487)
  • aa6521b fix: run columns mismatching sort/filter columns for run table (#9479)
  • de03909 fix: use num pods in k8 job summary (#9497)
  • 439734b chore: avoid payload limitation (#9164)
  • dde6362 fix: Use experiment config to determine is_multi_trial in api_runs queries (#9475)
  • f87214b test: preparing a homework assignment [TESTENG-4] (#9495)
  • 27e7307 feat: add custom key to projects table, backfilling based on current project name, and API support (#9134)
  • a5cf959 fix: pin huggingface version to <0.23.0 (#9483)
  • 97667c5 tc: Test format (#9490)
  • ffee34f test: update playwright [TESTENG-4] (#9484)
  • 869b96a test: readme and test name revisions [TESTENG-5] (#9463)
  • 698ab6c test: docstring revisions [TESTENG-5] (#9478)
  • 96c061b ci: lower hf trainer accuracy target + improve failure messages (#9322)
  • 84299a6 chore: upgrade golangci linter to 1.57.2 (#9279)
  • 2588eea feat: Pause & Resume run (#9129)
  • df3919c docs: remove all references to PowerPC/PPC64 (#9476)
  • ca03da1 chore: switch to mockery config file (#9473)
  • 9160ae9 docs: correct release note for deprecating PPC64/POWER builds (#9470)
  • 418525e fix: convert invalid hparam types to json string (#9449)
  • 934aeb6 fix: job state shows as scheduled when resources are allocated (#9466)
  • 8d64508 feat: remove genai from experimental feature list and enable via /master feature switches [GAS-1016] (#9435)
  • 4d8596c chore: deprecate PPC64/POWER builds (#9467)
  • 13a5142 fix: Revert to get_checkpoints.sql call to enable NaN & Infinity values in searcher metric (#9440)
  • d50433d chore: no longer store ee artifacts in circleCI (#9426)
  • a45aa1e feat: add SearchDetails page (ET-53) (#9436)
  • 4c821c3 docs: clarify data collected by telemetry (#9445)
  • 57bece4 fix: job queue's allocated slots should be correct after restarts (#9461)
  • c49eeea test: datagrid tests [INFENG-687] (#9400)
  • 8a9839a feat: add option for Checkpoint_GC pod spec in task container defaults (#9406)
  • d960f29 chore: only connect to the database once (#9456)
  • 0fdb822 feat(rm): convert Kubernetes submissions from pods to jobs (#9438)
  • f54fb7c test: react test datadog integration [infeng-497] (#9455)
  • cc4ad2b docs: fix observability README docs link (#9453)
  • ca60325 chore: bump version: 0.33.0-dev0 -> 0.33.1-dev0
  • 4936847 chore: add docs dropdown link for new version
  • 7b81df7 docs: add release notes for 0.33.0 (#9444)
  • da2f943 feat: add heatmap to runs table [ET-230] (#9429)
  • 0599d0e test: create test users through the API [INFENG-673] (#9431)
  • ac459f7 docs: Add historical cluster usage warning (#9439)
  • cf22597 docs: update broken nvidia anchor link (#9441)
  • d94e299 fix: notify master for core checkpoint deletes [MD-325] (#9415)
  • b96ccba fix: dont utilize the default efs mount on normal aws deploys (#9437)
  • a0f2e33 fix: redirect on sso login (#9369)
  • 9abde37 chore: remove stdlib errors package from lint blocklist (#9381)
  • 515c135 fix: add Admin Settings to NavigationTabbar (ET-194) (#9423)
  • 00bbda6 fix: set the defaults for shared_fs mount in genai correctly (#9433)
  • 9d54093 chore: skip TestSchedule until flake is fixed (#9434)
  • 9524dd4 ci: use priority scheduler in e2e tests (#9430)
  • 58b31e6 docs: terraforming an EKS cluster with autoscaling and EFS. (#9427)
  • 8a6f571 docs: ignore anchor for observability links (#9412)
  • 684c38b fix: add feature gate for checking for blank admin/determined password [DET-10197] (#9425)
  • 6ad9d73 fix: Keep template modal open when config is invalid (#9424)
  • 3dfb9ec test: remove confusing, unused slurm-related ci code (#9417)
  • cdd7a82 test: ensure make unslurmcluster always runs in CI (#9420)
  • ba31f03 fix: reset InteractiveTable pagination when filters applied [ET-183] [ET-121] (#9413)
  • 3cbe805 fix: master checks db newness before migrating [DET-10312] (#9414)
  • da46208 fix: bulk action bug in the old experiment table that cannot trigger bulk actions across pages (#9404)
  • 657286c feat: Add Run columns to GetProjectColumns (#9146)

0.33.0

29 May 20:46

Choose a tag to compare

Release Notes

0.33.0

Changelog

Read more