Releases · llm-d-incubation/workload-variant-autoscaler

04 Nov 18:26

mamy-CS

v0.0.3

13d1bdf

v0.0.3 Pre-release

Pre-release

What's Changed

Refactoring deployment scripts for Kubernetes and Kind by @WheelyMcBones in #226
Add prom queries for avg input tokens and ttft by @vishakha-ramani in #194
Enhance Helm chart security configuration and add development values by @mamy-CS in #242
Refactoring deployment scripts and integrating llm-d-inference-sim in emulated deployment environment by @WheelyMcBones in #237

Full Changelog: v0.0.2...v0.0.3

Contributors

mamy-CS, WheelyMcBones, and vishakha-ramani

Assets 2

24 Oct 17:11

mamy-CS

v0.0.2

b829546

v0.0.2 Pre-release

Pre-release

What's Changed

Update crd docs by @asm582 in #184
Update README.md to include ref to new home for container image by @clubanderson in #183
Update ci-release.yaml to do multi arch by @clubanderson in #182
Removing unsupported architectures from image multi-arch build by @WheelyMcBones in #192
Setting minNumReplicas to 1 by default by @WheelyMcBones in #189
Update ci-release.yaml to include 'latest' tag by @clubanderson in #193
Update QuickStart documentation to disable Rosetta on Apple Silicon by @myechuri in #195
Fix zero divide in queue analyzer by @atantawi in #190
add helm chart for ocp, supporting files, and instructions to install wva, prom adaptor, and llm-d in simple form by @clubanderson in #200
removed 'oc adm policy' (replaced with clusterrolebinding) and rollout commands - no longer needed by @clubanderson in #207
Deployment script and readme for wva + llmd on openshift by @mamy-CS in #203
updated files to consume llmd model name and modelID by @clubanderson in #209
Fixing tolerance function by @WheelyMcBones in #213
Remove resource accounting logic and disable limited mode by @mamy-CS in #210
E2es on openshift using sharegpt data by @mamy-CS in #201
remove epp from wva helm chart and instructions to delete it after wv… by @clubanderson in #215
Adding unit tests for the internal optimizer code by @WheelyMcBones in #204
Add metrics validation and health monitoring system with Kubernetes conditions by @mamy-CS in #214
refactor: Reorganize repository structure and documentation by @mamy-CS in #216
Changing OCP deployment script to use the Helm chart by @WheelyMcBones in #212
Documentation update by @mamy-CS in #223
Parameterizing OCP script by @WheelyMcBones in #218
Optimize VariantAutoscaling's owner setting by @learner0810 in #219
refactor: Externalize metric names and labels to constants package by @ev-shindin in #228
Enhancements to OC E2E Testing by @Vezio in #220
Helm Chart Refactoring by @Vezio in #222

New Contributors

@myechuri made their first contribution in #195
@learner0810 made their first contribution in #219
@Vezio made their first contribution in #220

Full Changelog: v0.0.1...v0.0.2

Contributors

clubanderson, atantawi, and 7 other contributors

Assets 2

23 Sep 18:18

asm582

v0.0.1

9abd73c

v0.0.1 Pre-release

Pre-release

What's Changed

Add link to prerequisites in readme by @atantawi in #3
Fix references in docs and readme files by @atantawi in #8
Have sample demo data in a common repo by @atantawi in #9
Remove control loop by @atantawi in #10
Changes to move to new design by @asm582 in #11
change interface of optimizer by @asm582 in #13
add logger with additional code changes by @asm582 in #15
Initial integration of inferno model analyzer and optimizer functionality by @atantawi in #17
feat: install cluster with multiple nodes, gpus by @haroldship in #21
Fix prometheus address by @haroldship in #25
Inferno emulator mode by @mamy-CS in #18
Revert "Inferno emulator mode" by @mamy-CS in #27
Vendor vllme (new-metric branch) into inferno-autoscaler by @mamy-CS in #24
automated inferno deployment for dev by @mamy-CS in #28
add license file by @mamy-CS in #31
Scale down variant to one replica when no traffic by @atantawi in #33
Fix setupwithmanager in controller by @asm582 in #29
use ctrl runtime backed cache for listing nodes by @asm582 in #26
resolve merge issue by @asm582 in #34
Variant to keep accelerator by @atantawi in #30
add retries by @asm582 in #35
enable HA in controller by @asm582 in #36
improve error handling and retries for current design by @asm582 in #37
update reconciler to smaller, composable helper functions and reduce inline logic by @asm582 in #38
Align vllme metrics with vllm for autoscaler compatibility by @vishakha-ramani in #32
rem modelservice requeue for reconcile by @asm582 in #40
Support configured maximum batch size by @atantawi in #41
Remove hard-coded accelerator name by @atantawi in #43
llm-d integration by @WheelyMcBones in #42
Actuator Emit custom metrics to Prometheus by @mamy-CS in #39
Remove requeue when optimization fails by @asm582 in #49
return errors for cm config by @asm582 in #53
Deployment and test env cleanup by @mamy-CS in #50
update readme by @mamy-CS in #54
simplify watchandrun loop by @asm582 in #56
Fix llmd integration by @WheelyMcBones in #58
move to openAI modelid format by @asm582 in #57
Use reconciler to run periodically by @asm582 in #59
Update readme and install cm in inferno ns by @asm582 in #61
multi-arch build by @mamy-CS in #62
Modified vllme load generator by @vishakha-ramani in #60
loadgen deterministic mode doc update by @mamy-CS in #63
fixed waiting for Gateway and EPP deployments by @WheelyMcBones in #64
Add documentation for modeling and analysis by @atantawi in #67
Report allocatable resources by @asm582 in #66
Update modelling documentation by @atantawi in #69
enabling installation for amd64 arch by @WheelyMcBones in #68
x86 llmd infra installation fixes by @mamy-CS in #72
improve logging readability by @WheelyMcBones in #77
refactoring backoff logic into global backoff by @WheelyMcBones in #89
align slo names to community terms by @asm582 in #101
first basic E2E tests by @WheelyMcBones in #90
fix lint errors by @asm582 in #105
E2E scaling tests by @WheelyMcBones in #104
fix make test by @asm582 in #108
Add optimize section by @asm582 in #109
Remove dummy analyzer code by @asm582 in #110
add crd api docs by @asm582 in #111
fixes for crd and docs by @asm582 in #113
fix shortname issue by @asm582 in #115
add tls configuration by @mamy-CS in #103
Testing continuous generated load and multiple VAs scenarios by @WheelyMcBones in #112
add gha workflows by @clubanderson in #121
remove precommit by @clubanderson in #124
Fix llm-d deploy and modify tests to align with community feedback by @WheelyMcBones in #122
add make test-e2e by @clubanderson in #126
ignore sync errors from zap logger by @asm582 in #125
Handle infeasible optimization solution by @atantawi in #102
Fix E2E test execution on CI by @WheelyMcBones in #130
Remove manual trigger logic by @asm582 in #128
Add unit tests to Inferno-autoscaler components by @WheelyMcBones in #133
Changing E2E to check emitted Inferno metrics by @WheelyMcBones in #135
HPA integration by @WheelyMcBones in #137
Integrating llm-d infra into E2E tests by @WheelyMcBones in #138
Add api unit tests by @asm582 in #140
Include HPA in README config by @WheelyMcBones in #142
consolidate yaml samples in markdown file by @vishakha-ramani in #141
Computing scaling decision based on ratio metric for HPA by @WheelyMcBones in #143
Changes to the collector: token query by @vishakha-ramani in #146
Upgrade optimizer-light to v0.5.0 by @atanta...

Contributors

clubanderson, haroldship, and 5 other contributors

Assets 2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

What's Changed

Contributors

Uh oh!

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

What's Changed

New Contributors

Contributors

Uh oh!

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

What's Changed

Contributors

Uh oh!

Releases: llm-d-incubation/workload-variant-autoscaler

v0.0.3

What's Changed

Contributors

Uh oh!

v0.0.2

What's Changed

New Contributors

Contributors

Uh oh!

v0.0.1

What's Changed

Contributors

Uh oh!