Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AWS EKS Prometheus and value_error #346

Closed
mxw-sec opened this issue Oct 3, 2024 · 6 comments
Closed

AWS EKS Prometheus and value_error #346

mxw-sec opened this issue Oct 3, 2024 · 6 comments
Assignees

Comments

@mxw-sec
Copy link

mxw-sec commented Oct 3, 2024

Describe the bug
Running
krr simple --history-duration X I get the same output regardless.

WARNING Not enough history available for cluster arn:aws:eks:us-east-1:{ACCOUNT}:cluster/staging-karpenter. runner.py:235
WARNING If the cluster is freshly installed, it might take some time for the enough data to be available. runner.py:238
WARNING Enough data is estimated to be available after 2024-10-04 02:22:55, but will try to calculate recommendations anyway. runner.py:241
INFO Listing scannable objects in arn:aws:eks:us-east-1:{ACCOUNT}:cluster/staging-karpenter init.py:58
ERROR An unexpected error occurred runner.py:349
Traceback (most recent call last):
File "robusta_krr/core/runner.py", line 342, in run
File "robusta_krr/core/runner.py", line 288, in _collect_result
File "robusta_krr/core/integrations/kubernetes/init.py", line 531, in list_scannable_objects
File "robusta_krr/core/integrations/kubernetes/init.py", line 534, in
File "robusta_krr/core/integrations/kubernetes/init.py", line 63, in list_scannable_objects
File "robusta_krr/core/integrations/kubernetes/init.py", line 249, in _list_scannable_objects
File "robusta_krr/core/integrations/kubernetes/init.py", line 249, in
File "robusta_krr/core/integrations/kubernetes/init.py", line 172, in __build_scannable_object
File "robusta_krr/core/models/allocations.py", line 89, in from_container
File "pydantic/main.py", line 341, in pydantic.main.BaseModel.init
pydantic.error_wrappers.ValidationError: 1 validation error for ResourceAllocations
limits
invalid literal for int() with base 10: '3200e6' (type=value_error)

To Reproduce
run any command such as

krr simple --history-duration X

Expected behavior
Output Provided

Screenshots
If applicable, add screenshots to help explain your problem.
image

Are you interested in contributing a fix for this?
Yes/no. If yes, we will provide guidance what parts of the code to modify and help you.

Desktop (please complete the following information):

  • OS: Ubuntu
  • Browser [e.g. chrome, safari] N/A
  • Version [e.g. 22] Latest

Additional context
Add any other context about the problem here.

@mxw-sec mxw-sec changed the title AWS EKS AWS EKS Prometheus and value_error Oct 3, 2024
@aantn
Copy link
Contributor

aantn commented Oct 6, 2024

Weird, I'm not sure what the problem is. The limits field is not an integer, it's a float field which makes the error especially weird.

Do you see any more information when running with --verbose?

@mxw-sec
Copy link
Author

mxw-sec commented Oct 11, 2024

Nothing new when running with --verbose

``Running Robusta's KRR (Kubernetes Resource Recommender) v.1.15.0
Using strategy: Simple
Using formatter: table
[10:48:29] DEBUG An error occurred while checking for a new version runner.py:85
Traceback (most recent call last):
File "robusta_krr/core/runner.py", line 79, in
__check_newer_version_available
File "robusta_krr/core/runner.py", line 75, in __parse_version_string
ValueError: invalid literal for int() with base 10: ''

[10:48:30] DEBUG Creating kubernetes python cli monkey patches patch.py:10
DEBUG Found 4 clusters: init.py:493
arn:aws:eks:us-east-1:{Account}:cluster/karpenter-test,
arn:aws:eks:us-east-1:{Account}:cluster/test-cluster1,
arn:aws:eks:us-west-2:{Account}:cluster/indtest-eks,
arn:aws:eks:us-east-1:{Account}:cluster/staging-karpenter
DEBUG Current cluster: init.py:494
arn:aws:eks:us-east-1:{Account}:cluster/staging-karpenter
DEBUG Configured clusters: [] init.py:496
INFO Using clusters: runner.py:280
['arn:aws:eks:us-east-1:{Account}:cluster/staging-karpenter']
INFO Prometheus URL is specified, will not auto-detect a metrics service loader.py:55
INFO Trying to connect to Prometheus for prometheus_metrics_service.py:68
arn:aws:eks:us-east-1:{Account}:cluster/staging-karp
enter cluster
INFO Using Prometheus at http://127.0.0.1:9090 for cluster prometheus_metrics_service.py:97
arn:aws:eks:us-east-1:{Account}:cluster/staging-karp
enter
INFO Prometheus found loader.py:74
INFO Prometheus connected successfully for loader.py:47
arn:aws:eks:us-east-1:{Account}:cluster/staging-karpenter cluster
DEBUG History range for runner.py:231
arn:aws:eks:us-east-1:{Account}:cluster/staging-karpenter:
(datetime.datetime(2024, 10, 11, 5, 48, 31), datetime.datetime(2024, 10,
11, 10, 48, 31))
[10:48:31] INFO Listing scannable objects in init.py:58
arn:aws:eks:us-east-1:{Account}:cluster/staging-karpenter
DEBUG Namespaces: * init.py:59
DEBUG Resources: * init.py:60
DEBUG Listing HPA-v2s in init.py:189
arn:aws:eks:us-east-1:{Account}:cluster/staging-karpenter
DEBUG Found 3 HPA-v2 in init.py:221
arn:aws:eks:us-east-1:{Account}:cluster/staging-karpenter
DEBUG Listing Deployments in init.py:189
arn:aws:eks:us-east-1:{Account}:cluster/staging-karpenter
DEBUG Listing Rollouts in init.py:189
arn:aws:eks:us-east-1:{Account}:cluster/staging-karpenter
DEBUG Listing DeploymentConfigs in init.py:189
arn:aws:eks:us-east-1:{Account}:cluster/staging-karpenter
DEBUG Listing StatefulSets in init.py:189
arn:aws:eks:us-east-1:{Account}:cluster/staging-karpenter
DEBUG Listing DaemonSets in init.py:189
arn:aws:eks:us-east-1:{Account}:cluster/staging-karpenter
DEBUG Listing Jobs in init.py:189
arn:aws:eks:us-east-1:{Account}:cluster/staging-karpenter
DEBUG Listing CronJobs in init.py:189
arn:aws:eks:us-east-1:{Account}:cluster/staging-karpenter
DEBUG DeploymentConfig API not available in init.py:253
arn:aws:eks:us-east-1:{Account}:cluster/staging-karpenter
DEBUG Rollout API not available in init.py:253
arn:aws:eks:us-east-1:{Account}:cluster/staging-karpenter
DEBUG Found 2 Job in init.py:221
arn:aws:eks:us-east-1:{Account}:cluster/staging-karpenter
DEBUG Found 1 CronJob in init.py:221
arn:aws:eks:us-east-1:{Account}:cluster/staging-karpenter
DEBUG Found 3 StatefulSet in init.py:221
arn:aws:eks:us-east-1:{Account}:cluster/staging-karpenter
DEBUG Found 6 DaemonSet in init.py:221
arn:aws:eks:us-east-1:{Account}:cluster/staging-karpenter
DEBUG Found 184 Deployment in init.py:221
arn:aws:eks:us-east-1:{Account}:cluster/staging-karpenter
ERROR An unexpected error occurred runner.py:349
Traceback (most recent call last):
File "robusta_krr/core/runner.py", line 342, in run
File "robusta_krr/core/runner.py", line 288, in _collect_result
File "robusta_krr/core/integrations/kubernetes/init.py", line 531,
in list_scannable_objects
File "robusta_krr/core/integrations/kubernetes/init.py", line 534,
in
File "robusta_krr/core/integrations/kubernetes/init.py", line 63, in
list_scannable_objects
File "robusta_krr/core/integrations/kubernetes/init.py", line 249,
in _list_scannable_objects
File "robusta_krr/core/integrations/kubernetes/init.py", line 249,
in
File "robusta_krr/core/integrations/kubernetes/init.py", line 172,
in __build_scannable_object
File "robusta_krr/core/models/allocations.py", line 89, in
from_container
File "pydantic/main.py", line 341, in pydantic.main.BaseModel.init
pydantic.error_wrappers.ValidationError: 1 validation error for
ResourceAllocations
limits
invalid literal for int() with base 10: '3200e6' (type=value_error
`

@mxw-sec
Copy link
Author

mxw-sec commented Oct 11, 2024

I am running Krr on ubuntu 22.04.5 in WSL, I installed it via brew, and just updated to 1.16.0.. Still getting the same issue.

@arikalon1
Copy link
Contributor

arikalon1 commented Oct 11, 2024

I think the 3200e6 is a legal limit in k8s, and krr doesn't know to parse it correctly
3200e6=3200000000, right?
Screenshot 2024-10-11 at 7 40 15 PM

The issue seems to be while handing a deployment

Can you try running this, and see if indeed you have that memory limit?
kubectl get deployments -A -o=jsonpath='{range .items[*]}{.metadata.name}{"\t"}{.spec.template.spec.containers[*].resources.limits.memory}{"\n"}{end}'

@arikalon1
Copy link
Contributor

Screenshot 2024-10-11 at 7 53 15 PM

@moshemorad moshemorad self-assigned this Nov 12, 2024
@moshemorad
Copy link
Contributor

Issue was fixed in #361

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants