Skip to content

Selenium-grid autoscaler not working as expected #1624

@miguel-cardoso-mindera

Description

@miguel-cardoso-mindera

I'm having issues with the selenium-grid scaler, as it always has sessions in queue and never scales up to the point of having the queue free.

This is my selenium helm chart:

selenium-operator:
  global:
    seleniumGrid:
      imageRegistry: XXXXX
      imageTag: 4.35.0-20250808
      nodesImageTag: 4.35.0-20250808
      kubectlImage: XXXXX
  ingress:
    enabled: false

  basicAuth:
    enabled: false

  autoscaling:
    enableWithExistingKEDA: true
    scalingType: deployment
    scaledOptions:
      minReplicaCount: 1
      maxReplicaCount: 100
      pollingInterval: 5
    #scaledJobOptions:
    #  scalingStrategy:
    #    strategy: default

  # Configuration for isolated components (applied only if `isolateComponents: true`)
  components:
    router:
      nodeSelector:
        grid: "true"

    # Configuration for distributor component
    distributor:
      nodeSelector:
        grid: "true"

    # Configuration for Event Bus component
    eventBus:
      nodeSelector:
        grid: "true"

    # Configuration for Session Map component
    sessionMap:
      nodeSelector:
        grid: "true"

    # Configuration for Session Queue component
    sessionQueue:
      nodeSelector:
        grid: "true"

  # Configuration for selenium hub deployment (applied only if `isolateComponents: false`)
  hub:
    imageTag: 4.35.0-20250808
    imageRegistry: XXXXXXXX
    nameOverride: selenium-operator-hub
    resources:
      requests:
        memory: "8Gi"
        cpu: "1"
      limits:
        memory: "12Gi"
        cpu: "3"
    nodeSelector:
      fanduel.com/spot: utils
    tolerations:
      - key: "fanduel.com/spot"
        operator: "Equal"
        value: "utils"
        effect: "NoSchedule"

  # Configuration for chrome nodes
  chromeNode:
    enabled: true
    deploymentEnabled: true
    replicas: 2
    imageRegistry: XXXXX
    imageTag: 138.0-20250808
    nameOverride: selenium-operator-chrome-node

    resources:
      requests:
        memory: "2Gi"
        cpu: "1"
      limits:
        memory: "3Gi"
        cpu: "1500m"

    nodeSelector:
      fanduel.com/spot: utils
    tolerations:
      - key: "fanduel.com/spot"
        operator: "Equal"
        value: "utils"
        effect: "NoSchedule"

    extraEnvironmentVariables:
      - name: SE_NODE_GRID_URL
        value: ""
      - name: SCREEN_WIDTH
        value: "1600"
      - name: SCREEN_HEIGHT
        value: "900"

    dshmVolumeSizeLimit: 2Gi

    hostAliases:
      - ip: "127.0.0.1"
        hostnames:
          - "api.lab.amplitude.com"

  # Configuration for firefox nodes
  firefoxNode:
    enabled: false

  # Configuration for edge nodes
  edgeNode:
    enabled: false

Expected Behavior

I expect keda to scale chromeNodes until it no longer has sessions in queue, and to scale up new nodes as new sessions are created.

Actual Behavior

I always have a queue, consistently over 10 sessions in queue:
Image

Steps to Reproduce the Problem

  1. Deploy keda and selenium helm chart
  2. Run pipeline that requests dozens of tests and sessions
  3. Observe the tests fail because of timeout issues due to requests being in queue

Specifications

  • KEDA Version: 2.16.1
  • Kubernetes Version: 1.32
  • Scaler(s): selenium-grid

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingstaleAll issues that are marked as stale due to inactivity

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions