Skip to content

bug: The prometheus format output is not standard #5386

@Spoutnik97

Description

@Spoutnik97

Describe the bug

I am trying to scrape the BentoMl /metrics route with fluent-bit.
Fluent bit prometheus_scrape input throw an error : [2025/06/19 14:19:35] [error] [input:prometheus_scrape:prometheus_scrape.0] error decoding Prometheus Text format

The issues seems to come from the order of histogram metrics.

All the _sum keys are at the begginning of the metric, then the _buckets and _count

To reproduce

  1. Deploy a basic Bentoml container with metrics enabled
  2. Install fluent-bit (brew install fluent-bit on macos)
  3. Create a basic configuration: fluent-bit.conf
[SERVICE]
    Flush         2
    Log_level     debug
    Daemon        off
    HTTP_Server   on
    HTTP_Listen   0.0.0.0
    HTTP_PORT     2020

[INPUT]
    Name                  prometheus_scrape
    Tag                   local_metrics
    Scrape_interval       2s
    Host                  localhost
    Port                  8080
    Metrics_path          /test-metrics.txt

[OUTPUT]
    Name                  stdout
    Match                 *
    Format                json_lines
  1. create a test-metrics.txt file with the content of the metrics below
  2. launch a basic http server python3 -m http.server 8080
  3. launch fluent-bit : fluent-bit -c fluent-bit.conf

Content of the test-metrics.txt file working :

# HELP prediction_time_seconds Time taken for predictions
# TYPE prediction_time_seconds histogram
prediction_time_seconds_sum{company_id="96a16b00-d289-45e6-856c-b45d7b83a09d",endpoint="predict_process_collection_and_costs"} 56.312395095825195
prediction_time_seconds_sum{company_id="be545906-1849-4c10-a331-6fffc88aa3ba",endpoint="predict_process_collection_and_costs"} 2.419936180114746
prediction_time_seconds_sum{company_id="c0cc5509-1249-4b1a-958b-af1dac4af697",endpoint="predict_process_collection_and_costs"} 0.5229167938232422
prediction_time_seconds_sum{company_id="1a3dd9b6-ba28-408d-aa7f-bb27e2d00f46",endpoint="predict_process_collection_and_costs"} 4.157390356063843
prediction_time_seconds_sum{company_id="62b6fe8f-6dce-407c-9e6b-8c588a2d9501",endpoint="predict_process_collection_and_costs"} 8.153648376464844
prediction_time_seconds_sum{company_id="1b700fec-6f92-484e-8243-7cb1a47e7afc",endpoint="predict_process_collection_and_costs"} 0.32573604583740234
prediction_time_seconds_sum{company_id="e3874ca4-3ea0-46d7-8e8c-359065b0fab9",endpoint="predict_process_collection_and_costs"} 1.031454086303711
prediction_time_seconds_bucket{company_id="96a16b00-d289-45e6-856c-b45d7b83a09d",endpoint="predict_process_collection_and_costs",le="0.1"} 0.0
prediction_time_seconds_bucket{company_id="96a16b00-d289-45e6-856c-b45d7b83a09d",endpoint="predict_process_collection_and_costs",le="0.5"} 219.0
prediction_time_seconds_bucket{company_id="96a16b00-d289-45e6-856c-b45d7b83a09d",endpoint="predict_process_collection_and_costs",le="1.0"} 220.0
prediction_time_seconds_bucket{company_id="96a16b00-d289-45e6-856c-b45d7b83a09d",endpoint="predict_process_collection_and_costs",le="2.0"} 220.0
prediction_time_seconds_bucket{company_id="96a16b00-d289-45e6-856c-b45d7b83a09d",endpoint="predict_process_collection_and_costs",le="5.0"} 220.0
prediction_time_seconds_bucket{company_id="96a16b00-d289-45e6-856c-b45d7b83a09d",endpoint="predict_process_collection_and_costs",le="10.0"} 220.0
prediction_time_seconds_bucket{company_id="96a16b00-d289-45e6-856c-b45d7b83a09d",endpoint="predict_process_collection_and_costs",le="30.0"} 220.0
prediction_time_seconds_bucket{company_id="96a16b00-d289-45e6-856c-b45d7b83a09d",endpoint="predict_process_collection_and_costs",le="60.0"} 220.0
prediction_time_seconds_bucket{company_id="96a16b00-d289-45e6-856c-b45d7b83a09d",endpoint="predict_process_collection_and_costs",le="+Inf"} 220.0
prediction_time_seconds_count{company_id="96a16b00-d289-45e6-856c-b45d7b83a09d",endpoint="predict_process_collection_and_costs"} 220.0
prediction_time_seconds_bucket{company_id="be545906-1849-4c10-a331-6fffc88aa3ba",endpoint="predict_process_collection_and_costs",le="0.1"} 0.0
prediction_time_seconds_bucket{company_id="be545906-1849-4c10-a331-6fffc88aa3ba",endpoint="predict_process_collection_and_costs",le="0.5"} 8.0
prediction_time_seconds_bucket{company_id="be545906-1849-4c10-a331-6fffc88aa3ba",endpoint="predict_process_collection_and_costs",le="1.0"} 8.0
prediction_time_seconds_bucket{company_id="be545906-1849-4c10-a331-6fffc88aa3ba",endpoint="predict_process_collection_and_costs",le="2.0"} 8.0
prediction_time_seconds_bucket{company_id="be545906-1849-4c10-a331-6fffc88aa3ba",endpoint="predict_process_collection_and_costs",le="5.0"} 8.0
prediction_time_seconds_bucket{company_id="be545906-1849-4c10-a331-6fffc88aa3ba",endpoint="predict_process_collection_and_costs",le="10.0"} 8.0
prediction_time_seconds_bucket{company_id="be545906-1849-4c10-a331-6fffc88aa3ba",endpoint="predict_process_collection_and_costs",le="30.0"} 8.0
prediction_time_seconds_bucket{company_id="be545906-1849-4c10-a331-6fffc88aa3ba",endpoint="predict_process_collection_and_costs",le="60.0"} 8.0
prediction_time_seconds_bucket{company_id="be545906-1849-4c10-a331-6fffc88aa3ba",endpoint="predict_process_collection_and_costs",le="+Inf"} 8.0
prediction_time_seconds_count{company_id="be545906-1849-4c10-a331-6fffc88aa3ba",endpoint="predict_process_collection_and_costs"} 8.0
prediction_time_seconds_bucket{company_id="c0cc5509-1249-4b1a-958b-af1dac4af697",endpoint="predict_process_collection_and_costs",le="0.1"} 0.0
prediction_time_seconds_bucket{company_id="c0cc5509-1249-4b1a-958b-af1dac4af697",endpoint="predict_process_collection_and_costs",le="0.5"} 2.0
prediction_time_seconds_bucket{company_id="c0cc5509-1249-4b1a-958b-af1dac4af697",endpoint="predict_process_collection_and_costs",le="1.0"} 2.0
prediction_time_seconds_bucket{company_id="c0cc5509-1249-4b1a-958b-af1dac4af697",endpoint="predict_process_collection_and_costs",le="2.0"} 2.0
prediction_time_seconds_bucket{company_id="c0cc5509-1249-4b1a-958b-af1dac4af697",endpoint="predict_process_collection_and_costs",le="5.0"} 2.0
prediction_time_seconds_bucket{company_id="c0cc5509-1249-4b1a-958b-af1dac4af697",endpoint="predict_process_collection_and_costs",le="10.0"} 2.0
prediction_time_seconds_bucket{company_id="c0cc5509-1249-4b1a-958b-af1dac4af697",endpoint="predict_process_collection_and_costs",le="30.0"} 2.0
prediction_time_seconds_bucket{company_id="c0cc5509-1249-4b1a-958b-af1dac4af697",endpoint="predict_process_collection_and_costs",le="60.0"} 2.0
prediction_time_seconds_bucket{company_id="c0cc5509-1249-4b1a-958b-af1dac4af697",endpoint="predict_process_collection_and_costs",le="+Inf"} 2.0
prediction_time_seconds_count{company_id="c0cc5509-1249-4b1a-958b-af1dac4af697",endpoint="predict_process_collection_and_costs"} 2.0
prediction_time_seconds_bucket{company_id="1a3dd9b6-ba28-408d-aa7f-bb27e2d00f46",endpoint="predict_process_collection_and_costs",le="0.1"} 0.0
prediction_time_seconds_bucket{company_id="1a3dd9b6-ba28-408d-aa7f-bb27e2d00f46",endpoint="predict_process_collection_and_costs",le="0.5"} 15.0
prediction_time_seconds_bucket{company_id="1a3dd9b6-ba28-408d-aa7f-bb27e2d00f46",endpoint="predict_process_collection_and_costs",le="1.0"} 15.0
prediction_time_seconds_bucket{company_id="1a3dd9b6-ba28-408d-aa7f-bb27e2d00f46",endpoint="predict_process_collection_and_costs",le="2.0"} 15.0
prediction_time_seconds_bucket{company_id="1a3dd9b6-ba28-408d-aa7f-bb27e2d00f46",endpoint="predict_process_collection_and_costs",le="5.0"} 15.0
prediction_time_seconds_bucket{company_id="1a3dd9b6-ba28-408d-aa7f-bb27e2d00f46",endpoint="predict_process_collection_and_costs",le="10.0"} 15.0
prediction_time_seconds_bucket{company_id="1a3dd9b6-ba28-408d-aa7f-bb27e2d00f46",endpoint="predict_process_collection_and_costs",le="30.0"} 15.0
prediction_time_seconds_bucket{company_id="1a3dd9b6-ba28-408d-aa7f-bb27e2d00f46",endpoint="predict_process_collection_and_costs",le="60.0"} 15.0
prediction_time_seconds_bucket{company_id="1a3dd9b6-ba28-408d-aa7f-bb27e2d00f46",endpoint="predict_process_collection_and_costs",le="+Inf"} 15.0
prediction_time_seconds_count{company_id="1a3dd9b6-ba28-408d-aa7f-bb27e2d00f46",endpoint="predict_process_collection_and_costs"} 15.0
prediction_time_seconds_bucket{company_id="62b6fe8f-6dce-407c-9e6b-8c588a2d9501",endpoint="predict_process_collection_and_costs",le="0.1"} 0.0
prediction_time_seconds_bucket{company_id="62b6fe8f-6dce-407c-9e6b-8c588a2d9501",endpoint="predict_process_collection_and_costs",le="0.5"} 31.0
prediction_time_seconds_bucket{company_id="62b6fe8f-6dce-407c-9e6b-8c588a2d9501",endpoint="predict_process_collection_and_costs",le="1.0"} 32.0
prediction_time_seconds_bucket{company_id="62b6fe8f-6dce-407c-9e6b-8c588a2d9501",endpoint="predict_process_collection_and_costs",le="2.0"} 32.0
prediction_time_seconds_bucket{company_id="62b6fe8f-6dce-407c-9e6b-8c588a2d9501",endpoint="predict_process_collection_and_costs",le="5.0"} 32.0
prediction_time_seconds_bucket{company_id="62b6fe8f-6dce-407c-9e6b-8c588a2d9501",endpoint="predict_process_collection_and_costs",le="10.0"} 32.0
prediction_time_seconds_bucket{company_id="62b6fe8f-6dce-407c-9e6b-8c588a2d9501",endpoint="predict_process_collection_and_costs",le="30.0"} 32.0
prediction_time_seconds_bucket{company_id="62b6fe8f-6dce-407c-9e6b-8c588a2d9501",endpoint="predict_process_collection_and_costs",le="60.0"} 32.0
prediction_time_seconds_bucket{company_id="62b6fe8f-6dce-407c-9e6b-8c588a2d9501",endpoint="predict_process_collection_and_costs",le="+Inf"} 32.0
prediction_time_seconds_count{company_id="62b6fe8f-6dce-407c-9e6b-8c588a2d9501",endpoint="predict_process_collection_and_costs"} 32.0
prediction_time_seconds_bucket{company_id="1b700fec-6f92-484e-8243-7cb1a47e7afc",endpoint="predict_process_collection_and_costs",le="0.1"} 0.0
prediction_time_seconds_bucket{company_id="1b700fec-6f92-484e-8243-7cb1a47e7afc",endpoint="predict_process_collection_and_costs",le="0.5"} 1.0
prediction_time_seconds_bucket{company_id="1b700fec-6f92-484e-8243-7cb1a47e7afc",endpoint="predict_process_collection_and_costs",le="1.0"} 1.0
prediction_time_seconds_bucket{company_id="1b700fec-6f92-484e-8243-7cb1a47e7afc",endpoint="predict_process_collection_and_costs",le="2.0"} 1.0
prediction_time_seconds_bucket{company_id="1b700fec-6f92-484e-8243-7cb1a47e7afc",endpoint="predict_process_collection_and_costs",le="5.0"} 1.0
prediction_time_seconds_bucket{company_id="1b700fec-6f92-484e-8243-7cb1a47e7afc",endpoint="predict_process_collection_and_costs",le="10.0"} 1.0
prediction_time_seconds_bucket{company_id="1b700fec-6f92-484e-8243-7cb1a47e7afc",endpoint="predict_process_collection_and_costs",le="30.0"} 1.0
prediction_time_seconds_bucket{company_id="1b700fec-6f92-484e-8243-7cb1a47e7afc",endpoint="predict_process_collection_and_costs",le="60.0"} 1.0
prediction_time_seconds_bucket{company_id="1b700fec-6f92-484e-8243-7cb1a47e7afc",endpoint="predict_process_collection_and_costs",le="+Inf"} 1.0
prediction_time_seconds_count{company_id="1b700fec-6f92-484e-8243-7cb1a47e7afc",endpoint="predict_process_collection_and_costs"} 1.0

Expected behavior

No response

Environment

bentoml==1.3.20
python>=3.10

Activity

added a commit that references this issue on Jun 21, 2025
87fd1c3
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

      Participants

      @Spoutnik97

      Issue actions

        bug: The prometheus format output is not standard · Issue #5386 · bentoml/BentoML