Estimate # prompts, # input tokens per probe #1071

leondz · 2025-01-11T12:01:10Z

Summary

We need to help estimate the inference load a run may bring.

One interface might be to have a .estimate_prompt_count() method on prompts

We first know the number of prompts after the probe has enqueued its attempts, but before these are sent for inference.

It would be worth making this possible to override for probes that are dynamic/adaptive.

Returned values are estimates, and don't need to be precise, but should be useful for e.g. building a run-level tqdm bar. User should expect that estimates are always within an order of magnitude of accuracy, and often within 10% of the real count.

The text was updated successfully, but these errors were encountered:

leondz added the architecture Architectural upgrades label Jan 11, 2025

leondz added this to the 25.02 Efficiency milestone Jan 11, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Estimate # prompts, # input tokens per probe #1071

Estimate # prompts, # input tokens per probe #1071

leondz commented Jan 11, 2025

Estimate # prompts, # input tokens per probe #1071

Estimate # prompts, # input tokens per probe #1071

Comments

leondz commented Jan 11, 2025

Summary