Skip to content

Fix LiteLLM split iteration in greedy_until to avoid duplicate API requests#1177

Open
dyurchenko98 wants to merge 1 commit intohuggingface:mainfrom
dyurchenko98:fix/litellm_split_iteration
Open

Fix LiteLLM split iteration in greedy_until to avoid duplicate API requests#1177
dyurchenko98 wants to merge 1 commit intohuggingface:mainfrom
dyurchenko98:fix/litellm_split_iteration

Conversation

@dyurchenko98
Copy link

Summary

This PR fixes a regression in LiteLLMClient.greedy_until where contexts were built from the full dataset inside the split loop, instead of the current split.

Root Cause

In the split loop in litellm_model.py, this line used full-dataset iteration:

contexts = [self.prompt_manager.prepare_prompt_api(doc) for doc in dataset]

Because of that, for S splits and N docs, LiteLLM sent ~N*S requests instead of N.

Changes

Updated split loop context construction to use split-local docs:

contexts = [self.prompt_manager.prepare_prompt_api(doc) for doc in split]

Added regression test:
tests/unit/models/endpoints/test_litellm_split_iteration.py
Verifies API call inputs are split-local and total sent requests equals total docs (no redundant requests).

Why this matters

Prevents redundant API calls and extra cost.
Ensures split-level params (generation_size, use_logits, num_samples, stop_sequences) are applied to the correct group.
Prevents silently dropped extra responses after get_original_order.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant