-
Notifications
You must be signed in to change notification settings - Fork 3
feat: Allow prefetching dependencies using a Threadpool #175
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
…a bottle-neck there
Encord Agents test report93 tests 93 ✅ 1m 58s ⏱️ Results for commit 25f99c4. |
| with ThreadPoolExecutor() as executor: | ||
| dependency_list = list( | ||
| executor.map( | ||
| lambda context: solve_dependencies( | ||
| context=context, dependant=runner_agent.dependant, stack=stack | ||
| ), | ||
| batch, | ||
| ) | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see why we'd want to wait for all tasks to be fetched before starting the compute? Can't we just call the agent when iterating the output of the executor.map call rather than wrapping it in list?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One could do. I was just replicating the customer's original behaviour where they do all fetching first ahead of task execution.
Notably the customer wanted explicitly sequential inference and if we had the map include the agent execution, we lose this benefit.
| help="Max number of tasks to try to process per stage on a given run. If `None`, will attempt all", | ||
| ), | ||
| ] = None, | ||
| pre_fetch_factor: Annotated[ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure name and functionality match here. Factor is multiplied, this seems to be an absolute number - at least based on doc string.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could be pre-fetch batch size. My view on factor was: If at x, we perform a (grouped) dependency fetch N/x times rather than N times.
| except Exception: | ||
| print(f"[attempt {attempt+1}/{num_retries+1}] Agent failed with error: ") | ||
| traceback.print_exc() | ||
| with ThreadPoolExecutor() as executor: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we not specify how many threads (perhaps even give as argument)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could do. By default, ThreadPoolExecutor will pick it based on n_cpus. I didn't necessarily want to expand the interface too much. But I defo see your point
Endless amount of effort could be put into this. Ideally there would be a separate fetch and execute queue. Ideally we could mark our dependencies as async to ensure that when we are fetching them, we can switch between different threads.
This is just one approach