Multi-Agent Architecture #3206

jacksund · 2020-08-22T18:38:49Z

jacksund
Aug 22, 2020

Hey everyone,

I apologize in advance for how much I wrote here... I realize that I'm attempting to use Prefect in a manner that it's not originally designed for, so I have a longer explanation of why I'm doing this below.

While most users are coming from AirFlow, I'm trying to transition from FireWorks (https://github.com/materialsproject/fireworks).

So here's my main question:
Can I setup Prefect with many Agents that make a single task request and then terminate? In addition, when an agent is started and no tasks are ready for execution, the Agent will terminate. If this is possible, what would you estimate for the overhead per task? I would expect the main overhead comes from Agent connection to the Prefect Cloud/Server.

This may sound like an inefficient use of Prefect, but it is intentional. Fireworks is designed with this setup in mind and is thus limited to 6/tasks per second -- this is acceptable because the average task submitted via FireWorks is on the hour timescale.

I'm a materials chemistry researcher at UNC, where I must submit tasks as individual SLURM jobs, each with their own time and memory restrictions. These tasks (DFT energy calculations) range drastically in their required resources (one could be <1GB while another >200GB memory), launch in parallel using mpirun, and require their own isolated directory.

I could use a single Executor like Dask, which supports queues like SLURM, but this would cause a number of problems for me. Dask holds onto worker resources indefinitely, which research-cluster admins don't want. If there are no tasks ready to execute, the cluster's resources should be released. Dask (as far as I'm aware) does not allow for setting time/memory limits on a per-task basis. And I'm unsure how Dask will handle tasks that execute via mpirun and also implement isolated directories per task.

Fireworks was made by the materials chemistry community specifically for this. In the setup, you constantly submit SLURM jobs. Once a SLURM job makes it through the queue, the job itself simply starts an Agent, runs a single task, and then closes. This submission architecture is something I would like to replicate with Prefect. Fireworks has a number of limitations that I think Prefect can fix - such as their WorkFlow classes and meta-database (MongoDB) - so I'm looking into switching over.

Should this be of interest to others, it may be worth making QueueAdaptors for Prefect - similar to FireWorks' adapters (https://materialsproject.github.io/fireworks/queue_tutorial.html).

Again, sorry for the long write! Thanks for reading through, and let me know if you think a multi-Agent approach is possible with Prefect.

-Jack

Answered by emcake

Aug 24, 2020

The alternative is to write a cluster level `Executor` that will let you take control of task submission. You implement something that takes a `Callable` and returns something like a `Future` and then something that can wait on those futures. The upside of this is full control of task submission. The downside is that every single task will be a cluster task - if it only takes two seconds you’ll still pay queue time (but prefect should gain you some parallelism)

View full answer

jlowin · 2020-08-22T22:03:25Z

jlowin
Aug 22, 2020
Maintainer

Hi Jack,

A few thoughts:

the current version of Prefect has no support for running individual tasks instead of an entire workflow, but support for running subgraphs, including individual tasks, is being worked on now. You could work around this with single-task flows, if it fit your requirements.
our typical agent pattern is long-running and polling for work but if you peek under the hood, agents are just querying for work and executing it if found. You could write a modified agent that only made a single query and turned off as long as you had some way to kick it off (maybe from within a different workflow?) Since this isn’t typical for Prefect users, I don’t think we’ve spent much time thinking about it, but I see no issue offhand with only issuing the agent queries one time. I’d be curious if you make some headway with it.

4 replies

jacksund Aug 22, 2020
Author

Thanks for the comments.

I'd think single-task flows would be pretty restricting, mostly because I would lose access to features like Triggers and setting the reference_task. I still might be able to get away with running entire Flows though -- I'd just need to make sure I don't parallelize tasks that use mpirun, or I can be extra safe by using a serial executor like LoaclExecutor. I'll give this a try. Thanks!
I think I can figure this out... I'll let you know. For now, I think kick-off will be as simple as my SLURM script executing the 'prefect agent' command. And for the modified agent that it actually kicks off, would I be able to just use LocalAgent(max_polls=1)...? Hopefully the Agent exits once max_polls is hit and the the submitted workflow completed, but I haven't tested it yet. I'm trying to understand the exit_event in the Agent class (link) still, but maybe this approach would give the desired effect. On top of max_polls=1, I also might need to somehow set workers=1 so that it only grabs a single workflow. Does that sound like the correct approach?

I'll post back here as I make progress, but it will be slow!

-Jack

jlowin Aug 23, 2020
Maintainer

Let us know! As you get more comfortable with your agents, you can look "inside" -- essentially, the agents are making a query to the backend for work, then submitting that work - you could end up with a completely new type of agent that just performed those two steps and exits.

I agree, single-task flows are probably not useful to you, I just wanted to call it out in case it let you experiment quickly. In the near future Prefect will deliver those fine-grained tools I mentioned, which sound like they could help.

Let us know how we can help!

emcake Aug 24, 2020

This overlaps with some of my own use-cases for prefect, where I have a large number (O[10-500k]) of external commands that would typically be submitted to a work-sharing cluster designed for high throughput, and I'd like to manage a flow over the top of this.

My current plan was to do this via managing the work queue as one task within a wider flow, but this involves discarding the USP that prefect brings about workflow management and retries.

My next thought was to achieve this with mapped tasks over the work items and managing access to the work queue with a resource manager. This works, but effectively all of the tasks are blocked on an external system, and would involve having 10-500k prefect tasks all sleep-waiting for an external service to finish.

I'd venture that integrating with external work queues directly could also be well-achieved by allowing tasks to wait in a 'blocked' state with some kind of named token, and then having the external system post back to prefect when a given task's block token has been signalled.

For things better suited to dask-style workflows (i.e where you can express most of the logic itself in prefect as opposed to external long-running commands) you could also achieve this with a standard Dask cluster, but configured with the jobqueue backend. From looking at the slurm cluster API it supports a walltime parameter which would allow you to constrain the time each worker lives for, and then you could use an adaptive deployment to maintain a target cluster size, in the hope that this will keep recycling new nodes into your worker pool as their lease from the work queue ends.

jacksund Aug 24, 2020
Author

The adaptive deployment method looks super useful!! Thank you. I'll definitely switch to SLURMCluster.adapt() over the SLURMCluster.scale() method when I use this type of architecture. This solves one of my main problems with Dask (of wasting resources by holding them), but I still need to see if dask-jobqueue can limit a worker to one task, execute a task via mpirun, and localize tasks to each worker / a single directory. Dask seems a more complicated than the simple agents that I'm used to, but I reading more to see if this would work. I think it will end up coming down to how Dask manages tasks/memory between workers -- the more isolated they are, the better. Prefect agents look to be more isolated (and higher level), which is why I'm starting here.

emcake · 2020-08-24T17:05:40Z

emcake
Aug 24, 2020

I’m not familiar with MPI, but are you basically running an external command? If so, you might get more mileage out of writing a wrapping `Task` that submits the job to your SLURM cluster directly, and then a cheap polling loop in your wrapping task that reports the status of the SLURM job in the Prefect GUI. This way you can control the requirements for each task separately and the dask cluster you’ll need to run the flow can be a lot smaller.

…

On Mon, 24 Aug 2020 at 17:00, jacksund ***@***.***> wrote: The adaptive deployment method looks super useful!! Thank you. I'll definitely switch to SLURMCluster.adapt() over the SLURMCluster.scale() method when I use this type of architecture. This solves one of my main problems with Dask (of wasting resources by holding them), but I still need to see if dask-jobqueue can limit a worker to one task, execute a task via mpirun, and localize tasks to each worker / a single directory. Dask seems a more complicated than the simple agents that I'm used to, but I reading more to see if this would work. I think it will end up coming down to how Dask manages tasks/memory between workers -- the more isolated they are, the better. Prefect agents look to be more isolated (and higher level), which is why I'm starting here. — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#3206 (reply in thread)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AA4N3PYLZHO2OBTLHWZEIOLSCKFDVANCNFSM4QIHI5OQ> .

1 reply

jacksund Aug 24, 2020
Author

I won't be able to do this unfortunately. This approach leads to a number of issues when I'm using multiple clusters (I submit to in-lab clusters, UNC clusters, and national clusters all through the same workflow manager) such as bottlenecks when one cluster hits a longer queue than others and it also leads to loss of data such as how long the task took to run (queue time is absorbed into the task duration). There are other problems that pop up, but these are the big two for me. It's a good idea though - it just falls apart for my specific application.

emcake · 2020-08-24T17:30:15Z

emcake
Aug 24, 2020

The alternative is to write a cluster level `Executor` that will let you take control of task submission. You implement something that takes a `Callable` and returns something like a `Future` and then something that can wait on those futures. The upside of this is full control of task submission. The downside is that every single task will be a cluster task - if it only takes two seconds you’ll still pay queue time (but prefect should gain you some parallelism)

…

On Mon, 24 Aug 2020 at 18:16, jacksund ***@***.***> wrote: I won't be able to do this unfortunately. This approach leads to a number of issues when I'm using multiple clusters (I submit to in-lab clusters, UNC clusters, and national clusters all through the same workflow manager) such as bottlenecks when one cluster hits a longer queue than others and it also leads to loss of data such as how long the task took to run (queue time is absorbed into the task duration). There are other problems that pop up, but these are the big two for me. It's a good idea though - it just falls apart for my specific application. — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#3206 (reply in thread)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AA4N3P44UOT6F5OZGAZKHKDSCKN7VANCNFSM4QIHI5OQ> .

2 replies

jacksund Sep 12, 2020
Author

I've been looking into this, but runs into a similar issue of what I described above -- this would tie tasks/flows to a specific cluster and if one cluster hits slower queue times, I hit a bottleneck there. If I limited my setup to a single cluster, this would be perfect though.

jacksund Sep 21, 2020
Author

hey @emcake, I need to apologize because it looks like your suggestion here is great temporary fix!!

I'm messing around with a Task-Specific Executor that's letting me submit individual tasks to an series of Dask or FireWork Executors. More specifically, the submit(fn, *args, **kwargs) looks at the type of fn -- and if it is type x it executes in with Dask, and if it is type y it executes with FireWorks. Thus, the executor is actually a list of executors, where each is linked to task types. Having a Dask executor alongside the Fireworks one is really useful for the simple non-mpi tasks like those in the prefect.tasks library -- as there's no reason to submit these as individual slurm jobs.

A FireWorks executor actually doesn't have the issue I mention in the prior comment (bottlenecking with multiple clusters) because FireWorks includes it's own cloud task database and distributed-worker setup. Sorry about that! It does however lose the data of how long the calculation took to run (queue time is absorbed into the FireWork's queue time) but that's fine for now.

FireWorks wasn't made to be an Executor, so I'll definitely move to something else in the future. But still, this is working well for me now. Thank you for the idea!

jacksund · 2020-09-12T04:28:10Z

jacksund
Sep 12, 2020
Author

Just wanted to leave an update. I've been looking into writing a minimalist Agent for this purpose, but I've admittedly gotten overwhelmed with the Agent base class... I'll probably drop this for now, but perhaps revisit it later in the year.

I've at the moment settled for the LocalAgent in testing. So instead of submitting a single task per job, it's one flow per slurm job. It works so far across multiple clusters, where I'm starting agents up with...

from prefect.agent.local import LocalAgent
agent = LocalAgent(max_polls=1)
agent.start()

When I submit a bunch of slurm jobs on one Cluster (where many Agents are being launched/killed/relaunched), I have all the Agents simply use the same RUNNER API token. No errors occur when I run agents at the same time with the same token (tested this without the max_polls kwarg), but I haven't fully tested this because I'm still using the developer edition of Prefect Cloud. I'm actually limited to just one concurrent flow, so I have no idea if this setup will breakdown with >1 concurrency. Should I just fill out this form to request a small increase in flow concurrency for testing?

The key thing I have working is a queue manager that tries to maintain N number of slurm jobs (so N number of Agents) on a cluster at any given time. This is really just a rewrite of FireWork's queue module. Once I have that all ironed out, I'll share it here and perhaps submit it to prefect.contrib.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Multi-Agent Architecture #3206

{{title}}

Replies: 4 comments 7 replies

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

Select a reply

Multi-Agent Architecture #3206

jacksund Aug 22, 2020

Replies: 4 comments · 7 replies

jlowin Aug 22, 2020 Maintainer

jacksund Aug 22, 2020 Author

jlowin Aug 23, 2020 Maintainer

emcake Aug 24, 2020

jacksund Aug 24, 2020 Author

emcake Aug 24, 2020

jacksund Aug 24, 2020 Author

emcake Aug 24, 2020

jacksund Sep 12, 2020 Author

jacksund Sep 21, 2020 Author

jacksund Sep 12, 2020 Author

jacksund
Aug 22, 2020

Replies: 4 comments 7 replies

jlowin
Aug 22, 2020
Maintainer

jacksund Aug 22, 2020
Author

jlowin Aug 23, 2020
Maintainer

jacksund Aug 24, 2020
Author

emcake
Aug 24, 2020

jacksund Aug 24, 2020
Author

emcake
Aug 24, 2020

jacksund Sep 12, 2020
Author

jacksund Sep 21, 2020
Author

jacksund
Sep 12, 2020
Author