Personal deploy slow vs replicate's hosted API

I'd like to get this setup as my own API so that I can make changes / extend it.

I deployed the dev.yaml config to replicate using an `A100` GPU. It works but predictions take about 18s (after booting and being warm, this is 18s active time) where the one at https://replicate.com/black-forest-labs/flux-dev takes about 3s. This won't do :(.

I'm deploying from a windows 11 machine using cog.

Any idea as to what could be going wrong? Even without "go_fast" checked on the hosted API, it only takes 6.4s. Is the hosted one on a MUCH faster GPU?

Ideally I'd be able to get this one up and running on replicate and going just as fast as the hosted one and then I can extend it.

Thanks for any help here!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Personal deploy slow vs replicate's hosted API #76

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Personal deploy slow vs replicate's hosted API #76

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions