Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] Support Postgres as DB backend #3892

Open
its-josh4 opened this issue Jul 10, 2023 · 9 comments · May be fixed by #5355
Open

[Feature] Support Postgres as DB backend #3892

its-josh4 opened this issue Jul 10, 2023 · 9 comments · May be fixed by #5355

Comments

@its-josh4
Copy link
Contributor

Stash stores all data in a SQLite database. Although SQLite is great and is the best option for 90% or more of Stash users, supporting external DB servers such as Postgres would be great.

Postgres:

  • Is more reliable
  • Has better tooling for operations (incl. backups and restore)
  • Can be hosted on a separate node (better for K8s users)
  • Doesn't fail spectacularly if the data is on a NFS/SMB drive like SQLite
  • Can offer better performance.

At a quick glance, it appears that supporting Postgres shouldn't be too complex in stash. We should be able to add that as an option (alternative to SQLite).

Only thing that is an open question is what to do with the built-in backup feature of stash, since it's currently implemented so it does a VACUUM into a separate SQLite database. This is obviously not an option. My suggestion would be to disable built-in backups when using a DB server, since backups should be performed in the server.

@trollboy
Copy link

I'd throw bounty cash at this if it also included mysql, I get that postgres is more popular right now.
IRT Backups, why not a giant JSON dump? JSON would be SQL agnostic.

@MrX292
Copy link
Contributor

MrX292 commented Jul 22, 2023

@trollboy you mean like the Export task?

@trollboy
Copy link

@MrX292 yeah. Exporting/backing up to JSON would be RDMS agnostic.

@NodudeWasTaken
Copy link
Contributor

NodudeWasTaken commented Mar 21, 2024

After investigating the main hindrance seems to be the custom functions registered to sqlite.

ConnectHook: func(conn *sqlite3.SQLiteConn) error {
funcs := map[string]interface{}{
"regexp": regexFn,
"durationToTinyInt": durationToTinyIntFn,
"basename": basenameFn,
"phash_distance": phashDistanceFn,
}
for name, fn := range funcs {
if err := conn.RegisterFunc(name, fn, true); err != nil {
return fmt.Errorf("error registering function %s: %v", name, err)
}
}

To do this in postgres you would need to register a binary to the service with said functions, which could be possible but would require some careful considerations about how to do it.

It might also require some deep dives into specific sqlite and postgres macros like current date etc.

EDIT2:
So after some investigation we need:

  • Alternative mitigations (AUTOINCREMENT/integer, datetime => timestamp, etc. needs replacing)
  • Create a Postgresql binary that registers the missing commands (not too hard, but i dont know that the project wants to maintain that).
  • Specify database type and connection string in stash config.

EDIT3:
So i added a bunch of switches for db type, i think the strategy currently is to create premigrations for creating tables such that i can select what type of index it is pr db type ("integer not null primary key autoincrement", vs "serial not null primary key")

@Traxey
Copy link

Traxey commented Apr 28, 2024

Couldn't this be handled by using an abstraction layer?

@adfaklsdjf
Copy link

Doesn't fail spectacularly if the data is on a NFS/SMB drive like SQLite

Just happened to me on my Unraid server. Wishing I'd stored in the dbms I'm already running on the same server :(

@adfaklsdjf
Copy link

I would also throw a small bounty, like maybe 100+ usd, for support for using either MySQL or Postgres as the backend. I wouldn't need those settings exposed in the settings UI, hand-editing config.yml is fine, or automating the setup of the binary service mentioned above, basic instructions (community-sourced maybe?) for how that might be done and user has to set it up themselves.

I know it would require abstracting parts of the system that currently aren't abstracted like the regex, phash_distance, etc functions mentioned above, so it wouldn't be trivial... but the performance gains for very large databases would be awesome..

I've never done bounties before so I'm not sure how that works, but I give small amounts to all kinds of things.. $60/mo patreon budget, monthly donations to Wikipedia, Internet Archive...

@NodudeWasTaken
Copy link
Contributor

NodudeWasTaken commented Sep 27, 2024

I have create some preliminary work on a branch here:
https://github.com/NodudeWasTaken/stash/tree/postgres-support-2
Its MOSTLY FUNCTIONAL, DONT USE IT.

TODOS:

  • Some backup specific functions arent implemented in postgres.
    Given we only have a network connection we cant run pg_dump.
    We can fetch everything and dump it into an sqlite file (like anonymize does), but i fear this may cause issues.
    I could be heavy handed and demand some backup file exists like pg_backup_{date}.gz and give the command to generate it, since this is an advanced feature anyway.
  • The missing functions are created, but the operator necessitates sql changes
    Im not that great at C, but here's the initial version of the plugin:
    https://github.com/NodudeWasTaken/stash_pge
    I added the collation using a builtin postgresql function, i need to find if this is sufficient or i need another c function:
    CREATE COLLATION IF NOT EXISTS NATURAL_CI (provider = icu, locale = 'en@colNumeric=yes');
  • I was lazy with the migrations, so we have one big one and it bugs out on first boot saying "found 2 migrations, expected 68" or something.
    I should probably just lower the number for postgres, like its always 68 behind, but increments with it.

There are probably more bugs which i haven't found.
Any help would be appreciated 😊

@NodudeWasTaken NodudeWasTaken linked a pull request Oct 8, 2024 that will close this issue
@its-josh4
Copy link
Contributor Author

@NodudeWasTaken I would be very happy to help. Let me know how we can sync! (Discord?)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: No status
Development

Successfully merging a pull request may close this issue.

6 participants