Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance issues after upgrading to 4.12.3 (from 4.11.2) #4646

Open
2 tasks done
zgael opened this issue Feb 12, 2025 · 3 comments
Open
2 tasks done

Performance issues after upgrading to 4.12.3 (from 4.11.2) #4646

zgael opened this issue Feb 12, 2025 · 3 comments
Labels
defect Something isn't working pending more information

Comments

@zgael
Copy link

zgael commented Feb 12, 2025

Current Behavior

Hello,

I have deployed Dependency Track at my company, and it is widely used.
We may use it slightly differently than we're supposed to, please let me know if that's the case : although we have let's say a hundred projects in the company, there are around 4000 projects in dependency track, as our projects are very lively, and any new version published to Dependency Track is counted as a new project. We are obviously planning to cleanup the older projects that do not make much sense anymore.

Still, this is where we are. And we have upgraded from 4.11.2 to 4.12.3, and are experiencing performance issues :

  • frontend takes a long time to load the project list
  • API requests on the project list (api/v1/project?page=1&limit=100) also take a long time (between 15 and 45 secs)
  • Used in conjunction with Jenkins Dependency Track plugin , publishing BOMs reaches timeout (although this is configurable, the performance issue also impacts this use case).

Deployed in the cloud, we have observed the pod/node metrics, but nothing really stands out (no huge CPU/RAM consumption, seems like IO waits may be in cause).

Yesterday, we tried to reboot everything, to get cleaner after the upgrade process.
It was better... for one day, and we considered it fixed.... until today, the performance issues came back.

It seems the database is using a lot more disk than before, do you have any experience with ~4000 projects, or an idea of what could explain the performance changes in the "recent" versions ?

Also, we've been wondering if the issue could be linked to the fact that the apiserver pod has been automatically killed several times during the upgrade process (readinessProbe delay was too low to withstand the upgrade progress) - few errors appear in the database log :

ERROR: duplicate key value violates unique constraint "CONFIGPROPERTY_U1" DETAIL: Key ("GROUPNAME", "PROPERTYNAME")=(artifact, bom.validation.mode) already exists

STATEMENT: select * from 'public.SCHEMAVERSION'; 2025-02-12 10:03:25.108 GMT [2812] ERROR: relation "public.schemaversion" does not exist at character 15

It doesn't seem to be big blocking errors, but still I'm not fond of ignoring them...

Any feedback, pointers on what could explain the situation, where to dig further, or what to do to improve things ?

Thanks !

Steps to Reproduce

Not so easy to reproduce, as the installation of the 4.12.3 version on a testing environment went smooth ! May have to do with the database load as there are not so many projects in the testing environment.

Expected Behavior

UI loads fast enough (sometimes may exceed 30 secs, which seems to be the frontend timeout) and interactions with Dependency Track are fluid enough.

Dependency-Track Version

4.12.3

Dependency-Track Distribution

Container Image

Database Server

PostgreSQL

Database Server Version

11.6.0

Browser

N/A

Checklist

EDIT : At first, I said the upgrade was from 4.8.2 to 4.12.3, but actually it was from 4.11.2 to 4.12.3.

@zgael zgael added defect Something isn't working in triage labels Feb 12, 2025
@nscuro
Copy link
Member

nscuro commented Feb 12, 2025

Going from 4.8.2 to 4.12.3 is almost 2 years worth of changes, so pinpointing this will be a challenge.

Deployed in the cloud, we have observed the pod/node metrics, but nothing really stands out (no huge CPU/RAM consumption, seems like IO waits may be in cause).

Given this, see if you can get any insights (metrics) from your database to find out where the time is spent. This is impossible to tell from the application's side.

Look for slow queries, stuck DDLs due to the container restarts during migration, lock waits, or similar.

Another simple measure you could take is running VACUUM ANALYZE on the tables to ensure statistics are up-to-date. Sometimes after migrations that affects many rows, that might be necessary.

REINDEX <index_name> CONCURRENTLY is also something you could try to ensure that borked indexes are not the problem.

In any case, database metrics should give you a clearer picture of what's wrong.

@zgael zgael changed the title Performance issues after upgrading to 4.12.3 (from 4.8.2) Performance issues after upgrading to 4.12.3 (from 4.11.2) Feb 12, 2025
@zgael
Copy link
Author

zgael commented Feb 12, 2025

Hi actually, I was completely out of my mind : the upgrade was definitely not from 4.8.2, but from 4.11.2 (it's "only" 8 months), clearly not 2 years worth of changes (I edited the post and title to fix this).

Yeah we've been looking into the database to find out about blocked queries, discovered that the upgrade didn't complete properly (apiserver pod was killed because of readinessProbes before the database upgrade was completed).

So we're cleaning this up and are trying to ensure that the database schema is up-to-date. That can't hurt, and then we can see if that solves the problem or if we keep running into issues.

Do you have any feedback on the use of Dependency Track with thousands of projects ?

  • Not a problem, we have boards with multiple thousands of projects, and it is not causing any problem (pagination helps)
  • Looks weird, but shouldn't be a problem
  • Seems totally stupid, you definitely need a purge

I've been working with Dependency Track for couple years now, but am still unsure about the philosophy behind a permanent usage in a company full of new and old development projects, CI, and such - so if you could shed some light on this, would be great.

Thanks a lot !

@stohrendorf
Copy link
Contributor

@zgael could you check that disabling the L2 cache resolves your issues as mentioned in #4590 (comment)? if that's the case, 4.13 will resolve your issues, since that release will disable the L2 cache. since a reboot seemed to help, this seems to be a cache issue, and the L2 cache usually takes a majority.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
defect Something isn't working pending more information
Projects
None yet
Development

No branches or pull requests

3 participants