-
Notifications
You must be signed in to change notification settings - Fork 900
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug]: Compression Policy Stalling After Upgrade to TimescaleDB 2.16.2 #7502
Comments
Also note that (if I'm not mistaken) I should be able to set a timeout on the compression policy like ... SELECT add_compression_policy('<hypertable-name>', INTERVAL '1 week');
WITH job AS (
SELECT job_id
FROM timescaledb_information.jobs
WHERE hypertable_name = '<hypertable-name>'
AND proc_name = 'policy_compression'
)
SELECT alter_job(
job_id => job.job_id,
max_runtime => INTERVAL '1 hour',
max_retries => 3,
retry_period => INTERVAL '10 minutes'
) |
I can confirm that after running the decompress/compress SQL query above compression jobs are now back running without any issues. |
@rdmolony Thank you for reporting this issue. Do you think this is issue is specific to Windows? |
Thanks @erimatnor, it could well be, I can't be sure |
I was also stuck with "CALL _timescaledb_functions.policy_compression()" Used @rdmolony script but got stuck on: "SELECT decompress_chunk('_timescaledb_internal._hyper_25_27_chunk')" with a "Lock: relation" |
An update: I detected the same problem but this time decided to wait some days to see if it solved on its own and it did, my app was stopped for a couple hours so it was probably new inserts blocking the compression job. |
Unfortunately, there are instances where compression policy can take a significant amount of time to re-compress a chunk which is what I think people are hitting here. This one of the things we are actively working on improving. One of the improvements is landing in January: #7482 Meanwhile, decompressing and compressing these problematic chunks manually is your best bet. There are plans to work on lock contention here as well which can be caused by the compression policy and inserts into the chunk which should be compressed. But in the meantime, I suggest adjusting the policy to run on chunks which don't have data inserted into them anymore. For instance, if you are inserting data over the last 3 days, compression policy should compress chunks which are at least older than that. I realize this cannot be done for all workloads and that's why we plan on addressing this in the near future. Hope this makes sense. |
What type of bug is this?
Crash
What subsystems and features are affected?
Compression
What happened?
TL;DR; -
My database's compression policy job (
CALL _timescaledb_functions.policy_compression()
) has not run successfully since I upgraded from version 2.15.3 to 2.16.2, approximately four months ago. It appears to get stuck on specific chunks during compression.The issue has been resolved by running a SQL query to manually decompress/compress all hypertable chunks - see the SQL query at the bottom for more info.
Recently, I noticed that the periodic job
CALL _timescaledb_functions.policy_compression()
was running indefinitely.So I looked at this job's stats via ...
... and saw that the last time it ran successfully was 4 months ago.
This timing corresponds with an upgrade of Timescale from version 2.15.3 to 2.16.2
I noticed that the compression policy query
CALL _timescaledb_functions.policy_compression()
was getting stuck on one particular hypertable chunk by running ...... while it was executing.
I found a relevant similar issue (thanks to
ChatGPT 4o
) that suggested manually compressing the chunk ...So I tried this via ...
... and found that the query ran indefinitely.
So (dumbly) I tried decompressing the chunk manually via ...
... and it eventually decompressed after 40 minutes! Then I recompressed it, which this time took less than a minute!
I don't understand quite what went wrong or how this fixed it.
ChatGPT 4o
believed that this decompress/compress resolved metadata inconsistencies & bloat - but my understanding is not good enough to dig into that.So I re-run the compression policy, this time it got stuck on another chunk.
So now I'm running decompress/compress on all chunks via ...
Fingers crossed this resolves it
TimescaleDB version affected
2.16.2
PostgreSQL version used
PostgreSQL 14.13, compiled by Visual C++ build 1940, 64-bit
What operating system did you use?
Windows Server 2019 Datacenter 17763.6532
What installation method did you use?
Other
What platform did you run on?
On prem/Self-hosted
Relevant log output and stack trace
No response
How can we reproduce the bug?
This might be hard to do other than upgrading from version 2.15.3 to 2.16.2 on a Windows server (if I've understood the issue)
The text was updated successfully, but these errors were encountered: