-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Stream_bath - pollWindow size in case of restart client application #8
Comments
We have seen a similar behavior reported before https://scylladb-users.slack.com/archives/C2NLNBXLN/p1632379210266400 In short, after we restart a consumer with progresstable, unprocessed stream_ids will start polling from the generation's start and, with very small A way to "workaround" it is to increase I think the issue/report should be a programmatic way to allow specifying the start Poll time (at user own's risk), instead of polling from the starting generation for unprocessed stream_ids. This will help development/non-write-heavy workloads to catch up faster when polling unprocessed stream IDs. |
FYI @avelanarius. @piodul and @avelanarius let's have a meeting to find a common solution for both Golang and Java |
Please note I've addressed the issue with persisting CDC log reader state + restart behaviour in #10. |
And I also would be interested to help with Java lib + scylla-cdc-source-connector where I'm suspecting similar issues (but even harder to debug/analyse...). |
This should be fixed by #13. Now, when the library starts reading for the first time (without any saved progress), the default progress manager saves the timestamp it originally started reading from and, when restarting, makes sure not to read older changes than that when restarting. Users can also implement the |
@AdamStawarz , could you please update if your issue was resolved by #13 |
Issue description
We are trying to implement storing progress in our cdc-consumer-service using the inbuilt progress manager(provided by scylla-cdc-go). But after the cdc-consumer-service and restarting it, we are unable to receive any more cdc updates.
However, according to the logs the progress is loaded for the last_timestamp(check screenshot attached).
On further checking the library code, we noticed that this could be happening because inside the getPollWindow() function in stream_batch.go, the queryWindowRightEnd and confidenceWindowStart are very far apart. On printing the queryWindowRightEnd, we found this value to be equal to the current_generation value(2021-08-04 05:42:32.878000+0000 - progress table screenshot attached) and confidenceWindowStart= time.Now().
Because of this huge size window we are not receiving cdc updates.
On changing this window value to a small one(say 1second - by hard coding), it is working fine.
So I believe the value of queryWindowRightEnd in getPollWindow() function is not getting set to the appropriate value. Are we missing something here?###hat this could be happening because inside the getPollWindow() function in stream_batch.go, the queryWindowRightEnd and confidenceWindowStart are very far apart. On printing the queryWindowRightEnd, we found this value to be equal to the current_generation value(2021-08-04 05:42:32.878000+0000 - progress table screenshot attached) and confidenceWindowStart= time.Now().
Because of this huge size window we are not receiving cdc updates.
On changing this window value to a small one(say 1second - by hard coding), it is working fine.
So I believe the value of queryWindowRightEnd in getPollWindow() function is not getting set to the appropriate value. Are we missing something here?
The text was updated successfully, but these errors were encountered: