Skip to content

sweepbatcher: do not fail on restoring empty batches #754

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
May 24, 2024

Conversation

bhandras
Copy link
Member

@bhandras bhandras commented May 24, 2024

Previously storing an empty batch would make the batcher fail to start as spinning up a restored batch assumes that there's a primary sweep added already. As there's no point in spinning up such batch we can just skip over it.
Furthermore we'll ensure that we won't try to ever publish an empty batch to avoid setting the fee rate too early.

Pull Request Checklist

  • Update release_notes.md if your PR contains major features, breaking changes or bugfixes

@bhandras bhandras force-pushed the sweepbatcher-empty-batch-fix branch 2 times, most recently from d18c3d4 to 6ea1536 Compare May 24, 2024 08:57
dbSweeps, err := b.store.FetchBatchSweeps(ctx, batch.id)
if err != nil {
return err
}

if len(dbSweeps) == 0 {
return fmt.Errorf("batch %d has no sweeps", batch.id)
log.Infof("skipping restored batch %d as it has no sweeps",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe also delete from storage

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@@ -321,7 +330,7 @@ func (b *Batcher) handleSweep(ctx context.Context, sweep *sweep,
// If one of the batches accepts the sweep, we provide it to that batch.
for _, batch := range b.batches {
accepted, err := batch.addSweep(ctx, sweep)
if err != nil && err != ErrBatchShuttingDown {
if err != nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we don't want to crash here, as it's possible that the batch confirmed by the time the sweep tries to enter it, returning an ErrBatchShuttingDown

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's not critical, but no need to cause an all-around system restart

could perhaps add some simple coverage for this, not sure how involved the unit test would be though

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the previous loop (trying to addSweep (update) to the batch to which it already belongs), the error from batch.addSweep is checked with regular err != nil check. Should the checks of errors from batch.addSweep in the two loops be the same?

Another question about that loop. Can we return earlier there, if the sweep was added successfuly?

diff --git a/sweepbatcher/sweep_batcher.go b/sweepbatcher/sweep_batcher.go
index 0251101..6019f6c 100644
--- a/sweepbatcher/sweep_batcher.go
+++ b/sweepbatcher/sweep_batcher.go
@@ -388,6 +388,9 @@ func (b *Batcher) handleSweep(ctx context.Context, sweep *sweep,
                                        "accepted by batch %d", sweep.swapHash[:6],
                                        batch.id)
                        }
+
+                       // The sweep was updated in the batch, our job is done.
+                       return nil
                }
        }

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that there's a check below for accepted and if that's true we return early.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the previous loop (trying to addSweep (update) to the batch to which it already belongs), the error from batch.addSweep is checked with regular err != nil check

Oh, you're right. If the batch just confirmed it will trigger an ErrBatchShuttingDown which should not cause a daemon restart. In that case, we should do b.monitorSpendAndNotify in order to let the swap know that it was swept

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this check also be if err != nil && !errors.Is(err, ErrBatchShuttingDown) {?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added the error check to both code paths.


// Now make it quit by canceling the context.
cancel()
wg.Wait()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could go a bit further

add a sweep and verify that we now have 1 batch in memory, and 2 in storage

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've changed the test a bit since the dormant batch is now dropped from the db.

@bhandras bhandras force-pushed the sweepbatcher-empty-batch-fix branch from 6ea1536 to 12ef7d3 Compare May 24, 2024 11:45
@@ -493,7 +493,7 @@ func (b *batch) Run(ctx context.Context) error {
b.currentHeight = height

case <-timerChan:
if b.state == Open {
if b.state == Open && len(b.sweeps) > 0 {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe we can move the length check inside publish so that other calls to it also check for the length?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea, done.

@@ -458,6 +465,11 @@ func (b *Batcher) spinUpBatchFromDB(ctx context.Context, batch *batch) error {
log: batchPrefixLogger(fmt.Sprintf("%d", batch.id)),
}

cfg := batchConfig{
maxTimeoutDistance: batch.cfg.maxTimeoutDistance,
batchConfTarget: defaultBatchConfTarget,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why setting defaultBatchConfTarget is needed here? If it is used, it means, that confTargets of sweeps were not taken into account.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The source of confTarget is fetchSweep method. I think, after resuming from DB we should make sure that confTarget is filled from fetchSweep before it is used to calculate fee rate (i.e. before publish). Then we can remove const defaultBatchConfTarget.

Copy link
Member

@GeorgeTsagk GeorgeTsagk May 24, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We update this when the first sweep comes in, see here

but agree it's basically a noop setting this here, as it's guaranteed it will be overwritten

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that this is just a reorder, no added code.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, ideally let's not introduce unwanted changes


go func() {
defer wg.Done()
err = batcher.Run(ctx)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's name it to runErr just in case another err is created between this line and the check of the error in the end.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@bhandras bhandras force-pushed the sweepbatcher-empty-batch-fix branch 4 times, most recently from ed9c22f to b0a1a90 Compare May 24, 2024 14:22
@bhandras bhandras force-pushed the sweepbatcher-empty-batch-fix branch from b0a1a90 to 05f1afb Compare May 24, 2024 14:40
Copy link
Collaborator

@starius starius left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 🏆

@@ -108,6 +111,12 @@ func (s *SQLStore) InsertSweepBatch(ctx context.Context, batch *dbBatch) (int32,
return s.baseDb.InsertBatch(ctx, batchToInsertArgs(*batch))
}

// DropBatch drops a batch from the database. Note that we only use this call
// for batches that have no sweeps and so we'd not be able to resume.
func (s *SQLStore) DropBatch(ctx context.Context, id int32) error {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I propose to add a check that the number of sweep in the batch is 0. If the batch is not empty, return an error.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea, added!

@@ -321,7 +330,7 @@ func (b *Batcher) handleSweep(ctx context.Context, sweep *sweep,
// If one of the batches accepts the sweep, we provide it to that batch.
for _, batch := range b.batches {
accepted, err := batch.addSweep(ctx, sweep)
if err != nil && err != ErrBatchShuttingDown {
if err != nil {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this check also be if err != nil && !errors.Is(err, ErrBatchShuttingDown) {?

@bhandras bhandras force-pushed the sweepbatcher-empty-batch-fix branch from 05f1afb to df06885 Compare May 24, 2024 16:41
bhandras added 4 commits May 24, 2024 18:44
Previously storing an empty batch would make the batcher fail to start
as spinning up a restored batch assumes that there's a primary sweep
added already. As there's no point in spinning up such batch we can just
skip over it.
Furthermore we'll ensure that we won't try to ever publish an empty
batch to avoid setting the fee rate too early.
@bhandras bhandras force-pushed the sweepbatcher-empty-batch-fix branch from df06885 to 14de8f1 Compare May 24, 2024 16:44
Copy link
Collaborator

@hieblmi hieblmi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@bhandras bhandras merged commit 563e7be into lightninglabs:master May 24, 2024
4 checks passed
@bhandras bhandras deleted the sweepbatcher-empty-batch-fix branch May 24, 2024 17:04
starius added a commit to starius/loop that referenced this pull request May 29, 2024
It is always overwritten with primary sweep's confTarget.

See lightninglabs#754 (comment)
starius added a commit to starius/loop that referenced this pull request May 29, 2024
It is always overwritten with primary sweep's confTarget.

Print a warning if batchConfTarget is 0 in updateRbfRate.

See lightninglabs#754 (comment)
starius added a commit to starius/loop that referenced this pull request May 30, 2024
It is always overwritten with primary sweep's confTarget.

Print a warning if batchConfTarget is 0 in updateRbfRate.

See lightninglabs#754 (comment)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants