Skip to content

Conversation

@utkuozdemir
Copy link
Member

@utkuozdemir utkuozdemir commented Nov 28, 2025

Use SQLite storage to store machine logs.
Use the same SQLite database used for the metrics state.

Refactor machine log storage to define a common interface for circular and sqlite based logs.

Part of #1771

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR refactors machine log storage to use SQLite as the primary storage backend instead of in-memory circular buffers with optional file persistence. The change introduces a common LogStore interface with two implementations: SQLite-based and circular buffer-based, allowing users to choose their preferred storage method.

Key Changes:

  • Introduced a new logstore package with a common interface for log storage implementations
  • Added SQLite-based log store implementation with support for tailing and following logs
  • Refactored the existing circular buffer implementation to implement the new LogStore interface
  • Changed default configuration to enable SQLite storage and disable circular buffer file storage

Reviewed changes

Copilot reviewed 20 out of 20 changed files in this pull request and generated 7 comments.

Show a summary per file
File Description
internal/pkg/siderolink/logstore/logstore.go Defines common LogStore and LineReader interfaces for log storage backends
internal/pkg/siderolink/logstore/sqlitelog/sqlitelog.go SQLite-based log store implementation with write, read, and follow functionality
internal/pkg/siderolink/logstore/sqlitelog/manager.go SQLite store manager for creating, checking existence, and removing machine log stores
internal/pkg/siderolink/logstore/sqlitelog/sqlitelog_test.go Tests for SQLite log store read/write and follow operations
internal/pkg/siderolink/logstore/circularlog/circularlog.go Refactored circular buffer implementation to implement LogStore interface
internal/pkg/siderolink/logstore/circularlog/manager.go Circular buffer store manager implementing common management interface
internal/pkg/siderolink/machines.go Major refactoring to use LogStore abstraction instead of direct circular buffer access
internal/pkg/siderolink/loghandler.go Updated to use LogStore interface and pass context to all log operations
internal/pkg/siderolink/loghandler_test.go Updated tests to use new API with context parameter and database setup
internal/pkg/logreceiver/logreceiver.go Updated to pass context through connection handling
internal/pkg/logreceiver/logreceiver_test.go Updated test to provide context parameter
internal/pkg/config/logs.go Added LogsMachineSQLite configuration struct and documented buffer config usage
internal/pkg/config/config.go Changed defaults to enable SQLite storage and disable circular buffer file storage
internal/backend/grpc/support.go Updated support bundle collection to use new log store API with context
internal/backend/grpc/management.go Updated machine logs streaming to use new reader API with context
internal/backend/runtime/omni/state_sqlite.go Refactored to accept pre-opened database connection instead of opening internally
internal/backend/runtime/omni/state.go Updated to pass secondary storage database connection
cmd/omni/pkg/app/app.go Updated to pass secondary storage database to log handler
cmd/omni/cmd/cmd.go Opens secondary storage database at startup and passes it through the stack
internal/integration/integration_test.go Updated integration tests to create and pass test database

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@utkuozdemir utkuozdemir force-pushed the feat/machine-logs-sqlite-v2 branch 2 times, most recently from 45e657c to c497ee1 Compare November 28, 2025 12:26
@utkuozdemir utkuozdemir requested a review from Copilot November 28, 2025 12:27
Copilot finished reviewing on behalf of utkuozdemir November 28, 2025 12:31
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 20 out of 20 changed files in this pull request and generated 29 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Use SQLite storage to store machine logs.
Use the same SQLite database used for the metrics state.

Refactor machine log storage to define a common interface for circular and sqlite based logs.

Signed-off-by: Utku Ozdemir <[email protected]>
@utkuozdemir utkuozdemir force-pushed the feat/machine-logs-sqlite-v2 branch from c497ee1 to a1b543b Compare November 28, 2025 13:26
@utkuozdemir utkuozdemir requested a review from Copilot November 28, 2025 13:41
@utkuozdemir utkuozdemir marked this pull request as ready for review November 28, 2025 13:41
@talos-bot talos-bot moved this to In Review in Planning Nov 28, 2025
@utkuozdemir utkuozdemir self-assigned this Nov 28, 2025
Copilot finished reviewing on behalf of utkuozdemir November 28, 2025 13:43
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 20 out of 20 changed files in this pull request and generated 5 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +82 to +85
ctx, cancel := context.WithTimeout(ctx, s.config.Timeout)
defer cancel()

result, err := s.db.ExecContext(ctx, query, s.id, message, time.Now().Unix())
Copy link

Copilot AI Nov 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The WriteLine method creates a new context with timeout that shadows the incoming context, but holds a mutex during the database operation. If the database operation blocks for longer than the timeout, the mutex will remain locked while the context times out. This could lead to deadlocks or mutex contention. Consider releasing the mutex before the database operation or using the original context instead of creating a new one with timeout.

Copilot uses AI. Check for mistakes.
m.logger.Debug("failed to get number of rows affected during logs cleanup", zap.Error(err))
}

m.logger.Info("completed logs cleanup", zap.Int64("rows_affected", numRowsDeleted))
Copy link

Copilot AI Nov 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] The log level for completed cleanup is Info, which might be too verbose for a periodic operation that runs every 30 minutes by default. Consider using Debug level instead, or only log at Info level when numRowsDeleted > 0.

Suggested change
m.logger.Info("completed logs cleanup", zap.Int64("rows_affected", numRowsDeleted))
if numRowsDeleted > 0 {
m.logger.Info("completed logs cleanup", zap.Int64("rows_affected", numRowsDeleted))
} else {
m.logger.Debug("completed logs cleanup", zap.Int64("rows_affected", numRowsDeleted))
}

Copilot uses AI. Check for mistakes.
Comment on lines +163 to +169
panichandler.Go(func() {
select {
case <-closeCh: // normal close, reader is already unsubscribed
return
case <-ctx.Done():
s.unsubscribe(followCh)
}
Copy link

Copilot AI Nov 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a potential race condition in the subscription cleanup goroutine. If the reader is closed before the context is done, the goroutine will wait on closeCh forever since it only listens for either closeCh or ctx.Done(). The goroutine should also check if the reader was already closed by selecting on the followCh being closed, or the goroutine should be properly terminated when Close() is called. Currently, this could leak goroutines if readers are closed without the context being canceled.

Copilot uses AI. Check for mistakes.
Comment on lines +30 to +31
// ctx is the root context for all connections.
ctx context.Context //nolint:containedctx
Copy link

Copilot AI Nov 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Storing a context in a struct field (ctx context.Context) is generally discouraged in Go. While there's a nolint directive acknowledging this (//nolint:containedctx), the comment doesn't explain why it's necessary here. Consider documenting why this pattern is required in this specific case, or refactoring to pass the context through method calls instead.

Copilot uses AI. Check for mistakes.
Enabled: true,
Timeout: 10 * time.Second,
CleanupInterval: 30 * time.Minute,
CleanupOlderThan: 24 * 30 * time.Hour,
Copy link

Copilot AI Nov 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] The cleanup threshold is set to 24 * 30 * time.Hour which assumes 30 days always have 24 hours. This could be more clearly expressed as 30 * 24 * time.Hour (720 hours) or better yet, use a constant like 720 * time.Hour with a comment explaining it represents 30 days. The current calculation order is correct mathematically but less intuitive.

Suggested change
CleanupOlderThan: 24 * 30 * time.Hour,
// 30 days
CleanupOlderThan: 30 * 24 * time.Hour,

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

Status: In Review

Development

Successfully merging this pull request may close these issues.

2 participants