-
Notifications
You must be signed in to change notification settings - Fork 247
feat: prevent sync peers sending local addresses #7359
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: development
Are you sure you want to change the base?
feat: prevent sync peers sending local addresses #7359
Conversation
This PR prevents peers sending local addresses during peer sync when peer info is requested. The rationale here is that peer sync is done by a local node to learn about the network and then for the local node to try and contact those peers. It is not possible for a local node to contact a remote peer on its internal address.
WalkthroughThese changes introduce a new mechanism for distinguishing and filtering external versus internal network addresses throughout the peer management and discovery system. The schema, database access, and peer selection logic are updated to track, store, and query an Changes
Sequence Diagram(s)sequenceDiagram
participant DHT
participant PeerManager
participant PeerStorageSql
participant PeerDatabaseSql
participant DB
DHT->>PeerManager: closest_n_active_peers(..., external_addresses_only)
PeerManager->>PeerStorageSql: closest_n_active_peers(..., external_addresses_only)
PeerStorageSql->>PeerDatabaseSql: get_closest_n_active_peers(..., external_addresses_only)
PeerDatabaseSql->>DB: SQL query WHERE is_external = true/false
DB-->>PeerDatabaseSql: Peer rows (filtered)
PeerDatabaseSql-->>PeerStorageSql: Peers
PeerStorageSql-->>PeerManager: Peers
PeerManager-->>DHT: Peers
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Poem
Note ⚡️ Unit Test Generation is now available in beta!Learn more here, or try it out under "Finishing Touches" below. 📜 Recent review detailsConfiguration used: CodeRabbit UI 📒 Files selected for processing (4)
🚧 Files skipped from review as they are similar to previous changes (3)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (8)
🔇 Additional comments (10)
✨ Finishing Touches
🧪 Generate unit tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
comms/core/src/peer_manager/storage/migrations/2025-07-19-085200_external_flag/up.sql (1)
6-39
: Migration logic is comprehensive and correctly identifies internal IP ranges.The string pattern matching approach works well with multiaddress format and covers all major internal IP ranges:
- IPv4: loopback (127.x.x.x), unspecified (0.0.0.0), and private ranges (10.x.x.x, 172.16-31.x.x, 192.168.x.x)
- IPv6: loopback (::1) and unspecified (::)
The explicit enumeration of 172.16.x.x through 172.31.x.x subnets (lines 15-30) correctly covers the entire 172.16.0.0/12 private range.
Consider potential edge cases for completeness.
While the current implementation covers standard cases well, consider these edge cases if they become relevant:
- IPv6 private/link-local ranges (fc00::/7, fe80::/10)
- IPv4 link-local addresses (169.254.0.0/16)
- IPv4 multicast ranges
For now, the current scope appears appropriate for the use case.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (18)
common/config/presets/c_base_node_c.toml
(2 hunks)common/config/presets/d_console_wallet.toml
(4 hunks)comms/core/src/net_address/multiaddr_with_stats.rs
(2 hunks)comms/core/src/peer_manager/manager.rs
(9 hunks)comms/core/src/peer_manager/mod.rs
(1 hunks)comms/core/src/peer_manager/peer.rs
(1 hunks)comms/core/src/peer_manager/peer_storage_sql.rs
(5 hunks)comms/core/src/peer_manager/storage/database.rs
(21 hunks)comms/core/src/peer_manager/storage/migrations/2025-07-19-085200_external_flag/down.sql
(1 hunks)comms/core/src/peer_manager/storage/migrations/2025-07-19-085200_external_flag/up.sql
(1 hunks)comms/core/src/peer_manager/storage/schema.rs
(1 hunks)comms/core/src/peer_validator/error.rs
(1 hunks)comms/dht/src/actor.rs
(1 hunks)comms/dht/src/connectivity/mod.rs
(1 hunks)comms/dht/src/connectivity/test.rs
(1 hunks)comms/dht/src/network_discovery/ready.rs
(1 hunks)comms/dht/src/rpc/peer_info.rs
(2 hunks)comms/dht/src/rpc/service.rs
(2 hunks)
🧠 Learnings (19)
📓 Common learnings
Learnt from: hansieodendaal
PR: tari-project/tari#7358
File: comms/core/src/peer_manager/storage/database.rs:566-570
Timestamp: 2025-07-21T16:03:14.260Z
Learning: In the Tari peer database, there was a known issue with JSON serialization corruption in the `source` field of the `multi_addresses` table, causing approximately 0.4% of peer validation failures. The migration to Borsh serialization (2025-07-21-170500_peer_address_source) intentionally uses a destructive approach (dropping and recreating tables) because the existing JSON data contains corruption that cannot be reliably converted. This data loss is acceptable to ensure data integrity going forward.
Learnt from: hansieodendaal
PR: tari-project/tari#6963
File: comms/core/src/peer_manager/manager.rs:60-68
Timestamp: 2025-05-26T02:40:23.812Z
Learning: PeerDatabaseSql in the Tari codebase has been specifically refactored to handle concurrent access and mitigate blocking I/O concerns on async executor threads. The implementation has been tested under high load at both system level and through unit tests like test_concurrent_add_or_update_and_get_closest_peers which validates concurrent read/write operations.
Learnt from: hansieodendaal
PR: tari-project/tari#6963
File: comms/core/src/peer_manager/storage/migrations/2025-04-14-072200_initial/up.sql:24-41
Timestamp: 2025-05-02T14:07:10.892Z
Learning: The peer system design requires each network address to be uniquely associated with exactly one peer, and an address cannot be reused across multiple peers.
Learnt from: hansieodendaal
PR: tari-project/tari#7123
File: comms/core/src/peer_manager/storage/database.rs:1517-1541
Timestamp: 2025-05-29T09:42:20.881Z
Learning: In the `hard_delete_all_stale_peers` method in `comms/core/src/peer_manager/storage/database.rs`, the SQL query intentionally uses exact equality (`peers.features = ?`) rather than bitwise operations (`peers.features & ? != 0`) when matching `COMMUNICATION_NODE` features. This is the intended behavior to match only peers with exactly the `COMMUNICATION_NODE` feature, excluding those with additional feature flags.
comms/core/src/peer_manager/storage/schema.rs (2)
Learnt from: hansieodendaal
PR: #7123
File: comms/core/src/peer_manager/storage/database.rs:1517-1541
Timestamp: 2025-05-29T09:42:20.881Z
Learning: In the hard_delete_all_stale_peers
method in comms/core/src/peer_manager/storage/database.rs
, the SQL query intentionally uses exact equality (peers.features = ?
) rather than bitwise operations (peers.features & ? != 0
) when matching COMMUNICATION_NODE
features. This is the intended behavior to match only peers with exactly the COMMUNICATION_NODE
feature, excluding those with additional feature flags.
Learnt from: hansieodendaal
PR: #7358
File: comms/core/src/peer_manager/storage/database.rs:566-570
Timestamp: 2025-07-21T16:03:14.260Z
Learning: In the Tari peer database, there was a known issue with JSON serialization corruption in the source
field of the multi_addresses
table, causing approximately 0.4% of peer validation failures. The migration to Borsh serialization (2025-07-21-170500_peer_address_source) intentionally uses a destructive approach (dropping and recreating tables) because the existing JSON data contains corruption that cannot be reliably converted. This data loss is acceptable to ensure data integrity going forward.
comms/dht/src/connectivity/test.rs (10)
Learnt from: hansieodendaal
PR: #7123
File: comms/core/src/peer_manager/storage/database.rs:1517-1541
Timestamp: 2025-05-29T09:42:20.881Z
Learning: In the hard_delete_all_stale_peers
method in comms/core/src/peer_manager/storage/database.rs
, the SQL query intentionally uses exact equality (peers.features = ?
) rather than bitwise operations (peers.features & ? != 0
) when matching COMMUNICATION_NODE
features. This is the intended behavior to match only peers with exactly the COMMUNICATION_NODE
feature, excluding those with additional feature flags.
Learnt from: hansieodendaal
PR: #7307
File: comms/core/src/connection_manager/peer_connection.rs:356-357
Timestamp: 2025-07-12T03:43:22.545Z
Learning: In comms/core/src/connectivity/manager.rs, the disconnect_if_unused call uses parameters (0, 0) intentionally to be conservative and avoid terminating connections that might be in use. This approach is used while troubleshooting inbound connection drop issues to eliminate the disconnect logic as a potential cause.
Learnt from: hansieodendaal
PR: #7294
File: comms/dht/src/network_discovery/seed_strap.rs:352-456
Timestamp: 2025-07-09T08:33:29.320Z
Learning: In comms/dht/src/network_discovery/seed_strap.rs, the context.connectivity.dial_peer method should fail fast and return an error if a peer cannot be dialed, rather than requiring retry logic for general connection failures.
Learnt from: hansieodendaal
PR: #7294
File: comms/dht/src/network_discovery/seed_strap.rs:352-456
Timestamp: 2025-07-09T08:33:29.320Z
Learning: In comms/dht/src/network_discovery/seed_strap.rs, the NUM_RETRIES logic in get_peers is specifically designed to handle peer connections that are closed while trying to RPC stream peer info, not general connection failures. The retry logic only applies when peers.is_empty() && !conn.is_connected() && attempt < NUM_RETRIES, which indicates a mid-stream disconnection.
Learnt from: hansieodendaal
PR: #7294
File: comms/dht/src/network_discovery/seed_strap.rs:352-456
Timestamp: 2025-07-09T08:33:29.320Z
Learning: In comms/dht/src/network_discovery/seed_strap.rs, the fetch_peers_from_connection and collect_peer_stream functions rely on RPC streaming, and when the main connection is closed by another process, collect_peer_stream times out after STREAM_ITEM_TIMEOUT because it cannot detect that the peer can no longer respond, returning an empty vector of peers. This is why the connection state check is important for the retry logic.
Learnt from: hansieodendaal
PR: #7307
File: comms/core/src/connection_manager/peer_connection.rs:356-357
Timestamp: 2025-07-12T03:43:22.545Z
Learning: In base_layer/core/src/base_node/tari_pulse_service/mod.rs, the disconnect_if_unused call uses parameters (0, 2) because the tari_pulse_service creates exactly 2 substreams and 0 RPC sessions during health checks.
Learnt from: SWvheerden
PR: #6951
File: base_layer/core/src/base_node/tari_pulse_service/mod.rs:327-352
Timestamp: 2025-04-16T07:06:53.981Z
Learning: The discovery_peer and dial_peer methods in the Tari codebase have built-in timeout mechanisms, so adding explicit timeouts with tokio::time::timeout is unnecessary.
Learnt from: SWvheerden
PR: #6951
File: base_layer/core/src/base_node/tari_pulse_service/mod.rs:327-352
Timestamp: 2025-04-16T07:06:53.981Z
Learning: The discovery_peer and dial_peer methods in the Tari codebase have built-in timeout mechanisms, so adding explicit timeouts with tokio::time::timeout is unnecessary.
Learnt from: hansieodendaal
PR: #7294
File: comms/dht/src/network_discovery/seed_strap.rs:721-735
Timestamp: 2025-07-09T08:13:37.206Z
Learning: In comms/dht/src/network_discovery/seed_strap.rs, the 10-second STREAM_ITEM_TIMEOUT and retry logic are intentionally designed to handle service conflicts where other services kill seed peer connections during seedstrap operations. The underlying discovery_peer/dial_peer API timeouts are too lenient for seedstrap use cases, so the more aggressive timeout with retry logic is appropriate and necessary.
Learnt from: hansieodendaal
PR: #6963
File: comms/dht/src/proto/mod.rs:141-142
Timestamp: 2025-05-02T07:12:23.985Z
Learning: The PeerFeatures::from_bits_u32_truncate
method truncates a u32 to u8 bits but can still return None
if the resulting bits don't match any valid flags, making the error handling with .ok_or_else()
necessary even after truncation.
comms/core/src/peer_manager/storage/migrations/2025-07-19-085200_external_flag/down.sql (2)
Learnt from: hansieodendaal
PR: #7358
File: comms/core/src/peer_manager/storage/database.rs:566-570
Timestamp: 2025-07-21T16:03:14.260Z
Learning: In the Tari peer database, there was a known issue with JSON serialization corruption in the source
field of the multi_addresses
table, causing approximately 0.4% of peer validation failures. The migration to Borsh serialization (2025-07-21-170500_peer_address_source) intentionally uses a destructive approach (dropping and recreating tables) because the existing JSON data contains corruption that cannot be reliably converted. This data loss is acceptable to ensure data integrity going forward.
Learnt from: hansieodendaal
PR: #7123
File: comms/core/src/peer_manager/storage/database.rs:1517-1541
Timestamp: 2025-05-29T09:42:20.881Z
Learning: In the hard_delete_all_stale_peers
method in comms/core/src/peer_manager/storage/database.rs
, the SQL query intentionally uses exact equality (peers.features = ?
) rather than bitwise operations (peers.features & ? != 0
) when matching COMMUNICATION_NODE
features. This is the intended behavior to match only peers with exactly the COMMUNICATION_NODE
feature, excluding those with additional feature flags.
comms/dht/src/rpc/service.rs (10)
Learnt from: hansieodendaal
PR: #7123
File: comms/core/src/peer_manager/storage/database.rs:1517-1541
Timestamp: 2025-05-29T09:42:20.881Z
Learning: In the hard_delete_all_stale_peers
method in comms/core/src/peer_manager/storage/database.rs
, the SQL query intentionally uses exact equality (peers.features = ?
) rather than bitwise operations (peers.features & ? != 0
) when matching COMMUNICATION_NODE
features. This is the intended behavior to match only peers with exactly the COMMUNICATION_NODE
feature, excluding those with additional feature flags.
Learnt from: hansieodendaal
PR: #7294
File: comms/dht/src/network_discovery/seed_strap.rs:352-456
Timestamp: 2025-07-09T08:33:29.320Z
Learning: In comms/dht/src/network_discovery/seed_strap.rs, the NUM_RETRIES logic in get_peers is specifically designed to handle peer connections that are closed while trying to RPC stream peer info, not general connection failures. The retry logic only applies when peers.is_empty() && !conn.is_connected() && attempt < NUM_RETRIES, which indicates a mid-stream disconnection.
Learnt from: hansieodendaal
PR: #7294
File: comms/dht/src/network_discovery/seed_strap.rs:352-456
Timestamp: 2025-07-09T08:33:29.320Z
Learning: In comms/dht/src/network_discovery/seed_strap.rs, the fetch_peers_from_connection and collect_peer_stream functions rely on RPC streaming, and when the main connection is closed by another process, collect_peer_stream times out after STREAM_ITEM_TIMEOUT because it cannot detect that the peer can no longer respond, returning an empty vector of peers. This is why the connection state check is important for the retry logic.
Learnt from: hansieodendaal
PR: #7294
File: comms/dht/src/network_discovery/seed_strap.rs:352-456
Timestamp: 2025-07-09T08:33:29.320Z
Learning: In comms/dht/src/network_discovery/seed_strap.rs, the context.connectivity.dial_peer method should fail fast and return an error if a peer cannot be dialed, rather than requiring retry logic for general connection failures.
Learnt from: hansieodendaal
PR: #7294
File: comms/dht/src/network_discovery/seed_strap.rs:721-735
Timestamp: 2025-07-09T08:13:37.206Z
Learning: In comms/dht/src/network_discovery/seed_strap.rs, the 10-second STREAM_ITEM_TIMEOUT and retry logic are intentionally designed to handle service conflicts where other services kill seed peer connections during seedstrap operations. The underlying discovery_peer/dial_peer API timeouts are too lenient for seedstrap use cases, so the more aggressive timeout with retry logic is appropriate and necessary.
Learnt from: hansieodendaal
PR: #7307
File: comms/core/src/connection_manager/peer_connection.rs:356-357
Timestamp: 2025-07-12T03:43:22.545Z
Learning: In base_layer/core/src/base_node/tari_pulse_service/mod.rs, the disconnect_if_unused call uses parameters (0, 2) because the tari_pulse_service creates exactly 2 substreams and 0 RPC sessions during health checks.
Learnt from: SWvheerden
PR: #6951
File: base_layer/core/src/base_node/tari_pulse_service/mod.rs:327-352
Timestamp: 2025-04-16T07:06:53.981Z
Learning: The discovery_peer and dial_peer methods in the Tari codebase have built-in timeout mechanisms, so adding explicit timeouts with tokio::time::timeout is unnecessary.
Learnt from: SWvheerden
PR: #6951
File: base_layer/core/src/base_node/tari_pulse_service/mod.rs:327-352
Timestamp: 2025-04-16T07:06:53.981Z
Learning: The discovery_peer and dial_peer methods in the Tari codebase have built-in timeout mechanisms, so adding explicit timeouts with tokio::time::timeout is unnecessary.
Learnt from: hansieodendaal
PR: #7307
File: comms/core/src/connection_manager/peer_connection.rs:356-357
Timestamp: 2025-07-12T03:43:22.545Z
Learning: In comms/core/src/connectivity/manager.rs, the disconnect_if_unused call uses parameters (0, 0) intentionally to be conservative and avoid terminating connections that might be in use. This approach is used while troubleshooting inbound connection drop issues to eliminate the disconnect logic as a potential cause.
Learnt from: hansieodendaal
PR: #6963
File: comms/dht/src/proto/mod.rs:141-142
Timestamp: 2025-05-02T07:12:23.985Z
Learning: The PeerFeatures::from_bits_u32_truncate
method truncates a u32 to u8 bits but can still return None
if the resulting bits don't match any valid flags, making the error handling with .ok_or_else()
necessary even after truncation.
comms/dht/src/actor.rs (10)
Learnt from: hansieodendaal
PR: #7123
File: comms/core/src/peer_manager/storage/database.rs:1517-1541
Timestamp: 2025-05-29T09:42:20.881Z
Learning: In the hard_delete_all_stale_peers
method in comms/core/src/peer_manager/storage/database.rs
, the SQL query intentionally uses exact equality (peers.features = ?
) rather than bitwise operations (peers.features & ? != 0
) when matching COMMUNICATION_NODE
features. This is the intended behavior to match only peers with exactly the COMMUNICATION_NODE
feature, excluding those with additional feature flags.
Learnt from: hansieodendaal
PR: #7294
File: comms/dht/src/network_discovery/seed_strap.rs:352-456
Timestamp: 2025-07-09T08:33:29.320Z
Learning: In comms/dht/src/network_discovery/seed_strap.rs, the NUM_RETRIES logic in get_peers is specifically designed to handle peer connections that are closed while trying to RPC stream peer info, not general connection failures. The retry logic only applies when peers.is_empty() && !conn.is_connected() && attempt < NUM_RETRIES, which indicates a mid-stream disconnection.
Learnt from: hansieodendaal
PR: #7294
File: comms/dht/src/network_discovery/seed_strap.rs:352-456
Timestamp: 2025-07-09T08:33:29.320Z
Learning: In comms/dht/src/network_discovery/seed_strap.rs, the context.connectivity.dial_peer method should fail fast and return an error if a peer cannot be dialed, rather than requiring retry logic for general connection failures.
Learnt from: hansieodendaal
PR: #7294
File: comms/dht/src/network_discovery/seed_strap.rs:352-456
Timestamp: 2025-07-09T08:33:29.320Z
Learning: In comms/dht/src/network_discovery/seed_strap.rs, the fetch_peers_from_connection and collect_peer_stream functions rely on RPC streaming, and when the main connection is closed by another process, collect_peer_stream times out after STREAM_ITEM_TIMEOUT because it cannot detect that the peer can no longer respond, returning an empty vector of peers. This is why the connection state check is important for the retry logic.
Learnt from: hansieodendaal
PR: #7307
File: comms/core/src/connection_manager/peer_connection.rs:356-357
Timestamp: 2025-07-12T03:43:22.545Z
Learning: In comms/core/src/connectivity/manager.rs, the disconnect_if_unused call uses parameters (0, 0) intentionally to be conservative and avoid terminating connections that might be in use. This approach is used while troubleshooting inbound connection drop issues to eliminate the disconnect logic as a potential cause.
Learnt from: hansieodendaal
PR: #7294
File: comms/dht/src/network_discovery/seed_strap.rs:721-735
Timestamp: 2025-07-09T08:13:37.206Z
Learning: In comms/dht/src/network_discovery/seed_strap.rs, the 10-second STREAM_ITEM_TIMEOUT and retry logic are intentionally designed to handle service conflicts where other services kill seed peer connections during seedstrap operations. The underlying discovery_peer/dial_peer API timeouts are too lenient for seedstrap use cases, so the more aggressive timeout with retry logic is appropriate and necessary.
Learnt from: hansieodendaal
PR: #7307
File: comms/core/src/connection_manager/peer_connection.rs:356-357
Timestamp: 2025-07-12T03:43:22.545Z
Learning: In base_layer/core/src/base_node/tari_pulse_service/mod.rs, the disconnect_if_unused call uses parameters (0, 2) because the tari_pulse_service creates exactly 2 substreams and 0 RPC sessions during health checks.
Learnt from: SWvheerden
PR: #6951
File: base_layer/core/src/base_node/tari_pulse_service/mod.rs:327-352
Timestamp: 2025-04-16T07:06:53.981Z
Learning: The discovery_peer and dial_peer methods in the Tari codebase have built-in timeout mechanisms, so adding explicit timeouts with tokio::time::timeout is unnecessary.
Learnt from: SWvheerden
PR: #6951
File: base_layer/core/src/base_node/tari_pulse_service/mod.rs:327-352
Timestamp: 2025-04-16T07:06:53.981Z
Learning: The discovery_peer and dial_peer methods in the Tari codebase have built-in timeout mechanisms, so adding explicit timeouts with tokio::time::timeout is unnecessary.
Learnt from: hansieodendaal
PR: #6963
File: comms/dht/src/proto/mod.rs:141-142
Timestamp: 2025-05-02T07:12:23.985Z
Learning: The PeerFeatures::from_bits_u32_truncate
method truncates a u32 to u8 bits but can still return None
if the resulting bits don't match any valid flags, making the error handling with .ok_or_else()
necessary even after truncation.
comms/core/src/peer_validator/error.rs (4)
Learnt from: hansieodendaal
PR: #7358
File: comms/core/src/peer_manager/storage/database.rs:566-570
Timestamp: 2025-07-21T16:03:14.260Z
Learning: In the Tari peer database, there was a known issue with JSON serialization corruption in the source
field of the multi_addresses
table, causing approximately 0.4% of peer validation failures. The migration to Borsh serialization (2025-07-21-170500_peer_address_source) intentionally uses a destructive approach (dropping and recreating tables) because the existing JSON data contains corruption that cannot be reliably converted. This data loss is acceptable to ensure data integrity going forward.
Learnt from: hansieodendaal
PR: #6963
File: common_sqlite/src/error.rs:88-92
Timestamp: 2025-05-23T07:49:57.349Z
Learning: In the StorageError enum in common_sqlite/src/error.rs, the HexError variant should keep the manual From implementation rather than using #[from] attribute, as it stores a String representation of the error rather than the HexError type itself.
Learnt from: hansieodendaal
PR: #7294
File: comms/dht/src/network_discovery/seed_strap.rs:352-456
Timestamp: 2025-07-09T08:33:29.320Z
Learning: In comms/dht/src/network_discovery/seed_strap.rs, the context.connectivity.dial_peer method should fail fast and return an error if a peer cannot be dialed, rather than requiring retry logic for general connection failures.
Learnt from: hansieodendaal
PR: #6963
File: comms/core/src/peer_manager/storage/migrations/2025-04-14-072200_initial/up.sql:24-41
Timestamp: 2025-05-02T14:07:10.892Z
Learning: The peer system design requires each network address to be uniquely associated with exactly one peer, and an address cannot be reused across multiple peers.
comms/dht/src/network_discovery/ready.rs (9)
Learnt from: hansieodendaal
PR: #7294
File: comms/dht/src/network_discovery/seed_strap.rs:352-456
Timestamp: 2025-07-09T08:33:29.320Z
Learning: In comms/dht/src/network_discovery/seed_strap.rs, the NUM_RETRIES logic in get_peers is specifically designed to handle peer connections that are closed while trying to RPC stream peer info, not general connection failures. The retry logic only applies when peers.is_empty() && !conn.is_connected() && attempt < NUM_RETRIES, which indicates a mid-stream disconnection.
Learnt from: hansieodendaal
PR: #7294
File: comms/dht/src/network_discovery/seed_strap.rs:352-456
Timestamp: 2025-07-09T08:33:29.320Z
Learning: In comms/dht/src/network_discovery/seed_strap.rs, the context.connectivity.dial_peer method should fail fast and return an error if a peer cannot be dialed, rather than requiring retry logic for general connection failures.
Learnt from: hansieodendaal
PR: #7123
File: comms/core/src/peer_manager/storage/database.rs:1517-1541
Timestamp: 2025-05-29T09:42:20.881Z
Learning: In the hard_delete_all_stale_peers
method in comms/core/src/peer_manager/storage/database.rs
, the SQL query intentionally uses exact equality (peers.features = ?
) rather than bitwise operations (peers.features & ? != 0
) when matching COMMUNICATION_NODE
features. This is the intended behavior to match only peers with exactly the COMMUNICATION_NODE
feature, excluding those with additional feature flags.
Learnt from: hansieodendaal
PR: #7294
File: comms/dht/src/network_discovery/seed_strap.rs:352-456
Timestamp: 2025-07-09T08:33:29.320Z
Learning: In comms/dht/src/network_discovery/seed_strap.rs, the fetch_peers_from_connection and collect_peer_stream functions rely on RPC streaming, and when the main connection is closed by another process, collect_peer_stream times out after STREAM_ITEM_TIMEOUT because it cannot detect that the peer can no longer respond, returning an empty vector of peers. This is why the connection state check is important for the retry logic.
Learnt from: hansieodendaal
PR: #7294
File: comms/dht/src/network_discovery/seed_strap.rs:721-735
Timestamp: 2025-07-09T08:13:37.206Z
Learning: In comms/dht/src/network_discovery/seed_strap.rs, the 10-second STREAM_ITEM_TIMEOUT and retry logic are intentionally designed to handle service conflicts where other services kill seed peer connections during seedstrap operations. The underlying discovery_peer/dial_peer API timeouts are too lenient for seedstrap use cases, so the more aggressive timeout with retry logic is appropriate and necessary.
Learnt from: SWvheerden
PR: #6951
File: base_layer/core/src/base_node/tari_pulse_service/mod.rs:327-352
Timestamp: 2025-04-16T07:06:53.981Z
Learning: The discovery_peer and dial_peer methods in the Tari codebase have built-in timeout mechanisms, so adding explicit timeouts with tokio::time::timeout is unnecessary.
Learnt from: SWvheerden
PR: #6951
File: base_layer/core/src/base_node/tari_pulse_service/mod.rs:327-352
Timestamp: 2025-04-16T07:06:53.981Z
Learning: The discovery_peer and dial_peer methods in the Tari codebase have built-in timeout mechanisms, so adding explicit timeouts with tokio::time::timeout is unnecessary.
Learnt from: hansieodendaal
PR: #7307
File: comms/core/src/connection_manager/peer_connection.rs:356-357
Timestamp: 2025-07-12T03:43:22.545Z
Learning: In comms/core/src/connectivity/manager.rs, the disconnect_if_unused call uses parameters (0, 0) intentionally to be conservative and avoid terminating connections that might be in use. This approach is used while troubleshooting inbound connection drop issues to eliminate the disconnect logic as a potential cause.
Learnt from: hansieodendaal
PR: #7307
File: comms/core/src/connection_manager/peer_connection.rs:356-357
Timestamp: 2025-07-12T03:43:22.545Z
Learning: In base_layer/core/src/base_node/tari_pulse_service/mod.rs, the disconnect_if_unused call uses parameters (0, 2) because the tari_pulse_service creates exactly 2 substreams and 0 RPC sessions during health checks.
comms/dht/src/connectivity/mod.rs (10)
Learnt from: hansieodendaal
PR: #7123
File: comms/core/src/peer_manager/storage/database.rs:1517-1541
Timestamp: 2025-05-29T09:42:20.881Z
Learning: In the hard_delete_all_stale_peers
method in comms/core/src/peer_manager/storage/database.rs
, the SQL query intentionally uses exact equality (peers.features = ?
) rather than bitwise operations (peers.features & ? != 0
) when matching COMMUNICATION_NODE
features. This is the intended behavior to match only peers with exactly the COMMUNICATION_NODE
feature, excluding those with additional feature flags.
Learnt from: hansieodendaal
PR: #7294
File: comms/dht/src/network_discovery/seed_strap.rs:352-456
Timestamp: 2025-07-09T08:33:29.320Z
Learning: In comms/dht/src/network_discovery/seed_strap.rs, the context.connectivity.dial_peer method should fail fast and return an error if a peer cannot be dialed, rather than requiring retry logic for general connection failures.
Learnt from: hansieodendaal
PR: #7294
File: comms/dht/src/network_discovery/seed_strap.rs:352-456
Timestamp: 2025-07-09T08:33:29.320Z
Learning: In comms/dht/src/network_discovery/seed_strap.rs, the NUM_RETRIES logic in get_peers is specifically designed to handle peer connections that are closed while trying to RPC stream peer info, not general connection failures. The retry logic only applies when peers.is_empty() && !conn.is_connected() && attempt < NUM_RETRIES, which indicates a mid-stream disconnection.
Learnt from: hansieodendaal
PR: #7307
File: comms/core/src/connection_manager/peer_connection.rs:356-357
Timestamp: 2025-07-12T03:43:22.545Z
Learning: In comms/core/src/connectivity/manager.rs, the disconnect_if_unused call uses parameters (0, 0) intentionally to be conservative and avoid terminating connections that might be in use. This approach is used while troubleshooting inbound connection drop issues to eliminate the disconnect logic as a potential cause.
Learnt from: hansieodendaal
PR: #7294
File: comms/dht/src/network_discovery/seed_strap.rs:352-456
Timestamp: 2025-07-09T08:33:29.320Z
Learning: In comms/dht/src/network_discovery/seed_strap.rs, the fetch_peers_from_connection and collect_peer_stream functions rely on RPC streaming, and when the main connection is closed by another process, collect_peer_stream times out after STREAM_ITEM_TIMEOUT because it cannot detect that the peer can no longer respond, returning an empty vector of peers. This is why the connection state check is important for the retry logic.
Learnt from: hansieodendaal
PR: #7307
File: comms/core/src/connection_manager/peer_connection.rs:356-357
Timestamp: 2025-07-12T03:43:22.545Z
Learning: In base_layer/core/src/base_node/tari_pulse_service/mod.rs, the disconnect_if_unused call uses parameters (0, 2) because the tari_pulse_service creates exactly 2 substreams and 0 RPC sessions during health checks.
Learnt from: SWvheerden
PR: #6951
File: base_layer/core/src/base_node/tari_pulse_service/mod.rs:327-352
Timestamp: 2025-04-16T07:06:53.981Z
Learning: The discovery_peer and dial_peer methods in the Tari codebase have built-in timeout mechanisms, so adding explicit timeouts with tokio::time::timeout is unnecessary.
Learnt from: SWvheerden
PR: #6951
File: base_layer/core/src/base_node/tari_pulse_service/mod.rs:327-352
Timestamp: 2025-04-16T07:06:53.981Z
Learning: The discovery_peer and dial_peer methods in the Tari codebase have built-in timeout mechanisms, so adding explicit timeouts with tokio::time::timeout is unnecessary.
Learnt from: hansieodendaal
PR: #7294
File: comms/dht/src/network_discovery/seed_strap.rs:721-735
Timestamp: 2025-07-09T08:13:37.206Z
Learning: In comms/dht/src/network_discovery/seed_strap.rs, the 10-second STREAM_ITEM_TIMEOUT and retry logic are intentionally designed to handle service conflicts where other services kill seed peer connections during seedstrap operations. The underlying discovery_peer/dial_peer API timeouts are too lenient for seedstrap use cases, so the more aggressive timeout with retry logic is appropriate and necessary.
Learnt from: hansieodendaal
PR: #6963
File: comms/dht/src/proto/mod.rs:141-142
Timestamp: 2025-05-02T07:12:23.985Z
Learning: The PeerFeatures::from_bits_u32_truncate
method truncates a u32 to u8 bits but can still return None
if the resulting bits don't match any valid flags, making the error handling with .ok_or_else()
necessary even after truncation.
comms/core/src/peer_manager/mod.rs (4)
Learnt from: SWvheerden
PR: #6951
File: base_layer/core/src/base_node/tari_pulse_service/mod.rs:327-352
Timestamp: 2025-04-16T07:06:53.981Z
Learning: The discovery_peer and dial_peer methods in the Tari codebase have built-in timeout mechanisms, so adding explicit timeouts with tokio::time::timeout is unnecessary.
Learnt from: SWvheerden
PR: #6951
File: base_layer/core/src/base_node/tari_pulse_service/mod.rs:327-352
Timestamp: 2025-04-16T07:06:53.981Z
Learning: The discovery_peer and dial_peer methods in the Tari codebase have built-in timeout mechanisms, so adding explicit timeouts with tokio::time::timeout is unnecessary.
Learnt from: hansieodendaal
PR: #7123
File: comms/core/src/peer_manager/storage/database.rs:1517-1541
Timestamp: 2025-05-29T09:42:20.881Z
Learning: In the hard_delete_all_stale_peers
method in comms/core/src/peer_manager/storage/database.rs
, the SQL query intentionally uses exact equality (peers.features = ?
) rather than bitwise operations (peers.features & ? != 0
) when matching COMMUNICATION_NODE
features. This is the intended behavior to match only peers with exactly the COMMUNICATION_NODE
feature, excluding those with additional feature flags.
Learnt from: hansieodendaal
PR: #6963
File: comms/core/src/peer_manager/manager.rs:60-68
Timestamp: 2025-05-26T02:40:23.812Z
Learning: PeerDatabaseSql in the Tari codebase has been specifically refactored to handle concurrent access and mitigate blocking I/O concerns on async executor threads. The implementation has been tested under high load at both system level and through unit tests like test_concurrent_add_or_update_and_get_closest_peers which validates concurrent read/write operations.
comms/core/src/peer_manager/peer.rs (5)
Learnt from: hansieodendaal
PR: #7123
File: comms/core/src/peer_manager/storage/database.rs:1517-1541
Timestamp: 2025-05-29T09:42:20.881Z
Learning: In the hard_delete_all_stale_peers
method in comms/core/src/peer_manager/storage/database.rs
, the SQL query intentionally uses exact equality (peers.features = ?
) rather than bitwise operations (peers.features & ? != 0
) when matching COMMUNICATION_NODE
features. This is the intended behavior to match only peers with exactly the COMMUNICATION_NODE
feature, excluding those with additional feature flags.
Learnt from: hansieodendaal
PR: #6963
File: comms/dht/src/proto/mod.rs:141-142
Timestamp: 2025-05-02T07:12:23.985Z
Learning: The PeerFeatures::from_bits_u32_truncate
method truncates a u32 to u8 bits but can still return None
if the resulting bits don't match any valid flags, making the error handling with .ok_or_else()
necessary even after truncation.
Learnt from: SWvheerden
PR: #6951
File: base_layer/core/src/base_node/tari_pulse_service/mod.rs:327-352
Timestamp: 2025-04-16T07:06:53.981Z
Learning: The discovery_peer and dial_peer methods in the Tari codebase have built-in timeout mechanisms, so adding explicit timeouts with tokio::time::timeout is unnecessary.
Learnt from: SWvheerden
PR: #6951
File: base_layer/core/src/base_node/tari_pulse_service/mod.rs:327-352
Timestamp: 2025-04-16T07:06:53.981Z
Learning: The discovery_peer and dial_peer methods in the Tari codebase have built-in timeout mechanisms, so adding explicit timeouts with tokio::time::timeout is unnecessary.
Learnt from: hansieodendaal
PR: #7294
File: comms/dht/src/network_discovery/seed_strap.rs:352-456
Timestamp: 2025-07-09T08:33:29.320Z
Learning: In comms/dht/src/network_discovery/seed_strap.rs, the context.connectivity.dial_peer method should fail fast and return an error if a peer cannot be dialed, rather than requiring retry logic for general connection failures.
comms/core/src/peer_manager/storage/migrations/2025-07-19-085200_external_flag/up.sql (2)
Learnt from: hansieodendaal
PR: #7358
File: comms/core/src/peer_manager/storage/database.rs:566-570
Timestamp: 2025-07-21T16:03:14.260Z
Learning: In the Tari peer database, there was a known issue with JSON serialization corruption in the source
field of the multi_addresses
table, causing approximately 0.4% of peer validation failures. The migration to Borsh serialization (2025-07-21-170500_peer_address_source) intentionally uses a destructive approach (dropping and recreating tables) because the existing JSON data contains corruption that cannot be reliably converted. This data loss is acceptable to ensure data integrity going forward.
Learnt from: hansieodendaal
PR: #7123
File: comms/core/src/peer_manager/storage/database.rs:1517-1541
Timestamp: 2025-05-29T09:42:20.881Z
Learning: In the hard_delete_all_stale_peers
method in comms/core/src/peer_manager/storage/database.rs
, the SQL query intentionally uses exact equality (peers.features = ?
) rather than bitwise operations (peers.features & ? != 0
) when matching COMMUNICATION_NODE
features. This is the intended behavior to match only peers with exactly the COMMUNICATION_NODE
feature, excluding those with additional feature flags.
comms/dht/src/rpc/peer_info.rs (5)
Learnt from: hansieodendaal
PR: #7123
File: comms/core/src/peer_manager/storage/database.rs:1517-1541
Timestamp: 2025-05-29T09:42:20.881Z
Learning: In the hard_delete_all_stale_peers
method in comms/core/src/peer_manager/storage/database.rs
, the SQL query intentionally uses exact equality (peers.features = ?
) rather than bitwise operations (peers.features & ? != 0
) when matching COMMUNICATION_NODE
features. This is the intended behavior to match only peers with exactly the COMMUNICATION_NODE
feature, excluding those with additional feature flags.
Learnt from: hansieodendaal
PR: #7294
File: comms/dht/src/network_discovery/seed_strap.rs:352-456
Timestamp: 2025-07-09T08:33:29.320Z
Learning: In comms/dht/src/network_discovery/seed_strap.rs, the NUM_RETRIES logic in get_peers is specifically designed to handle peer connections that are closed while trying to RPC stream peer info, not general connection failures. The retry logic only applies when peers.is_empty() && !conn.is_connected() && attempt < NUM_RETRIES, which indicates a mid-stream disconnection.
Learnt from: hansieodendaal
PR: #7294
File: comms/dht/src/network_discovery/seed_strap.rs:352-456
Timestamp: 2025-07-09T08:33:29.320Z
Learning: In comms/dht/src/network_discovery/seed_strap.rs, the context.connectivity.dial_peer method should fail fast and return an error if a peer cannot be dialed, rather than requiring retry logic for general connection failures.
Learnt from: hansieodendaal
PR: #7358
File: comms/core/src/peer_manager/storage/database.rs:566-570
Timestamp: 2025-07-21T16:03:14.260Z
Learning: In the Tari peer database, there was a known issue with JSON serialization corruption in the source
field of the multi_addresses
table, causing approximately 0.4% of peer validation failures. The migration to Borsh serialization (2025-07-21-170500_peer_address_source) intentionally uses a destructive approach (dropping and recreating tables) because the existing JSON data contains corruption that cannot be reliably converted. This data loss is acceptable to ensure data integrity going forward.
Learnt from: hansieodendaal
PR: #6963
File: comms/dht/src/proto/mod.rs:141-142
Timestamp: 2025-05-02T07:12:23.985Z
Learning: The PeerFeatures::from_bits_u32_truncate
method truncates a u32 to u8 bits but can still return None
if the resulting bits don't match any valid flags, making the error handling with .ok_or_else()
necessary even after truncation.
comms/core/src/net_address/multiaddr_with_stats.rs (6)
Learnt from: SWvheerden
PR: #6951
File: base_layer/core/src/base_node/tari_pulse_service/mod.rs:327-352
Timestamp: 2025-04-16T07:06:53.981Z
Learning: The discovery_peer and dial_peer methods in the Tari codebase have built-in timeout mechanisms, so adding explicit timeouts with tokio::time::timeout is unnecessary.
Learnt from: SWvheerden
PR: #6951
File: base_layer/core/src/base_node/tari_pulse_service/mod.rs:327-352
Timestamp: 2025-04-16T07:06:53.981Z
Learning: The discovery_peer and dial_peer methods in the Tari codebase have built-in timeout mechanisms, so adding explicit timeouts with tokio::time::timeout is unnecessary.
Learnt from: hansieodendaal
PR: #7294
File: comms/dht/src/network_discovery/seed_strap.rs:721-735
Timestamp: 2025-07-09T08:13:37.206Z
Learning: In comms/dht/src/network_discovery/seed_strap.rs, the 10-second STREAM_ITEM_TIMEOUT and retry logic are intentionally designed to handle service conflicts where other services kill seed peer connections during seedstrap operations. The underlying discovery_peer/dial_peer API timeouts are too lenient for seedstrap use cases, so the more aggressive timeout with retry logic is appropriate and necessary.
Learnt from: hansieodendaal
PR: #7294
File: comms/dht/src/network_discovery/seed_strap.rs:352-456
Timestamp: 2025-07-09T08:33:29.320Z
Learning: In comms/dht/src/network_discovery/seed_strap.rs, the fetch_peers_from_connection and collect_peer_stream functions rely on RPC streaming, and when the main connection is closed by another process, collect_peer_stream times out after STREAM_ITEM_TIMEOUT because it cannot detect that the peer can no longer respond, returning an empty vector of peers. This is why the connection state check is important for the retry logic.
Learnt from: hansieodendaal
PR: #7294
File: comms/dht/src/network_discovery/seed_strap.rs:352-456
Timestamp: 2025-07-09T08:33:29.320Z
Learning: In comms/dht/src/network_discovery/seed_strap.rs, the context.connectivity.dial_peer method should fail fast and return an error if a peer cannot be dialed, rather than requiring retry logic for general connection failures.
Learnt from: hansieodendaal
PR: #7294
File: comms/dht/src/network_discovery/seed_strap.rs:352-456
Timestamp: 2025-07-09T08:33:29.320Z
Learning: In comms/dht/src/network_discovery/seed_strap.rs, the NUM_RETRIES logic in get_peers is specifically designed to handle peer connections that are closed while trying to RPC stream peer info, not general connection failures. The retry logic only applies when peers.is_empty() && !conn.is_connected() && attempt < NUM_RETRIES, which indicates a mid-stream disconnection.
common/config/presets/c_base_node_c.toml (4)
Learnt from: hansieodendaal
PR: #7294
File: comms/dht/src/network_discovery/seed_strap.rs:721-735
Timestamp: 2025-07-09T08:13:37.206Z
Learning: In comms/dht/src/network_discovery/seed_strap.rs, the 10-second STREAM_ITEM_TIMEOUT and retry logic are intentionally designed to handle service conflicts where other services kill seed peer connections during seedstrap operations. The underlying discovery_peer/dial_peer API timeouts are too lenient for seedstrap use cases, so the more aggressive timeout with retry logic is appropriate and necessary.
Learnt from: hansieodendaal
PR: #7294
File: comms/dht/src/network_discovery/seed_strap.rs:352-456
Timestamp: 2025-07-09T08:33:29.320Z
Learning: In comms/dht/src/network_discovery/seed_strap.rs, the NUM_RETRIES logic in get_peers is specifically designed to handle peer connections that are closed while trying to RPC stream peer info, not general connection failures. The retry logic only applies when peers.is_empty() && !conn.is_connected() && attempt < NUM_RETRIES, which indicates a mid-stream disconnection.
Learnt from: hansieodendaal
PR: #7294
File: comms/dht/src/network_discovery/seed_strap.rs:352-456
Timestamp: 2025-07-09T08:33:29.320Z
Learning: In comms/dht/src/network_discovery/seed_strap.rs, the fetch_peers_from_connection and collect_peer_stream functions rely on RPC streaming, and when the main connection is closed by another process, collect_peer_stream times out after STREAM_ITEM_TIMEOUT because it cannot detect that the peer can no longer respond, returning an empty vector of peers. This is why the connection state check is important for the retry logic.
Learnt from: hansieodendaal
PR: #7294
File: comms/dht/src/network_discovery/seed_strap.rs:352-456
Timestamp: 2025-07-09T08:33:29.320Z
Learning: In comms/dht/src/network_discovery/seed_strap.rs, the context.connectivity.dial_peer method should fail fast and return an error if a peer cannot be dialed, rather than requiring retry logic for general connection failures.
common/config/presets/d_console_wallet.toml (9)
Learnt from: SolfataraEmit
PR: #6994
File: docs/src/API_GRPC_Explanation.md:236-244
Timestamp: 2025-04-28T11:41:15.722Z
Learning: The minotari_console_wallet reports a fourth field timelocked_balance
in the terminal output for the getBalance
method, which should be reflected in the documentation for accuracy. The wallet.proto
file confirms that timelocked_balance
is defined as the fourth field in the GetBalanceResponse
message.
Learnt from: SolfataraEmit
PR: #6994
File: docs/src/API_GRPC_Explanation.md:236-244
Timestamp: 2025-04-28T11:41:15.722Z
Learning: The minotari_console_wallet reports a fourth field timelocked_balance
in the terminal output for the getBalance
method, which should be reflected in the documentation for accuracy.
Learnt from: SWvheerden
PR: #7301
File: clients/rust/base_node_wallet_client/src/client/http.rs:70-106
Timestamp: 2025-07-11T06:07:36.367Z
Learning: In clients/rust/base_node_wallet_client/src/client/http.rs, the HTTP client server selection logic is intentionally designed to try the local API once and then fallback to the seed server permanently without retry mechanisms. This is the preferred behavior - they want to use local API if available, otherwise use seed server, without continuously retrying the local API.
Learnt from: hansieodendaal
PR: #7294
File: comms/dht/src/network_discovery/seed_strap.rs:721-735
Timestamp: 2025-07-09T08:13:37.206Z
Learning: In comms/dht/src/network_discovery/seed_strap.rs, the 10-second STREAM_ITEM_TIMEOUT and retry logic are intentionally designed to handle service conflicts where other services kill seed peer connections during seedstrap operations. The underlying discovery_peer/dial_peer API timeouts are too lenient for seedstrap use cases, so the more aggressive timeout with retry logic is appropriate and necessary.
Learnt from: hansieodendaal
PR: #7294
File: comms/dht/src/network_discovery/seed_strap.rs:352-456
Timestamp: 2025-07-09T08:33:29.320Z
Learning: In comms/dht/src/network_discovery/seed_strap.rs, the NUM_RETRIES logic in get_peers is specifically designed to handle peer connections that are closed while trying to RPC stream peer info, not general connection failures. The retry logic only applies when peers.is_empty() && !conn.is_connected() && attempt < NUM_RETRIES, which indicates a mid-stream disconnection.
Learnt from: hansieodendaal
PR: #7307
File: comms/core/src/connection_manager/peer_connection.rs:356-357
Timestamp: 2025-07-12T03:43:22.545Z
Learning: In comms/core/src/connectivity/manager.rs, the disconnect_if_unused call uses parameters (0, 0) intentionally to be conservative and avoid terminating connections that might be in use. This approach is used while troubleshooting inbound connection drop issues to eliminate the disconnect logic as a potential cause.
Learnt from: hansieodendaal
PR: #7307
File: comms/core/src/connection_manager/peer_connection.rs:356-357
Timestamp: 2025-07-12T03:43:22.545Z
Learning: In base_layer/core/src/base_node/tari_pulse_service/mod.rs, the disconnect_if_unused call uses parameters (0, 2) because the tari_pulse_service creates exactly 2 substreams and 0 RPC sessions during health checks.
Learnt from: hansieodendaal
PR: #7294
File: comms/dht/src/network_discovery/seed_strap.rs:352-456
Timestamp: 2025-07-09T08:33:29.320Z
Learning: In comms/dht/src/network_discovery/seed_strap.rs, the fetch_peers_from_connection and collect_peer_stream functions rely on RPC streaming, and when the main connection is closed by another process, collect_peer_stream times out after STREAM_ITEM_TIMEOUT because it cannot detect that the peer can no longer respond, returning an empty vector of peers. This is why the connection state check is important for the retry logic.
Learnt from: hansieodendaal
PR: #7294
File: comms/dht/src/network_discovery/seed_strap.rs:352-456
Timestamp: 2025-07-09T08:33:29.320Z
Learning: In comms/dht/src/network_discovery/seed_strap.rs, the context.connectivity.dial_peer method should fail fast and return an error if a peer cannot be dialed, rather than requiring retry logic for general connection failures.
comms/core/src/peer_manager/storage/database.rs (12)
Learnt from: hansieodendaal
PR: #7123
File: comms/core/src/peer_manager/storage/database.rs:1517-1541
Timestamp: 2025-05-29T09:42:20.881Z
Learning: In the hard_delete_all_stale_peers
method in comms/core/src/peer_manager/storage/database.rs
, the SQL query intentionally uses exact equality (peers.features = ?
) rather than bitwise operations (peers.features & ? != 0
) when matching COMMUNICATION_NODE
features. This is the intended behavior to match only peers with exactly the COMMUNICATION_NODE
feature, excluding those with additional feature flags.
Learnt from: hansieodendaal
PR: #6963
File: comms/core/src/peer_manager/manager.rs:60-68
Timestamp: 2025-05-26T02:40:23.812Z
Learning: PeerDatabaseSql in the Tari codebase has been specifically refactored to handle concurrent access and mitigate blocking I/O concerns on async executor threads. The implementation has been tested under high load at both system level and through unit tests like test_concurrent_add_or_update_and_get_closest_peers which validates concurrent read/write operations.
Learnt from: hansieodendaal
PR: #7358
File: comms/core/src/peer_manager/storage/database.rs:566-570
Timestamp: 2025-07-21T16:03:14.260Z
Learning: In the Tari peer database, there was a known issue with JSON serialization corruption in the source
field of the multi_addresses
table, causing approximately 0.4% of peer validation failures. The migration to Borsh serialization (2025-07-21-170500_peer_address_source) intentionally uses a destructive approach (dropping and recreating tables) because the existing JSON data contains corruption that cannot be reliably converted. This data loss is acceptable to ensure data integrity going forward.
Learnt from: hansieodendaal
PR: #7294
File: comms/dht/src/network_discovery/seed_strap.rs:352-456
Timestamp: 2025-07-09T08:33:29.320Z
Learning: In comms/dht/src/network_discovery/seed_strap.rs, the NUM_RETRIES logic in get_peers is specifically designed to handle peer connections that are closed while trying to RPC stream peer info, not general connection failures. The retry logic only applies when peers.is_empty() && !conn.is_connected() && attempt < NUM_RETRIES, which indicates a mid-stream disconnection.
Learnt from: hansieodendaal
PR: #7294
File: comms/dht/src/network_discovery/seed_strap.rs:352-456
Timestamp: 2025-07-09T08:33:29.320Z
Learning: In comms/dht/src/network_discovery/seed_strap.rs, the fetch_peers_from_connection and collect_peer_stream functions rely on RPC streaming, and when the main connection is closed by another process, collect_peer_stream times out after STREAM_ITEM_TIMEOUT because it cannot detect that the peer can no longer respond, returning an empty vector of peers. This is why the connection state check is important for the retry logic.
Learnt from: hansieodendaal
PR: #7294
File: comms/dht/src/network_discovery/seed_strap.rs:352-456
Timestamp: 2025-07-09T08:33:29.320Z
Learning: In comms/dht/src/network_discovery/seed_strap.rs, the context.connectivity.dial_peer method should fail fast and return an error if a peer cannot be dialed, rather than requiring retry logic for general connection failures.
Learnt from: SWvheerden
PR: #6951
File: base_layer/core/src/base_node/tari_pulse_service/mod.rs:327-352
Timestamp: 2025-04-16T07:06:53.981Z
Learning: The discovery_peer and dial_peer methods in the Tari codebase have built-in timeout mechanisms, so adding explicit timeouts with tokio::time::timeout is unnecessary.
Learnt from: SWvheerden
PR: #6951
File: base_layer/core/src/base_node/tari_pulse_service/mod.rs:327-352
Timestamp: 2025-04-16T07:06:53.981Z
Learning: The discovery_peer and dial_peer methods in the Tari codebase have built-in timeout mechanisms, so adding explicit timeouts with tokio::time::timeout is unnecessary.
Learnt from: hansieodendaal
PR: #7123
File: comms/core/src/peer_manager/storage/database.rs:1655-1658
Timestamp: 2025-05-29T09:40:09.356Z
Learning: In the Tari codebase, node_id hex strings in the database are guaranteed to be valid because they can only be added via update_peer_sql(peer: Peer)
which converts from valid NodeId objects, ensuring data integrity at the insertion layer.
Learnt from: hansieodendaal
PR: #6963
File: comms/core/src/peer_manager/storage/migrations/2025-04-14-072200_initial/up.sql:24-41
Timestamp: 2025-05-02T14:07:10.892Z
Learning: The peer system design requires each network address to be uniquely associated with exactly one peer, and an address cannot be reused across multiple peers.
Learnt from: hansieodendaal
PR: #7307
File: comms/core/src/connection_manager/peer_connection.rs:356-357
Timestamp: 2025-07-12T03:43:22.545Z
Learning: In base_layer/core/src/base_node/tari_pulse_service/mod.rs, the disconnect_if_unused call uses parameters (0, 2) because the tari_pulse_service creates exactly 2 substreams and 0 RPC sessions during health checks.
Learnt from: hansieodendaal
PR: #7294
File: comms/dht/src/network_discovery/seed_strap.rs:721-735
Timestamp: 2025-07-09T08:13:37.206Z
Learning: In comms/dht/src/network_discovery/seed_strap.rs, the 10-second STREAM_ITEM_TIMEOUT and retry logic are intentionally designed to handle service conflicts where other services kill seed peer connections during seedstrap operations. The underlying discovery_peer/dial_peer API timeouts are too lenient for seedstrap use cases, so the more aggressive timeout with retry logic is appropriate and necessary.
comms/core/src/peer_manager/peer_storage_sql.rs (10)
Learnt from: hansieodendaal
PR: #7123
File: comms/core/src/peer_manager/storage/database.rs:1517-1541
Timestamp: 2025-05-29T09:42:20.881Z
Learning: In the hard_delete_all_stale_peers
method in comms/core/src/peer_manager/storage/database.rs
, the SQL query intentionally uses exact equality (peers.features = ?
) rather than bitwise operations (peers.features & ? != 0
) when matching COMMUNICATION_NODE
features. This is the intended behavior to match only peers with exactly the COMMUNICATION_NODE
feature, excluding those with additional feature flags.
Learnt from: hansieodendaal
PR: #6963
File: comms/core/src/peer_manager/manager.rs:60-68
Timestamp: 2025-05-26T02:40:23.812Z
Learning: PeerDatabaseSql in the Tari codebase has been specifically refactored to handle concurrent access and mitigate blocking I/O concerns on async executor threads. The implementation has been tested under high load at both system level and through unit tests like test_concurrent_add_or_update_and_get_closest_peers which validates concurrent read/write operations.
Learnt from: SWvheerden
PR: #6951
File: base_layer/core/src/base_node/tari_pulse_service/mod.rs:327-352
Timestamp: 2025-04-16T07:06:53.981Z
Learning: The discovery_peer and dial_peer methods in the Tari codebase have built-in timeout mechanisms, so adding explicit timeouts with tokio::time::timeout is unnecessary.
Learnt from: SWvheerden
PR: #6951
File: base_layer/core/src/base_node/tari_pulse_service/mod.rs:327-352
Timestamp: 2025-04-16T07:06:53.981Z
Learning: The discovery_peer and dial_peer methods in the Tari codebase have built-in timeout mechanisms, so adding explicit timeouts with tokio::time::timeout is unnecessary.
Learnt from: hansieodendaal
PR: #7294
File: comms/dht/src/network_discovery/seed_strap.rs:352-456
Timestamp: 2025-07-09T08:33:29.320Z
Learning: In comms/dht/src/network_discovery/seed_strap.rs, the NUM_RETRIES logic in get_peers is specifically designed to handle peer connections that are closed while trying to RPC stream peer info, not general connection failures. The retry logic only applies when peers.is_empty() && !conn.is_connected() && attempt < NUM_RETRIES, which indicates a mid-stream disconnection.
Learnt from: hansieodendaal
PR: #7294
File: comms/dht/src/network_discovery/seed_strap.rs:352-456
Timestamp: 2025-07-09T08:33:29.320Z
Learning: In comms/dht/src/network_discovery/seed_strap.rs, the fetch_peers_from_connection and collect_peer_stream functions rely on RPC streaming, and when the main connection is closed by another process, collect_peer_stream times out after STREAM_ITEM_TIMEOUT because it cannot detect that the peer can no longer respond, returning an empty vector of peers. This is why the connection state check is important for the retry logic.
Learnt from: hansieodendaal
PR: #7358
File: comms/core/src/peer_manager/storage/database.rs:566-570
Timestamp: 2025-07-21T16:03:14.260Z
Learning: In the Tari peer database, there was a known issue with JSON serialization corruption in the source
field of the multi_addresses
table, causing approximately 0.4% of peer validation failures. The migration to Borsh serialization (2025-07-21-170500_peer_address_source) intentionally uses a destructive approach (dropping and recreating tables) because the existing JSON data contains corruption that cannot be reliably converted. This data loss is acceptable to ensure data integrity going forward.
Learnt from: hansieodendaal
PR: #7307
File: comms/core/src/connection_manager/peer_connection.rs:356-357
Timestamp: 2025-07-12T03:43:22.545Z
Learning: In comms/core/src/connectivity/manager.rs, the disconnect_if_unused call uses parameters (0, 0) intentionally to be conservative and avoid terminating connections that might be in use. This approach is used while troubleshooting inbound connection drop issues to eliminate the disconnect logic as a potential cause.
Learnt from: hansieodendaal
PR: #7294
File: comms/dht/src/network_discovery/seed_strap.rs:352-456
Timestamp: 2025-07-09T08:33:29.320Z
Learning: In comms/dht/src/network_discovery/seed_strap.rs, the context.connectivity.dial_peer method should fail fast and return an error if a peer cannot be dialed, rather than requiring retry logic for general connection failures.
Learnt from: hansieodendaal
PR: #7307
File: comms/core/src/connection_manager/peer_connection.rs:356-357
Timestamp: 2025-07-12T03:43:22.545Z
Learning: In base_layer/core/src/base_node/tari_pulse_service/mod.rs, the disconnect_if_unused call uses parameters (0, 2) because the tari_pulse_service creates exactly 2 substreams and 0 RPC sessions during health checks.
comms/core/src/peer_manager/manager.rs (11)
Learnt from: hansieodendaal
PR: #7123
File: comms/core/src/peer_manager/storage/database.rs:1517-1541
Timestamp: 2025-05-29T09:42:20.881Z
Learning: In the hard_delete_all_stale_peers
method in comms/core/src/peer_manager/storage/database.rs
, the SQL query intentionally uses exact equality (peers.features = ?
) rather than bitwise operations (peers.features & ? != 0
) when matching COMMUNICATION_NODE
features. This is the intended behavior to match only peers with exactly the COMMUNICATION_NODE
feature, excluding those with additional feature flags.
Learnt from: hansieodendaal
PR: #7307
File: comms/core/src/connection_manager/peer_connection.rs:356-357
Timestamp: 2025-07-12T03:43:22.545Z
Learning: In comms/core/src/connectivity/manager.rs, the disconnect_if_unused call uses parameters (0, 0) intentionally to be conservative and avoid terminating connections that might be in use. This approach is used while troubleshooting inbound connection drop issues to eliminate the disconnect logic as a potential cause.
Learnt from: hansieodendaal
PR: #7294
File: comms/dht/src/network_discovery/seed_strap.rs:352-456
Timestamp: 2025-07-09T08:33:29.320Z
Learning: In comms/dht/src/network_discovery/seed_strap.rs, the fetch_peers_from_connection and collect_peer_stream functions rely on RPC streaming, and when the main connection is closed by another process, collect_peer_stream times out after STREAM_ITEM_TIMEOUT because it cannot detect that the peer can no longer respond, returning an empty vector of peers. This is why the connection state check is important for the retry logic.
Learnt from: SWvheerden
PR: #6951
File: base_layer/core/src/base_node/tari_pulse_service/mod.rs:327-352
Timestamp: 2025-04-16T07:06:53.981Z
Learning: The discovery_peer and dial_peer methods in the Tari codebase have built-in timeout mechanisms, so adding explicit timeouts with tokio::time::timeout is unnecessary.
Learnt from: SWvheerden
PR: #6951
File: base_layer/core/src/base_node/tari_pulse_service/mod.rs:327-352
Timestamp: 2025-04-16T07:06:53.981Z
Learning: The discovery_peer and dial_peer methods in the Tari codebase have built-in timeout mechanisms, so adding explicit timeouts with tokio::time::timeout is unnecessary.
Learnt from: hansieodendaal
PR: #7294
File: comms/dht/src/network_discovery/seed_strap.rs:352-456
Timestamp: 2025-07-09T08:33:29.320Z
Learning: In comms/dht/src/network_discovery/seed_strap.rs, the NUM_RETRIES logic in get_peers is specifically designed to handle peer connections that are closed while trying to RPC stream peer info, not general connection failures. The retry logic only applies when peers.is_empty() && !conn.is_connected() && attempt < NUM_RETRIES, which indicates a mid-stream disconnection.
Learnt from: hansieodendaal
PR: #7294
File: comms/dht/src/network_discovery/seed_strap.rs:352-456
Timestamp: 2025-07-09T08:33:29.320Z
Learning: In comms/dht/src/network_discovery/seed_strap.rs, the context.connectivity.dial_peer method should fail fast and return an error if a peer cannot be dialed, rather than requiring retry logic for general connection failures.
Learnt from: hansieodendaal
PR: #6963
File: comms/core/src/peer_manager/storage/migrations/2025-04-14-072200_initial/up.sql:24-41
Timestamp: 2025-05-02T14:07:10.892Z
Learning: The peer system design requires each network address to be uniquely associated with exactly one peer, and an address cannot be reused across multiple peers.
Learnt from: hansieodendaal
PR: #7294
File: comms/dht/src/network_discovery/seed_strap.rs:721-735
Timestamp: 2025-07-09T08:13:37.206Z
Learning: In comms/dht/src/network_discovery/seed_strap.rs, the 10-second STREAM_ITEM_TIMEOUT and retry logic are intentionally designed to handle service conflicts where other services kill seed peer connections during seedstrap operations. The underlying discovery_peer/dial_peer API timeouts are too lenient for seedstrap use cases, so the more aggressive timeout with retry logic is appropriate and necessary.
Learnt from: hansieodendaal
PR: #6963
File: comms/dht/src/proto/mod.rs:141-142
Timestamp: 2025-05-02T07:12:23.985Z
Learning: The PeerFeatures::from_bits_u32_truncate
method truncates a u32 to u8 bits but can still return None
if the resulting bits don't match any valid flags, making the error handling with .ok_or_else()
necessary even after truncation.
Learnt from: hansieodendaal
PR: #7307
File: comms/core/src/connection_manager/peer_connection.rs:356-357
Timestamp: 2025-07-12T03:43:22.545Z
Learning: In base_layer/core/src/base_node/tari_pulse_service/mod.rs, the disconnect_if_unused call uses parameters (0, 2) because the tari_pulse_service creates exactly 2 substreams and 0 RPC sessions during health checks.
🧬 Code Graph Analysis (3)
comms/core/src/peer_manager/storage/schema.rs (1)
comms/core/src/net_address/multiaddr_with_stats.rs (1)
is_external
(148-156)
comms/core/src/peer_manager/mod.rs (1)
comms/core/src/peer_manager/manager.rs (3)
create_test_peer
(361-408)create_test_peer_add_internal_addresses
(411-416)create_test_peer_internal_addresses_only
(419-441)
comms/core/src/peer_manager/storage/database.rs (4)
comms/core/src/net_address/multiaddr_with_stats.rs (3)
address
(143-145)is_external
(148-156)new
(43-59)comms/core/src/peer_manager/peer_storage_sql.rs (4)
format
(676-677)discovery_syncing_peers_with_external_addresses_only
(807-838)create_test_peer
(659-695)all
(182-184)common_sqlite/src/connection.rs (1)
connect_temp_file_and_migrate
(197-211)comms/core/src/peer_manager/peer.rs (1)
new
(103-127)
🧰 Additional context used
🧠 Learnings (19)
📓 Common learnings
Learnt from: hansieodendaal
PR: tari-project/tari#7358
File: comms/core/src/peer_manager/storage/database.rs:566-570
Timestamp: 2025-07-21T16:03:14.260Z
Learning: In the Tari peer database, there was a known issue with JSON serialization corruption in the `source` field of the `multi_addresses` table, causing approximately 0.4% of peer validation failures. The migration to Borsh serialization (2025-07-21-170500_peer_address_source) intentionally uses a destructive approach (dropping and recreating tables) because the existing JSON data contains corruption that cannot be reliably converted. This data loss is acceptable to ensure data integrity going forward.
Learnt from: hansieodendaal
PR: tari-project/tari#6963
File: comms/core/src/peer_manager/manager.rs:60-68
Timestamp: 2025-05-26T02:40:23.812Z
Learning: PeerDatabaseSql in the Tari codebase has been specifically refactored to handle concurrent access and mitigate blocking I/O concerns on async executor threads. The implementation has been tested under high load at both system level and through unit tests like test_concurrent_add_or_update_and_get_closest_peers which validates concurrent read/write operations.
Learnt from: hansieodendaal
PR: tari-project/tari#6963
File: comms/core/src/peer_manager/storage/migrations/2025-04-14-072200_initial/up.sql:24-41
Timestamp: 2025-05-02T14:07:10.892Z
Learning: The peer system design requires each network address to be uniquely associated with exactly one peer, and an address cannot be reused across multiple peers.
Learnt from: hansieodendaal
PR: tari-project/tari#7123
File: comms/core/src/peer_manager/storage/database.rs:1517-1541
Timestamp: 2025-05-29T09:42:20.881Z
Learning: In the `hard_delete_all_stale_peers` method in `comms/core/src/peer_manager/storage/database.rs`, the SQL query intentionally uses exact equality (`peers.features = ?`) rather than bitwise operations (`peers.features & ? != 0`) when matching `COMMUNICATION_NODE` features. This is the intended behavior to match only peers with exactly the `COMMUNICATION_NODE` feature, excluding those with additional feature flags.
comms/core/src/peer_manager/storage/schema.rs (2)
Learnt from: hansieodendaal
PR: #7123
File: comms/core/src/peer_manager/storage/database.rs:1517-1541
Timestamp: 2025-05-29T09:42:20.881Z
Learning: In the hard_delete_all_stale_peers
method in comms/core/src/peer_manager/storage/database.rs
, the SQL query intentionally uses exact equality (peers.features = ?
) rather than bitwise operations (peers.features & ? != 0
) when matching COMMUNICATION_NODE
features. This is the intended behavior to match only peers with exactly the COMMUNICATION_NODE
feature, excluding those with additional feature flags.
Learnt from: hansieodendaal
PR: #7358
File: comms/core/src/peer_manager/storage/database.rs:566-570
Timestamp: 2025-07-21T16:03:14.260Z
Learning: In the Tari peer database, there was a known issue with JSON serialization corruption in the source
field of the multi_addresses
table, causing approximately 0.4% of peer validation failures. The migration to Borsh serialization (2025-07-21-170500_peer_address_source) intentionally uses a destructive approach (dropping and recreating tables) because the existing JSON data contains corruption that cannot be reliably converted. This data loss is acceptable to ensure data integrity going forward.
comms/dht/src/connectivity/test.rs (10)
Learnt from: hansieodendaal
PR: #7123
File: comms/core/src/peer_manager/storage/database.rs:1517-1541
Timestamp: 2025-05-29T09:42:20.881Z
Learning: In the hard_delete_all_stale_peers
method in comms/core/src/peer_manager/storage/database.rs
, the SQL query intentionally uses exact equality (peers.features = ?
) rather than bitwise operations (peers.features & ? != 0
) when matching COMMUNICATION_NODE
features. This is the intended behavior to match only peers with exactly the COMMUNICATION_NODE
feature, excluding those with additional feature flags.
Learnt from: hansieodendaal
PR: #7307
File: comms/core/src/connection_manager/peer_connection.rs:356-357
Timestamp: 2025-07-12T03:43:22.545Z
Learning: In comms/core/src/connectivity/manager.rs, the disconnect_if_unused call uses parameters (0, 0) intentionally to be conservative and avoid terminating connections that might be in use. This approach is used while troubleshooting inbound connection drop issues to eliminate the disconnect logic as a potential cause.
Learnt from: hansieodendaal
PR: #7294
File: comms/dht/src/network_discovery/seed_strap.rs:352-456
Timestamp: 2025-07-09T08:33:29.320Z
Learning: In comms/dht/src/network_discovery/seed_strap.rs, the context.connectivity.dial_peer method should fail fast and return an error if a peer cannot be dialed, rather than requiring retry logic for general connection failures.
Learnt from: hansieodendaal
PR: #7294
File: comms/dht/src/network_discovery/seed_strap.rs:352-456
Timestamp: 2025-07-09T08:33:29.320Z
Learning: In comms/dht/src/network_discovery/seed_strap.rs, the NUM_RETRIES logic in get_peers is specifically designed to handle peer connections that are closed while trying to RPC stream peer info, not general connection failures. The retry logic only applies when peers.is_empty() && !conn.is_connected() && attempt < NUM_RETRIES, which indicates a mid-stream disconnection.
Learnt from: hansieodendaal
PR: #7294
File: comms/dht/src/network_discovery/seed_strap.rs:352-456
Timestamp: 2025-07-09T08:33:29.320Z
Learning: In comms/dht/src/network_discovery/seed_strap.rs, the fetch_peers_from_connection and collect_peer_stream functions rely on RPC streaming, and when the main connection is closed by another process, collect_peer_stream times out after STREAM_ITEM_TIMEOUT because it cannot detect that the peer can no longer respond, returning an empty vector of peers. This is why the connection state check is important for the retry logic.
Learnt from: hansieodendaal
PR: #7307
File: comms/core/src/connection_manager/peer_connection.rs:356-357
Timestamp: 2025-07-12T03:43:22.545Z
Learning: In base_layer/core/src/base_node/tari_pulse_service/mod.rs, the disconnect_if_unused call uses parameters (0, 2) because the tari_pulse_service creates exactly 2 substreams and 0 RPC sessions during health checks.
Learnt from: SWvheerden
PR: #6951
File: base_layer/core/src/base_node/tari_pulse_service/mod.rs:327-352
Timestamp: 2025-04-16T07:06:53.981Z
Learning: The discovery_peer and dial_peer methods in the Tari codebase have built-in timeout mechanisms, so adding explicit timeouts with tokio::time::timeout is unnecessary.
Learnt from: SWvheerden
PR: #6951
File: base_layer/core/src/base_node/tari_pulse_service/mod.rs:327-352
Timestamp: 2025-04-16T07:06:53.981Z
Learning: The discovery_peer and dial_peer methods in the Tari codebase have built-in timeout mechanisms, so adding explicit timeouts with tokio::time::timeout is unnecessary.
Learnt from: hansieodendaal
PR: #7294
File: comms/dht/src/network_discovery/seed_strap.rs:721-735
Timestamp: 2025-07-09T08:13:37.206Z
Learning: In comms/dht/src/network_discovery/seed_strap.rs, the 10-second STREAM_ITEM_TIMEOUT and retry logic are intentionally designed to handle service conflicts where other services kill seed peer connections during seedstrap operations. The underlying discovery_peer/dial_peer API timeouts are too lenient for seedstrap use cases, so the more aggressive timeout with retry logic is appropriate and necessary.
Learnt from: hansieodendaal
PR: #6963
File: comms/dht/src/proto/mod.rs:141-142
Timestamp: 2025-05-02T07:12:23.985Z
Learning: The PeerFeatures::from_bits_u32_truncate
method truncates a u32 to u8 bits but can still return None
if the resulting bits don't match any valid flags, making the error handling with .ok_or_else()
necessary even after truncation.
comms/core/src/peer_manager/storage/migrations/2025-07-19-085200_external_flag/down.sql (2)
Learnt from: hansieodendaal
PR: #7358
File: comms/core/src/peer_manager/storage/database.rs:566-570
Timestamp: 2025-07-21T16:03:14.260Z
Learning: In the Tari peer database, there was a known issue with JSON serialization corruption in the source
field of the multi_addresses
table, causing approximately 0.4% of peer validation failures. The migration to Borsh serialization (2025-07-21-170500_peer_address_source) intentionally uses a destructive approach (dropping and recreating tables) because the existing JSON data contains corruption that cannot be reliably converted. This data loss is acceptable to ensure data integrity going forward.
Learnt from: hansieodendaal
PR: #7123
File: comms/core/src/peer_manager/storage/database.rs:1517-1541
Timestamp: 2025-05-29T09:42:20.881Z
Learning: In the hard_delete_all_stale_peers
method in comms/core/src/peer_manager/storage/database.rs
, the SQL query intentionally uses exact equality (peers.features = ?
) rather than bitwise operations (peers.features & ? != 0
) when matching COMMUNICATION_NODE
features. This is the intended behavior to match only peers with exactly the COMMUNICATION_NODE
feature, excluding those with additional feature flags.
comms/dht/src/rpc/service.rs (10)
Learnt from: hansieodendaal
PR: #7123
File: comms/core/src/peer_manager/storage/database.rs:1517-1541
Timestamp: 2025-05-29T09:42:20.881Z
Learning: In the hard_delete_all_stale_peers
method in comms/core/src/peer_manager/storage/database.rs
, the SQL query intentionally uses exact equality (peers.features = ?
) rather than bitwise operations (peers.features & ? != 0
) when matching COMMUNICATION_NODE
features. This is the intended behavior to match only peers with exactly the COMMUNICATION_NODE
feature, excluding those with additional feature flags.
Learnt from: hansieodendaal
PR: #7294
File: comms/dht/src/network_discovery/seed_strap.rs:352-456
Timestamp: 2025-07-09T08:33:29.320Z
Learning: In comms/dht/src/network_discovery/seed_strap.rs, the NUM_RETRIES logic in get_peers is specifically designed to handle peer connections that are closed while trying to RPC stream peer info, not general connection failures. The retry logic only applies when peers.is_empty() && !conn.is_connected() && attempt < NUM_RETRIES, which indicates a mid-stream disconnection.
Learnt from: hansieodendaal
PR: #7294
File: comms/dht/src/network_discovery/seed_strap.rs:352-456
Timestamp: 2025-07-09T08:33:29.320Z
Learning: In comms/dht/src/network_discovery/seed_strap.rs, the fetch_peers_from_connection and collect_peer_stream functions rely on RPC streaming, and when the main connection is closed by another process, collect_peer_stream times out after STREAM_ITEM_TIMEOUT because it cannot detect that the peer can no longer respond, returning an empty vector of peers. This is why the connection state check is important for the retry logic.
Learnt from: hansieodendaal
PR: #7294
File: comms/dht/src/network_discovery/seed_strap.rs:352-456
Timestamp: 2025-07-09T08:33:29.320Z
Learning: In comms/dht/src/network_discovery/seed_strap.rs, the context.connectivity.dial_peer method should fail fast and return an error if a peer cannot be dialed, rather than requiring retry logic for general connection failures.
Learnt from: hansieodendaal
PR: #7294
File: comms/dht/src/network_discovery/seed_strap.rs:721-735
Timestamp: 2025-07-09T08:13:37.206Z
Learning: In comms/dht/src/network_discovery/seed_strap.rs, the 10-second STREAM_ITEM_TIMEOUT and retry logic are intentionally designed to handle service conflicts where other services kill seed peer connections during seedstrap operations. The underlying discovery_peer/dial_peer API timeouts are too lenient for seedstrap use cases, so the more aggressive timeout with retry logic is appropriate and necessary.
Learnt from: hansieodendaal
PR: #7307
File: comms/core/src/connection_manager/peer_connection.rs:356-357
Timestamp: 2025-07-12T03:43:22.545Z
Learning: In base_layer/core/src/base_node/tari_pulse_service/mod.rs, the disconnect_if_unused call uses parameters (0, 2) because the tari_pulse_service creates exactly 2 substreams and 0 RPC sessions during health checks.
Learnt from: SWvheerden
PR: #6951
File: base_layer/core/src/base_node/tari_pulse_service/mod.rs:327-352
Timestamp: 2025-04-16T07:06:53.981Z
Learning: The discovery_peer and dial_peer methods in the Tari codebase have built-in timeout mechanisms, so adding explicit timeouts with tokio::time::timeout is unnecessary.
Learnt from: SWvheerden
PR: #6951
File: base_layer/core/src/base_node/tari_pulse_service/mod.rs:327-352
Timestamp: 2025-04-16T07:06:53.981Z
Learning: The discovery_peer and dial_peer methods in the Tari codebase have built-in timeout mechanisms, so adding explicit timeouts with tokio::time::timeout is unnecessary.
Learnt from: hansieodendaal
PR: #7307
File: comms/core/src/connection_manager/peer_connection.rs:356-357
Timestamp: 2025-07-12T03:43:22.545Z
Learning: In comms/core/src/connectivity/manager.rs, the disconnect_if_unused call uses parameters (0, 0) intentionally to be conservative and avoid terminating connections that might be in use. This approach is used while troubleshooting inbound connection drop issues to eliminate the disconnect logic as a potential cause.
Learnt from: hansieodendaal
PR: #6963
File: comms/dht/src/proto/mod.rs:141-142
Timestamp: 2025-05-02T07:12:23.985Z
Learning: The PeerFeatures::from_bits_u32_truncate
method truncates a u32 to u8 bits but can still return None
if the resulting bits don't match any valid flags, making the error handling with .ok_or_else()
necessary even after truncation.
comms/dht/src/actor.rs (10)
Learnt from: hansieodendaal
PR: #7123
File: comms/core/src/peer_manager/storage/database.rs:1517-1541
Timestamp: 2025-05-29T09:42:20.881Z
Learning: In the hard_delete_all_stale_peers
method in comms/core/src/peer_manager/storage/database.rs
, the SQL query intentionally uses exact equality (peers.features = ?
) rather than bitwise operations (peers.features & ? != 0
) when matching COMMUNICATION_NODE
features. This is the intended behavior to match only peers with exactly the COMMUNICATION_NODE
feature, excluding those with additional feature flags.
Learnt from: hansieodendaal
PR: #7294
File: comms/dht/src/network_discovery/seed_strap.rs:352-456
Timestamp: 2025-07-09T08:33:29.320Z
Learning: In comms/dht/src/network_discovery/seed_strap.rs, the NUM_RETRIES logic in get_peers is specifically designed to handle peer connections that are closed while trying to RPC stream peer info, not general connection failures. The retry logic only applies when peers.is_empty() && !conn.is_connected() && attempt < NUM_RETRIES, which indicates a mid-stream disconnection.
Learnt from: hansieodendaal
PR: #7294
File: comms/dht/src/network_discovery/seed_strap.rs:352-456
Timestamp: 2025-07-09T08:33:29.320Z
Learning: In comms/dht/src/network_discovery/seed_strap.rs, the context.connectivity.dial_peer method should fail fast and return an error if a peer cannot be dialed, rather than requiring retry logic for general connection failures.
Learnt from: hansieodendaal
PR: #7294
File: comms/dht/src/network_discovery/seed_strap.rs:352-456
Timestamp: 2025-07-09T08:33:29.320Z
Learning: In comms/dht/src/network_discovery/seed_strap.rs, the fetch_peers_from_connection and collect_peer_stream functions rely on RPC streaming, and when the main connection is closed by another process, collect_peer_stream times out after STREAM_ITEM_TIMEOUT because it cannot detect that the peer can no longer respond, returning an empty vector of peers. This is why the connection state check is important for the retry logic.
Learnt from: hansieodendaal
PR: #7307
File: comms/core/src/connection_manager/peer_connection.rs:356-357
Timestamp: 2025-07-12T03:43:22.545Z
Learning: In comms/core/src/connectivity/manager.rs, the disconnect_if_unused call uses parameters (0, 0) intentionally to be conservative and avoid terminating connections that might be in use. This approach is used while troubleshooting inbound connection drop issues to eliminate the disconnect logic as a potential cause.
Learnt from: hansieodendaal
PR: #7294
File: comms/dht/src/network_discovery/seed_strap.rs:721-735
Timestamp: 2025-07-09T08:13:37.206Z
Learning: In comms/dht/src/network_discovery/seed_strap.rs, the 10-second STREAM_ITEM_TIMEOUT and retry logic are intentionally designed to handle service conflicts where other services kill seed peer connections during seedstrap operations. The underlying discovery_peer/dial_peer API timeouts are too lenient for seedstrap use cases, so the more aggressive timeout with retry logic is appropriate and necessary.
Learnt from: hansieodendaal
PR: #7307
File: comms/core/src/connection_manager/peer_connection.rs:356-357
Timestamp: 2025-07-12T03:43:22.545Z
Learning: In base_layer/core/src/base_node/tari_pulse_service/mod.rs, the disconnect_if_unused call uses parameters (0, 2) because the tari_pulse_service creates exactly 2 substreams and 0 RPC sessions during health checks.
Learnt from: SWvheerden
PR: #6951
File: base_layer/core/src/base_node/tari_pulse_service/mod.rs:327-352
Timestamp: 2025-04-16T07:06:53.981Z
Learning: The discovery_peer and dial_peer methods in the Tari codebase have built-in timeout mechanisms, so adding explicit timeouts with tokio::time::timeout is unnecessary.
Learnt from: SWvheerden
PR: #6951
File: base_layer/core/src/base_node/tari_pulse_service/mod.rs:327-352
Timestamp: 2025-04-16T07:06:53.981Z
Learning: The discovery_peer and dial_peer methods in the Tari codebase have built-in timeout mechanisms, so adding explicit timeouts with tokio::time::timeout is unnecessary.
Learnt from: hansieodendaal
PR: #6963
File: comms/dht/src/proto/mod.rs:141-142
Timestamp: 2025-05-02T07:12:23.985Z
Learning: The PeerFeatures::from_bits_u32_truncate
method truncates a u32 to u8 bits but can still return None
if the resulting bits don't match any valid flags, making the error handling with .ok_or_else()
necessary even after truncation.
comms/core/src/peer_validator/error.rs (4)
Learnt from: hansieodendaal
PR: #7358
File: comms/core/src/peer_manager/storage/database.rs:566-570
Timestamp: 2025-07-21T16:03:14.260Z
Learning: In the Tari peer database, there was a known issue with JSON serialization corruption in the source
field of the multi_addresses
table, causing approximately 0.4% of peer validation failures. The migration to Borsh serialization (2025-07-21-170500_peer_address_source) intentionally uses a destructive approach (dropping and recreating tables) because the existing JSON data contains corruption that cannot be reliably converted. This data loss is acceptable to ensure data integrity going forward.
Learnt from: hansieodendaal
PR: #6963
File: common_sqlite/src/error.rs:88-92
Timestamp: 2025-05-23T07:49:57.349Z
Learning: In the StorageError enum in common_sqlite/src/error.rs, the HexError variant should keep the manual From implementation rather than using #[from] attribute, as it stores a String representation of the error rather than the HexError type itself.
Learnt from: hansieodendaal
PR: #7294
File: comms/dht/src/network_discovery/seed_strap.rs:352-456
Timestamp: 2025-07-09T08:33:29.320Z
Learning: In comms/dht/src/network_discovery/seed_strap.rs, the context.connectivity.dial_peer method should fail fast and return an error if a peer cannot be dialed, rather than requiring retry logic for general connection failures.
Learnt from: hansieodendaal
PR: #6963
File: comms/core/src/peer_manager/storage/migrations/2025-04-14-072200_initial/up.sql:24-41
Timestamp: 2025-05-02T14:07:10.892Z
Learning: The peer system design requires each network address to be uniquely associated with exactly one peer, and an address cannot be reused across multiple peers.
comms/dht/src/network_discovery/ready.rs (9)
Learnt from: hansieodendaal
PR: #7294
File: comms/dht/src/network_discovery/seed_strap.rs:352-456
Timestamp: 2025-07-09T08:33:29.320Z
Learning: In comms/dht/src/network_discovery/seed_strap.rs, the NUM_RETRIES logic in get_peers is specifically designed to handle peer connections that are closed while trying to RPC stream peer info, not general connection failures. The retry logic only applies when peers.is_empty() && !conn.is_connected() && attempt < NUM_RETRIES, which indicates a mid-stream disconnection.
Learnt from: hansieodendaal
PR: #7294
File: comms/dht/src/network_discovery/seed_strap.rs:352-456
Timestamp: 2025-07-09T08:33:29.320Z
Learning: In comms/dht/src/network_discovery/seed_strap.rs, the context.connectivity.dial_peer method should fail fast and return an error if a peer cannot be dialed, rather than requiring retry logic for general connection failures.
Learnt from: hansieodendaal
PR: #7123
File: comms/core/src/peer_manager/storage/database.rs:1517-1541
Timestamp: 2025-05-29T09:42:20.881Z
Learning: In the hard_delete_all_stale_peers
method in comms/core/src/peer_manager/storage/database.rs
, the SQL query intentionally uses exact equality (peers.features = ?
) rather than bitwise operations (peers.features & ? != 0
) when matching COMMUNICATION_NODE
features. This is the intended behavior to match only peers with exactly the COMMUNICATION_NODE
feature, excluding those with additional feature flags.
Learnt from: hansieodendaal
PR: #7294
File: comms/dht/src/network_discovery/seed_strap.rs:352-456
Timestamp: 2025-07-09T08:33:29.320Z
Learning: In comms/dht/src/network_discovery/seed_strap.rs, the fetch_peers_from_connection and collect_peer_stream functions rely on RPC streaming, and when the main connection is closed by another process, collect_peer_stream times out after STREAM_ITEM_TIMEOUT because it cannot detect that the peer can no longer respond, returning an empty vector of peers. This is why the connection state check is important for the retry logic.
Learnt from: hansieodendaal
PR: #7294
File: comms/dht/src/network_discovery/seed_strap.rs:721-735
Timestamp: 2025-07-09T08:13:37.206Z
Learning: In comms/dht/src/network_discovery/seed_strap.rs, the 10-second STREAM_ITEM_TIMEOUT and retry logic are intentionally designed to handle service conflicts where other services kill seed peer connections during seedstrap operations. The underlying discovery_peer/dial_peer API timeouts are too lenient for seedstrap use cases, so the more aggressive timeout with retry logic is appropriate and necessary.
Learnt from: SWvheerden
PR: #6951
File: base_layer/core/src/base_node/tari_pulse_service/mod.rs:327-352
Timestamp: 2025-04-16T07:06:53.981Z
Learning: The discovery_peer and dial_peer methods in the Tari codebase have built-in timeout mechanisms, so adding explicit timeouts with tokio::time::timeout is unnecessary.
Learnt from: SWvheerden
PR: #6951
File: base_layer/core/src/base_node/tari_pulse_service/mod.rs:327-352
Timestamp: 2025-04-16T07:06:53.981Z
Learning: The discovery_peer and dial_peer methods in the Tari codebase have built-in timeout mechanisms, so adding explicit timeouts with tokio::time::timeout is unnecessary.
Learnt from: hansieodendaal
PR: #7307
File: comms/core/src/connection_manager/peer_connection.rs:356-357
Timestamp: 2025-07-12T03:43:22.545Z
Learning: In comms/core/src/connectivity/manager.rs, the disconnect_if_unused call uses parameters (0, 0) intentionally to be conservative and avoid terminating connections that might be in use. This approach is used while troubleshooting inbound connection drop issues to eliminate the disconnect logic as a potential cause.
Learnt from: hansieodendaal
PR: #7307
File: comms/core/src/connection_manager/peer_connection.rs:356-357
Timestamp: 2025-07-12T03:43:22.545Z
Learning: In base_layer/core/src/base_node/tari_pulse_service/mod.rs, the disconnect_if_unused call uses parameters (0, 2) because the tari_pulse_service creates exactly 2 substreams and 0 RPC sessions during health checks.
comms/dht/src/connectivity/mod.rs (10)
Learnt from: hansieodendaal
PR: #7123
File: comms/core/src/peer_manager/storage/database.rs:1517-1541
Timestamp: 2025-05-29T09:42:20.881Z
Learning: In the hard_delete_all_stale_peers
method in comms/core/src/peer_manager/storage/database.rs
, the SQL query intentionally uses exact equality (peers.features = ?
) rather than bitwise operations (peers.features & ? != 0
) when matching COMMUNICATION_NODE
features. This is the intended behavior to match only peers with exactly the COMMUNICATION_NODE
feature, excluding those with additional feature flags.
Learnt from: hansieodendaal
PR: #7294
File: comms/dht/src/network_discovery/seed_strap.rs:352-456
Timestamp: 2025-07-09T08:33:29.320Z
Learning: In comms/dht/src/network_discovery/seed_strap.rs, the context.connectivity.dial_peer method should fail fast and return an error if a peer cannot be dialed, rather than requiring retry logic for general connection failures.
Learnt from: hansieodendaal
PR: #7294
File: comms/dht/src/network_discovery/seed_strap.rs:352-456
Timestamp: 2025-07-09T08:33:29.320Z
Learning: In comms/dht/src/network_discovery/seed_strap.rs, the NUM_RETRIES logic in get_peers is specifically designed to handle peer connections that are closed while trying to RPC stream peer info, not general connection failures. The retry logic only applies when peers.is_empty() && !conn.is_connected() && attempt < NUM_RETRIES, which indicates a mid-stream disconnection.
Learnt from: hansieodendaal
PR: #7307
File: comms/core/src/connection_manager/peer_connection.rs:356-357
Timestamp: 2025-07-12T03:43:22.545Z
Learning: In comms/core/src/connectivity/manager.rs, the disconnect_if_unused call uses parameters (0, 0) intentionally to be conservative and avoid terminating connections that might be in use. This approach is used while troubleshooting inbound connection drop issues to eliminate the disconnect logic as a potential cause.
Learnt from: hansieodendaal
PR: #7294
File: comms/dht/src/network_discovery/seed_strap.rs:352-456
Timestamp: 2025-07-09T08:33:29.320Z
Learning: In comms/dht/src/network_discovery/seed_strap.rs, the fetch_peers_from_connection and collect_peer_stream functions rely on RPC streaming, and when the main connection is closed by another process, collect_peer_stream times out after STREAM_ITEM_TIMEOUT because it cannot detect that the peer can no longer respond, returning an empty vector of peers. This is why the connection state check is important for the retry logic.
Learnt from: hansieodendaal
PR: #7307
File: comms/core/src/connection_manager/peer_connection.rs:356-357
Timestamp: 2025-07-12T03:43:22.545Z
Learning: In base_layer/core/src/base_node/tari_pulse_service/mod.rs, the disconnect_if_unused call uses parameters (0, 2) because the tari_pulse_service creates exactly 2 substreams and 0 RPC sessions during health checks.
Learnt from: SWvheerden
PR: #6951
File: base_layer/core/src/base_node/tari_pulse_service/mod.rs:327-352
Timestamp: 2025-04-16T07:06:53.981Z
Learning: The discovery_peer and dial_peer methods in the Tari codebase have built-in timeout mechanisms, so adding explicit timeouts with tokio::time::timeout is unnecessary.
Learnt from: SWvheerden
PR: #6951
File: base_layer/core/src/base_node/tari_pulse_service/mod.rs:327-352
Timestamp: 2025-04-16T07:06:53.981Z
Learning: The discovery_peer and dial_peer methods in the Tari codebase have built-in timeout mechanisms, so adding explicit timeouts with tokio::time::timeout is unnecessary.
Learnt from: hansieodendaal
PR: #7294
File: comms/dht/src/network_discovery/seed_strap.rs:721-735
Timestamp: 2025-07-09T08:13:37.206Z
Learning: In comms/dht/src/network_discovery/seed_strap.rs, the 10-second STREAM_ITEM_TIMEOUT and retry logic are intentionally designed to handle service conflicts where other services kill seed peer connections during seedstrap operations. The underlying discovery_peer/dial_peer API timeouts are too lenient for seedstrap use cases, so the more aggressive timeout with retry logic is appropriate and necessary.
Learnt from: hansieodendaal
PR: #6963
File: comms/dht/src/proto/mod.rs:141-142
Timestamp: 2025-05-02T07:12:23.985Z
Learning: The PeerFeatures::from_bits_u32_truncate
method truncates a u32 to u8 bits but can still return None
if the resulting bits don't match any valid flags, making the error handling with .ok_or_else()
necessary even after truncation.
comms/core/src/peer_manager/mod.rs (4)
Learnt from: SWvheerden
PR: #6951
File: base_layer/core/src/base_node/tari_pulse_service/mod.rs:327-352
Timestamp: 2025-04-16T07:06:53.981Z
Learning: The discovery_peer and dial_peer methods in the Tari codebase have built-in timeout mechanisms, so adding explicit timeouts with tokio::time::timeout is unnecessary.
Learnt from: SWvheerden
PR: #6951
File: base_layer/core/src/base_node/tari_pulse_service/mod.rs:327-352
Timestamp: 2025-04-16T07:06:53.981Z
Learning: The discovery_peer and dial_peer methods in the Tari codebase have built-in timeout mechanisms, so adding explicit timeouts with tokio::time::timeout is unnecessary.
Learnt from: hansieodendaal
PR: #7123
File: comms/core/src/peer_manager/storage/database.rs:1517-1541
Timestamp: 2025-05-29T09:42:20.881Z
Learning: In the hard_delete_all_stale_peers
method in comms/core/src/peer_manager/storage/database.rs
, the SQL query intentionally uses exact equality (peers.features = ?
) rather than bitwise operations (peers.features & ? != 0
) when matching COMMUNICATION_NODE
features. This is the intended behavior to match only peers with exactly the COMMUNICATION_NODE
feature, excluding those with additional feature flags.
Learnt from: hansieodendaal
PR: #6963
File: comms/core/src/peer_manager/manager.rs:60-68
Timestamp: 2025-05-26T02:40:23.812Z
Learning: PeerDatabaseSql in the Tari codebase has been specifically refactored to handle concurrent access and mitigate blocking I/O concerns on async executor threads. The implementation has been tested under high load at both system level and through unit tests like test_concurrent_add_or_update_and_get_closest_peers which validates concurrent read/write operations.
comms/core/src/peer_manager/peer.rs (5)
Learnt from: hansieodendaal
PR: #7123
File: comms/core/src/peer_manager/storage/database.rs:1517-1541
Timestamp: 2025-05-29T09:42:20.881Z
Learning: In the hard_delete_all_stale_peers
method in comms/core/src/peer_manager/storage/database.rs
, the SQL query intentionally uses exact equality (peers.features = ?
) rather than bitwise operations (peers.features & ? != 0
) when matching COMMUNICATION_NODE
features. This is the intended behavior to match only peers with exactly the COMMUNICATION_NODE
feature, excluding those with additional feature flags.
Learnt from: hansieodendaal
PR: #6963
File: comms/dht/src/proto/mod.rs:141-142
Timestamp: 2025-05-02T07:12:23.985Z
Learning: The PeerFeatures::from_bits_u32_truncate
method truncates a u32 to u8 bits but can still return None
if the resulting bits don't match any valid flags, making the error handling with .ok_or_else()
necessary even after truncation.
Learnt from: SWvheerden
PR: #6951
File: base_layer/core/src/base_node/tari_pulse_service/mod.rs:327-352
Timestamp: 2025-04-16T07:06:53.981Z
Learning: The discovery_peer and dial_peer methods in the Tari codebase have built-in timeout mechanisms, so adding explicit timeouts with tokio::time::timeout is unnecessary.
Learnt from: SWvheerden
PR: #6951
File: base_layer/core/src/base_node/tari_pulse_service/mod.rs:327-352
Timestamp: 2025-04-16T07:06:53.981Z
Learning: The discovery_peer and dial_peer methods in the Tari codebase have built-in timeout mechanisms, so adding explicit timeouts with tokio::time::timeout is unnecessary.
Learnt from: hansieodendaal
PR: #7294
File: comms/dht/src/network_discovery/seed_strap.rs:352-456
Timestamp: 2025-07-09T08:33:29.320Z
Learning: In comms/dht/src/network_discovery/seed_strap.rs, the context.connectivity.dial_peer method should fail fast and return an error if a peer cannot be dialed, rather than requiring retry logic for general connection failures.
comms/core/src/peer_manager/storage/migrations/2025-07-19-085200_external_flag/up.sql (2)
Learnt from: hansieodendaal
PR: #7358
File: comms/core/src/peer_manager/storage/database.rs:566-570
Timestamp: 2025-07-21T16:03:14.260Z
Learning: In the Tari peer database, there was a known issue with JSON serialization corruption in the source
field of the multi_addresses
table, causing approximately 0.4% of peer validation failures. The migration to Borsh serialization (2025-07-21-170500_peer_address_source) intentionally uses a destructive approach (dropping and recreating tables) because the existing JSON data contains corruption that cannot be reliably converted. This data loss is acceptable to ensure data integrity going forward.
Learnt from: hansieodendaal
PR: #7123
File: comms/core/src/peer_manager/storage/database.rs:1517-1541
Timestamp: 2025-05-29T09:42:20.881Z
Learning: In the hard_delete_all_stale_peers
method in comms/core/src/peer_manager/storage/database.rs
, the SQL query intentionally uses exact equality (peers.features = ?
) rather than bitwise operations (peers.features & ? != 0
) when matching COMMUNICATION_NODE
features. This is the intended behavior to match only peers with exactly the COMMUNICATION_NODE
feature, excluding those with additional feature flags.
comms/dht/src/rpc/peer_info.rs (5)
Learnt from: hansieodendaal
PR: #7123
File: comms/core/src/peer_manager/storage/database.rs:1517-1541
Timestamp: 2025-05-29T09:42:20.881Z
Learning: In the hard_delete_all_stale_peers
method in comms/core/src/peer_manager/storage/database.rs
, the SQL query intentionally uses exact equality (peers.features = ?
) rather than bitwise operations (peers.features & ? != 0
) when matching COMMUNICATION_NODE
features. This is the intended behavior to match only peers with exactly the COMMUNICATION_NODE
feature, excluding those with additional feature flags.
Learnt from: hansieodendaal
PR: #7294
File: comms/dht/src/network_discovery/seed_strap.rs:352-456
Timestamp: 2025-07-09T08:33:29.320Z
Learning: In comms/dht/src/network_discovery/seed_strap.rs, the NUM_RETRIES logic in get_peers is specifically designed to handle peer connections that are closed while trying to RPC stream peer info, not general connection failures. The retry logic only applies when peers.is_empty() && !conn.is_connected() && attempt < NUM_RETRIES, which indicates a mid-stream disconnection.
Learnt from: hansieodendaal
PR: #7294
File: comms/dht/src/network_discovery/seed_strap.rs:352-456
Timestamp: 2025-07-09T08:33:29.320Z
Learning: In comms/dht/src/network_discovery/seed_strap.rs, the context.connectivity.dial_peer method should fail fast and return an error if a peer cannot be dialed, rather than requiring retry logic for general connection failures.
Learnt from: hansieodendaal
PR: #7358
File: comms/core/src/peer_manager/storage/database.rs:566-570
Timestamp: 2025-07-21T16:03:14.260Z
Learning: In the Tari peer database, there was a known issue with JSON serialization corruption in the source
field of the multi_addresses
table, causing approximately 0.4% of peer validation failures. The migration to Borsh serialization (2025-07-21-170500_peer_address_source) intentionally uses a destructive approach (dropping and recreating tables) because the existing JSON data contains corruption that cannot be reliably converted. This data loss is acceptable to ensure data integrity going forward.
Learnt from: hansieodendaal
PR: #6963
File: comms/dht/src/proto/mod.rs:141-142
Timestamp: 2025-05-02T07:12:23.985Z
Learning: The PeerFeatures::from_bits_u32_truncate
method truncates a u32 to u8 bits but can still return None
if the resulting bits don't match any valid flags, making the error handling with .ok_or_else()
necessary even after truncation.
comms/core/src/net_address/multiaddr_with_stats.rs (6)
Learnt from: SWvheerden
PR: #6951
File: base_layer/core/src/base_node/tari_pulse_service/mod.rs:327-352
Timestamp: 2025-04-16T07:06:53.981Z
Learning: The discovery_peer and dial_peer methods in the Tari codebase have built-in timeout mechanisms, so adding explicit timeouts with tokio::time::timeout is unnecessary.
Learnt from: SWvheerden
PR: #6951
File: base_layer/core/src/base_node/tari_pulse_service/mod.rs:327-352
Timestamp: 2025-04-16T07:06:53.981Z
Learning: The discovery_peer and dial_peer methods in the Tari codebase have built-in timeout mechanisms, so adding explicit timeouts with tokio::time::timeout is unnecessary.
Learnt from: hansieodendaal
PR: #7294
File: comms/dht/src/network_discovery/seed_strap.rs:721-735
Timestamp: 2025-07-09T08:13:37.206Z
Learning: In comms/dht/src/network_discovery/seed_strap.rs, the 10-second STREAM_ITEM_TIMEOUT and retry logic are intentionally designed to handle service conflicts where other services kill seed peer connections during seedstrap operations. The underlying discovery_peer/dial_peer API timeouts are too lenient for seedstrap use cases, so the more aggressive timeout with retry logic is appropriate and necessary.
Learnt from: hansieodendaal
PR: #7294
File: comms/dht/src/network_discovery/seed_strap.rs:352-456
Timestamp: 2025-07-09T08:33:29.320Z
Learning: In comms/dht/src/network_discovery/seed_strap.rs, the fetch_peers_from_connection and collect_peer_stream functions rely on RPC streaming, and when the main connection is closed by another process, collect_peer_stream times out after STREAM_ITEM_TIMEOUT because it cannot detect that the peer can no longer respond, returning an empty vector of peers. This is why the connection state check is important for the retry logic.
Learnt from: hansieodendaal
PR: #7294
File: comms/dht/src/network_discovery/seed_strap.rs:352-456
Timestamp: 2025-07-09T08:33:29.320Z
Learning: In comms/dht/src/network_discovery/seed_strap.rs, the context.connectivity.dial_peer method should fail fast and return an error if a peer cannot be dialed, rather than requiring retry logic for general connection failures.
Learnt from: hansieodendaal
PR: #7294
File: comms/dht/src/network_discovery/seed_strap.rs:352-456
Timestamp: 2025-07-09T08:33:29.320Z
Learning: In comms/dht/src/network_discovery/seed_strap.rs, the NUM_RETRIES logic in get_peers is specifically designed to handle peer connections that are closed while trying to RPC stream peer info, not general connection failures. The retry logic only applies when peers.is_empty() && !conn.is_connected() && attempt < NUM_RETRIES, which indicates a mid-stream disconnection.
common/config/presets/c_base_node_c.toml (4)
Learnt from: hansieodendaal
PR: #7294
File: comms/dht/src/network_discovery/seed_strap.rs:721-735
Timestamp: 2025-07-09T08:13:37.206Z
Learning: In comms/dht/src/network_discovery/seed_strap.rs, the 10-second STREAM_ITEM_TIMEOUT and retry logic are intentionally designed to handle service conflicts where other services kill seed peer connections during seedstrap operations. The underlying discovery_peer/dial_peer API timeouts are too lenient for seedstrap use cases, so the more aggressive timeout with retry logic is appropriate and necessary.
Learnt from: hansieodendaal
PR: #7294
File: comms/dht/src/network_discovery/seed_strap.rs:352-456
Timestamp: 2025-07-09T08:33:29.320Z
Learning: In comms/dht/src/network_discovery/seed_strap.rs, the NUM_RETRIES logic in get_peers is specifically designed to handle peer connections that are closed while trying to RPC stream peer info, not general connection failures. The retry logic only applies when peers.is_empty() && !conn.is_connected() && attempt < NUM_RETRIES, which indicates a mid-stream disconnection.
Learnt from: hansieodendaal
PR: #7294
File: comms/dht/src/network_discovery/seed_strap.rs:352-456
Timestamp: 2025-07-09T08:33:29.320Z
Learning: In comms/dht/src/network_discovery/seed_strap.rs, the fetch_peers_from_connection and collect_peer_stream functions rely on RPC streaming, and when the main connection is closed by another process, collect_peer_stream times out after STREAM_ITEM_TIMEOUT because it cannot detect that the peer can no longer respond, returning an empty vector of peers. This is why the connection state check is important for the retry logic.
Learnt from: hansieodendaal
PR: #7294
File: comms/dht/src/network_discovery/seed_strap.rs:352-456
Timestamp: 2025-07-09T08:33:29.320Z
Learning: In comms/dht/src/network_discovery/seed_strap.rs, the context.connectivity.dial_peer method should fail fast and return an error if a peer cannot be dialed, rather than requiring retry logic for general connection failures.
common/config/presets/d_console_wallet.toml (9)
Learnt from: SolfataraEmit
PR: #6994
File: docs/src/API_GRPC_Explanation.md:236-244
Timestamp: 2025-04-28T11:41:15.722Z
Learning: The minotari_console_wallet reports a fourth field timelocked_balance
in the terminal output for the getBalance
method, which should be reflected in the documentation for accuracy. The wallet.proto
file confirms that timelocked_balance
is defined as the fourth field in the GetBalanceResponse
message.
Learnt from: SolfataraEmit
PR: #6994
File: docs/src/API_GRPC_Explanation.md:236-244
Timestamp: 2025-04-28T11:41:15.722Z
Learning: The minotari_console_wallet reports a fourth field timelocked_balance
in the terminal output for the getBalance
method, which should be reflected in the documentation for accuracy.
Learnt from: SWvheerden
PR: #7301
File: clients/rust/base_node_wallet_client/src/client/http.rs:70-106
Timestamp: 2025-07-11T06:07:36.367Z
Learning: In clients/rust/base_node_wallet_client/src/client/http.rs, the HTTP client server selection logic is intentionally designed to try the local API once and then fallback to the seed server permanently without retry mechanisms. This is the preferred behavior - they want to use local API if available, otherwise use seed server, without continuously retrying the local API.
Learnt from: hansieodendaal
PR: #7294
File: comms/dht/src/network_discovery/seed_strap.rs:721-735
Timestamp: 2025-07-09T08:13:37.206Z
Learning: In comms/dht/src/network_discovery/seed_strap.rs, the 10-second STREAM_ITEM_TIMEOUT and retry logic are intentionally designed to handle service conflicts where other services kill seed peer connections during seedstrap operations. The underlying discovery_peer/dial_peer API timeouts are too lenient for seedstrap use cases, so the more aggressive timeout with retry logic is appropriate and necessary.
Learnt from: hansieodendaal
PR: #7294
File: comms/dht/src/network_discovery/seed_strap.rs:352-456
Timestamp: 2025-07-09T08:33:29.320Z
Learning: In comms/dht/src/network_discovery/seed_strap.rs, the NUM_RETRIES logic in get_peers is specifically designed to handle peer connections that are closed while trying to RPC stream peer info, not general connection failures. The retry logic only applies when peers.is_empty() && !conn.is_connected() && attempt < NUM_RETRIES, which indicates a mid-stream disconnection.
Learnt from: hansieodendaal
PR: #7307
File: comms/core/src/connection_manager/peer_connection.rs:356-357
Timestamp: 2025-07-12T03:43:22.545Z
Learning: In comms/core/src/connectivity/manager.rs, the disconnect_if_unused call uses parameters (0, 0) intentionally to be conservative and avoid terminating connections that might be in use. This approach is used while troubleshooting inbound connection drop issues to eliminate the disconnect logic as a potential cause.
Learnt from: hansieodendaal
PR: #7307
File: comms/core/src/connection_manager/peer_connection.rs:356-357
Timestamp: 2025-07-12T03:43:22.545Z
Learning: In base_layer/core/src/base_node/tari_pulse_service/mod.rs, the disconnect_if_unused call uses parameters (0, 2) because the tari_pulse_service creates exactly 2 substreams and 0 RPC sessions during health checks.
Learnt from: hansieodendaal
PR: #7294
File: comms/dht/src/network_discovery/seed_strap.rs:352-456
Timestamp: 2025-07-09T08:33:29.320Z
Learning: In comms/dht/src/network_discovery/seed_strap.rs, the fetch_peers_from_connection and collect_peer_stream functions rely on RPC streaming, and when the main connection is closed by another process, collect_peer_stream times out after STREAM_ITEM_TIMEOUT because it cannot detect that the peer can no longer respond, returning an empty vector of peers. This is why the connection state check is important for the retry logic.
Learnt from: hansieodendaal
PR: #7294
File: comms/dht/src/network_discovery/seed_strap.rs:352-456
Timestamp: 2025-07-09T08:33:29.320Z
Learning: In comms/dht/src/network_discovery/seed_strap.rs, the context.connectivity.dial_peer method should fail fast and return an error if a peer cannot be dialed, rather than requiring retry logic for general connection failures.
comms/core/src/peer_manager/storage/database.rs (12)
Learnt from: hansieodendaal
PR: #7123
File: comms/core/src/peer_manager/storage/database.rs:1517-1541
Timestamp: 2025-05-29T09:42:20.881Z
Learning: In the hard_delete_all_stale_peers
method in comms/core/src/peer_manager/storage/database.rs
, the SQL query intentionally uses exact equality (peers.features = ?
) rather than bitwise operations (peers.features & ? != 0
) when matching COMMUNICATION_NODE
features. This is the intended behavior to match only peers with exactly the COMMUNICATION_NODE
feature, excluding those with additional feature flags.
Learnt from: hansieodendaal
PR: #6963
File: comms/core/src/peer_manager/manager.rs:60-68
Timestamp: 2025-05-26T02:40:23.812Z
Learning: PeerDatabaseSql in the Tari codebase has been specifically refactored to handle concurrent access and mitigate blocking I/O concerns on async executor threads. The implementation has been tested under high load at both system level and through unit tests like test_concurrent_add_or_update_and_get_closest_peers which validates concurrent read/write operations.
Learnt from: hansieodendaal
PR: #7358
File: comms/core/src/peer_manager/storage/database.rs:566-570
Timestamp: 2025-07-21T16:03:14.260Z
Learning: In the Tari peer database, there was a known issue with JSON serialization corruption in the source
field of the multi_addresses
table, causing approximately 0.4% of peer validation failures. The migration to Borsh serialization (2025-07-21-170500_peer_address_source) intentionally uses a destructive approach (dropping and recreating tables) because the existing JSON data contains corruption that cannot be reliably converted. This data loss is acceptable to ensure data integrity going forward.
Learnt from: hansieodendaal
PR: #7294
File: comms/dht/src/network_discovery/seed_strap.rs:352-456
Timestamp: 2025-07-09T08:33:29.320Z
Learning: In comms/dht/src/network_discovery/seed_strap.rs, the NUM_RETRIES logic in get_peers is specifically designed to handle peer connections that are closed while trying to RPC stream peer info, not general connection failures. The retry logic only applies when peers.is_empty() && !conn.is_connected() && attempt < NUM_RETRIES, which indicates a mid-stream disconnection.
Learnt from: hansieodendaal
PR: #7294
File: comms/dht/src/network_discovery/seed_strap.rs:352-456
Timestamp: 2025-07-09T08:33:29.320Z
Learning: In comms/dht/src/network_discovery/seed_strap.rs, the fetch_peers_from_connection and collect_peer_stream functions rely on RPC streaming, and when the main connection is closed by another process, collect_peer_stream times out after STREAM_ITEM_TIMEOUT because it cannot detect that the peer can no longer respond, returning an empty vector of peers. This is why the connection state check is important for the retry logic.
Learnt from: hansieodendaal
PR: #7294
File: comms/dht/src/network_discovery/seed_strap.rs:352-456
Timestamp: 2025-07-09T08:33:29.320Z
Learning: In comms/dht/src/network_discovery/seed_strap.rs, the context.connectivity.dial_peer method should fail fast and return an error if a peer cannot be dialed, rather than requiring retry logic for general connection failures.
Learnt from: SWvheerden
PR: #6951
File: base_layer/core/src/base_node/tari_pulse_service/mod.rs:327-352
Timestamp: 2025-04-16T07:06:53.981Z
Learning: The discovery_peer and dial_peer methods in the Tari codebase have built-in timeout mechanisms, so adding explicit timeouts with tokio::time::timeout is unnecessary.
Learnt from: SWvheerden
PR: #6951
File: base_layer/core/src/base_node/tari_pulse_service/mod.rs:327-352
Timestamp: 2025-04-16T07:06:53.981Z
Learning: The discovery_peer and dial_peer methods in the Tari codebase have built-in timeout mechanisms, so adding explicit timeouts with tokio::time::timeout is unnecessary.
Learnt from: hansieodendaal
PR: #7123
File: comms/core/src/peer_manager/storage/database.rs:1655-1658
Timestamp: 2025-05-29T09:40:09.356Z
Learning: In the Tari codebase, node_id hex strings in the database are guaranteed to be valid because they can only be added via update_peer_sql(peer: Peer)
which converts from valid NodeId objects, ensuring data integrity at the insertion layer.
Learnt from: hansieodendaal
PR: #6963
File: comms/core/src/peer_manager/storage/migrations/2025-04-14-072200_initial/up.sql:24-41
Timestamp: 2025-05-02T14:07:10.892Z
Learning: The peer system design requires each network address to be uniquely associated with exactly one peer, and an address cannot be reused across multiple peers.
Learnt from: hansieodendaal
PR: #7307
File: comms/core/src/connection_manager/peer_connection.rs:356-357
Timestamp: 2025-07-12T03:43:22.545Z
Learning: In base_layer/core/src/base_node/tari_pulse_service/mod.rs, the disconnect_if_unused call uses parameters (0, 2) because the tari_pulse_service creates exactly 2 substreams and 0 RPC sessions during health checks.
Learnt from: hansieodendaal
PR: #7294
File: comms/dht/src/network_discovery/seed_strap.rs:721-735
Timestamp: 2025-07-09T08:13:37.206Z
Learning: In comms/dht/src/network_discovery/seed_strap.rs, the 10-second STREAM_ITEM_TIMEOUT and retry logic are intentionally designed to handle service conflicts where other services kill seed peer connections during seedstrap operations. The underlying discovery_peer/dial_peer API timeouts are too lenient for seedstrap use cases, so the more aggressive timeout with retry logic is appropriate and necessary.
comms/core/src/peer_manager/peer_storage_sql.rs (10)
Learnt from: hansieodendaal
PR: #7123
File: comms/core/src/peer_manager/storage/database.rs:1517-1541
Timestamp: 2025-05-29T09:42:20.881Z
Learning: In the hard_delete_all_stale_peers
method in comms/core/src/peer_manager/storage/database.rs
, the SQL query intentionally uses exact equality (peers.features = ?
) rather than bitwise operations (peers.features & ? != 0
) when matching COMMUNICATION_NODE
features. This is the intended behavior to match only peers with exactly the COMMUNICATION_NODE
feature, excluding those with additional feature flags.
Learnt from: hansieodendaal
PR: #6963
File: comms/core/src/peer_manager/manager.rs:60-68
Timestamp: 2025-05-26T02:40:23.812Z
Learning: PeerDatabaseSql in the Tari codebase has been specifically refactored to handle concurrent access and mitigate blocking I/O concerns on async executor threads. The implementation has been tested under high load at both system level and through unit tests like test_concurrent_add_or_update_and_get_closest_peers which validates concurrent read/write operations.
Learnt from: SWvheerden
PR: #6951
File: base_layer/core/src/base_node/tari_pulse_service/mod.rs:327-352
Timestamp: 2025-04-16T07:06:53.981Z
Learning: The discovery_peer and dial_peer methods in the Tari codebase have built-in timeout mechanisms, so adding explicit timeouts with tokio::time::timeout is unnecessary.
Learnt from: SWvheerden
PR: #6951
File: base_layer/core/src/base_node/tari_pulse_service/mod.rs:327-352
Timestamp: 2025-04-16T07:06:53.981Z
Learning: The discovery_peer and dial_peer methods in the Tari codebase have built-in timeout mechanisms, so adding explicit timeouts with tokio::time::timeout is unnecessary.
Learnt from: hansieodendaal
PR: #7294
File: comms/dht/src/network_discovery/seed_strap.rs:352-456
Timestamp: 2025-07-09T08:33:29.320Z
Learning: In comms/dht/src/network_discovery/seed_strap.rs, the NUM_RETRIES logic in get_peers is specifically designed to handle peer connections that are closed while trying to RPC stream peer info, not general connection failures. The retry logic only applies when peers.is_empty() && !conn.is_connected() && attempt < NUM_RETRIES, which indicates a mid-stream disconnection.
Learnt from: hansieodendaal
PR: #7294
File: comms/dht/src/network_discovery/seed_strap.rs:352-456
Timestamp: 2025-07-09T08:33:29.320Z
Learning: In comms/dht/src/network_discovery/seed_strap.rs, the fetch_peers_from_connection and collect_peer_stream functions rely on RPC streaming, and when the main connection is closed by another process, collect_peer_stream times out after STREAM_ITEM_TIMEOUT because it cannot detect that the peer can no longer respond, returning an empty vector of peers. This is why the connection state check is important for the retry logic.
Learnt from: hansieodendaal
PR: #7358
File: comms/core/src/peer_manager/storage/database.rs:566-570
Timestamp: 2025-07-21T16:03:14.260Z
Learning: In the Tari peer database, there was a known issue with JSON serialization corruption in the source
field of the multi_addresses
table, causing approximately 0.4% of peer validation failures. The migration to Borsh serialization (2025-07-21-170500_peer_address_source) intentionally uses a destructive approach (dropping and recreating tables) because the existing JSON data contains corruption that cannot be reliably converted. This data loss is acceptable to ensure data integrity going forward.
Learnt from: hansieodendaal
PR: #7307
File: comms/core/src/connection_manager/peer_connection.rs:356-357
Timestamp: 2025-07-12T03:43:22.545Z
Learning: In comms/core/src/connectivity/manager.rs, the disconnect_if_unused call uses parameters (0, 0) intentionally to be conservative and avoid terminating connections that might be in use. This approach is used while troubleshooting inbound connection drop issues to eliminate the disconnect logic as a potential cause.
Learnt from: hansieodendaal
PR: #7294
File: comms/dht/src/network_discovery/seed_strap.rs:352-456
Timestamp: 2025-07-09T08:33:29.320Z
Learning: In comms/dht/src/network_discovery/seed_strap.rs, the context.connectivity.dial_peer method should fail fast and return an error if a peer cannot be dialed, rather than requiring retry logic for general connection failures.
Learnt from: hansieodendaal
PR: #7307
File: comms/core/src/connection_manager/peer_connection.rs:356-357
Timestamp: 2025-07-12T03:43:22.545Z
Learning: In base_layer/core/src/base_node/tari_pulse_service/mod.rs, the disconnect_if_unused call uses parameters (0, 2) because the tari_pulse_service creates exactly 2 substreams and 0 RPC sessions during health checks.
comms/core/src/peer_manager/manager.rs (11)
Learnt from: hansieodendaal
PR: #7123
File: comms/core/src/peer_manager/storage/database.rs:1517-1541
Timestamp: 2025-05-29T09:42:20.881Z
Learning: In the hard_delete_all_stale_peers
method in comms/core/src/peer_manager/storage/database.rs
, the SQL query intentionally uses exact equality (peers.features = ?
) rather than bitwise operations (peers.features & ? != 0
) when matching COMMUNICATION_NODE
features. This is the intended behavior to match only peers with exactly the COMMUNICATION_NODE
feature, excluding those with additional feature flags.
Learnt from: hansieodendaal
PR: #7307
File: comms/core/src/connection_manager/peer_connection.rs:356-357
Timestamp: 2025-07-12T03:43:22.545Z
Learning: In comms/core/src/connectivity/manager.rs, the disconnect_if_unused call uses parameters (0, 0) intentionally to be conservative and avoid terminating connections that might be in use. This approach is used while troubleshooting inbound connection drop issues to eliminate the disconnect logic as a potential cause.
Learnt from: hansieodendaal
PR: #7294
File: comms/dht/src/network_discovery/seed_strap.rs:352-456
Timestamp: 2025-07-09T08:33:29.320Z
Learning: In comms/dht/src/network_discovery/seed_strap.rs, the fetch_peers_from_connection and collect_peer_stream functions rely on RPC streaming, and when the main connection is closed by another process, collect_peer_stream times out after STREAM_ITEM_TIMEOUT because it cannot detect that the peer can no longer respond, returning an empty vector of peers. This is why the connection state check is important for the retry logic.
Learnt from: SWvheerden
PR: #6951
File: base_layer/core/src/base_node/tari_pulse_service/mod.rs:327-352
Timestamp: 2025-04-16T07:06:53.981Z
Learning: The discovery_peer and dial_peer methods in the Tari codebase have built-in timeout mechanisms, so adding explicit timeouts with tokio::time::timeout is unnecessary.
Learnt from: SWvheerden
PR: #6951
File: base_layer/core/src/base_node/tari_pulse_service/mod.rs:327-352
Timestamp: 2025-04-16T07:06:53.981Z
Learning: The discovery_peer and dial_peer methods in the Tari codebase have built-in timeout mechanisms, so adding explicit timeouts with tokio::time::timeout is unnecessary.
Learnt from: hansieodendaal
PR: #7294
File: comms/dht/src/network_discovery/seed_strap.rs:352-456
Timestamp: 2025-07-09T08:33:29.320Z
Learning: In comms/dht/src/network_discovery/seed_strap.rs, the NUM_RETRIES logic in get_peers is specifically designed to handle peer connections that are closed while trying to RPC stream peer info, not general connection failures. The retry logic only applies when peers.is_empty() && !conn.is_connected() && attempt < NUM_RETRIES, which indicates a mid-stream disconnection.
Learnt from: hansieodendaal
PR: #7294
File: comms/dht/src/network_discovery/seed_strap.rs:352-456
Timestamp: 2025-07-09T08:33:29.320Z
Learning: In comms/dht/src/network_discovery/seed_strap.rs, the context.connectivity.dial_peer method should fail fast and return an error if a peer cannot be dialed, rather than requiring retry logic for general connection failures.
Learnt from: hansieodendaal
PR: #6963
File: comms/core/src/peer_manager/storage/migrations/2025-04-14-072200_initial/up.sql:24-41
Timestamp: 2025-05-02T14:07:10.892Z
Learning: The peer system design requires each network address to be uniquely associated with exactly one peer, and an address cannot be reused across multiple peers.
Learnt from: hansieodendaal
PR: #7294
File: comms/dht/src/network_discovery/seed_strap.rs:721-735
Timestamp: 2025-07-09T08:13:37.206Z
Learning: In comms/dht/src/network_discovery/seed_strap.rs, the 10-second STREAM_ITEM_TIMEOUT and retry logic are intentionally designed to handle service conflicts where other services kill seed peer connections during seedstrap operations. The underlying discovery_peer/dial_peer API timeouts are too lenient for seedstrap use cases, so the more aggressive timeout with retry logic is appropriate and necessary.
Learnt from: hansieodendaal
PR: #6963
File: comms/dht/src/proto/mod.rs:141-142
Timestamp: 2025-05-02T07:12:23.985Z
Learning: The PeerFeatures::from_bits_u32_truncate
method truncates a u32 to u8 bits but can still return None
if the resulting bits don't match any valid flags, making the error handling with .ok_or_else()
necessary even after truncation.
Learnt from: hansieodendaal
PR: #7307
File: comms/core/src/connection_manager/peer_connection.rs:356-357
Timestamp: 2025-07-12T03:43:22.545Z
Learning: In base_layer/core/src/base_node/tari_pulse_service/mod.rs, the disconnect_if_unused call uses parameters (0, 2) because the tari_pulse_service creates exactly 2 substreams and 0 RPC sessions during health checks.
🧬 Code Graph Analysis (3)
comms/core/src/peer_manager/storage/schema.rs (1)
comms/core/src/net_address/multiaddr_with_stats.rs (1)
is_external
(148-156)
comms/core/src/peer_manager/mod.rs (1)
comms/core/src/peer_manager/manager.rs (3)
create_test_peer
(361-408)create_test_peer_add_internal_addresses
(411-416)create_test_peer_internal_addresses_only
(419-441)
comms/core/src/peer_manager/storage/database.rs (4)
comms/core/src/net_address/multiaddr_with_stats.rs (3)
address
(143-145)is_external
(148-156)new
(43-59)comms/core/src/peer_manager/peer_storage_sql.rs (4)
format
(676-677)discovery_syncing_peers_with_external_addresses_only
(807-838)create_test_peer
(659-695)all
(182-184)common_sqlite/src/connection.rs (1)
connect_temp_file_and_migrate
(197-211)comms/core/src/peer_manager/peer.rs (1)
new
(103-127)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (8)
- GitHub Check: test (mainnet, stagenet)
- GitHub Check: test (nextnet, nextnet)
- GitHub Check: cargo check with stable
- GitHub Check: test (testnet, esmeralda)
- GitHub Check: ci
- GitHub Check: ledger build tests
- GitHub Check: Cucumber tests / Base Layer
- GitHub Check: Cucumber tests / FFI
🔇 Additional comments (41)
comms/dht/src/connectivity/test.rs (1)
123-123
: Test updated correctly for new API signature.The addition of the
false
parameter correctly aligns with the newexternal_addresses_only
parameter in theclosest_n_active_peers
method. Usingfalse
is appropriate for this test as it allows both internal and external addresses to be considered.comms/core/src/peer_manager/storage/schema.rs (1)
37-37
: Schema addition supports external address filtering.The new
is_external
boolean column is correctly positioned and will enable efficient database-level filtering of peers by external address status, directly supporting the PR's objective to prevent local addresses from being shared during peer sync.comms/dht/src/connectivity/mod.rs (1)
987-987
: Correctly filters for external addresses only in DHT connectivity.Setting
external_addresses_only
totrue
infetch_neighbouring_peers
is the correct implementation to ensure only peers with external addresses are selected for DHT connectivity, directly addressing the PR's objective to prevent local addresses from being used in peer synchronization.comms/dht/src/actor.rs (1)
798-798
: LGTM! Correctly filters peers to external addresses only for propagation.This change ensures that peer propagation only considers peers with external addresses, which aligns perfectly with the PR objective of preventing local address propagation during peer sync.
comms/core/src/peer_manager/storage/migrations/2025-07-19-085200_external_flag/down.sql (1)
1-2
: Migration script correctly removes the external flag column.The SQL syntax is correct for dropping the
is_external
column. Note that running this down migration will permanently lose the external/internal address classification data, which is expected behavior for reversing the feature.comms/dht/src/network_discovery/ready.rs (1)
65-65
: Correctly filters network discovery peers to external addresses only.This ensures network discovery only considers peers with external addresses when selecting peers for discovery rounds, preventing attempts to discover peers through internal addresses that wouldn't be reachable.
comms/dht/src/rpc/service.rs (2)
169-169
: Correctly filters returned peers to those with external addresses only.This ensures the
get_closer_peers
RPC method only returns peers that have external addresses, preventing local addresses from being shared with requesting peers.
211-211
: Properly filters discovery peers to external addresses only.The
get_peers
RPC method now correctly filters to peers with external addresses, ensuring remote peers only receive information about peers they can actually connect to.comms/core/src/peer_validator/error.rs (1)
34-34
: Improved error message accuracy.The updated error message correctly describes the actual error condition (peer has no address claims) rather than the misleading "was banned" message. This improves debugging and aligns with the enhanced address validation logic in this PR.
comms/core/src/peer_manager/peer.rs (1)
328-354
: LGTM! Well-implemented external address filtering method.The implementation correctly filters addresses to include only external ones and handles the case where no external addresses exist. The method preserves all peer fields appropriately while creating a new instance with filtered addresses.
The
#[allow(dead_code)]
annotation suggests this method isn't used yet, which aligns with the summary indicating it's used in the database layer for filtering query results.comms/core/src/peer_manager/mod.rs (1)
101-106
: LGTM! Appropriate expansion of test utilities.The addition of
create_test_peer_add_internal_addresses
andcreate_test_peer_internal_addresses_only
to the public test interface enables comprehensive testing of the new external/internal address filtering functionality. These utilities complement the existingcreate_test_peer
function well.comms/dht/src/rpc/peer_info.rs (2)
42-42
: Good catch! Typo fix improves code quality.The parameter name correction from
max_addresse_per_claim
tomax_addresses_per_claim
fixes a spelling error and improves code readability.
54-54
: Consistent usage of the corrected parameter name.The usage of the corrected parameter name maintains consistency with the method signature fix.
comms/core/src/net_address/multiaddr_with_stats.rs (2)
16-16
: Import addition supports the new external address detection.Adding
Protocol
to the import enables accessing individual protocol components within multiaddresses for the newis_external
method.
147-156
: Excellent implementation of external address detection.The
is_external
method correctly identifies internal addresses using robust standard library methods:
- IPv4: Uses
is_loopback()
,is_unspecified()
, andis_private()
which properly handle all RFC-defined private/special-use ranges- IPv6: Covers loopback (::1) and unspecified (::) addresses
- Other protocols: Defaults to external (appropriate for onion3, etc.)
This approach is more reliable than string pattern matching and aligns perfectly with the migration logic in
up.sql
.common/config/presets/d_console_wallet.toml (2)
62-65
: Configuration enhancement for custom base node specification.The new
custom_base_node
option provides a way to specify a custom base node peer for metadata retrieval, which aligns with the PR objective of improving peer management capabilities.
343-372
: Comprehensive network discovery configuration options added.These new configuration parameters provide fine-grained control over the bootstrap and peer sync process, including:
- Maximum seed peers to try during bootstrap
- Minimum peers needed for early exit conditions
- Bootstrap timeout settings
- Ban thresholds for misbehaving peers
The configuration options are well-documented and align with the PR's goal of improving peer sync behavior by filtering external addresses.
common/config/presets/c_base_node_c.toml (1)
297-326
: Network discovery configuration options aligned with wallet preset.The addition of these commented configuration options provides consistent fine-tuning capabilities for network discovery across both base node and wallet presets. The parameters control bootstrap behavior, peer sync thresholds, and ban policies, supporting the broader peer filtering enhancements in this PR.
comms/core/src/peer_manager/peer_storage_sql.rs (5)
193-213
: External address filtering parameter added to discovery syncing.The
external_addresses_only
parameter is properly integrated into the method signature and forwarded to the underlying database call. This enables filtering peers to only those with external addresses during peer sync operations, directly supporting the PR objective.
232-254
: External address filtering parameter added to closest peers selection.The
external_addresses_only
parameter is consistently implemented across both peer discovery methods, maintaining API consistency and enabling the same filtering capability for closest peer selection.
393-393
: Test helper import added for internal address testing.The import of
create_test_peer_add_internal_addresses
supports the new test functionality for verifying external address filtering behavior.
799-799
: Existing test updated to use new parameter.The test correctly passes
true
for theexternal_addresses_only
parameter, maintaining existing test behavior while adapting to the new API.
806-838
: Comprehensive test for external address filtering functionality.The new test thoroughly verifies the external address filtering behavior by:
- Creating peers with both internal and external addresses
- Verifying that without filtering, peers contain both address types
- Confirming that with filtering enabled, only external addresses are returned
This provides excellent coverage for the core functionality introduced in this PR.
comms/core/src/peer_manager/storage/database.rs (11)
540-544
: LGTM!The addition of the
is_external
field correctly captures the external/internal status of addresses during peer insertion.
643-647
: LGTM!Consistent implementation for updating the
is_external
field during peer updates.
1369-1378
: LGTM!The addition of the
external_addresses_only
parameter enables filtering at the database level, which is more efficient than post-query filtering.
1400-1403
: LGTM!Efficient SQL-level filtering for external addresses using the
is_external
column.
1448-1472
: LGTM!Proper propagation of the
external_addresses_only
parameter through the method chain.
1674-1680
: LGTM!Struct field additions properly support the new database schema with the
is_external
column.Also applies to: 1693-1697
1770-1774
: LGTM!The
From
trait implementations correctly handle theis_external
field in both conversion scenarios.Also applies to: 1853-1859
1887-1888
: LGTM!Test helper imports support the new external address filtering test.
2106-2115
: LGTM!Test calls properly updated with
false
forexternal_addresses_only
to maintain existing test behavior.Also applies to: 2120-2129, 2134-2143, 2149-2158, 2300-2309, 2334-2343, 2371-2380
2636-2684
: Well-structured test for external address filtering!The test comprehensively verifies:
- Peers with mixed addresses are correctly identified
- External-only filtering excludes peers with only internal addresses
- Returned peers have only external addresses when filtered
1517-1548
: external_addresses_only filtering is intentional and requiredThe SQL filter on
multi_addresses.is_external
ensures we only select peers that have at least one external address, whileas_peer_with_external_addresses_only()
then strips out any internal addresses from the returnedPeer
instances. These two steps serve different purposes and are both necessary:
- SQL filter (in
get_active_peer_node_ids
): includes only peers with at least one external address in the candidate set.as_peer_with_external_addresses_only()
: constructs a newPeer
containing only external addresses and returnsNone
if none exist (guarding against unexpected race conditions).No redundant checks—this double layer guarantees both correct peer selection and that returned peers carry only external addresses. No changes needed here.
comms/core/src/peer_manager/manager.rs (7)
170-174
: LGTM! Parameter addition aligns with PR objectives.The addition of the
external_addresses_only
parameter todiscovery_syncing
properly supports filtering peers to only those with external addresses, which aligns with the PR objective of preventing local addresses from being sent during peer sync.
227-239
: LGTM! Parameter addition is properly implemented.The
external_addresses_only
parameter addition toclosest_n_active_peers
is correctly implemented and propagated to the underlying storage layer call.
374-378
: LGTM! IP generation logic excludes internal address ranges.The modification to exclude ranges 10 and 127 from the first octet ensures that
create_test_peer
generates external addresses by default, which aligns with the PR's focus on distinguishing between internal and external addresses.
410-416
: LGTM! Well-designed test helper function.The
create_test_peer_add_internal_addresses
function provides a clean way to create test peers with both external and internal addresses, supporting comprehensive testing of the new address filtering functionality.
418-441
: LGTM! Comprehensive test helper for internal-only peers.The
create_test_peer_internal_addresses_only
function correctly creates peers with only internal addresses, which is essential for testing the filtering logic that should exclude such peers during sync operations.
443-511
: LGTM! Comprehensive coverage of internal address ranges.The
add_internal_addresses
function provides excellent coverage of internal address ranges including:
- IPv4: loopback (127.x.x.x), unspecified (0.0.0.0), and private ranges (10.x.x.x, 172.16-31.x.x, 192.168.x.x)
- IPv6: loopback (::1) and unspecified (::)
The randomization and port range differentiation enhance test coverage and debugging capabilities.
595-595
: LGTM! Test updates maintain backward compatibility.The existing test calls correctly pass
false
for the newexternal_addresses_only
parameter, maintaining backward compatibility and existing test behavior while allowing the new functionality to be tested separately.Also applies to: 630-630, 843-843, 878-878
…adresses # Conflicts: # common/config/presets/c_base_node_c.toml # common/config/presets/d_console_wallet.toml
Description
This PR prevents peers from sending local addresses during peer sync when peer info is requested. The rationale here is that peer sync is performed by a local node to learn about the network, and then for the local node to attempt to contact those peers.
Motivation and Context
How Has This Been Tested?
System-level testing for migrations.
Unit tests have been added to verify that only peers with their external addresses are supplied when requested as such.
What process can a PR reviewer use to test or verify this change?
Code review.
System-level test.
Breaking Changes
Summary by CodeRabbit
New Features
Bug Fixes
Documentation
Style
Tests