fix(backend): cleanups part 1 #517

Karrq · 2025-10-20T15:24:11Z

This PR addresses part of the TODOs remaining in the backend code

⚠️ Breaking Changes ⚠️

Configuration BREAKING changes:

Add configuration items for auth expirations
Add configuration item for Node callaback URL
Configurable number of retries for Upload

API BREAKING changes:

Use alloy's Address for addresses instead of strings
Authenticate users for upload/download operations
FileList response now has a single tree field at the root and folder items do not have a children field
Remove /distribute endpoint as the backend is not in charge of this anymore

DB Migrations:

The Bucket table now has 3 more columns: value_prop_id, total_size, file_count

API changes:

Added pagination query parameters to /files and /buckets endpoints
- The pagination parameters are optional and are: limit (the number of items to reply with) and page (which 'page' the endpoint should return)

Performance improvements:

Cache the MSP ID
Use pagination mechanism for db requests with many items

refactor: use raw fingerprint bytes for file info internally refactor: get_file_info authenticates user, no need for bucket id docs: document missing auth for download

fix(test:download): initialize test db

refactor(handler:file): authenticate user before download

This is to allow the /health endpoint to potentially trigger a reconnect if the connection has errors

backend/lib/src/services/msp.rs

chore: cleanup after merge

HermanObst

Some preliminar comments

HermanObst · 2025-11-04T21:20:40Z

backend/lib/src/services/msp.rs

        let payment_stream_data = self
            .postgres
-            .get_payment_streams_for_user(user_address)
+            .get_payment_streams_for_user(&user_address.to_string())


Changing user_address downstream to also be Address instead String was a PITA?

You mean in the DB itself? Or the internal traits?
In the DB we keep string because technically speaking it could be a SS58 address, which was the case before the solochain runtime

backend/lib/src/services/msp.rs

backend/lib/src/api/handlers/files.rs

backend/lib/src/services/msp.rs

HermanObst

Second pass

backend/lib/src/api/routes.rs

HermanObst · 2025-11-06T19:47:28Z

backend/lib/src/data/indexer_db/client.rs

+    ///
+    /// This method caches the MSP data to avoid repeated database hits.
+    /// The cache is automatically refreshed after the configured TTL expires.
    pub async fn get_msp(&self, msp_onchain_id: &OnchainMspId) -> Result<Msp> {
        debug!(target: "indexer_db::client::get_msp", onchain_id = %msp_onchain_id, "Fetching MSP");

-        // TODO: should we cache this?
-        // since we always reference the same msp
-        self.repository
+        // Check if we have a valid cached entry
+        {
+            let cache = self.msp_cache.read().await;
+            if let Some(entry) = &*cache {
+                // Check if the cache entry matches the requested MSP and is still valid
+                if entry.msp.onchain_msp_id == *msp_onchain_id
+                    && entry.last_refreshed.elapsed() < Duration::from_secs(MSP_CACHE_TTL_SECS)
+                {
+                    return Ok(entry.msp.clone());
+                }
+            }
+        }
+
+        // Cache miss or expired, fetch from database
+        let msp = self
+            .repository
            .get_msp_by_onchain_id(msp_onchain_id)
            .await
-            .map_err(Into::into)
+            .map_err(Into::<crate::error::Error>::into)?;
+
+        // Update cache with the new value
+        {
+            let mut cache = self.msp_cache.write().await;
+            *cache = Some(MspCacheEntry {
+                msp: msp.clone(),
+                last_refreshed: Instant::now(),
+            });
+        }
+
+        Ok(msp)
+    }
+
+    /// Invalidate the MSP cache if it matches the given MSP
+    ///
+    /// # Arguments


What is the motivation of having this cache?
Also, can't this cause having out of date info?

We have the cache because realistically speaking the MSP that we make query for will be only 1, so this way we save a few redudant calls to the db...
Not necessary tbh but not bad to have. Yeah we might be out of date but not meaningfully so

Yup, i think we should introduce a cache only if the load require to.
In this moment I think is best to have accurate data always.
We can discuss it in the Daily tomorrow
cc @ffarall @TDemeco

client/indexer-db/migrations/2025-10-09-161554_add_bucket_value_prop_size_count/up.sql

HermanObst

GigaBrain work! Lets just discuss all my comments and we are good to go

HermanObst · 2025-11-06T19:54:30Z

client/indexer-db/src/models/bucket.rs

    pub updated_at: NaiveDateTime,
    pub merkle_root: Vec<u8>,
+    pub value_prop_id: String,
+    pub total_size: BigDecimal,


do we need a BigInt?

Most likely no, but this gives us flexibility to not have to depend on the runtime unit typing (technically file should change too)

client/indexer-db/src/models/bucket.rs

HermanObst · 2025-11-06T20:01:04Z

client/indexer-db/src/models/file.rs

        diesel::delete(file::table)
            .filter(file::file_key.eq(file_key))
            .execute(conn)
            .await?;
+
+        // Update bucket counts if file was found
+        if let Some((bucket_id, file_size)) = file_info {
+            Bucket::decrement_file_count_and_size(conn, bucket_id, file_size).await?;
+        }
+


Out of curiosity, is this happening atomically?

Yes, it's all in one transaction at the start of the block processing

How the db transaction is initiated?

from handle_finality_notification -> index_block which then trickles the data to everything else

storage-hub/client/indexer-service/src/handler.rs

Lines 121 to 134 in a651b50

conn.transaction::<(), IndexBlockError, _>(move |conn| {

Box::pin(async move {

let block_number_u64: u64 = block_number.saturated_into();

let block_number_i64: i64 = block_number_u64 as i64;

ServiceState::update(conn, block_number_i64).await?;

for ev in block_events {

self.route_event(conn, &ev.event.into(), block_hash).await?;

}

Ok(())

})

})

.await?;

Karrq added 27 commits October 6, 2025 17:14

fix: remove /distribute endpoint

0172394

fix: remove storage from msp service

fd46bcf

refactor: file list without empty children

ece041d

chore: cleanup changes

9240715

feat(download): authenticate user

1bec86d

refactor: use raw fingerprint bytes for file info internally refactor: get_file_info authenticates user, no need for bucket id docs: document missing auth for download

Merge remote-tracking branch 'origin/main' into backend-cleanups

5fd6128

feat(download): authenticate user

a3c9f01

fix(test:download): initialize test db

Merge remote-tracking branch 'origin/main' into backend-cleanups

e8e1223

chore: cleanup debug lines

7bda7f3

fix: update filelist typings

8ba5df4

feat(auth): configurable jwt expiry

a549a9e

fix: remove dead code

c4670ed

Merge remote-tracking branch 'origin/main' into backend-cleanups

da65b10

docs: extra todo

90bccb4

feat(auth): config nonce duration and siwe domain

17c03f7

feat(config): msp, rpc and upload retries

fe82a37

feat(indexer): bucket total size, file count and value prop

b6d16f7

perf(backend): cache msp id

50f4c98

perf(backend:db): request items by page

e11d32a

refactor: use typed Address instead of strings

50af8f1

fix(backend): cleanup health service

044b227

chore: fix compilation

09adb12

fix(backend): mock repository relations

6df048e

fix(db): bucket value prop id

a305be3

chore: lints

e5be386

Merge remote-tracking branch 'origin/main' into backend-cleanups

e29e329

fix(config): remove rpc retry config items

dd1e4f9

Karrq requested review from TDemeco, ffarall and ftheirs October 20, 2025 15:24

Karrq added 2 commits October 27, 2025 20:48

feat(backend): pagination query parameters

5f0ca17

Merge remote-tracking branch 'origin/main' into backend-cleanups

9de48e0

Karrq requested a review from TDemeco October 28, 2025 13:09

Karrq added 11 commits October 28, 2025 14:11

Merge remote-tracking branch 'origin/main' into backend-cleanups

62bccf6

Merge remote-tracking branch 'origin/main' into backend-cleanups

b4e5ab9

chore: lints

72b19af

fix: profile with checksummed address

1667d0a

fix(backend): db not found leads to 404

af8c477

Merge remote-tracking branch 'origin/main' into backend-cleanups

2ee6617

fix(test): adjust rust test assert

efa1b53

Merge remote-tracking branch 'origin/main' into backend-cleanups

fed6549

refactor(msp): get_file w/o auth

045a779

refactor(handler:file): authenticate user before download

fix: compilation

4dfb7d7

Merge remote-tracking branch 'origin/main' into backend-cleanups

5ce9cad

Karrq added B3-backendnoteworthy Changes should be mentioned in SH backend related release notes breaking Needs to be mentioned in breaking changes D2-noauditneeded🙈 PR doesn't need to be audited B1-sdknoteworthy Changes should be mentioned in SH SDK related release notes labels Nov 3, 2025

Karrq added 2 commits November 4, 2025 21:07

refactor(health): attempt rpc call before conn

be7b5bb

This is to allow the /health endpoint to potentially trigger a reconnect if the connection has errors

Merge remote-tracking branch 'origin/main' into backend-cleanups

be564c6

HermanObst reviewed Nov 4, 2025

View reviewed changes

backend/lib/src/services/msp.rs Show resolved Hide resolved

feat: optional authentication for query endpoints

a7f90b6

chore: cleanup after merge

HermanObst reviewed Nov 5, 2025

View reviewed changes

Karrq added 2 commits November 5, 2025 15:03

chore: fix grammar

ee674a5

Merge remote-tracking branch 'origin/main' into backend-cleanups

6838c4d

HermanObst reviewed Nov 6, 2025

View reviewed changes

backend/lib/src/services/msp.rs Show resolved Hide resolved

HermanObst reviewed Nov 6, 2025

View reviewed changes

HermanObst requested changes Nov 6, 2025

View reviewed changes

Karrq added the indexer-db Changes include migrations for the Indexer DB label Nov 6, 2025

chore: cleanup dead method

b7d45ad

	conn.transaction::<(), IndexBlockError, _>(move \|conn\| {
	Box::pin(async move {
	let block_number_u64: u64 = block_number.saturated_into();
	let block_number_i64: i64 = block_number_u64 as i64;
	ServiceState::update(conn, block_number_i64).await?;

	for ev in block_events {
	self.route_event(conn, &ev.event.into(), block_hash).await?;
	}

	Ok(())
	})
	})
	.await?;

fix(backend): cleanups part 1 #517

Are you sure you want to change the base?

fix(backend): cleanups part 1 #517

Conversation

Karrq commented Oct 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

⚠️ Breaking Changes ⚠️

Uh oh!

Uh oh!

HermanObst left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

HermanObst left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

HermanObst left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Karrq commented Oct 20, 2025 •

edited

Loading