-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: support non consecutive blocks ryhope #425
base: indexing-with-no-probvable-ext
Are you sure you want to change the base?
feat: support non consecutive blocks ryhope #425
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Outside of the minor comments, I have a major one. Although I agree that introducing a lookup epoch table is the only solution that makes sense for randomly indexed trees, it introduces (i) a performance hit for all the linearly-indexed trees, and this the harder the more they grow; (ii) consequently, a lot of tables that will boil down to [x, x + offset]
– concretely, after a couple months, we will be JOIN
-ing tables of several hundred thousands rows for every single query.
Do you think it would make sense/be possible, similarly to what you do with IS_EXTERNAL_MAPPER
, to add a third specialization that would fall back to the old behavior when the index progression is known to be linear?
parsil/src/queries.rs
Outdated
SELECT {execution_epoch}::BIGINT as {EPOCH}, | ||
{USER_EPOCH} as {KEY} | ||
FROM {table_name} JOIN ( | ||
SELECT * FROM {mapper_table_name} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we avoid the *
and explicitly select columns?
parsil/src/queries.rs
Outdated
{USER_EPOCH} as {KEY} | ||
FROM {table_name} JOIN ( | ||
SELECT * FROM {mapper_table_name} | ||
WHERE {USER_EPOCH} >= {}::BIGINT AND {USER_EPOCH} <= {}::BIGINT |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please compute the arguments above and put them directly in the {}
for readability sake?
|
||
pub fn mapper_table_name(table_name: &str) -> String { | ||
format!("{}_mapper", table_name) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If yo do that for the _mapper
, could you please do it for the _meta
as well, for the sake of homogeneity?
pub type Epoch = i64; | ||
pub type UserEpoch = i64; | ||
|
||
pub type IncrementalEpoch = i64; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please document the why of this type.
|
||
/// A timestamp in a versioned storage. Using a signed type allows for easy | ||
/// detection & debugging of erroneous subtractions. | ||
pub type Epoch = i64; | ||
pub type UserEpoch = i64; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please update the docstring to emphasize the the fact that this is not the actual epoch, but the user-facing one.
Ok(()) | ||
} | ||
} | ||
|
||
impl EpochMapper for InMemoryEpochMapper { | ||
async fn try_to_incremental_epoch(&self, epoch: UserEpoch) -> Option<IncrementalEpoch> { | ||
self.try_to_incremental_epoch_inner(epoch) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we put these function bodies right here, or is there a reason to implement them in duplicate out of the trait implementation?
/// Storage for tree state. | ||
state: VersionedStorage<<T as TreeTopology>::State>, | ||
/// Storage for topological data. | ||
nodes: VersionedKvStorage<<T as TreeTopology>::Key, <T as TreeTopology>::Node>, | ||
/// Storage for node-associated data. | ||
data: VersionedKvStorage<<T as TreeTopology>::Key, V>, | ||
epoch_mapper: SharedEpochMapper<InMemoryEpochMapper, READ_ONLY>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please comment this field, especially in which direction it maps.
} | ||
|
||
#[derive(Default)] | ||
pub struct Tree; | ||
impl Tree { | ||
pub struct Tree<const IS_EPOCH_TREE: bool>; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we replace IS_EPOCH_TREE
with something more generic?
&format!( | ||
" | ||
CREATE VIEW {mapper_table_alias} AS | ||
SELECT * FROM {mapper_table_name}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this should be *
?
), | ||
], | ||
) | ||
// Build the subquery that will be used as the source of epochs and block numbers |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use ///
here instead of //
I see, it's a concern I had as well, but I wasn't so sure that a JOIN would be much slower than a
I think this would be possible, but maybe it will be more annoying on the API side because the caller will need to specify an internal implementation detail (i.e., whether the table has incremental epochs or not). Furthermore, we will need to make this information also available to parsil, which might make the parsil API more cumbersome. But it should still be reasonable, so it could be a good compromise if the performance hit is too high. Maybe on this I would like to hear also @nikkolasg's opinion since it will affect also the API and DQ as well (as I expect any table in DQ will need to specify to ryhope how to handle epochs). |
IIRC, the
Agreed on this, the parsil/ryhope interfacing is suboptimal. I already tried to split them better, but the best I could do is the |
Not sure actually, to me it looked that the |
If we take this example, that went from:
to:
the old version is one linear scan over the block range. The new one however, is one linear scan plus a join over the filtered mapper, that we expect to grow linearly in function of time. So if b is the block range, m the mapper size, and n the vTable size, then to my understanding the old version is O(n), whereas the new one is O(n [WHERE clause on table] + m [WHERE clause on mapper] + b×b [JOIN table/mapper]). Does that seem correct to you? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice job for a complex task ! What i would really like to see more tho is more comments explaining the architectural choice, the hidden assumptions (only one process must update the DB epoch table etc), and also split into a bit more files when that makes sense. The ryhope codebase is fairly dense.
|
||
/// The index tree when the primary index is the block number of a blockchain is a sbbst since it | ||
/// is a highly optimized tree for monotonically increasing index. It produces very little | ||
/// tree-manipulating operations on update, and therefore, requires the least amount of reproving | ||
/// when adding a new index. | ||
/// NOTE: when dealing with another type of index, i.e. a general index such as what can happen on | ||
/// a result table, then this tree does not work anymore. | ||
pub type BlockTree = sbbst::Tree; | ||
pub type BlockTree = sbbst::EpochTree; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Change comments above which is still very much linked to sequential usecase
index_table_name: String, | ||
row_table_name: String, | ||
genesis_block: UserEpoch, | ||
alpha: scapegoat::Alpha, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lol we never really played with this alpha to find the best trade-off, something to keep in mind...
// {table} JOIN ( | ||
// SELECT {USER_EPOCH}, {INCREMENTAL_EPOCH} FROM {mapper_table} | ||
// WHERE {USER_EPOCH} >= $min_block AND {USER_EPOCH} <= $max_block | ||
// ) ON {VALID_FROM} <= {INCREMENTAL_EPOCH} AND {VALID_UNTIL} >= {INCREMENTAL_EPOCH} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Really not an expert of the parser framework, but isn't it simpler to write the query in a string and ask the framework to parse it instead of manually constructing it ? why is it not possible ?
if self | ||
.current_epoch() | ||
.await | ||
.ok() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe worth panicking in errors now such that it's easy to remember and change in the error type later on ?
I mean i'll completely forget about the ok() after this review for sure.
} | ||
} | ||
#[derive(Clone, Debug)] | ||
pub struct InMemoryEpochMapper(BTreeMap<UserEpoch, IncrementalEpoch>); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Definitely not if you ask me. We don't use the memory backend to be honest in practice, maybe in a few tests here and there only in ryhope i believe but nothing important. So clearly not something to spend time optimizing.
|
||
async fn fetch_at_inner(&self, epoch: IncrementalEpoch) -> T { | ||
trace!("[{self}] fetching payload at {}", epoch); | ||
let connection = self.db.get().await.unwrap(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you put expect()
instead.
pub(crate) const INITIAL_INCREMENTAL_EPOCH: IncrementalEpoch = 0; | ||
|
||
#[derive(Clone, Debug)] | ||
pub struct EpochMapperStorage { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Comments please.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And can we put this whole thing into another file ? we're already at 1.5k lines.
in_tx: bool, | ||
/// Set of `UserEpoch`s being updated in the cache since the last commit to the DB | ||
dirty: BTreeSet<UserEpoch>, | ||
pub(super) cache: Arc<RwLock<InMemoryEpochMapper>>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you write somewhere why
- we need a cache
- we need a shared cache
it wasn't clear to me why we needed that until i chatted with you.
db_tx | ||
.query( | ||
&format!( | ||
"INSERT INTO {} ({USER_EPOCH}, {INCREMENTAL_EPOCH}) | ||
VALUES ($1, $2)", | ||
self.mapper_table_name() | ||
), | ||
&[&(user_epoch as UserEpoch), &incremental_epoch], | ||
) | ||
.await?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we do only one query accross all vs doing one by one for each epoch ?
assert!(s.store(11, 2).await.is_err()); // try insert key smaller than previous one | ||
assert!(s.store(14, 2).await.is_ok()); | ||
assert!(s.store(15, 2).await.is_ok()); | ||
s.commit_transaction().await?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you check the values of the DB + cache afterwards ?
Ah right now I see what you meant that a range is applied before I am trying to think if there is a way to re-write the queries for the new version without the expensive JOIN. I found a way for some type of queries with this solution:
This query should be more efficient since we are JOINING with a table with only 1 row, and the sub-query should be run only on a range of epochs, not on the entire mapper table. |
As we discussed in meeting, let's first do a benchmark and see what are the consequences.. |
This PR mainly modifies ryhope crate to support non-consecutive block numbers in the index tree. It introduces the concept of epoch mapper, which maps user-defined logic epochs (which can correspond to arbitrary block numbers/timestamps) to internal incremental epochs used in the transactional storage implementations.