Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RPC Branch for Persistent Entry Eviction #4498

Draft
wants to merge 35 commits into
base: master
Choose a base branch
from

Conversation

SirTyson
Copy link
Contributor

@SirTyson SirTyson commented Oct 3, 2024

This branch contains all the externally observable side effects of persistent entry eviction. This includes evicting persistent entries out of the live bucketlist and eviction meta, requiring proofs for restoration, requiring proofs for persistent entry creation, transaction enforcement of new rules wrt to archival, a new LedgerEntryChange type for restored entries, and the getledgerentry, getrestorationproof, and getarchivalproof endpoints.

This is not a production build, but is intended as an early build for downstream system integration. I've included some unit tests and manually tested the new endpoints, and have successfully ran a watcher of pubnet without forking on protocol 22. All that to say it probably pretty much works, but I wouldn't be surprised if there's a few buggy edge cases.

Two new config flags have been added to assist with testing. When ARTIFICIALLY_SIMULATE_ARCHIVE_FILTER_MISS is set, all persistent contract data keys with a key type of SCV_SYMBOL and a key symbol of "miss" will require a creation proof. This proof can be retrieved via the getcreationproof endpoint.

The REQUIRE_PROOFS_FOR_ALL_EVICTED_ENTRIES should always be set in this build. When this flag is set, all persistent entries will require proofs following their eviction. These proofs can be retrieved via the getrestorationproof endpoint.

@SirTyson
Copy link
Contributor Author

SirTyson commented Oct 11, 2024

The interface for the new endpoints is as follows (where all requests are POST requests on captive-core's QUERY_SERVER endpoint):

getledgerentry

Used to query the live ledger and archival state. While getledgerentryraw does simple key-value lookup on the live BucketList, getledgerentry will query a given key in both the live BucketList, as well as archives. It will also check to see what proofs are required for the given key, if any, and report back whether an entry is archived or live.

getledgerentry
ledgerSeq=NUM&key=Base64&key=Base64...

Key is base 64 XDR encoded LedgerKey. TTL keys must not be queried.
ledgerSeq is the ledger which the snapshot should be based on, but is optional. If not set, captive-core will automatically use the most recent ledgerSeq.

Return payload JSON object in the following format:

{
"entries": [
  {"e": "Base64-LedgerEntry", "state": "live", "ttl": uint32 },
  {"e": "Base64-LedgerKey", "state": "archived_no_proof"},
],
"ledger": ledgerSeq
}

Where entries is a list of all live or archived entries that were found.

State is one of:

live -> Live entry
new_entry_no_proof -> Entry does not exist and does not require creation proof
new_entry_proof -> Entry does not exist and requires creation proof
archived_no_proof -> Entry is archived and can be restored without a proof
archived_proof -> Entry is archived and requires restoration proof

For live Soroban entries, the TTL ledger is also returned in the ttlLedger field. If an entry is not live or is not a Soroban entry type, this field is omitted (Note: not yet implemented).

Note about Temporary Entries

Since temporary entries cannot be restored, a temporary key will return the state of "new_entry_no_proof" iff it has never existed, or has existed but is expired.

getinvokeproof

Used to generate creation proofs, required when writing new keys that have an archival filter false positive. For testing, use ARTIFICIALLY_SIMULATE_ARCHIVE_FILTER_MISS to simulate this.

getinvokeproof
ledgerSeq=NUM&key=Base64&key=Base64...

Key is base 64 XDR encoded LedgerKey
ledgerSeq is the ledger which the snapshot should be based on, but is optional. If not set, captive-core will automatically use the most recent ledgerSeq.

Return payload JSON object in the following format:

{
"ledger": ledgerSeq, // Required
"proof": Base64XDR // optional
}

Where “proof” is a base64 encoded XDR string of the proof payload required for the associated InvokeHostFunctionOp. If no proof is required, “proof” is omitted.

If a ledgerSeq is requested and the captive-core node does not have the given snapshot, a 404 error will be returned.

Note: Only a single proof object is required for all keys in a given TX.

getrestoreproof

Used to generate restoration proofs, required when restoring entries that no longer exist on validators. For testing, use REQUIRE_PROOFS_FOR_ALL_EVICTED_ENTRIES for easier testing.

getrestoreproof
ledgerSeq=NUM&key=Base64&key=Base64...

Key is base 64 XDR encoded LedgerKey
ledgerSeq is the ledger which the snapshot should be based on, but is optional. If not set, captive-core will automatically use the most recent ledgerSeq.

Return payload JSON object in the following format:

{
"ledger": ledgerSeq, // Required
"proof": Base64XDR // optional
}

Where “proof” is a base64 encoded XDR string of the proof payload required for the associated RestoreFootprintOp. If no proof is required, “proof” is omitted.

If a ledgerSeq is requested and the captive-core node does not have the given snapshot, a 404 error will be returned.

Note: Only a single proof object is required for all keys in a given TX.

@SirTyson
Copy link
Contributor Author

Meta Changes

Eviction Meta

Whenever a PERSISTENT entry is evicted, the full LedgerEntry for the evicted data is included in evictedPersistentLedgerEntries. The corresponding TTL entry is stored in evictedTemporaryLedgerKeys. Unfortunately, evictedTemporaryLedgerKeys was originally misnamed, and should be something like "keysDeletedViaEviction", as this field now contains:

  1. Keys of evicted TEMPORARY entries.
  2. Keys of TTL's associated with evicted TEMPORARY entries.
  3. Keys of TTL's associated with evicted PERSISTENT entries.

Restore Op Meta

The RestoreFootprintOperation operation meta has changed to support a new LedgerEntryChange type. There are now two different cases where two different types of meta are produced.

If an entry is restored, but has not yet been evicted (i.e. the entry has not yet been deleted), restoration meta is the same as it is pre protocol 23. In particular, the original value of the restored entries TTL will be emitted as LEDGER_ENTRY_STATE, followed by the new value of the TTL entry via LEDGER_ENTRY_UPDATED.

If an entry is restored after it has been evicted (i.e. the entry has been deleted from validators), both the restored code/data LedgerEntry and it's associated TTL entry will be emitted as type LEDGER_ENTRY_RESTORED.

@dmkozh
Copy link
Contributor

dmkozh commented Oct 15, 2024

getledgerentry

This function is underspecified in terms of temp entry behavior. I would expect temp entries to never be 'expired'/'evicted' and just get a 'new entry, no proof required' response as soon as the entry's TTL expires. But we must specify that clearly.

Key is base 64 XDR encoded LedgerKey. TTL keys must not be queried.

TTL must be returned as a part of per-entry response then. TTL is necessary for the fee computations.

Eviction Meta

I believe we've discussed this a few times, and it would seem that this is not necessary. What's the purpose of eviction meta again?

@dmkozh
Copy link
Contributor

dmkozh commented Oct 15, 2024

Another thing missing is the behavior w.r.t. archived entries. I would expect we make best effort at returning the entry, but that's not documented.

@dmkozh
Copy link
Contributor

dmkozh commented Oct 15, 2024

I also wonder if proof is necessary to read a non-existent entry that otherwise would require a proof for creation. I think it must be required, but that's not obvious from the docs (and we also need to ensure that our read code requires the proofs as well).

@SirTyson
Copy link
Contributor Author

I also wonder if proof is necessary to read a non-existent entry that otherwise would require a proof for creation. I think it must be required, but that's not obvious from the docs (and we also need to ensure that our read code requires the proofs as well).

Good catch. I believe I accidentally implemented this anyway, since I check for creation proof validity on all keys that don't currently exist in the footprint, but we should have an explicit test and spec for this.

Another thing missing is the behavior w.r.t. archived entries. I would expect we make best effort at returning the entry, but that's not documented.

For now, this endpoint will always return correct results since RPC stores all archival history (we decided to remove partial history from the initial spec for simplicity since Soroban disk requirements are still small). In the future sharding and partial archival history at the RPC/captive-core level is possible, but won't be implemented or necessary for some time. Personally I think I'd like to leave it as is for now. I think the "best effort" interface introduces complexity that won't be necessary for some time, and when it comes time for archival sharding, RPC will have to make a lot of changes anyway.

TTL must be returned as a part of per-entry response then. TTL is necessary for the fee computations.

Sorry, I think I underspeced this in the initial design doc and was an oversight. I'll add TTL info to the return payload, but will still not allow direct querying of TTL entries (since the query implicitly does a TTL lookup anyway).

Eviction Meta

This is primarily for Hubble to make ingestion easier. I think currently the status quo of meta is that the meta contains the complete state of a given key following a change (as in the case for LedgerEntryChange tx meta). If we just give a set of keys, it is up to downstream to lookup the evicted entry in their own databases to find the entry body, which is fundamentally different then what meta does today. This is different than a temp entry eviction, since that is logically a delete so the LedgerEntry data does not matter to downstream. Peristent eviction is not a straight delete so the story's a little different, but I'll chat with platform more about this issue.

@SirTyson
Copy link
Contributor Author

This function is underspecified in terms of temp entry behavior. I would expect temp entries to never be 'expired'/'evicted' and just get a 'new entry, no proof required' response as soon as the entry's TTL expires. But we must specify that clearly.

Yeah, expired temp returns "new_entry_no_proof", I'll make this edge case clear in the spec.

@dmkozh
Copy link
Contributor

dmkozh commented Oct 22, 2024

Personally I think I'd like to leave it as is for now.

I'm not sure what 'as is' is. There is no documentation as to what we return for the archived entry. If we don't support the partial archives yet, we should then say that a persistent entry will always be returned, even if has been was evicted.

@SirTyson
Copy link
Contributor Author

I've updated the getledgerentry endpoint to return entry's TTL, and I've also written a (hopefully) detailed spec in commands.md.

I've also change meta as follows:

Starting in protocol 23, all restoration ops will emit both the entry being restored and it's TTL entry as a LEDGER_ENTRY_RESTORED change type. This happens for all restored entries regardless of if they have been evicted.

Additionally, eviction meta has changed such that all evicted keys are included in evictedTemporaryLedgerKeys. This includes evicted temp entries, persistent entries, and all associated TTLs. Unfortunately, evictedTemporaryLedgerKeys is now poorly named, but I'll fix it whenever we release a new version of LedgerCloseMeta xdr.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants