-
Notifications
You must be signed in to change notification settings - Fork 85
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SIP - Catastrophic blockchain failures and recovery #10
base: main
Are you sure you want to change the base?
Conversation
Thank you for this! Without knowing much about soft and hard forks on the technical side, this SIP gave me a great understanding of what capabilities exist to mitigate a catastrophic failure. Each possible action contained a clear description and action plan, with emphasis on using least-disruptive method possible. |
2. The branch will be submitted as a pull-request to the `master` branch, and | ||
will be reviewed and approved by at least two blockchain engineers, representing | ||
both the Stacks Foundation and at least one other major Stacks ecosystem entity. | ||
3. If warranted, an unofficial announcement will be made to various public |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When would this not be warranted? Given:
A failure qualifies as a catastrophic failure if and only if there is no conceivable way for the correct nodes in the network to make progress and preserve safety without human intervention.
Two options that could happen automatically:
- setup GitHub to announce any releases for stacks-blockchain automatically
- setup Discord to accept and relay the message from [email protected]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All the more-disruptive procedures use this procedure as a step. Some of them, like the one below it, require a public embargo on talking about the bug because the act of making the bug public knowledge would make the problem worse. For example, if someone discovered a bug that would let an attacker steal anyone's STX, we would not announce the bug until the fix was already deployed.
I like the idea of relaying all messages from [email protected] to Discord. Is there a bot that can do this? Not too familiar with Discord (I'm old-school), but I'd love to have a dedicated #announce channel which just relayed these messages.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why even have a separate (3) step? IMO step (5) could very well be re-worded as "Availability of new binaries will be communicated via standard channels per established norms of the project -- currently via the announce@ email list and Discord"
In particular, is there some specific advantage an "unofficial" announcement has? If Github releases are a source of truth, availability of a new release is automatically a "formal" announcement IMO.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Github didn't render your comment in the appropriate place for some reason. Meant to post this reply here).
The point is to let node operators know when the source code for the fix is available for scrutiny and testing (which is mentioned at the end of (3)).
To encourage users who discover such sensitive blockchain bugs to report them | ||
while keeping them secret, the Stacks Foundation will a bug-bounty program that | ||
will be set up once this SIP activates. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo: ..Foundation will start*? bug-bounty program.
Would love to explore this more, may be relevant to COC?
The incentive weights of malicious actors is also interesting to play with in bug-bounties mentioned presumably?
Ty kindly for all this, this kind of SIP goes a long way to opening up maintenance know how in the space 💯 agree with @whoabuddy's sentiments there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I really appreciate your feedback @HaroldDavis3. Ty kindly for taking the time to read through this SIP :)
The incentive weights of malicious actors is also interesting to play with in bug-bounties mentioned presumably?
If at all possible, I think we'll want to make it more profitable for malicious actors who find bugs that trigger catastrophic failures to just tell us about the bug privately instead of exploit it. We'd want to do some research and figure out what the ROI would be for exploiting various kinds of serious bugs, and see if we can offer some kind of ROI-matching bug bounty (it's a tricky calculation -- just because the attacker can steal funds on-chain, for example, doesn't necessarily mean that they can cash them out without consequences).
Would y'all like to talk about this some more at the governance call this week? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @jcnelson -- this is a great starting point, left bunch of minor comments!
with `fix/` to the Stacks Blockchain reference implementation, hosted at | ||
https://github.com/blockstack/stacks-blockchain. | ||
2. The branch will be submitted as a pull-request to the `master` branch, and | ||
will be reviewed and approved by at least two blockchain engineers, representing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[nit] s/"blockchain engineers"/"developers with write access to the stacks-blockchain repo"
Alternatively: "two members of the Stacks Core Developers group"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In particular, is there some specific advantage an "unofficial" announcement has? If Github releases are a source of truth, availability of a new release is automatically a "formal" announcement IMO.
The point is to let node operators know when the source code for the fix is available for scrutiny and testing (which is mentioned at the end of (3)).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I see Github corrupted the comments and moved the one that I quoted above into a different thread.
I don't think the set of people who have write access to the stacks blockchain repo accurately reflects the set of people will be called upon to fix catastrophic bugs. I addressed this in 3832ae2 by defining the Stacks Core Developers as a list of developer names and contacts in a supplementary file, which can be added to or removed from by the Steering Committee. To be clear, this is just the list of folks who will be called upon to execute these procedures; this isn't meant to be an exclusionary list of who's-who in the blockchain or some other social club or clique. It's more like a "who is it okay to email at 3am on a Saturday if the blockchain halts and catches fire" list.
2. The branch will be submitted as a pull-request to the `master` branch, and | ||
will be reviewed and approved by at least two blockchain engineers, representing | ||
both the Stacks Foundation and at least one other major Stacks ecosystem entity. | ||
3. If warranted, an unofficial announcement will be made to various public |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why even have a separate (3) step? IMO step (5) could very well be re-worded as "Availability of new binaries will be communicated via standard channels per established norms of the project -- currently via the announce@ email list and Discord"
In particular, is there some specific advantage an "unofficial" announcement has? If Github releases are a source of truth, availability of a new release is automatically a "formal" announcement IMO.
@kantai I've added a set of case studies to demonstrate how each of these catastrophic error recover procedures can be used. |
phases vote to accept the soft fork rules within the activation window, then the new | ||
rules will take effect starting in the next whole reward cycle (i.e. right after | ||
the second prepare phase finishes). All new releases of the Stacks node will | ||
adhere to the soft fork rules, since they are now part of the block validation rules. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this soft-fork process, would the node released in step 1 immediately begin enforcing the soft-fork rules (i.e., ignoring blocks that do not follow the soft fork)? Then is step 4 just a consolidation of this rule (i.e., the declaration that all future releases will also follow the soft fork rule)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not necessarily. This SIP only requires that miners who vote for the soft-fork do so through this signaling procedure, and that the new rules must take effect no later than when this activation threshold is met. The SIP intentionally does not specify anything about how early miners can start deciding when to ignore blocks they would no longer consider valid, since miners have this power already (with or without soft forks). In fact, per point 2, the provision that "stricter criteria are permitted" is meant to allow a particular soft-fork upgrade to impose additional requirements for miners to begin activating the new rules. For example, in the third case study below, a particular soft fork to repair a bug in smart contract processing could require miners to begin orphaning blocks in which the bug manifests ASAP.
In addition, the Stacks Core Developers would coordinate with the Foundation to | ||
release a version of the node software that included a fix for the bug, as well | ||
as code to re-activate smart contracts under the condition that | ||
the vast majority of miners have _rejected_ all the forks in | ||
which this bug's exploits have occurred. In other words, smart contracts would | ||
only re-activate if a fork that does _not_ descend from any block in which an | ||
exploit occurs becomes the dominant fork. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could this itself be a hard-fork? Repairing these kinds of bugs could easily create a hard-fork, but is this class of solution only considering soft-fork-able fixes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, admittedly this example is a bit contrived because it assumes that there's a bug severe enough that anyone can take anyone else's STX, but at the same time, we can somehow identify each occurrence of the exploit each STX's true owner. This maybe isn't the best example.
The point of this section is to explain how to use a chain fork to "undo" any active exploits, assuming they could be identified as such. The real-world example I had in mind of this was Bitcoin's integer overflow bug whereby 2**64 - 1 BTC got minted. This was handled by a soft fork then, and we could do something similar if there was a STX minting bug or some other kind of bug that resulted in the number of liquid STX increasing unexpectedly.
Note: the first goal for the Governance CAB will be to review and comment on SIP-011 by the next governance meeting on 2021/09/16, after which we can discuss any comments and help move this toward being ratified! |
Further advancement of this particular SIP is pending more real world stress testing of this SIP being used in DR scenarios. As this has been successfully utilized in recent Stacks updates that should be noted and consideration for moving to Accepted should be considered |
This SIP attempts to codify a set of procedures for recovering from catastrophic errors in the blockchain that either cause it to crash, or cause severe safety problems for other peoples' digital assets and code. I was inspired to write this in light of the recent network outage on 7 February. The procedures outlined in this SIP are meant to be "game plans" for dealing with future such events, as well as drawing a few lines in the sand as to what's a legitimate reason for following some of the more severe recovery procedures (e.g. forks).