RPIP-63: Houston Hotfix

langers · 13 September 2024 05:42

This is the official discussion thread for RPIP-63: Houston Hotfix.

The changes described in this RPIP are about to enter audits. We acknowledge that this is not the preferred flow of creating an RPIP ahead of the work. In part this is because the initial scope of the hotfix was purely in “bugfix” territory and it grew to include items that merited a pDAO vote.

Key required bugfixes:

Fixing legacy minipool finalisation
Onchain voting: Prevent abuse of proposals to exit RPL stake
Onchain voting: Prevent attacks that allow a proposer’s bond to be stolen
Reduce the guardrail on on-chain voting quorum (the exact number is up for discussion)

Key items that aren’t required bugfixes:

Allow the minimum per-minipool RPL stake to go to zero without causing reverts elsewhere; while this is a bugfix, it’s only needed to enable something similar to RPIP-62
Use ETH for the scrub check penalty instead of RPL; this is primarily needed to enable something similar to RPIP-62

Valdorff · 17 September 2024 06:11

Kane brought up a good point on github about whether we should tweak the quorum, in addition to the quorum guardrail. The reason to do so would be to avoid getting stuck – 51% of the active vote is quite a high bar.

Given that we have about 50k/94k vote initialized, I think we should do something near the end goal of 15% when everything is initialized. This would be 15% * 94k/50k = 28.2%. Maybe go with 25% or 30% cuz yay round numbers?

langers · 17 September 2024 11:53

Yes that’s a great point. I have raised a PR on the RPIP to update the quorum to 30%.

epineph · 17 September 2024 11:59

So this is a bit in the weeds, but going to make a few points anyway.

we should consider if quorum is the gatekeeper we want. In a quorum system, when quorum is uncertain, supporters of the less popular side are incentivized to boycott and hope quorum isn’t reached. This gives rise to weird gaming behavior, and obscures actual disagreement as the vote appears extremely lopsided.
As an example, which vote represents the will of the DAO better:

14% in favor and 0% opposed, or
8% in favor, 7% opposed?

Consider if we should instead enact something like [most popular outcome that reaches X % of total vote power], because the outcome of “no action” is just status quo, not anarchy like an political election.
There are some things (MC elections eg) where quorum is the better system.

2). I agree 50% active vote quorum (if we use quorum) is too high. however, I think the onus is on the dao to figure out ways to a) get more people initialized and b) get more people voting. I voted against RPIP-4 for this reason and I still maintain a system where 7.6% of vote is required to pass law is broken. Passing something on-chain should be hard-ish, and there should be an expectation that NOs vote OR delegate. And if required, incentives. Our best protection against malicious attack is high fidelity honest voters. So 25-30% quorum of active vote makes sense to me, even after we initialize most voters.
This DAO is not LDO, and we should have higher expectations of ourselves.

Valdorff · 17 September 2024 12:33

I think it’s worth noting that Lido quorum is 5% and they have no sublinear scaling at all. The largest single wallet is at 5%.

It’s fine to aspire even higher, but 3x (15%) the requirement alongside a more robust source of vote power (must be staked on a node, sublinear scaling) is a massive difference.

I do agree that it would be better to use “% in favor” as a minimum threshold for votes instead of “% that voted”, as the latter creates some perverse edge cases (eg: 14.99% of the vote is in, and it’s favor; if I’m against I can still “win” here). That said, I don’t think this patch (bugfixes and enabling changes that have no impact without a further change) is the right place for that kind of change.

knoshua · 17 September 2024 23:00

As with the original Houston RPIPs, the DAO only gets looped in after the team has decided what to do, done it and booked an audit. This undermines the RPIP process. I called this out back then and langers promised to do better in the future, but here we are again with the exact same thing.

I believe there should be separate RPIPs for separate issues. It makes it a lot easier to write and work on and also easier to read.

I think RPIP-63 is partly unnecessary. The scrub attack doesn’t appear viable since it requires a huge amount of attacker capital, a huge amount of capital in the deposit pool, an oDAO that is non responsive and a guardian/security council that is non responsive. It would also be possible to reduce viability further with setting changes alone (lower scrub treshold, longer scrub period, …).

Overall, I will oppose this RPIP based on process. The pDAO should make it clear that this behavior is unacceptable and we’ve seen that simply telling them does not work.

langers · 19 September 2024 05:25

The Houston hotfix was intended to be a bug fix but later in development we incorporated community feedback. As a consequence there were elements that we felt needed pDAO ratification and so we reached out to the community for their opinion on how to proceed in terms of governance. We have been consultative and collaborative throughout the process.

We agree the best process is the DAO ratifies before work starts but in this case the process path was unclear.

Although the attack requires a lot of capital, it has a very high pay off - stealing of 24 ETH of rETH user funds each minipool they get through. It can also be achieved in a timing that the oDAO and guardian may not be able to response (minutes). Using settings has other trade offs that are worse for trust/decentralisation or UX.

Ultimately, our auditors see it as an issue and we would rather not take that risk.

Obviously it is your (and your delegates’) vote to cast.

Making a statement here, makes perfect sense - despite what you think, we do listen and take things on board. I agree the process followed was not optimal because the path was not initially clear (we did our best to consult) but voting against key functionality that is beneficial to the protocol seems shortsighted.

knoshua · 19 September 2024 05:42

I don’t see how, please explain.

As I said in my original post, I believe this should be multiple RPIPs. By bundling them, you are not giving me the opportunity to vote for key functionality that is beneficial to the protocol while opposing ideas that I disagree with.

langers · 19 September 2024 05:44

@epineph I acknowledge this is an interesting subject but the ramifications are much larger than the current hotfix. I think this is worth discussing but I do not believe it should impede the hotfix RPIP progress. Which I don’t think is the intention right?

langers · 19 September 2024 05:49

Correction - it is 12 hours or the scrub time they have to respond but as I stated the stakes are high and ultimately we would rather be as safe as possible.

langers · 19 September 2024 05:53

No other community members during our consultation process expressed that concern. Happy to take that feedback on board though. If other community members feel it is important we will go through the redrafting process. If not, we should probably proceed but we will consider that feedback for future RPIPs.

langers · 21 September 2024 06:12

This discussion has raised the point about setting the default quorum value, which has been addressed. Other feedback, around the nature and effectiveness of quorum, and process improvement going forward, does not seem to be a blocker for proceeding to sentiment poll.

I feel comfortable moving on to a sentiment poll. As usual, this will lead to a Snapshot vote in 7 days if there is promising community sentiment.

The RPIP is close to its final form now, with no expected changes other than potential clarifications and minor error corrections. Since the creation of this discussion thread last week, there have been these changes:

Set default pDAO quorum to 30% (which is effectively 15% when you take into account initialised vote power)
Added @knoshua as author because I reused his security consideration section from RPIP-62

RPIP-63 Sentiment Poll

Support moving to vote; I think this proposal is great
Support moving to vote; I think this is good enough
Oppose moving to vote; I have a specific issue I’m mentioning in the comments below
Oppose moving to vote; other
Undecided; I have a specific question I’d like clarified in the comments below
Undecided; other

0 voters

langers · 26 September 2024 00:47

Sorry for the last minute addition.

During devnet testing we discovered that “auction.log.bidding.enabled” and “auction.lot.create.enabled” are incorrectly not able to be modified by security council (missing from the list of parameters allowed).

We would like to fix this non-conformance with RPIP-33 (RPIP-33: Implementation of an On-Chain pDAO) in this hotfix release. I have raised a PR for the addition.

The addition is a bugfix and not a functional change. The change is extremely small and does not require additional audit time.

Please let me know if there are any concerns.

Valdorff · 26 September 2024 03:41

That seems reasonable. I don’t think those would traditionally need a vote on their own (as they are true fixes that fix the implementation of an earlier RPIP).

knoshua · 30 September 2024 13:00

I voted against for the following reasons:

I believe completely unrelated features should have separate RPIPs with separate votes. While I’m in favor of some of the changes here, I disagree with others and the only option given to me is to vote in favor of all or against all.
We shouldn’t vote on fixes for bugs that are due to deviations from already voted on specifications (as with the auction settings) or are not conflicting with existing specifications and are clearly implementation mistakes (as with the reverting due to division by 0 and finalising legacy minipools issue). Doing so isn’t only unnecessary, but also sets a bad precedent for the future. We may not want to draw attention to existing bugs before fixing them.
I believe the scrub penalty on ETH is not necessary because alternative mitigations for ETH only minipools were available. Including it in this hotfix may have delayed the remaining fixes and therefore delayed ETH only minipools.
The team continues to ignore the RPIP process by making decisions, implementing changes and booking audits ahead of asking for DAO input. This is a continuation from how they handled the Houston RPIPs. I do not buy langers attempts to explain this behavior in this thread. As the github commit history shows, elements that clearly needed pDAO ratification were implemented a month before this forum post (for example Replace scrub RPL penalty with ETH one on August 15). So even if they initially viewed this as bug fixes only, it turned into something else a month before they bothered to involve the DAO and that gap in the timeline remains unexplained.

Patches · 30 September 2024 13:44

Knoshua has a compelling point about bundling, in that I at least believe a path forward without changing to an eth penalty for scrubbed minipools does exist and may merit more discussion, but I feel obligated to vote in favor of fixing the divide-by-0 reverts to unblock eth only minipools.

That said I’ll vote in favor out of expedience. If RPIP-62 didn’t contain bug fixes blocking RPIP-63, I would have considered punting, but I am also cognizant of the handful of node operators who have funds locked in the ecosystem due to the finalization bug as well.

peteris · 30 September 2024 16:18

Voting in favor to make this happen.

But @knoshua raises valid objections.

Valdorff · 3 October 2024 01:44

All right… Being an RPIP editor and providing feedback on the rpip, I have a bit of a complicated relationship here.

An outright mistake, partly mine: The true bug components should not have been in this or any other voted rpip. They should have been in an info RPIP. I should have realized that as it happened.

Something I think the dao and the rpip editors can get better at: The bundling, if we remove the true bug fixes, is a lot less than it seems (2-3 items depending how you count). That said, I do agree that it’s better to vote for things separately - combining things makes it harder for ppl to vote effectively.

Something I think the team can get better at: I do think the scrub stuff should have been raised before coding, or (if the coding wasn’t a huge effort) when the team’s proposal was clear (as shown by the commit). This would have been sentiment polling mid August, and I don’t think it would’ve delayed things.

Votesplanation: With all that I mind, I’m voting Abstain. Let’s get better and be the best dao in web3.

Valdorff · 16 October 2024 04:35

Verifying the right thing is deployed

The RP team provide a tool to help with this at GitHub - rocket-pool/verify-1.3.1.

I don’t really understand the internals of this script toooo well, but what it’s doing is grabbing the verified source from etherscan and comparing it to the git repo at the 1.3.1 tag. I was able to get a happy run, and also build confidence by making specific changes in my local copy of the source and seeing that they got properly identified. It’s pretty cool and shows the actual difference, not just that there is a difference.

As an aside to anyone trying to use it, I did tweak line 62 of verify.js. It was allowing 220ms between calls to the etherscan API, which should be enough, but was failing for me. I set it to 320 and it ran just fine.

Looking at post-audit changes

Sigma prime was up to 6f07f2a.
The 1.3.1 tag is what is deployed, and includes some additional commits after the audit.

The remaining diff: https://github.com/rocket-pool/rocketpool/compare/6f07f2a0bd31c6cf84196566e473d643a51cded6...v1.3.1. I’m only ok at solidity, but the diff is pretty small, so let’s do our best and take a look.

In human language, roughly prioritized, the differences I see are:

(,uint224 currentValue,) = rocketNetworkSnapshots.latest(key); corrected to (,,uint224 currentValue) = rocketNetworkSnapshots.latest(key); when applying ETH match corrections
- the function on the right returns return (true, checkpoint._block, checkpoint._value), and we definitely want the value here, not the block (the expression on the left is using tuple assignment)
- this bug wasn’t caught in the audit
A parameter in _initialiseVoting that wasn’t updated and wasn’t caught in audit
Add missing security council permissions for auction.lot.create.enabled and auction.lot.bidding.enabled (noticed after audit iirc?)
Modify pDAO onchain quorum to 30% (noticed, discussed with community, and decided after audit)
An extra sanity check require() when applying ETH match corrections (that the number is >=0)
Sigma Prime’s RPHF-01; 7 days definition
Comment changes

TL; DR

lgtm