Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make 6-duplicate-mutable-accounts lint handle more cases #7

Open
oslfmt opened this issue Jul 13, 2022 · 5 comments
Open

Make 6-duplicate-mutable-accounts lint handle more cases #7

oslfmt opened this issue Jul 13, 2022 · 5 comments

Comments

@oslfmt
Copy link
Contributor

oslfmt commented Jul 13, 2022

The current lint implementation handles the case when there are only two duplicate accounts just fine. This is because there only needs to be one constraint, to check the keys between a and b are not equal.

If there are >2 duplicate mutable accounts in the program, things become more complex, because of two main reasons. First, as more identical accounts are added, the number of constraints does not scale linearly. For example, if we have 3 accounts, we don't have 1 extra constraint, but 2 extra constraints. If we add 1 account to bring the total to 4, we have to add 3 more constraints to make sure all keys are unique.

Second, as multiple constraints are added, they don't have to all be specified in a single #[account] macro. Instead of creating a single macro, with the constraints comma-separated, like #[account(constraint = x, constraint = y...)], the program may choose to have many different macros, partitioning the set of required constraints in whatever way, for whatever reason.

@oslfmt
Copy link
Contributor Author

oslfmt commented Jul 13, 2022

Issue (2) should already be handled by the current implementation-the check_attribute function catches any attribute macros with account as name.

@ndrewh
Copy link
Contributor

ndrewh commented Jul 13, 2022

I wonder if doing this at the MIR level would be better, since there's no guarantee that the contract would even use the #[account] macro for the constraints, and they could formulate the constraints in any number of ways, right?

I'm thinking maybe you could look at every function body, and if there are any arguments that have an accounts struct with multiple accounts of the same type, then you can walk the MIR (?) and make sure they call .key() on all of them?

  • I'm not sure how the macro gets expanded but the check has got to be in MIR somewhere, right?
  • I'm not sure this is easily solvable for large N because at some point if you have, say, and array of accounts, you might just use a hash table to check for duplicates instead of having N^2 comparisons. So while you could try to pattern-match looking for checks like a.key() == b.key() ... this is probably only useful for N=1, is my guess.

@oslfmt
Copy link
Contributor Author

oslfmt commented Jul 14, 2022

I wonder if doing this at the MIR level would be better, since there's no guarantee that the contract would even use the #[account] macro for the constraints, and they could formulate the constraints in any number of ways, right?

Yeah this is a good point. My plan was/is to create a lint that is guaranteed to work for all solana programs using the anchor framework. This would simplify things a bit, allowing the lint to look at just the struct that has #[derive(Accounts)] on it, due to the way the framework is organized. Then it could check there is #[account] constraints. The problem is as you pointed out, they may do the check in a function body instead, and the lint would flag the current secure example as a false positive.

I'm thinking maybe you could look at every function body, and if there are any arguments that have an accounts struct with multiple accounts of the same type, then you can walk the MIR (?) and make sure they call .key() on all of them?

This is sorta what I initially had in mind too. It may need some more stringent checks than just making sure .key() is called on each account though, for ex. if we have 3 accounts, a,b,c, .key() needs to be called on each twice. If the program doesn't do the key check the "recommended" way (with the anchor macro), then I was thinking about just catching it as a false positive and encouraging to do it the recommended way. Not sure if this is the best strategy for linting though.

I'm not sure how the macro gets expanded but the check has got to be in MIR somewhere, right?

What exactly do you mean by check in MIR? For macro expansion, I thought it happened earlier on before lowering to MIR. I was actually looking at the expanded code this morning, and essentially the check happens in #[derive(Accounts)], which implements try_accounts(), the constraint code is basically embedded somewhere into that function. So yet another strategy is combining something like you had outlined above, and also do a walk across the try_accounts() function.

@ndrewh
Copy link
Contributor

ndrewh commented Jul 14, 2022

What exactly do you mean by check in MIR? For macro expansion, I thought it happened earlier on before lowering to MIR. I was actually looking at the expanded code this morning, and essentially the check happens in #[derive(Accounts)], which implements try_accounts(), the constraint code is basically embedded somewhere into that function.

Yeah, what I meant is I suspected you'd be able to look for the check somewhere in the emitted MIR, since macro expansion happened long before. You found out that the check ends up in try_accounts, which sounds great.

It may need some more stringent checks than just making sure .key() is called on each account though, for ex. if we have 3 accounts, a,b,c, .key() needs to be called on each twice.

My point is they don't actually even have to do all the pairwise comparisons. For larger N they could just make a HashSet and insert each key, checking the return value of insert to detect duplicates. So i'm not sure the number of calls to key() really tells you much.

If the program doesn't do the key check the "recommended" way (with the anchor macro), then I was thinking about just catching it as a false positive and encouraging to do it the recommended way.

Yeah, you're right, this is probably easiest and the best we can realistically do imo. My point is that I don't think anybody is actually going to put all the constraints into the macro in this way for N > 3, so i'm thinking don't worry about it that much. So maybe for N > 3, or if there's an array of accounts or something, just always flag it as "might not check for duplicate mutable accounts" and call it done? Rather than trying to pattern-match all the different ways they could do duplication detection (HashSet, a bunch of comparisons, ...).

@oslfmt
Copy link
Contributor Author

oslfmt commented Jul 14, 2022

So maybe for N > 3, or if there's an array of accounts or something, just always flag it as "might not check for duplicate mutable accounts" and call it done?

Gotcha. Yeah this is not a bad idea. I'll continue to think about it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants