Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Store hashed token in DB #707

Draft
wants to merge 1 commit into
base: master
Choose a base branch
from
Draft

Store hashed token in DB #707

wants to merge 1 commit into from

Conversation

eternal-flame-AD
Copy link
Member

fixes #325.

Uses sha256+salt for a compromise between performance and not storing plain text tokens.

@eternal-flame-AD eternal-flame-AD requested a review from a team as a code owner October 23, 2024 07:57
@eternal-flame-AD eternal-flame-AD marked this pull request as draft October 23, 2024 07:57
@eternal-flame-AD eternal-flame-AD changed the title hashed token generation facility Store hashed token in DB Oct 23, 2024
@@ -24,6 +30,64 @@ func randIntn(n int) int {
return int(res.Int64())
}

// Convert a Token to its hashed representation.
func HashToken(s string, salt []byte) (string, error) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason not using bcrypt like we do for hashing passwords?

Copy link
Member Author

@eternal-flame-AD eternal-flame-AD Oct 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's too expensive and poses a DoS concern.

Tokens are guaranteed to be pseudorandom and have enough entropy, so many of the mitigations by KDFs like bcrypt is not required. The attacker will need to enumerate every possible key to find the result which is usually not true for user-chosen passwords.

Theoretically you can do a bloom filter with a salt or something ... but it is too much effort for the unlikely scenario this original reported attack will have sufficient preconditions to be exploited and not enough preconditions such that the attacker gain no additional practical advantage that they do not already automatically have.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's too expensive and poses a DoS concern.

we could use a low strength / iteration count. The big benefit I see we have a prefix we can check if the token is already hashed $2a$ and the salt is already built in. Less code we have to write.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we need to salt at all after a second thought. See #707 (comment). We need constant time lookup ourselves. Using any individualized salting (built-in to bcrypt or custom rolled) make this impossible.

@p1gp1g
Copy link

p1gp1g commented Oct 24, 2024

fixes #325.

Uses sha256+salt for a compromise between performance and not storing plain text tokens.

It must not be salted, the token have to be unique

@eternal-flame-AD
Copy link
Member Author

fixes #325.
Uses sha256+salt for a compromise between performance and not storing plain text tokens.

It must not be salted, the token have to be unique

Can you elaborate? I can see why you may argue it might not have to be but why it must not be?

@p1gp1g
Copy link

p1gp1g commented Oct 25, 2024

Passwords don't have to be unique because they are used with a user account id which is unique. So when you login with login+password you know what account does the action, and you test the password against a single hash. (Even if some website tried to force unique password and reply "another user already use that password" 😬 )

For API token, this is different. They are not used with an account id. So they are necessary unique (a token can't identify 2 users). If the tokens are salted: when you generate a new token, you can't easily check if this token already exists, you have to test the new token with all the salts.

There is a 2nd thing: let's say you take the risk of non-unique tokens and store their hash salted. When you receive a request with a token, you will have to hash the token with all salt to see if it matches one (you don't have account id again). This is very inefficient. Without salt, the token is hashed once, and then you can check if the hash is present in the db (the only difference with today's implementation is that we are currently checking if the token is in db)

@eternal-flame-AD
Copy link
Member Author

eternal-flame-AD commented Oct 25, 2024

Passwords don't have to be unique because they are used with a user account id which is unique. So when you login with login+password you know what account does the action, and you test the password against a single hash. (Even if some website tried to force unique password and reply "another user already use that password" 😬 )

For API token, this is different. They are not used with an account id. So they are necessary unique (a token can't identify 2 users). If the tokens are salted: when you generate a new token, you can't easily check if this token already exists, you have to test the new token with all the salts.

There is a 2nd thing: let's say you take the risk of non-unique tokens and store their hash salted. When you receive a request with a token, you will have to hash the token with all salt to see if it matches one (you don't have account id again). This is very inefficient. Without salt, the token is hashed once, and then you can check if the hash is present in the db (the only difference with today's implementation is that we are currently checking if the token is in db)

Yes you are right I can't salt it individual and be able to look it up constant time. Need another way.

You are correct that tokens must be unique. However whether I choose to store it salted or not doesn't change any of this?

Yes theoretically if you generate two random tokens there is a non zero chance for it to accidentally collide. The chance of a ~96 bit (current token length) token collision is 1 every 2^48 tokens (birthday paradox). It would be overwhelmingly more likely that all your drives and backups happened to crash at the same time than for someone to generate enough tokens for this to happen. ( I calculated if an attacker generate 100000 tokens every second it takes almost 100 years on average to get a collision, the database would have crashed for the amount of data)

There is a valid concern that if one day the hash function gets broken and attackers can silently generate collisions for a given hash that are not easily detected but I think this is way outside of our threat model.

I don't feel it is really needed to wire the generation logic to the database. Actually I think it actually creates more security risks of timing or side channel leaks of other kind indicating the distribution of tokens

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

Security: Store api token hashed
3 participants