Skip to content

[PM-20614] Cose Content Type #203

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 43 commits into
base: main
Choose a base branch
from
Open

[PM-20614] Cose Content Type #203

wants to merge 43 commits into from

Conversation

quexten
Copy link
Contributor

@quexten quexten commented Mar 27, 2025

🎟️ Tracking

https://bitwarden.atlassian.net/browse/PM-20614

📔 Objective

Encrypt type 7 (CoseEncrypt0 with currently Xchacha20poly1305) requires specifying a content format. Since the encrypted content is just a byte array, this alone is not enough to understand what format the data contained has. In some cases, multiple formats may be OK.

These cases include:

  • Private keys
    • PKCS8 DER / CoseKey
  • Symmetric keys
    • Bitwarden Custom Key Byte Serialization ("just key bytes concatenated (and padding if it is a CoseKey)"), CoseKey
  • Utf8 for most cipher fields
  • Future:
    • Blob Cipher: Json/Cbor?

To distinguish between these, we need to introduce a content format that is passed through until the encryption happens. For AES256-CBC-HMAC, it gets ignored, for CoseEncrypt0/Xchacha20poly1305, it gets written as a protected header onto the encstring/encrypt0 object. We do not add validation on the decryption path in this PR.

For key wrapping, when wrapping cosekeys with cosekeys, the Bitwarden Custom Key Byte Serialization with padding is not used, but the raw CoseKey without padding is used; but can be distinguished from other keys on unwrapping by inspecting the content format.

Further to ensure the content format is passed along, the encryptable trait is split into 3 traits by this PR.
CompositeEncryptable is a trait that represents encryption of composite objects. Composite objects here are objects that have sub-encyrptables under the same key. For instance, a cipher is a composite encryptable, since all fields are individually encrypted. It also has some sub-composite encryptables (Uri). Composite encryptables only defer encryption to one of the other traits, and do not themselves encrypt directly, and thus also don't have a content type.

Next, some types always have a unique content type associated. String and &str are always Utf8. It would be cumbersome and error-prone to always pass Utf8 along here, and so a PrimitiveEncryptableWithContentType is introduced.

Finally, the regular encryptable is implemented on more generic types (Vec<u8>), but for these types it is not clear what content-type they should have, so the content format must be passed along by the caller.

For the Utf8 content type specifically, for XChaCha20Poly1305 encryption, a padding scheme is added, in order to hide the direct relationship of ciphertext length to plaintext length. The content type written onto the encrypt0 object is a custom type to indicate this.

⏰ Reminders before review

  • Contributor guidelines followed
  • All formatters and local linters executed and passed
  • Written new unit and / or integration tests where applicable
  • Protected functional changes with optionality (feature flags)
  • Used internationalization (i18n) for all UI strings
  • CI builds passed
  • Communicated to DevOps any deployment requirements
  • Updated any necessary documentation (Confluence, contributing docs) or informed the documentation
    team

🦮 Reviewer guidelines

  • 👍 (:+1:) or similar for great changes
  • 📝 (:memo:) or ℹ️ (:information_source:) for notes or general info
  • ❓ (:question:) for questions
  • 🤔 (:thinking:) or 💭 (:thought_balloon:) for more open inquiry that's not quite a confirmed
    issue and could potentially benefit from discussion
  • 🎨 (:art:) for suggestions / improvements
  • ❌ (:x:) or ⚠️ (:warning:) for more significant problems or concerns needing attention
  • 🌱 (:seedling:) or ♻️ (:recycle:) for future improvements or indications of technical debt
  • ⛏ (:pick:) for minor or nitpick changes

Copy link
Contributor

github-actions bot commented Mar 27, 2025

Logo
Checkmarx One – Scan Summary & Details738476ed-7aa1-4a0f-b8e2-42f92e1063a0

Great job, no security vulnerabilities found in this Pull Request

@quexten quexten changed the title Km/cose content format Cose Content Type Mar 27, 2025
Copy link

Copy link

codecov bot commented Mar 28, 2025

Codecov Report

Attention: Patch coverage is 86.07306% with 61 lines in your changes missing coverage. Please review.

Project coverage is 70.54%. Comparing base (38654e9) to head (9d0ea55).

Files with missing lines Patch % Lines
...rates/bitwarden-crypto/src/keys/key_encryptable.rs 0.00% 13 Missing ⚠️
crates/bitwarden-crypto/src/store/context.rs 88.88% 12 Missing ⚠️
crates/bitwarden-crypto/src/cose.rs 92.40% 6 Missing ⚠️
...rates/bitwarden-crypto/src/enc_string/symmetric.rs 82.35% 6 Missing ⚠️
crates/bitwarden-crypto/src/traits/encryptable.rs 86.66% 6 Missing ⚠️
crates/bitwarden-vault/src/cipher/cipher.rs 96.42% 2 Missing ⚠️
crates/bitwarden-vault/src/cipher/login.rs 60.00% 2 Missing ⚠️
crates/bitwarden-core/src/secrets_manager/state.rs 0.00% 1 Missing ⚠️
crates/bitwarden-crypto/src/keys/utils.rs 95.83% 1 Missing ⚠️
crates/bitwarden-exporters/src/export.rs 0.00% 1 Missing ⚠️
... and 11 more
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #203      +/-   ##
==========================================
+ Coverage   70.32%   70.54%   +0.22%     
==========================================
  Files         217      217              
  Lines       16992    17287     +295     
==========================================
+ Hits        11949    12195     +246     
- Misses       5043     5092      +49     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Base automatically changed from km/cose to main May 6, 2025 15:13
@quexten quexten changed the base branch from main to km/wrapped-key May 7, 2025 10:47
@quexten quexten changed the base branch from km/wrapped-key to main May 8, 2025 23:05
@quexten quexten changed the title Cose Content Type [PM-20614] Cose Content Type May 8, 2025
@quexten quexten force-pushed the km/cose-content-format branch from f862c19 to e04b781 Compare May 9, 2025 09:20
@quexten quexten force-pushed the km/cose-content-format branch from e04b781 to bc5900b Compare May 9, 2025 09:27
pub enum ContentFormat {
Utf8,
Pkcs8,
CoseKey,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Opinion question: Are we fine with OctetStream for old bitwarden keys? Should we make a custom content format - "BitwardenLegacyKey"?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems reasonable to me to create our own custom content format. Attachments and file sends would probably also use the same content type, and there maybe be some others as well, like the FIDO2 private key (or is that just a base64 string?)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, so for FIDO2, the private key seems to be a PKCS8 DER, B64 encoded (but without the typical PEM header/footer) before encrypting. So we cannot give it the pkcs8 content type, as that refers to PKCS8 DER without B64, but string/our padded string seems reasonable. (Or, I guess we could skip the b64 encoding step when encrypting with a COSE key, and mark it as PKCS8? I'm not sure whether that is worth the effort to ensure correctness with old encryption in all places...)

I'm not sure about files (attachments/send attachments); they seem like just byte buffers, so octetstream seems fine.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we're going with a BitwardenLegacyKey, is there a reason not to use this enum to precisely define content of anything that encrypted?

That definitely has value to me from a type safety perspective, but I don't know if a format flag is the right place to do that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm OK with using a custom content format here.

Some things to keep in mind: Custom content formats - in contrast to standardized ones - are defined using strings. The standardized ones sadly don't offer a privateuse category:

0-255 | Expert Review
-- | --
256-9999 | IETF Review or IESG Approval
10000-64999 | First Come First Served
65000-65535 | Experimental use (no operational use)

(This is also the case with encstrings, that contain padded utf8 text, which are much more common right now).

The second: So far I've introduced content formats only where they are required to distinguish. (PKCS8 / CoseKey, Utf8/Octetstream CoseKey/OctetStream[or BitwardenLegacyKey].). OctetStream should be unique here, but BitwardenLegacyKey makes this much clearer.

Both of these are not a big deal, and don't hinder using it, and I'm happy to change to it. Given the context, should I change it?

/// Implementations should generally consist of calling [Encryptable::encrypt] for all the fields of
/// the type.
pub trait Encryptable<Ids: KeyIds, Key: KeyId, Output> {
pub trait CompositeEncryptable<Ids: KeyIds, Key: KeyId, Output> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A couple of things about the traits:

It seems weird that we have CompositeEncryptable, PrimitiveEncryptableWithContentType, and then the other one is just called Encryptable, but it's also a primitive. I can think of some other ways to name them:

  • CompositeEncryptable,PrimitiveEncryptable, PrimitiveEncryptableWithContentType. All their names with prefixes, so there's never any confusion.
  • Encryptable,PrimitiveEncryptable, PrimitiveEncryptableWithContentType. Composite is the only one that should be implemented outside the crypto crate, so it gets the simpler name. I would probably go with this one as it would probably make a smaller diff for this PR.

This can be left for a followup PR, but now that we have the split, I think it would be a good idea to ensure that the Primitive traits can't be implemented outside this crate. We can use sealed traits for that:

// This trait is only accessible as part of a pub(crate) module, so no one else would be able to implement it
pub(crate) mod private {
    pub trait Sealed {}
}
pub trait CompositeEncryptable<Ids: KeyIds, Key: KeyId, Output>: private::Sealed {

Should we also have a similar split for Decryptable? Presumably to validate the content types there.


Now that we have multiple encryptable traits, I wonder if it would be a good idea to publicly export the traits module and encourage users to do use bitwarden_crypto::traits::*; to avoid having to import them all.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we also have a similar split for Decryptable? Presumably to validate the content types there.

We can; For now I'm ok not validating them except for CoseKey. CoseKey is the only option where we actually currently distinguish based on content type for now. So is it fine keeping that as a separate ticket?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CompositeEncryptable,PrimitiveEncryptable, PrimitiveEncryptableWithContentType

I prefer this approach, and went with it!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regarding trait sealing:

How important is that, really? For primitives like Option, String, &str, these can't be expanded outside of the crate since those crates won't own a type. It seems worthwhile allowing people to define fixed content type impls if they want to. In fact, that seems like the safer direction to go for a given type.

@Hinton
Copy link
Member

Hinton commented May 27, 2025

Please fill out the PR template.

@quexten
Copy link
Contributor Author

quexten commented May 27, 2025

Please fill out the PR template.

@Hinton Sorry, not sure what happened here, maybe GitHub ate the edit of the description at the time. Fixed now.

@@ -1,49 +1,116 @@
use crate::{store::KeyStoreContext, CryptoError, EncString, KeyId, KeyIds};
use crate::{store::KeyStoreContext, ContentFormat, CryptoError, EncString, KeyId, KeyIds};
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I want a more clear explanation about the difference between the traits. We should be mindful that we'd eventually want to migrate to blob based encryption and a more flexible data model for ciphers which will at least partially migrate us away from nested encrypt.

Copy link
Contributor Author

@quexten quexten May 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll try, though it would be helpful if you pointed out specific questions to where the PR's description falls short so I know which areas to focus on.

CompositeEncryptable:
This trait is used for structs that are not directly encrypted, but have child encryptables (such as the fields for a cipher). Subsequently, there is no associated content type.

Blob encryption is separate, and the split makes this very explicit. If we wanted to implement blob encryption, we would implement PrimitiveEncryptableWithContentType, additionally. The impl could look something like:

fn encrypt(...) {
  let cipher_blob: String = cipher.to_json();
  cipher_blob.encrypt(&ctx, &key) // String has content type UTF8
}

Depending on whether you want the cipherblob or the composite encryption (which we deprecate at that point), we would then call .encrypt, or .encrypt_composite.

PrimitiveEncryptableWithContentType:
This is implemented for any types that have a unique content type associated. For strings this is Utf8. The caller cannot pass along their own ContentFormat.

Encryptable:
This is implemented for any types that do not have a unique content type associated. The caller MUST pass along the ContentFormat.

Copy link
Member

@Hinton Hinton May 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apologies for the confusion, I want documentation in this file as top level documentation of the module

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ohh, okay, yeah I agree. I'll expand it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added

@Hinton Hinton requested a review from MGibson1 May 27, 2025 10:31
Copy link
Member

@MGibson1 MGibson1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly piling onto previously existing conversations

Comment on lines 22 to 32
pub(crate) fn pad_bytes(bytes: &mut Vec<u8>, min_length: usize) {
// at least 1 byte of padding is required
let pad_bytes = min_length.saturating_sub(bytes.len()).max(1);
let padded_length = max(min_length, bytes.len() + 1);
bytes.resize(padded_length, pad_bytes as u8);
}

/// Unpads bytes that is padded using the PKCS7-like padding defined by [pad_bytes].
/// The last N bytes of the padded bytes all have the value N.
/// For example, padded to size 4, the value 0,0 becomes 0,0,2,2.
pub(crate) fn unpad_bytes(padded_bytes: &[u8]) -> Result<&[u8], CryptoError> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems odd these mutate, are you trying to take advantage of some Vec amortized resizing costs?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this case, I think the only goal in the xchacha pr was to make the code readable.

Comment on lines +150 to +156
(EncString::Aes256Cbc_B64 { iv, data }, SymmetricCryptoKey::Aes256CbcKey(key)) => {
SymmetricCryptoKey::try_from(crate::aes::decrypt_aes256(
iv,
data.clone(),
&key.enc_key,
)?)?
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this not an invalid key?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this asking whether Aes256-CBC is still supported? Or is this asking whether this code is correctly implementing it?

The former is dropping partially, but still supported for masterkey-wrapped-userkeys (but nothing else). For the latter, I'm not sure what "invalid key" means here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do think we can drop EncString::Aes256Cbc_B64 { .. } entirely, but that should be separate work.

//! Some (legacy) encryptables are made up of many small individually encrypted items. For instance,
//! a cipher is currently made up of many small `EncString`s and some further json objects that
//! themselves contain `EncString`s. The use of this is generally discouraged for new designs.
//! Still, this is generally the only trait that should be implemented outside of the crypto crate.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the intention that this is true in the long term? I expect that the ideal is this trait goes away and direct an evolutionary serialization (in the sense of EDD) to &[u8] is more desired.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, we say "a cipher is currently made up". The implication is that there is an intent to change this. The concrete intent to change this is blob ciphers.

/// Implementations should generally consist of calling [Encryptable::encrypt] for all the fields of
/// the type.
pub trait Encryptable<Ids: KeyIds, Key: KeyId, Output> {
pub trait CompositeEncryptable<Ids: KeyIds, Key: KeyId, Output> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regarding trait sealing:

How important is that, really? For primitives like Option, String, &str, these can't be expanded outside of the crate since those crates won't own a type. It seems worthwhile allowing people to define fixed content type impls if they want to. In fact, that seems like the safer direction to go for a given type.

@quexten quexten requested review from dani-garcia and MGibson1 June 4, 2025 08:49
@quexten quexten requested a review from a team as a code owner June 6, 2025 15:02
@quexten quexten requested a review from gbubemismith June 6, 2025 15:02
Copy link

sonarqubecloud bot commented Jun 6, 2025

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants