WIP: Adding Identity Hooks for Enhanced Privacy and Consistency

- [x] I have read the [CLA Document](https://github.com/creator-assertions/identity-assertion/blob/main/contributor-license.md) and I hereby sign the CLA.

----------------------------------------

WORK IN PROGRESS

----------------------------------------

I brought up the concept of “identity hooks” in a previous meeting and wanted to flesh out some of the details for why I think this is an essential feature for an identity credential and how I propose it be implemented into CAWG. 

In my view, this change could be implemented in the CAWG standard with relatively minimal overhead, though it does change some of infrastructure identity aggregators and claim generators would be expected to maintain. 

With that said, the potential benefits to privacy, consistency, extensibility, and user empowerment that identity hooks can provide make it a worthwhile feature to consider.

### What is an Identity Hook?
An identity hook is a “created assertion” (i.e. an assertion that does not require human input and that a claim generator can attest to directly) that acts as a pathway for creators to add and append identity information to a credential, either immediately or in the future.

It binds an opaque, one-off key pair to a C2PA credential whose only purpose is to act as a hook for verified identities. The private key should be kept in strict confidence for the creator, and can be custodied by the claim generator, the identity claims aggregator, or the (expert) user themselves.

Because the key pair itself contains no identifiable or human-inputted information, the claims generator can create this hook themselves - even if the creator currently has no intent of binding identity info to it. (Assuming of course that a good claims generator doesn’t intend on disseminating the private key or binding false identities to the content)

### Why is this so important?
There are a bunch of reasons:
1. I have many identities. I have a tax payer ID, a Twitter profile, a pseudonymous Reddit account, a crypto wallet, a work email, and more. The current spec allows this, but with some big caveats we can fix (see the rest of this list).
2. In many cases, I don’t want my identities associated with each other. If my identity claims aggregator signs one piece of content with one identity, and another with another, then the common ID the identity claims aggregator stamps as my username is a dead giveaway that all my identities are associated with each other. My privacy is gone. With an identity hook, those identities only get tied to the one-off key pair, keeping more of my info private.
3. Once the credential is published, if my identity isn’t on it, there’s no going back. Sure, I could wrap a new manifest around the old one, but that’s far less secure or convincing than if I was on the initial credential. I don’t want to lose out on that benefit just because I don’t have an identity claims aggregator accessible & pre-configured at the time I created my content. With an identity hook, the claims generator preserves my right to identify myself into the future without forcing the identity claim upfront.
4. When I’m in the process of creating content, I often have no idea if or how I want to publish it. I don’t want to preemptively bind an identity to the content when the credential is first being created. A hook gives me optionality into the future for which identity I want to put on the content, if I want one at all.
5. Privacy is one the biggest reasons that identity is not currently part of the core C2PA standard - but identity hooks change that. Sometimes, it might be necessary to prove my content binding to somebody, even if I don’t want to publish that in a credential. With a hook, I can always do this by signing a private message, and because the key pairs are one-offs, that person doesn’t learn anything about my identity beyond my affiliation with that single piece of content.
6. Especially in rights & attribution use cases, knowing _who didn’t_ create a piece of content is just as important as knowing _who did_. If the initial credential doesn’t come with an identity assertion, there’s no credential down the line that can reestablish the identity binding. So the absence of an authentication pathway limits my ability to say when others are taking advantage of my content without attribution. An identity hook solves this, because it means that an unauthorized user is _choosing_ not to prove their creatorship, not that they _can’t_ prove their creatorship - i.e. it has the potential to shift the burden of proof.
7. Public identity is not a prerequisite for the distribution of rights, but consistency of identity is. Identity hooks give me a privacy-preserving pathway to issue downstream licenses to my content without needing to disclose my personal information. Even though this doesn't fall within the scope of CAWG today, it's a very helpful piece of functionality for others to be able to bind to later on.

### The technical details
I haven’t put these in normative form or hammered out the finer points, but I’ve tried to include enough detail below to make the idea clear:
- An identity hook is a _did:key_ created uniquely for that assertion via public-private key encryption. It’s an optional field that lives in the identity assertion alongside the _cawg.role_.
- The id field of the _credentialSubject_ must, if present, either:
    - Match the identity hook of the current assertion. This would be useful for verified identities to be established in the same assertion as the identity hook, and provide a stronger signal that the creator consented to having their identity shared.
    - Match the identity hook of another referenced _cawg.identity_ assertion from a previous manifest. This would be useful for appending identities to previously created assertions. 
- This procedure is most useful for assertions made by the “creator”—i.e. the person who is operating the software/hardware used to create the content, because the device can reliably place the keys in the custody of that person. There might be cases where we want to make an identity assertion about someone who is not the “creator”, like the director of a film or the subject of a photo. In these cases, the binding of identity is naturally weaker because the subject is already one layer removed from the claim generator. So these claims should continue to be allowed, but have the id field omitted in the _credentialSubject_.
    - It’s unclear whether CAWG should voice which _cawg.roles_ are permitted to include identity hooks, or whether it’s fair to allow software implementers to decide this.
- The id field of the _credentialSubject_ is currently not described in the CAWG docs (aside from in one non-normative example), so this change would simply restrict the use of this field.
- For clarity, the concept of appending identities to a previous manifest DOES NOT imply that we modify a manifest - this never happens in C2PA. Instead, we would wrap the manifest in a new manifest that references the previous manifest’s identity hook, and by signing with the identity hook’s private key, we’d accept that the newly-reported identities are associated with the previous creator and tied to the original credential.
- The verifiable credential is signed both by identity hook’s private key (if included) and countersigned by the issuer (i.e. the identity claims aggregator). 
- In some cases, multiple identity hooks will be issued at different stages of the editing process (e.g. photo capture and subsequent editing). We can express this consistency by listing an identity hook as a verified identity of another identity hook. In most cases, we’d want the earliest-timestamped hook to serve as the primary identifier, so we’d want the later hook to be expressed as a verified identity for the earlier hook.


### FAQ
There are some nuances and intended limitations to identity hooks, and I’ve tried to describe them here.

Q: Does this add a bunch of unnecessary complexity to the workflow? Creators don’t want to learn public-private key encryption.
A: This doesn’t need to introduce any additional complexity into the process - the identity claims aggregator could custody all the information. But it leaves the door open to much richer workflows as well, such as where credentials are passed around between identity claims aggregators, where the ability to append identities is destroyed by deleting a private key, by allowing aggregators to perform private handshakes to establish identity without a credential, etc.

Q: Is an identity hook a created or gathered assertion, and where should it find its home in the C2PA stack?
A: The reservations I’ve heard so far for including identity assertions in the core C2PA standard, as a part of every credential, stem from either the concern that forcing the inclusion of identity credentials could create a privacy concern, or that identity credentials are “gathered assertions” that require human input. 
Neither of these statements hold true for identity hooks, as any software system can unilaterally create key pairs (which even in the worst case where everyone ignores the identity hooks, could just pile up on the device to be used later), making it a “created assertion”. The opacity and uniqueness of the key pair results in no identifiable information whatsoever in an identity hook. Since identity cannot be recovered if it is not placed in the initial credential, identity hooks provide many potential benefits with no significant risks, and I believe that identity hooks could be moved into the core C2PA standard and become a part of every credential.
This level of adoption may require the hook to be separated from the gathered assertion that binds verified identities, but to accomplish that, one could just move the hook and role to another assertion type and have it referenced by the identity verification assertion.

Q: If I don’t want my many identities associated with one another, but still want to use the same piece of content between many of those identities, how can I prevent someone from knowing that identities 1 & 2 are associated with each other when I bind them to the same identity hook?
A: A few options:
- I can secretly prove my control over the identity hook to a whichever platform is requiring the authentication, without publishing a credential, because I control the key pair behind that content. Because this communication is private, nobody will be able to make the association.
- This might not be an identity problem at all: I can maintain a level of plausible deniability by issuing whichever rights are needed as licenses to the content, so that my two related identities look like common licensees as opposed to a consistent identity, accomplishing whichever goal I had. This falls outside of scope of CAWG as a rights & attribution problem, so it’s okay not to address this directly.

Q: Identity hooks allow someone to effectively append identity information to an existing credential - how large an attack service does that create?
A: We can break this into a few cases:
- Any identity info, or perhaps none, that I want to assert is done upfront (the existing model). The presence of identity hooks doesn’t change this ability. You can always delete the private key.
- I want to add identity info once, and never again. As soon as I know that I no longer want the append identity info in the future, I can delete the private key.
- I want to leave the door open to appending more identity info, but only of certain types (e.g. social media profiles). This isn’t possible under the current scheme, but it also doesn’t sound much like an identity use case anymore. We’re getting into the territory where I must not fully trust the custodian of my private key, so this is more likely an issue that should be addressed by multi-sig credentials, licensing, or some other mechanism.

Q: Is there ever an indication on the credential when a hook is no longer active?
A: We never explicitly label on the credential when we’re closing the door to additional identities. It might be reasonable to add a field like this, but it’s more of a confirmation to the credential reader than it is useful to the credential holder, and I’m skeptical whether this piece of information could be useful or reliable (since you can’t force people not to delete their private keys without telling you).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

WIP: Adding Identity Hooks for Enhanced Privacy and Consistency #216

What is an Identity Hook?

Why is this so important?

The technical details

FAQ

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

WIP: Adding Identity Hooks for Enhanced Privacy and Consistency #216

Description

What is an Identity Hook?

Why is this so important?

The technical details

FAQ

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions