Skip to content

Assistant hacking - Awareness of potential hacking of the future #14

@henyckma

Description

@henyckma

Applications can hide natural language prompts from the user to hack the assistant. A literal example is the following: (not hiding it for demonstration purposes)

Screenshot 2023-12-01 143812

Other prompt techniques:

Screenshot 2023-12-01 145532

It selects all text and deletes the "hacking" prompt.

Screenshot 2023-12-01 150047

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions