This is the specification for ZkVM, the zero-knowledge transaction virtual machine.
ZkVM defines a procedural representation for blockchain transactions and the rules for a virtual machine to interpret them and ensure their validity.
- Overview
- Types
- Definitions
- LE32
- LE64
- Scalar
- Point
- Base points
- Pedersen commitment
- Verification key
- Time bounds
- Contract ID
- Anchor
- Transcript
- Predicate
- Program predicate
- Contract payload
- Output structure
- UTXO
- Constraint system
- Constraint system proof
- Transaction
- Transaction log
- Transaction ID
- Merkle binary tree
- Transaction signature
- Unblinding proof
- Taproot
- Call proof
- Blinded program
- VM operation
- Instructions
- Transaction Encoding
- Examples
- Discussion
- Relation to TxVM
- Compatibility
- Static arguments
- Should cloak and borrow take variables and not commitments?
- Why there is no
and
combinator in the predicate tree? - Why we need Wide value and
borrow
? - How to perform an inequality constraint?
- How to perform a logical
not
? - What ensures transaction uniqueness?
- Open questions
TxVM introduced a novel representation for the blockchain transactions:
- Each transaction is an executable program that produces effects to the blockchain state.
- Values as first-class types subject to linear logic.
- Contracts are first-class types that implement object-capability model.
The resulting design enables scalable blockchain state machine (since the state is very small, and its updates are separated from transaction verification), expressive yet safe smart contracts via the sound abstractions provided by the VM, simpler validation rules and simpler transaction format.
TxVM, however, did not focus on privacy and in several places traded off simplicity for unlimited flexibility.
ZkVM is the entirely new design that inherits most important insights from the TxVM, makes the security and privacy its primary focus, and provides a more constrained customization framework, while making the expression of the most common contracts even more straightforward.
A transaction is represented by a transaction object that contains a program that runs in the context of a stack-based virtual machine.
When the virtual machine executes a program, it creates and manipulates data of various types: copyable types and linear types, such as values and contracts.
A value is a specific quantity of a certain flavor that can be merged or split, issued or retired, but not otherwise created or destroyed.
A contract encapsulates a list of data and value items
protected by a predicate (a public key or a program) which must be satisfied
during the VM execution. The contract can be stored in and loaded from the global state
using output
and input
instructions.
Custom logic is represented via programmable constraints applied to variables and expressions (linear combinations of variables). Variables represent quantities and flavors of values, time bounds and user-defined secret parameters. All constraints are arranged in a single constraint system which is proven to be satisfied after the VM has finished execution.
Some ZkVM instructions write proposed updates to the blockchain state to the transaction log, which represents the principal result of executing a transaction.
Hashing the transaction log gives the unique transaction ID.
A ZkVM transaction is valid if and only if it runs to completion without encountering failure conditions and without leaving any data on the stack.
After a ZkVM program runs, the proposed state changes in the transaction log are compared with the global state to determine the transaction’s applicability to the blockchain.
The items on the ZkVM stack are typed. The available types fall into two categories: copyable types and linear types. All copyable types, and some linear types are droppable.
Copyable types can be freely copied with dup
.
Note: the program is not copyable to avoid denial-of-service attacks via repeated execution of the same program that can be scaled exponentially while growing the transaction size only linearly. For the same reason, expressions and constraints are not copyable either.
Droppable types can be destroyed with drop
:
Linear types are subject to special rules as to when and how they may be created and destroyed, and may never be copied.
Only the string, program and value types can be ported across transactions via outputs.
Notes:
- Wide values are not portable because they are not proven to be non-negative.
- Contracts are not portable because they must be satisfied within the current transaction or output their contents themselves.
- Variables, expressions and constraints have no meaning outside the VM state and its constraint system, and therefore cannot be meaningfully ported between transactions.
A string is a variable-length byte array used to represent commitments, scalars, signatures, and proofs.
A string cannot be larger than the entire transaction program and cannot be longer than 2^32-1
bytes (see LE32).
A program type is a variable-length byte array representing a sequence of ZkVM instructions.
Program cannot be larger than the entire transaction program and cannot be longer than 2^32-1
bytes (see LE32).
A contract consists of a predicate and a payload. The payload is guarded by the predicate.
Contracts are created with the contract
instruction and
destroyed by evaluating the predicate, leaving their payload on the stack.
Contracts can be "frozen" with the output
instruction that places the predicate
and the payload into the output structure which is
recorded in the transaction log.
Each contract contains a hidden field anchor that makes it globally unique for safe signing.
Variable represents a secret scalar value in the constraint system bound to its Pedersen commitment.
A point that represents a commitment to a secret scalar can be turned into a variable using the commit
instruction.
A cleartext scalar can be turned into a single-term expression using the scalar
instruction (which does not allocate a variable). Since we do not need to hide their values, a Variable is not needed to represent the cleartext constant.
Variables can be copied and dropped at will, but cannot be ported across transactions via outputs.
Value quantities and flavors are represented as variables.
Constraint system also contains low-level variables that are not individually bound to Pedersen commitments:
when these are exposed to the VM (for instance, from mul
), they have the expression type.
Expression is a linear combination of constraint system variables with cleartext scalar weights.
expr = { (weight0, var0), (weight1, var1), ... }
A variable can be converted to an expression using expr
:
the result is a linear combination with one term with weight 1:
expr = { (1, var) }
Expressions can be added and multiplied, producing new expressions.
An expression that contains one term with the scalar weight assigned to the R1CS 1
is considered
a constant expression:
const_expr = { (weight, 1) }
Instructions add
and mul
preserve constant expressions as an optimization in order to avoid
allocating unnecessary multipliers in the constraint system.
Constraint is a statement within the constraint system. Constraints are formed using expressions
and can be combined using logical operators and
and or
.
There are three kinds of constraints:
- Linear constraint is created using the
eq
instruction over two expressions. - Conjunction constraint is created using the
and
instruction over two constraints of any type. - Disjunction constraint is created using the
or
instruction over two constraints of any type. - Inversion constraint is created using the
not
instruction over a constraint of any type. - Cleartext constraint is created as a result of guaranteed optimization of the above instructions when executed with constant expressions. Cleartext constraint contains a cleartext boolean
true
orfalse
.
Constraints only have an effect if added to the constraint system using the verify
instruction.
A value is a linear type representing a pair of quantity and flavor (see quantity and flavor in the Cloak specification).
Both quantity and flavor are represented as variables.
Quantity is guaranteed to be in a 64-bit range ([0..2^64-1]
).
Values are created with issue
and destroyed with retire
.
Values can be merged and split together with other values using a cloak
instruction.
Only values having the same flavor can be merged.
Values are secured by “locking them up” inside contracts.
Contracts can also require payments by creating outputs using borrowed values.
borrow
instruction produces two items: a non-negative value and a negated wide value,
which must be cleared using appropriate combination of non-negative values.
Each non-negative value keeps the Pedersen commitments
for the quantity and flavor (in addition to the respective variables),
so that they can serialized in the output
.
Wide value is an extension of the value type where
quantity is guaranteed to be in a wider, 65-bit range [-(2^64-1) .. 2^64-1]
.
The subtype Value is most commonly used because it guarantees the non-negative quantity
(for instance, output
instruction only permits positive values),
and the wide value is only used as an output of borrow
and as an input to cloak
.
A non-negative 32-bit integer encoded using little-endian convention. Used to encode lengths of strings, sizes of contract payloads and stack indices.
A non-negative 64-bit integer encoded using little-endian convention. Used to encode value quantities and timestamps.
A scalar is an integer modulo Ristretto group order |G| = 2^252 + 27742317777372353535851937790883648493
.
Scalars are encoded as 32-byte strings using little-endian convention.
Every scalar in the VM is guaranteed to be in a canonical (reduced) form: an instruction that operates on a scalar checks if the scalar is canonical.
A point is an element in the Ristretto group.
Points are encoded as 32-byte strings in compressed Ristretto form.
Each point in the VM is guaranteed to be a valid Ristretto point.
ZkVM defines two base points: primary B
and secondary B2
.
B = e2f2ae0a6abc4e71a884a961c500515f58e30b6aa582dd8db6a65945e08d2d76
B2 = hash-to-ristretto255(SHA3-512(B))
Both base points are orthogonal (the discrete log between them is unknown) and used in Pedersen commitments, verification keys and predicates.
Pedersen commitment to a secret scalar is defined as a point with the following structure:
P = Com(v, f) = v·B + f·B2
where:
P
is a point representing commitment,v
is a secret scalar value being committed to,f
is a secret blinding factor (scalar),B
andB2
are base points.
Pedersen commitments can be used to allocate new variables using the commit
instruction.
Pedersen commitments can be opened using the unblind
instruction.
A verification key P
is a commitment to a secret scalar x
(signing key)
using the primary base point B
: P = x·B
.
Verification keys are used to construct predicates and verify signatures.
Each transaction is explicitly bound to a range of minimum and maximum time.
Each bound is in milliseconds since the Unix epoch: 00:00:00 on 1 Jan 1970 (UTC),
represented by an unsigned 64-bit integer.
Time bounds are available in the transaction as expressions provided by the instructions
mintime
and maxtime
.
Contract ID is a 32-byte unique identifier of the contract and a commitment to its contents (see Output):
T = Transcript("ZkVM.contractid")
T.append("anchor", anchor)
T.append("predicate", predicate encoded as ristretto255)
T.append_u64le("k", number of items)
for each payload item {
// for String:
T.append("type", 0x00)
T.append("n", length)
T.append("string", bytes)
// for Program:
T.append("type", 0x01)
T.append("n", length)
T.append("program", bytes)
// for Value:
T.append("type", 0x02)
T.append("qty", qty encoded as ristretto255)
T.append("flv", qty encoded as ristretto255)
}
id = T.challenge_bytes("id")
Contract ID makes signatures safe against reused verification keys in predicates: a signature covers the unique contract ID, therefore preventing its replay against another contract, even if containing the same payload and predicate.
Uniqueness is provided via the anchor.
Anchor is a 32-byte string that provides uniqueness to the contract ID. Anchors are generated from unique contract IDs used earlier in the same transaction. These are tracked by the VM via last anchor:
- Claimed UTXO (
input
) sets the VM’s last anchor to its ratcheted contract ID (seeinput
). - Newly created contracts and outputs (
contract
,output
) consume the VM’s last anchor and replace it with its contract ID.
VM fails if:
- an
issue
,output
orcontract
is invoked before the anchor is set, - by the end of the execution, no anchor was used (which means that transaction ID is not unique).
Note 1: chaining the anchors this way gives flexibility to the signer: if contract A creates contract B, the B's ID can be computed solely from the contents of A, without the knowledge of the entire transaction.
Note 2: input
ratchets the contract ID because this contract ID was already available as an anchor in the previous transaction (where the output was created).
Transcript is an instance of the Merlin construction, which is itself based on STROBE and Keccak-f with 128-bit security parameter.
Transcript is used throughout ZkVM to generate challenge scalars and commitments.
Transcripts have the following operations, each taking a label for domain separation:
- Initialize transcript:
T := Transcript(label)
- Append bytes of arbitrary length prefixed with a label:
T.append(label, bytes)
- Challenge bytes
T.challenge_bytes<size>(label) -> bytes
- Challenge scalar is defined as generating 64 challenge bytes and reducing the 512-bit little-endian integer modulo Ristretto group order
|G|
:T.challenge_scalar(label) -> scalar T.challenge_scalar(label) == T.challenge_bytes<64>(label) mod |G|
Labeled instances of the transcript can be precomputed to reduce number of Keccak-f permutations to just one per challenge.
A predicate is a representation of a condition that unlocks the contract. It is encoded as a point.
Program predicate is a commitment to a program made using
commitment scalar h
on a secondary base point B2
:
PP(prog) = h(prog)·B2
Commitment scheme is defined using the transcript protocol by committing the program string and squeezing a scalar that is bound to it:
T = Transcript("ZkVM.predicate")
T.append("prog", prog)
h = T.challenge_scalar("h")
PP(prog) = h·B2
Program predicate can be satisfied only via the call
instruction that takes a cleartext program string, verifies the commitment and evaluates the program. Use of the secondary base point B2
prevents using the predicate as a verification key and signing with h
without executing the program.
The contract payload is a list of portable items stored in the contract or output.
Output is a serialized contract:
Output = Anchor || Predicate || LE32(k) || Item[0] || ... || Item[k-1]
Anchor = <32 bytes>
Predicate = <32 bytes>
Item = enum { String, Program, Value }
String = 0x00 || LE32(len) || <bytes>
Program = 0x01 || LE32(len) || <bytes>
Value = 0x02 || <32 bytes> || <32 bytes>
UTXO stands for Unspent Transaction Output. UTXO is uniquely identified by the ID of the contract represented by the output.
The constraint system is the part of the VM state that implements Bulletproof's rank-1 constraint system.
It also keeps track of the variables and constraints, and is used to verify the constraint system proof.
A proof of satisfiability of a constraint system built during the VM execution.
The proof is provided to the VM at the beginning of execution and verified when the VM is finished.
Transaction is a structure that contains all data and logic required to produce a unique transaction ID:
- Version (uint64)
- Time bounds (pair of LE64s)
- Program
- Transaction signature (64 bytes)
- Constraint system proof (variable-length array of points and scalars)
The transaction log contains entries that describe the effects of various instructions.
The transaction log is empty at the beginning of a ZkVM program. It is append-only. Items are added to it upon execution of any of the following instructions:
See the specification of each instruction for the details of which data is stored.
Note: transaction log items are only serialized when committed to a transcript during the transaction ID computation.
Transaction ID is defined as a merkle hash of a list consisting of a header entry followed by all the entries from the transaction log:
T = Transcript("ZkVM.txid")
txid = MerkleHash(T, {header} || txlog )
Entries are committed to the transcript using the following schema.
Header commits the transaction version and time bounds using the LE64 encoding.
T.append("tx.version", LE64(version))
T.append("tx.mintime", LE64(mintime))
T.append("tx.maxtime", LE64(maxtime))
Input entry is added using input
instruction.
T.append("input", contract_id)
where contract_id
is the contract ID of the claimed UTXO.
Output entry is added using output
instruction.
T.append("output", contract_id)
where contract_id
is the contract ID of the newly created UTXO.
Issue entry is added using issue
instruction.
T.append("issue.q", qty_commitment)
T.append("issue.f", flavor_commitment)
Retire entry is added using retire
instruction.
T.append("retire.q", qty_commitment)
T.append("retire.f", flavor_commitment)
Fee entry is added using fee
instruction.
T.append("fee", LE64(fee))
Data entry is added using log
instruction.
T.append("data", data)
The construction of a merkle binary tree is based on the RFC 6962 Section 2.1 with hash function replaced with a transcript.
Leafs and nodes in the tree use the same instance of a transcript provided by the upstream protocol:
T = Transcript(<label>)
The hash of an empty list is a 32-byte challenge string with the label merkle.empty
:
MerkleHash(T, {}) = T.challenge_bytes("merkle.empty")
The hash of a list with one entry (also known as a leaf hash) is
computed by committing the entry to the transcript (defined by the item type),
and then generating 32-byte challenge string the label merkle.leaf
:
MerkleHash(T, {item}) = {
T.append(<field1 name>, item.field1)
T.append(<field2 name>, item.field2)
...
T.challenge_bytes("merkle.leaf")
}
For n > 1, let k be the largest power of two smaller than n (i.e., k < n ≤ 2k). The merkle hash of an n-element list is then defined recursively as:
MerkleHash(T, list) = {
T.append("L", MerkleHash(list[0..k]))
T.append("R", MerkleHash(list[k..n]))
T.challenge_bytes("merkle.node")
}
Note that we do not require the length of the input list to be a power of two. The resulting merkle binary tree may thus not be balanced; however, its shape is uniquely determined by the number of leaves.
Instruction signtx
unlocks a contract if its predicate
correctly signs the transaction ID. The contract‘s predicate
is added to the array of deferred verification keys (alongside with associated contract ID) that
are later used in a multi-message signature protocol to verify a multi-signature over a transaction ID and the associated contract IDs.
- Instantiate the transcript
T
for transaction signature:T = Transcript("ZkVM.signtx")
- Commit the transaction ID:
T.append("txid", txid)
- Perform the multi-message signature protocol using the transcript
T
and the pairs of verification keys and contract IDs as submessages. - Add the verifier's statement to the list of deferred point operations.
Unblinding proof shows the committed value v
and proves
that the blinding factor in the Pedersen commitment is zero:
V == v·B + 0·B2
- Prover shows
v
. - Verifier checks equality
V == v·B
.
Taproot provides efficient and privacy-preserving storage of smart contracts. It is based on a proposal by Gregory Maxwell. Multi-party blockchain contracts typically have a top-level "success clause" (all parties agree and sign) or "alternative clauses" to let a party exit the contract based on pre-determined constraints. This has three significant features.
- The alternative clauses
{C_i}
, each of which is a program, and their corresponding blinding factors are stored as blinded programs and compressed in a Merkle tree with rootM
. - A signing key
X
and the Merkle rootM
(from 1) are committed to a single signing keyP
using a hash functionh1
, such thatP = X + h1(X, M)
. This makes signing forP
possible if parties want to sign forX
and avoids revealing the alternative clauses. - Calling a program will check a call proof to verify the program's inclusion in the Taproot tree before executing the program.
The commitment scheme is defined using the transcript protocol by committing (1) X
, the signing key, as a compressed 32-byte point, and (2) M
, the root of the Merkle tree constructed from the n
programs and (3) squeezing a scalar bound to all n
programs.
// given Merkle root M and signing key X
t = Transcript("ZkVM.taproot")
t.append_message(b"key", X); // Compressed Ristretto
t.append_message(b"merkle", M); // [u8; 32]
t.challenge_scalar(b"h")
Typically, to select a program to run from a leaf of the Merkle tree, we must navigate branches from the Merkle root. Here, the Merkle root is constructed from a provided Merkle path. It is then used to verify a relation between the contract predicate and the provided program. A branch is chosen using the call
instruction, which takes a program, call proof, and contract. The predicate P
is read from the contract. The call proof contains the signing key X
, a list of left-right neighbors, and the Merkle path. The neighbor list, Merkle path, and the provided program are used to reconstruct the Merkle root M
. If the relation P = X + h1(X, M)
is verified, then the program will be executed.
This struct is an argument to the call
instruction. It contains the information to verify that a program is represented in the Taproot Merkle tree, in the following fields:
X
, the signing keyneighbors
, an array of left-right neighbors in the Merkle tree. Each is a hash of its children, stored as a 32-byte string. It is the neighbor of an item on the Merkle path to the leaf hash of the program string. The leaf hash and root hash are not included. Neighbor hashes are ordered from bottom-most to top-most. As an example, consider a Merkle tree with more than three levels. The first element is the neighbor of the leaf hash; the second element is the neighbor of the leaf hash's parent; and the last is a child of the Merkle root.positions
, a pattern indicating each neighbor's position and their count. 0 represents a left neighbor, and 1 represents a right neighbor. This is a 32-bit little-endian integer, where the lower bits indicate the position of the lower-level neighbors. These bits are ordered identically toneighbors
, because there is a one-to-one correspondence between their elements. Additionally, the extra 1 bit is added to indicate the number of neighbors. Hence, the...00000001
indicates no neighbors,...0000001x
indicates one neighbour with position indicated byx
etc. The all-zero positions string is invalid.
Here is a visual example using the tree above. Suppose we are verifying the program with the leaf hash C
, with a right neighbor D
. The parent of C
is J = h(C || D)
, with a left neighbor I
. The parent of J
is M = h(I || J)
, with a right neighbor N
. M
's parent is the Merkle root R
.
This gives the neighbor list [D
, I
, N
] and bit pattern 101
. The Merkle path is red, while the neighbors are blue.
The blinded program is used to construct the Taproot tree. It is an enum that can either be a program or its accompanying blinding factor. A leaf element in the Taproot tree is a hash of a blinded program. This lets us store the hash of the blinding factor in the tree: each odd leaf will be the hash of a program, and each even leaf a hash of the program's blinding factor.
We commit them to the transcript as follows:
// given program prog and key blinding_key
t = Transcript("ZkVM.taproot-blinding")
t.append("program", prog)
t.append("key", blinding_key)
t.challenge_scalar(b"blinding")
Note: this provides domain separation from the ordinary Merkle leaf hash, making it impossible to open a blinding branch into a program. This ensures that in a multi-party contract, an untrusted party cannot hide a malicious clause in a blinding branch; other parties will check that the node's hash is computed via a separate function.
The ZkVM state consists of the static attributes and the state machine attributes.
- Transaction:
version
mintime
andmaxtime
program
tx_signature
cs_proof
- Extension flag (boolean)
- Last anchor or ∅ if unset
- Data stack (array of items)
- Program stack (array of programs with their offsets)
- Current program with its offset
- Transaction log (array of logged items)
- Transaction signature verification keys (array of points)
- Deferred point operations
- Constraint system
The VM is initialized with the following state:
- Transaction as provided by the user.
- Extension flag set to
true
orfalse
according to the transaction versioning rules for the transaction version. - Last anchor is set to ∅.
- Data stack is empty.
- Program stack is empty.
- Current program set to the transaction program; with zero offset.
- Transaction log is empty.
- Array of signature verification keys is empty.
- Array of deferred point operations is empty.
- Constraint system: empty (time bounds are constants that appear only within linear combinations of actual variables), with transcript initialized with label
ZkVM.r1cs
:r1cs_transcript = Transcript("ZkVM.r1cs")
Then, the VM executes the current program till completion:
- Each instruction is read at the current program offset, including its immediate data (if any).
- Program offset is advanced immediately after reading the instruction to the next instruction.
- The instruction is executed per specification below. If the instruction fails, VM exits early with an error result.
- If VM encounters
eval
,call
,signid
orsigntag
instruction, the new program with offset zero is set as the current program. The next iteration of the vm will start from the beginning of the new program. - If the offset is less than the current program’s length, a new instruction is read (go back to step 1).
- Otherwise (reached the end of the current program):
- If the program stack is not empty, pop top item from the program stack and set it to the current program. Go to step 5.
- If the program stack is empty, the transaction is considered finalized and VM successfully finishes execution.
If the execution finishes successfully, VM performs the finishing tasks:
- Checks if the stack is empty; fails otherwise.
- Checks if the last anchor is set; fails otherwise.
- Computes transaction ID.
- Commits the transaction ID into the R1CS proof transcript before committing low-level variables:
r1cs_transcript.append("ZkVM.txid", txid)
- Computes a verification statement for transaction signature.
- Computes a verification statement for constraint system proof.
- Executes all deferred point operations, including aggregated transaction signature and constraint system proof, using a single multi-scalar multiplication. Fails if the result is not an identity point.
If none of the above checks failed, the resulting transaction log is applied to the blockchain state as described in the blockchain specification (TBD).
VM defers operations on points till the end of the transaction in order to batch them with the verification of transaction signature and constraint system proof.
Each deferred operation at index i
represents a statement:
0 == sum{s[i,j]·P[i,j], for all j} + a[i]·B + b[i]·B2
where:
{s[i,j],P[i,j]}
is an array of (scalar,point) tuples,a[i]
is a scalar weight of a primary base pointB
,b[i]
is a scalar weight of a secondary base pointB2
.
All such statements are combined using the following method:
- For each statement, a random scalar
x[i]
is sampled. - Each weight
s[i,j]
is multiplied byx[i]
for all weights per statementi
:z[i,j] = x[i]·s[i,j]
- All weights
a[i]
andb[i]
are independently added up withx[i]
factors:a = sum{a[i]·x[i]} b = sum{b[i]·x[i]}
- A single multi-scalar multiplication is performed to verify the combined statement:
0 == sum{z[i,j]·P[i,j], for all i,j} + a·B + b·B2
- Each transaction has a version number. Each block (reference TBD) also has a version number.
- Block version numbers must be monotonically non-decreasing: each block must have a version number equal to or greater than the version of the block before it.
- The current block version is 1. The current transaction version is 1.
Extensions:
- If the block version is equal to the current block version, no transaction in the block may have a version higher than the current transaction version.
- If a transaction’s version is higher than the current transaction
version, the ZkVM
extension
flag is set totrue
. Otherwise, theextension
flag is set tofalse
.
Each instruction is represented by a one-byte opcode optionally followed by immediate data.
Immediate data is denoted by a colon :
after the instruction name.
Each instruction defines the format for immediate data. See the reference below for detailed specification.
Code | Instruction | Stack diagram | Effects |
---|---|---|---|
Stack | |||
0x00 | push:n:x |
ø → data | |
0x01 | program:n:x |
ø → program | |
0x02 | drop |
x → ø | |
0x03 | dup:k |
x[k] … x[0] → x[k] ... x[0] x[k] | |
0x04 | roll:k |
x[k] … x[0] → x[k-1] ... x[0] x[k] | |
Constraints | |||
0x05 | scalar |
scalar → expr | |
0x06 | commit |
point → var | Adds an external variable to CS |
0x07 | alloc |
ø → expr | Allocates a low-level variable in CS |
0x08 | mintime |
ø → expr | |
0x09 | maxtime |
ø → expr | |
0x0a | expr |
var → expr | Allocates a variable in CS |
0x0b | neg |
expr1 → expr2 | |
0x0c | add |
expr1 expr2 → expr3 | |
0x0d | mul |
expr1 expr2 → expr3 | Potentially adds multiplier in CS |
0x0e | eq |
expr1 expr2 → constraint | |
0x0f | range |
expr → expr | Modifies CS |
0x10 | and |
constr1 constr2 → constr3 | |
0x11 | or |
constr1 constr2 → constr3 | |
0x12 | not |
constr1 → constr2 | Modifies CS |
0x13 | verify |
constraint → ø | Modifies CS |
0x14 | unblind |
V v → V | Defers point ops |
Values | |||
0x15 | issue |
qty flv data pred → contract | Modifies CS, tx log, defers point ops |
0x16 | borrow |
qty flv → –V +V | Modifies CS |
0x17 | retire |
value → ø | Modifies CS, tx log |
0x18 | cloak:m:n |
widevalues commitments → values | Modifies CS |
0x19 | fee |
qty → widevalue | Modifies CS, tx log |
Contracts | |||
0x1a | input |
prevoutput → contract | Modifies tx log |
0x1b | output:k |
items... pred → ø | Modifies tx log |
0x1c | contract:k |
items... pred → contract | |
0x1d | log |
data → ø | Modifies tx log |
0x1e | eval |
prog → results... | |
0x1f | call |
contract(P) proof prog → results... | Defers point operations |
0x20 | signtx |
contract → results... | Modifies deferred verification keys |
0x21 | signid |
contract prog sig → results... | Defers point operations |
0x22 | signtag |
contract prog sig → results... | Defers point operations |
— | ext |
ø → ø | Fails if extension flag is not set. |
push:n:x → data
Pushes a string x
containing n
bytes.
Immediate data n
is encoded as LE32
followed by x
encoded as a sequence of n
bytes.
program:n:x → program
Pushes a program x
containing n
bytes.
Immediate data n
is encoded as LE32
followed by x
, as a sequence of n
bytes.
x drop → ø
Drops x
from the stack.
Fails if x
is not a droppable type.
x[k] … x[0] dup:k → x[k] ... x[0] x[k]
Copies k’th item from the top of the stack.
Immediate data k
is encoded as LE32.
Fails if x[k]
is not a copyable type.
x[k] x[k-1] ... x[0] roll:k → x[k-1] ... x[0] x[k]
Looks past k
items from the top, and moves the next item to the top of the stack.
Immediate data k
is encoded as LE32.
Note: roll:0
is a no-op, roll:1
swaps the top two items.
a scalar → expr
- Pops a scalar
a
from the stack. - Creates an expression
expr
with weighta
assigned to an R1CS constant1
. - Pushes
expr
to the stack.
Fails if a
is not a valid scalar.
P commit → v
- Pops a point
P
from the stack. - Creates a variable
v
from a Pedersen commitmentP
. - Pushes
v
to the stack.
Fails if P
is not a valid point.
alloc → expr
- Allocates a low-level variable in the constraint system and wraps it in the expression with weight 1.
- Pushes the resulting expression to the stack.
This is different from commit
: the variable created by alloc
is not represented by an individual Pedersen commitment and therefore can be chosen freely when the transaction is constructed.
mintime → expr
Pushes an expression expr
corresponding to the minimum time bound of the transaction.
The one-term expression represents time bound as a weight on the R1CS constant 1
(see scalar
).
maxtime → expr
Pushes an expression expr
corresponding to the maximum time bound of the transaction.
The one-term expression represents time bound as a weight on the R1CS constant 1
(see scalar
).
var expr → ex
- Pops a variable
var
. - Allocates a high-level variable in the constraint system using its Pedersen commitment.
- Pushes a single-term expression with weight=1 to the stack:
expr = { (1, var) }
.
Fails if var
is not a variable type.
ex1 neg → ex2
- Pops an expression
ex1
. - Negates the weights in the
ex1
producing new expressionex2
. - Pushes
ex2
to the stack.
Fails if ex1
is not an expression type.
ex1 ex2 add → ex3_
- Pops two expressions
ex2
, thenex1
. - If both expressions are constant expressions:
- Creates a new constant expression
ex3
with the weight equal to the sum of weights inex1
andex2
.
- Creates a new constant expression
- Otherwise, creates a new expression
ex3
by concatenating terms inex1
andex2
. - Pushes
ex3
to the stack.
Fails if ex1
and ex2
are not both expression types.
ex1 ex2 mul → ex3
Multiplies two expressions producing another expression representing the result of multiplication.
This performs a guaranteed optimization: if one of the expressions ex1
or ex2
contains
only one term and this term is for the variable representing the R1CS constant 1
(in other words, the statement is a cleartext constant),
then the other expression is multiplied by that constant in-place without allocating a multiplier in the constraint system.
This optimization is guaranteed because it affects the state of the constraint system: not performing it would make the existing proofs invalid.
- Pops two expressions
ex2
, thenex1
. - If either
ex1
orex2
is a constant expression:- The other expression is multiplied in place by the scalar from that expression.
- The resulting expression is pushed to the stack.
- Otherwise:
- Creates a multiplier in the constraint system.
- Constrains the left wire to
ex1
, and the right wire toex2
. - Creates an expression
ex3
with the output wire in its single term. - Pushes
ex3
to the stack.
Fails if ex1
and ex2
are not both expression types.
Note: if both ex1
and ex2
are constant expressions,
the result does not depend on which one treated as a constant,
and the resulting expression is also a constant expression.
ex1 ex2 eq → constraint
- Pops two expressions
ex2
, thenex1
. - If both
ex1
orex2
are constant expressions:- Creates a cleartext constraint with a boolean
true
if the weights are equal,false
otherwise.
- Creates a cleartext constraint with a boolean
- Otherwise:
- Creates a constraint that represents statement
ex1 - ex2 = 0
.
- Creates a constraint that represents statement
- Pushes the constraint to the stack.
Fails if ex1
and ex2
are not both expression types.
expr range → expr
- Pops an expression
expr
. - Adds an 64-bit range proof for
expr
to the constraint system (see Cloak protocol for the range proof definition). - Pushes
expr
back to the stack.
Fails if expr
is not an expression type.
c1 c2 and → c3
- Pops constraints
c2
, thenc1
. - If either
c1
orc2
is a cleartext constraint:- If the cleartext constraint is
false
, returns it; otherwise returns the other constraint.
- If the cleartext constraint is
- Otherwise:
- Creates a conjunction constraint
c3
containingc1
andc2
.
- Creates a conjunction constraint
- Pushes
c3
to the stack.
No changes to the constraint system are made until verify
is executed.
Fails if c1
and c2
are not constraints.
constraint1 constraint2 or → constraint3
- Pops constraints
c2
, thenc1
. - If either
c1
orc2
is a cleartext constraint:- If the cleartext constraint is
true
, returns it; otherwise returns the other constraint.
- If the cleartext constraint is
- Otherwise:
- Creates a disjunction constraint
c3
containingc1
andc2
.
- Creates a disjunction constraint
- Pushes
c3
to the stack.
No changes to the constraint system are made until verify
is executed.
Fails if c1
and c2
are not constraints.
constr1 not → constr2
- Pops constraint
c1
. - If
c1
is a cleartext constraint, returns its negation. - Otherwise:
- Create two constraints:
where
x * y = 0 x * w = 1-y
w
is a free variable andx
is the evaluation of constraintc1
. - Wrap the output
y
in a constraintc2
. - Push
c2
to the stack.
- Create two constraints:
This implements the boolean not
trick from Setty, Vu, Panpalia, Braun, Ali, Blumberg, Walfish (2012) and implemented in libsnark.
constr verify → ø
- Pops constraint
constr
. - If
constr
is a cleartext constraint:- If it is
true
, returns immediately. - If it is
false
, fails execution.
- If it is
- Otherwise, transforms the constraint
constr
recursively using the following rules:- Replace conjunction of two linear constraints
a
andb
with a linear constraintc
by combining both constraints with a random challengez
:z = transcript.challenge_scalar(b"ZkVM.verify.and-challenge"); c = a + z·b
- Replace disjunction of two linear constraints
a
andb
by constrainting an outputo
of a newly allocated multiplier{r,l,o}
to zero, while adding constraintsr == a
andl == b
to the constraint system.r == a # added to CS l == b # added to CS o == 0 # replaces OR(a,b)
- Conjunctions and disjunctions of non-linear constraints are transformed via rules (1) and (2) using depth-first recursion.
- Replace conjunction of two linear constraints
- The resulting single linear constraint is added to the constraint system.
Fails if constr
is not a constraint.
V v unblind → V
- Pops scalar
v
. - Pops point
V
. - Verifies the unblinding proof for the commitment
V
and scalarv
, deferring all point operations). - Pushes point
V
.
Fails if:
qty flv metadata pred issue → contract
- Pops point
pred
. - Pops string
metadata
. - Pops variable
flv
and commits it to the constraint system. - Pops variable
qty
and commits it to the constraint system. - Creates a value with variables
qty
andflv
for quantity and flavor, respectively. - Computes the flavor scalar defined by the predicate
pred
using the following transcript-based protocol:T = Transcript("ZkVM.issue") T.append("predicate", pred) T.append("metadata", metadata) flavor = T.challenge_scalar("flavor")
- Checks that the
flv
has unblinded commitment toflavor
by deferring the point operation:flv == flavor·B
- Adds a 64-bit range proof for the
qty
to the constraint system (see Cloak protocol for the range proof definition). - Adds an issue entry to the transaction log.
- Creates a contract with the value as the only payload,
protected by the predicate
pred
, consuming VM’s last anchor and replacing it with this contract’s ID.
The value is now issued into the contract that must be unlocked
using one of the contract instructions: signtx
, signid
, signtag
or call
.
Fails if:
pred
is not a valid point,flv
orqty
are not variable types,- VM’s last anchor is not set.
qty flv borrow → –V +V
- Pops variable
flv
and commits it to the constraint system. - Pops variable
qty
and commits it to the constraint system. - Creates a value
+V
with variablesqty
andflv
for quantity and flavor, respectively. - Adds a 64-bit range proof for
qty
variable to the constraint system (see Cloak protocol for the range proof definition). - Creates wide value
–V
, allocating a low-level variableqty2
for the negated quantity and reusing the flavor variableflv
. - Adds a constraint
qty2 == -qty
to the constraint system. - Pushes
–V
, then+V
to the stack.
The wide value –V
is not a portable type, and can only be consumed by a cloak
instruction
(where it is merged with appropriate positive quantity of the same flavor).
Fails if qty
and flv
are not variable types.
value retire → ø
- Pops a value from the stack.
- Adds a retirement entry to the transaction log.
Fails if the value is not a non-negative value type.
widevalues commitments cloak:m:n → values
Merges and splits m
wide values into n
values.
- Pops
2·n
points as pairs of flavor and quantity for each output value, flavor is popped first in each pair. - Pops
m
wide values as input values. - Creates constraints and 64-bit range proofs for quantities per Cloak protocol.
- Pushes
n
values to the stack, placing them in the reverse order as their corresponding commitments:A B C → cloak → C B A
Immediate data m
and n
are encoded as two LE32s.
qty fee → widevalue
- Pops an 4-byte string
qty
from the stack and decodes it as LE32 integer. - Checks that
qty
and accumulated fee is less or equal to2^24
. - Pushes wide value
–V
, with quantity variable constrained to-qty
and with flavor constrained to 0. Both variables are allocated from a single multiplier. - Adds a fee entry to the transaction log with the quantity
qty
.
Fails if the resulting amount of fees is exceeding 2^24
.
prevoutput input → contract
- Pops a string
prevoutput
representing the unspent output structure from the stack. - Constructs a contract based on
prevoutput
and pushes it to the stack. - Adds input entry to the transaction log.
- Sets the VM’s last anchor to the ratcheted contract ID:
T = Transcript("ZkVM.ratchet-anchor") T.append("old", contract_id) new_anchor = T.challenge_bytes("new")
Fails if the prevoutput
is not a string with exact encoding of an output structure.
items... predicate output:k → ø
- Pops
predicate
from the stack. - Pops
k
portable items from the stack. - Creates a contract with the
k
items as a payload, the predicatepred
, and anchor set to the VM’s last anchor. - Adds an output entry to the transaction log.
- Updates the VM’s last anchor with the contract ID of the new contract.
Immediate data k
is encoded as LE32.
Fails if:
- VM’s last anchor is not set,
- payload items are not portable.
items... pred contract:k → contract
- Pops predicate
pred
from the stack. - Pops
k
portable items from the stack. - Creates a contract with the
k
items as a payload, the predicatepred
, and anchor set to the VM’s last anchor. - Pushes the contract onto the stack.
- Update the VM’s last anchor with the contract ID of the new contract.
Immediate data k
is encoded as LE32.
Fails if:
- VM’s last anchor is not set,
- payload items are not portable.
data log → ø
- Pops
data
from the stack. - Adds data entry with it to the transaction log.
Fails if data
is not a string.
prog eval → results...
- Pops program
prog
. - Set the
prog
as current.
Fails if prog
is not a program.
contract(P) proof prog call → results...
- Pops program
prog
, the call proofproof
, and a contractcontract
. - Reads the predicate
P
from the contract. - Reads the signing key
X
, list of neighborsneighbors
, and their positionspositions
from the call proofproof
. - Uses the program
prog
,neighbors
, andpositions
to compute the Merkle rootM
. - Forms a statement to verify a relation between
P
,M
, andX
:0 == -P + X + h1(X, M)·G
- Adds the statement to the deferred point operations.
- Places the payload on the stack (last item on top).
- Set the
prog
as current.
Fails if:
contract(predicate, payload) signtx → items...
- Pops the contract from the stack.
- Adds the contract’s
predicate
as a verification key to the list of deferred keys for transaction signature check at the end of the VM execution. - Places the
payload
on the stack (last item on top), discarding the contract.
Note: the instruction never fails as the only check (signature verification) is deferred until the end of VM execution.
contract(predicate, payload) prog sig signid → items...
- Pop string
sig
, programprog
and the contract from the stack. - Read the
predicate
from the contract. - Place the
payload
on the stack (last item on top), discarding the contract. - Instantiate the transcript:
T = Transcript("ZkVM.signid")
- Commit the contract ID
contract.id
to the transcript:T.append("contract", contract_id)
- Commit the program
prog
to the transcript:T.append("prog", prog)
- Extract nonce commitment
R
and scalars
from a 64-byte stringsig
:R = sig[ 0..32] s = sig[32..64]
- Perform the signature protocol using the transcript
T
, public keyP = predicate
and the valuesR
ands
:(s = dlog(R) + e·dlog(P)) s·B == R + c·P
- Add the statement to the list of deferred point operations.
- Set the
prog
as current.
Fails if:
contract(predicate, payload) prog sig signtag → items... tag
- Pop string
sig
, programprog
and the contract from the stack. - Read the
predicate
from the contract. - Place the
payload
on the stack (last item on top), discarding the contract. - Verifies that the top item is a string, and reads it as a
tag
. The item remains on the stack. - Instantiate the transcript:
T = Transcript("ZkVM.signtag")
- Commit the
tag
to the transcript:T.append("tag", tag)
- Commit the program
prog
to the transcript:T.append("prog", prog)
- Extract nonce commitment
R
and scalars
from a 64-byte datasig
:R = sig[ 0..32] s = sig[32..64]
- Perform the signature protocol using the transcript
T
, public keyP = predicate
and the valuesR
ands
:(s = dlog(R) + e·dlog(P)) s·B == R + c·P
- Add the statement to the list of deferred point operations.
- Set the
prog
as current.
Fails if:
sig
is not a 64-byte long string,- or
prog
is not a program, - or
contract
is not a contract, - or last item in the
payload
(tag
) is not a string.
ø ext → ø
All unassigned instruction codes are interpreted as no-ops. This are reserved for use in the future versions of the VM.
See Versioning.
A Transaction is serialized as follows:
SerializedTx = TxHeader || LE32(len(Program)) || Program || Signature || Proof
TxHeader = LE64(version) || LE64(mintime) || LE64(maxtime)
Program = <len(Program) bytes>
Signature = <64 bytes>
Proof = <14·32 + len(InnerProductProof) bytes>
Locks value with a public key.
... (<value>) <pubkey> output:1
Unlocks a simple contract that locked a single value with a public key. The unlock is performed by claiming the input and signing the transaction.
<serialized_input> input signtx ...
Unlocks three values from the existing inputs,
recombines them into a payment to address A
(pubkey) and a change C
:
<input1> input signtx
<input2> input signtx
<input3> input signtx
<FC> <QC> <FA> <QA> cloak:3:2 # flavor and quantity commitments for A and C
<A> output:1
<C> output:1
Multi-signature predicate can be constructed in three ways:
- For N-of-N schemes, a set of independent public keys can be merged using a MuSig scheme as described in transaction signature. This allows non-interactive key generation, and only a simple interactive signing protocol.
- For threshold schemes (M-of-N, M ≠ N), a single public key can be constructed using a variant of a Feldman-VSS scheme, but this requires interactive key generation.
- Small-size threshold schemes can be instantiated non-interactively using Taproot. Most commonly, 2-of-3 "escrow" scheme can be implemented as 2 keys aggregated as the main branch for the "happy path" (escrow party not involved), while the other two combinations aggregated in the nested branches.
Note that all three approaches minimize computational costs and metadata leaks, unlike Bitcoin, Stellar and TxVM where all keys are enumerated and checked independently.
Offer is a cleartext contract that can be cancelled by the owner or lifted by an arbitrary taker.
Offer locks the value being sold and stores the price as a pair of commitments: for the flavor and quantity.
The cancellation clause is simply a predicate formed by the maker’s public key.
The lift clause when chosen by the taker, borrows the payment amount according to the embedded price, makes an output with the positive payment value and leaves to the taker a negative payment and the unlocked value. The taker than merges the negative payment and the value together with their actual payment using the cloak instruction, and create an output for the lifted value.
contract Offer(value, price, maker) {
OR(
maker,
{
let (payment, negative_payment) = borrow(price.qty, price.flavor)
output(payment, maker)
return (negative_payment, value)
}
)
}
Lift clause bytecode:
<priceqty> <priceflavor> borrow <makerpubkey> output:1
To make it discoverable, each transaction that creates an offer output also creates a data entry describing the value quantity and flavor, and the price quantity and flavor in cleartext format. This way the offer contract does not need to perform any additional computation or waste space for cleartext scalars.
Bytecode creating the offer:
<value> <offer predicate> output:1 "Offer: 1 BTC for 3745 USD" data
The programmatic API for creating, indexing and interacting with offers ties all parts together.
TBD:
Sketch:
price = rational X/Y, total value = V, payment: P, change: C;
constraint: (V - C)*X == Y*P
.
contract unlocks V, borrow
s and unblind
s C, outputs C into the same contract,
borrows P per provided scalar amount, outputs P into recipient's address.
Returns V, -C and -P to the user. User merges V+(-C) and outputs V-C to its address, provides P into the cloak.
What about zero remainder or dust? Contract can have two branches: partial or full lift, and partial lift has minimum remainder to prevent dust.
TBD.
TBD.
Payment channel is a contract that permits a number of parties to exchange value within a given range back-and-forth without publication of each transaction on a blockchain. Instead, only net distribution of balances is settled when the channel is closed.
Assumptions for this example:
- There are 2 parties in a channel.
- The channel uses values of one flavor chosen when the channel is created.
Overview:
- Both parties commit X quantity of funds to a shared contract protected by a simple 2-of-2 multisig predicate. This allows each party to net-send or net-receive up to X units.
- Parties can close the channel mutually at any time, signing off a transaction that distributes the latest balances. They can even choose the target addresses arbitrarily. E.g. if one party needs to make an on-chain payment, they can have their balance split in two outputs: payment and change immediately when closing a channel, without having to make an additional transaction.
- Parties can independently close the channel at any time using a pre-signed authorization predicate, that encodes a distribution of balances with the hard-coded pay-out outputs.
- Each payment has a corresponding pre-signed authorization reflecting a new distribution.
- To prevent publication of a transaction using a stale authorization predicate:
- The predicate locks funds in a temporary "holding" contract for a duration of a "contest period" (agreed upon by both parties at the creation of a channel).
- Any newer predicate can immediately spend that time-locked contract into a new "holding" contract with an updated distribution of funds.
- Users watch the blockchain updates for attempts to use a stale authorization, and counter-act them with the latest version.
- If the channel was force-closed and not (anymore) contested, after a "contest period" is passed, the "holding" contract can be opened by either party, which sends the funds to the pre-determined outputs.
In ZkVM such predicate is implemented with a signed program (signtag
) that plays dual role:
- It allows any party to initiate a force-close.
- It allows any party to dispute a stale force-close (overriding it with a more fresh force-close).
To initiate a force-close, the program P1
does:
- Take the exptime as an argument provided by the user (encrypted via Pedersen commitment).
- Check that
tx.maxtime + D == exptime
(built-in contest period). Transaction maxtime is chosen by the initiator of the tx close to the current time. - Put the exptime and the value into the output under predicate
P2
(built-in predicate committing to a program producing final outputs and checking exptime).
To construct such program P1
, users first agree on the final distribution of balances via the program P2
.
The final-distribution program P2
:1. Checks that tx.mintime >= exptime
(can be done via range:24(tx.mintime - exptime)
which gives 6-month resolution for the expiration time)
2. Creates borrow
/output
combinations for each party with hard-coded predicate for each output.
3. Leaves the payload value and negatives from borrow
on the stack to be consumed by the cloak
instruction.
To dispute a stale force-close, the program P1
has an additional feature:
it contains a sequence number that's incremented for each new authorization predicate P1
, at each payment.
The program P1
, therefore:
- Expect two items on the stack coming from a contract:
seq
scalar andval
value (on top ofseq
). Below these, there isexptime
commitment. - Check
range:24(new_seq - seq - 1)
to make surenew_seq > seq
. If there are more than 16.7 million payments (capped by a 24-bit rangeproof), several contest transactions can be chained together with a distance of 16.7 million payments between each other to reach the latest one. At the tx rate of 1 tx/sec, this implies 6 months worth of channel life per intermediate force-close transaction. - Check the expiration time
tx.maxtime + D == exptime
(D is built into the predicate). - Lock the value and
exptime
in a new output with a built-in predicateP2
.
If P1
is used to initiate force-close on a contract that does not have a sequence number (initial contract),
the user simply provides a zero scalar on the stack under the contract.
Confidentiality properties:
- When the channel is normally closed, it is not even revealed whether it was a payment channel at all. The channel contract looks like an ordinary output indistinguishable from others.
- When the channel is force-closed, it does not reveal the number of payments (sequence number) that went through it, nor the contest period which could be used to fingerprint participants with different settings.
The overhead in the force-close case is 48 multipliers (two 24-bit rangeproofs) — a 37.5% performance overhead on top of 2 outputs which the channel has to create even in a normal-close case. There is no range proof for the re-locked value between P1
and P2
, as it's already known to be in range and it does no go through the cloak
.
TBD.
This section collects discussion of the rationale behind the design decisions in the ZkVM.
ZkVM has similar or almost identical properties as TxVM:
- The format of the transaction is the executable bytecode.
- The VM is a Forth-like stack machine.
- Multi-asset issuable values are first-class types subject to linear logic.
- Contracts are first-class types implementing object-capability model.
- VM assumes small UTXO-based blockchain state and very simple validation rules outside the VM.
- Each unspent transaction output (UTXO) is a contract which holds arbitrary collection of data and values.
- Transaction log as an append-only list of effects of the transaction (inputs, outputs, etc.)
- Contracts use linear logic by imperatively producing the effects they require instead of observing and verifying their context.
- Execution can be delegated to a signed program which is provided at the moment of opening a contract.
At the same time, ZkVM improves on the following tradeoffs in TxVM:
- Runlimit and jumps: ZkVM does not permit recursion and loops and has more predictable cost model that does not require artificial cost metrics per instruction.
- Too abstract capabilities that do not find their application in the practical smart contracts, like having “wrapping” contracts or many kinds of hash functions.
- Uniqueness of transaction IDs enforced via anchors (embedded in values) is conceptually clean in TxVM, although not very ergonomic. In confidential transactions anchors become even less ergonomic in several respects, and issuance is simpler without anchors.
- TxVM allows multiple time bounds and needs to intersect all of them, which comes at odds with zero-knowledge proofs about time bounds.
- Creating outputs is less ergonomic and more complex than necessary (via a temporary contract).
- TxVM allows nested contracts and needs multiple stacks and an argument stack. ZkVM uses one stack.
Forward- and backward-compatible upgrades (“soft forks”) are possible with extension instructions, enabled by the extension flag and higher version numbers.
For instance, to implement a SHA-256 function, an unsued extension instruction
could be assigned verifysha256
name and check if top two strings on the stack
are preimage and image of the SHA-256 function respectively. The VM would fail if
the check failed, while the non-upgraded software could choose to treat the instruction as no-op
(e.g. by ignoring the upgraded transaction version).
It is possible to write a compatible contract that uses features of a newer
transaction version while remaining usable by non-upgraded software
(that understands only older transaction versions) as long as
new-version code paths are protected by checks for the transaction
version. To facilitate that, a hypothetical ZkVM upgrade may introduce
an extension instruction “version assertion” that fails execution if
the version is below a given number (e.g. 4 versionverify
).
Some instructions (output
, roll
etc) have size or index
parameters specified as immediate data (part of the instruction code),
which makes it impossible to compute such argument on the fly.
This allows for a simpler type system (no integers, only scalars), while limiting programs to have pre-determined structure.
In general, there are no jumps or cleartext conditionals.
Note, however, that with the use of delegated signatures (signid
, signtag
),
program structure can be determined right before the use.
- it makes sense to reuse variable created by
blind
- txbuilder can keep the secrets assigned to variable instances, so it may be more convenient than remembering preimages for commitments.
The payload of a contract must be provided to the selected branch. If both predicates must be evaluated and both are programs, then which one takes the payload? To avoid ambiguity, AND can be implemented inside a program that can explicitly decide in which order and which parts of payload to process: maybe check some conditions and then delegate the whole payload to a predicate, or split the payload in two parts and apply different predicates to each part. There's contract
instruction for that delegation.
Imagine your contract is "pay $X to address A in order to unlock £Y".
You can write this contract as "give me the value V, i'll check if V == $X, and lock it with address A, then unlock my payload £Y". Then you don't need any negative values or "borrowing".
However, this is not flexible: what if $X comes from another contract that takes Z£ and part of Z is supposed to be paid by £Y? You have a circular dependency which you can only resolve if you have excess of £ from somewhere to resolve both contracts and then given these funds back.
Also, it's not private enough: the contract now not only reveals its logic, but also requires you to link the whole tx graph where the $X is coming from.
Now let’s see how borrow
simplifies things: it allows you to make $X out of thin air, "borrowing" it from void for the duration of the transaction.
borrow
gives you a positive $X as requested, but it also needs to require you to repay $X from some other source before tx is finalized. This is represented by creating a negative –$X as a less powerful type Wide Value.
Wide values are less powerful (they are super-types of Values) because they are not portable. You cannot just stash such value away in some output. You have to actually repay it using the cloak
instruction.
First, note that inequality constraint only makes sense for integers, not for scalars.
This is because integers are ordered and scalars wrap around modulo group order.
Therefore, any two variables or expressions that must be compared,
must theselves be proven to be in range using the range
instruction (directly or indirectly).
Then, the inequality constraint is created by forming an expression of the form expr ≥ 0
(using instructions neg
and add
) and using a range
instruction to place a range check on expr
.
Constraint | Range check |
---|---|
a ≥ b |
range:64(a - b) |
a ≤ b |
range:64(b - a) |
a > b |
range:64(a - b - 1) |
a < b |
range:64(b - a - 1) |
Logical instructions or
and and
work by combining constraints of form expr == 0
, that is, a comparison with zero. Logical not
then is a check that a secret variable is not zero.
One way to do that is to break the scalar in 253 bits (using 253 multipliers) and add a disjunction "at least one of these bits is 1" (using additional 252 multipliers), spending total 505 multipliers (this is 8x more expensive than a regular 64-bit rangeproof).
On the other hand, using not
may not be a good way to express contracts because of a dramatic increase in complexity: a contract that says not(B)
effectively inverts a small set of inadmissible inputs, producing a big set of addmissible inputs.
We need to investigate whether there are use-cases that cannot be safely or efficiently expressed with only and
and or
combinators.
In ZkVM:
- Transaction ID is globally unique,
- UTXO is globally unique,
- Nonces are replaced with synthetic "nonce units" added via synthetic UTXOs introduced by each block,
- Contract is globally unique,
- Value is not unique.
In contrast, in TxVM:
- Transaction ID is globally unique,
- UTXO is not unique,
Nonce
is globally unique,- Contract is globally unique,
- Value is globally unique.
TxVM ensures transaction uniqueness this way:
- Each value has a unique “anchor” (32-byte payload)
- When values are split/merged, anchor is one-way hashed to produce new anchor(s).
finalize
instruction consumes a zero-quantity value, effectively consuming a unique anchor.- Anchors are originally produced by synthetic UTXOs added by each block.
- Issuance of new assets consumes a zero-quantity value, moving the anchor from it to the new value.
Pro:
UTXO is fully determined before transaction is finalized. So e.g. a child transaction can be formed before the current transaction is completed and its ID is known. This might be handy in some cases.
Meh:
It seems like transaction ID is defined quite simply as a hash of the log, but it also must consume an anchor (since UTXOs alone are not guaranteed to be unique, and one can make a transaction without spending any UTXOs). So transaction ID computation still has some level of special handling.
Con:
Recipient of payment cannot fully specify the contract snapshot because they do not know sender’s anchors. This is not a problem in cleartext TxVM, but a problem in ZkVM where values have to participate in the Cloak protocol.
Storing anchor inside a value turned out to be handy, but is not very ergonomic. For instance, a contract cannot simply “claim” an arbitrary value and produce a negative counterpart, w/o having some value to “fork off” an anchor from.
Another potential issue: UTXOs are not guaranteed to be always unique. E.g. if a contract does not modify its value and other content, it can re-save itself to the same UTXO ID. It can even toggle between different states, returning to the previously spent ID. This can cause issues in some applications that forget that UTXO ID in special circumstances can be resurrected.
ZkVM ensures transaction uniqueness this way:
- Values do not have anchors.
- Instead of the
nonce
instruction, the initial anchors are tied to a block via "nonce units" introduced by each block. issue
does not consume zero valuefinalize
does not consume zero valueclaim/borrow
can produce an arbitrary value and its negative at any point- Each UTXO ID is defined as
Hash(contract)
, that is contents of the contract are unique due to a contract’s anchor. - Transaction ID is a hash of the finalized log.
- Newly created contracts consume the "last anchor" tracked by VM and set it to their contract ID.
- When VM finishes, it checks that "last anchor" is present, which means that the log contains at least one input entry.
- Inputs and outputs are represented in the log as contract IDs which the blockchain state machine should remove/insert accordingly.
Pros:
The values are simply pedersen commitments (Q,A) (quantity, flavor), without any extra payload. This makes handling them much simpler: we can issue, finalize and even “claim” a value in a straightforward manner.
Con:
UTXO IDs are not generally known without fuller view on transaction flow. This could be not a big deal, as we cannot really plan for the next transaction until this one is fully formed and published. Also, in a joint proof scenario, it’s even less reliable to plan the next payment until the multi-party protocol is completed, therefore waiting for transaction ID to be determined becomes a requirement.
That said, for contracts created from another contract, the contract ID is determined locally by the parent contract’s ID.
Previously, we thought we needed them to reblind a received value. (In this now-scrapped protocol, we would encrypt a value to a public key in one transaction and decrypt it in another one.) Now, we will use El-Gamal encryptions and codify encrypted values or data in specific types. That means we do not need to operate on individual values or commitments.
Also, instead of placing bounds on some received value, we now normally use borrow
. While constraints on values are still relevant, the borrow
approach means we do not need to inspect a Value
object. Instead, we can simply constrain raw commitments or variables and then borrow
a value with such a commitment.
Thus, neither of these use cases requires qty
and flavor
.
None at the moment.