|
| 1 | +.. _untrusted: |
| 2 | + |
| 3 | +.. warning:: |
| 4 | + This feature is currently in public testing and not yet recommended to be |
| 5 | + used for important data. Related UI controls are hidden by a feature flag - |
| 6 | + check the release notes for information on how to test it. |
| 7 | + |
| 8 | +Untrusted Device Encryption |
| 9 | +=========================== |
| 10 | + |
| 11 | +Threat Model / Primary Goals |
| 12 | +---------------------------- |
| 13 | + |
| 14 | +An "untrusted" device can participate in a Syncthing cluster with the |
| 15 | +following assumptions and limitations; |
| 16 | + |
| 17 | +The untrusted device can *not* observe: |
| 18 | + |
| 19 | +- File data |
| 20 | + |
| 21 | +- File or directory names, symlink names, symlink targets |
| 22 | + |
| 23 | +- File modification time, permissions, or modification history (version |
| 24 | + vectors) |
| 25 | + |
| 26 | +The untrusted device *will* be able to observe: |
| 27 | + |
| 28 | +- File sizes [#sizes]_ |
| 29 | + |
| 30 | +- Which parts of files are changed by the other devices and when |
| 31 | + |
| 32 | +The last point is required by the syncing mechanism, in order to avoid |
| 33 | +transferring all unchanged file data when a file block changes. Blocks and |
| 34 | +block hashes are encrypted with a per-file key and depends on the block |
| 35 | +offset, so correlation is not possible between blocks at different offsets |
| 36 | +or different files. |
| 37 | + |
| 38 | +In addition the untrusted device *must not* be able to modify, remove or |
| 39 | +introduce data by itself without detection. |
| 40 | + |
| 41 | +Primitives Used |
| 42 | +--------------- |
| 43 | + |
| 44 | +The user input to the system is the *folder ID*, which is a short string |
| 45 | +identifying a given folder between devices, and the *password*. From this we |
| 46 | +generate a *folder key* (32 bytes) using ``scrypt``:: |
| 47 | + |
| 48 | + folderKey = scrypt.Key(password, "syncthing" + folderID) |
| 49 | + |
| 50 | +The string "syncthing" with the folder ID concatenated make up the salt. The |
| 51 | +folder key is used to encrypt file names using AES-SIV without nonce:: |
| 52 | + |
| 53 | + encryptedFilename = AES-SIV(filename, folderKey) |
| 54 | + |
| 55 | +Given the key length of 32 bytes the algorithm in use will be AES-128 |
| 56 | +("AES-SIV-256"). To make the encrypted file name usable again as a file |
| 57 | +name, we encode it using base32 and add slashes at strategic places. |
| 58 | + |
| 59 | +From the folder key and the plaintext file name we derive the *file key* by |
| 60 | +HKDF of the folder key and the plaintext file name:: |
| 61 | + |
| 62 | + fileKey = HKDF(SHA256, folderKey+filename, salt: "syncthing", info: nil) |
| 63 | + |
| 64 | +This file key is used for all other encryption, specifically file block |
| 65 | +hashes and data blocks. In file metadata, block hashes are encrypted using |
| 66 | +AES-SIV with the file key:: |
| 67 | + |
| 68 | + encryptedBlockHash = AES-SIV(blockHash, fileKey) |
| 69 | + |
| 70 | +Data blocks are encrypted using XChaCha20-Poly1305 with random nonces and |
| 71 | +appended to the nonce itself:: |
| 72 | + |
| 73 | + encryptedBlock = nonce + XChaCha20-Poly1305.Seal(block, nonce, fileKey) |
| 74 | + |
| 75 | +The original file metadata descriptor is encrypted in the same manner and |
| 76 | +attached to the encrypted-file metadata. |
| 77 | + |
| 78 | +Devices sharing a folder need to use the same password. |
| 79 | +To ensure that a *password token* in the form of an arbitrary, but commonly |
| 80 | +known string encrypted using AES-SIV with the folder key is sent in the |
| 81 | +:ref:`cluster-config`:: |
| 82 | + |
| 83 | + passwordToken = AES-SIV("syncthing" + folderID, folderKey) |
| 84 | + |
| 85 | +Thus an encrypted device can verify all its connected devices use the same |
| 86 | +password comparing the encrypted token, without knowing the password itself. |
| 87 | + |
| 88 | +.. note:: |
| 89 | + |
| 90 | + In Syncthing a file is made up of a number of equal size data blocks, |
| 91 | + followed by a usually shorter last data block. The full size data blocks |
| 92 | + are at minimum 128 KiB, ranging up to 16 MiB in multiples of two. The |
| 93 | + last data block can in principle be as small as one byte. For untrusted |
| 94 | + folders the size of the last data block is padded up to a kilobyte if it |
| 95 | + was shorter to begin with. The untrusted device isn't allowed to request |
| 96 | + less than a kilobyte of data. |
| 97 | + |
| 98 | + I don't actually know if this block padding serves a purpose. It was |
| 99 | + added to address a worry that something might break or leak if an |
| 100 | + attacker is allowed to repeatedly request single-byte data blocks of |
| 101 | + their choosing. If there is nothing to worry about here we can remove |
| 102 | + the padding. //jb |
| 103 | + |
| 104 | +.. note:: |
| 105 | + |
| 106 | + While a well behaved implementation is expected to request data blocks |
| 107 | + precisely as announced in the file metadata there is no enforcement of |
| 108 | + this. This means that an attacker on the untrusted side can repeatedly |
| 109 | + request arbitrary ranges of a file and receive the encrypted result. |
| 110 | + With the restriction above, the minimum block size that can be requested |
| 111 | + in 1024 bytes. |
| 112 | + |
| 113 | + |
| 114 | +Implementation Details |
| 115 | +---------------------- |
| 116 | + |
| 117 | +Metadata Encryption |
| 118 | +~~~~~~~~~~~~~~~~~~~ |
| 119 | + |
| 120 | +The Syncthing protocol is essentially two-phase: |
| 121 | + |
| 122 | +- A device sends file metadata (a ``FileInfo`` structure) for a new or changed file |
| 123 | + |
| 124 | +- The other side determines which blocks it needs to construct the new file, and requests these blocks |
| 125 | + |
| 126 | +For untrusted devices a fake FileInfo is constructed, with an encrypted |
| 127 | +name and block list and other metadata such as modification time and |
| 128 | +permissions set to static values. |
| 129 | + |
| 130 | +An original file metadata structure looks something like this: |
| 131 | + |
| 132 | +.. graphviz:: |
| 133 | + |
| 134 | + digraph g { |
| 135 | + graph [ |
| 136 | + rankdir = "LR" |
| 137 | + ] |
| 138 | + "fileinfo" [ |
| 139 | + label = "name | type | size | modified | ... | <b> blocks | block size" |
| 140 | + shape = "record" |
| 141 | + ] |
| 142 | + "blocks" [ |
| 143 | + label = "{ <a> offset | size | hash } | { offset | size | hash } | ..." |
| 144 | + shape = "record" |
| 145 | + ] |
| 146 | + fileinfo:b -> blocks:a |
| 147 | + } |
| 148 | + |
| 149 | +The fake FileInfo encrypts and adjusts a couple of attributes: |
| 150 | + |
| 151 | +- The name is encrypted (with the folder key), base32 encoded, and slashes |
| 152 | + are inserted after the first and third characters, and then every 200 |
| 153 | + characters. |
| 154 | + |
| 155 | +- The file size is adjusted for the per block overhead, and rounded up so that |
| 156 | + the last block is a multiple of 1024 bytes. |
| 157 | + |
| 158 | +- The block size is adjusted for block overhead. |
| 159 | + |
| 160 | +Other file attributes are set to static values, for example the modification |
| 161 | +time is set to UNIX epoch time 1234567890 and permissions are set to 0644. |
| 162 | + |
| 163 | +The block list is encrypted and adjusted: |
| 164 | + |
| 165 | +- The offset and size are adjusted to account for block overhead |
| 166 | + |
| 167 | +- The hash is encrypted using AES-SIV (with the file key) |
| 168 | + |
| 169 | +The resulting encrypted hash can't be used for data verification by the |
| 170 | +untrusted device, but it can be used as a form of "token" referring to a |
| 171 | +given data block for reuse purposes. |
| 172 | + |
| 173 | +Finally, the whole original FileInfo (in protobuf form) is encrypted using |
| 174 | +XChaCha20-Poly1305 with the file key and attached to the fake FileInfo. This |
| 175 | +is retained on the untrusted side and passed along to trusted devices, where |
| 176 | +it will be used in place of the fake FileInfo. |
| 177 | + |
| 178 | +.. graphviz:: |
| 179 | + |
| 180 | + digraph g { |
| 181 | + graph [ |
| 182 | + rankdir = "LR" |
| 183 | + ] |
| 184 | + "fileinfo" [ |
| 185 | + label = "encrypted name | ... | adjusted size | ... | <b> encrypted blocks | adjusted block size | encrypted metadata" |
| 186 | + shape = "record" |
| 187 | + ] |
| 188 | + "blocks" [ |
| 189 | + label = "{ <a> offset + n * overhead | size + overhead | encrypted hash } | { <a> offset + n * overhead | size + overhead | encrypted hash } | ..." |
| 190 | + shape = "record" |
| 191 | + ] |
| 192 | + fileinfo:b -> blocks:a |
| 193 | + } |
| 194 | + |
| 195 | +Incoming Metadata |
| 196 | +~~~~~~~~~~~~~~~~~ |
| 197 | + |
| 198 | +File metadata sent from the untrusted device is always decrypted. This means |
| 199 | +the original FileInfo is discarded and the attached encrypted FileInfo is |
| 200 | +decrypted and used instead. If the FileInfo does not decrypt it's considered |
| 201 | +a protocol error and the connection is dropped. This means only file |
| 202 | +metadata created by a trusted device is accepted. |
| 203 | + |
| 204 | +Data Encryption |
| 205 | +~~~~~~~~~~~~~~~ |
| 206 | + |
| 207 | +When an untrusted device makes a request for a data block, the trusted |
| 208 | +device: |
| 209 | + |
| 210 | +1. decrypts the given filename, |
| 211 | +2. reads the corresponding plaintext data block, |
| 212 | +3. pads the block with random data if the read returned less than 1024 bytes, |
| 213 | +4. encrypts it using the file encryption key and a random nonce, and |
| 214 | +5. responds with the result. |
| 215 | + |
| 216 | +.. graphviz:: |
| 217 | + |
| 218 | + digraph g { |
| 219 | + graph [ |
| 220 | + rankdir = "LR" |
| 221 | + ] |
| 222 | + "u" [ |
| 223 | + label = "<h> plaintext (variable)" |
| 224 | + shape = "record" |
| 225 | + ] |
| 226 | + "e" [ |
| 227 | + label = "nonce (24 B) | <h> ciphertext (variable) | tag (16 B)" |
| 228 | + shape = "record" |
| 229 | + ] |
| 230 | + u:h -> e:h [ label = "XChaCha20-Poly1305" ] |
| 231 | + } |
| 232 | + |
| 233 | +This is repeated for all required blocks. At the end, the untrusted device |
| 234 | +appends the fake FileInfo (which includes the original, encrypted, FileInfo) |
| 235 | +to the file. This serves no purpose during normal operations, but enables |
| 236 | +offline decryption of an encrypted folder without database access and, in |
| 237 | +principle, scanning an encrypted folder to populate the database should it |
| 238 | +be lost or corrupted. |
| 239 | + |
| 240 | +.. graphviz:: |
| 241 | + |
| 242 | + digraph g { |
| 243 | + graph [ |
| 244 | + rankdir = "LR" |
| 245 | + ] |
| 246 | + "u" [ |
| 247 | + label = "<b0> plaintext block | <b1> plaintext block | ..." |
| 248 | + shape = "record" |
| 249 | + ] |
| 250 | + "e" [ |
| 251 | + label = "<b0> encrypted block | <b1> encrypted block | ... | FileInfo (variable) | len(FileInfo) (uint32)" |
| 252 | + shape = "record" |
| 253 | + ] |
| 254 | + u:b0 -> e:b0 [ label = "encryption" ] |
| 255 | + u:b1 -> e:b1 |
| 256 | + } |
| 257 | + |
| 258 | +Incoming Data |
| 259 | +~~~~~~~~~~~~~ |
| 260 | + |
| 261 | +Making a request to an untrusted device is mostly the reverse of the above. |
| 262 | +The file name is encrypted and the block offset and size adjusted. The |
| 263 | +resulting data is decrypted and thereby also authenticated, meaning it must |
| 264 | +have originated on a trusted device. Contrary to the usual case we cannot |
| 265 | +simply make arbitrary range requests -- only the precise blocks that were |
| 266 | +encrypted to begin with will decrypt properly. |
| 267 | + |
| 268 | +--- |
| 269 | + |
| 270 | +.. [#sizes] Although files grow slightly due to block |
| 271 | + overhead, and some files are padded up to an even kilobyte, file sizes |
| 272 | + can be determined at least to the closest kilobyte. |
0 commit comments