11# Journalist API v2
22
3- This package (in ` securedrop/journalist_app/api2 ` ) implements and documents the synchronization strategy for the v2
4- Journalist API.
5-
6- | File | Contents |
7- | ------------------------------------- | ------------------------------------ |
8- | ` README.md ` | Specification |
9- | ` __init__.py ` | Server implementation |
10- | ` ../../tests/test_journalist_api2.py ` | Test suite for server implementation |
3+ The ` securedrop.journalist_app.api2 ` package implements the synchronization
4+ strategy for the v2 Journalist API.
5+
6+ | File/module | Contents |
7+ | --------------------------------------- | -------------------------------------------------------------------------------------------------------------- |
8+ | ` API2.md ` (you are here) | Specification |
9+ | ` securedrop.journalist_app.api2 ` | Flask blueprint for ` /api/v2/ ` |
10+ | ` securedrop.journalist_app.api2.events ` | Event-handling framework |
11+ | ` securedrop.journalist_app.api2.shared ` | Helper functions factored out of and still shared with the v1 Journalist API (` securedrop.journalist_app.api ` ) |
12+ | ` securedrop.journalist_app.api2.types ` | Types |
13+ | ` securedrop.tests.test_journalist_api2 ` | Test suite for server implementation |
1114
1215A client-side implementation should be able to interact with the endpoints
13- implemented in ` __init__.py ` according to this specification.
16+ implemented in ` securedrop.journalist_app.api2 ` according to this specification.
17+
18+ ## Audience
19+
20+ This API is intended for use by the [ SecureDrop journalist app] [ app ] , and this
21+ documentation is intended to support its development. We make no guarantees
22+ about support, compatibility, or documentation for other purposes.
23+
24+ [ app ] : https://github.com/freedomofpress/securedrop-client/tree/main/app
25+
26+ ## Goals and properties
27+
28+ Although the SecureDrop Server remains the source of truth for its clients, the
29+ v2 Journalist API borrows ideas from distributed systems and content-addressable
30+ storage.
31+
32+ 1 . Support the Journalist API's "occasionally connected" clients: actions should
33+ be possible while in offline mode, responsive even over flaky Tor connections,
34+ etc.
35+
36+ 2 . Provide a single write-read loop in every synchronization round trip, at an
37+ interval of the client's choosing.
38+
39+ 3 . Hash a canonical representation of each record (source, item, etc.) to
40+ version it deterministically.
41+
42+ 4 . Hash a canonical representation of an endpoint's entire state (all sources,
43+ all items, etc.) to version it deterministically.
1444
1545## Overview
1646
1747The request/response schemas referred to in these sequence diagrams are defined
18- as mypy types in ` __init__.py ` .
48+ as mypy types in ` securedrop.journalist_app.api2.types ` .
1949
2050### Initial synchronization
2151
52+ ** Figure 1.**
53+
2254``` mermaid
2355sequenceDiagram
2456participant Client
@@ -38,6 +70,8 @@ Server ->> Client: MetadataResponse
3870
3971### Incremental synchronization
4072
73+ ** Figure 2.**
74+
4175``` mermaid
4276sequenceDiagram
4377participant Client
6498
6599### Batched events from client
66100
101+ ** Figure 3.**
102+
67103``` mermaid
68104sequenceDiagram
69105participant Client
@@ -87,10 +123,88 @@ Note over Client: Global version uvwxyz
87123end
88124```
89125
126+ #### State machine
127+
128+ Events in a given ` BatchRequest ` are handled in [ snowflake-ID] ( #snowflake-ids )
129+ order. Each event is handled according to the following state machine:
130+
131+ ** Figure 4.**
132+
133+ ``` mermaid
134+ stateDiagram-v2
135+ direction TB
136+
137+ [*] --> CacheLookup : process(event)
138+ CacheLookup: status = redis.get(event.id)
139+
140+ CacheLookup --> IdempotentBranch : status in {102 Processing, 200 OK}
141+ CacheLookup --> StartBranch : status == None
142+
143+ state "Enforce idempotency" as IdempotentBranch {
144+ AlreadyReported : 208 AlreadyReported
145+ AlreadyReported --> [*] : return AlreadyReported
146+ }
147+
148+ state "Start processing" as StartBranch {
149+ [*] --> Processing : redis.set(event.id, Processing, ttl)
150+ Processing : 102 Processing
151+ }
152+
153+ Processing --> Handler
154+ state "handle_<event.type>()" as Handler {
155+ [*] --> [*]
156+ }
157+
158+ Handler --> OK
159+ state "Cache and report success" as SuccessBranch {
160+ OK : 200 OK
161+ OK --> UpdateCache
162+
163+ UpdateCache : redis.set(event.id, OK, ttl)
164+ UpdateCache --> [*] : return (OK, delta)
165+ }
166+
167+ Handler --> BadRequest
168+ Handler --> NotFound
169+ Handler --> Conflict
170+ Handler --> Gone
171+ Handler --> NotImplemented
172+ state "Report error" as ErrorBranch {
173+ BadRequest : 400 BadRequest
174+ NotFound : 404 NotFound
175+ Conflict : 409 Conflict
176+ Gone : 410 Gone
177+ NotImplemented : 501 NotImplemented
178+
179+ BadRequest --> ClearCache
180+ NotFound --> ClearCache
181+ Conflict --> ClearCache
182+ Gone --> ClearCache
183+ NotImplemented --> ClearCache
184+
185+ ClearCache : redis.delete(event.id)
186+ ClearCache --> [*] : return error
187+ }
188+ ```
189+
190+ ** Notes:**
191+
192+ 1 . A client that submits a successful event $E$ will receive HTTP ` 200 OK ` for
193+ $E$ and SHOULD apply the event locally as confirmed based on the returned data
194+ (` sources ` , ` items ` , etc.).
195+
196+ 2 . A client that subsequently resubmits $E$ will receive only a cached HTTP `208
197+ Already Reported` and SHOULD apply the event locally as confirmed. The server
198+ will not return data in this case, but the client SHOULD already know the
199+ results of the operation once confirmed.
200+
201+ 3 . A client that submits a failed event $E'$ will receive an individual error
202+ code for $E'$. The client MAY resubmit $E'$ immediately, since idempotence is
203+ not enforced for error states.
204+
90205#### Consistency
91206
92- This diagram implies single-round-trip consistency. To make that expectation
93- explicit:
207+ Figure 3 above depicts single-round-trip consistency. That is:
94208
952091 . If the server $S$ currently has exactly one active client $C$; and
96210
@@ -101,6 +215,10 @@ E_n\}$; and
101215
1022164 . $C$'s index SHOULD match $S$'s index without a subsequent synchronization.
103217
218+ This property does not hold for resubmitted events (returning HTTP `208 Already
219+ Reported)`. A subsequent synchronization MAY be necessary for the client to
220+ "catch up" to the effects of accepted events.
221+
104222#### Snowflake IDs
105223
106224The ` Event.id ` field is a "snowflake ID", which a client can generate using a
0 commit comments