|
| 1 | +# NATS Development and Integration Log |
| 2 | + |
| 3 | +## [x] Goal: nats from nodejs |
| 4 | + |
| 5 | +- start a nats server in cocalc\-docker |
| 6 | +- connect from nats cli outside docker |
| 7 | +- connect to it from the nodejs client over a websocket |
| 8 | + |
| 9 | +```sh |
| 10 | +nats-server -p 5004 |
| 11 | + |
| 12 | +nats context save --select --server nats://localhost:5004 nats |
| 13 | + |
| 14 | +nats sub '>' |
| 15 | +``` |
| 16 | + |
| 17 | +Millions of messages a second works \-\- and you can run like 5x of these at once without saturating nats\-server. |
| 18 | + |
| 19 | +```js |
| 20 | +import { connect, StringCodec } from "nats"; |
| 21 | +const nc = await connect({ port: 5004 }); |
| 22 | +console.log(`connected to ${nc.getServer()}`); |
| 23 | +const sc = StringCodec(); |
| 24 | + |
| 25 | +const t0 = Date.now(); |
| 26 | +for (let i = 0; i < 1000000; i++) { |
| 27 | + nc.publish("hello", sc.encode("world")); |
| 28 | +} |
| 29 | +await nc.drain(); |
| 30 | +console.log(Date.now() - t0); |
| 31 | +``` |
| 32 | + |
| 33 | +That was connecting over TCP. Now can we connect via websocket? |
| 34 | + |
| 35 | +## [x] Goal: Websocket from browser |
| 36 | + |
| 37 | +First need to start a nats **websocket** server instead on port 5004: |
| 38 | + |
| 39 | +[https://nats.io/blog/getting\-started\-nats\-ws/](https://nats.io/blog/getting-started-nats-ws/) |
| 40 | + |
| 41 | +```sh |
| 42 | +nats context save --select --server ws://localhost:5004 ws |
| 43 | +~/nats/nats.js/lib$ nats context select ws |
| 44 | +NATS Configuration Context "ws" |
| 45 | + |
| 46 | + Server URLs: ws://localhost:5004 |
| 47 | + Path: /projects/3fa218e5-7196-4020-8b30-e2127847cc4f/.config/nats/context/ws.json |
| 48 | + |
| 49 | +~/nats/nats.js/lib$ nats pub foo bar |
| 50 | +21:24:53 Published 3 bytes to "foo" |
| 51 | +~/nats/nats.js/lib$ |
| 52 | +``` |
| 53 | +
|
| 54 | +## |
| 55 | +
|
| 56 | +- their no\-framework html example DOES work for me! |
| 57 | +- [https://localhost:4043/projects/3fa218e5\-7196\-4020\-8b30\-e2127847cc4f/files/nats/nats.js/lib/ws.html](https://localhost:4043/projects/3fa218e5-7196-4020-8b30-e2127847cc4f/files/nats/nats.js/lib/ws.html) |
| 58 | +- It takes about 1\-2 seconds to send **one million messages** from browser outside docker to what is running inside there! |
| 59 | +
|
| 60 | +## [x] Goal: actually do something useful |
| 61 | +
|
| 62 | +- nats server |
| 63 | +- browser connects via websocket port 5004 |
| 64 | +- nodejs hub connects via tcp |
| 65 | +- hub answers a ping or something else from the browser... |
| 66 | +
|
| 67 | +This worked perfectly with no difficulty. It's very fast and flexible and robust. |
| 68 | +
|
| 69 | +Reconnects work, etc. |
| 70 | +
|
| 71 | +## [x] Goal: proxying |
| 72 | +
|
| 73 | +- nats server with websocket listening on localhost:5004 |
| 74 | +- proxy it via node\-proxy in the hub to localhost:4043/nats |
| 75 | +- as above |
| 76 | +
|
| 77 | +This totally worked! |
| 78 | +
|
| 79 | +Everything is working that I try?! |
| 80 | +
|
| 81 | +Maybe NATS totally kicks ass. |
| 82 | +
|
| 83 | +## [x] Goal: do something actually useful. |
| 84 | +
|
| 85 | +- authentication: is there a way to too who the user who made the websocket connection is? |
| 86 | + - worry about this **later** \- obviously possible and not needed for a POC |
| 87 | +- let's try to make `write_text_file_to_project` also be possible via nats. |
| 88 | +- OK, made some of api/v2 usable. Obviously this is really minimal POC. |
| 89 | +
|
| 90 | +## [x] GOAL: do something involving the project |
| 91 | +
|
| 92 | +The most interesting use case for nats/jetsteam is timetravel collab editing, where this is all a VERY natural fit. |
| 93 | +
|
| 94 | +But for now, let's just do _something_ at all. |
| 95 | +
|
| 96 | +This worked - I did project exec with subject projects.{project_id}.api |
| 97 | +
|
| 98 | +## [x] Goal: Queue group for hub api |
| 99 | +
|
| 100 | +- change this to be a queue group and test by starting a few servers at once |
| 101 | +
|
| 102 | +## [x] Goal: Auth Strategy that is meaningful |
| 103 | +
|
| 104 | +Creating a creds file that encodes a JWT that says what you can publish and subscribe to, then authenticating with that works. |
| 105 | +
|
| 106 | +- make it so user with account_id can publish to hub.api.{account_id} makes it so we know the account_id automatically by virtue of what was published to. This works. |
| 107 | +
|
| 108 | +## [x] Goal: Solve Critical Auth Problems |
| 109 | +
|
| 110 | +Now need to solve two problems: |
| 111 | +
|
| 112 | +- [x] GOAL: set the creds for a browser client in a secure http cookie, so the browser can't directly access it |
| 113 | +
|
| 114 | +I finally figured this out after WASTING a lot of time with stupid AI misleading me and trying actively to get me to write very stupid insecure code as a lazy workaround. AI really is very, very dangerous... The trick was to read the docs repeatedly, increase logging a lot, and \-\- most imporantly \-\- read the relevant Go source code of NATS itself. The answer is to modify the JWT so that it explicitly has bearer set: `nsc edit user wstein --bearer` |
| 115 | +
|
| 116 | +This makes it so the server doesn't check the signature of the JWT against the _user_ . Putting exactly the JWT token string in the cookie then works because "bearer" literally tells the backend server not to do the signature check. I think this is secure and the right approach because the server checks that the JWT is valid using the account and operator signatures. |
| 117 | +
|
| 118 | +**WAIT!** Using signing keys [https://docs.nats.io/using\-nats/nats\-tools/nsc/signing_keys](https://docs.nats.io/using-nats/nats-tools/nsc/signing_keys) \(and https://youtu.be/KmGtnFxHnVA?si=0uvLMBTJ5TUpem4O \) is VASTLY superior. There's just one JWT issued to each user, and we make a server\-side\-only JWT for their account that has everything. The user never has to reconnect or change their JWT. We can adjust the subject on the fly to account for running projects \(or collaboration changes\) at any time server side. Also the size limits go away, so we don't have to compress project_id's \(probably\). |
| 119 | +
|
| 120 | +## Goal: Implement Auth Solution for Browsers |
| 121 | +
|
| 122 | +- [x] automate creation of creds for browser clients, i.e., what we just did with the nsc tool manually |
| 123 | +- |
| 124 | +
|
| 125 | +--- |
| 126 | +
|
| 127 | +This is my top priority goal for NOW! |
| 128 | +
|
| 129 | +What's the plan? |
| 130 | +
|
| 131 | +Need to figure out how to do all the nsc stuff from javascript, storing results in the database? |
| 132 | +
|
| 133 | +- Question: how do we manage creating signing keys and users from nodejs? Answer: clear from many sources that we must use the nsc CLI tool via subprocess calls. Seems fine to me. |
| 134 | +- [x] When a user signs in, we check for their JWT in the database. If it is there, set the cookie. If not, create the signing key and JWT for them, save in database, and set the cookie. |
| 135 | +- [x] update nats\-server resolver state after modifying signing cookie's subjects configuration. |
| 136 | +
|
| 137 | +``` |
| 138 | +nsc edit operator --account-jwt-server-url nats://localhost:4222 |
| 139 | +``` |
| 140 | +
|
| 141 | +Now I can do `nsc push` and it just works. |
| 142 | +
|
| 143 | +[x] TODO: when signing out, need to delete the jwt cookie or dangerous private info leaks... and also new info not set properly. |
| 144 | +
|
| 145 | +- [x] similar creds for projects, I.e., access to a project means you can publish to `projects.{project_id}.>` Also, projects should have access to something under hub. |
| 146 | +
|
| 147 | +## [x] Goal: Auth for Projects |
| 148 | +
|
| 149 | +Using an env variable I got a basic useful thing up and running. |
| 150 | +
|
| 151 | +--- |
| 152 | +
|
| 153 | +Some thoughts about project auth security: |
| 154 | +
|
| 155 | +- [ ] when collaborators on a project leave maybe we change JWT? Otherwise, in theory any user of a project can probably somehow get access to the project's JWT \(it's in memory at least\) and still act as the project. Changing JWT requires reconnect. This could be "for later", since even now we don't have this level of security! |
| 156 | +- [ ] restarting project could change JWT. That's like the current project's secret token being changed. |
| 157 | +
|
| 158 | +## [ ] Goal: nats-server automation of creation and configuration of system account, operator, etc. |
| 159 | +
|
| 160 | +- This looks helpful: https://www.synadia.com/newsletter/nats-weekly-27/ |
| 161 | +- NOT DONE YET |
| 162 | +
|
| 163 | +## [x] Goal: Terminal! Something complicated involving the project which is NOT just request/response |
| 164 | +
|
| 165 | +- Implementing terminals goes beyond request/response. |
| 166 | +- It could also leverage jetstream if we want for state (?). |
| 167 | +- Multiple connected client |
| 168 | +
|
| 169 | +Project/compute server sends terminal output to |
| 170 | +
|
| 171 | + project.{project_id}.terminal.{sha1(path)} |
| 172 | +
|
| 173 | +Anyone who can read project gets to see this. |
| 174 | +
|
| 175 | +Browser sends terminal input to |
| 176 | +
|
| 177 | + project.{project_id}.{group}.{account_id}.terminal.{sha1(path)} |
| 178 | +
|
| 179 | +API calls: |
| 180 | +
|
| 181 | + - to start terminal |
| 182 | + - to get history (move to jetstream?) |
| 183 | +
|
| 184 | +If I can get this to work, then collaborative editing and everything else is basically the same (just more details). |
| 185 | +
|
| 186 | +## [x] Goal: Terminal! #now |
| 187 | +
|
| 188 | +Make it so an actual terminal works, i.e., UI integration. |
| 189 | +
|
| 190 | +## [x] Goal: Terminal JetStream state |
| 191 | +
|
| 192 | +Use Jetstream to store messages from terminal, so user can reconnect without loss. !? This is very interesting... |
| 193 | +
|
| 194 | +First problem -- we used the system account SYS for all our users; however, |
| 195 | +SYS can't use jetstreams, as explained here https://github.com/nats-io/nats-server/discussions/6033 |
| 196 | +
|
| 197 | +Let's redo *everything* with a new account called "cocalc". |
| 198 | +
|
| 199 | +```sh |
| 200 | +~/nats$ nsc create account --name=cocalc |
| 201 | +[ OK ] generated and stored account key "AD4G6R62BDDQUSCJVLZNA7ES7R3A6DWXLYUWGZV74EJ2S6VBC7DQVM3I" |
| 202 | +[ OK ] added account "cocalc" |
| 203 | +~/nats$ nats context save admin --creds=/projects/3fa218e5-7196-4020-8b30-e2127847cc4f/.local/share/nats/nsc/keys/creds/MyOperator/cocalc/admin.creds |
| 204 | +~/nats$ nsc edit account cocalc --js-enable 1 |
| 205 | +~/nats$ nsc push -a cocalc |
| 206 | +``` |
| 207 | +
|
| 208 | +```js |
| 209 | +// making the stream for ALL terminal activity |
| 210 | +await jsm.streams.add({ name: 'project-81e0c408-ac65-4114-bad5-5f4b6539bd0e-terminal', subjects: ['project.81e0c408-ac65-4114-bad5-5f4b6539bd0e.terminal.>'] }); |
| 211 | +
|
| 212 | +// making a consumer for just one subject (e.g., one terminal frame) |
| 213 | +z = await jsm.consumers.add('project-81e0c408-ac65-4114-bad5-5f4b6539bd0e-terminal',{name:'9149af7632942a94ea13877188153bd8bf2ace57',filter:['project.81e0c408-ac65-4114-bad5-5f4b6539bd0e.terminal.9149af7632942a94ea13877188153bd8bf2ace57']}) |
| 214 | +c = await js.consumers.get('project-81e0c408-ac65-4114-bad5-5f4b6539bd0e-terminal', '9149af7632942a94ea13877188153bd8bf2ace57') |
| 215 | +for await (const m of await c.consume()) { console.log(cc.client.nats_client.jc.decode(m.data))} |
| 216 | +``` |
| 217 | +
|
| 218 | +NOTE!!! The above consumer is ephemeral -- it disappears if we don't grab it via c within a few seconds!!!! https://docs.nats.io/using-nats/developer/develop_jetstream/consumers |
| 219 | +
|
| 220 | +## [ ] Goal: Jetstream permissions |
| 221 | +
|
| 222 | +- [x] project should set up the stream for capturing terminal outputs. |
| 223 | +- [x] delete old messages with a given subject. `nats stream purge project-81e0c408-ac65-4114-bad5-5f4b6539bd0e-terminal --seq=7000` |
| 224 | + - there is a setting max\_msgs\_per\_subject on a stream, so **we just set that and are done!** Gees. It is too easy. |
| 225 | +- [x] handle the other messages like resize |
| 226 | +- [x] need to move those other messages to a different subject that isn't part of the stream!! |
| 227 | +- [ ] permissions for jetstream usage and access |
| 228 | +- [ ] use non\-json for the data.... |
| 229 | +- [ ] refactor code so basic parameters \(e.g., subject names, etc.\) are defined in one place that can be imported in both the frontend and backend. |
| 230 | +- [ ] font size keyboard shortcut |
| 231 | +- [ ] need a better algorithm for sizing since we don't know when a user disconnects! |
| 232 | + - when one user proposes a size, all other clients get asked their current size and only those that respond matter. how to do this? |
| 233 | +
|
| 234 | +## [ ] Goal: Basic Collab Document Editing |
| 235 | +
|
| 236 | +Plan. |
| 237 | +
|
| 238 | +- [x] Use a kv store hosted on nats to trac syncstring objects as before. This means anybody can participate \(browser, compute server, project\) without any need to contact the database, hence eliminating all proxying! |
| 239 | +
|
| 240 | +[x] Next Goal \- collaborative file editing \-\- some sort of "proof of concept"! This requires implementing the "ordered patches list" but on jetstream. Similar to the nats SyncTable I wrote yesterday, except will use jetstream directly, since it is an event stream, after all. |
| 241 | +
|
| 242 | +- [x] synctable\-stream: change to one big stream for the whole project but **consume** a specific subject in that stream? |
| 243 | +
|
| 244 | +[ ] cursors \- an ephemeral table |
| 245 | +
|
| 246 | +--- |
| 247 | +
|
| 248 | +- [ ] Subject For Particular File: `project.${project_id}.patches.${sha1(path)}` |
| 249 | +- [ ] Stream: Records everything with this subject `project.${project_id}.patches` |
| 250 | +- [ ] It would be very nice if we can use the server assigned timestamps.... but probably not |
| 251 | + - [ ] For transitioning and de\-archiving, there must be a way to do this, since they have a backup/restore process |
| 252 | +
|
| 253 | +## [ ] Goal: PostgreSQL Changefeed Synctable |
| 254 | +
|
| 255 | +This is critical to solve. This sucks now. This is key to eliminating "hub\-websocket". This might be very easy. Here's the plan: |
| 256 | +
|
| 257 | +- [x] make a request/response listener that listens on hub.account.{account\_id} and hub.db.project.{project\_id} for a db query. |
| 258 | +- [x] if changes is false, just responds with the result of the query. |
| 259 | +- [ ] if changes is true, get kv store k named `account-{account_id}` or `project-{project_id}` \(which can be used by project or compute server\). |
| 260 | + - let id be the sha1 hash of the query \(and options\) |
| 261 | + - k.id.update is less than X seconds ago, do nothing... it's already being updated by another server. |
| 262 | + - do the query to the database \(with changes true\) |
| 263 | + - write the results into k under k.id.data.key = value. |
| 264 | + - keep watching for changes so long as k.id.interest is at most n\*X seconds ago. |
| 265 | + - Also set k.id.update to now. |
| 266 | + - return id |
| 267 | +- [ ] another message to `hub.db.{account_id}` which contains a list of id's. |
| 268 | + - When get this one, update k.id.interest to now for each of the id's. |
| 269 | +
|
| 270 | +With the above algorithm, it should be very easy to reimplement the client side of SyncTable. Moreover, there are many advantages: |
| 271 | +
|
| 272 | +- For a fixed account\_id or project\-id, there's no extra work at all for 1 versus 100 of them. I.e., this is great for opening a bunch of distinct browser windows. |
| 273 | +- If you refresh your browser, everything stays stable \-\- nothing changes at all and you instantly have your data. Same if the network drops and resumes. |
| 274 | +- When implementing our new synctable, we can immediately start with the possibly stale data from the last time it was active, then update it to the correct data. Thus even if everything but NATS is done/unavailable, the experience would be much better. It's like "local first", but somehow "network mesh first". With a leaf node it would literally be local first. |
| 275 | +
|
| 276 | +--- |
| 277 | +
|
| 278 | +This is working well! |
| 279 | +
|
| 280 | +TODO: |
| 281 | +
|
| 282 | +- [x] build full proof of concept SyncTable on top of my current implementation of synctablekvatomic, to _make sure it is sufficient_ |
| 283 | + - this worked and wasn't too difficult |
| 284 | +
|
| 285 | +THEN do the following to make it robust and scalable |
| 286 | +
|
| 287 | +- [ ] store in nats which servers are actively managing which synctables |
| 288 | +- [ ] store in nats the client interest data, instead of storing it in memory in a server? i.e., instead of client making an api call, they could instead just update a kv and say "i am interested in this changefeed". This approach would make everything just keep working easily even as servers scale up/down/restart. |
| 289 | +
|
| 290 | +--- |
| 291 | +
|
| 292 | +## [ ] Goal: Terminal and **compute server** |
| 293 | +
|
| 294 | +Another thing to do for compute servers: |
| 295 | +
|
| 296 | +- use jetstream and KV to agree on _who_ is running the terminal? |
| 297 | +
|
| 298 | +This is critical to see how easily we can support compute servers using nats + jetstream. |
| 299 | +
|
0 commit comments