|
1 | | -# DEVGUIDE.md |
| 1 | +# DEVGUIDE.md (WIP) |
| 2 | + |
| 3 | +> **Note: this DEVGUIDE is under construction and is not complete yet; see `scripts/docs` for documentation on each of the available scripts.** |
2 | 4 |
|
3 | 5 | ## Contents |
4 | | -1. [Running the tests](#running-the-tests) |
5 | | - 1. [Prerequisites](#prerequisites) |
6 | | - 2. [I can't be bothered to read all of this](#i-cant-be-bothered-to-read-all-of-this) |
7 | | - 3. [The custom test script](#the-custom-test-script) |
8 | | - 4. [Test tags](#test-tags) |
9 | | - 5. [Running vectorize tests](#running-vectorize-tests) |
10 | | - 6. [Running the tests on local Stargate](#running-the-tests-on-local-stargate) |
11 | | - 7. [The custom Mocha wrapper](#the-custom-mocha-wrapper) |
12 | | -2. [Typechecking & Linting](#typechecking--linting) |
13 | | -3. [Building the library](#building-the-library) |
14 | | -4. [Publishing](#publishing) |
15 | | -5. [Miscellaneous](#miscellaneous) |
| 6 | +1. [I can't be bothered to read all of this](#i-cant-be-bothered-to-read-all-of-this) |
| 7 | +2. [Building the library](#building-the-library) |
| 8 | +3. [Publishing](#publishing) |
| 9 | +4. [Miscellaneous](#miscellaneous) |
16 | 10 | 1. [nix-shell + direnv support](#nix-shell--direnv-support) |
17 | 11 |
|
18 | | -## Running the tests |
19 | | - |
20 | | -### Prerequisites |
21 | | - |
22 | | -- `npm`/`npx` |
23 | | -- A running Data API instance |
24 | | -- A `.env` with the credentials filled out |
25 | | - |
26 | | -<sub>*DISCLAIMER: The test suite will create any necessary namespaces/collections, and any existing collections in |
27 | | -the database will be deleted.*</sub> |
28 | | - |
29 | | -<sub>*Also, if you for some reason already have an existing namespace called 'slania', it too will be deleted. Not |
30 | | -sure why you'd have a namespace named that, but if you do, you have a good taste in music.*</sub> |
31 | | - |
32 | | -### I can't be bothered to read all of this |
33 | | - |
34 | | -1. Just make sure `CLIENT_DB_URL` and `CLIENT_DB_TOKEN` are set in your `.env` file |
35 | | -2. If you're running the full test suite, copy `vectorize_test_spec.example.json`, fill out the providers you want |
36 | | - to test, and delete the rest |
37 | | -3. Run one of the following commands: |
38 | | - |
39 | | -```sh |
40 | | -# Add '-e dse' or '-e hcd' to the command if running on either of those |
41 | | - |
42 | | -# Runs the full test suite (~10m) |
43 | | -sh scripts/test.sh -all # -e dse|hcd |
44 | | - |
45 | | -# Runs a version of the test suite that omits all longer-running tests (~2m) |
46 | | -sh scripts/test.sh -light # -e dse|hcd |
47 | | -``` |
48 | | - |
49 | | -### The custom test script |
50 | | - |
51 | | -The `astra-db-ts` test suite uses a custom wrapper around [ts-mocha](https://www.npmjs.com/package/ts-mocha), including |
52 | | -its own custom test script. |
53 | | - |
54 | | -While this undeniably adds in extra complexity and getting-started overhead, you can read the complete rationale as to |
55 | | -why [here](https://github.com/datastax/astra-db-ts/pull/66#issue-2430902926), but TL;DR: |
56 | | -- We sped up the complete test suite by 500% |
57 | | -- We improved the test filtering capabilities |
58 | | -- We made it easier to write and work with `astra-db-ts`-esque tests |
59 | | - |
60 | | -The API for the test script is as the following: |
61 | | - |
62 | | -```sh |
63 | | -1. scripts/test.sh |
64 | | -2. [-all | -light | -coverage] |
65 | | -3. [-fand | -for] [-f/F <filter>]+ [-g/G <regex>]+ |
66 | | -4. [-w/W <vectorize_whitelist>] |
67 | | -5. [-b | -bail] |
68 | | -6. [-R | -no-report] |
69 | | -7. [-c <http_client>] |
70 | | -8. [-e <environment>] |
71 | | -``` |
72 | | - |
73 | | -#### 1. The test file (`scripts/test.sh`) |
74 | | - |
75 | | -While you can use `npm run test` or `bun run test` if you so desire, attempting to use the test script's flags with it |
76 | | -may be a bit iffy, as the inputs are first "de-quoted" (evaluated) when you use the shell command, but they're |
77 | | -"de-quoted" again when the package manager runs the actual shell command. |
78 | | - |
79 | | -Just use `scripts/test.sh` (or `sh scripts/test.sh`) directly if you're using command-line flags and want to |
80 | | -avoid a headache. |
81 | | - |
82 | | -#### 2. The test types (`[-all | -light | -coverage]`) |
83 | | - |
84 | | -There are three main test types: |
85 | | -- `-all`: This is a shorthand for running enabling the `(LONG)`, `(ADMIN)`, and `(VECTORIZE)` tests (alongside all the normal tests that always run) |
86 | | -- `-light`: This is a shorthand for disabling the aforementioned tests. This runs only the normal tests, which are much quicker to run in comparison |
87 | | -- `-coverage`: This runs all tests, but uses `nyc` to test for coverage statistics. Enabled the `-b` (bail) flag, as no point continuing if a test fails |
88 | | - |
89 | | -By default, just running `scripts/test.sh` will be like using `-light`, but you can set the default config for which tests |
90 | | -to enable in your `.env` file, through the `CLIENT_RUN_*_TESTS` env vars. |
91 | | - |
92 | | -#### 3. The test filters (`[-fand | -for] [-f/F <filter>]+ [-g/G <regex>]+`) |
93 | | - |
94 | | -The `astra-db-ts` test suite implements fully custom test filtering, inspired by Mocha's, but improved upon. |
95 | | - |
96 | | -You can add a basic filter using `-f <filter>` which acts like Mocha's own `-f` flag. Like Mocha, we also support `-g`, |
97 | | -which is like `-f`, but for regex. Each only needs to match a part of the test name (or its parent describes' names) to |
98 | | -succeed, so use `^$` as necessary. |
99 | | - |
100 | | -Unlike Mocha, there is no `-i` flag—instead, you can invert a filter by using `-F <filter>` or `-G <regex>`, so that the |
101 | | -test needs to NOT match that string/regex to run. |
102 | | - |
103 | | -You can also use multiple filters by simply using multiple of `-f`, `-g`, `-F`, and `-G` as you please. By default, |
104 | | -it'll only run a test if it satisfies all the filters (`-fand`), but you can use the `-for` flag to run a test if |
105 | | -it satisfies any one of the filters. |
106 | | - |
107 | | -In case filters overlap, an inverted filter always wins over a regular filter, and the conflicted test won't run. |
108 | | - |
109 | | -#### 4. The vectorize whitelist (`[-w/W <vectorize_whitelist>]`) |
110 | | - |
111 | | -There's a special filtering system just for vectorize tests, called the "vectorize whitelist", of which there are two |
112 | | -different types: either a piece of regex, or a special filter operator. |
113 | | - |
114 | | -##### Regex filtering |
115 | | - |
116 | | -Every vectorize test is given a test name representing every branch it took to become that specific test. It is |
117 | | -of the following format: |
118 | | - |
119 | | -```sh |
120 | | -# providerName@modelName@authType@dimension |
121 | | -# where dimension := 'specified' | 'default' | <some_number> |
122 | | -# where authType := 'header' | 'providerKey' | 'none' |
123 | | -``` |
124 | | - |
125 | | -Again, the regex only needs to match part of each test's name to succeed, so use `^$` as necessary. |
126 | | - |
127 | | -##### Filter operators |
128 | | - |
129 | | -The vectorize test suite also defines some custom "filter operators" to provide filtering that can't be done through |
130 | | -basic regex. They come of the format `-w $<operator>:<colon_separated_args>` |
131 | | - |
132 | | -1. `$limit:<number>` - This is a limit over the total number of vectorize tests, only running up to the specified amount |
133 | | -2. `$provider-limit:<number>` - This limits the amount of vectorize tests that can be run per provider |
134 | | -3. `$model-limit:<number>` - Akin to the above, but limits per model. |
135 | | - |
136 | | -The default whitelist is `$limit-per-model:1`. |
137 | | - |
138 | | -#### 5. Bailing (`[-b | -bail]`) |
139 | | - |
140 | | -Simply sets the bail flag, as it does in Mocha. Forces the test script to exit after a single test failure. |
141 | | - |
142 | | -#### 6. Disabling error reporting (`[-R | -no-report]`) |
143 | | - |
144 | | -By default, the test suite logs the complete error objects of any that may've been thrown during your tests to the |
145 | | -`./etc/test-reports` directory for greatest debuggability. However, this can be disabled for a test run using the |
146 | | -`-R`/`-no-report` flag. |
147 | | - |
148 | | -#### 7. The HTTP client (`[-c <http_client>]`) |
149 | | - |
150 | | -By default, `astra-db-ts` will run its tests on `fetch-h2` using `HTTP/2`, but you can specify a specific client, which |
151 | | -is one of `default:http1`, `default:http2`, or `fetch`. |
152 | | - |
153 | | -#### 8. The Data API environment (`[-e <environment>]`) |
154 | | - |
155 | | -By default, `astra-db-ts` assumes you're running on Astra, but you can specify the Data API environment through this |
156 | | -flag. It should be one of `dse`, `hcd`, `cassandra`, or `other`. You can also provide `astra`, but it wouldn't really |
157 | | -do anything. But I'm not the boss of you; you can make your own big-boy/girl/other decisions. |
158 | | - |
159 | | -### Test tags |
160 | | - |
161 | | -The `astra-db-ts` test suite uses the concept of "test tags" to further advance test filtering. These are tags in |
162 | | -the names of test blocks, such as `(LONG) createCollection tests` or `(ADMIN) (ASTRA) AstraAdmin tests`. |
163 | | - |
164 | | -These tags are automatically parsed and filtered through the custom wrapper our test suite uses, though |
165 | | -you can still interact with them through test filters as well. For example, I commonly use `-f VECTORIZE` to |
166 | | -only run the vectorize tests. |
167 | | - |
168 | | -Current tags include: |
169 | | - - `VECTORIZE` - Enabled if `CLIENT_RUN_VECTORIZE_TESTS` is set (or `-all` is set) |
170 | | - - `LONG` - Enabled if `CLIENT_RUN_LONG_TESTS` is set (or `-all` is set) |
171 | | - - `ADMIN` - Enabled if `CLIENT_RUN_ADMIN_TESTS` is set (or `-all` is set) |
172 | | - - `DEV` - Automatically enabled if running on Astra-dev |
173 | | - - `NOT-DEV` - Automatically enabled if not running on Astra-dev |
174 | | - - `ASTRA` - Automatically enabled if running on Astra |
175 | | - |
176 | | -Attempting to set any other test tag will throw an error. (All test tags must contain only uppercase letters & |
177 | | -hyphens—any tag not matching `\([A-Za]+?\)` will not be counted.) |
178 | | - |
179 | | -### Running vectorize tests |
180 | | - |
181 | | -To run vectorize tests, you need to have a vectorize-enabled kube running, with the correct tags enabled. |
182 | | - |
183 | | -Ensure `CLIENT_RUN_VECTORIZE_TESTS` and `CLIENT_RUN_LONG_TESTS` are enabled as well (or just pass the `-all` flag to |
184 | | -the test script). |
185 | | - |
186 | | -Lastly, you must create a file, `vectorize_tests.json`, in the root folder, with the following format: |
187 | | - |
188 | | -```ts |
189 | | -type VectorizeTestSpec = { |
190 | | - [providerName: string]: { |
191 | | - headers?: { |
192 | | - [header: `x-${string}`]: string, |
193 | | - }, |
194 | | - sharedSecret?: { |
195 | | - providerKey?: string, |
196 | | - }, |
197 | | - dimension?: { |
198 | | - [modelNameRegex: string]: number, |
199 | | - }, |
200 | | - parameters?: { |
201 | | - [modelNameRegex: string]: Record<string, string>, |
202 | | - }, |
203 | | - warmupErr?: string, |
204 | | - }, |
205 | | -} |
206 | | -``` |
207 | | -
|
208 | | -where: |
209 | | -- `providerName` is the name of the provider (e.g. `nvidia`, `openai`, etc.) as found in `findEmbeddingProviders`. |
210 | | -- `headers` sets the embedding headers to be used for header auth. |
211 | | - - resolves to an `EmbeddingHeadersProvider` under the hood—throws error if no corresponding one found. |
212 | | - - optional if no header auth test wanted. |
213 | | -- `sharedSecret` is the block for KMS auth (isomorphic to `providerKey`, but it's an object for future-compatability). |
214 | | - - `providerKey` is the provider key for the provider (which will be passed in @ collection creation). |
215 | | - - optional if no KMS auth test wanted. |
216 | | -- `parameters` is a mapping of model names to their corresponding parameters. The model name can be some regex that partially matches the full model name. |
217 | | - - `"text-embedding-3-small"`, `"3-small"`, and `".*"` will all match `"text-embedding-3-small"`. |
218 | | - - optional if not required. `azureOpenAI`, for example, will need this. |
219 | | -- `dimension` is also a mapping of model name regex to their corresponding dimensions, like the `parameters` field. |
220 | | - - optional if not required. `huggingfaceDedicated`, for example, will need this. |
221 | | -- `warmupErr` may be set if the provider errors on a cold start |
222 | | - - if set, the provider will be called in a `while (true)` loop until it stops throwing an error matching this message |
223 | | -
|
224 | | -This file is .gitignore-d by default and will not be checked into VCS. |
225 | | -
|
226 | | -See `vectorize_test_spec.example.json` for, guess what, an example. |
227 | | -
|
228 | | -This spec is cross-referenced with `findEmbeddingProviders` to create a suite of tests branching off each possible |
229 | | -parameter, with tests names of the format `providerName@modelName@authType@dimension`, where each section is another |
230 | | -potential branch. |
231 | | -
|
232 | | -To run *only* the vectorize tests, a common pattern I use is `scripts/test.sh -all -f VECTORIZE [-w <vectorize_whitelist>]`. |
233 | | -
|
234 | | -### Running the tests on local Stargate |
235 | | -In another terminal tab, you can do `sh scripts/start-stargate-4-tests.sh` to spin up an ephemeral Data API on DSE |
236 | | -instance which will destroy itself on script exit. The test suite will set up any keyspaces/collections as necessary. |
237 | | -
|
238 | | -Then, be sure to set the following vars in `.env` exactly. |
239 | | -```dotenv |
240 | | -CLIENT_DB_URL=http://localhost:8181 |
241 | | -CLIENT_DB_TOKEN=Cassandra:Y2Fzc2FuZHJh:Y2Fzc2FuZHJh |
242 | | -CLIENT_DB_ENVIRONMENT=dse |
243 | | -``` |
244 | | -
|
245 | | -Once the local Data API instance is fully started and ready for requests, you can run the tests. |
246 | | -
|
247 | | -### The custom Mocha wrapper |
248 | | -
|
249 | | -The `astra-db-ts` test suite is massively IO-bound, and desires a more advanced test filtering system than |
250 | | -Mocha provides by default. As such, we have written a (relatively) light custom wrapper around Mocha, extending |
251 | | -it to allow us to squeeze all possible performance out of our tests, and make it easier to write, scale, and work |
252 | | -with tests in both the present, and the future. |
253 | | -
|
254 | | -#### The custom test functions |
255 | | -
|
256 | | -The most prominent changes are the introduction of 5 new Mocha-API-esque functions (two of which are overhauls) |
257 | | -- [`describe`](https://github.com/datastax/astra-db-ts/blob/60fa445192b6a648b7a139a45986af8525a37ffb/tests/testlib/describe.ts) - An overhaul to the existing `dynamic` block |
258 | | - - Provides fresh instances of the "common fixtures" in its callback |
259 | | - - Performs "tag filtering" on the suite names |
260 | | - - Some suite options to reduce boilerplate |
261 | | - - `truncateColls: 'default'` - Does `deleteMany({})` on the default collection in the default namespace after each test case |
262 | | - - `truncateColls: 'both'` - Does `deleteMany({})` on the default collection in both test namespaces after each test case |
263 | | - - `dropEphemeral: 'after'` - Drops all non-default collections in both test namespaces after all the test cases in the suite |
264 | | - - `dropEphemeral: 'afterEach'` - Drops all non-default collections in both test namespaces each test case |
265 | | -- [`it`](https://github.com/datastax/astra-db-ts/blob/60fa445192b6a648b7a139a45986af8525a37ffb/tests/testlib/it.ts) - An overhaul to the existing `it` block |
266 | | - - Performs "tag filtering" on the test names |
267 | | - - Provides unique string keys for every test case |
268 | | -- [`parallel`](https://github.com/datastax/astra-db-ts/blob/60fa445192b6a648b7a139a45986af8525a37ffb/tests/testlib/parallel.ts) - A wrapper around `describe` which runs all of its test cases in parallel |
269 | | - - Only allows `it`, `before`, `after`, and a single layer of `describe` functions |
270 | | - - Will run all tests simultaneously in a `before` hook, capture any exceptions, and rethrow them in reconstructed `it`/`describe` blocks for the most native-like behavior |
271 | | - - Performs tag and test filtering as normal |
272 | | - - Nearly all integration tests have been made parallel |
273 | | -- [`background`](https://github.com/datastax/astra-db-ts/blob/60fa445192b6a648b7a139a45986af8525a37ffb/tests/testlib/background.ts) - A version of `describe` which runs in the background while all the other test cases run |
274 | | - - Only allows `it` blocks |
275 | | - - Will run the test at the very start of the test script, capture any exceptions, and rethrow them in reconstructed `it`/`describe` blocks for the most native-like behavior at the end of the test script |
276 | | - - Performs tag and test filtering as normal |
277 | | - - Meant for independent tests that take a very long time to execute (such as the `integration.devops.db-admin` lifecycle test) |
278 | | -
|
279 | | -These are not globals like Mocha's—rather, they are imported, like so: |
280 | | -```ts |
281 | | -import { background, describe, it, parallel } from '@/tests/testlib'; |
282 | | -``` |
283 | | - |
284 | | -#### Examples |
285 | | - |
286 | | -You can find examples of usages of each in most, if not all, test files, such as: |
287 | | -- [`/tests/integration/miscs/timeouts.test.ts`](https://github.com/datastax/astra-db-ts/blob/60fa445192b6a648b7a139a45986af8525a37ffb/tests/integration/misc/timeouts.test.ts) (`describe`, `parallel`, `it`) |
288 | | -- [`/tests/integration/devops/lifecycle.test.ts`](https://github.com/datastax/astra-db-ts/blob/60fa445192b6a648b7a139a45986af8525a37ffb/tests/integration/devops/lifecycle.test.ts) (`background`) |
289 | | - |
290 | | -## Typechecking & Linting |
291 | | - |
292 | | -The test script also provides typechecking and linting through the following commands: |
293 | | - |
294 | | -```sh |
295 | | -# Full typechecking |
296 | | -scripts/test.sh -tc |
297 | | - |
298 | | -# Linting |
299 | | -scripts/test.sh -lint |
| 12 | +## I can't be bothered to read all of this |
300 | 13 |
|
301 | | -# Or even both |
302 | | -scripts/test.sh -lint -tc |
303 | | -``` |
| 14 | +yeah, fair enough. |
304 | 15 |
|
305 | 16 | ## Building the library |
306 | 17 |
|
|
0 commit comments