Skip to content

Commit 0b60614

Browse files
Add sentinel documentation (#534)
* Add sentinel documentation Just a basic docs page on setting up sentinels * add arch diagram docs
1 parent bbccb4c commit 0b60614

File tree

8 files changed

+145
-4
lines changed

8 files changed

+145
-4
lines changed

generated/routes.json

Lines changed: 12 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -13,11 +13,11 @@
1313
},
1414
"/overview/management-api-reference": {
1515
"relPath": "/overview/management-api-reference.md",
16-
"lastmod": "2025-10-15T00:53:52.000Z"
16+
"lastmod": "2025-11-03T09:07:35.000Z"
1717
},
1818
"/overview/agent-api-reference": {
1919
"relPath": "/overview/agent-api-reference.md",
20-
"lastmod": "2025-10-15T00:53:52.000Z"
20+
"lastmod": "2025-11-03T09:07:35.000Z"
2121
},
2222
"/getting-started": {
2323
"relPath": "/getting-started/index.md",
@@ -121,7 +121,7 @@
121121
},
122122
"/plural-features/continuous-deployment/resource-application-logic": {
123123
"relPath": "/plural-features/continuous-deployment/resource-application-logic.md",
124-
"lastmod": "2025-10-15T13:54:02.028Z"
124+
"lastmod": "2025-10-15T14:09:53.000Z"
125125
},
126126
"/plural-features/continuous-deployment/lua": {
127127
"relPath": "/plural-features/continuous-deployment/lua.md",
@@ -223,6 +223,14 @@
223223
"relPath": "/plural-features/plural-ai/architecture.md",
224224
"lastmod": "2025-03-12T14:59:41.000Z"
225225
},
226+
"/plural-features/plural-ai/sentinels": {
227+
"relPath": "/plural-features/plural-ai/sentinels.md",
228+
"lastmod": "2025-11-08T15:56:41.000Z"
229+
},
230+
"/plural-features/plural-ai/arch-diagram": {
231+
"relPath": "/plural-features/plural-ai/arch-diagram.md",
232+
"lastmod": "2025-11-08T16:28:24.261Z"
233+
},
226234
"/plural-features/plural-ai/cost": {
227235
"relPath": "/plural-features/plural-ai/cost.md",
228236
"lastmod": "2025-03-12T14:59:41.000Z"
@@ -389,7 +397,7 @@
389397
},
390398
"/getting-started/agent-api-reference": {
391399
"relPath": "/overview/agent-api-reference.md",
392-
"lastmod": "2025-10-15T00:53:52.000Z"
400+
"lastmod": "2025-11-03T09:07:35.000Z"
393401
},
394402
"/getting-started/readme": {
395403
"relPath": "/getting-started/index.md",
Lines changed: 40 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,40 @@
1+
---
2+
title: Deep Infrastructure Research
3+
description: Agentic Search of Your Infrastructure to Generate Arch Diagrams and More
4+
---
5+
6+
The best way to understand complex software infrastructure is to find a way to diagram it out to get to a visual representation of the data at hand. Plural AI has all the tools to automate this process in many cases, even across cloud and kubernetes boundaries. In particular, Plural has a few key datasources that enable an agentic diagramming engine:
7+
8+
1. A constantly refreshing semantic index of your infrastructure generated by our GitOps engine
9+
2. The ability to reference source code and terraform state when necessary in an agentic process
10+
3. The ability to live query Kubernetes or your cloud when necessary to fill in needed gaps
11+
12+
{% callout severity="info" %}
13+
We are also actively working on ebpf network inspection which will improve this source data even more
14+
{% /callout %}
15+
16+
That enables our AI to draft basic arch diagrams with a simple prompt.
17+
18+
## Create An Research Session
19+
20+
Diagram creation is simple and UI-based. Navigate to `AI -> Infra Research`. You'll have a prompt button to spawn a new research, and from there you'll see a few threads spawn in as the AI is working in the background. The process takes about 1-2 minutes, and is completely headless, so feel free to grab a coffee while its churning. Once done, you'll have a full result looking something like (using the prompt `Show me the architecture of the grafana deployment`):
21+
22+
![](/assets/ai/research-diagram.png)
23+
24+
![](/assets/ai/research-analysis.png)
25+
26+
This will have:
27+
28+
1. A complete architecture diagram of the infrastructure tied to your prompt
29+
2. A text summary of the infrastructure and any other learnings found
30+
3. A list of notes of what the AI still doesn't seem to understand from its investigation
31+
4. A list of associated Plural Services and Stacks used as source data for the investigation
32+
33+
The graph itself is created in [Mermaid](https://mermaid.js.org/) format, and can be quite complex. This can easily lead to hallucinations. To correct these you have two tools:
34+
35+
1. AI fix - we provide a fix with ai button that will take any javascript errors from mermaid parsing and attempt to correct them.
36+
2. Try it again - in other cases, it's oftentimes easier to just rerun the generation. You can use the `Try Again` button to do this.
37+
38+
## Publish Your Researc
39+
40+
Once you feel like the diagram and research is suitable for broader acceptance, you can chose to publish it. From there, anyone can view your research results and we'll index it for use in other investigations in the future.
Lines changed: 88 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,88 @@
1+
---
2+
title: At-Scale Infrastructure Testing with Sentinels
3+
description: Automate testing over any set of Kubernetes Clusters with the Sentinel Resource
4+
---
5+
6+
Validating the correctness of any infrastructure change is a meaningfully complex task that has no parallel to a local unit test that is effective at the application layer. Slight differences almost always require some degree of live integration testing. Multiply this by n kubernetes clusters, for any large n, and you definitely need to automate.
7+
8+
Sentinels are meant to provide a flexible abstraction to solve for this. In particular, they allow you to bundle a sequence of checks that can:
9+
10+
* Run terratest-based integration tests across any subset of your clusters and aggregate the results
11+
* Tail logs across any set of clusters using search filters, and analyze it with AI and git-source rules files
12+
* Deep-query a kubernetes resource on a cluster and analyze its health with AI and git-sourced rules files
13+
14+
Once a sentinel is defined, it can be run anytime on-demand via API. This can be triggered:
15+
16+
* in our UI
17+
* in github actions or other CI systems
18+
* in Plural pipelines
19+
20+
Some common usecases that we find they are particularly well suited for are:
21+
22+
1. Validating kubernetes upgrades do not introduce regressions
23+
2. Cross-cutting kubernetes operator changes (eg istio upgrades)
24+
3. Validating network reconfigurations are safe.
25+
26+
But there are likely many more.
27+
28+
The motivation behind all of these, and the use of AI, is that oftentimes confirming infra health requires aggregating multiple textual datasources and interpreting them using some degree of discretion that consumes meaningful man-hours as a result. You simply cannot do that deterministically, so a governed AI-based approach is needed. For deterministic correctness, a full terratest run can exercise common paths like validating pods start, storage volumes can be mounted, networking is enabled, etc.
29+
30+
## Set Up Your First Sentinel
31+
32+
Defining a new sentinel is best done via CRD. If you set up Plural with `plural up` you can register this at a file like `bootstrap/sentinels/example.yaml`:
33+
34+
```yaml
35+
apiVersion: deployments.plural.sh/v1alpha1
36+
kind: Sentinel
37+
metadata:
38+
name: example
39+
spec:
40+
description: Test baseline kubernetes health
41+
repositoryRef:
42+
name: infra
43+
namespace: infra
44+
git:
45+
ref: main
46+
folder: rules
47+
checks:
48+
- name: console-logs
49+
type: LOG
50+
ruleFile: logrule.md
51+
configuration:
52+
log:
53+
query: error
54+
duration: 5m
55+
namespaces:
56+
- cert-manager
57+
- external-dns
58+
- kube-system
59+
- name: integration-tests
60+
type: INTEGRATION_TEST
61+
configuration:
62+
integrationTest:
63+
format: JUNIT
64+
tags:
65+
tier: dev
66+
67+
# notice no job image is specified, we ship with a working integration test out of the box that can be used
68+
# without upfront development.
69+
jobSpec:
70+
namespace: plrl-deploy-operator
71+
serviceAccount: deployment-operator
72+
```
73+
74+
{% callout severity="info" %}
75+
To see the full api spec, go to our [Management API Docs](https://docs.plural.sh/overview/management-api-reference#sentinel)
76+
{% /callout %}
77+
78+
What this particular sentinel will do when run is, in parallel:
79+
80+
1. Query the logs for the configured namespaces (cert-manager, external-dns, and kube-system, some common low-level operator namespaces) for 5m for errors, and then analyze any results found according to a rule file specified in git. You as the engineer can tune how the AI operates with that rule file.
81+
2. Launch our default terratest job across all `tier: dev` clusters, doing a basic sequence of health checks.
82+
83+
You can run a sentinel at any time in your Plural Console instance by navigating to `AI -> Sentinels -> {sentinel-name}`, and once run, you'll see an experience something like this:
84+
85+
![](/assets/ai/sentinel-landing.png)
86+
87+
![](/assets/ai/sentinel-open.png)
88+
619 KB
Loading
382 KB
Loading
264 KB
Loading

public/assets/ai/sentinel-open.png

425 KB
Loading

src/routing/docs-structure.ts

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -143,6 +143,11 @@ export const docsStructure: DocSection[] = [
143143
sections: [
144144
{ path: 'setup', title: 'Setup Plural AI' },
145145
{ path: 'architecture', title: 'Plural AI Architecture' },
146+
{
147+
path: 'sentinels',
148+
title: 'At-Scale Infrastructure Testing with Sentinels',
149+
},
150+
{ path: 'arch-diagram', title: 'Infrastructure Deep Research' },
146151
{ path: 'cost', title: 'Plural AI cost analysis' },
147152
{
148153
path: 'multi-model-configuration',

0 commit comments

Comments
 (0)