Skip to content

Commit 96c0725

Browse files
authored
Web Documentation Cleanup (#65)
1 parent 3374728 commit 96c0725

File tree

3 files changed

+24
-24
lines changed

3 files changed

+24
-24
lines changed

docs/configuration-guide.rst

Lines changed: 12 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@ The full list of parameters can be found in the `Configuration Reference <config
66

77
You can find example values files in the `SuperSONIC GitHub repository <https://github.com/fastmachinelearning/SuperSONIC/tree/main/values>`_.
88

9-
1. Select a Triton Inference Server version
9+
1. Select a Triton Inference Server Version
1010
=============================================
1111

1212
- Official versions can be found at `NVIDIA NGC <https://ngc.nvidia.com/catalog/containers/nvidia:tritonserver>`_.
@@ -90,7 +90,7 @@ Triton version must be specified in the ``triton.image`` parameter in the values
9090
<br><br>
9191

9292

93-
3. Select resources for Triton pods
93+
3. Select Resources for Triton Pods
9494
=============================================
9595

9696
- You can configure CPU, memory, and GPU resources for Triton pods via the ``triton.resources`` parameter in the values file:
@@ -125,7 +125,7 @@ Triton version must be specified in the ``triton.image`` parameter in the values
125125
- NVIDIA-L4
126126
127127
128-
4. Configure Envoy Proxy
128+
4. Configure Envoy Proxy
129129
================================================
130130

131131
By default, Envoy proxy is enabled and configured to provide per-request
@@ -164,7 +164,7 @@ There are two options:
164164
In this case, the client connections should be established to ``<load_balancer_url>:8001`` and NOT use SSL.
165165

166166

167-
5. (optional) Configure rate limiting in Envoy Proxy
167+
5. (Optional) Configure Rate Limiting in Envoy Proxy
168168
======================================================
169169

170170
There are two types of rate limiting available in Envoy Proxy: *listener-level*, and *prometheus-based*.
@@ -202,7 +202,7 @@ There are two types of rate limiting available in Envoy Proxy: *listener-level*,
202202

203203
The metric and threshold for the Prometheus-based rate limiter are the same as those used for the autoscaler (see Prometheus Configuration).
204204

205-
6. (optional) Configure authentication in Envoy Proxy
205+
6. (Optional) Configure Authentication in Envoy Proxy
206206
======================================================
207207

208208
At the moment, the only supported authentication method is JWT. Example configuration for IceCube:
@@ -219,7 +219,7 @@ At the moment, the only supported authentication method is JWT. Example configur
219219
port: 443
220220
221221
222-
7. Deploy a Prometheus server or connect to an existing one
222+
7. Deploy a Prometheus Server or Connect to an Existing One
223223
============================================================
224224

225225
Prometheus is needed to scrape metrics for monitoring, as well as for the rate limiter and autoscaler.
@@ -272,7 +272,7 @@ Prometheus is needed to scrape metrics for monitoring, as well as for the rate l
272272
port: <prometheus_port>
273273
274274
275-
8. (optional) Configure metrics for scaling and rate limiting
275+
8. (Optional) Configure Metrics for Scaling and Rate Limiting
276276
===============================================================
277277

278278
Both the rate limiter and the autoscaler are currently configured to use the same Prometheus metric and threshold.
@@ -290,7 +290,7 @@ The Prometheus query for the graph is automatically inferred from the value of `
290290
The graph also displays the threshold value defined in ``serverLoadThreshold`` parameter.
291291

292292

293-
9. (optional) Deploy Grafana dashboard
293+
9. (Optional) Deploy Grafana Dashboard
294294
==========================================
295295

296296
Grafana is used to visualize metrics collected by Prometheus.
@@ -318,9 +318,9 @@ Grafana Ingress for web access, and datasources to connect to Prometheus,
318318
.. figure:: img/grafana.png
319319
:align: center
320320
:height: 200
321-
:alt: Supersonic Grafana dashboard
321+
:alt: SuperSONIC Grafana Dashboard
322322

323-
10. (optional) Enable KEDA autoscaler
323+
10. (Optional) Enable KEDA Autoscaler
324324
==========================================
325325

326326
Autoscaling is implemented via `KEDA (Kubernetes Event-Driven Autoscaler) <https://keda.sh/>`_ and
@@ -353,7 +353,7 @@ Additional optional parameters can control how quickly the autoscaler reacts to
353353
periodSeconds: 30
354354
stepsize: 1
355355
356-
11. (optional) Configure Metrics Collector for running ``perf_analyzer``
356+
11. (Optional) Configure Metrics Collector for Running ``perf_analyzer``
357357
=========================================================================
358358

359359
To collect Prometheus metrics when using ``perf_analyzer`` for testing,
@@ -384,7 +384,7 @@ Running with ``perf_analyzer`` is then done with:
384384
If ingress is not desired, port-forward the metrics collector service and call
385385
``--metrics-url localhost:8003/metrics`` to access the metrics.
386386

387-
12. (optional) Configure advanced monitoring
387+
12. (Optional) Configure Advanced Monitoring
388388
=============================================
389389

390390
Refer to the `advanced monitoring guide <advanced-monitoring>`_.

docs/getting-started.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -60,7 +60,7 @@ Installation
6060
This value will be used as a prefix for all resources created by the chart,
6161
unless ``nameOverride`` is specified in the values file.
6262
63-
1. Successfully executed ``helm install`` command will print a link to auto-generated Grafana dashboard
63+
Successfully executed ``helm install`` command will print a link to auto-generated Grafana dashboard
6464
and other useful information.
6565
6666
.. figure:: img/grafana.png

docs/index.rst

Lines changed: 11 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -20,17 +20,17 @@ SuperSONIC GitHub repository: `fastmachinelearning/SuperSONIC <https://github.co
2020

2121
-----
2222

23-
Why "inference-as-a-service"?
23+
Why Inference-as-a-Service?
2424
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
2525

2626
.. container:: twocol
2727

2828
.. container:: leftside
2929

3030
The computing demands of modern scientific experiments are growing at a faster rate than the performance improvements
31-
of traditional processors (CPUs). This trend is driven by increasing data collection rates, tightening latency requirements,
31+
of traditional general-purpose processors (CPUs). This trend is driven by increasing data collection rates, tightening latency requirements,
3232
and rising complexity of algorithms, particularly those based on machine learning.
33-
Such a computing landscape strongly motivates the adoption of specialized coprocessors, such as FPGAs, GPUs, and TPUs.
33+
Such a computing landscape strongly motivates the adoption of specialized coprocessors, such as FPGAs, GPUs, and TPUs. However, this introduces new resource allocation and scaling issues.
3434

3535
.. container:: rightside
3636

@@ -41,8 +41,8 @@ Why "inference-as-a-service"?
4141
`Image source: A3D3 <https://a3d3.ai/about/>`_
4242

4343

44-
In "inference-as-a-service" model, the data processing workflows ("clients") off-load computationally intensive steps,
45-
such as neural network inference, to a remote "server" equipped with coprocessors. This design allows to optimize both
44+
In the inference-as-a-service model, the data processing workflows ("clients") off-load computationally intensive steps,
45+
such as neural network inference, to a remote "server" equipped with coprocessors. This design allows for optimization of both
4646
data processing throughput and coprocessor utilization by dynamically balancing the ratio of CPUs to coprocessors.
4747
Numerous R&D efforts implementing this paradigm in HEP and MMA experiments are grouped under the name
4848
**SONIC (Services for Optimized Network Inference on Coprocessors)**.
@@ -54,16 +54,16 @@ Numerous R&D efforts implementing this paradigm in HEP and MMA experiments are g
5454

5555
-----
5656

57-
SuperSONIC: a case for shared server infrastructure
57+
SuperSONIC: A Case for Shared Server Infrastructure
5858
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
5959

60-
A key feature of the SONIC approach is the decoupling of clients from servers and the standardization
60+
Two of the key features of the SONIC approach are the decoupling of clients from servers and the standardization
6161
of communication between them.
6262
While client-side implementations may vary across applications, the server-side infrastructure can remain
6363
largely the same, since the server functionality requirements (load balancing, autoscaling, etc.) are not
6464
experiment-specific.
6565

66-
The purpose of SuperSONIC project is to develop server infrastructure that could be reused by scientific
66+
The purpose of the SuperSONIC project is to develop server infrastructure that could be reused by multiple scientific
6767
experiments with only small differences in configuration.
6868

6969
-----
@@ -81,7 +81,7 @@ We are open for collaboration and encourage other experiments to try SuperSONIC
8181
<td style="width:65%; vertical-align: center; padding-right: 1em;">
8282
<p><a href="https://home.cern/science/experiments/cms">CMS Experiment</a> at the Large Hadron Collider (CERN).</p>
8383
<p>
84-
CMS is testing inference-as-a-service approach in Run 3 offline processing workflows, off-loading inferences to GPUs for
84+
CMS is testing the inference-as-a-service approach in Run 3 offline processing workflows, off-loading inferences to GPUs for
8585
machine learning models such as <strong>ParticleNet</strong>, <strong>DeepMET</strong>, <strong>DeepTau</strong>, <strong>ParT</strong>.
8686
In addition, non-ML tracking algorithms such as <strong>LST</strong> and <strong>Patatrack</strong> are being adapted for deployment
8787
as-a-service.
@@ -120,7 +120,7 @@ We are open for collaboration and encourage other experiments to try SuperSONIC
120120
<td style="width:65%; vertical-align: center; padding-right: 1em;">
121121
<p><a href="https://icecube.wisc.edu/">IceCube Neutrino Observatory</a> at the South Pole.</p>
122122
<p>
123-
IceCube uses SONIC approach to accelerate event classifier algorithms based on convolutional neural networks (CNNs).
123+
IceCube uses the SONIC approach to accelerate event classifier algorithms based on convolutional neural networks (CNNs).
124124
</p>
125125
</td>
126126
<td style="width:35%; vertical-align: center;">
@@ -129,7 +129,7 @@ We are open for collaboration and encourage other experiments to try SuperSONIC
129129
</tr>
130130
</table>
131131

132-
Deployment sites
132+
Deployment Sites
133133
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
134134

135135
SuperSONIC has been successfully tested at the computing clusters listed below.

0 commit comments

Comments
 (0)