Skip to content

Commit 549ba12

Browse files
jcohen-hdbgitbook-bot
authored andcommitted
GITBOOK-10: Spelling and grammar fixes
1 parent 54ccca8 commit 549ba12

35 files changed

+369
-458
lines changed

docs/administration/administration.md

+3-3
Original file line numberDiff line numberDiff line change
@@ -6,14 +6,14 @@ HarperDB is designed for minimal administrative effort, and with managed service
66

77
As a distributed database, data protection and recovery can benefit from different data protection strategies than a traditional single-server database. But multiple aspects of data protection and recovery should be considered:
88

9-
* Availability: As a distributed database HarperDB is intrinsically built for high-availability and a cluster will continue to run even with complete server(s) failure. The is the first and primary defense for protecting against any downtime or data loss. HarperDB provides fast horizontal scaling functionality with node cloning, which facilitates ease of establishing high availability clusters.
9+
* Availability: As a distributed database HarperDB is intrinsically built for high-availability and a cluster will continue to run even with complete server(s) failure. This is the first and primary defense for protecting against any downtime or data loss. HarperDB provides fast horizontal scaling functionality with node cloning, which facilitates ease of establishing high availability clusters.
1010
* [Audit log](logging/audit-logging.md): HarperDB defaults to tracking data changes so malicious data changes can be found, attributed, and reverted. This provides security-level defense against data loss, allowing for fine-grained isolation and reversion of individual data without the large-scale reversion/loss of data associated with point-in-time recovery approaches.
11-
* Snapshots: When used as a source-of-truth database for crucial data, we recommend using snapshot tools to regularly snapshot databases as a final backup/defense against data loss (this should only be used as a last resort in recovery). HarperDB has a [`get_backup`](../developers/operations-api/databases-and-tables.md#get-backup) operation, which provides direct support for making and retrieving database snapshots. An HTTP request can be used to get a snapshot. Alternately, volume snapshot tools can be used to snapshot data at the OS/VM level. HarperDB can also provide scripts for replaying transaction logs from snapshots to facilitate point-in-time recovery when necessary (often customization may be preferred in certain recovery situations to minimize data loss).
11+
* Snapshots: When used as a source-of-truth database for crucial data, we recommend using snapshot tools to regularly snapshot databases as a final backup/defense against data loss (this should only be used as a last resort in recovery). HarperDB has a [`get_backup`](../developers/operations-api/databases-and-tables.md#get-backup) operation, which provides direct support for making and retrieving database snapshots. An HTTP request can be used to get a snapshot. Alternatively, volume snapshot tools can be used to snapshot data at the OS/VM level. HarperDB can also provide scripts for replaying transaction logs from snapshots to facilitate point-in-time recovery when necessary (often customization may be preferred in certain recovery situations to minimize data loss).
1212

1313
### Horizontal Scaling with Node Cloning
1414

1515
HarperDB provides rapid horizontal scaling capabilities through [node cloning functionality described here](cloning.md).
1616

1717
### Replication Transaction Logging
1818

19-
HarperDB utilizes NATS for replication, which maintains a transaction log. See the [transaction log documentation for information how to query this log](logging/transaction-logging.md).
19+
HarperDB utilizes NATS for replication, which maintains a transaction log. See the [transaction log documentation for information on how to query this log](logging/transaction-logging.md).

docs/administration/cloning.md

+7-11
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
1-
# Cloning
1+
# Clone Node
22

3-
Clone node is a configurable node script that can be pointed to another instance of HarperDB and create a full clone it.
3+
Clone node is a configurable node script that can be pointed to another instance of HarperDB and create a full clone.
44

55
To start clone node run `harperdb` as you would normally but have the clone node environment variables set (see below).
66

@@ -11,7 +11,7 @@ To run clone node the following environment variables must be set:
1111
* `HDB_LEADER_USERNAME` - The leader node admin username.
1212
* `HDB_LEADER_PASSWORD` - The leader node admin password.
1313

14-
Clone node can be configured through `clone-node-config.yaml`, which should to be located in the `ROOTPATH` directory of your clone. If no configuration is supplied, default values will be used.
14+
Clone node can be configured through `clone-node-config.yaml`, which should be located in the `ROOTPATH` directory of your clone. If no configuration is supplied, default values will be used.
1515

1616
**Leader node** - the instance of HarperDB you are cloning.\
1717
**Clone node** - the new node which will be a clone of the leader node.
@@ -36,8 +36,7 @@ componentConfig:
3636
- name: my-cool-component
3737
```
3838
39-
`skipNodeModules` will not include the node\_modules directory when clone node is packaging components in `hdb/components`\
40-
39+
`skipNodeModules` will not include the node\_modules directory when clone node is packaging components in `hdb/components`.
4140

4241
`exclude` can be used to set any components that you do not want cloned.
4342

@@ -53,9 +52,7 @@ clusteringConfig:
5352
httpsRejectUnauthorized: false
5453
```
5554

56-
Clone node makes http requests to the leader node, `httpsRejectUnauthorized` is used to set if https requests should be verified.\
57-
\
58-
55+
Clone node makes http requests to the leader node, `httpsRejectUnauthorized` is used to set if https requests should be verified.
5956

6057
Any HarperDB configuration can also be used in the `clone-node-config.yaml` file and will be applied to the cloned node, for example:
6158

@@ -75,8 +72,7 @@ _Note: any required configuration needed to install/run HarperDB will be default
7572

7673
### Fully connected clone
7774

78-
A fully connected topology is when all nodes are replicating (publish and subscribing) with all other nodes. A fully connected clone maintains this topology with addition of the new node. When a clone is created, replication is added between the leader and the clone and any nodes the leader is replicating with. For example, if the leader is replicating with node-a and node-b, the clone will replicate with the leader, node-a and node-b.\
79-
75+
A fully connected topology is when all nodes are replicating (publish and subscribing) with all other nodes. A fully connected clone maintains this topology with addition of the new node. When a clone is created, replication is added between the leader and the clone and any nodes the leader is replicating with. For example, if the leader is replicating with node-a and node-b, the clone will replicate with the leader, node-a and node-b.
8076

8177
To run clone node with the fully connected option simply pass the environment variable `HDB_FULLY_CONNECTED=true`
8278

@@ -96,7 +92,7 @@ When run clone node will execute the following steps:
9692

9793
## Custom database and table pathing
9894

99-
Currently clone node will not clone a table if it has custom pathing configured. In this situation the full database that the table is located in will not be cloned.
95+
Currently, clone node will not clone a table if it has custom pathing configured. In this situation the full database that the table is located in will not be cloned.
10096

10197
If a database has custom pathing (no individual table pathing) it will be cloned, however if no custom pathing is provided in the clone config the database will be stored in the default database directory.
10298

docs/administration/jobs.md

+8-8
Original file line numberDiff line numberDiff line change
@@ -1,12 +1,12 @@
1-
# Asynchronous Jobs
1+
# Jobs
22

33
HarperDB Jobs are asynchronous tasks performed by the Operations API.
44

55
## Job Summary
66

77
Jobs uses an asynchronous methodology to account for the potential of a long-running operation. For example, exporting millions of records to S3 could take some time, so that job is started and the id is provided to check on the status.
88

9-
The job status can be **COMPLETE** or **IN_PROGRESS**.
9+
The job status can be **COMPLETE** or **IN\_PROGRESS**.
1010

1111
## Example Job Operations
1212

@@ -20,11 +20,11 @@ Example job operations include:
2020

2121
[import from s3](https://api.harperdb.io/#820b3947-acbe-41f9-858b-2413cabc3a18)
2222

23-
[delete_records_before](https://api.harperdb.io/#8de87e47-73a8-4298-b858-ca75dc5765c2)
23+
[delete\_records\_before](https://api.harperdb.io/#8de87e47-73a8-4298-b858-ca75dc5765c2)
2424

25-
[export_local](https://api.harperdb.io/#49a02517-ada9-4198-b48d-8707db905be0)
25+
[export\_local](https://api.harperdb.io/#49a02517-ada9-4198-b48d-8707db905be0)
2626

27-
[export_to_s3](https://api.harperdb.io/#f6393e9f-e272-4180-a42c-ff029d93ddd4)
27+
[export\_to\_s3](https://api.harperdb.io/#f6393e9f-e272-4180-a42c-ff029d93ddd4)
2828

2929
Example Response from a Job Operation
3030

@@ -34,11 +34,11 @@ Example Response from a Job Operation
3434
}
3535
```
3636

37-
Whenever one of these operations is initiated, an asynchronous job is created and the request contains the id of that job which can be used to check on its status.
37+
Whenever one of these operations is initiated, an asynchronous job is created and the request contains the ID of that job which can be used to check on its status.
3838

3939
## Managing Jobs
4040

41-
To check on a job's status, use the [get_job](https://api.harperdb.io/#d501bef7-dbb7-4714-b535-e466f6583dce) operation.
41+
To check on a job's status, use the [get\_job](https://api.harperdb.io/#d501bef7-dbb7-4714-b535-e466f6583dce) operation.
4242

4343
Get Job Request
4444

@@ -73,7 +73,7 @@ Get Job Response
7373

7474
## Finding Jobs
7575

76-
To find jobs (if the id is not know) use the [search_jobs_by_start_date](https://api.harperdb.io/#4474ca16-e4c2-4740-81b5-14ed98c5eeab) operation.
76+
To find jobs (if the ID is not known) use the [search\_jobs\_by\_start\_date](https://api.harperdb.io/#4474ca16-e4c2-4740-81b5-14ed98c5eeab) operation.
7777

7878
Search Jobs Request
7979

docs/administration/logging/README.md

+2-2
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
# Logging
22

3-
HarperDB provides many different logging options for various features and functionality. 
3+
HarperDB provides many different logging options for various features and functionality.
44

55
* [Standard Logging](logging.md): HarperDB maintains a log of events that take place throughout operation.
66
* [Audit Logging](audit-logging.md): HarperDB uses a standard HarperDB table to track transactions. For each table a user creates, a corresponding table will be created to track transactions against that table.
7-
* [Transaction Logging](transaction-logging.md): HarperDB stores a verbose history of all transactions logged for specified database table, including original data records.
7+
* [Transaction Logging](transaction-logging.md): HarperDB stores a verbose history of all transactions logged for specified database tables, including original data records.

docs/administration/logging/audit-logging.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -72,7 +72,7 @@ The above example will return all records whose primary key (`hash_value`) is 31
7272

7373
#### read\_audit\_log Response
7474

75-
The example that follows provides records of operations performed on a table. One thing of note is that this the `read_audit_log` operation gives you the `original_records`.
75+
The example that follows provides records of operations performed on a table. One thing of note is that the `read_audit_log` operation gives you the `original_records`.
7676

7777
```json
7878
{

docs/administration/logging/logging.md

+2-2
Original file line numberDiff line numberDiff line change
@@ -20,7 +20,7 @@ The components of a log entry are:
2020

2121
* timestamp - This is the date/time stamp when the event occurred
2222
* level - This is an associated log level that gives a rough guide to the importance and urgency of the message. The available log levels in order of least urgent (and more verbose) are: `trace`, `debug`, `info`, `warn`, `error`, `fatal`, and `notify`.
23-
* thread/id - This reports the name of the thread and the thread id, that the event was reported on. Note that NATS logs are recorded by their process name and there is no thread id for them since they are a separate process. Key threads are:
23+
* thread/ID - This reports the name of the thread and the thread ID that the event was reported on. Note that NATS logs are recorded by their process name and there is no thread id for them since they are a separate process. Key threads are:
2424
* main - This is the thread that is responsible for managing all other threads and routes incoming requests to the other threads
2525
* http - These are the worker threads that handle the primary workload of incoming HTTP requests to the operations API and custom functions.
2626
* Clustering\* - These are threads and processes that handle replication.
@@ -34,7 +34,7 @@ The log level can be changed by modifying `logging.level` in the config file `ha
3434

3535
## Clustering Logging
3636

37-
HarperDB clustering utilizes two [Nats](https://nats.io/) servers, named Hub and Leaf. The Hub server is responsible for establishing the mesh network that connects instances of HarperDB and the Leaf server is responsible for managing the message stores (streams) that replicate and store messages between instances. Due to the verbosity of these servers there is a separate log level configuration for them. To adjust their log verbosity set `clustering.logLevel` in the config file `harperdb-config.yaml`. Valid log levels from least verbose are `error`, `warn`, `info`, `debug` and `trace`.
37+
HarperDB clustering utilizes two [Nats](https://nats.io/) servers, named Hub and Leaf. The Hub server is responsible for establishing the mesh network that connects instances of HarperDB and the Leaf server is responsible for managing the message stores (streams) that replicate and store messages between instances. Due to the verbosity of these servers there is a separate log level configuration for them. To adjust their log verbosity, set `clustering.logLevel` in the config file `harperdb-config.yaml`. Valid log levels from least verbose are `error`, `warn`, `info`, `debug` and `trace`.
3838

3939
## Log File vs Standard Streams
4040

docs/deployments/harperdb-cli.md

+2-2
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
# HarperDB CLI
22

3-
The HarperDB command line interface (CLI) is used to administer [self-installed HarperDB instances](./install-harperdb/).
3+
The HarperDB command line interface (CLI) is used to administer [self-installed HarperDB instances](install-harperdb/).
44

55
## Installing HarperDB
66

@@ -108,4 +108,4 @@ harperdb status
108108

109109
## Backups
110110

111-
HarperDB uses a transactional commit process that ensures that data on disk is always transactionally consistent with storage. This means that HarperDB maintains safety of database integrity in the event of a crash. It also means that you can use any standard volume snapshot tool to make a backup of a HarperDB database. Database files are stored in the hdb/database directory. As long as the snapshot is an atomic snapshot of these database files, the data can be copied/movied back into the database directory to restore a previous backup (with HarperDB shut down) , and database integrity will be preserved. Note that simply copying an in-use database file (using `cp`, for example) is _not_ a snapshot, and this would progressively read data from the database at different points in time, which yields unreliable copy that likely will not be usable. Standard copying is only reliable for a database file that is not in use.
111+
HarperDB uses a transactional commit process that ensures that data on disk is always transactionally consistent with storage. This means that HarperDB maintains database integrity in the event of a crash. It also means that you can use any standard volume snapshot tool to make a backup of a HarperDB database. Database files are stored in the hdb/database directory. As long as the snapshot is an atomic snapshot of these database files, the data can be copied/moved back into the database directory to restore a previous backup (with HarperDB shut down) , and database integrity will be preserved. Note that simply copying an in-use database file (using `cp`, for example) is _not_ a snapshot, and this would progressively read data from the database at different points in time, which yields unreliable copy that likely will not be usable. Standard copying is only reliable for a database file that is not in use.
Original file line numberDiff line numberDiff line change
@@ -1,22 +1,20 @@
1-
# HarperDB Cloud Instance Size Hardware Specs
1+
# Instance Size Hardware Specs
22

3-
While HarperDB Cloud bills by RAM, each instance has other specifications associated with the RAM selection. The following table describes each instance size in detail*.
3+
While HarperDB Cloud bills by RAM, each instance has other specifications associated with the RAM selection. The following table describes each instance size in detail\*.
44

5-
| AWS EC2 Instance Size | RAM (GiB) | # vCPUs | Network (Gbps) | Processor |
6-
|------------------------|------------|----------|-----------------|----------------------------------------|
7-
| t3.nano | 0.5 | 2 | Up to 5 | 2.5 GHz Intel Xeon Platinum 8000 |
8-
| t3.micro | 1 | 2 | Up to 5 | 2.5 GHz Intel Xeon Platinum 8000 |
9-
| t3.small | 2 | 2 | Up to 5 | 2.5 GHz Intel Xeon Platinum 8000 |
10-
| t3.medium | 4 | 2 | Up to 5 | 2.5 GHz Intel Xeon Platinum 8000 |
11-
| m5.large | 8 | 2 | Up to 10 | Up to 3.1 GHz Intel Xeon Platinum 8000 |
12-
| m5.xlarge | 16 | 4 | Up to 10 | Up to 3.1 GHz Intel Xeon Platinum 8000 |
13-
| m5.2xlarge | 32 | 8 | Up to 10 | Up to 3.1 GHz Intel Xeon Platinum 8000 |
14-
| m5.4xlarge | 64 | 16 | Up to 10 | Up to 3.1 GHz Intel Xeon Platinum 8000 |
15-
| m5.8xlarge | 128 | 32 | 10 | Up to 3.1 GHz Intel Xeon Platinum 8000 |
16-
| m5.12xlarge | 192 | 48 | 10 | Up to 3.1 GHz Intel Xeon Platinum 8000 |
17-
| m5.16xlarge | 256 | 64 | 20 | Up to 3.1 GHz Intel Xeon Platinum 8000 |
18-
| m5.24xlarge | 384 | 96 | 25 | Up to 3.1 GHz Intel Xeon Platinum 8000 |
5+
| AWS EC2 Instance Size | RAM (GiB) | # vCPUs | Network (Gbps) | Processor |
6+
| --------------------- | --------- | ------- | -------------- | -------------------------------------- |
7+
| t3.nano | 0.5 | 2 | Up to 5 | 2.5 GHz Intel Xeon Platinum 8000 |
8+
| t3.micro | 1 | 2 | Up to 5 | 2.5 GHz Intel Xeon Platinum 8000 |
9+
| t3.small | 2 | 2 | Up to 5 | 2.5 GHz Intel Xeon Platinum 8000 |
10+
| t3.medium | 4 | 2 | Up to 5 | 2.5 GHz Intel Xeon Platinum 8000 |
11+
| m5.large | 8 | 2 | Up to 10 | Up to 3.1 GHz Intel Xeon Platinum 8000 |
12+
| m5.xlarge | 16 | 4 | Up to 10 | Up to 3.1 GHz Intel Xeon Platinum 8000 |
13+
| m5.2xlarge | 32 | 8 | Up to 10 | Up to 3.1 GHz Intel Xeon Platinum 8000 |
14+
| m5.4xlarge | 64 | 16 | Up to 10 | Up to 3.1 GHz Intel Xeon Platinum 8000 |
15+
| m5.8xlarge | 128 | 32 | 10 | Up to 3.1 GHz Intel Xeon Platinum 8000 |
16+
| m5.12xlarge | 192 | 48 | 10 | Up to 3.1 GHz Intel Xeon Platinum 8000 |
17+
| m5.16xlarge | 256 | 64 | 20 | Up to 3.1 GHz Intel Xeon Platinum 8000 |
18+
| m5.24xlarge | 384 | 96 | 25 | Up to 3.1 GHz Intel Xeon Platinum 8000 |
1919

20-
21-
22-
*Specifications are subject to change. For the most up to date information, please refer to AWS documentation: https://aws.amazon.com/ec2/instance-types/.
20+
\*Specifications are subject to change. For the most up to date information, please refer to AWS documentation: [https://aws.amazon.com/ec2/instance-types/](https://aws.amazon.com/ec2/instance-types/).

0 commit comments

Comments
 (0)