Skip to content

Commit bcb5667

Browse files
authored
Grammer and language (pytorch#1024)
* Edit grammer and count * update runtime.txt * updated the file * fixed some typos, formatted document
1 parent 7ed7ac7 commit bcb5667

File tree

22 files changed

+123
-80
lines changed

22 files changed

+123
-80
lines changed

CODE_OF_CONDUCT.md

+13-13
Original file line numberDiff line numberDiff line change
@@ -14,22 +14,22 @@ appearance, race, religion, or sexual identity and orientation.
1414
Examples of behavior that contributes to creating a positive environment
1515
include:
1616

17-
* Using welcoming and inclusive language
18-
* Being respectful of differing viewpoints and experiences
19-
* Gracefully accepting constructive criticism
20-
* Focusing on what is best for the community
21-
* Showing empathy towards other community members
17+
- Using welcoming and inclusive language
18+
- Being respectful of differing viewpoints and experiences
19+
- Gracefully accepting constructive criticism
20+
- Focusing on what is best for the community
21+
- Showing empathy towards other community members
2222

2323
Examples of unacceptable behavior by participants include:
2424

25-
* The use of sexualized language or imagery and unwelcome sexual attention or
26-
advances
27-
* Trolling, insulting/derogatory comments, and personal or political attacks
28-
* Public or private harassment
29-
* Publishing others' private information, such as a physical or electronic
30-
address, without explicit permission
31-
* Other conduct which could reasonably be considered inappropriate in a
32-
professional setting
25+
- The use of sexualized language or imagery and unwelcome sexual attention or
26+
advances
27+
- Trolling, insulting/derogatory comments, and personal or political attacks
28+
- Public or private harassment
29+
- Publishing other's private information, such as physical or electronic
30+
address, without explicit permission
31+
- Other conduct which could reasonably be considered inappropriate in a
32+
professional setting
3333

3434
## Our Responsibilities
3535

CONTRIBUTING.md

+21-13
Original file line numberDiff line numberDiff line change
@@ -1,19 +1,22 @@
11
# Contributing to examples
2+
23
We want to make contributing to this project as easy and transparent as
34
possible.
45

56
## Pull Requests
7+
68
We actively welcome your pull requests.
79

810
If you're new we encourage you to take a look at issues tagged with [good first issue](https://github.com/pytorch/examples/issues?q=is%3Aissue+is%3Aopen+label%3A%22good+first+issue%22)
911

1012
### For new examples
11-
0. Create a github issue proposing a new example and make sure it's substantially different from an existing one
13+
14+
0. Create a GitHub issue proposing a new example and make sure it's substantially different from an existing one.
1215
1. Fork the repo and create your branch from `main`.
13-
2. If you've added code that should be tested, add tests to `run_python_examples.sh`
16+
2. If you've added code that should be tested, add tests to `run_python_examples.sh`.
1417
3. Create a `README.md`.
1518
4. Add a card with a brief description of your example and link to the repo to
16-
the `docs/source/index.rst` file and build the docs by running:
19+
the `docs/source/index.rst` file and build the docs by running:
1720

1821
```
1922
cd docs
@@ -22,34 +25,39 @@ If you're new we encourage you to take a look at issues tagged with [good first
2225
pip install -r requirements.txt
2326
make html
2427
```
28+
2529
When done working with `virtualenv`, run `deactivate`.
2630

27-
5. Verify that there are no issues in your doc build. You can check preview locally
31+
5. Verify that there are no issues in your doc build. You can check the preview locally
2832
by installing [sphinx-serve](https://pypi.org/project/sphinx-serve/) and
2933
then running `sphinx-serve -b build`.
30-
31-
5. Ensure your test passes locally.
32-
6. If you haven't already, complete the Contributor License Agreement ("CLA").
33-
7. Address any feedback in code review promptly.
34+
6. Ensure your test passes locally.
35+
7. If you haven't already, complete the Contributor License Agreement ("CLA").
36+
8. Address any feedback in code review promptly.
3437

3538
## For bug fixes
39+
3640
1. Fork the repo and create your branch from `main`.
37-
2. Make sure you have a GPU-enabled machine, either locally or in the cloud. `g4dn.4xlarge` is a good starting point on AWS.
38-
3. Make your code change.
41+
2. Make sure you have a GPU-enabled machine, either locally or in the cloud. `g4dn.4xlarge` is a good starting point on AWS.
42+
3. Make your code change.
3943
4. First, install all dependencies with `./run_python_examples.sh "install_deps"`.
40-
5. Then make sure that `./run_python_examples.sh` passes locally by running script end to end.
44+
5. Then make sure that `./run_python_examples.sh` passes locally by running the script end to end.
4145
6. If you haven't already, complete the Contributor License Agreement ("CLA").
4246
7. Address any feedback in code review promptly.
4347

44-
4548
## Contributor License Agreement ("CLA")
46-
In order to accept your pull request, we need you to submit a CLA. You only need
49+
50+
To accept your pull request, we need you to submit a CLA. You only need
4751
to do this once to work on any of Facebook's open source projects.
4852

4953
Complete your CLA here: <https://code.facebook.com/cla>
54+
5055
## Issues
56+
5157
We use GitHub issues to track public bugs. Please ensure your description is
5258
clear and has sufficient instructions to be able to reproduce the issue.
59+
5360
## License
61+
5462
By contributing to examples, you agree that your contributions will be licensed
5563
under the LICENSE file in the root directory of this source tree.

README.md

+6-6
Original file line numberDiff line numberDiff line change
@@ -4,13 +4,13 @@
44

55
https://pytorch.org/examples/
66

7-
`pytorch/examples` is a repository showcasing examples of using [PyTorch](https://github.com/pytorch/pytorch). The goal is to have curated, short, few/no dependencies *high quality* examples that are substantially different from each other that can be emulated in your existing work.
7+
`pytorch/examples` is a repository showcasing examples of using [PyTorch](https://github.com/pytorch/pytorch). The goal is to have curated, short, few/no dependencies _high quality_ examples that are substantially different from each other that can be emulated in your existing work.
88

9-
* For tutorials: https://github.com/pytorch/tutorials
10-
* For changes to pytorch.org: https://github.com/pytorch/pytorch.github.io
11-
* For a general model hub: https://pytorch.org/hub/ or https://huggingface.co/models
12-
* For recipes on how to run PyTorch in production: https://github.com/facebookresearch/recipes
13-
* For general Q&A and support: https://discuss.pytorch.org/
9+
- For tutorials: https://github.com/pytorch/tutorials
10+
- For changes to pytorch.org: https://github.com/pytorch/pytorch.github.io
11+
- For a general model hub: https://pytorch.org/hub/ or https://huggingface.co/models
12+
- For recipes on how to run PyTorch in production: https://github.com/facebookresearch/recipes
13+
- For general Q&A and support: https://discuss.pytorch.org/
1414

1515
## Available models
1616

cpp/autograd/README.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -12,7 +12,7 @@ $ cmake -DCMAKE_PREFIX_PATH=/path/to/libtorch ..
1212
$ make
1313
```
1414

15-
where `/path/to/libtorch` should be the path to the unzipped *LibTorch*
15+
where `/path/to/libtorch` should be the path to the unzipped _LibTorch_
1616
distribution, which you can get from the [PyTorch
1717
homepage](https://pytorch.org/get-started/locally/).
1818

cpp/custom-dataset/README.md

+4-2
Original file line numberDiff line numberDiff line change
@@ -20,9 +20,10 @@ $ make
2020

2121
where /path/to/libtorch should be the path to the unzipped LibTorch distribution, which you can get from the [PyTorch homepage](https://pytorch.org/get-started/locally/).
2222

23-
if you see an error like ```undefined reference to cv::imread(std::string const&, int)``` when running the ```make``` command, you should build LibTorch from source using the instructions [here](https://github.com/pytorch/pytorch#from-source), and then set ```CMAKE_PREFIX_PATH``` to that PyTorch source directory.
23+
if you see an error like `undefined reference to cv::imread(std::string const&, int)` when running the `make` command, you should build LibTorch from source using the instructions [here](https://github.com/pytorch/pytorch#from-source), and then set `CMAKE_PREFIX_PATH` to that PyTorch source directory.
2424

2525
The build directory should look like this:
26+
2627
```
2728
.
2829
├── custom-dataset
@@ -38,9 +39,10 @@ The build directory should look like this:
3839
└── ...
3940
```
4041

41-
```info.txt``` file gets copied from source directory during build.
42+
`info.txt` file gets copied from source directory during build.
4243

4344
Execute the compiled binary to train the model:
45+
4446
```shell
4547
./custom-dataset
4648
Running on: CUDA

cpp/dcgan/README.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -15,7 +15,7 @@ $ cmake -DCMAKE_PREFIX_PATH=/path/to/libtorch ..
1515
$ make
1616
```
1717

18-
where `/path/to/libtorch` should be the path to the unzipped *LibTorch*
18+
where `/path/to/libtorch` should be the path to the unzipped _LibTorch_
1919
distribution, which you can get from the [PyTorch
2020
homepage](https://pytorch.org/get-started/locally/).
2121

cpp/distributed/README.md

-1
Original file line numberDiff line numberDiff line change
@@ -23,4 +23,3 @@ To run the code,
2323
```shell
2424
mpirun -np {NUM-PROCS} ./dist-mnist
2525
```
26-

cpp/mnist/README.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -15,7 +15,7 @@ $ cmake -DCMAKE_PREFIX_PATH=/path/to/libtorch ..
1515
$ make
1616
```
1717

18-
where `/path/to/libtorch` should be the path to the unzipped *LibTorch*
18+
where `/path/to/libtorch` should be the path to the unzipped _LibTorch_
1919
distribution, which you can get from the [PyTorch
2020
homepage](https://pytorch.org/get-started/locally/).
2121

cpp/regression/README.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -12,7 +12,7 @@ $ cmake -DCMAKE_PREFIX_PATH=/path/to/libtorch ..
1212
$ make
1313
```
1414

15-
where `/path/to/libtorch` should be the path to the unzipped *LibTorch*
15+
where `/path/to/libtorch` should be the path to the unzipped _LibTorch_
1616
distribution, which you can get from the [PyTorch
1717
homepage](https://pytorch.org/get-started/locally/).
1818

cpp/transfer-learning/README.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -17,4 +17,4 @@ For **prediction**:
1717
1. `cd build`
1818
2. `./classify <path_image> <path_to_resnet18_model_without_fc_layer> <model_linear_trained>` : `./classify <path_image> ../resnet18_without_last_layer.pt model_linear.pt`
1919

20-
Detailed blog on applying Transfer Learning using Libtorch: https://krshrimali.github.io/Applying-Transfer-Learning-Dogs-Cats/.
20+
Detailed blog on applying Transfer Learning using Libtorch: https://krshrimali.github.io/Applying-Transfer-Learning-Dogs-Cats/.

dcgan/README.md

+3
Original file line numberDiff line numberDiff line change
@@ -10,12 +10,15 @@ with the samples from the generative model.
1010
After every epoch, models are saved to: `netG_epoch_%d.pth` and `netD_epoch_%d.pth`
1111

1212
## Downloading the dataset
13+
1314
You can download the LSUN dataset by cloning [this repo](https://github.com/fyu/lsun) and running
15+
1416
```
1517
python download.py -c bedroom
1618
```
1719

1820
## Usage
21+
1922
```
2023
usage: main.py [-h] --dataset DATASET --dataroot DATAROOT [--workers WORKERS]
2124
[--batchSize BATCHSIZE] [--imageSize IMAGESIZE] [--nz NZ]

distributed/ddp/README.md

+22-2
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,8 @@ multiple nodes, each with multiple GPUs using PyTorch's distributed
66
[launcher script](https://github.com/pytorch/pytorch/blob/master/torch/distributed/launch.py).
77

88
# Prerequisites
9-
We assume you are familiar with [PyTorch](https://pytorch.org/tutorials/beginner/deep_learning_60min_blitz.html), the primitives it provides for [writing distributed applications](https://pytorch.org/tutorials/intermediate/dist_tuto.html) as well as training [distributed models](https://pytorch.org/tutorials/intermediate/ddp_tutorial.html).
9+
10+
We assume you are familiar with [PyTorch](https://pytorch.org/tutorials/beginner/deep_learning_60min_blitz.html), the primitives it provides for [writing distributed applications](https://pytorch.org/tutorials/intermediate/dist_tuto.html) as well as training [distributed models](https://pytorch.org/tutorials/intermediate/ddp_tutorial.html).
1011

1112
The example program in this tutorial uses the
1213
[`torch.nn.parallel.DistributedDataParallel`](https://pytorch.org/docs/stable/nn.html#distributeddataparallel) class for training models
@@ -20,6 +21,7 @@ application but each one operates on different portions of the
2021
training dataset.
2122

2223
# Application process topologies
24+
2325
A Distributed Data Parallel (DDP) application can be executed on
2426
multiple nodes where each node can consist of multiple GPU
2527
devices. Each node in turn can run multiple copies of the DDP
@@ -49,6 +51,7 @@ computational costs. In the rest of this tutorial, we assume that the
4951
application follows this heuristic.
5052

5153
# Preparing and launching a DDP application
54+
5255
Independent of how a DDP application is launched, each process needs a
5356
mechanism to know its global and local ranks. Once this is known, all
5457
processes create a `ProcessGroup` that enables them to participate in
@@ -66,26 +69,32 @@ python -c "from os import path; import torch; print(path.join(path.dirname(torch
6669
```
6770

6871
This will print something like this:
72+
6973
```sh
7074
/home/username/miniconda3/envs/pytorch/lib/python3.8/site-packages/torch/distributed/launch.py
7175
```
7276

7377
When the DDP application is started via `launch.py`, it passes the world size, global rank, master address and master port via environment variables and the local rank as a command-line parameter to each instance.
7478
To use the launcher, an application needs to adhere to the following convention:
79+
7580
1. It must provide an entry-point function for a _single worker_. For example, it should not launch subprocesses using `torch.multiprocessing.spawn`
7681
2. It must use environment variables for initializing the process group.
7782

7883
For simplicity, the application can assume each process maps to a single GPU but in the next section we also show how a more general process-to-GPU mapping can be performed.
7984

8085
# Sample application
86+
8187
The sample DDP application in this repo is based on the "Hello, World" [DDP tutorial](https://pytorch.org/tutorials/intermediate/ddp_tutorial.html).
8288

8389
## Argument passing convention
90+
8491
The DDP application takes two command-line arguments:
92+
8593
1. `--local_rank`: This is passed in via `launch.py`
8694
2. `--local_world_size`: This is passed in explicitly and is typically either $1$ or the number of GPUs per node.
8795

8896
The application parses these and calls the `spmd_main` entrypoint:
97+
8998
```py
9099
if __name__ == "__main__":
91100
parser = argparse.ArgumentParser()
@@ -94,7 +103,9 @@ if __name__ == "__main__":
94103
args = parser.parse_args()
95104
spmd_main(args.local_world_size, args.local_rank)
96105
```
106+
97107
In `spmd_main`, the process group is initialized with just the backend (NCCL or Gloo). The rest of the information needed for rendezvous comes from environment variables set by `launch.py`:
108+
98109
```py
99110
def spmd_main(local_world_size, local_rank):
100111
# These are the parameters used to initialize the process group
@@ -116,6 +127,7 @@ def spmd_main(local_world_size, local_rank):
116127
```
117128

118129
Given the local rank and world size, the training function, `demo_basic` initializes the `DistributedDataParallel` model across a set of GPUs local to the node via `device_ids`:
130+
119131
```py
120132
def demo_basic(local_world_size, local_rank):
121133

@@ -144,10 +156,13 @@ def demo_basic(local_world_size, local_rank):
144156
```
145157

146158
The application can be launched via `launch.py` as follows on a 8 GPU node with one process per GPU:
159+
147160
```sh
148161
python /path/to/launch.py --nnode=1 --node_rank=0 --nproc_per_node=8 example.py --local_world_size=8
149162
```
163+
150164
and produces an output similar to the one shown below:
165+
151166
```sh
152167
*****************************************
153168
Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed.
@@ -177,16 +192,21 @@ Setting OMP_NUM_THREADS environment variable for each process to be 1 in default
177192
[238631] rank = 4, world_size = 8, n = 1, device_ids = [4]
178193
[238627] rank = 0, world_size = 8, n = 1, device_ids = [0]
179194
```
195+
180196
Similarly, it can be launched with a single process that spans all 8 GPUs using:
197+
181198
```sh
182199
python /path/to/launch.py --nnode=1 --node_rank=0 --nproc_per_node=1 example.py --local_world_size=1
183200
```
201+
184202
that in turn produces the following output
203+
185204
```sh
186205
[262816] Initializing process group with: {'MASTER_ADDR': '127.0.0.1', 'MASTER_PORT': '29500', 'RANK': '0', 'WORLD_SIZE': '1'}
187206
[262816]: world_size = 1, rank = 0, backend=nccl
188207
[262816] rank = 0, world_size = 1, n = 8, device_ids = [0, 1, 2, 3, 4, 5, 6, 7]
189208
```
190209
191210
# Conclusions
192-
As the author of a distributed data parallel application, your code needs to be aware of two types of resources: compute nodes and the GPUs within each node. The process of setting up bookkeeping to track how the set of GPUs is mapped to the processes of your application can be tedious and error-prone. We hope that by structuring your application as shown in this example and using the launcher, the mechanics of setting up distributed training can be significantly simplified.
211+
212+
As the author of a distributed data parallel application, your code needs to be aware of two types of resources: compute nodes and the GPUs within each node. The process of setting up bookkeeping to track how the set of GPUs is mapped to the processes of your application can be tedious and error-prone. We hope that by structuring your application as shown in this example and using the launcher, the mechanics of setting up distributed training can be significantly simplified.

distributed/rpc/batch/README.md

+11-11
Original file line numberDiff line numberDiff line change
@@ -3,15 +3,15 @@
33
This folder contains two examples for [`@rpc.functions.async_execution`](https://pytorch.org/docs/master/rpc.html#torch.distributed.rpc.functions.async_execution):
44

55
1. Synchronized Batch Update Parameter Server: uses `@rpc.functions.async_execution`
6-
for parameter update and retrieving. This serves as a simple starter example
7-
for batch RPC.
8-
```
9-
pip install -r requirements.txt
10-
python parameter_server.py
11-
```
6+
for parameter update and retrieving. This serves as a simple starter example
7+
for batch RPC.
8+
```
9+
pip install -r requirements.txt
10+
python parameter_server.py
11+
```
1212
2. Multi-Observer with Batch-Processing Agent: uses `@rpc.functions.async_execution`
13-
to run multiple observed states through the policy to get actions.
14-
```
15-
pip install -r requirements.txt
16-
python reinforce.py
17-
```
13+
to run multiple observed states through the policy to get actions.
14+
```
15+
pip install -r requirements.txt
16+
python reinforce.py
17+
```

distributed/rpc/parameter_server/README.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
### RPC-based distributed training
22

3-
This is a basic example of RPC-based training that uses several trainers remotely train a model hosted on a server.
3+
This is a basic example of RPC-based training that uses several trainers remotely train a model hosted on a server.
44

55
To run the example locally, run the following command worker for the server and each worker you wish to spawn, in separate terminal windows:
66
`python rpc_parameter_server.py --world_size=WORLD_SIZE --rank=RANK`. For example, for a master node with world size of 2, the command would be `python rpc_parameter_server.py --world_size=2 --rank=0`. The trainer can then be launched with the command `python rpc_parameter_server.py --world_size=2 --rank=1` in a separate window, and this will begin training with one server and a single trainer.

distributed/rpc/rnn/README.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
Distributed RNN Model Parallel Example
22

3-
This example shows how to build an RNN model using RPC where different
3+
This example shows how to build an RNN model using RPC where different
44
components of the RNN model can be placed on different workers.
55

66
```

distributed/sharded_tensor/README.md

-3
Original file line numberDiff line numberDiff line change
@@ -8,13 +8,10 @@ PyTorch native sharding APIs, which include:
88
3. A E2E demo of tensor parallel for a given toy model (Forward/backward + optimization).
99
4. API to optimize parameters when they are `ShardedTensor`s.
1010

11-
1211
More details about the design can be found:
1312
https://github.com/pytorch/pytorch/issues/72138
1413

15-
1614
```
1715
pip install -r requirements.txt
1816
python main.py
1917
```
20-

0 commit comments

Comments
 (0)