Name	Name	Last commit message	Last commit date
parent directory ..
README.md	README.md
bench_175b_1x4.sh	bench_175b_1x4.sh
bench_175b_4x1.sh	bench_175b_4x1.sh
bench_30b_1x4.sh	bench_30b_1x4.sh
bench_30b_4x1.sh	bench_30b_4x1.sh
bench_6.7b_1x4.sh	bench_6.7b_1x4.sh
bench_6.7b_4x1.sh	bench_6.7b_4x1.sh
bench_dist_multi_node.sh	bench_dist_multi_node.sh
bench_dist_single_node.sh	bench_dist_single_node.sh
bench_suite.py	bench_suite.py

Name

Last commit message

Last commit date

bench_dist_multi_node.sh

bench_dist_single_node.sh

bench_suite.py

Benchmark FlexLLMGen

NOTE: This benchmark uses dummy weights by default for faster experiments. It is expected if you see randomly generated garbled characters, but the throughput and latency numbers should be correct.

Mount SSD

The following commands use ~/flexllmgen_offload_dir as the offloading folder by default. To get the best performance, it is recommonded to mount this folder on a fast SSD. If you use AWS or GCP instances with local SSDs, you can use mount_nvme_aws.sh or mount_nvme_gcp.sh to mount the local SSDs.

Single GPU

OPT-6.7B

# fp16
python3 bench_suite.py 6b7_1x1

# with int4 compression
python3 bench_suite.py 6b7_1x1_comp

OPT-30B

# fp16
python3 bench_suite.py 30b_1x1

# with int4 compression
python3 bench_suite.py 30b_1x1_comp

OPT-175B

# fp16
python3 bench_suite.py 175b_1x1

# with int4 compression
python3 bench_suite.py 175b_1x1_comp

Distributed GPUs

Requirements

sudo apt install openmpi-bin

OPT-6.7B

# 1 node with 4 GPUs
bash bench_6.7b_1x4.sh

# 4 nodes and one GPU per node
bash bench_6.7b_4x1.sh

OPT-30B

# 1 node with 4 GPUs
bash bench_30b_1x4.sh

# 4 nodes and one GPU per node
bash bench_30b_4x1.sh

OPT-175B

# 1 node with 4 GPUs
bash bench_175b_1x4.sh

# 4 nodes and one GPU per node
bash bench_175b_4x1.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

Benchmark FlexLLMGen

Mount SSD

Single GPU

OPT-6.7B

OPT-30B

OPT-175B

Distributed GPUs

Requirements

OPT-6.7B

OPT-30B

OPT-175B

FilesExpand file tree

flexllmgen

Directory actions

More options

Directory actions

More options

Latest commit

History

flexllmgen

Folders and files

parent directory

README.md

Benchmark FlexLLMGen

Mount SSD

Single GPU

OPT-6.7B

OPT-30B

OPT-175B

Distributed GPUs

Requirements

OPT-6.7B

OPT-30B

OPT-175B