-
Notifications
You must be signed in to change notification settings - Fork 3
Commands
This section describes in detail subcommands provided by CLI tool torchlambda
.
Using them users can deploy PyTorch models to AWS Lambda
with minimal dependencies as all libraries (libtorch, AWS C++ SDK and AWS C++ Runtime) are statically linked into single binary weighing in total of 30
megabytes.
Each command and subcommand has built-in --help
command, just issue from your command line:
$ torchlambda --help
Each subcommand has separate sections called Required arguments and Optional arguments where each argument and flag is described in detail.
Exact description and how all the parts work described below:
This command generates yaml
file containing settings for deployment which can be seen below:
---
grad: true
validate_json: true
data: data
validate_data: true
model: /opt/model.ptc
inputs: [1, 3, width, height]
validate_inputs: true
cast: float
divide: 255
normalize:
means: [0.485, 0.456, 0.406]
stddevs: [0.229, 0.224, 0.225]
return:
output:
type: double
name: output
item: false
result:
operation: argmax
arguments: 1
type: long
name: labels
item: true
Extensive description of each field and option is located in YAML settings reference file. By default the file will be named torchlambda.yaml
and created in your current working directory. It can be passed to torchlambda scheme --yaml /path/to/settings.yaml
to create C++
code automatically.
⚠️ A lot of fields above already have sane default values so there is no need to specify everything, see reference!
None
-
--destination
- path specifying location of generated file e.g.torchlambda settings --destination /path/to/file/to/generate/settings.yaml
. By default they will be placed in your current working directory and namedtorchlambda.yaml
.
Creates C++ code scheme for user to modify or creates it based on specified yaml
settings (see previous step).
In the first case, human friendly main.cpp
C++ example code is generated, by default in torchlambda
folder in your current working directory.
Click here to check generated code
#include <aws/core/Aws.h>
#include <aws/core/utils/base64/Base64.h>
#include <aws/core/utils/json/JsonSerializer.h>
#include <aws/core/utils/memory/stl/AWSString.h>
#include <aws/lambda-runtime/runtime.h>
#include <torch/script.h>
#include <torch/torch.h>
/*!
*
* HANDLE REQUEST
*
*/
static aws::lambda_runtime::invocation_response
handler(torch::jit::script::Module &module,
const Aws::Utils::Base64::Base64 &transformer,
const aws::lambda_runtime::invocation_request &request) {
const Aws::String data_field{"data"};
/*!
*
* PARSE AND VALIDATE REQUEST
*
*/
const auto json = Aws::Utils::Json::JsonValue{request.payload};
if (!json.WasParseSuccessful())
return aws::lambda_runtime::invocation_response::failure(
"Failed to parse input JSON file.", "InvalidJSON");
const auto json_view = json.View();
if (!json_view.KeyExists(data_field))
return aws::lambda_runtime::invocation_response::failure(
"Required data was not provided.", "InvalidJSON");
/*!
*
* LOAD DATA, TRANSFORM TO TENSOR, NORMALIZE
*
*/
const auto base64_data = json_view.GetString(data_field);
Aws::Utils::ByteBuffer decoded = transformer.Decode(base64_data);
torch::Tensor tensor =
torch::from_blob(decoded.GetUnderlyingData(),
{
static_cast<long>(decoded.GetLength()),
},
torch::kUInt8)
.reshape({1, 3, 64, 64})
.toType(torch::kFloat32) /
255.0;
torch::Tensor normalized_tensor = torch::data::transforms::Normalize<>{
{0.485, 0.456, 0.406}, {0.229, 0.224, 0.225}}(tensor);
/*!
*
* MAKE INFERENCE
*
*/
auto output = module.forward({normalized_tensor}).toTensor();
const int label = torch::argmax(output).item<int>();
/*!
*
* RETURN JSON
*
*/
return aws::lambda_runtime::invocation_response::success(
Aws::Utils::Json::JsonValue{}
.WithInteger("label", label)
.View()
.WriteCompact(),
"application/json");
}
int main() {
/*!
*
* LOAD MODEL ON CPU
* & SET IT TO EVALUATION MODE
*
*/
/* Turn off gradient */
torch::NoGradGuard no_grad_guard{};
/* No optimization during first pass as it might slow down inference by 30s */
torch::jit::setGraphExecutorOptimize(false);
constexpr auto model_path = "/opt/model.ptc";
torch::jit::script::Module module = torch::jit::load(model_path, torch::kCPU);
module.eval();
/*!
*
* INITIALIZE AWS SDK
* & REGISTER REQUEST HANDLER
*
*/
Aws::SDKOptions options;
Aws::InitAPI(options);
{
const Aws::Utils::Base64::Base64 transformer{};
const auto handler_fn =
[&module,
&transformer](const aws::lambda_runtime::invocation_request &request) {
return handler(module, transformer, request);
};
aws::lambda_runtime::run_handler(handler_fn);
}
Aws::ShutdownAPI(options);
return 0;
}
If --yaml
specified main.cpp
will be created in the same default location as well, but this time it will be much less human-friendly and imputed based on settings. It's not advised to base your custom code off of this as it uses C++ macros
heavily and may lack appropriate formatting.
None
-
--destination
- path specifying folder where generated C++ file namedmain.cpp
will be located e.g.torchlambda template --destination /path/to/folder/where/main/will/be/located
. By default new foldertorchlambda
will be created in your current working directory (if it doesn't exist) and files written to it (possibly overwriting existing ones). -
--yaml
- path specifying file containing settings from previous step e.g.torchlambda template --yaml /path/to/settings.yaml
This command allows users to change the most in their deployment (besides created/generated source code).
It builds .zip
package deployable directly to AWS Lambda using one of specified images.
For easiest build just run:
$ torchlambda build /path/to/folder/with/source
(please note you should provide path to folder, not your source! Using this way all .cpp
, .c
, .h
, .hpp
files will be compiled and packed automatically)
torchlambda
provides pre-built, relatively small (~600Mb
), Docker images which will be run to produce deployment package (~20Mb
after packing).
By default following images based on PyTorch
version are provided on DockerHub:
-
szymonmaszke/torchlambda:latest
(head of current PyTorch master branch) szymonmaszke/torchlambda:1.5.0
-
szymonmaszke/torchlambda:1.4.0
(discouraged due to unstable PyTorch C++ frontend)
Above images are built daily so AWS's C++ SDK and AWS C++ Lambda Runtime are always up to date (unless any major breaking change occurs), their versions are not currently specifiable via script due to frequent releases (if this will become frequent request it will be added).
Some flags trigger build of the torchlambda
image. This allows users to create their own deployment and model tailored deployment versions to some extent. For this see Optional arguments below and look for this warning (any flag with it will trigger customized build):
⚠️ This flag turns on custom deployment image build!
-
source
- positional arguments pointing to folder containing C++ source codes to use for deployment (seetorchlambda template
command unless you provide your own). Any source code inside this folder will be linked and compiled so you can freely split your code into multiple files, no need for singlemain.cpp
. Files having one of.c
,.cpp
,.h
and.hpp
extensions will be compiled (support for different filetypes can be easily added)
-
--destination
- path specifying zipped filename where generated.zip
package ready for deployment will be located e.g.torchlambda build source --destination /path/to/folder/where/zip/will/be/located.zip
. By default it will be placed in current working directory and namedtorchlambda.zip
-
--compilation
- compilation arguments used for your source code (if you want different options forpytorch
oraws
see--pytorch
or--aws
options accordingly). Should be passed as astring
, e.g.torchlambda sources --compilation "-Wall -O2". If you want to pass only single flag __add space after value, e.g.
torchlambda sources --compilation "-O3 "`. By default no flags are passed -
--operations
- path tooperations.yaml
containing operations needed by your model for compilation. This allows for smaller deployment size if model's ops are exported (see here). For now this option doesn't seem to influence final deployment size probably due to--whole-archive
option passed duringlibtorch
linking due to some issues with static version.
⚠️ This flag turns on custom deployment image build!
-
--pytorch
- options used to buildpytorch
dependency from source. Will be passed ascmake
flags to PyTorch build script. To see available options check PyTorch'sCMakeLists.txt
. Should be passed as variable list of arguments, without-D
CMake environment variable prefix e.g.--pytorch USE_MPI=ON -USE_NUMPY=ON
. Default have those overridableCMake
flags set (in addition to the ones provided in aforementioned script):- -DBUILD_PYTHON=OFF
- -DUSE_MPI=OFF
- -DUSE_NUMPY=OFF
- -DUSE_ROCM=OFF
- -DUSE_NCCL=OFF
- -DUSE_NUMA=OFF
- -DUSE_MKLDNN=OFF
- -DUSE_GLOO=OFF
- -DUSE_OPENMP=OFF
⚠️ This flag turns on custom deployment image build!
-
--pytorch-version
- GitHub sources ofPyTorch
will be reverted to specify version (either commit or tag, for full list of releases see here. For example--pytorch-version "v1.4.0"
By default, current PyTorch's master head will be used.
⚠️ This flag turns on custom deployment image build!
-
--aws
- options used to build AWS C++ SDK from source. Will be passed ascmake
flags toCMake
. To see available options check here. Should be passed as variable list of arguments, without-D
CMake environment variable prefix e.g.--aws CPP_STANDARD=11 CUSTOM_MEMORY_MANAGEMENT=ON
. Those flags are used by default:- -DBUILD_SHARED_LIBS=OFF (cannot be overriden)
- -DENABLE_UNITY_BUILD=ON (usually shouldn't be overriden)
- -DCUSTOM_MEMORY_MANAGEMENT=OFF
- -DCPP_STANDARD=17
⚠️ This flag turns on custom deployment image build!⚠️ Please note flag specifying components to build (-DBUILD_ONLY
) should be passed in--aws-components
-
--aws-components
- components of AWS C++ SDK to be built. This flag corresponds to-DBUILD_ONLY
from docs. Should be specified as variable list of arguments naming the components, e.g.--aws-components s3 dynamodb
. Additionalycore
always be built as well. If users need integration with other services in their C++ source this is the flag to specify. By default onlycore
is built.
⚠️ This flag turns on custom deployment image build!
-
--image
- name ofDocker
image to use. It follows the following rules:- If name of Docker image exists on
localhost
it will be used for deployment - Otherwise if name adheres to one of prebuilt tags (e.g.
szymonmaszke/torchlambda:latest
) it will be downloaded fromDockerHub
and used - Otherwise and if one of flags rebuilding image is specified new image will be built with the specified name
- If flag is unspecified (default case)
szymonmaszke/torchlambda:latest
will be used for deployment
- If name of Docker image exists on
-
--docker
- Flags passed todocker
command (including every otherdocker
subcommand used during deployment likedocker build
ordocker run
). Should be specified asstring
, if you want to pass a single flag you should add space after string, e.g.--docker "--debug "
will becomedocker --debug run
for run commands oftorchlambda
. By default no flags are passed. -
--docker-build
- Flags passed specifically todocker build
command. Should be specified asstring
, if you want to pass a single flag you should add space after string, e.g.--docker-build "--compress "
, multiple flags can be passed as well like--docker-build "--compress --no-cache"
. By default no flags are passed. -
--docker-run
- Flags passed specifically todocker run
command. Should be specified asstring
, if you want to pass a single flag you should add space after string, e.g.--docker-run "--name deployment "
, multiple flags can be passed as well like--docker-run "--name deployment --mount source=myvol2,target=/home/app"
. By default no flags are passed. -
--no-run
- If true, do not runcompilation
of provided source code (only buildimage
). Allows for separation of image build and run (useful for decoupling both steps). If you are only after building docker image you could use any folder in place ofsource
argument, e.g.torchlambda build . --no-run --pytorch-version "1.4.0" --components s3 dynamodb
. Default:False
.
This command packs your torchscript
compiled model as ready to deploy as AWS Lambda layer .zip
package.
Basic form of this command is simply:
$ torchlambda layer my_model.ptc
model.zip
will be created in your current working directory using store
(no compression). This compression method is recommended as it is the fastest to be unpacked on AWS Lambda and no supported by AWS Lambda compression method will make your model much smaller AFAIK. If you wish to make your model smaller please use quantization methods or a-like! (please note quantization wasn't tested extensively at this point though it should work).
-
source
- path pointing to torchscript compiled PyTorch module to be packed.
-
--destination
- path specifying location of generated.zip
file e.g.torchlambda layer model.ptc --destination /path/to/file/to/generate/model.zip
. By default it will be placed in your current working directory and namedmodel.zip
. -
--directory
- archive directory where model will be saved. Issuingtorchlambda layer model.ptc --directory "my/folder"
will packmodel.ptc
inside "my/folder/model.ptc". By default there is no folder, e.g. model will be unpacked to/opt/model.ptc
on AWS Lambda. -
--compression
- compression method to use. One of["STORED", "DEFLATED", "BZIP2", "LZMA"]
available. See Python documentation for more information. Default:"STORED"
(for reasons mentioned above). -
--compression-level
- level of compression to use,0
through9
. See Python documentation for more information. Default:None
(default for--compression
)