Skip to content

marilena-baldi/Llambda

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Llambda

Description

This sample project aims to show the possibility of deploying serverless generative AI on AWS at a very low cost.

The main idea is to deploy a container with an endpoint on a Lambda function to interact with the model.

Requirements

Local

You need to have installed:

  • Docker;
  • AWS CLI;
  • Make.

AWS

Make sure you have created:

  • an ECR repository.

Getting started

Configuration

Create the .env file from the .env.dist file and update it with:

Note that the size of the model should be not a little less than the memory limit of the lambda, which is about 10 GB at most.

Setup

  1. Build and push the image to the registry.

    Download the model:

    make download

    Build the container image and tag it:

    make build
    make tag

    Login into ECR:

    make ecr-login

    Push the image:

    make push
  2. Create a Lambda function with your repository.

    Make sure to:

    • set the maximum available memory;
    • enable function URL;
    • increase the timeout if necessary;

Usage

Make a request to the function endpoint to get the model response:

curl "https://{LAMBDA_FUNCTION_URL}/prompt?text=hello"

References

OpenLLaMa on AWS Lambda

About

Serverless generative AI with llama.cpp on AWS Lambda.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published