Skip to content

Latest commit

 

History

History
844 lines (526 loc) · 33.4 KB

File metadata and controls

844 lines (526 loc) · 33.4 KB

👋 Welcome to Setting Up Your Interactive Dev Environment for Building with LLMs!!

Welcome to the beginning of your journey to becoming an LLM Operations (LLMOps) Engineer! 🎉 Follow these steps to get your development environment teed up!

👋 Welcome to Beyond ChatGPT!!

For a step-by-step YouTube video walkthrough, watch this!
Setting up your LLM Ops Dev Environment

📚 Quick Review

We will be using some terminal commands, so let's make sure you know what they are and what they do!

Command Stands For Description
ls long listing lists all files and directories in the present working directory
ls -a long listing all lists hidden files as well
cd {dirname} change directory to change to a particular directory
cd ~ change directory home navigate to HOME directory
cd .. change directory up move one level up
cat {filename} concatenate displays the file content
sudo superuser allows regular users to run programs with the security privileges of the superuser or root
mv {filename} {newfilename} move renames the file to new filename
clear clear clears the terminal screen
mkdir {dirname} make directory create new directory in present working directory or at specified path
rm {filename} remove remove file with given filename
touch {filename}.{ext} touch create new empty file
rmdir {dirname} remove directory deletes a directory
ssh {username}@{ip-address} or {hostname} secure shell login into a remote Linux machine using SSH

🛠️ Tools We'll Be Using

We will also be using a few tools such as git, conda, and pip.

Git

Git is a free and open source distributed version control system designed to handle everything from small to very large projects. These are the commands we will be using with git:

git clone -> clone a remote repository to your local computer

git add -> add files to a commit

git commit -m {message} -> commit changes with a message

git push -> push commit to remote repository

Conda & Pip

Conda is an open-source, cross-platform, language-agnostic package manager and environment management system. We will use pip within conda environments to manage our package installations. pip is Python's package management system. conda comes with Anaconda. And Anaconda is a convenient way to set up your Python programming environment since it comes with an enviornment management tool (conda) and comes with extra packages that are commonly used in data science and ML.

Some commands we will use in this lesson when it comes to conda and pip:

conda create --name llmops-course python=3.11 pip -> This creates a virtual environment. A virtual environment is a Python environment such that the Python interpreter, libraries, amnd scripts installed into it are isolated from those installed on other environments and any libraries installed on the system. So basically, this allows you to keep all your project's code/dependencies/libraries separated from other projects. You are specifically saying to create said environment with the name llmops-course, use python version 3.11, and use pip as your package manager. The command conda invokes the underlying logic to actually make the virtual environment and manages said environments for you.

conda activate llmops-course -> This activates the virtual environment you made with the above command for your current terminal session.

pip install numpy pandas matplotlib jupyter openai huggingface_hub -> This installs the six packages mentioned - numpy, pandas, jupyter, matplotlib, and openai. numpy is used for scientific computing, pandas is used for data analysis, matplotlib is used for data graphics. jupyter is discussed later in this tutorial in depth! openai is used to access OpenAI's GPT models through an API key. huggingface_hub is used to push our code and models to Huggingface and host it in a Huggingface Space. pip is the Python package manager and you are telling it to install the listed packages to your environment.

🚀 Let's Get Started!

Let's start off by setting up our environment! Review the environment setup instructions for the local environment that you'll be using in this course.

Windows
wsl --install

(If you find yourself getting stuck on the WSL2 install, here is a link to video instructions)

Give it a test drive!

WindowsTerminal

Continue by installing the following tools using Windows Terminal to setup your environment. When prompted, make sure to add conda to init.

Tool Purpose Command
🐍 Anaconda (installed in WSL2) Python & ML Toolkits wget https://repo.anaconda.com/archive/Anaconda3-2023.07-2-Linux-x86_64.sh
bash Anaconda3-2023.07-2-Linux-x86_64.sh
source ~/.bashrc
:octocat: Git (installed in WSL2) Version Control sudo apt update && sudo apt upgrade
sudo apt install git-all
📝 VS Code (installed in Windows) Development Environment Download
Linux (Debian/Ubuntu)

Open terminal using Ctrl+Shift+T. Enter the following commands in terminal to setup your environment. When prompted, make sure to add conda to init.

Tool Purpose Command
🐍 Anaconda Python & ML Toolkits wget https://repo.anaconda.com/archive/Anaconda3-2023.07-2-Linux-x86_64.sh
bash Anaconda3-2023.07-2-Linux-x86_64.sh
source ~/.bashrc
:octocat: Git Version Control sudo apt update && sudo apt upgrade
sudo apt install git-all
📝 VS Code Development Environment Download
macOS Intel

To get started, we need to download the MacOS package manager, Homebrew 🍺, so that we can download the tools we'll be using in the course. If you don't already have Homebrew installed, run the following commands:

  1. Open terminal using +Space and type terminal.

  2. Install Homebrew using the command below, following the command prompts:

    /bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"

  3. Update Homebrew (This may take a few minutes)

    git -C /usr/local/Homebrew/Library/Taps/homebrew/homebrew-core fetch --unshallow

    git -C /usr/local/Homebrew/Library/Taps/homebrew/homebrew-cask fetch

  4. Install the wget command to continue following along brew install wget

Enter the following commands in terminal to setup your environment. When prompted, make sure to add conda to init.

Tool Purpose Command
🐍 Anaconda Python & ML Toolkits wget https://repo.anaconda.com/archive/Anaconda3-2023.07-2-MacOSX-x86_64.sh
bash Anaconda3-2023.07-2-MacOSX-x86_64.sh
source ~/.bashrc
:octocat: Git Version Control brew install git
📝 VS Code Development Environment Download
macOS Apple Silicon

To leverage the Mx chip for Python, you must use a special Python distribution called Miniforge. Open terminal using +Space and type terminal. Enter the following commands in terminal to setup your environment.

Miniforge can be installed using Homebrew or from the source. We suggest trying Homebrew option first.

Option 1 Homebrew

To get started, we need to download the MacOS package manager, Homebrew 🍺, so that we can download the tools we'll be using in the course. If you don't already have Homebrew installed, run the following commands:

  1. Open terminal using +Space and type terminal.

  2. Install Homebrew using the command below, following the command prompts:

    /bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"

  3. Update Homebrew (This may take a few minutes)

    git -C /usr/local/Homebrew/Library/Taps/homebrew/homebrew-core fetch --unshallow

    git -C /usr/local/Homebrew/Library/Taps/homebrew/homebrew-cask fetch

  4. Install the wget command to continue following along brew install wget

  5. Install the xcode-select command-line utilities by typing the following command in the terminal

    xcode-select --install

After running the commands from the table, when prompted, initiate your conda base environment by running conda init zsh.

Tool Purpose Command
🐍 Miniforge Python & ML Toolkits brew install miniforge
:octocat: Git Version Control sudo apt update && sudo apt upgrade
sudo apt install git-all
📝 VS Code Development Environment Download

Let's configure our VS Code environment!

Install the IntelliCode Extension

IntelliCode is an AI-powered code completion extension to boost coding productivity. 😎

  1. Click the Extensions tab in the navigation panel on the left side of VS Code.

  2. Type "IntelliCode" in the search bar.

  3. Click install on the Microsoft IntelliCode Extension

Install the Python and Jupyter Notebook Extensions
  1. Click the Extensions tab on the left side of the window.

  2. Type "Python" in the search bar.

  3. Click Install on the Python Extension

  4. Then, type "Jupyter" in the search bar.

  5. Click Install on the Microsoft Jupyter Notebook Extension

Set the Python Interpreter
  1. Open VS Code and click on New File...

  2. Open the Command Pallette (Mac: Shift+P , Windows: Ctrl+Shift+P)

  3. Type "Python" in the search bar.

  4. Click on New Python File

  5. Open the Command Pallette again. Can you remember the shortcut? If not, see #2 above again.

  6. Type "Python Interpreter".

  7. Click on Python: Select Interpreter

  8. Select the Conda environment that you installed earlier.

  1. Now you're ready to start coding!

🐳 Setting up Docker Desktop and Compose

Windows

👉 💿 Download Docker 👈

  1. Double-click Docker Desktop Installer.exe to run the installer.

  2. When prompted, ensure the Use WSL 2 instead of Hyper-V option on the Configuration page is selected or not depending on your choice of backend.

    If your system only supports one of the two options, you will not be able to select which backend to use.

  3. Follow the instructions on the installation wizard to authorize the installer and proceed with the install.

  4. When the installation is successful, select Close to complete the installation process.

  5. If your admin account is different to your user account, you must add the user to the docker-users group. Run Computer Management as an administrator and navigate to Local Users and Groups > Groups > docker-users. Right-click to add the user to the group. Sign out and sign back in for the changes to take effect.

Ubuntu

To install Docker Desktop successfully, you must:

  • Meet the system requirements
  • Have a 64-bit version of either Ubuntu Jammy Jellyfish 22.04 (LTS) or Ubuntu Impish Indri 21.10. Docker Desktop is supported on x86_64 (or amd64) architecture.
  • For non-Gnome Desktop environments, gnome-terminal must be installed:
    $ sudo apt install gnome-terminal
  1. Update the apt package index and install packages to allow apt to use a repository over HTTPS:

    $ sudo apt-get update
    $ sudo apt-get install ca-certificates curl gnupg
  2. Add Docker's official GPG key:

    $ sudo install -m 0755 -d /etc/apt/keyrings
    $ curl -fsSL {{% param "download-url-base" %}}/gpg | sudo gpg --dearmor -o /etc/apt/keyrings/docker.gpg
    $ sudo chmod a+r /etc/apt/keyrings/docker.gpg
  3. Use the following command to set up the repository:

    $ echo \
      "deb [arch="$(dpkg --print-architecture)" signed-by=/etc/apt/keyrings/docker.gpg] {{% param "download-url-base" %}} \
      "$(. /etc/os-release && echo "$VERSION_CODENAME")" stable" | \
      sudo tee /etc/apt/sources.list.d/docker.list > /dev/null
  4. Update the apt package index:

    $ sudo apt-get update
  5. Download Docker Desktop

    wget https://desktop.docker.com/linux/main/amd64/docker-desktop-4.22.1-amd64.deb
  6. Install the package with apt as follows:

    $ sudo apt-get update
    $ sudo apt-get install ./docker-desktop-4.22.1-amd64.deb
  7. Launch Docker Desktop

         systemctl --user start docker-desktop

    Note

    At the end of the installation process, apt displays an error due to installing a downloaded package. You can ignore this error message.

    N: Download is performed unsandboxed as root, as file '/home/user/Downloads/docker-desktop.deb' couldn't be accessed by user '_apt'. - pkgAcquire::Run (13: Permission denied)
    
macOS (Intel and Apple Silicon)

👉 💿 Download Docker 👈

  1. Double-click Docker.dmg to open the installer, then drag the Docker icon to the Applications folder.

  2. Double-click Docker.app in the Applications folder to start Docker.

  3. The Docker menu ({{< inline-image src="images/whale-x.svg" alt="whale menu" >}}) displays the Docker Subscription Service Agreement.

    {{< include "desktop-license-update.md" >}}

  4. Select Accept to continue.

    Note that Docker Desktop won't run if you do not agree to the terms. You can choose to accept the terms at a later date by opening Docker Desktop.

    For more information, see Docker Desktop Subscription Service Agreement. We recommend that you also read the FAQs.

  5. From the installation window, select either:

    • Use recommended settings (Requires password). This let's Docker Desktop automatically set the necessary configuration settings.
    • Use advanced settings. You can then set the location of the Docker CLI tools either in the system or user directory, enable the default Docker socket, and enable privileged port mapping. See Settings, for more information and how to set the location of the Docker CLI tools.
  6. Select Finish. If you have applied any of the above configurations that require a password in step 5, enter your password to confirm your choice.

🔑 Setting Up Keys and Tokens

Generating a GitHub Access Token

Create an account with GitHub here if you do not have one.

Navigate to GitHub's Developer Token settings. Click on Generate new token > Generate new token (classic) Screenshot 2023-08-30 at 8 16 58 PM

Give the token a description, set the expiration (we recommend 90 days), and check every box. When you're done, click Generate token at the bottom of the page.

Screenshot 2023-08-30 at 8 36 14 PM

Copy the access token and save it for later use. We will use this token to interact with GitHub. Please do not lose this access token or you will need to generate a new one.

image

Generating an OpenAI API key

Create an account with OpenAI here if you do not have one.

Navigate to OpenAI's API Developer settings and click on + Create new secrete key. image

Name your key and click Create secret key image

Copy the key and save it for later use. We will use this key several times in deploying projects. Please do not lose this key or you will need to generate a new one

image

We recommend your run through our OpenAI Notebook to learn how to utilize the OpenAI API.

Generating a Huggingface Access Token

Create an account with Huggingface here if you do not have one.

Navigate to Token settings and click on New token. Screenshot 2023-08-29 at 6 16 12 PM

Name your access token, change the role to write, and click Generate a token Screenshot 2023-08-29 at 6 16 58 PM

Copy the token and save it for later use. We will use this token several times in deploying projects. If you lose this token, you can always go back to your token's page and view the token.

Screenshot 2023-08-29 at 6 17 29 PM

Login to Huggingface using your terminal

huggingface-cli login

Screenshot 2023-08-29 at 6 13 23 PM

After logging in, press y to add the token to credentials for git. Screenshot 2023-08-29 at 6 18 13 PM

Let's Make Sure That GitHub is Ready to Roll!

Github SSH Setup Secure Shell Protocol (SSH) provides a secure communication channel of an unsecured network. Let's set it up!

  1. Generate a Private/Public SSH Key Pair.
ssh-keygen -o -t rsa -C "your email address for github"
  1. Save file pair. Default location ~/.ssh/id_rsa is fine!

  2. At the prompt, type in a secure passphrase.

  3. Copy the contents of the public key that we will share with GitHub.

    • Mac: pbcopy < ~/.ssh/id_rsa.pub

    • Windows (WSL): clip.exe < ~/.ssh/id_rsa.pub

    • Linux: xclip -sel c < ~/.ssh/id_rsa.pub

  4. Go to your GitHub account and go to Settings.

  5. Under Access, click on the SSH and GPG keys tab on the left.

image

  1. Click on the New SSH Key button.

image

  1. Name the key, and paste the public key that you copied. Click the Add SSH Key button

image

Viewing the Repositories

Login and click on the top right user icon, then go to Your repositories.

image

Creating a New Repository

When viewing the respository page, click on New and proceed to create your repo.

image


Filling Respository Details

Create the repository by inputting the following:

  • Repo name
  • Repo description
  • Make repo public
  • Add a README
  • Add .gitignore (Python template)
  • Add license (choose MIT)

Then click Create Repository.

image

Clone Your Repo
  1. Open your terminal and navigate to a place where you would like to make a directory to hold all your files for this class using the command cd.
cd {directory name}
  1. Once there, make a top level directory using mkdir.
mkdir {directory name}
  1. cd into it and make another directory called code.
cd {directory name}
mkdir code
  1. cd into it and run your git clone {your repo url} command.
cd code
git clone {your repo url}
Adding The AI Makerspace Beyond-ChatGPT Content to Your Repo
  1. cd into your repo and check your remote git.
cd {your repo name}
git remote -v

At this point, you should just have access to your own repo with an origin branch with both fetch and push options.

  1. Let's setup our global configuration:
git config --global user.email "your email address"
git config --global user.name "your name"
  1. Let's add a local branch for development.
git checkout -b LocalDev

You can change anything here in this branch!

git add .

Commit the changes with the branch addition.

git commit -m "Adding a LocalDev branch."
  1. Let's push our local changes to our remote repo.
git checkout main
git merge LocalDev
git push origin main
  1. Add the Beyond-ChatGPT (BC) repo as an extra remote repo:
git remote add BC [email protected]:AI-Maker-Space/Beyond-ChatGPT.git

Let's check our remote repos:

git remote -v

At this point, you should have access to both your own repo and the AI Maker Space repo and should see something like this:

BC    [email protected]:AI-Maker-Space/Beyond-ChatGPT.git (fetch)
BC    [email protected]:AI-Maker-Space/Beyond-ChatGPT.git (push)
origin [email protected]:ai-kadhim/TestRepo.git (fetch)
origin [email protected]:ai-kadhim/TestRepo.git (push)

Let's update our local repos:

git fetch --all

Make a new branch for the Beyond-ChatGPT material (BCBranch).

git checkout --track -b BCBranch BC/main

You should see something like this:

branch 'BCBranch' set up to track 'BC/main'.
Switched to a new branch 'BCBranch'

You can visually check whether you are in that branch:

git log --all --graph

Now let's push our updated local repo to our remote repo!

git checkout main
git merge BCBranch --allow-unrelated-histories

If there are any conflicts you'll need to resolve them.

git add .
git commit -m "message-here"
git push origin main

From now on... after each release follow these steps to update your repo with new content:

git fetch --all
git checkout BCBranch
git merge --ff-only @{u}
git add .
git commit -m "branch is updated"
git checkout main
git merge BCBranch --allow-unrelated-histories

You will be asked to add a comment about why this change is necessary --> add a message.

git push origin main

Bringing it all together with Jupyter Notebooks

Activating Your Conda Environment
  1. Now, let's activate the environment we set-up earlier with the command conda activate llmops-course. If you were successful, you could see (llmops-course) preceeding your terminal commands.
Adding a Feature Branch

Let's add a feature branch to our local repo. Earlier, we showed you how to add a feature branch and content to your repo via the Terminal. This time we are going to show you how to do it using the VS Code GUI.

  1. Click on the main branch in the lower left side of the screen

  2. You will then see a drop-down menu with some branch-level option commands. Select the Create a new branch option.

  3. You will be prompted to enter the name for the branch. Let's give our branch an informative name feature-hello-world. The feature pre-fix is a common Git convention and let's our collaborators know what the purpose of the branch and the name of the feature.

  4. Now that we have a feature branch to work on, let's add some code to it!

Hello World! - Part 1 1. Next we will review some terminal commands and make some additions to our repo. Do these in your terminal where your current working directory is your repo.
  • Check your current working directory: pwd

  • Create a new file: touch hello_world.py

  • Create new directory: mkdir app

  • Move file to directory: mv hello_world.py app/hello_world.py

  • Check that the move command worked: cd app and then ls, you should see your hello_world.py file

  • Lastly, lets clear our terminal screen: clear

  1. Click on the Explorer tab.

  2. Click on your hello_world.py file and type the following into the file:

print("hello world! let's do some ml ops!")
  1. Save. And now go to the integrated terminal by clicking CTRL + ~. In the terminal run your first program of the class by doing cd app -> python hello_world.py. Congrats, we are off to a great start!
Hello World! - Part 2 - Notebook Edition
  1. Create a new file under app by clicking on the Add file button and let's name this file hello_world.ipynb. The .ipynb extension is a notebook extension which will allow you to interact with your code via a notebook in VS Code, instead of a vanilla Python file. You might need to select your kernel in the top right of the notebook file, if so, choose the one we created previously.

  2. In the first cell of hello_world.ipynb lets do our imports.

    import pandas as pd
    import numpy as np
    import matplotlib.pyplot as plt
    
  3. Run the cell by either clicking the play button or by doing CTRL + ENTER.

  4. Create a new cell and in that put the following code:

    np.random.seed(0)
    
    values = np.random.randn(100) # array of normally distributed random numbers
    s = pd.Series(values) # generate a pandas series
    s.plot(kind='hist', title='Normally distributed random values') # hist computes distribution
    plt.show()   
    
  5. Run the cell and you should see your histogram plot! Well done.

coding histogram

  1. Now let's commit our code to our remote repository. This can be done one of two ways - either through the terminal or through VS Code's GUI. I'll explain the VS Code way and but you can also choose to use the terminal method by making use of the commands demonstrated earlier.
  • Click Source Control on the left icon bar.

  • Add a message to your commit by typing in the message field.

  • Click the check mark button under changes to add your files to this commit. If you haven't saved your changes, you will be prompted to Save All and Commit. Click Save All and Commit.

    OPTIONAL: Manually staging individual files 1. You can manually stage files by pressing the `+` button.

    image

  • Click the elipsis in Source Control ribbon and click Push. You may also be prompted to Sync Changes. This will do Pull and Push, which will fetch new changes to the code and push your updates as well.

  • You can then put in a pull request in GitHub to merge into the branch that you pulled from, in this case the main branch. In real life, you would then review the code changes with another developer/team lead/supervisor and address any potential code conflicts.

Build Your First LLM Application

Now that you've completed your IDE set-up, head on over to the Beyond-ChatGPT repository to build your first LLM application!