Skip to content

Latest commit

 

History

History
281 lines (177 loc) · 13.7 KB

README.md

File metadata and controls

281 lines (177 loc) · 13.7 KB

MLOS Autotuning for Sqlite Repo

MLOS Autotuning DevContainer

This repo is a fork of the mlos-autotuning-template repo.

It is meant as a basic class demo and example of tuning a local sqlite instance running via benchbase and analyzing the results using MLOS, a framework to help benchmark and automate systems tuning.

Contents

Background

Systems tuning is a difficult and time consuming task, yet more and more necessary as the complexity of cloud systems and all of their varied choices, different deployment environments, and myriad of workloads grows. With it, we can reduce cost, improve performance, lower carbon footprint, improve reliability, etc., by fitting the system to the workload and its execution environment, tuning everything from VM size, to OS parameters, to application configurations.

To mitigate risk to production systems, we often employ offline exploration of this large parameter space to find a better config. Benchmarks are needed to exercise the system, many parameter combinations should be tried, and data collected and compared. This process has traditionally been done manually, which is both time consuming and error prone. Noise in the cloud further makes it less reproducible.

The goal of autotuning systems like MLOS are to use automation and data driven techniques to help reduce the burden of this task and make it more approachable for many different systems and workloads in order to help bring the vision of an autonomous instance optimized cloud closer to reality.

Overview

There are several items in this example:

  1. Some configs and example commands to use mlos_bench to autotune a sqlite workload (see below).

    These can be run in the background while you explore the data in some of the other notebooks.

  2. mlos_demo_sqlite.ipynb

    This is your workbook for this demo. Use it to analyze the data from running mlos_bench to find a better SQLite configuration and help understand what the optimizer found about the performance of that config.

    Initially, there won't be much data here to work with, until the commands from the loop in the previous step have run for a short while.

  3. mlos_demo_sqlite_teachers.ipynb

    Here we analyze the data from running 100 trials of mlos_bench for SQLite optimization, as detailed in the instructions below. The results you obtain during this workshop should look similar to what we have in this notebook.

  4. mlos_demo_mysql.ipynb

    This notebook explores some existing data that we've collected with the mlos_bench tool while optimizing a MySQL Server on Azure.

    It is meant to familiarize yourself with the data access and visualization APIs while the commands from the first step gather new data for the sqlite demo in the background.

Video

You can see a brief example demo of this tool in the following video:

demo video

Setup

Prerequisites

For this demo, we will be using Github's Codespaces feature to provide a pre-configured environment for you to use.

Codespaces

Local

  • VSCode

    For a more pleasant experience, we recommend connecting to the remote codespace using a local instance of VSCode, but it's not required. You can also just use the web interface.

    It is also possible to use a local checkout of the code using git, docker, and a devcontainer, but we omit these instructions for now.

Prior to Class

  1. Create a github account if you do not already have one.

  2. Open the project in your browser.

    Navigate to the green <> Code drop down at the top of page and select the green Create codespace on main button.

    open github repo in codespace
  3. Reopen the workspace (if prompted).

    open workspace prompt

    Note: you can trigger the prompt by browsing to the mlos-autotuning.code-workspace file and following the prompt in the lower right to reopen.

  4. Run the following code in the terminal at the bottom of the page, confirm you get an output back with help text.

    conda activate mlos
    mlos_bench --help

    You should see some help output that looks like the following:

    usage: mlos_bench [-h] [--config CONFIG] [--log_file LOG_FILE] [--log_level LOG_LEVEL] [--config_path CONFIG_PATH [CONFIG_PATH ...]] [--environment ENVIRONMENT] [--optimizer OPTIMIZER] [--storage STORAGE] [--random_init]
                      [--random_seed RANDOM_SEED] [--tunable_values TUNABLE_VALUES [TUNABLE_VALUES ...]] [--globals GLOBALS [GLOBALS ...]] [--no_teardown]
    
    mlos_bench : Systems autotuning and benchmarking tool
    
    options:
      -h, --help            show this help message and exit
      ...
  5. That's it! If you run into any issues, please reach out to the teaching team and we can assist prior to class starting.

Start of Class

These instructions use the Github Codespaces approach described above.

  1. Open the codespace previously created above by browsing to the green <> Code button on the project repo site as before.

    • Use the "Open in VSCode Desktop" option from the triple bar menu on the left hand side to re-open the codespace in a local VSCode instance.
    open codespace in VSCode menu option

    Note this step is optional, but recommended for a better experience. You can alternatively stay in the browser interface for the entire demo.

  2. Make sure the local repo is up to date.

    To be executed in the integrated terminal at the bottom of the VSCode window:

    # Pull the latest sqlite-autotuning demo code.
    git pull
  3. Make sure the MLOS dependencies are up to date.

    To be executed in the integrated terminal at the bottom of the VSCode window:

    # Pull the latest MLOS code.
    git -C MLOS pull
  4. Make sure the mlos_bench.sqlite data is available.

    To be executed in the integrated terminal at the bottom of the VSCode window:

    # Download the previously generated results database.
    test -f mlos_bench.sqlite || wget -Nc https://mlospublic.z13.web.core.windows.net/sqlite-autotuning/mlos_bench.sqlite
  5. Activate the conda environment in the integrated terminal (lower panel):

    conda activate mlos
    codespace terminal activating mlos conda environment
  6. Make sure the TPC-C database is preloaded.

    Note: this is an optimization. If not present, the scripts below will generate it the first time it's needed.

    mkdir -p workdir/benchbase/db.bak
    wget -Nc -O workdir/benchbase/db.bak/tpcc.db https://mlospublic.z13.web.core.windows.net/sqlite-autotuning/tpcc.db

Using mlos_bench

  1. Run the mlos_bench tool as a one-shot benchmark.

    For instance, to run the sqlite example from the upstream MLOS repo (pulled locally):

    To be executed in the integrated terminal at the bottom of the VSCode window:

    # Run the one-shot benchmark.
    # This will run a single experiment trial and output the results to the local results database.
    mlos_bench --config "./config/cli/local-sqlite-bench.jsonc" --globals "./config/experiments/sqlite-sync-journal-pagesize-caching-experiment.jsonc"

    This should take a few minutes to run and does the following:

    • Loads the CLI config ./config/cli/local-sqlite-bench.jsonc

    • The CLI config also references and loads the root environment config ./config/environments/apps/sqlite/sqlite-local-benchbase.jsonc.

      • In that config the setup section lists commands used to
        1. Prepare a config for the sqlite instance based on the tunable parameters specified in the experiment config,
        2. Load or restores a previously loaded copy of a tpcc.db sqlite instance using a benchbase docker image.
      • Next, the run section lists commands used to
        1. execute a TPC-C workload against that sqlite instance
        2. assemble the results into a file that is read in the read_results_file config section in order to store them into the mlos_bench results database.
  2. Run the mlos_bench tool as an optimization loop.

    # Run the optimization loop by referencing a different config file
    # that specifies an optimizer and objective target.
    mlos_bench --config "./config/cli/local-sqlite-opt.jsonc" --globals "./config/experiments/sqlite-sync-journal-pagesize-caching-experiment.jsonc" --trial-config-repeat-count 3 --max-suggestions 100

    The command above will run the optimization loop for 100 suggested configurations, which should take about 30 minutes since each trial should takes about 12 seconds to run.

    Note: a 10 second run is not very long evaluation period. It's used here to keep the demo short, but in practice you would want to run for longer to get more accurate results.

    To do this, it follows the procedure outlined above, but instead of running a single trial, it runs an optimization loop that runs multiple trials, each time updating the tunable parameters based on the results of the previous trial, balancing exploration and exploitation to find the optimal set of parameters.

    The overall process looks like this:

    optimization loop

    Source: LlamaTune: VLDB 2022

    While that's executing you can try exploring other previously collected data using the mlos_demo_mysql.ipynb notebook.

Analyzing the Results

  1. Use the mlos_demo_sqlite.ipynb notebook to analyze the results.

    To do this, you may need to activate the appropriate python kernel in the Jupyter notebook environment.

    1. Select Python Environments from the Select Kernel menu in the upper right of the notebook:

      jupyter select kernel menu
    2. Select mlos from the options listed:

      jupyter select mlos kernel

Troubleshooting

Here's a short list of some tips/tricks of things to try in case you encounter some issues during the demo:

  • If the "Select Kernels" menu is hanging during the notebook steps,

    Or, if the mlos_bench --help step returns a command not found error,

    Then, try to

    1. Update and/or restart VSCode
    2. Restart your codespace

See Also

Here are some additional sources of information:

  • MLOS - the main repo for the mlos_bench tool.

Data Science APIs