This directory contains python scripts for benchmarking the supported algorithms.
This script can be used to run them locally.
They can also be run on the Databricks AWS hosted Spark service. See these instructions and accompanying scripts for running a set of high compute workloads on comparable CPU and GPU clusters. The graph below shows the resulting Spark ML CPU and Spark Rapids ML GPU average running times.
Click on the below for instructions on running the benchmarking scripts in the respective CSP Spark environments