Skip to content

Latest commit

 

History

History
27 lines (21 loc) · 778 Bytes

README.md

File metadata and controls

27 lines (21 loc) · 778 Bytes

Help message:

usage: k-means.py [-h] [--nocache] [--distance DISTANCE] [--scale SCALE]
                  input output_bucket output_path k

k-means

positional arguments:
  input                Input file location
  output_bucket        Name of bucket to write output
  output_path          Path inside bucket to write output
  k                    how many means

optional arguments:
  -h, --help           show this help message and exit
  --nocache            Persist RDDs
  --distance DISTANCE  Distance measure [euclidian|giant]
  --scale SCALE        Scale to plot on [world|usa]

example use:

spark-submit k-means.py s3://cleaned-geo-final/clean_devicestatus.txt geo-final small/5/euclidian/nocache.png 5 \
			--scale usa --distance euclidian --nocache