Name	Name	Last commit message	Last commit date
parent directory ..
bonus_chapters	bonus_chapters
chap01	chap01
chap02	chap02
chap03	chap03
chap04	chap04
chap05	chap05
chap06	chap06
chap07	chap07
chap08	chap08
chap09	chap09
chap10	chap10
chap11	chap11
chap12	chap12
jars	jars
README.md	README.md

Name

Last commit message

Last commit date

chap01

"... This book will be a great resource for
both readers looking to implement existing
algorithms in a scalable fashion and readers
who are developing new, custom algorithms
using Spark. ..."

Dr. Matei Zaharia
Original Creator of Apache Spark

FOREWORD by Dr. Matei Zaharia

Chapters

This directory contains all of the chapter codes for "Data Algorithms with Spark".

Bonus Chapters

The following directories are bonus chapters:

Bonus Chapter	Description
Word Count	Provided multiple solutions for word count problem using `reduceByKey()` and `groupByKey()` reducers.
Anagrams	Find words, which are anagrams: provided multiple solutions for anagrams problem using `reduceByKey()`, `groupByKey()`, and `combineByKey()` reducers.
Lambda Expressions	How to use Lambda Expressions in PySpark programs
TF-IDF	Term Frequency - Inverse Document Frequency
K-mers	K-mers for DNA Sequences
Correlation	All vs. All Correlation
`mapPartitions()` Transformation	`mapPartitions()` Complete Example
UDF	User-Defined Function Example
DataFrames Transformations	Examples on Creation and Transformation of DataFrames
DataFrames Tutorials	DataFrames Tutorials: from collections and CSV text files
Join Operations	Examples on join of RDDs
PySpark Tutorial 101	Examples on using PySpark RDDs and DataFrames
Physical Data Partitioning	Tutorial of Physical Data Partitioning
Monoid: Design Principle	Monoid as a Design Principle

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

Chapters

Bonus Chapters

FilesExpand file tree

code

Directory actions

More options

Directory actions

More options

Latest commit

History

code

Folders and files

parent directory

README.md

Chapters

Bonus Chapters