Repository for my talk at UCLA's DataFest 2021

Jeremy Guinta - Data Scientist; Litigation Consultant; Statistical Expert; Machine Learning; Lecturer

Jeremy has nearly 20 years’ experience in litigation consulting and complex data analysis. He specializes in performing complex data analysis, including complex statistical procedures to develop trends and predictive analysis. He also retains an expertise in machine learning and has developed machine learning processes to improve name matching and categorical descriptions of textual data. Over the course of his nearly 20-year career, Jeremy has developed and delivered training courses on Statistics, Data Management and Analysis, R programming, SQL programming, and Data Visualization. Jeremy has authored articles on data privacy, data management, statistical analysis, and general areas of law when law, statistics, and data analysis intersect. Jeremy also teaches a college level course in beginning and intermediate statistics at California State University Los Angeles.

www.linkedin.com/in/jeremyguinta

Program Description

This program is a two-part program that will discuss my theory of data graphics and discuss an introduction to machine learning. Both parts of my talk will have hands on learning and programming using R and RStudio.

Data Visuals

Everyone wants to be able to tell a story, and persuasive storytelling is an important part of an attorney’s role. This program is designed to teach proper data visualization techniques to tell a more powerful story. The program will cover the 4 Os (Observable, Original, Objective, and Open) of great graphics and how you can portray a better story using data visualizations. Specifically, the program will cover 1) How to use facts and information in pictorial form; 2) Useful graphic techniques; 3) Good and bad data visual methods, pitfalls, and other considerations when creating or analyzing a graph; and 4) basics of ggplot2() using R to make a great visual.

Machine Learning

Machine Learning is a series of highly sophisicated statisitcal techniques that is hard to discern and even harder to implement correctly with meaning. This program is is designed to break open the black box of Machine Learning, so participants can understand the basics of telling their story based on the data and the model through visuals, setting up their data, developing a training and testing dataset, validating their model, and, most importantly, explaining the meaning of their model to a wider audience. This program will work through a very basic Machine Learning example using R.

Data

mtcars - Our favorite built-in dataset from R

indexes - Stock price indices taken from publicly available sources for

NYSE (https://finance.yahoo.com/quote/%5ENYA?p=^NYA&.tsrc=fin-srch)
Nasdaq (https://finance.yahoo.com/quote/%5EIXIC?p=^IXIC&.tsrc=fin-srch)
Standard & Poors Preferred Stock (https://finance.yahoo.com/quote/%5ESPPREF/)

miller_d - Stock price index for D preferred shares of Miller Energy (https://finance.yahoo.com/news/miller-energy-responds-notice-delisting-213036706.html)

UCI_Credit_Card.csv - Please see UCI_README

Software Requirements

R (https://mran.microsoft.com/download) or (https://cran.r-project.org/bin/windows/base/) I prefer the mran version as it is optimized for multi-threading. It is not required for this program.
RStudio (https://rstudio.com/products/rstudio/download/)

Scripting

see 20210408_Script.r for R code for all of the PPTX graphics and machine learning walkthrough. This code will auto-load all of the required packages.

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
20210408_DataFest_Presentation_Visuals_ML_wWM.pdf		20210408_DataFest_Presentation_Visuals_ML_wWM.pdf
20210408_script.r		20210408_script.r
LICENSE		LICENSE
README.md		README.md
UCI_Credit_Card.csv		UCI_Credit_Card.csv
UCI_README.txt		UCI_README.txt
indexes.csv		indexes.csv
millerD.csv		millerD.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Repository for my talk at UCLA's DataFest 2021

Jeremy Guinta - Data Scientist; Litigation Consultant; Statistical Expert; Machine Learning; Lecturer

Program Description

Data Visuals

Machine Learning

Data

Software Requirements

Scripting

About

Releases

Packages

Languages

License

jjghockey/DataFest2021

Folders and files

Latest commit

History

Repository files navigation

Repository for my talk at UCLA's DataFest 2021

Jeremy Guinta - Data Scientist; Litigation Consultant; Statistical Expert; Machine Learning; Lecturer

Program Description

Data Visuals

Machine Learning

Data

Software Requirements

Scripting

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages