Descriptive Analytics is one of the core components of any analysis life-cycle pertaining to a data science project or even specific research. Data aggregation, summarization and visualization are some of the main pillars supporting this area of data analysis. However, dealing with multi-dimensional datasets with typically more than two attributes start causing problems, since our medium of data analysis and communication is typically restricted to two dimensions. We will explore some effective strategies of visualizing data in multiple dimensions (ranging from 1-D up to 6-D) using a hands-on approach with Python and popular open-source visualization libraries like matplotlib and seaborn. We will also do a brief coverage on excellent R visualization libraries like ggplot if we have time.
BONUS: We will also look at ways to visualize unstructured data with several dimensions including text, images and audio!
The talk is usually a 90 minutes session but we will be covering it in the scheduled time focusing on the main aspects of effective data visualization with the grammar of graphics, leveraging popular open-source frameworks in Python. If we have time to spare, as a bonus we will then cover visualization in unstructured data including text, audio and images.
Outline:
- Introduction
- What is Data Visualization?
- Why Data Visualization?
- Motivation
- Why Effective Data Visualization
- Effective Multi-dimensional Data Visualization
- Whirlwind tour of the grammar of graphics
- Visualization tools and frameworks
- General tools & frameworks
- Python visualization frameworks
- R visualization frameworks
- Visualizing Structured Data
- Univariate analysis and visualizations
- Multivariate analysis and visualizations
- Visualizing from 1-D up to 6-D
- BONUS: Visualizing Unstructured Data
- Text
- Images
- Audio
- Final words