Smart Market AI ingests influencer, campaign, and social-performance datasets, enriches them with NLP/embedding pipelines, and outputs clustering + visualization artifacts for GTM teams. The repo captures the full workflow—from scraping raw data to publishing dashboards and documentation.
- Data Fetching/ – scripts and configs for downloading influencer + competitor feeds.
- Processing Data/ – cleaning utilities, feature builders, and similarity calculations.
- Analysis Module/ – notebooks, clustering experiments, and result CSVs.
- Visualization/ – Plotly/Matplotlib exports and storytelling assets.
- Demonstration/ – walkthrough notebooks + presentation-ready summaries.
- Create/activate an environment and install dependencies:
powershell conda create -n smart-market python=3.10 pandas numpy scikit-learn plotly conda activate smart-market pip install sentence-transformers seaborn openpyxl - Place API tokens or brand credentials inside .env (if a connector requires it).
- Fetch the latest datasets via the scripts in Data Fetching/ (most are Jupyter/py files with documented parameters).
- Execute preprocessing notebooks in Processing Data/ to regenerate the cleaned CSVs consumed by clustering modules.
- Run the analysis notebooks inside Analysis Module/Clustering*/ to produce updated similarity matrices and export them to Analysis Module/Data/.
- Refresh dashboards by re-running the notebooks or using the assets under Visualization/.
- Track large CSVs with Git LFS if they exceed GitHub’s 100 MB limit.
- When updating notebooks, strip outputs (jupyter nbconvert --clear-output) before pushing updates to keep diffs readable.
- Late-window enhancements (2021–2022) should emphasize transformer upgrades, influencer segmentation, and visualization publishing.