- 👋 I’m Charles-Meldhine Madi Mnemoi. I am a Data Scientist in Co-op by day and a full-stack developper for eMush by night.
- 🛠️ Skills
- proficient in data analysis (Pandas, Matplotlib, Seaborn, Plotly), machine learning (Scikit-learn, PyTorch) with Python and SQL and API development (FastAPI) ;
- familiar with DevOps/MLOps (Docker, CI/CD with GitHub Actions, GitLab CI, unit testing with pytest), GCP cloud (Big Query, Cloud Run, Vertex AI), Gen AI (Langchain, Haystack), vector databases (Chroma, Weaviate, PgVector), Front-end development (Vue.js, React.js) and agile development methods (Scrum, Kanban) ;
- familiar with Infrastructure as Code with Terraform.
- 📫 Reach me by mail or Linkedin
Below are some projects I've worked on.
Stack : Python (FastAPI, pytest), TypeScript (React.js), Haystack, PostgresSQL, Docker, Terraform, Github Actions
A web application for interacting with a RAG-based chatbot that answers questions about SightCall from their website.
Stack : Python (FastAPI, pytest), TypeScript (Vue.js), Chroma DB, Docker
A chatbot web application which can answer question about eMush with Retrieval-Augmented Generation (RAG) from curated documents.
cmnemoi-learn
is a Python package which reimplements machine learning algorithms from scratch (using only numpy
) with high quality development practices :
- unit testing with
pytest
- code quality checking with
black
,pylint
andmypy
- CI/CD pipeline with GitHub Actions to version and publish the package automatically to PyPI
Stack : PHP 8.3 (Symfony 6.4, PHPUnit, Codeception), Vue.js 3, PostgreSQL, GitLab, Docker, GitLab CI
eMush is an open source remake of Mush: the greatest space opera epic of Humanity, directly on your browser!
I am a full-stack developer for the project since July 2022.
KPIs :
- 2000+ users (150+ daily)
- contribution to 100 000+ lines of code
Missions :
- feature development, bugfixes and testing
- enhancement of CI pipelines
- implementing good practices (TDD, BDD, Clean Architecture)
- participation in discussions on project direction and features to be developed
- writing monthly news and patchnotes
- animating alpha tests
- LiveSplit autosplitters (https://github.com/cmnemoi/NuclearBlazeAutoSplitter)
- refactoring.guru
- srcomapi
- visions
- V tensor library
I've done the projects below when I was starting in Data Science and software engineering, they deserve a reboot now...
Stack : Python (Pandas, Seaborn, Streamlit, scikit-learn, pytest), GCP, GitHub Actions
Data Science project of Lille's Bachelor of Economics, which consists of participating in the Kaggle competition New York City Taxi Fare Prediction.
- Developed a web application that estimates the price of a ride within a $1.4 range
- Cleaned and analyzed a dataset with 340,000+ rows to remove outliers and noise from data with normalization
- Created new variables based on ride duration and destinations
- Built the web application using Streamlit
- Quality "CI" pipeline with git hooks and Github Actions (lint with Ruff, test with Pytest)