I'm a Data Scientist with a master's degree in Business Administration (MBA) and startup founder with 6+ years' experience delivering end-to-end data analysis and infrastructure projects to multiple clients. I have a deep industry background in big data, machine learning systems, NLP and statistics.
I am passionate about building a platform to deliver insight by facilitating the decision-making process and solving business problems through the application of cognitive computing and AI.
Currently, I am working on a duplicate bug report detection system utilizing multiple NLP packages and three models (TF-IDF, Word2Vec, BM25F) in Python.
- Duplicate Bug Report Detection System (NLP Based Recommendation Engine)
- Default Risk Predictive Modeling (Logistic Regression & Light Gradient Boosting)
- Demographics Customer Segmentation of VAS Subscribers (K-means & Adaboost)
- SMS Content Analysis and Classification (NLP Based Project)
This project aims to propose an effective unsupervised and supervised models to detect duplicate bug report in the Bugzilla repository. The search engine finds the top-N most similar reports to a given report, and deduplicate issues faster. Moreover, it presents an analytical dashboard to developers to understand the different aspects of the bug reports’ statistics and major sources of bug generation.
This project aims to build a default risk predictive model to assess the credibility of loan applicants.