Skip to content

The project aims to predict MBTI personality types based on text data using deep learning models. It leverages LSTMs, Bi-Directional LSTMs, and BERT to enhance classification accuracy.

License

Notifications You must be signed in to change notification settings

JaspreetSingh-exe/Personality-Prediction-Using-Deep-Learning

Folders and files

NameName
Last commit message
Last commit date

Latest commit

9d77a31 Β· Feb 21, 2025

History

10 Commits
Feb 21, 2025
Feb 21, 2025
Feb 21, 2025
Feb 21, 2025
Feb 21, 2025
Feb 21, 2025
Feb 20, 2025
Feb 20, 2025
Feb 21, 2025
Feb 20, 2025
Feb 21, 2025
Feb 21, 2025
Feb 21, 2025

Repository files navigation

Personality Prediction using Deep Learning

πŸš€ A Deep Learning-based personality classification model using LSTMs, Bi-Directional LSTMs, and BERT to predict MBTI personality types from text data.


πŸ“Œ Project Overview

This project classifies MBTI personality types based on text data. It is structured into three key steps:

  1. Data Visualization & Preprocessing – Cleaning and preparing text data.
  2. Model Training – Training LSTM, Bi-Directional LSTM, and BERT models.
  3. Model Evaluation – Comparing model accuracy, loss, and overall performance.

βœ” LSTM Model – Sequential model with embeddings and LSTM layers
βœ” Bi-Directional LSTM Model – Enhances sequence learning with bidirectional LSTMs
βœ” BERT Model – Transformer-based NLP model for improved contextual understanding
βœ” Performance Comparison – Evaluating all models based on accuracy and loss


πŸ›  Installation & Setup

1️⃣ Clone the Repository

git clone https://github.com/JaspreetSingh-exe/Personality-Prediction-Using-Deep-Learning.git
cd Personality-Prediction-Using-Deep-Learning

2️⃣ Install Dependencies

pip install -r requirements.txt

3️⃣ Run the Project

πŸ“Š Step 1: Data Visualization & Preprocessing

jupyter notebook data_visualization.ipynb

πŸ‹οΈβ€β™‚οΈ Step 2: Model Training

jupyter notebook training.ipynb

πŸ“ˆ Step 3: Model Evaluation & Comparison

jupyter notebook evaluate_model.ipynb

πŸ“‚ Dataset

The dataset used for this project consists of text-based personality traits labeled according to the Myers-Briggs Type Indicator (MBTI). Each entry contains a series of posts written by a user and their corresponding personality type. The dataset is preprocessed by:

  • Removing stopwords and special characters to clean text data.
  • Tokenizing and padding sequences for uniform input.
  • Splitting into training and testing sets for model evaluation.

The dataset helps train models to classify personality types based on textual inputs.

πŸ“‚ Project Structure

πŸ“¦ Personality Prediction Using Deep Learning
β”œβ”€β”€ data_visualization.ipynb      # Exploratory Data Analysis & Preprocessing
β”œβ”€β”€ training.ipynb                # Model Training (LSTM, Bi-LSTM, BERT)
β”œβ”€β”€ evaluate_model.ipynb          # Model Evaluation & Comparison
β”œβ”€β”€ cleaned_mbti_data.csv         # Preprocessed dataset
β”œβ”€β”€ README.md                     # Project Documentation
β”œβ”€β”€ requirements.txt              # Dependencies List
β”œβ”€β”€ model_comparison_results.csv  # Performance metrics  
β”œβ”€β”€ models/
β”‚   β”œβ”€β”€ lstm_model.h5             # Trained LSTM Model
β”‚   β”œβ”€β”€ bilstm_model.h5           # Trained Bi-LSTM Model
β”‚   β”œβ”€β”€ bert_model.h5             # Trained BERT Model

πŸ€– Understanding LSTM, Bi-Directional LSTM & BERT

πŸ”Ή What is LSTM (Long Short-Term Memory)?

LSTM is a type of Recurrent Neural Network (RNN) that is well-suited for sequential data processing.

LSTM is a type of Recurrent Neural Network (RNN) that is well-suited for sequential data processing.

Example Code for LSTM:

from keras.models import Sequential
from keras.layers import LSTM, Embedding, Dense, Dropout

model = Sequential([
    Embedding(input_dim=10000, output_dim=256, input_length=1500),
    LSTM(100, dropout=0.2, recurrent_dropout=0.2),
    Dense(16, activation='softmax')
])
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])

πŸ”— Paper Link

πŸ”Ή What is Bi-Directional LSTM?

A Bi-Directional LSTM (Bi-LSTM) processes input sequences forward and backward, improving context capture.

A Bi-Directional LSTM (Bi-LSTM) processes input sequences forward and backward, improving context capture.

Example Code for Bi-Directional LSTM:

from keras.layers import Bidirectional

model = Sequential([
    Embedding(input_dim=10000, output_dim=256, input_length=1500),
    Bidirectional(LSTM(100, return_sequences=True)),
    Dropout(0.3),
    Bidirectional(LSTM(50)),
    Dense(16, activation='softmax')
])
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])

πŸ”— Paper Link

πŸ”Ή What is BERT (Bidirectional Encoder Representations from Transformers)?

BERT is a transformer-based NLP model trained on large datasets that captures context from both left and right directions.

BERT is a transformer-based NLP model trained on large datasets that captures context from both left and right directions.

Example Code for BERT:

from transformers import TFBertModel, AutoTokenizer
import tensorflow as tf

tokenizer = AutoTokenizer.from_pretrained("bert-base-uncased")
bert_layer = TFBertModel.from_pretrained("bert-base-uncased")

input_word_ids = tf.keras.layers.Input(shape=(1500,), dtype=tf.int32, name="input_word_ids")
bert_outputs = bert_layer(input_word_ids)[0]
output = tf.keras.layers.Dense(16, activation="softmax")(bert_outputs[:, 0, :])

bert_model = tf.keras.models.Model(inputs=input_word_ids, outputs=output)
bert_model.compile(loss="categorical_crossentropy",
                   optimizer=tf.keras.optimizers.Adam(learning_rate=0.00001),
                   metrics=["accuracy"])

πŸ”— Paper Link

πŸ† Model Comparison

Model Accuracy
LSTM 25.4 %
Bi-Directional LSTM 53.0 %
BERT 85.8 %

🀝 Contributing

Want to improve this project? Contributions are welcome!

  1. Fork the repo
  2. Create a new branch
  3. Submit a pull request

πŸ“œ License

This project is licensed under the MIT License.

πŸ“§ Contact

For queries, reach out to: βœ‰οΈ jaspreetsingh01110@gmail.com


About

The project aims to predict MBTI personality types based on text data using deep learning models. It leverages LSTMs, Bi-Directional LSTMs, and BERT to enhance classification accuracy.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published