Skip to content

Enhance xGitGuard Scanner with BERT Model for Advanced Secret Detection #34

Open
@radhi1991

Description

@radhi1991

Details:
Transformer-based models are better for this problem as they capture the context around lines of code. In general, random forest models do not perform well on high-dimensional data. For sequential data, proposed transformer models work better than existing models, which are better suited for non-sequential data.

The solution:
We propose to enhance the xGitGuard scanner by integrating a BERT model specifically trained for secret detection.

The steps include:

  1. Training and building models using BERT:
    Develop machine learning models focused on secret detection using BERT architecture.

  2. Integrating BERT into scanners:
    Seamlessly integrate the trained BERT model into the xGitGuard scanner, enhancing its ability to detect sensitive information with higher accuracy.

Alternatives:
Any other pre-trained models like PaLM, Gemini, or any GPT models.

Additional context:
Requires considerable training data.

Metadata

Metadata

Assignees

Labels

enhancementNew feature or requesthelp wantedExtra attention is needed

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions