This project is a simple Flask API that exposes an endpoint to anonymize text, with a particular focus on identifying and masking Personally Identifiable Information (PII) in Italian text using Microsoft Presidio.
This service exposes a single POST
endpoint: /anonymize
.
Request:
- Method:
POST
- Path:
/anonymize
- Headers:
Content-Type: application/json
- Body:
{ "text": "String containing the text to be anonymized." }
Successful Response (200 OK):
- Headers:
Content-Type: application/json
- Body:
{ "text": "String containing the anonymized text." }
Error Responses:
400 Bad Request
: Invalid JSON or missingtext
field.500 Internal Server Error
: Internal processing error.
- Python 3.8+
- Flask: Micro web framework for creating the API.
- Presidio Analyzer: For PII detection.
- Presidio Anonymizer: For PII anonymization/masking.
- spaCy: NLP library used by Presidio for entity recognition (specifically with the Italian model
it_core_news_lg
).
- Python 3.8+ and
pip
git
(for cloning)
-
Clone the repository:
git clone https://github.com/<YOUR_GITHUB_USER>/<YOUR_REPO_NAME>.git cd <YOUR_REPO_NAME>
-
Create and activate a virtual environment:
python3 -m venv venv # On Windows: # venv\Scripts\activate # On macOS/Linux: source venv/bin/activate
-
Install dependencies:
pip install -r requirements.txt
Start the Flask development server:
python3 app.py
The application will typically be available at http://127.0.0.1:3000/
.
app.py
: Flask application entry point, API endpoint definition.presidio_logic.py
: Core Presidio setup, custom recognizers, and anonymization functions.requirements.txt
: Python dependencies.venv/
: Virtual environment directory (usually gitignored).README.md
: This file.
As described in "Run the Application" above:
python app.py
It's recommended to use a virtual environment (venv
) to manage Python versions and dependencies per project. Tools like pyenv
can also be used for managing multiple Python installations.
While this template doesn't include specific unit tests, you would typically use a framework like pytest
or Python's built-in unittest
module.
To run unit tests (example using pytest):
- Install pytest:
pip install pytest
- Create test files (e.g.,
test_presidio_logic.py
,test_app.py
) in atests/
directory. - Run tests:
pytest
Use tools like curl
, Postman, or Insomnia to send POST
requests to the /anonymize
endpoint.
Example using curl
:
curl -X POST \
http://127.0.0.1:3000/anonymize \
-H 'Content-Type: application/json' \
-d '{
"text": "Il signor Mario Rossi vive in Via Roma 123. Contattare a [email protected]"
}'