Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 7 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,14 @@

## Next

* [feature] 🌟 File multi-column anonymizer, inject sample rows in database from a CSV file.
* [feature] 🌟 File enum anonymizer, inject samples in database from a plain text or CSV file.
* [feature] 🌟 String pattern anonymizer, build complex strings by fetching values from other anonymizers.
* [internal] introduce anonymizer context for carrying environment configuration to anonymizers (#235).
* [bc] Salt in `AbstractAnonymizer::$option->get('salt')` in now in `AbstractAnonymizer::$context->salt` (#235).
* [bc] `AbstractAnonymizer::__construct()` now expects an additional `$context` parameter (#235).
* [bc] `Anonymizator::__construct()` `$salt` parameter was removed (#235).
* [fix] Some minor PHP 8.4 deprecations.

## 2.0.3

Expand Down
3 changes: 3 additions & 0 deletions docs/content/anonymization/core-anonymizers.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,8 +14,11 @@ This page list all *Anonymizers* provided by *DbToolsBundle*.
<!--@include: ./core-anonymizers/md5.md-->
<!--@include: ./core-anonymizers/string.md-->
<!--@include: ./core-anonymizers/pattern.md-->
<!--@include: ./core-anonymizers/file-enum.md-->
<!--@include: ./core-anonymizers/file-column.md-->
<!--@include: ./core-anonymizers/lastname.md-->
<!--@include: ./core-anonymizers/firstname.md-->
<!--@include: ./core-anonymizers/lorem-ipsum.md-->
<!--@include: ./core-anonymizers/address.md-->
<!--@include: ./core-anonymizers/iban-bic.md-->
<!--@include: ./core-anonymizers/file-resolution.md-->
7 changes: 7 additions & 0 deletions docs/content/anonymization/core-anonymizers/address.md
Original file line number Diff line number Diff line change
Expand Up @@ -95,6 +95,13 @@ customer:
#...
```
:::

:::warning
This anonymizer works at the *table level* which means that the PHP attribute
cannot target object properties: you must specify table column names and not
PHP class property names.
:::

@@@

:::tip
Expand Down
125 changes: 125 additions & 0 deletions docs/content/anonymization/core-anonymizers/file-column.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,125 @@
## File multiple column

This Anonymizer will anonymize multiple columns at once using value rows from a
input file. As of now, only CSV files are supported.

This aninymizer behaves like any other multiple column anonymizer and allows you
to arbitrarily map any sample column into any database table column using the
anonymizer options.

Given the following file:

```txt
Number,Foo,Animal
1,foo,cat
2,bar,dog
3,baz,girafe
```

Then:

@@@ standalone docker

```yaml [YAML]
# db_tools.config.yaml
anonymization:
default:
customer:
my_data:
anonymizer: file_column
options:
source: ./resources/my_data.csv
# Define your CSV file column names.
columns: [number, foo, animal]
# Other allowed options.
file_skip_header: true
# Now your columns, keys are CSV column names
# you set upper, values are your database column
# names.
number: my_integer_column
foo: my_foo_column
animal: my_animal_column
#...
```

@@@
@@@ symfony

::: code-group
```php [Attribute]
namespace App\Entity;

use Doctrine\ORM\Mapping as ORM;
use MakinaCorpus\DbToolsBundle\Attribute\Anonymize;

#[ORM\Entity()]
#[ORM\Table(name: 'customer')]
#[Anonymize(type: 'string', options: [ // [!code ++]
'source' => './resources/my_data.csv', // [!code ++]
// Define your CSV file column names. // [!code ++]
'columns': ['number', 'foo', 'animal'], // [!code ++]
// Other allowed options. // [!code ++]
'file_skip_header' => true, // [!code ++]
// Now your columns, keys are CSV column names // [!code ++]
// you set upper, values are your database column // [!code ++]
// names. // [!code ++]
'number' => 'my_integer_column', // [!code ++]
'foo' => 'my_foo_column', // [!code ++]
'animal' => 'my_animal_column', // [!code ++]
])] // [!code ++]
class Customer
{
// ...

#[ORM\Column(length: 255)]
private ?string $myNumber = null;

#[ORM\Column(length: 255)]
private ?string $myFoo = null;

#[ORM\Column(length: 255)]
private ?string $myAnimal = null;

// ...
}
```

```yaml [YAML]
# config/anonymization.yaml
customer:
my_data:
anonymizer: file_column
options:
source: ./resources/my_data.csv
# Define your CSV file column names.
columns: [number, foo, animal]
# Other allowed options.
file_skip_header: true
# Now your columns, keys are CSV column names
# you set upper, values are your database column
# names.
number: my_integer_column
foo: my_foo_column
animal: my_animal_column
#...
```
:::

:::warning
This anonymizer works at the *table level* which means that the PHP attribute
cannot target object properties: you must specify table column names and not
PHP class property names.
:::

@@@

When parsing a file file, you can set the following options as well:
- `file_csv_enclosure`: if file is a CSV, use this as the enclosure character (default is `'"'`).
- `file_csv_escape`: if file is a CSV, use this as the escape character (default is `'\\'`).
- `file_csv_separator`: if file is a CSV, use this as the separator character (default is `','`).
- `file_skip_header`: when reading any file, set this to true to skip the first line (default is `false`).

:::tip
The filename can be absolute, or relative. For relative file resolution
please see [*File name resolution*](#file-name-resolution)
:::
79 changes: 79 additions & 0 deletions docs/content/anonymization/core-anonymizers/file-enum.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,79 @@
## File enum

This anonymizer will fill configured column with a random value from a given sample fetched
from a plain text or a CSV file.

Given the following file:

```txt
none
bad
good
expert
```

Then:

@@@ standalone docker

```yaml [YAML]
# db_tools.config.yaml
anonymization:
default:
customer:
level:
anonymizer: file_enum
options: {source: ./resources/levels.txt}
#...
```

@@@
@@@ symfony

::: code-group
```php [Attribute]
namespace App\Entity;

use Doctrine\ORM\Mapping as ORM;
use MakinaCorpus\DbToolsBundle\Attribute\Anonymize;

#[ORM\Entity()]
#[ORM\Table(name: 'customer')]
class Customer
{
// ...

#[ORM\Column(length: 255)]
#[Anonymize(type: 'string', options: ['source' => "./resources/levels.txt"])] // [!code ++]
private ?string $level = null;

// ...
}
```

```yaml [YAML]
# config/anonymization.yaml
customer:
level:
anonymizer: file_enum
options: {source: ./resources/levels.txt}
#...
```
:::

@@@

File will be read this way:
- When using a plain text file, each line is a value, no matter what's inside.
- When using a CSV file, the first column will be used instead.

When parsing a file file, you can set the following options as well:
- `file_csv_enclosure`: if file is a CSV, use this as the enclosure character (default is `'"'`).
- `file_csv_escape`: if file is a CSV, use this as the escape character (default is `'\\'`).
- `file_csv_separator`: if file is a CSV, use this as the separator character (default is `','`).
- `file_skip_header`: when reading any file, set this to true to skip the first line (default is `false`).

:::tip
The filename can be absolute, or relative. For relative file resolution
please see [*File name resolution*](#file-name-resolution)
:::
39 changes: 39 additions & 0 deletions docs/content/anonymization/core-anonymizers/file-resolution.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
## File name resolution

In various places you can configure relative file names in order to load data,
here is how relative file names are resolved.
**All relative file names will be considered relative to a given _base path_.**

The default base path is always stable but depends upon your selected flavor.

@todo examples

@@@ symfony

When parsing Symfony configuration, base path will always be the project
directory, known as `%kernel.project_dir%` variable in Symfony configuration.
This is the directory where your `composer.json` file.

@todo examples

@@@
@@@ laravel

When parsing Laravel configuration, base path will always be the project
directory, as returned by the `base_path()` Laravel function.

@todo examples

@@@
@@@ standalone docker

When parsing configuration in the standalone CLI version or in docker context,
base path will be currently being parsed Yaml file.

:::tip
If you set the `workdir` option in your configuration file, then it will
override the file directory and use it as the base path.

@todo link to `workdir` documentation
:::
@@@
6 changes: 6 additions & 0 deletions docs/content/anonymization/core-anonymizers/iban-bic.md
Original file line number Diff line number Diff line change
Expand Up @@ -74,4 +74,10 @@ customer:
```
:::

:::warning
This anonymizer works at the *table level* which means that the PHP attribute
cannot target object properties: you must specify table column names and not
PHP class property names.
:::

@@@
22 changes: 11 additions & 11 deletions src/Anonymization/Anonymizator.php
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@

use MakinaCorpus\DbToolsBundle\Anonymization\Anonymizer\AbstractAnonymizer;
use MakinaCorpus\DbToolsBundle\Anonymization\Anonymizer\AnonymizerRegistry;
use MakinaCorpus\DbToolsBundle\Anonymization\Anonymizer\Context;
use MakinaCorpus\DbToolsBundle\Anonymization\Config\AnonymizationConfig;
use MakinaCorpus\DbToolsBundle\Anonymization\Config\AnonymizerConfig;
use MakinaCorpus\DbToolsBundle\Helper\Format;
Expand Down Expand Up @@ -42,15 +43,17 @@ class Anonymizator implements LoggerAwareInterface
];

private OutputInterface $output;
private readonly Context $defaultContext;

public function __construct(
private DatabaseSession $databaseSession,
private AnonymizerRegistry $anonymizerRegistry,
private AnonymizationConfig $anonymizationConfig,
private ?string $salt = null,
?Context $defaultContext = null,
) {
$this->logger = new NullLogger();
$this->output = new NullOutput();
$this->defaultContext = $defaultContext ?? new Context();
}

/**
Expand All @@ -71,25 +74,21 @@ public function setOutput(OutputInterface $output): self
return $this;
}

#[\Deprecated(message: "Will be removed in 3.0, use Context::generateRandomSalt() instead.", since: "2.1.0")]
public static function generateRandomSalt(): string
{
return \base64_encode(\random_bytes(12));
}

protected function getSalt(): string
{
return $this->salt ??= self::generateRandomSalt();
return Context::generateRandomSalt();
}

/**
* Create anonymizer instance.
*/
protected function createAnonymizer(AnonymizerConfig $config): AbstractAnonymizer
protected function createAnonymizer(AnonymizerConfig $config, Context $context): AbstractAnonymizer
{
return $this->anonymizerRegistry->createAnonymizer(
$config->anonymizer,
$config,
$config->options->with(['salt' => $this->getSalt()]),
$context,
$this->databaseSession
);
}
Expand Down Expand Up @@ -127,6 +126,7 @@ public function anonymize(
}

$plan = [];
$context = clone $this->defaultContext;

if ($onlyTargets) {
foreach ($onlyTargets as $targetString) {
Expand Down Expand Up @@ -160,7 +160,7 @@ public function anonymize(
foreach ($plan as $table => $targets) {
$anonymizers[$table] = [];
foreach ($this->anonymizationConfig->getTableConfig($table, $targets) as $target => $config) {
$anonymizers[$table][] = $this->createAnonymizer($config);
$anonymizers[$table][] = $this->createAnonymizer($config, $context);
}
}

Expand Down Expand Up @@ -910,7 +910,7 @@ public function checkAnonymizationConfig(): array
foreach ($this->anonymizationConfig->all() as $table => $tableConfig) {
foreach ($tableConfig as $config) {
try {
$this->createAnonymizer($config);
$this->createAnonymizer($config, $this->defaultContext);
} catch (\Exception $e) {
if (!\key_exists($table, $errors)) {
$errors[$table] = [];
Expand Down
Loading