Training data should include bullet-like characters

Modern texts especially business documents contain bullet-like symbols e. g. for lists. Also middle dot is used with some frequency. While the recognition results for `eng` and `deu` are nearly perfect, the results for these symbols are "random".

For a next release of trained models the training data should be improved in this direction and maybe other symbols as well.

Test image:

![bullets](https://user-images.githubusercontent.com/1275557/129733772-e6110191-8c17-4f8e-bd09-b0dedefc21e7.png)

Tesseract result with `-l eng`:

```
List of vehicles:
* Trucks
* vans
* bicycles
Liste von Fahrzeugen:
e Lastwagen
e Transporter
e Fahrrader

```

Result with `-l deu`:

```
List of vehicles:
« Trucks
« vans
+ bicycles
Liste von Fahrzeugen:
e Lastwagen
e Transporter
e Fahrräder

```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Training data should include bullet-like characters #45

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Training data should include bullet-like characters #45

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions