Skip to content

Adding more languages for recognition models by default #563

Closed
@felixdittrich92

Description

@felixdittrich92

🚀 The feature

Currently you support only french by default would be great to add more languages directly to choose for example:
model = ocr_predictor(det_arch='db_resnet50', reco_arch='crnn_vgg16_bn', pretrained=True, language='en')
or
reco_arch = crnn_mobilenet_v3_large(pretrained=True, language='de')
What do you think ?

Motivation, pitch

In most cases you need the recognition for a specific language this can be done by training yourself but it would be much easier if some often used languages can be used without own training

Alternatives

Some other ideas:

  • auto detect language and choose the right model if provided
  • multilingual model
  • or provide to choose multiple models like languages=['en', 'fr', 'de']
    Adding ViTSTR #513 (i think i will finish this at the end of the year on my side which will than provide de, en, es, fr in one model - but currently no benchmarks)

Additional context

If you want i can train all current existing models in pytorch for english and german

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions