Skip to content

feat: port OCR to C++ #389

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 16 commits into
base: main
Choose a base branch
from
Open

feat: port OCR to C++ #389

wants to merge 16 commits into from

Conversation

JakubGonera
Copy link
Contributor

@JakubGonera JakubGonera commented Jun 12, 2025

Description

Port the native implementation to C++

Type of change

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Documentation update (improves or adds clarity to existing documentation)

Tested on

  • iOS
  • Android

Related issues

#259

Checklist

  • I have performed a self-review of my code
  • I have commented my code, particularly in hard-to-understand areas
  • I have updated the documentation accordingly
  • My changes generate no new warnings

NorbertKlockiewicz and others added 4 commits July 22, 2025 12:13
## Description

<!-- Provide a concise and descriptive summary of the changes
implemented in this PR. -->

### Type of change

- [ ] Bug fix (non-breaking change which fixes an issue)
- [ ] New feature (non-breaking change which adds functionality)
- [ ] Breaking change (fix or feature that would cause existing
functionality to not work as expected)
- [ ] Documentation update (improves or adds clarity to existing
documentation)

### Tested on

- [ ] iOS
- [ ] Android

### Testing instructions

<!-- Provide step-by-step instructions on how to test your changes.
Include setup details if necessary. -->

### Screenshots

<!-- Add screenshots here, if applicable -->

### Related issues

<!-- Link related issues here using #issue-number -->

### Checklist

- [ ] I have performed a self-review of my code
- [ ] I have commented my code, particularly in hard-to-understand areas
- [ ] I have updated the documentation accordingly
- [ ] My changes generate no new warnings

### Additional notes

<!-- Include any additional information, assumptions, or context that
reviewers might need to understand this PR. -->

---------

Co-authored-by: mlodyjesienin <[email protected]>
@NorbertKlockiewicz NorbertKlockiewicz marked this pull request as ready for review July 22, 2025 14:28
@msluszniak
Copy link
Member

I'll submit my comments for c++ code today/tommorow

Comment on lines +126 to +128
std::vector<float> inputVector = colorMatToVector(matrix, mean, variance);
return executorch::extension::make_tensor_ptr(tensorDims, inputVector);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Usually, intermediate variables are good in describing the code. Here, I think that inputVector does not adds much value in the sense of self-description and passing r-value ref aka result of function call directly as an argument improves performance and does not diminish the quality of code.

Suggested change
std::vector<float> inputVector = colorMatToVector(matrix, mean, variance);
return executorch::extension::make_tensor_ptr(tensorDims, inputVector);
}
return executorch::extension::make_tensor_ptr(tensorDims, colorMatToVector(matrix, mean, variance));
}

v.reserve(pixelCount);

if (mat.isContinuous()) {
v.assign((float *)mat.data, (float *)mat.data + pixelCount);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Avoid C-style casts, if static_cast will not work, use reinterpret_cast

Suggested change
v.assign((float *)mat.data, (float *)mat.data + pixelCount);
v.assign(static_cast<float*>(mat.data), static_cast<float*>(mat.data + pixelCount));

Comment on lines +162 to +163
const float heightRatio = (float)targetSize.height / inputSize.height;
const float widthRatio = (float)targetSize.width / inputSize.width;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
const float heightRatio = (float)targetSize.height / inputSize.height;
const float widthRatio = (float)targetSize.width / inputSize.width;
const float heightRatio = static_cast<float>(targetSize.height) / inputSize.height;
const float widthRatio = static_cast<float>(targetSize.width) / inputSize.width;

Comment on lines +171 to +172
const int cornerPatchSize =
std::max(1, std::min(inputSize.height, inputSize.width) / 30);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 and 30 are magic numbers here, move them to named constants

for (int i = 1; i < corners.size(); i++) {
backgroundScalar += cv::mean(corners[i]);
}
backgroundScalar /= (double)corners.size();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
backgroundScalar /= (double)corners.size();
backgroundScalar /= static_cast<double>(corners.size());

Comment on lines +88 to +90
for (int j = 0; j < 4; j++) {
points[j] = {.x = vertices[j].x, .y = vertices[j].y};
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
for (int j = 0; j < 4; j++) {
points[j] = {.x = vertices[j].x, .y = vertices[j].y};
}
#pragma unroll
for (int j = 0; j < 4; j++) {
points[j] = {.x = vertices[j].x, .y = vertices[j].y};
}

minRect.points(vertices);

std::array<Point, 4> points;
for (int j = 0; j < 4; j++) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
for (int j = 0; j < 4; j++) {
#pragma unroll
for (int j = 0; j < 4; j++) {

float minSideLength = std::numeric_limits<float>::max();
std::size_t numOfPoints = points.size();

for (std::size_t i = 0; i < numOfPoints; i++) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
for (std::size_t i = 0; i < numOfPoints; i++) {
#pragma unroll
for (std::size_t i = 0; i < numOfPoints; i++) {

fitLineToShortestSides(std::array<Point, 4> points) {
std::array<std::pair<float, float>, 4> sides;
std::array<Point, 4> midpoints;
for (std::size_t i = 0; i < 4; i++) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
for (std::size_t i = 0; i < 4; i++) {
#pragma unroll
for (std::size_t i = 0; i < 4; i++) {


std::array<Point, 4> pointsFromCvPoints(cv::Point2f cvPoints[4]) {
std::array<Point, 4> points;
for (std::size_t i = 0; i < 4; ++i) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
for (std::size_t i = 0; i < 4; ++i) {
#pragma unroll
for (std::size_t i = 0; i < 4; ++i) {

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants