Optimize invalid fields parsing when no callback defined [#2345] #2346

thomas-boucher · 2025-06-18T12:36:54Z

See the description of the issue in #2345.

This proposal avoids allocating a new string with the current row on each invalid field if the user didn't define the "bad data" callback.
As an example, this reduced the time to parse a 3.5M file with 20k lines from 16 secs to 61 ms (see an example of how to reproduce in the issue).

I didn't add tests because the callback is already tested.

To go further and improve even if the callback is defined, I guess the RawRecord could be filled when a line is considered ended (and if the callback is defined) to avoid a new allocation on each field.

…ck defined.

thomas-boucher · 2025-06-23T15:40:18Z

hello @JoshClose, can I help to move this forward? Thanks!

JoshClose · 2025-06-23T16:19:06Z

I rewrote the parser from scratch and am currently integrating it back into the system, which may make this invalid. I'll leave this hear to remember to check and make sure the same thing doesn't happen again.

thomas-boucher · 2025-06-25T07:18:06Z

I rewrote the parser from scratch and am currently integrating it back into the system, which may make this invalid. I'll leave this hear to remember to check and make sure the same thing doesn't happen again.

Thanks @JoshClose, do you have a rough idea on a estimated readiness for the new implementation? Otherwise would it make sense to merge this in a minor update of the current code since it seems very low risk?
Thanks for the support

Avoid allocating objects for invalid fields when no "bad data" callba…

6133bb6

…ck defined.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Optimize invalid fields parsing when no callback defined [#2345] #2346

Optimize invalid fields parsing when no callback defined [#2345] #2346

Uh oh!

thomas-boucher commented Jun 18, 2025

Uh oh!

thomas-boucher commented Jun 23, 2025

Uh oh!

JoshClose commented Jun 23, 2025

Uh oh!

thomas-boucher commented Jun 25, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Optimize invalid fields parsing when no callback defined [#2345] #2346

Are you sure you want to change the base?

Optimize invalid fields parsing when no callback defined [#2345] #2346

Uh oh!

Conversation

thomas-boucher commented Jun 18, 2025

Uh oh!

thomas-boucher commented Jun 23, 2025

Uh oh!

JoshClose commented Jun 23, 2025

Uh oh!

thomas-boucher commented Jun 25, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants