-
Notifications
You must be signed in to change notification settings - Fork 79
feat: Implement Singer msgspec encoding #2541
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Implement Singer msgspec encoding #2541
Conversation
CodSpeed Performance ReportMerging #2541 will improve performances by ×12Comparing Summary
Benchmarks breakdown
|
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #2541 +/- ##
==========================================
+ Coverage 91.34% 91.42% +0.08%
==========================================
Files 63 63
Lines 5231 5280 +49
Branches 677 673 -4
==========================================
+ Hits 4778 4827 +49
Misses 320 320
Partials 133 133 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
97de9f7 to
1d9e947
Compare
1d9e947 to
4febeba
Compare
4febeba to
cbe10bd
Compare
8876b80 to
f691f78
Compare
|
I found that it helped to add a |
|
In the https://jcristharif.com/msgspec/perf-tips.html#line-delimited-json json.py: SingerWriter: |
Do you mean in sdk/singer_sdk/_singerlib/encoding/_msgspec.py Lines 73 to 94 in 08b58bf
? |
29dea7a to
d23a8ab
Compare
|
Yes, that is exactly what I meant. Could have definitely been stated clearer on my part😅. |
d23a8ab to
3169b58
Compare
|
Naive of me to think I could get this across in 1/2 a day of work 😅. I'll come back to this later, there's plenty of time until the planned release date. |
|
Like the pun 😊. Great dad joke material. Kind an inside joke now since you dropped (naive) from the title of the PR. |
|
Ok, the tests are passing. Now I want to think of how to make it easy and straightforward for a developer to use msgspec as the SerDe layer, and also keep the door open to the user being the one deciding which serialization layer to use. |
|
@sourcery-ai review |
Reviewer's Guide by SourceryThis pull request introduces Updated class diagram for BaseSingerReader and BaseSingerWriterclassDiagram
class PluginBase {
+config: dict | PurePath | str | list[PurePath | str] | None
+parse_env_config: bool
+validate_config: bool
}
class BaseSingerReader {
+message_reader_class: type[GenericSingerReader]
+message_reader: GenericSingerReader | None
+listen(file_input: t.IO[str] | None) : None
+process_lines(file_input: t.IO[str] | None) : t.Counter[str]
+process_endofpipe() : None
+_assert_line_requires(message_dict: dict, requires: set[str]) : None
<<abstract>>
+_process_schema_message(message_dict: dict) : None
<<abstract>>
+_process_record_message(message_dict: dict) : None
<<abstract>>
+_process_state_message(message_dict: dict) : None
<<abstract>>
+_process_activate_version_message(message_dict: dict) : None
<<abstract>>
+_process_batch_message(message_dict: dict) : None
}
class BaseSingerWriter {
+message_writer_class: type[GenericSingerWriter]
+message_writer: GenericSingerWriter | None
+write_message(message: t.Any) : None
}
PluginBase <|-- BaseSingerReader
PluginBase <|-- BaseSingerWriter
Class diagram for MsgSpecReader and MsgSpecWriterclassDiagram
class GenericSingerReader {
<<interface>>
+deserialize_json(line: str) : dict
}
class GenericSingerWriter {
<<interface>>
+serialize_message(message: Message) : bytes
}
class MsgSpecReader {
+default_input: t.IO
+deserialize_json(line: str) : dict
}
class MsgSpecWriter {
+serialize_message(message: Message) : bytes
+write_message(message: Message) : None
}
GenericSingerReader <|.. MsgSpecReader : implements
GenericSingerWriter <|.. MsgSpecWriter : implements
File-Level Changes
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @edgarrmondragon - I've reviewed your changes - here's some feedback:
Overall Comments:
- Consider making the IO implementation an attribute of the Singer class rather than using multiple inheritance, to avoid MRO ordering issues. This would provide a cleaner and more explicit design.
Here's what I looked at during the review
- 🟢 General issues: all looks good
- 🟡 Security: 1 issue found
- 🟡 Testing: 2 issues found
- 🟡 Complexity: 1 issue found
- 🟢 Documentation: all looks good
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @edgarrmondragon - I've reviewed your changes - here's some feedback:
Overall Comments:
- Consider adding
msgspecas a dependency to thecoreextra inpyproject.toml. - The new base classes
BaseSingerReaderandBaseSingerWriterduplicate some logic from the originalPluginBaseclass; consider refactoring to avoid this duplication.
Here's what I looked at during the review
- 🟡 General issues: 2 issues found
- 🟢 Security: all looks good
- 🟡 Testing: 1 issue found
- 🟡 Complexity: 1 issue found
- 🟢 Documentation: all looks good
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
Co-authored-by: sourcery-ai[bot] <58596630+sourcery-ai[bot]@users.noreply.github.com>
SQLiteTap(MsgSpecWriter, SQLTap)and notSQLiteTap(SQLTap, MsgSpecWriter). A better approach might be to make the IO implementation an attribute of the Singer class.📚 Documentation preview 📚: https://meltano-sdk--2541.org.readthedocs.build/en/2541/
Summary by Sourcery
Implement
msgspecencoding for improved performance.Enhancements:
msgspecfor serialization and deserialization.Tests:
msgspecimplementation.