Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support the superscript and subscript tags #407

Closed
wants to merge 4 commits into from

Conversation

cowboysync
Copy link
Contributor

Update init.py to support the superscript and subscript tags

Support the superscript and subscript tags
@Alir3z4
Copy link
Owner

Alir3z4 commented Jan 19, 2024

Thanks for the fix, could you please add some tests?

Copy link

codecov bot commented Jan 23, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Comparison is base (e375689) 97.23% compared to head (a13f493) 97.26%.

Additional details and impacted files
@@            Coverage Diff             @@
##           master     #407      +/-   ##
==========================================
+ Coverage   97.23%   97.26%   +0.02%     
==========================================
  Files          11       11              
  Lines        1120     1132      +12     
==========================================
+ Hits         1089     1101      +12     
  Misses         31       31              
Flag Coverage Δ
unittests-3.10 97.26% <100.00%> (+0.02%) ⬆️
unittests-3.11 97.26% <100.00%> (+0.02%) ⬆️
unittests-3.12 97.26% <100.00%> (+0.02%) ⬆️
unittests-3.8 97.26% <100.00%> (+0.02%) ⬆️
unittests-3.9 97.26% <100.00%> (+0.02%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@Alir3z4
Copy link
Owner

Alir3z4 commented Jan 24, 2024

I'd suggest and highly recommend to keep the default output of html2text close to plain-text as possible while being compatible with HTML.

html2text is a Python script that converts a page of HTML into clean, easy-to-read plain ASCII text. Better yet, that ASCII also happens to be valid Markdown (a text-to-HTML format).

While supporting superscript and subscript is a very nice addition, considering above, it would be much better to have the feature as a flag. (worth noting, many markdown parsers or renderers don't happen to have such support )

For instance flag --images-with-size is an example of how to preserve the images with their sizes, but by default it's off.

If we can have such flexibility in this, we can have this merged and include in the next release.

@Alir3z4
Copy link
Owner

Alir3z4 commented Jan 25, 2024

CI failed to pass the tests.

As the default behavior has sup/sub tags ignored, the way of having test (html->md files) won't work since the test runner will run the HTML2Text with default configuration.

The test need to be changed to a python file like how it's done for newlines on multiple calls. You can make a new file in the tests directory and make a function in it and call the HTML2Text class with the HTML2Text(ignore_sup_sub=False) ... (of course you can delete the current html->md test files).

@Alir3z4
Copy link
Owner

Alir3z4 commented Feb 2, 2024

Merged via #408 42278c6

@Alir3z4 Alir3z4 closed this Feb 2, 2024
@Alir3z4
Copy link
Owner

Alir3z4 commented Feb 2, 2024

Thanks for the great contribution.
I did some code cleanup to align the code with the rest of the code base and updated the changelog and documentation files.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants