You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi! I'm doing some natural language processing experiments and using html2text to make text sources out of internet pages. My problem is words in links are sticking to each other if I use --ignore-link flag:
example is specially simplified of course, but it "creates" new composite words. I've patched it locally to add spaces after each ignored link, sort of workaround with minimal changes:
Should I open a pull request for this? The code above is sort of workaround, but if it will be useful - I'd be happy to make it cleaner, add tests, changelog, etc.
Version by html2text --version: 2020.1.16 (from pypi, but github/master version is affected too)
Python version python3.8 --version: Python 3.8.0`
The text was updated successfully, but these errors were encountered:
Hi! I'm doing some natural language processing experiments and using html2text to make text sources out of internet pages. My problem is words in links are sticking to each other if I use --ignore-link flag:
example is specially simplified of course, but it "creates" new composite words. I've patched it locally to add spaces after each ignored link, sort of workaround with minimal changes:
and it produces what I need:
Should I open a pull request for this? The code above is sort of workaround, but if it will be useful - I'd be happy to make it cleaner, add tests, changelog, etc.
html2text --version
: 2020.1.16 (from pypi, but github/master version is affected too)python3.8 --version
: Python 3.8.0`The text was updated successfully, but these errors were encountered: