Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IndexError on empty strong mark on version 2024.2.25 #409

Closed
elmernocon opened this issue Feb 26, 2024 · 1 comment · Fixed by #410
Closed

IndexError on empty strong mark on version 2024.2.25 #409

elmernocon opened this issue Feb 26, 2024 · 1 comment · Fixed by #410

Comments

@elmernocon
Copy link

  • html2text 2024.2.25

  • Python 3.9+

  • Test script

    ✅ default

    converter = html2text.HTML2Text()
    converter.emphasis_mark = "_"
    converter.strong_mark = "**"
    string = "A <b>B</b> <i>C</i>."
    result = converter.handle(string)
    print(result)
    # output: A **B** _C_.

    ✅ emphasis emptied ''

    converter = html2text.HTML2Text()
    converter.emphasis_mark = ""
    converter.strong_mark = "**"
    string = "A <b>B</b> <i>C</i>."
    result = converter.handle(string)
    print(result)
    # output: A **B** C.

    ❌ strong emptied ''

    converter = html2text.HTML2Text()
    converter.emphasis_mark = "_"
    converter.strong_mark = ""
    string = "A <b>B</b> <i>C</i>."
    result = converter.handle(string)
    print(result)
    # expected output: A B _C_.
    Traceback (most recent call last):
      File "script.py", line 29, in <module>
        main()
      File "script.py", line 24, in main
        result = converter.handle(string)
      File "test/venv/lib/python3.9/site-packages/html2text/__init__.py", line 145, in handle
        self.feed(data)
      File "test/venv/lib/python3.9/site-packages/html2text/__init__.py", line 141, in feed
        super().feed(data)
      File "/opt/homebrew/Cellar/[email protected]/3.9.18_1/Frameworks/Python.framework/Versions/3.9/lib/python3.9/html/parser.py", line 110, in feed
    self.goahead(0)
      File "/opt/homebrew/Cellar/[email protected]/3.9.18_1/Frameworks/Python.framework/Versions/3.9/lib/python3.9/html/parser.py", line 170, in goahead
        k = self.parse_starttag(i)
      File "/opt/homebrew/Cellar/[email protected]/3.9.18_1/Frameworks/Python.framework/Versions/3.9/lib/python3.9/html/parser.py", line 344, in parse_starttag
        self.handle_starttag(tag, attrs)
      File "test/venv/lib/python3.9/site-packages/html2text/__init__.py", line 194, in handle_starttag
        self.handle_tag(tag, dict(attrs), start=True)
      File "test/venv/lib/python3.9/site-packages/html2text/__init__.py", line 441, in handle_tag
        and self.preceding_data[-1] == self.strong_mark[0]
    IndexError: string index out of range
    
@Alir3z4
Copy link
Owner

Alir3z4 commented Feb 26, 2024

Thanks for the report.
#410 should fix it.

I'll merge and publish a new release tomorrow.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants