You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
get html from https://www.stats.gov.cn/sj/zxfb/202409/t20240914_1956486.html
in the download html file ,only one data
but in the mardkown text ,data twice
html2text --version
get html from
https://www.stats.gov.cn/sj/zxfb/202409/t20240914_1956486.html
in the download html file ,only one data
but in the mardkown text ,data twice
script :
def convert_html_to_markdown(html_text,base_url:str):
h = HTML2Text(baseurl=base_url)
h.ignore_links = False
markdown_text = h.handle(html_text)
return markdown_text
The text was updated successfully, but these errors were encountered: