We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Some emails might be HTML, we need to convert that to a readable text version.
The text was updated successfully, but these errors were encountered:
BeautifulSoup is surprisingly bad at this. Any ideas?
html = '<p>T<i>e</i>st <b>haha</b></p><p>Other\nline</p>' from bs4 import BeautifulSoup BeautifulSoup(html).get_text() # 'Test hahaOther\nline' BeautifulSoup(html).get_text(' ') # 'T e st haha Other\nline' BeautifulSoup(html).get_text('\n') # 'T\ne\nst \nhaha\nOther\nline'
Sorry, something went wrong.
Aaron Swartz's html2text seems close enough.
from html2text import HTML2Text HTML2Text().handle(html) # 'T_e_st **haha**\n\nOther line\n\n'
No branches or pull requests
Some emails might be HTML, we need to convert that to a readable text version.
The text was updated successfully, but these errors were encountered: