bookmarks/tagged/web-content-extracting.md at master · abhidave001/bookmarks

Bookmarks tagged [web-content-extracting]

^{^{www.codever.land/bookmarks/t/web-content-extracting}}

html2text

^{https://github.com/Alir3z4/html2text}

Convert HTML to Markdown-formatted text.

tags: python, web-content-extracting
source code

lassie

^{https://github.com/michaelhelmick/lassie}

Web Content Retrieval for Humans.

tags: python, web-content-extracting
source code

micawber

^{https://github.com/coleifer/micawber}

A small library for extracting rich content from URLs.

tags: python, web-content-extracting
source code

newspaper

^{https://github.com/codelucas/newspaper}

News extraction, article extraction and content curation in Python.

tags: python, web-content-extracting
source code

python-readability

^{https://github.com/buriy/python-readability}

Fast Python port of arc90's readability tool.

tags: python, web-content-extracting
source code

requests-html

^{https://github.com/kennethreitz/requests-html}

Pythonic HTML Parsing for Humans.

tags: python, web-content-extracting
source code

sumy

^{https://github.com/miso-belica/sumy}

A module for automatic summarization of text documents and HTML pages.

tags: python, web-content-extracting
source code

textract

^{https://github.com/deanmalmgren/textract}

Extract text from any document, Word, PowerPoint, PDFs, etc.

tags: python, web-content-extracting
source code

toapi

^{https://github.com/gaojiuli/toapi}

Every web site provides APIs.

tags: python, web-content-extracting
source code

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bookmarks tagged [web-content-extracting]

^{^{www.codever.land/bookmarks/t/web-content-extracting}}

html2text

lassie

micawber

newspaper

python-readability

requests-html

sumy

textract

toapi

FilesExpand file tree

web-content-extracting.md

Latest commit

History

web-content-extracting.md

File metadata and controls

Bookmarks tagged [web-content-extracting]