Make sure you have pnpm installed: https://pnpm.io/installation
pnpm install
pnpm run scrape <url>You'll see some console output, and then should have an output directory full of PDF files and a single ___urls.txt file.
| Full | Short | Description |
|---|---|---|
--media |
-m |
What media type you want to generate PDFs with, if the site supports different media types ("screen" or "print" (default)) |
--colorScheme |
-c |
What color scheme you want to generate PDFs with, if the site supports color schemes ("light", "dark", "no-preference" (default)) |
--withHeader |
-h |
Whether or not you want PDFs with generated headers (and footers) (default false) |
--dryRun |
-d |
Perform the web crawl without creating PDFs (default false) |
--verbose |
-v |
Adds additional logging (default false) |
- A
--dry-runflag that shows which URLs will be downloaded and what each page's filename will be - Asynchronous file download
- Figure out how to actually import and use ESM packages with
ts-nodelikep-limitandchalkv5...¯\_(ツ)_/¯ - EPUB generation option
- A URL whitelist feature that works with globs/regexes
- An option to combine all of the resulting PDFs into one
- The ability to also download linked ZIP/PDF files (which are currently ignored)
- Darkmode option VIA Dark Reader extension
- Update links in PDFs to refer to other saved files