[GH-ISSUE #846] Feature Request: Support saving local webpages or PDFs #524

New issue

Closed

opened 2026-03-01 14:44:18 +03:00 by kerem · 1 comment

kerem commented

2026-03-01 14:44:18 +03:00

Owner

Originally created by @Victor239 on GitHub (Sep 12, 2021).
Original GitHub issue: https://github.com/ArchiveBox/ArchiveBox/issues/846

Type

General question or discussion
Propose a brand new feature
Request modification of existing behavior or design

What is the problem that your feature request solves

Sometimes ArchiveBox fails to archive a URL, but in those instances I'm still able to use the SingleFile browser extension or save the webpage as a PDF. If I open the SingleFile or PDF in my browser, I'd like ArchiveBox to be able to archive these webpages via the local file instead and then I can manually edit the URL that is associated with these webpages later if desired.

Also in some cases there are PDFs I've obtained from emails, or webpages which are already offline, but I'm unable to import them into ArchiveBox. This leaves me having to maintain a separate archive for these files in Zotero, which is a headache and makes me want to just pair things down to one archive program.

Describe the ideal specific solution you'd want, and whether it fits into any broader scope of changes

Instead of just HTTP URLs, ArchiveBox should also accept filenames such as file:///home/user/Downloads/Rupert.html and be able to archive these pages.

Originally created by @Victor239 on GitHub (Sep 12, 2021). Original GitHub issue: https://github.com/ArchiveBox/ArchiveBox/issues/846 ## Type - [ ] General question or discussion - [x] Propose a brand new feature - [ ] Request modification of existing behavior or design ## What is the problem that your feature request solves Sometimes ArchiveBox fails to archive a URL, but in those instances I'm still able to use the SingleFile browser extension or save the webpage as a PDF. If I open the SingleFile or PDF in my browser, I'd like ArchiveBox to be able to archive these webpages via the local file instead and then I can manually edit the URL that is associated with these webpages later if desired. Also in some cases there are PDFs I've obtained from emails, or webpages which are already offline, but I'm unable to import them into ArchiveBox. This leaves me having to maintain a separate archive for these files in Zotero, which is a headache and makes me want to just pair things down to one archive program. ## Describe the ideal specific solution you'd want, and whether it fits into any broader scope of changes Instead of just HTTP URLs, ArchiveBox should also accept filenames such as `file:///home/user/Downloads/Rupert.html` and be able to archive these pages.

kerem

2026-03-01 14:44:18 +03:00

closed this issue
added the
status: wontfix
label

kerem commented

2026-03-01 14:44:19 +03:00

Author

Owner

@pirate commented on GitHub (Sep 16, 2021):

Not possible given the security model. ArchiveBox is essentially running in a separate virtual machine and does not have access to the local filesystem or localhost:... urls. This is not something that's likely to change in the near/medium-term future. If it were to be implemented, it would be via a 3rd-party / community-contributed extension like https://github.com/ArchiveBox/ArchiveBox/issues/577 sending the files to ArchiveBox.

As a workaround, keep in mind you can always drop files directly into snapshot folders in ArchiveBox's data dir. If you make a new snapshot, let it fail during archiving, then drag some files into the folder manually it wont delete them, and they'll be considered part of that snapshot's outputs. e.g. some_attachment.pdf -> ~/archivebox/archive/152342345234/some_attachment.pdf.

@pirate commented on GitHub (Sep 16, 2021): Not possible given the security model. ArchiveBox is essentially running in a separate virtual machine and does not have access to the local filesystem or `localhost:...` urls. This is not something that's likely to change in the near/medium-term future. If it were to be implemented, it would be via a 3rd-party / community-contributed extension like https://github.com/ArchiveBox/ArchiveBox/issues/577 sending the files to ArchiveBox. As a workaround, keep in mind you can always drop files directly into snapshot folders in ArchiveBox's data dir. If you make a new snapshot, let it fail during archiving, then drag some files into the folder manually it wont delete them, and they'll be considered part of that snapshot's outputs. e.g. `some_attachment.pdf` -> `~/archivebox/archive/152342345234/some_attachment.pdf`.

kerem referenced this issue

2026-03-01 14:48:51 +03:00

[PR #524] [MERGED] Fixed docker-compose.yml url #1207

kerem referenced this issue

2026-03-01 18:00:32 +03:00

[PR #524] [MERGED] Fixed docker-compose.yml url #2717

kerem referenced this issue

2026-03-15 01:32:50 +03:00

[PR #524] [MERGED] Fixed docker-compose.yml url #4221