[GH-ISSUE #457] Feature Request: Allowing the import of already archived URLs. #3322

Closed
opened 2026-03-14 22:06:50 +03:00 by kerem · 1 comment
Owner

Originally created by @dbeley on GitHub (Aug 23, 2020).
Original GitHub issue: https://github.com/ArchiveBox/ArchiveBox/issues/457

Type

  • General question or discussion
  • Propose a brand new feature
  • Request modification of existing behavior or design

What is the problem that your feature request solves

Hi, thanks a lot for ArchiveBox! In the documentation, it's stated that "[..] ArchiveBox will never re-download sites that have already succeeded previously."

But what if I wanted to periodically export a website whose content's change are interesting, for example the frontpage of a newspaper website?

Having a way to export the same website several times could provide several snapshots over months or years that could be very interesting.

Describe the ideal specific solution you'd want, and whether it fits into any broader scope of changes

I think an optional argument to the archivebox add command would fit ideally. It could be named --force-add or --add-if-present or any similar name (I'm bad at naming things). A general configuration settings could also be considered, but I think the optional argument is more fitting.

I don't know the underlying changes required specially to the database models, so I can't really estimate the difficulty of my request.

Thanks in advance!

How badly do you want this new feature?

  • It's an urgent deal-breaker, I can't live without it
  • It's important to add it in the near-mid term future
  • It would be nice to have eventually

  • I'm willing to contribute dev time / money to fix this issue
  • I like ArchiveBox so far / would recommend it to a friend
  • I've had a lot of difficulty getting ArchiveBox set up
Originally created by @dbeley on GitHub (Aug 23, 2020). Original GitHub issue: https://github.com/ArchiveBox/ArchiveBox/issues/457 ## Type - [ ] General question or discussion - [ ] Propose a brand new feature - [x] Request modification of existing behavior or design ## What is the problem that your feature request solves Hi, thanks a lot for ArchiveBox! In the documentation, it's stated that ["[..] ArchiveBox will never re-download sites that have already succeeded previously."](https://github.com/pirate/ArchiveBox/wiki/Configuration#only_new) But what if I wanted to periodically export a website whose content's change are interesting, for example the frontpage of a newspaper website? Having a way to export the same website several times could provide several snapshots over months or years that could be very interesting. ## Describe the ideal specific solution you'd want, and whether it fits into any broader scope of changes I think an optional argument to the `archivebox add` command would fit ideally. It could be named `--force-add` or `--add-if-present` or any similar name (I'm bad at naming things). A general configuration settings could also be considered, but I think the optional argument is more fitting. I don't know the underlying changes required specially to the database models, so I can't really estimate the difficulty of my request. Thanks in advance! ## How badly do you want this new feature? - [ ] It's an urgent deal-breaker, I can't live without it - [x] It's important to add it in the near-mid term future - [ ] It would be nice to have eventually --- - [x] I'm willing to contribute dev time / money to fix this issue - [x] I like ArchiveBox so far / would recommend it to a friend - [ ] I've had a lot of difficulty getting ArchiveBox set up
Author
Owner

@pirate commented on GitHub (Aug 23, 2020):

Duplicate: https://github.com/pirate/ArchiveBox/issues/179

<!-- gh-comment-id:678830193 --> @pirate commented on GitHub (Aug 23, 2020): Duplicate: https://github.com/pirate/ArchiveBox/issues/179
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/ArchiveBox#3322
No description provided.