[PR #107] [MERGED] Optionally import only new links #2569

Closed
opened 2026-03-01 17:59:58 +03:00 by kerem · 0 comments
Owner

📋 Pull Request Information

Original PR: https://github.com/ArchiveBox/ArchiveBox/pull/107
Author: @f0086
Created: 10/19/2018
Status: Merged
Merged: 10/26/2018
Merged by: @pirate

Base: masterHead: import-only-new-links


📝 Commits (4)

  • 69c007c Optionally import only new links
  • b1b6be4 merge_links() used wrong index
  • ebc327b Make O(n^2) loop to an O(n) problem.
  • a2f5fa8 Use a more appropriate coding style from @pirate.

📊 Changes

4 files changed (+32 additions, -6 deletions)

View changed files

📝 README.md (+6 -0)
📝 archiver/archive.py (+17 -6)
📝 archiver/config.py (+1 -0)
📝 archiver/links.py (+8 -0)

📄 Description

When importing a huge list of links periodically (from a big dump of
links from a bookmark service for example) with a lot of broken links,
this links will always be rechecked. To skip this, the environment
variable ONLY_NEW can be used to only import new links and skip the rest
altogether. This partially fixes #95.


🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.

## 📋 Pull Request Information **Original PR:** https://github.com/ArchiveBox/ArchiveBox/pull/107 **Author:** [@f0086](https://github.com/f0086) **Created:** 10/19/2018 **Status:** ✅ Merged **Merged:** 10/26/2018 **Merged by:** [@pirate](https://github.com/pirate) **Base:** `master` ← **Head:** `import-only-new-links` --- ### 📝 Commits (4) - [`69c007c`](https://github.com/ArchiveBox/ArchiveBox/commit/69c007ce8536db173b4ba367236a811848464808) Optionally import only new links - [`b1b6be4`](https://github.com/ArchiveBox/ArchiveBox/commit/b1b6be4f13a403420a58fc8c06462c605deb16ed) merge_links() used wrong index - [`ebc327b`](https://github.com/ArchiveBox/ArchiveBox/commit/ebc327bb897c137c66ee4a0cbb0b616f17175897) Make O(n^2) loop to an O(n) problem. - [`a2f5fa8`](https://github.com/ArchiveBox/ArchiveBox/commit/a2f5fa8ba69ed87916208e3f0439509f7a72da98) Use a more appropriate coding style from @pirate. ### 📊 Changes **4 files changed** (+32 additions, -6 deletions) <details> <summary>View changed files</summary> 📝 `README.md` (+6 -0) 📝 `archiver/archive.py` (+17 -6) 📝 `archiver/config.py` (+1 -0) 📝 `archiver/links.py` (+8 -0) </details> ### 📄 Description When importing a huge list of links periodically (from a big dump of links from a bookmark service for example) with a lot of broken links, this links will always be rechecked. To skip this, the environment variable ONLY_NEW can be used to only import new links and skip the rest altogether. This partially fixes #95. --- <sub>🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.</sub>
kerem 2026-03-01 17:59:58 +03:00
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/ArchiveBox#2569
No description provided.