[PR #707] [CLOSED] #578: Add ability to schedule and manage recurring imports via the admin UI #4280

Closed
opened 2026-03-15 01:36:04 +03:00 by kerem · 0 comments
Owner

📋 Pull Request Information

Original PR: https://github.com/ArchiveBox/ArchiveBox/pull/707
Author: @pirate
Created: 4/16/2021
Status: Closed

Base: devHead: scheduler-ui


📝 Commits (2)

  • c2f2f4f make add command accept Path as import and return all_links, new_links
  • 940b9fe add beginnings of new scheduler model for recurring imports

📊 Changes

2 files changed (+106 additions, -7 deletions)

View changed files

📝 archivebox/main.py (+7 -7)
archivebox/scheduler/models.py (+99 -0)

📄 Description

Fixes: #578

Remaining TODOs:

  • figure out which python scheduler to use
    • huey (my current favorite)
    • celery (ugh...)
    • APScheduler (will require lots of manual models and concurrency control code)
    • yacron (not sure if it can be configured dynamically)
    • dramatiq (doesn't support sqlite)
  • decide whether to continue supporting system crontab at all, or tear it out (imo we should just tear it out and move to using an internal scheduler)
  • fork the scheduled task worker off the server process automatically on startup, so no need to run separate archivebox schedule --foreground process manually
  • figure out how to enforce "at least once" or "at most once" concurrency model for scheduled tasks

🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.

## 📋 Pull Request Information **Original PR:** https://github.com/ArchiveBox/ArchiveBox/pull/707 **Author:** [@pirate](https://github.com/pirate) **Created:** 4/16/2021 **Status:** ❌ Closed **Base:** `dev` ← **Head:** `scheduler-ui` --- ### 📝 Commits (2) - [`c2f2f4f`](https://github.com/ArchiveBox/ArchiveBox/commit/c2f2f4ff57b88c55cc557a73ce28ed06e04d3c28) make add command accept Path as import and return all_links, new_links - [`940b9fe`](https://github.com/ArchiveBox/ArchiveBox/commit/940b9fe181b465c9c1483bea124c2d819c9ca8a4) add beginnings of new scheduler model for recurring imports ### 📊 Changes **2 files changed** (+106 additions, -7 deletions) <details> <summary>View changed files</summary> 📝 `archivebox/main.py` (+7 -7) ➕ `archivebox/scheduler/models.py` (+99 -0) </details> ### 📄 Description Fixes: #578 **Remaining TODOs:** - [ ] figure out which python scheduler to use - huey (my current favorite) - celery (ugh...) - APScheduler (will require lots of manual models and concurrency control code) - yacron (not sure if it can be configured dynamically) - dramatiq (doesn't support sqlite) - [ ] decide whether to continue supporting system crontab at all, or tear it out (imo we should just tear it out and move to using an internal scheduler) - [ ] fork the scheduled task worker off the server process automatically on startup, so no need to run separate `archivebox schedule --foreground` process manually - [ ] figure out how to enforce "at least once" or "at most once" concurrency model for scheduled tasks --- <sub>🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.</sub>
kerem 2026-03-15 01:36:04 +03:00
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/ArchiveBox#4280
No description provided.