[GH-ISSUE #303] Question: RSSMix useless? #1732

Closed
opened 2026-03-01 17:53:13 +03:00 by kerem · 6 comments
Owner

Originally created by @dataarchivist on GitHub (Nov 20, 2019).
Original GitHub issue: https://github.com/ArchiveBox/ArchiveBox/issues/303

I didnt try if AB can do it by itself, but for now, I use http://rssmix.com/ to merge my rss feeds into one to cronjob a lot of websites at once. Its fully free btw.
If its not already possible to catch them easily via txt file or something else, it would be nice to have.

Originally created by @dataarchivist on GitHub (Nov 20, 2019). Original GitHub issue: https://github.com/ArchiveBox/ArchiveBox/issues/303 I didnt try if AB can do it by itself, but for now, I use http://rssmix.com/ to merge my rss feeds into one to cronjob a lot of websites at once. Its fully free btw. If its not already possible to catch them easily via txt file or something else, it would be nice to have.
kerem closed this issue 2026-03-01 17:53:13 +03:00
Author
Owner

@pirate commented on GitHub (Nov 20, 2019):

Archivebox never re-downloads urls that have been already archived, so you can already pipe in all your RSS feeds without merging / deduplicating them first, so this isn't needed.

<!-- gh-comment-id:556176866 --> @pirate commented on GitHub (Nov 20, 2019): Archivebox never re-downloads urls that have been already archived, so you can already pipe in all your RSS feeds without merging / deduplicating them first, so this isn't needed.
Author
Owner

@dataarchivist commented on GitHub (Nov 20, 2019):

Is an option to redownload a site again planned?

<!-- gh-comment-id:556300390 --> @dataarchivist commented on GitHub (Nov 20, 2019): Is an option to redownload a site again planned?
Author
Owner

@pirate commented on GitHub (Nov 20, 2019):

Yes, see https://github.com/pirate/ArchiveBox/issues/179

<!-- gh-comment-id:556337771 --> @pirate commented on GitHub (Nov 20, 2019): Yes, see https://github.com/pirate/ArchiveBox/issues/179
Author
Owner

@dataarchivist commented on GitHub (Nov 21, 2019):

Archivebox never re-downloads urls that have been already archived

But its does for rss feeds.
So, if you merge the feeds, you normaly get every x hour an fresh rss with new links.

<!-- gh-comment-id:556955463 --> @dataarchivist commented on GitHub (Nov 21, 2019): > Archivebox never re-downloads urls that have been already archived But its does for rss feeds. So, if you merge the feeds, you normaly get every x hour an fresh rss with new links.
Author
Owner

@pirate commented on GitHub (Nov 21, 2019):

But its does for rss feeds.

It downloads the feed itself, but it doesn't re-download the pages within it after the first time.

<!-- gh-comment-id:557072668 --> @pirate commented on GitHub (Nov 21, 2019): > But its does for rss feeds. It downloads the feed itself, but it doesn't re-download the pages within it after the first time.
Author
Owner

@dataarchivist commented on GitHub (Nov 22, 2019):

But its does for rss feeds.

It downloads the feed itself, but it doesn't re-download the pages within it after the first time.

A normal feed didnt have the same link a second time. ;)
Anyway, I did this with another cron, it was just interesting to know.

<!-- gh-comment-id:557489046 --> @dataarchivist commented on GitHub (Nov 22, 2019): > > > > But its does for rss feeds. > > It downloads the feed itself, but it doesn't re-download the pages within it after the first time. A normal feed didnt have the same link a second time. ;) Anyway, I did this with another cron, it was just interesting to know.
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/ArchiveBox#1732
No description provided.