starred/ArchiveBox

Fork 0

mirror of https://github.com/ArchiveBox/ArchiveBox.git synced 2026-04-25 09:06:02 +03:00

[GH-ISSUE #333] Question: How far does archivebox traverse? #240

New issue

Closed

opened 2026-03-01 14:41:45 +03:00 by kerem · 3 comments

kerem commented

2026-03-01 14:41:45 +03:00

Owner

Originally created by @vext01 on GitHub (Mar 26, 2020).
Original GitHub issue: https://github.com/ArchiveBox/ArchiveBox/issues/333

Hi,

I've recently discovered archivebox -- what a neat tool!

To try it out, I ran it on my personal website and was surprised to find that it followed links outside of my website too!

So my question is: How many links does it follow before stopping? Can this be controlled in any way?

Thanks!

P.S. I'm an OpenBSD developer. If you can get this up on PyPI, I'll happily make a port so that archivebox can be in the package manager.

Originally created by @vext01 on GitHub (Mar 26, 2020). Original GitHub issue: https://github.com/ArchiveBox/ArchiveBox/issues/333 Hi, I've recently discovered archivebox -- what a neat tool! To try it out, I ran it on my personal website and was surprised to find that it followed links outside of my website too! So my question is: How many links does it follow before stopping? Can this be controlled in any way? Thanks! P.S. I'm an OpenBSD developer. If you can get this up on PyPI, I'll happily make a port so that archivebox can be in the package manager.

kerem closed this issue

2026-03-01 14:41:45 +03:00

kerem commented

2026-03-01 14:41:46 +03:00

Author

Owner

@pirate commented on GitHub (Mar 27, 2020):

If you pipe a link in via stdin it archives just that link, if you pass a URL as an arg it interprets it as a source to import other links from.

See:

https://github.com/pirate/ArchiveBox/wiki/Usage#import-a-single-url-or-list-of-urls-via-stdin

https://github.com/pirate/ArchiveBox/wiki/Usage#import-list-of-links-exported-from-browser-or-another-service

This difference in behavior is intentional but not intuitive, so it's been changed in the upcoming v0.4 archivebox add CLI design.

Thanks for the offer re: OpenBSD! If you want to subscribe to PR #207 you'll get an update when v0.4 ships on PyPI.

@pirate commented on GitHub (Mar 27, 2020): If you pipe a link in via stdin it archives just that link, if you pass a URL as an arg it interprets it as a source to import other links from. See: https://github.com/pirate/ArchiveBox/wiki/Usage#import-a-single-url-or-list-of-urls-via-stdin vs https://github.com/pirate/ArchiveBox/wiki/Usage#import-list-of-links-exported-from-browser-or-another-service This difference in behavior is intentional but not intuitive, so it's been changed in the upcoming v0.4 [`archivebox add`](https://github.com/pirate/ArchiveBox/wiki/Roadmap#-archivebox-add) CLI design. Thanks for the offer re: OpenBSD! If you want to subscribe to PR #207 you'll get an update when v0.4 ships on PyPI.

kerem commented

2026-03-01 14:41:46 +03:00

Author

Owner

@vext01 commented on GitHub (Mar 27, 2020):

If you pass a URL as an arg it interprets it as a source to import other links from.

I see. That indeed isn't intuitive. The new CLI makes much more sense. Looking forward to that!

So with the current design, if I pass a URL as an arg, it follows links 1 deep. Is that correct?

If you want to subscribe to PR #207 you'll get an update when v0.4 ships on PyPI.

Many thanks. Subscribed.

@vext01 commented on GitHub (Mar 27, 2020): > If you pass a URL as an arg it interprets it as a source to import other links from. I see. That indeed isn't intuitive. The new CLI makes much more sense. Looking forward to that! So with the current design, if I pass a URL as an arg, it follows links 1 deep. Is that correct? > If you want to subscribe to PR #207 you'll get an update when v0.4 ships on PyPI. Many thanks. Subscribed.

kerem commented

2026-03-01 14:41:46 +03:00

Author

Owner

@pirate commented on GitHub (Mar 31, 2020):

In a sense it follows one link deep, but that's not really what you want if you're looking for recursive archiving since it doesn't archive the original URL. What it's really doing is treating the path/link argument as a feed to import a list of links from, e.g. a browser history or pinboard export.

@pirate commented on GitHub (Mar 31, 2020): In a sense it follows one link deep, but that's not really what you want if you're looking for recursive archiving since it doesn't archive the original URL. What it's really doing is treating the path/link argument as a *feed* to import a list of links from, e.g. a browser history or pinboard export.

kerem referenced this issue

2026-03-01 14:48:25 +03:00

[PR #240] [MERGED] Add "prefers-color-scheme: dark" support #1099

kerem referenced this issue

2026-03-01 18:00:08 +03:00

[PR #240] [MERGED] Add "prefers-color-scheme: dark" support #2612

kerem referenced this issue

2026-03-15 01:27:11 +03:00

[PR #240] [MERGED] Add "prefers-color-scheme: dark" support #4117