[GH-ISSUE #354] Question: Get all links from the same domain #3273

Closed
opened 2026-03-14 21:52:57 +03:00 by kerem · 1 comment
Owner

Originally created by @walkero-gr on GitHub (Jul 7, 2020).
Original GitHub issue: https://github.com/ArchiveBox/ArchiveBox/issues/354

Hello there,
I would like to ask if there is a way to archive a website, and all it's subpages, but without following links to other domains. This is helpful if you need to archive a whole website and preserve it, without be necessary to grab all it's pages by hand and without the risk to get every website when an external domain exist inside the html.

thank you for your time.

Originally created by @walkero-gr on GitHub (Jul 7, 2020). Original GitHub issue: https://github.com/ArchiveBox/ArchiveBox/issues/354 Hello there, I would like to ask if there is a way to archive a website, and all it's subpages, but without following links to other domains. This is helpful if you need to archive a whole website and preserve it, without be necessary to grab all it's pages by hand and without the risk to get every website when an external domain exist inside the html. thank you for your time.
kerem closed this issue 2026-03-14 21:53:02 +03:00
Author
Owner

@pirate commented on GitHub (Jul 7, 2020):

Duplicate: https://github.com/pirate/ArchiveBox/issues/191

<!-- gh-comment-id:654938167 --> @pirate commented on GitHub (Jul 7, 2020): Duplicate: https://github.com/pirate/ArchiveBox/issues/191
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/ArchiveBox#3273
No description provided.