[GH-ISSUE #1080] Is there a way to navigate to an archived URL directly without knowing its timestamp, e.g. https://archivebox.example.com/archive/en.wikipedia.org/wiki/Philosophy? #3693

Closed
opened 2026-03-15 00:02:34 +03:00 by kerem · 1 comment
Owner

Originally created by @happening-primal on GitHub (Jan 4, 2023).
Original GitHub issue: https://github.com/ArchiveBox/ArchiveBox/issues/1080

I've played around with archivebox a few times in the past and I always run into a limitation related to the archive URL.

As per the documentation, and in reality, the archived URL will look something like this:

https://archive.example.com/archive/1493350273/en.wikipedia.org/wiki/Dining_philosophers_problem.html

That's great, and you can scroll through your archived web pages and find this, but it not very convenient for actually just browsing through to archived pages. A method of doing this would be to use the Firefox Add-in Redirector to take all attempts to go to https://en.wikipedia.org/wiki/Dining_philosophers_problem.html and redirect them to https://archive.example.com/archive/1493350273/en.wikipedia.org/wiki/Dining_philosophers_problem.html except....the issue is that you must know the archive number, in above example, 1493350273.

So, to my question as I've done a lot of searching on this topic with no luck. Is there a 'permalink' to the most recent copy of an archived page such that you can automate the browsing of your archive without needing to know the archive number?

If might look something like these examples below:

https://archive.example.com/archive/mostrecent/en.wikipedia.org/wiki/Dining_philosophers_problem.html
https://archive.example.com/archive/single/mostrecent/en.wikipedia.org/wiki/Dining_philosophers_problem.html
https://archive.example.com/pdf/mostrecent/en.wikipedia.org/wiki/Dining_philosophers_problem.html
https://archive.example.com/readability/mostrecent/en.wikipedia.org/wiki/Dining_philosophers_problem.html

or any other simple algorithmic way to navigate to a desired web page snapshot located in your archive? Any help much appreciated.

Originally created by @happening-primal on GitHub (Jan 4, 2023). Original GitHub issue: https://github.com/ArchiveBox/ArchiveBox/issues/1080 I've played around with archivebox a few times in the past and I always run into a limitation related to the archive URL. As per the documentation, and in reality, the archived URL will look something like this: https://archive.example.com/archive/1493350273/en.wikipedia.org/wiki/Dining_philosophers_problem.html That's great, and you can scroll through your archived web pages and find this, but it not very convenient for actually just browsing through to archived pages. A method of doing this would be to use the Firefox Add-in Redirector to take all attempts to go to https://en.wikipedia.org/wiki/Dining_philosophers_problem.html and redirect them to https://archive.example.com/archive/1493350273/en.wikipedia.org/wiki/Dining_philosophers_problem.html except....the issue is that you must know the archive number, in above example, 1493350273. So, to my question as I've done a lot of searching on this topic with no luck. Is there a 'permalink' to the most recent copy of an archived page such that you can automate the browsing of your archive without needing to know the archive number? If might look something like these examples below: https://archive.example.com/archive/mostrecent/en.wikipedia.org/wiki/Dining_philosophers_problem.html https://archive.example.com/archive/single/mostrecent/en.wikipedia.org/wiki/Dining_philosophers_problem.html https://archive.example.com/pdf/mostrecent/en.wikipedia.org/wiki/Dining_philosophers_problem.html https://archive.example.com/readability/mostrecent/en.wikipedia.org/wiki/Dining_philosophers_problem.html or any other simple algorithmic way to navigate to a desired web page snapshot located in your archive? Any help much appreciated.
kerem closed this issue 2026-03-15 00:02:39 +03:00
Author
Owner

@pirate commented on GitHub (Jan 4, 2023):

ArchiveBox actually already supports going to https://archivebox.example.com/archive/https://example.com/some/original/url and it'll auto-redirect without needing to know the snapshot number, you can find the redirect logic here: https://github.com/ArchiveBox/ArchiveBox/blob/dev/archivebox/core/views.py#L143

However, ArchiveBox doesn't offer proxy-replaying (aka seamless browsing with automatic redirecting to archived versions for every URL) directly from a browser.

pywb's proxy-archiving feature would be a better fit for that than ArchiveBox: https://pywb.readthedocs.io/en/develop/manual/configuring.html#http-s-proxy-mode

<!-- gh-comment-id:1371501143 --> @pirate commented on GitHub (Jan 4, 2023): ArchiveBox actually already supports going to `https://archivebox.example.com/archive/https://example.com/some/original/url` and it'll auto-redirect without needing to know the snapshot number, you can find the redirect logic here: https://github.com/ArchiveBox/ArchiveBox/blob/dev/archivebox/core/views.py#L143 However, ArchiveBox doesn't offer proxy-replaying (aka seamless browsing with automatic redirecting to archived versions for every URL) directly from a browser. pywb's proxy-archiving feature would be a better fit for that than ArchiveBox: https://pywb.readthedocs.io/en/develop/manual/configuring.html#http-s-proxy-mode
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/ArchiveBox#3693
No description provided.