starred/ArchiveBox

Fork 0

mirror of https://github.com/ArchiveBox/ArchiveBox.git synced 2026-04-25 17:16:00 +03:00

Philosophy`? #675

New issue

Closed

opened 2026-03-01 14:45:26 +03:00 by kerem · 1 comment

kerem commented

2026-03-01 14:45:26 +03:00

Owner

Originally created by @happening-primal on GitHub (Jan 4, 2023).
Original GitHub issue: https://github.com/ArchiveBox/ArchiveBox/issues/1080

I've played around with archivebox a few times in the past and I always run into a limitation related to the archive URL.

As per the documentation, and in reality, the archived URL will look something like this:

https://archive.example.com/archive/1493350273/en.wikipedia.org/wiki/Dining_philosophers_problem.html

That's great, and you can scroll through your archived web pages and find this, but it not very convenient for actually just browsing through to archived pages. A method of doing this would be to use the Firefox Add-in Redirector to take all attempts to go to https://en.wikipedia.org/wiki/Dining_philosophers_problem.html and redirect them to https://archive.example.com/archive/1493350273/en.wikipedia.org/wiki/Dining_philosophers_problem.html except....the issue is that you must know the archive number, in above example, 1493350273.

So, to my question as I've done a lot of searching on this topic with no luck. Is there a 'permalink' to the most recent copy of an archived page such that you can automate the browsing of your archive without needing to know the archive number?

If might look something like these examples below:

https://archive.example.com/archive/mostrecent/en.wikipedia.org/wiki/Dining_philosophers_problem.html
https://archive.example.com/archive/single/mostrecent/en.wikipedia.org/wiki/Dining_philosophers_problem.html
https://archive.example.com/pdf/mostrecent/en.wikipedia.org/wiki/Dining_philosophers_problem.html
https://archive.example.com/readability/mostrecent/en.wikipedia.org/wiki/Dining_philosophers_problem.html

or any other simple algorithmic way to navigate to a desired web page snapshot located in your archive? Any help much appreciated.

Originally created by @happening-primal on GitHub (Jan 4, 2023). Original GitHub issue: https://github.com/ArchiveBox/ArchiveBox/issues/1080 I've played around with archivebox a few times in the past and I always run into a limitation related to the archive URL. As per the documentation, and in reality, the archived URL will look something like this: https://archive.example.com/archive/1493350273/en.wikipedia.org/wiki/Dining_philosophers_problem.html That's great, and you can scroll through your archived web pages and find this, but it not very convenient for actually just browsing through to archived pages. A method of doing this would be to use the Firefox Add-in Redirector to take all attempts to go to https://en.wikipedia.org/wiki/Dining_philosophers_problem.html and redirect them to https://archive.example.com/archive/1493350273/en.wikipedia.org/wiki/Dining_philosophers_problem.html except....the issue is that you must know the archive number, in above example, 1493350273. So, to my question as I've done a lot of searching on this topic with no luck. Is there a 'permalink' to the most recent copy of an archived page such that you can automate the browsing of your archive without needing to know the archive number? If might look something like these examples below: https://archive.example.com/archive/mostrecent/en.wikipedia.org/wiki/Dining_philosophers_problem.html https://archive.example.com/archive/single/mostrecent/en.wikipedia.org/wiki/Dining_philosophers_problem.html https://archive.example.com/pdf/mostrecent/en.wikipedia.org/wiki/Dining_philosophers_problem.html https://archive.example.com/readability/mostrecent/en.wikipedia.org/wiki/Dining_philosophers_problem.html or any other simple algorithmic way to navigate to a desired web page snapshot located in your archive? Any help much appreciated.

kerem closed this issue

2026-03-01 14:45:26 +03:00

kerem commented

2026-03-01 14:45:27 +03:00

Author

Owner

@pirate commented on GitHub (Jan 4, 2023):

ArchiveBox actually already supports going to https://archivebox.example.com/archive/https://example.com/some/original/url and it'll auto-redirect without needing to know the snapshot number, you can find the redirect logic here: https://github.com/ArchiveBox/ArchiveBox/blob/dev/archivebox/core/views.py#L143

However, ArchiveBox doesn't offer proxy-replaying (aka seamless browsing with automatic redirecting to archived versions for every URL) directly from a browser.

pywb's proxy-archiving feature would be a better fit for that than ArchiveBox: https://pywb.readthedocs.io/en/develop/manual/configuring.html#http-s-proxy-mode

@pirate commented on GitHub (Jan 4, 2023): ArchiveBox actually already supports going to `https://archivebox.example.com/archive/https://example.com/some/original/url` and it'll auto-redirect without needing to know the snapshot number, you can find the redirect logic here: https://github.com/ArchiveBox/ArchiveBox/blob/dev/archivebox/core/views.py#L143 However, ArchiveBox doesn't offer proxy-replaying (aka seamless browsing with automatic redirecting to archived versions for every URL) directly from a browser. pywb's proxy-archiving feature would be a better fit for that than ArchiveBox: https://pywb.readthedocs.io/en/develop/manual/configuring.html#http-s-proxy-mode

kerem referenced this issue

2026-03-01 17:55:07 +03:00

[GH-ISSUE #675] Error: no such function: JSON_VALID during archivebox init when SQLite JSON extension isn't present on Windows #1936

kerem referenced this issue

2026-03-14 22:56:19 +03:00

[GH-ISSUE #675] Error: no such function: JSON_VALID during archivebox init when SQLite JSON extension isn't present on Windows #3445

No milestone

No project

No assignees

1 participant

Notifications

Due date

The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference

starred/ArchiveBox#675

No description provided.

Rows
Columns

[GH-ISSUE #1080] Is there a way to navigate to an archived URL directly without knowing its timestamp, e.g. https://archivebox.example.com/archive/en.wikipedia.org/wiki/Philosophy? #675

[GH-ISSUE #1080] Is there a way to navigate to an archived URL directly without knowing its timestamp, e.g. `https://archivebox.example.com/archive/en.wikipedia.org/wiki/Philosophy`? #675