[GH-ISSUE #319] Feature Request: Import Safari history #1740

Closed
opened 2026-03-01 17:53:16 +03:00 by kerem · 5 comments
Owner

Originally created by @wvdk on GitHub (Feb 3, 2020).
Original GitHub issue: https://github.com/ArchiveBox/ArchiveBox/issues/319

This is more of a hey-how-much-work-would-this-be than a serious feature request. I'm just getting interested in this archiving space. My preferred browser on macOS is Safari for a number of reasons. My ideal setup would be a daily job which runs ArchiveBox the latest history from Safari. Interested in hearing people's thoughts on this.

Type

  • General question or discussion
  • Propose a brand new feature
  • Request modification of existing behavior or design

What is the problem that your feature request solves

Ability to create automated archives of all websites visited via Safari browser

Describe the ideal specific solution you'd want, and whether it fits into any broader scope of changes

I believe Safari keeps its history in a sqlite db. I don't yet know much about ArchiveBox's input methods but I imaging this could be quite a bit of work if it's currently only built to handle stdin and simple text formats.

What hacks or alternative solutions have you tried to solve the problem?

None really. I did try importing Safari's exported bookmarks (which ArchiveBox returned [X] No links found :( but that's a separate problem). I'd really like to be able to run ArchiveBox on the browser's history.

How badly do you want this new feature?

  • It's an urgent deal-breaker, I can't live without it
  • It's important to add it in the near-mid term future
  • It would be nice to have eventually

  • I'm willing to contribute dev time / money to fix this issue
  • I like ArchiveBox so far / would recommend it to a friend
  • I've had a lot of difficulty getting ArchiveBox set up
Originally created by @wvdk on GitHub (Feb 3, 2020). Original GitHub issue: https://github.com/ArchiveBox/ArchiveBox/issues/319 This is more of a hey-how-much-work-would-this-be than a serious feature request. I'm just getting interested in this archiving space. My preferred browser on macOS is Safari for a number of reasons. My ideal setup would be a daily job which runs ArchiveBox the latest history from Safari. Interested in hearing people's thoughts on this. ## Type - [ ] General question or discussion - [ ] Propose a brand new feature - [x] Request modification of existing behavior or design ## What is the problem that your feature request solves Ability to create automated archives of all websites visited via Safari browser ## Describe the ideal specific solution you'd want, and whether it fits into any broader scope of changes I believe Safari keeps its history in a sqlite db. I don't yet know much about ArchiveBox's input methods but I imaging this could be quite a bit of work if it's currently only built to handle stdin and simple text formats. ## What hacks or alternative solutions have you tried to solve the problem? None really. I did try importing Safari's exported bookmarks (which ArchiveBox returned `[X] No links found :(` but that's a separate problem). I'd really like to be able to run ArchiveBox on the browser's history. ## How badly do you want this new feature? - [x] It's an urgent deal-breaker, I can't live without it - [ ] It's important to add it in the near-mid term future - [ ] It would be nice to have eventually --- - [x] I'm willing to contribute dev time / money to fix this issue - [x] I like ArchiveBox so far / would recommend it to a friend - [ ] I've had a lot of difficulty getting ArchiveBox set up
kerem 2026-03-01 17:53:16 +03:00
Author
Owner

@pirate commented on GitHub (Feb 4, 2020):

It's super easy to add.

sqlite3 ~/Library/Safari/History.db "select date,url from history_items" > safari_history.txt
./archive < safari_history.txt
<!-- gh-comment-id:581682901 --> @pirate commented on GitHub (Feb 4, 2020): It's super easy to add. ```bash sqlite3 ~/Library/Safari/History.db "select date,url from history_items" > safari_history.txt ./archive < safari_history.txt ```
Author
Owner

@wvdk commented on GitHub (Feb 4, 2020):

Wow, perfect. Completely spaced on the fact that I can so easily query sqlite. Shows how little I know lol. Thanks a bunch @pirate.

Note for any future visitors: Had to make a copy of the History.db file (it seems to be protected by Safari but a copy of it is easily queryable).

<!-- gh-comment-id:581688093 --> @wvdk commented on GitHub (Feb 4, 2020): Wow, perfect. Completely spaced on the fact that I can so easily query sqlite. Shows how little I know lol. Thanks a bunch @pirate. Note for any future visitors: Had to make a copy of the History.db file (it seems to be protected by Safari but a copy of it is easily queryable).
Author
Owner

@pirate commented on GitHub (Feb 4, 2020):

I'm actually going to reopen this as a TODO: to add Safari history dumping support to the bin/archivebox-export-browser-history script.

<!-- gh-comment-id:581697445 --> @pirate commented on GitHub (Feb 4, 2020): I'm actually going to reopen this as a TODO: to add Safari history dumping support to the `bin/archivebox-export-browser-history` script.
Author
Owner

@pirate commented on GitHub (Feb 4, 2020):

Aaaaand done, that was fast haha... github.com/pirate/ArchiveBox@0c1b1b523c

For now this just dumps the URLs, ideally in the future we should also import the URLs together with their respective "Last Visit" timestamps so that history order is preserved.

<!-- gh-comment-id:581697499 --> @pirate commented on GitHub (Feb 4, 2020): Aaaaand done, that was fast haha... https://github.com/pirate/ArchiveBox/commit/0c1b1b523cfc98833ec31297aa901360ae98318d For now this just dumps the URLs, ideally in the future we should also import the URLs together with their respective "Last Visit" timestamps so that history order is preserved.
Author
Owner

@pirate commented on GitHub (Feb 4, 2020):

And added to the docs here: https://github.com/pirate/ArchiveBox/wiki/Usage#import-list-of-links-from-browser-history

<!-- gh-comment-id:581698027 --> @pirate commented on GitHub (Feb 4, 2020): And added to the docs here: https://github.com/pirate/ArchiveBox/wiki/Usage#import-list-of-links-from-browser-history
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/ArchiveBox#1740
No description provided.