mirror of
https://github.com/go-shiori/shiori.git
synced 2026-04-25 06:25:54 +03:00
[GH-ISSUE #353] Support Obelisk archiving #240
Labels
No labels
component:backend
component:builds
component:builds
component:extension
component:frontend
component:readability
database
database:mysql
database:postgres
database:sqlite
feature:ebooks
github_actions
good first issue
hacktoberfest
note:duplicate?
note:fixed?
note:out-of-scope?
os:windows
priority:high
priority:low
pull-request
resolution:as-intended
resolution:cant-reproduce
resolution:duplicate
resolution:fixed
resolution:wontfix
tag:TBD
tag:big-task
tag:help-wanted
tag:huge-data
tag:meta
tag:more-info
tag:next
tag:no-stale
tag:requires-migrations
tag:research
tag:security 🛡️
tag:stale
tag:waiting-for-assignee
type:bug
type:documentation
type:enhancement
type:meta
type:ux
user:cli
user:web
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
starred/shiori#240
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @fmartingr on GitHub (Feb 10, 2022).
Original GitHub issue: https://github.com/go-shiori/shiori/issues/353
It seems that shiori depends on warc which is currently archived. We need to find a replacement for warc. Maybe obelisk?
Acceptance criteria
warcfor already existing rows, butobeliskas default)/bookmark/:id/archivehandler to load multiple archive types (to load old and new)POST /api/v1/bookmarks/POST /api/v1/bookmarks/cache/POST /api/v1/bookmarks/:id/cacheto select which archiver to use (but hardcode/default it toobelisk).@efrecon commented on GitHub (Feb 18, 2022):
obeliskis great, I have just tested the latest release on a few examples and it does a good job at preserving the original layout and content.@fmartingr commented on GitHub (Feb 18, 2022):
I still haven't tested/checked it yet, but the other day I stumbled randomly with https://github.com/gildas-lormeau/SingleFile and it also seemed quite good (and having just a single HTML as output it's quite useful as well).
@grawlinson commented on GitHub (Feb 19, 2022):
I'm in the process of packaging shiori for the AUR, and I strongly recommend staying within the Go ecosystem (obelisk can be imported as a go module!) as relying on external tools (e.g. SingleFile) defeats one of shiori's major selling points.
EDIT: Additionally, SingleFile requires a browser binary to be present, which is a Pandora's box in itself.
@fmartingr commented on GitHub (Feb 19, 2022):
Just to clarify (because I didn't express myself very well): I like how SingleFile works (the single HTML file output) but I do not plan to replace warc with it. The plan still is to go for Obelisk. :)
Edit: Yeah, when I made my first comment I didn't realise that Obelisk's output is also a Single HTML file 😅
@grawlinson commented on GitHub (Feb 19, 2022):
Thanks for clarifying that!
A package is now available on the AUR, so if there are any bug reports relating to Arch Linux, tag me and I'll attempt to help out.
@gildas-lormeau commented on GitHub (Oct 13, 2022):
For the record, this statement is false. SingleFile can work with JSDOM. Anyway, good luck!
@fmartingr commented on GitHub (Oct 14, 2022):
Thanks for the clarification, and even if I love SingleFile (I has helped me a ton while moving out to a new flat!), it would add unnecessary complexity for us. So far, obelisk seems to provide the expected results, and we could use this migration to move that project further in the go world :)
@gildas-lormeau commented on GitHub (Oct 14, 2022):
Thanks for the feedback! Personally, I think that in 2022, you have to use a web browser for this kind of tasks. Also, it's really becoming essential when it comes to determining what to really save. This is where SingleFile, generally, stands out. A very large part of the code consists in optimizing the size of the saved page. To do this, a browser is unfortunately required.
@ivanrg99 commented on GitHub (Jan 18, 2024):
What's the status on this? One of the reasons why we choose to run software like Shiori is for archiving purposes, to prevent link-rot and preserve information/knowledge. Having our bookmarks stored in a binary data format as opposed to plain text hurts data preservation. Do you need any help with the transition to Obelisk? Is anyone working on this at the moment?
@Monirzadeh commented on GitHub (Jan 18, 2024):
Personally try to make it ready to use later. currently i work on https://github.com/go-shiori/obelisk/pull/96 and https://github.com/go-shiori/obelisk/pull/98
we have some open issue there too. you can work on any aspect that you like.
@fmartingr commented on GitHub (Feb 4, 2024):
I need to sit down and pave the way for people to start implementing this features. I started a draft under #481 some time ago but didn't sat down again on that since there were other things that had priority like the API. I guess the API migration will get faster over time while we refactor the logic in different components, but that's still the main priority now.
For this to work, we will need to isolate the archiving logic in its own domain and provide backwards compatibility, which will require a migration adding a new column specifying which archive format a bookmark is currently in.
What I'm trying to say is that it can be done and on my radar, but is not trivial. Once 1.6 is released I need to sit down and work on the roadmap again, defining some issues that we need to work on several things and probably making some PRs to preprare for that to happen.
@dehlen commented on GitHub (Feb 26, 2024):
Hey,
I am eagerly awaiting the work on this issue. I would like to migrate my catalog of bookmarks saved in instapaper to shiori and self host this on my local network. However what is holding me back is that the current implementation stores the archived bookmark in a bolt database. I am now wondering whether I should wait for obelisk support in shiori or if it makes sense to migrate right away. I do not want to import all my bookmarks again whenever obelisk is added and am wondering how likely it is there will exist a migration path for previously archived bookmarks to be converted from the bolt db to an html output created by obelisk.
As I understand it is definitely on your radar but it's just something you didn't find time to look at yet. My comment shouldn't pressure you in any way it's more of a +1 for this feature and to be subscribed to the ongoing discussion. Whenever you have new information I am very keen to hear them regarding this issue :)