mirror of
https://github.com/ArchiveBox/ArchiveBox.git
synced 2026-04-25 17:16:00 +03:00
[GH-ISSUE #531] Feature Request: One-Click Deploy to hosting providers #3358
Labels
No labels
expected: maybe someday
expected: next release
expected: release after next
expected: unlikely unless contributed
good first ticket
help wanted
pull-request
scope: all users
scope: windows users
size: easy
size: hard
size: medium
size: medium
status: backlog
status: blocked
status: done
status: idea-phase
status: needs followup
status: wip
status: wontfix
touches: API/CLI/Spec
touches: configuration
touches: data/schema/architecture
touches: dependencies/packaging
touches: docs
touches: js
touches: views/replayers/html/css
why: correctness
why: functionality
why: performance
why: security
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
starred/ArchiveBox#3358
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @mAAdhaTTah on GitHub (Nov 11, 2020).
Original GitHub issue: https://github.com/ArchiveBox/ArchiveBox/issues/531
DigitalOcean is launching a one-click deploy for it's AppPlatform. This won't work for us yet because we would need to attach a Volume, which AppPlatform doesn't support, but the documentation linked suggests it will soon/eventually. Alternatively, we could look into configuring it for Heroku.
I'm happy to take the lead on this as well, but wanted to open an issue for visibility/discussion.
Type
What is the problem that your feature request solves
I think it would be helpful for new users to be able to spin up an ArchiveBox instance in the cloud w/ minimal work. Running it on Docker in the first place is really helpful, but would be nice to simplify it even further.
Describe the ideal specific solution you'd want, and whether it fits into any broader scope of changes
It should be feasible for a new user
What hacks or alternative solutions have you tried to solve the problem?
I'm still considering how I'm going to host my archive. I initially spun it up on a home server, which works but doesn't help if I want to expose the in-progress REST API to my website. I then put it on a DO droplet, which I'm still fiddling with. I've also considered writing ansible roles for this as well, although that's a bit more involved for the less technical.
The main issue with something like AppPlatform & Heroku is that you don't get CLI access, so everything needs to function via the UI. Downloading sites can take several minutes, which may time out if deployed on AppPlatform (I haven't tested it in that context but it's definitely been happening on my droplet). Maybe worth looking at/considering how we can configure this as background tasks or something? Or maybe deploy to AppPlatform as a worker?
How badly do you want this new feature?
@pirate commented on GitHub (Apr 6, 2021):
Some managed hosting options have popped up in the last few months, might be worth checking out if you're willing to pay $ for hosting:
https://github.com/ArchiveBox/ArchiveBox/wiki/Web-Archiving-Community#managed-archivebox-hosting
@olimart commented on GitHub (Apr 19, 2021):
Heroku button support would be awesome indeed.
https://www.heroku.com/elements/buttons
@mAAdhaTTah commented on GitHub (Apr 19, 2021):
@olimart The biggest issue with doing this is the filesystem. Heroku & DO's App Platform both provide ephemeral filesystems per deploy, so they're wiped on restart/redeploy. We'd need to either configure those platforms for block storage (something DO's AP doesn't support yet; not sure about Heroku) or provide a swappable implementation for the filesystem to save things to S3 or some other object storage (DO's Spaces, which is S3 compatible). I haven't dug into this much but it's definitely not a trivial effort.
@olimart commented on GitHub (Apr 19, 2021):
Thanks @mAAdhaTTah
Yep, would need to provide the ability to configure external storage (S3...)
I saw quickly a reference to SQLite which is not supported by Heroku either.
Web app on Heroku, storage on Dropbox 😄
@pirate commented on GitHub (Apr 23, 2021):
Here's a WIP DigitalOcean "one-click" deploy template, but as @mAAdhaTTah mentioned it's broken because disk storage is not supported by DO apps yet: https://github.com/ArchiveBox/ArchiveBox/blob/digitalocean/.do/deploy.template.yaml
@mAAdhaTTah commented on GitHub (Apr 25, 2021):
@pirate Yeah, and swapping out for S3 would be tough/impossible with the SQLite db (plus if the tools we use write their own files, that makes it even more difficult).
@pirate commented on GitHub (Apr 25, 2021):
I think it's still feasible though, we can write to local disk / RAM disk and then sync it to s3 or other storage backends every few seconds. It'll have a second or two of lag but I think that's an acceptable trade off.
@mAAdhaTTah commented on GitHub (Apr 25, 2021):
@pirate How would you handle the db in that instance? Sync it down on boot?
@pirate commented on GitHub (Apr 25, 2021):
Nah just rsync it every few seconds like all the other files. I think S3 supports byte-range requests so you can just sync the diffs instead of the whole thing each time.
@turian commented on GitHub (Aug 12, 2022):
I would also want this feature
@turian commented on GitHub (Sep 11, 2022):
Alternately, use the Digital Ocean postgres server. (Or is archivebox sqlite3 only.)
@turian commented on GitHub (Sep 12, 2022):
Additionally, it might be possible to use s3fuse to treat the DO spaces as a local filesystem
This might be kinda gross since you have to overwrite the file each time, you can't modify / append it. That could cause issues
@mAAdhaTTah commented on GitHub (Sep 12, 2022):
@turian The big issue, as I understand it, is the external binaries write files directly to disk.
@turian commented on GitHub (Sep 12, 2022):
Yeah but @pirate 's suggestion is just to rsync very frequently to s3.
On startup, you rsync back from s3. (I guess this can get expensive if you are not in AWS, since s3 downloads are costly.)
(BTW, digital ocean spaces are s3 compatible.)
The only real issue I can think of is durability, like if the process breaks for some reason and you have a corrupted thing. Then you have to rollback the s3 which could be a pain.
@mAAdhaTTah commented on GitHub (Sep 12, 2022):
rsync'ing back & forth seems rough for an archive of any serious size. I believe my archive is several GBs at this point and if I had to resync it down on startup and rsync up after archiving, that would be pretty slow.
@turian commented on GitHub (Sep 12, 2022):
@mAAdhaTTah So I don't know the internals of archivebox but:
@pirate commented on GitHub (Sep 15, 2022):
I believe rsyncing bidirectionally on startup can be made reasonably fast/efficient even for large archives as there are advanced rsync options that let you store a sync cache file for faster diffing.
@turian commented on GitHub (Sep 15, 2022):
@mAAdhaTTah Also, if you want a one-click deploy of ArchiveBox, you can get one on PikaPods. It costs a few bucks a month.
I think they are running 0.6.2. Unfortunately this means you still will get crashes on the UTF-8 bug and youtube-dl bugs and the archiving will stop, for which there are PRs but are not merged yet.
PikaPods builds all their one-click app stuff in house (not open source) I think, so there's no way to customize.
Another option is YunoHost. Their apps are all open-source, so in principle there could be a bleeding edge archivebox app in there too.
@pirate commented on GitHub (Jun 13, 2023):
I'm going to close this for now because realistically the only two options I foresee for the future are:
@boehs commented on GitHub (May 6, 2024):
For what its worth I did a railway deploy, this is a link to it. I think for new users they give you $5 in credit, and once that is used you get $5 credit for a $5 subscription. ArchiveBox uses like $1 of credit or so per month.
Edit: here it is deployed: https://box.boehs.org/archive/1714976395.796772/index.html
@turian commented on GitHub (Oct 24, 2024):
@pirate I just spent the better part of two days trying to write an ansible playbook setting up archivebox on hetzner with caddy and decent security and it still doesn't work. So I would love if you launched a managed hosted option. I would pay at least double what the expenses it costs for your server / PaaS rental, just so you could understand possible pricing.
Indeed, I would venture to say that MANY MANY more people are interested in USING archivebox than in maintaining it. See how popular pinboard.in is? This could be the next one, particularly considering that pinboard.in dev goes dark for extended periods of time.
"I turn ArchiveBox into a for-profit enterprise and offer paid ArchiveBox hosting (in which case I have no interest in supporting competing paid deployment solutions for free)" YES PLEASE. I think that is probably the most sustainable path to recurring revenue.
Feel free to email me at lastname at gmail's email service if you want feedback