[GH-ISSUE #916] Feature Request: Consider providing an official AppImage #569

Open
opened 2026-03-01 14:44:39 +03:00 by kerem · 1 comment
Owner

Originally created by @noctux on GitHub (Jan 24, 2022).
Original GitHub issue: https://github.com/ArchiveBox/ArchiveBox/issues/916

Type

  • General question or discussion
  • Propose a brand new feature
  • Request modification of existing behavior or design

What is the problem that your feature request solves

ArchiveBox as a software bundles multiple "moving pieces", including a Django website, several node-js based services, youtube-dl, etc.pp. This makes maintaining an installation, especially on fast moving rolling-release distributions a time-consuming task.
The proposed solution to this is running the docker container, which among other things requires root access (so it does not work e.g. on my universities shell account) and a complicated setup.
It would be nice if there was a simpler way to maintain a personal archivebox, while, like the Readme states, "avoid polluting your host system with extra dependencies".

Describe the ideal specific solution you'd want, and whether it fits into any broader scope of changes

AppImages allow to pack an application, along with its dependencies, into a single file binary. This binary contains a squashfs-based filesystem image of your application which will be mounted and then used to run your app.
Thus, distributing is (ideally) as simple as downloading the binary from the official release page, setting it executable, and be ready to run.

Cons:

  • Another maintanence burden
  • AppImages do not feature isolation features, in contrast to docker. This can be both good and bad:
    • The appimage could for instance allow to run an host-provided chromium/ffmpeg/youtube-dl, can be interacted with by other means than the webinterface, ...
    • Several deployment strategies (such as a systemd-unit activated by an systemd-timer) do not work with docker at all, furthermore, these usually provide their own hardening options that allow security isolation

Pros:

  • More or less portable, single file binary might help adoption
  • Similar to the existing docker container, can probably be integrated into the existing release automation I've seen on master

How badly do you want this new feature?

  • It's an urgent deal-breaker, I can't live without it
  • It's important to add it in the near-mid term future
  • It would be nice to have eventually

  • I'm willing to contribute dev time to fix this issue

I've toyed around with that idea last weekend and used pkg2appimage (in version pkg2appimage-1807-x86_64.AppImage) to create an appimage for archivebox:

# Note: due to reasons, this requires a python3 (matching, e.g. python3.9 on both), python3-venv, nodejs and npm ON THE BUILDING HOST
app: archivebox
ingredients:
  dist: focal
  sources:
    # Order is important, currently (as of 2022-01-23) apt-get.do-download in pkg2appimage prefers earlier sources
    # When switching distributions, be sure to adapt the pythonpath in `script:` as well
    - deb https://deb.nodesource.com/node_14.x focal main
    - deb http://archive.ubuntu.com/ubuntu/ focal focal-updates focal-security main universe
    - deb http://archive.ubuntu.com/ubuntu/ focal-updates main universe
    - deb http://archive.ubuntu.com/ubuntu/ focal-security main universe
  packages:
    - python3.9-venv
    - nodejs
    - npm
    - wget
    - curl
script:
  - wget -c "https://raw.githubusercontent.com/ArchiveBox/ArchiveBox/dev/icon.png" -O archivebox.png
  - cp ./archivebox.png ./usr/share/icons/hicolor/256x256/
  - npm install -g --prefix ./usr 'git+https://github.com/pirate/readability-extractor'
  - npm install -g --prefix ./usr '@postlight/mercury-parser'
  - npm install -g --prefix ./usr 'gildas-lormeau/SingleFile#master'
  - python3.9 -m venv usr
  - source ./usr/bin/activate
  - ./usr/bin/pip3 install --ignore-installed archivebox
  - cat > archivebox.desktop <<\EOF
  - [Desktop Entry]
  - Type=Application
  - Terminal=true
  - Name=archivebox
  - Exec=archivebox
  - Categories=Network;
  - Icon=archivebox
  - EOF
  - usr/bin/pip3 freeze | grep "archivebox" | cut -d "=" -f 3 | head -n1 > ../VERSION
  # Fixup the pythonpath: Warning: the import os happens some lines below
  - sed -i '3s|^|import sys\nsys.path.insert(0, os.getenv("APPDIR") + "/usr/lib/python3.9/site-packages")\nsys.path.insert(0, os.getenv("APPDIR") + "/usr/lib/python3.9")\n|' ./usr/bin/archivebox
  # Prepare it for "recursion", i.e. calling other python programms
  - sed -i '3s|^|os.environ["PYTHONPATH"] = os.getenv("APPDIR") + "/usr/lib/python3.9:" + os.getenv("APPDIR") + "/usr/lib/python3.9/site-packages"\n|' ./usr/bin/archivebox
  # Appimage chdirs to usr/ by default...
  - sed -i '3s|^|import os\nos.chdir(os.getenv("OWD"))\n|' ./usr/bin/archivebox
  # Patch python interpreter path
  - find . -type f -exec sed -i '1 s|^#!/.*\.AppDir/usr/bin/\(python3*\)|#!/usr/bin/env \1|g' {} +

The image can then be built using ARCH=x86_64 ./pkg2appimage-1807-x86_64.AppImage archivebox.yml. Currently, the AppImage has a size of approx. 115M.

I've currently used a Debian Bullseye as the build host for now (because it was at hand, I've needed to "apt get install python3.9-venv npm" on the host) and tested on Debian Bullseye and a current ArchLinux installation (I've tested init and add, including youtube/github links). I've not included ffmpeg or chromium into the AppImage so far, because I believe that those are 1) optional dependencies and 2) better maintained outside of the appimage by the regular distributions package manager as their attack surface makes regular, high frequency updates sort of mandatory.
The AppImage was able to pick up and use the distribution binaries of these tools on both tested distributions.

The only bug that I've walked into was for the openssl employed by youtube-dl. As the location of the SSL_CERT_FILE differs a bit between distributions, it was necessary to run the appimage as SSL_CERT_FILE=/etc/ssl/cert.pem ./archivebox-0.6.2.glibc2.29-x86_64.AppImage ... on ArchLinux. In theory, one could include a cert-file into the AppImage. But once more, I believe it's best to use the host systems information there as the user might have certain custom SSL-trustlevels configured.

Thanks for reading this far and for developing and maintaining archivebox in general! So what do you think? Any interest? Please note that this is the first AppImage I've built, but I'll do my best to answer any questions or improve the AppImage if desired :)

Best regards
Simon

Originally created by @noctux on GitHub (Jan 24, 2022). Original GitHub issue: https://github.com/ArchiveBox/ArchiveBox/issues/916 <!-- Please fill out the following information, feel free to delete sections if they're not applicable or if long issue templates annoy you :) --> ## Type - [ ] General question or discussion - [X] Propose a brand new feature - [ ] Request modification of existing behavior or design ## What is the problem that your feature request solves <!-- e.g. I need to be able to archive spanish and french subtitle files from a particular <example.com> movie site that's going down soon. --> ArchiveBox as a software bundles multiple "moving pieces", including a Django website, several node-js based services, youtube-dl, etc.pp. This makes maintaining an installation, especially on fast moving rolling-release distributions a time-consuming task. The proposed solution to this is running the docker container, which among other things requires root access (so it does not work e.g. on my universities shell account) and a complicated setup. It would be nice if there was a simpler way to maintain a personal archivebox, while, like the Readme states, "avoid polluting your host system with extra dependencies". ## Describe the ideal specific solution you'd want, and whether it fits into any broader scope of changes <!-- e.g. I specifically need a new archive method to look for multilingual subtitle files related to pages. The bigger picture solution is the ability for custom user scripts to be run in a puppeteer context during archiving. --> [AppImages](https://appimage.org/) allow to pack an application, along with its dependencies, into a single file binary. This binary contains a squashfs-based filesystem image of your application which will be mounted and then used to run your app. Thus, distributing is (ideally) as simple as downloading the binary from the official release page, setting it executable, and be ready to run. ### Cons: - Another maintanence burden - AppImages do not feature isolation features, in contrast to docker. This can be both good and bad: + The appimage could for instance allow to run an host-provided chromium/ffmpeg/youtube-dl, can be interacted with by other means than the webinterface, ... + Several deployment strategies (such as a systemd-unit activated by an systemd-timer) do not work with docker at all, furthermore, these usually provide their own hardening options that allow security isolation ### Pros: - More or less portable, single file binary might help adoption - Similar to the existing docker container, can probably be integrated into the existing release automation I've seen on master ## How badly do you want this new feature? - [ ] It's an urgent deal-breaker, I can't live without it - [ ] It's important to add it in the near-mid term future - [X] It would be nice to have eventually --- - [X] I'm willing to contribute [dev time](https://github.com/ArchiveBox/ArchiveBox#archivebox-development) to fix this issue I've toyed around with that idea last weekend and used [pkg2appimage](https://github.com/AppImage/pkg2appimage) (in version pkg2appimage-1807-x86_64.AppImage) to create an appimage for archivebox: ```yaml # Note: due to reasons, this requires a python3 (matching, e.g. python3.9 on both), python3-venv, nodejs and npm ON THE BUILDING HOST app: archivebox ingredients: dist: focal sources: # Order is important, currently (as of 2022-01-23) apt-get.do-download in pkg2appimage prefers earlier sources # When switching distributions, be sure to adapt the pythonpath in `script:` as well - deb https://deb.nodesource.com/node_14.x focal main - deb http://archive.ubuntu.com/ubuntu/ focal focal-updates focal-security main universe - deb http://archive.ubuntu.com/ubuntu/ focal-updates main universe - deb http://archive.ubuntu.com/ubuntu/ focal-security main universe packages: - python3.9-venv - nodejs - npm - wget - curl script: - wget -c "https://raw.githubusercontent.com/ArchiveBox/ArchiveBox/dev/icon.png" -O archivebox.png - cp ./archivebox.png ./usr/share/icons/hicolor/256x256/ - npm install -g --prefix ./usr 'git+https://github.com/pirate/readability-extractor' - npm install -g --prefix ./usr '@postlight/mercury-parser' - npm install -g --prefix ./usr 'gildas-lormeau/SingleFile#master' - python3.9 -m venv usr - source ./usr/bin/activate - ./usr/bin/pip3 install --ignore-installed archivebox - cat > archivebox.desktop <<\EOF - [Desktop Entry] - Type=Application - Terminal=true - Name=archivebox - Exec=archivebox - Categories=Network; - Icon=archivebox - EOF - usr/bin/pip3 freeze | grep "archivebox" | cut -d "=" -f 3 | head -n1 > ../VERSION # Fixup the pythonpath: Warning: the import os happens some lines below - sed -i '3s|^|import sys\nsys.path.insert(0, os.getenv("APPDIR") + "/usr/lib/python3.9/site-packages")\nsys.path.insert(0, os.getenv("APPDIR") + "/usr/lib/python3.9")\n|' ./usr/bin/archivebox # Prepare it for "recursion", i.e. calling other python programms - sed -i '3s|^|os.environ["PYTHONPATH"] = os.getenv("APPDIR") + "/usr/lib/python3.9:" + os.getenv("APPDIR") + "/usr/lib/python3.9/site-packages"\n|' ./usr/bin/archivebox # Appimage chdirs to usr/ by default... - sed -i '3s|^|import os\nos.chdir(os.getenv("OWD"))\n|' ./usr/bin/archivebox # Patch python interpreter path - find . -type f -exec sed -i '1 s|^#!/.*\.AppDir/usr/bin/\(python3*\)|#!/usr/bin/env \1|g' {} + ``` The image can then be built using `ARCH=x86_64 ./pkg2appimage-1807-x86_64.AppImage archivebox.yml`. Currently, the AppImage has a size of approx. 115M. I've currently used a Debian Bullseye as the build host for now (because it was at hand, I've needed to "apt get install python3.9-venv npm" on the host) and tested on Debian Bullseye and a current ArchLinux installation (I've tested `init` and `add`, including youtube/github links). I've not included ffmpeg or chromium into the AppImage so far, because I believe that those are 1) optional dependencies and 2) better maintained outside of the appimage by the regular distributions package manager as their attack surface makes regular, high frequency updates sort of mandatory. The AppImage was able to pick up and use the distribution binaries of these tools on both tested distributions. The only bug that I've walked into was [](https://github.com/openssl/openssl/issues/7481) for the openssl employed by youtube-dl. As the location of the `SSL_CERT_FILE` differs a bit between distributions, it was necessary to run the appimage as `SSL_CERT_FILE=/etc/ssl/cert.pem ./archivebox-0.6.2.glibc2.29-x86_64.AppImage ...` on ArchLinux. In theory, one could include a cert-file into the AppImage. But once more, I believe it's best to use the host systems information there as the user might have certain custom SSL-trustlevels configured. Thanks for reading this far and for developing and maintaining archivebox in general! So what do you think? Any interest? Please note that this is the first AppImage I've built, but I'll do my best to answer any questions or improve the AppImage if desired :) Best regards Simon
Author
Owner

@pirate commented on GitHub (Jan 24, 2022):

I've never used AppImage before, and probably don't have the time to maintain another release channel (we already have too many for me to handle comfortably). That being said, if someone contributes a Github Actions pipeline that autobuilds and releases an AppImage on new release tags, then it's a minimal additional burden for me and I'd be willing to merge it.

<!-- gh-comment-id:1020539341 --> @pirate commented on GitHub (Jan 24, 2022): I've never used AppImage before, and probably don't have the time to maintain another release channel (we already have too many for me to handle comfortably). That being said, if someone contributes a Github Actions pipeline that autobuilds and releases an AppImage on new release tags, then it's a minimal additional burden for me and I'd be willing to merge it.
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/ArchiveBox#569
No description provided.