[GH-ISSUE #1348] Announcement: Renaming init --setup to install + init --createsuperuser in upcoming release #2335

Open
opened 2026-03-01 17:58:16 +03:00 by kerem · 4 comments
Owner

Originally created by @pirate on GitHub (Feb 17, 2024).
Original GitHub issue: https://github.com/ArchiveBox/ArchiveBox/issues/1348

init --setup is confusing because the words mean similar things.

I think I want to do something similar to playwright's install + --with-deps.

I also want to install all dependencies into data/lib/... and symlink them into data/bin/.


archivebox init

  • --install | --install=system,local,extras,search,...: install package dependencies during init
  • --createsuperuser: run archivebox manage createsuperuser during init

archivebox install [pkgname|tagname]

If [pkgname|tagname] is not specified, it will install everything by default.
If a [pkgname|tagname] is specified, it will try to install that package or packages matching that tag using any/all the providers it supports.

  • --provider=apt|brew|pkg|pip|npm|cargo|... specify which dependency provider to use
  • [provider options] e.g. archivebox install wget --provider=user --path=/hardcoded/path/to/wget

Dependencies

git

  • tags: system, extractors
  • config: {enabled: True, args: [], git_domains: 'github.com,gitlab.com,...'}
  • providers: {'*': {name: 'git'}, apt: {}, brew: {}, pkg: {}, usr: {}}
  • @bin_path(): $ which git
  • @version(): $ git --version
  • @is_valid(): self.bin_path and self.version > 2.1.1
  • @provider(): $ which git && echo 'external' || (which apt | which brew | which pkg) || echo None
  • .post_install(): link_dependency(): $ ln -s {self.bin_path} data/bin/git
  • .install(provider=self.get_provider): `provider.install(**self.providers[self.provider])

yt-dlp

  • tags: local, extractors
  • config: {enabled, max_media_size, playlists, args: [--max-media-size=MAX_MEDIA_SIZE, --playlists=PLAYLISTS}
  • providers: {'*': {name: 'yt-dlp'}, apt: {name: 'yt-dlp ffmpeg'}, brew: {}, pip: {}, usr: {}}
  • ...

ripgrep

  • tags: system, extras, search
  • config: {enabled, args}
  • providers: {'*': {name: 'ripgrep'}, apt: {}, brew: {}, pip: {}, usr: {}}
  • ...

sonic

  • tags: system, extras, daemons, search
  • config: {enabled, host, port, username, password, args: ['-c', 'data/etc/sonic.cfg']}
  • providers: {'*': {name: 'sonic'}, apt: {preinstall:
    apt-get install -y curl gpg
    echo "deb [signed-by=/usr/share/keyrings/valeriansaliou_sonic.gpg] https://packagecloud.io/valeriansaliou/sonic/debian/ bookworm main" > /etc/apt/sources.list.d/valeriansaliou_sonic.list'
    
    apt-get update -qq
    
    }, brew: {}, cargo: {}, usr: {}}
  • .post_install(): if data/etc/sonic.cfg not present, download it from github
  • .start(): {self.bin_path} {args}
  • @pid(): {pid: 1234, host: HOSTNAME, port: PORT}
  • .stop(): $ kill {self.pid}; sleep 30; kill -9 {self.pid}
  • @is_up(): test_sonic_connection(*self.pid)

...

...

sys install: git, curl, wget, ffmpeg, ripgrep, sonic, nodejs, pip, fonts, etc.
pip install: yt-dlp, gallery-dl, playwright+chromium
npm install: singlefile, readability, mercury


Providers

apt

  • sudo: True
  • @is_available(): $ (which apt)
  • .install(name: str, sources={}, install_recommends=True):
    for source in sources:
        # {
        #   name: valeriansaliou_sonic
        #   sourcefile: /etc/apt/sources.list.d/valeriansaliou_sonic.list
        #   keyfile: /usr/share/keyrings/{name}.gpg
        #   gpgurl: https://packagecloud.io/valeriansaliou/sonic/gpgkey
        #   deburl: https://packagecloud.io/valeriansaliou/sonic/debian/
        #   distro: bookworm
        #   channel: main
        # }
        echo "deb [signed-by={keyfile}] {deburl} {distro} {channel}" > {sourcefile}
        curl -fsSL "{gpg_url}" | gpg --dearmor -o {keyfile}
    apt-get update -qq
    apt-get install -y {install_recommends or '--no-install-recommends'} {name}
    
  • @installed(names: List[str):
    for pkg in names:
        binpath=$(dpkg -L [name] | grep bin | grep name)
        if not binpath: return False
    return True
    

pip

  • sudo: False
  • @is_available(): $ (which pip) && python3 && pip3
  • .install(name: str, pip_args='--upgrade')
    python3 -m venv data/lib/pip/venv
    source data/lib/pip/venv/bin/activate
    pip install {pip_args} [name]
    

...

...


State directories

  • data/lib/{apt,brew,pkg,usr}/bin/... (symlinks to user-provided or system-installed binaries)
  • data/lib/pip/venv/bin (python venv for local packages)
  • data/lib/npm/node_modules/.bin (npm node_modules for local packages
  • data/lib/cargo/bin/ (cargo folder for local packages)
  • data/lib/usr/bin/... (dir of symlinks to user-provided binaries)
  • data/bin (all final binaries are symlinked here)
  • data/etc (etc / config files)
Originally created by @pirate on GitHub (Feb 17, 2024). Original GitHub issue: https://github.com/ArchiveBox/ArchiveBox/issues/1348 `init --setup` is confusing because the words mean similar things. I think I want to do something similar to `playwright`'s `install` + `--with-deps`. I also want to install all dependencies into `data/lib/...` and symlink them into `data/bin/`. --- ### `archivebox init` - `--install` | `--install=system,local,extras,search,...`: install package dependencies during init - `--createsuperuser`: run `archivebox manage createsuperuser` during init --- ### `archivebox install [pkgname|tagname]` If `[pkgname|tagname]` is not specified, it will install everything by default. If a `[pkgname|tagname]` is specified, it will try to install that package or packages matching that tag using any/all the providers it supports. - `--provider=apt|brew|pkg|pip|npm|cargo|...` specify which dependency provider to use - `[provider options]` e.g. `archivebox install wget --provider=user --path=/hardcoded/path/to/wget` ### Dependencies #### `git` - `tags`: `system`, `extractors` - `config`: `{enabled: True, args: [], git_domains: 'github.com,gitlab.com,...'}` - `providers`: `{'*': {name: 'git'}, apt: {}, brew: {}, pkg: {}, usr: {}}` - `@bin_path()`: `$ which git` - `@version()`: `$ git --version` - `@is_valid()`: `self.bin_path and self.version > 2.1.1` - `@provider()`: `$ which git && echo 'external' || (which apt | which brew | which pkg) || echo None` - `.post_install()`: `link_dependency(): $ ln -s {self.bin_path} data/bin/git` - `.install(provider=self.get_provider)`: `provider.install(**self.providers[self.provider]) #### `yt-dlp` - `tags`: `local`, `extractors` - `config`: `{enabled, max_media_size, playlists, args: [--max-media-size=MAX_MEDIA_SIZE, --playlists=PLAYLISTS}` - `providers`: `{'*': {name: 'yt-dlp'}, apt: {name: 'yt-dlp ffmpeg'}, brew: {}, pip: {}, usr: {}}` - ... #### `ripgrep` - `tags`: `system`, `extras`, `search` - `config`: `{enabled, args}` - `providers`: `{'*': {name: 'ripgrep'}, apt: {}, brew: {}, pip: {}, usr: {}}` - ... #### `sonic` - `tags`: `system`, `extras`, `daemons`, `search` - `config`: `{enabled, host, port, username, password, args: ['-c', 'data/etc/sonic.cfg']}` - `providers`: `{'*': {name: 'sonic'}, apt: {preinstall: ` ``` apt-get install -y curl gpg echo "deb [signed-by=/usr/share/keyrings/valeriansaliou_sonic.gpg] https://packagecloud.io/valeriansaliou/sonic/debian/ bookworm main" > /etc/apt/sources.list.d/valeriansaliou_sonic.list' apt-get update -qq ``` `}, brew: {}, cargo: {}, usr: {}}` - `.post_install()`: if `data/etc/sonic.cfg` not present, download it from github - `.start()`: `{self.bin_path} {args}` - `@pid()`: {pid: 1234, host: HOSTNAME, port: PORT} - `.stop()`: `$ kill {self.pid}; sleep 30; kill -9 {self.pid}` - `@is_up()`: `test_sonic_connection(*self.pid)` #### `...` ... `sys install`: `git`, `curl`, `wget`, `ffmpeg`, `ripgrep`, `sonic`, `nodejs`, `pip`, fonts, etc. `pip install`: `yt-dlp`, `gallery-dl`, `playwright`+`chromium` `npm install`: `singlefile`, `readability`, `mercury` --- ### Providers #### `apt` - `sudo`: `True` - `@is_available()`: `$ (which apt)` - `.install(name: str, sources={}, install_recommends=True)`: ```bash for source in sources: # { # name: valeriansaliou_sonic # sourcefile: /etc/apt/sources.list.d/valeriansaliou_sonic.list # keyfile: /usr/share/keyrings/{name}.gpg # gpgurl: https://packagecloud.io/valeriansaliou/sonic/gpgkey # deburl: https://packagecloud.io/valeriansaliou/sonic/debian/ # distro: bookworm # channel: main # } echo "deb [signed-by={keyfile}] {deburl} {distro} {channel}" > {sourcefile} curl -fsSL "{gpg_url}" | gpg --dearmor -o {keyfile} apt-get update -qq apt-get install -y {install_recommends or '--no-install-recommends'} {name} ``` - `@installed(names: List[str)`: ```python for pkg in names: binpath=$(dpkg -L [name] | grep bin | grep name) if not binpath: return False return True ``` #### `pip` - `sudo`: `False` - `@is_available()`: `$ (which pip) && python3 && pip3` - `.install(name: str, pip_args='--upgrade')` ```bash python3 -m venv data/lib/pip/venv source data/lib/pip/venv/bin/activate pip install {pip_args} [name] ``` #### `...` ... --- ### State directories - `data/lib/{apt,brew,pkg,usr}/bin/...` (symlinks to user-provided or system-installed binaries) - `data/lib/pip/venv/bin` (python venv for local packages) - `data/lib/npm/node_modules/.bin` (npm node_modules for local packages - `data/lib/cargo/bin/` (cargo folder for local packages) - `data/lib/usr/bin/...` (dir of symlinks to user-provided binaries) - `data/bin` (all final binaries are symlinked here) - `data/etc` (etc / config files)
Author
Owner

@pirate commented on GitHub (Feb 22, 2024):

archivebox install should also respect the environment variables / config to only install the things that are enabled, but this is hard because it means it will behave differently inside/outside a data folder and that might be too much complexity.

https://github.com/ArchiveBox/ArchiveBox/issues/1346

<!-- gh-comment-id:1958450169 --> @pirate commented on GitHub (Feb 22, 2024): `archivebox install` should also respect the environment variables / config to only install the things that are enabled, but this is hard because it means it will behave differently inside/outside a data folder and that might be too much complexity. https://github.com/ArchiveBox/ArchiveBox/issues/1346
Author
Owner

@pirate commented on GitHub (Feb 22, 2024):

I'm planning to use ansible for the new install system, which will provide:

  • generic: ansible localhost -m ansible.builtin.package -a "name=yt-dlp state=latest" (apt, brew, pkg, etc.)
  • apt: ansible localhost -m ansible.builtin.apt -a "name=pkgname state=latest"
  • brew: ansible localhost -m community.general.homebrew -a "name=pkgname state=latest"
  • pip: ansible localhost -m aansible.builtin.pip -a "name=pkgname state=latest"
  • npm: ansible localhost -m community.general.npm -a "name=pkgname state=latest"
  • shell: ansible localhost -m ansible.builtin.shell -a "playwright install --with-deps chromium" --diff
  • zfs: ansible localhost -m community.general.zfs -a "name=rpool/myfs state=present"
  • docker: community.docker.docker_compose_v2
  • cloudflare: community.general.cloudflare_dns
  • inventory: https://docs.ansible.com/ansible/latest/inventory_guide/intro_inventory.html#variables-in-inventory
  • and lots more...

https://docs.ansible.com/ansible/latest/dev_guide/developing_api.html
https://ansible.readthedocs.io/projects/runner/en/latest/python_interface/

<!-- gh-comment-id:1959367705 --> @pirate commented on GitHub (Feb 22, 2024): I'm planning to use ansible for the new install system, which will provide: - `generic`: `ansible localhost -m ansible.builtin.package -a "name=yt-dlp state=latest"` (`apt`, `brew`, `pkg`, etc.) - `apt`: `ansible localhost -m ansible.builtin.apt -a "name=pkgname state=latest"` - `brew`: `ansible localhost -m community.general.homebrew -a "name=pkgname state=latest"` - `pip`: `ansible localhost -m aansible.builtin.pip -a "name=pkgname state=latest"` - `npm`: `ansible localhost -m community.general.npm -a "name=pkgname state=latest"` - `shell`: `ansible localhost -m ansible.builtin.shell -a "playwright install --with-deps chromium" --diff` - `zfs`: `ansible localhost -m community.general.zfs -a "name=rpool/myfs state=present"` - `docker`: `community.docker.docker_compose_v2` - `cloudflare`: `community.general.cloudflare_dns` - `inventory`: https://docs.ansible.com/ansible/latest/inventory_guide/intro_inventory.html#variables-in-inventory - [and lots more...](https://docs.ansible.com/ansible/latest/collections/community/general/index.html) https://docs.ansible.com/ansible/latest/dev_guide/developing_api.html https://ansible.readthedocs.io/projects/runner/en/latest/python_interface/
Author
Owner

@pirate commented on GitHub (May 18, 2024):

https://github.com/ArchiveBox/pydantic-pkgr

<!-- gh-comment-id:2118789126 --> @pirate commented on GitHub (May 18, 2024): https://github.com/ArchiveBox/pydantic-pkgr
Author
Owner

@pirate commented on GitHub (Sep 24, 2024):

Ok this is almost done! Most of this is now built in the new v0.8 dev branch.

<!-- gh-comment-id:2372331005 --> @pirate commented on GitHub (Sep 24, 2024): Ok this is almost done! Most of this is now built in the new v0.8 `dev` branch.
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/ArchiveBox#2335
No description provided.