[GH-ISSUE #1631] Bug: sqlite search backend engine does not work in v0.8.5rc51 docker image #2488

Open
opened 2026-03-01 17:59:23 +03:00 by kerem · 1 comment
Owner

Originally created by @bvc3at on GitHub (Jan 5, 2025).
Original GitHub issue: https://github.com/ArchiveBox/ArchiveBox/issues/1631

Originally assigned to: @pirate on GitHub.

Provide a screenshot and describe the bug

Using sqlite as search backend option results in error while indexing:

docker exec --user=archivebox -it {container_id} /bin/bash -c "archivebox update --index-only"

[X] The search backend threw an exception=unrecognized option: "contentless_delete":

I assume this is because of old sqlite version in container:

 √  sqlite                2.6.0        sys_pip    /usr/local/lib/python3.11/site-packages/django/db/backends/sqlite3/base.py

Sqlitefts plugin even has code that raises wrong version error, but in my case this statement wasn't triggered due to contentless delete spell (contentless_delete vs contentlessdelete).

Steps to reproduce

1. Started ArchiveBox by running `docker compose up -d`
2. Added any url
3. Ran `docker exec --user=archivebox -it {container_id} /bin/bash -c "archivebox update --index-only"`
4. Got an exception from search backend

Logs or errors

╭──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
│ [2025-01-05 11:40:15] ArchiveBox v0.8.5rc51: archivebox update --index-only                                                              │
╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
[*] Finding matching Snapshots to update...
    - Filtering by  (exact) before=None after=None status='indexed'...
    - Checking 1 snapshot folders for existing data with status='indexed'...
    - Sorting by most unfinished -> least unfinished + date archived...
[green][*] Indexing url: https://github.com/ArchiveBox/ArchiveBox in the search index[/]


[X] The search backend threw an exception=unrecognized option: "contentless_delete":

ArchiveBox Version

0.8.5rc51
ArchiveBox v0.8.5rc51 COMMIT_HASH=63bf902 BUILD_TIME=2024-10-24 06:30:40 1729751440
IN_DOCKER=True IN_QEMU=False ARCH=aarch64 OS=Linux PLATFORM=Linux-6.6.62+rpt-rpi-2712-aarch64-with-glibc2.36 PYTHON=Cpython
EUID=1000:1000 UID=1000:1000 PUID=1000:1000 FS_UID=1000:1000 FS_PERMS=644 FS_ATOMIC=True FS_REMOTE=True
DEBUG=False IS_TTY=True SUDO=False ID=9f373648:b6dcfe91 SEARCH_BACKEND=sqlite LDAP=False

 Binary Dependencies:
 √  python                3.11.10      sys_pip    /usr/local/bin/python3.11
 √  django                5.1.2        sys_pip    /usr/local/lib/python3.11/site-packages/django/__init__.py
 √  sqlite                2.6.0        sys_pip    /usr/local/lib/python3.11/site-packages/django/db/backends/sqlite3/base.py
 √  pip                   24.0.0       sys_pip    /usr/local/bin/pip
 √  pipx                  1.1.0        sys_pip    /bin/pipx
 √  node                  22.10.0      apt        /usr/bin/node
 √  npm                   10.9.0       apt        /usr/bin/npm
 √  npx                   10.9.0       apt        /usr/bin/npx
 √  playwright            1.48.0       sys_pip    /usr/local/bin/playwright
 √  puppeteer             23.6.0       lib_npm    ~/.npm/bin/puppeteer
 √  ldap                  3.4.4        sys_pip    /usr/local/lib/python3.11/site-packages/ldap/__init__.py
 √  rg                    13.0.0       apt        /usr/bin/rg
 X  sonic                 None         not found  None of the configured providers (brew, env) were able to load binary: sonic ERRORS={}
 √  chrome                130.0.6723   env        /usr/bin/chromium-browser
 √  curl                  8.10.1       apt        /usr/bin/curl
 √  git                   2.39.5       apt        /usr/bin/git
 √  postlight-parser      2.2.3        sys_npm    ~/.npm/bin/postlight-parser
 √  readability-extractor 0.0.11       lib_npm    ~/.npm/bin/readability-extractor
 √  single-file           1.1.54       lib_npm    ~/.npm/bin/single-file
 √  wget                  1.21.3       apt        /usr/bin/wget
 √  yt-dlp                2024.10.22   sys_pip    /usr/local/bin/yt-dlp
 √  ffmpeg                5.1.6        env        /usr/bin/ffmpeg

 Package Managers:
 √  env         /usr/bin/which                                       UID=1000 PATH=~/.npm/bin:/usr/local/bin:/usr/local/sbin:/usr/local/bin…
 √  apt         /usr/bin/apt-get                                     UID=0    PATH=/usr/bin:/bin
 -  brew        not available                                        UID=1000 PATH=
 √  sys_pip     /usr/local/bin/pip                                   UID=1000 PATH=/bin:~/.local/bin:/usr/local/bin:/usr/bin
 -  venv_pip    not available                                        UID=1000 PATH=/tmp/NotInsideAVenv/lib/bin
 -  lib_pip     not available                                        UID=1000 PATH=./lib/aarch64-linux-docker/pip/venv/bin
 √  sys_npm     /usr/bin/npm                                         UID=1000 PATH=~/.npm/bin
 -  lib_npm     /usr/bin/npm                                         UID=1000 PATH=./lib/aarch64-linux-docker/npm/node_modules/.bin:./node_…
 √  playwright  /usr/local/bin/playwright                            UID=0    PATH=./lib/aarch64-linux-docker/bin:~/.npm/bin:/usr/local/bin…
 √  puppeteer   /usr/bin/npx                                         UID=1000 PATH=./lib/aarch64-linux-docker/bin

 Code locations:
 √  PACKAGE_DIR           39 files        valid     /app/archivebox
 √  TEMPLATES_DIR         4 files         valid     /app/archivebox/templates
 -  CUSTOM_TEMPLATES_DIR  missing         unused    ./user_templates
 -  USER_PLUGINS_DIR      missing         unused    ./user_plugins
 √  LIB_DIR               0 files         valid     /usr/share/archivebox/lib

 Data locations:
 √  DATA_DIR              21 files @      valid     /data
 √  CONFIG_FILE           139.0 Bytes     valid     ./ArchiveBox.conf
 √  SQL_INDEX             396.0 KB        valid     ./index.sqlite3
 √  QUEUE_DATABASE        92.0 KB         valid     ./queue.sqlite3
 √  ARCHIVE_DIR           1 files @       valid     ./archive
 √  SOURCES_DIR           1 files         valid     ./sources
 √  PERSONAS_DIR          1 files         valid     ./personas
 √  LOGS_DIR              5 files         valid     ./logs
 √  TMP_DIR               4 files         valid     /tmp/archivebox

How did you install the version of ArchiveBox you are using?

Docker (or other container system like podman/LXC/Kubernetes or TrueNAS/Cloudron/YunoHost/etc.)

What operating system are you running on?

Linux (Ubuntu/Debian/Arch/Alpine/etc.)

What type of drive are you using to store your ArchiveBox data?

  • data/ is on a local SSD or NVMe drive
  • data/ is on a spinning hard drive or external USB drive
  • data/ is on a network mount (e.g. NFS/SMB/CIFS/etc.)
  • data/ is on a FUSE mount (e.g. SSHFS/RClone/S3/B2/OneDrive, etc.)

Docker Compose Configuration

services:
...
  archivebox:
    image: archivebox/archivebox:0.8.5rc51
    ...
    environment:
      - SEARCH_BACKEND_ENGINE=sqlite
      - FTS_SEPARATE_DATABASE=True
      - FTS_SQLITE_MAX_LENGTH=1000000000	
...
  archivebox_scheduler:
    image: archivebox/archivebox:0.8.5rc51
    command: schedule --foreground --update --every=day
    environment:
      - SEARCH_BACKEND_ENGINE=sqlite
      - FTS_SEPARATE_DATABASE=True
      - FTS_SQLITE_MAX_LENGTH=1000000000
      ...

ArchiveBox Configuration


Originally created by @bvc3at on GitHub (Jan 5, 2025). Original GitHub issue: https://github.com/ArchiveBox/ArchiveBox/issues/1631 Originally assigned to: @pirate on GitHub. ### Provide a screenshot and describe the bug Using sqlite as search backend option results in error while indexing: `docker exec --user=archivebox -it {container_id} /bin/bash -c "archivebox update --index-only"` ``` [X] The search backend threw an exception=unrecognized option: "contentless_delete": ``` I assume this is because of old sqlite version in container: ``` √ sqlite 2.6.0 sys_pip /usr/local/lib/python3.11/site-packages/django/db/backends/sqlite3/base.py ``` Sqlitefts plugin even [has code](https://github.com/ArchiveBox/ArchiveBox/blob/55a347c32eba27915effb9529d40a76de1276370/archivebox/pkgs/abx-plugin-sqlitefts-search/abx_plugin_sqlitefts_search/searchbackend.py#L56) that raises wrong version error, but in my case this statement wasn't triggered due to contentless delete spell (`contentless_delete` vs `contentlessdelete`). ### Steps to reproduce ```markdown 1. Started ArchiveBox by running `docker compose up -d` 2. Added any url 3. Ran `docker exec --user=archivebox -it {container_id} /bin/bash -c "archivebox update --index-only"` 4. Got an exception from search backend ``` ### Logs or errors ```shell ╭──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮ │ [2025-01-05 11:40:15] ArchiveBox v0.8.5rc51: archivebox update --index-only │ ╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯ [*] Finding matching Snapshots to update... - Filtering by (exact) before=None after=None status='indexed'... - Checking 1 snapshot folders for existing data with status='indexed'... - Sorting by most unfinished -> least unfinished + date archived... [green][*] Indexing url: https://github.com/ArchiveBox/ArchiveBox in the search index[/] [X] The search backend threw an exception=unrecognized option: "contentless_delete": ``` ### ArchiveBox Version ```shell 0.8.5rc51 ArchiveBox v0.8.5rc51 COMMIT_HASH=63bf902 BUILD_TIME=2024-10-24 06:30:40 1729751440 IN_DOCKER=True IN_QEMU=False ARCH=aarch64 OS=Linux PLATFORM=Linux-6.6.62+rpt-rpi-2712-aarch64-with-glibc2.36 PYTHON=Cpython EUID=1000:1000 UID=1000:1000 PUID=1000:1000 FS_UID=1000:1000 FS_PERMS=644 FS_ATOMIC=True FS_REMOTE=True DEBUG=False IS_TTY=True SUDO=False ID=9f373648:b6dcfe91 SEARCH_BACKEND=sqlite LDAP=False Binary Dependencies: √ python 3.11.10 sys_pip /usr/local/bin/python3.11 √ django 5.1.2 sys_pip /usr/local/lib/python3.11/site-packages/django/__init__.py √ sqlite 2.6.0 sys_pip /usr/local/lib/python3.11/site-packages/django/db/backends/sqlite3/base.py √ pip 24.0.0 sys_pip /usr/local/bin/pip √ pipx 1.1.0 sys_pip /bin/pipx √ node 22.10.0 apt /usr/bin/node √ npm 10.9.0 apt /usr/bin/npm √ npx 10.9.0 apt /usr/bin/npx √ playwright 1.48.0 sys_pip /usr/local/bin/playwright √ puppeteer 23.6.0 lib_npm ~/.npm/bin/puppeteer √ ldap 3.4.4 sys_pip /usr/local/lib/python3.11/site-packages/ldap/__init__.py √ rg 13.0.0 apt /usr/bin/rg X sonic None not found None of the configured providers (brew, env) were able to load binary: sonic ERRORS={} √ chrome 130.0.6723 env /usr/bin/chromium-browser √ curl 8.10.1 apt /usr/bin/curl √ git 2.39.5 apt /usr/bin/git √ postlight-parser 2.2.3 sys_npm ~/.npm/bin/postlight-parser √ readability-extractor 0.0.11 lib_npm ~/.npm/bin/readability-extractor √ single-file 1.1.54 lib_npm ~/.npm/bin/single-file √ wget 1.21.3 apt /usr/bin/wget √ yt-dlp 2024.10.22 sys_pip /usr/local/bin/yt-dlp √ ffmpeg 5.1.6 env /usr/bin/ffmpeg Package Managers: √ env /usr/bin/which UID=1000 PATH=~/.npm/bin:/usr/local/bin:/usr/local/sbin:/usr/local/bin… √ apt /usr/bin/apt-get UID=0 PATH=/usr/bin:/bin - brew not available UID=1000 PATH= √ sys_pip /usr/local/bin/pip UID=1000 PATH=/bin:~/.local/bin:/usr/local/bin:/usr/bin - venv_pip not available UID=1000 PATH=/tmp/NotInsideAVenv/lib/bin - lib_pip not available UID=1000 PATH=./lib/aarch64-linux-docker/pip/venv/bin √ sys_npm /usr/bin/npm UID=1000 PATH=~/.npm/bin - lib_npm /usr/bin/npm UID=1000 PATH=./lib/aarch64-linux-docker/npm/node_modules/.bin:./node_… √ playwright /usr/local/bin/playwright UID=0 PATH=./lib/aarch64-linux-docker/bin:~/.npm/bin:/usr/local/bin… √ puppeteer /usr/bin/npx UID=1000 PATH=./lib/aarch64-linux-docker/bin Code locations: √ PACKAGE_DIR 39 files valid /app/archivebox √ TEMPLATES_DIR 4 files valid /app/archivebox/templates - CUSTOM_TEMPLATES_DIR missing unused ./user_templates - USER_PLUGINS_DIR missing unused ./user_plugins √ LIB_DIR 0 files valid /usr/share/archivebox/lib Data locations: √ DATA_DIR 21 files @ valid /data √ CONFIG_FILE 139.0 Bytes valid ./ArchiveBox.conf √ SQL_INDEX 396.0 KB valid ./index.sqlite3 √ QUEUE_DATABASE 92.0 KB valid ./queue.sqlite3 √ ARCHIVE_DIR 1 files @ valid ./archive √ SOURCES_DIR 1 files valid ./sources √ PERSONAS_DIR 1 files valid ./personas √ LOGS_DIR 5 files valid ./logs √ TMP_DIR 4 files valid /tmp/archivebox ``` ### How did you install the version of ArchiveBox you are using? Docker (or other container system like podman/LXC/Kubernetes or TrueNAS/Cloudron/YunoHost/etc.) ### What operating system are you running on? Linux (Ubuntu/Debian/Arch/Alpine/etc.) ### What type of drive are you using to store your ArchiveBox data? - [x] `data/` is on a local SSD or NVMe drive - [x] `data/` is on a spinning hard drive or external USB drive - [x] `data/` is on a network mount (e.g. NFS/SMB/CIFS/etc.) - [ ] `data/` is on a FUSE mount (e.g. SSHFS/RClone/S3/B2/OneDrive, etc.) ### Docker Compose Configuration ```shell services: ... archivebox: image: archivebox/archivebox:0.8.5rc51 ... environment: - SEARCH_BACKEND_ENGINE=sqlite - FTS_SEPARATE_DATABASE=True - FTS_SQLITE_MAX_LENGTH=1000000000 ... archivebox_scheduler: image: archivebox/archivebox:0.8.5rc51 command: schedule --foreground --update --every=day environment: - SEARCH_BACKEND_ENGINE=sqlite - FTS_SEPARATE_DATABASE=True - FTS_SQLITE_MAX_LENGTH=1000000000 ... ``` ### ArchiveBox Configuration ```shell ```
Author
Owner

@bvc3at commented on GitHub (Jan 5, 2025):

This does not seems relevant, but

What type of drive are you using to store your ArchiveBox data?

My data folder on external ssd usb drive. But data/archive is SMB mounted storage:

    volumes:
      - ./archivebox-data:/data
      - archivebox-archive:/data/archive
 ...
volumes:
  archivebox-backup:
    driver: local
    driver_opts:
      type: cifs
      device: "{//ip/path}"
      o: "{credentials}"
<!-- gh-comment-id:2571597383 --> @bvc3at commented on GitHub (Jan 5, 2025): This does not seems relevant, but > What type of drive are you using to store your ArchiveBox data? My `data` folder on external ssd usb drive. But `data/archive` is SMB mounted storage: ``` volumes: - ./archivebox-data:/data - archivebox-archive:/data/archive ... volumes: archivebox-backup: driver: local driver_opts: type: cifs device: "{//ip/path}" o: "{credentials}" ```
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/ArchiveBox#2488
No description provided.