[GH-ISSUE #1087] Question: Help setting up full text search #679

Closed
opened 2026-03-01 14:45:29 +03:00 by kerem · 11 comments
Owner

Originally created by @diego898 on GitHub (Jan 20, 2023).
Original GitHub issue: https://github.com/ArchiveBox/ArchiveBox/issues/1087

I was trying the instructions outlined here:

https://github.com/ArchiveBox/ArchiveBox/issues/956#issuecomment-1320587158

to setup full text search on my archive of a single link. I split this out into its own issue so as to not derail the other:

@pirate - I had to download the sonic.cfg the root directory not the data folder.

Also, after down; down; up I tried docker-compose run archivebox update --index-only

and got:

~/archivebox
❯ docker-compose run archivebox update --index-only
[i] [2023-01-19 02:45:38] ArchiveBox v0.6.2: archivebox update --index-only
    > /data

[*] Indexing url: https://www.ecliptik.com/bookmarking-with-raindrop/ in the search index

[!] Sonic search backend threw an error while indexing: SonicServerError ERR invalid_meta_key(?["])

I've only indexed a single website so far

Originally created by @diego898 on GitHub (Jan 20, 2023). Original GitHub issue: https://github.com/ArchiveBox/ArchiveBox/issues/1087 I was trying the instructions outlined here: https://github.com/ArchiveBox/ArchiveBox/issues/956#issuecomment-1320587158 to setup full text search on my archive of a single link. I split this out into its own issue so as to not derail the other: @pirate - I had to download the `sonic.cfg` the root directory not the `data` folder. Also, after `down; down; up` I tried `docker-compose run archivebox update --index-only` and got: ``` ~/archivebox ❯ docker-compose run archivebox update --index-only [i] [2023-01-19 02:45:38] ArchiveBox v0.6.2: archivebox update --index-only > /data [*] Indexing url: https://www.ecliptik.com/bookmarking-with-raindrop/ in the search index [!] Sonic search backend threw an error while indexing: SonicServerError ERR invalid_meta_key(?["]) ``` I've only indexed a single website so far
kerem closed this issue 2026-03-01 14:45:29 +03:00
Author
Owner

@pirate commented on GitHub (Jan 21, 2023):

Interesting, never seen this error before. Can you post your sonic config from docker-compose.yml. Also the full output of archivebox --version.

<!-- gh-comment-id:1399161532 --> @pirate commented on GitHub (Jan 21, 2023): Interesting, never seen this error before. Can you post your sonic config from `docker-compose.yml`. Also the full output of `archivebox --version`.
Author
Owner

@diego898 commented on GitHub (Jan 21, 2023):

from docker-compose.yml:

archivebox:
        # build: .                              # for developers working on archivebox
        image: ${DOCKER_IMAGE:-archivebox/archivebox:master}
        command: server --quick-init 0.0.0.0:8000
        ports:
            - 8000:8000
        environment:
            - ALLOWED_HOSTS=*                   # add any config options you want as env vars
            - MEDIA_MAX_SIZE=750m
            - SEARCH_BACKEND_ENGINE=sonic     # uncomment these if you enable sonic below
            - SEARCH_BACKEND_HOST_NAME=sonic
            - SEARCH_BACKEND_PASSWORD=SecretPassword
        volumes:
            - ./data:/data
            # - ./archivebox:/app/archivebox    # for developers working on archivebox

...

# To run the Sonic full-text search backend, first download the config file to sonic.cfg
    # curl -O https://raw.githubusercontent.com/ArchiveBox/ArchiveBox/master/etc/sonic.cfg
    # after starting, backfill any existing Snapshots into the index: docker-compose run archivebox update --index-only
    sonic:
       image: valeriansaliou/sonic:v1.3.0
       expose:
           - 1491
       environment:
           - SEARCH_BACKEND_PASSWORD=SecretPassword
       volumes:
           - ./sonic.cfg:/etc/sonic.cfg:ro
           - ./data/sonic:/var/lib/sonic/store

and:

~/archivebox took 13s
❯ docker-compose run archivebox --version
ArchiveBox v0.6.2
Cpython Linux Linux-5.15.49-linuxkit-x86_64-with-glibc2.28 x86_64
IN_DOCKER=True DEBUG=False IS_TTY=True TZ=UTC SEARCH_BACKEND_ENGINE=sonic

[i] Dependency versions:
 √  ARCHIVEBOX_BINARY     v0.6.2          valid     /usr/local/bin/archivebox
 √  PYTHON_BINARY         v3.9.5          valid     /usr/local/bin/python3.9
 √  DJANGO_BINARY         v3.1.10         valid     /usr/local/lib/python3.9/site-packages/django/bin/django-admin.py
 √  CURL_BINARY           v7.64.0         valid     /usr/bin/curl
 √  WGET_BINARY           v1.20.1         valid     /usr/bin/wget
 √  NODE_BINARY           v15.14.0        valid     /usr/bin/node
 √  SINGLEFILE_BINARY     v0.3.16         valid     /node/node_modules/single-file/cli/single-file
 √  READABILITY_BINARY    v0.0.2          valid     /node/node_modules/readability-extractor/readability-extractor
 √  MERCURY_BINARY        v1.0.0          valid     /node/node_modules/@postlight/mercury-parser/cli.js
 √  GIT_BINARY            v2.20.1         valid     /usr/bin/git
 √  YOUTUBEDL_BINARY      v2021.04.26     valid     /usr/local/bin/youtube-dl
 √  CHROME_BINARY         v90.0.4430.93   valid     /usr/bin/chromium
 √  RIPGREP_BINARY        v0.10.0         valid     /usr/bin/rg

[i] Source-code locations:
 √  PACKAGE_DIR           22 files        valid     /app/archivebox
 √  TEMPLATES_DIR         3 files         valid     /app/archivebox/templates
 -  CUSTOM_TEMPLATES_DIR  -               disabled

[i] Secrets locations:
 -  CHROME_USER_DATA_DIR  -               disabled
 -  COOKIES_FILE          -               disabled

[i] Data locations:
 √  OUTPUT_DIR            11 files        valid     /data
 √  SOURCES_DIR           1 files         valid     ./sources
 √  LOGS_DIR              1 files         valid     ./logs
 √  ARCHIVE_DIR           1 files         valid     ./archive
 √  CONFIG_FILE           81.0 Bytes      valid     ./ArchiveBox.conf
 √  SQL_INDEX             212.0 KB        valid     ./index.sqlite3

<!-- gh-comment-id:1399325629 --> @diego898 commented on GitHub (Jan 21, 2023): from `docker-compose.yml`: ``` archivebox: # build: . # for developers working on archivebox image: ${DOCKER_IMAGE:-archivebox/archivebox:master} command: server --quick-init 0.0.0.0:8000 ports: - 8000:8000 environment: - ALLOWED_HOSTS=* # add any config options you want as env vars - MEDIA_MAX_SIZE=750m - SEARCH_BACKEND_ENGINE=sonic # uncomment these if you enable sonic below - SEARCH_BACKEND_HOST_NAME=sonic - SEARCH_BACKEND_PASSWORD=SecretPassword volumes: - ./data:/data # - ./archivebox:/app/archivebox # for developers working on archivebox ... # To run the Sonic full-text search backend, first download the config file to sonic.cfg # curl -O https://raw.githubusercontent.com/ArchiveBox/ArchiveBox/master/etc/sonic.cfg # after starting, backfill any existing Snapshots into the index: docker-compose run archivebox update --index-only sonic: image: valeriansaliou/sonic:v1.3.0 expose: - 1491 environment: - SEARCH_BACKEND_PASSWORD=SecretPassword volumes: - ./sonic.cfg:/etc/sonic.cfg:ro - ./data/sonic:/var/lib/sonic/store ``` and: ``` ~/archivebox took 13s ❯ docker-compose run archivebox --version ArchiveBox v0.6.2 Cpython Linux Linux-5.15.49-linuxkit-x86_64-with-glibc2.28 x86_64 IN_DOCKER=True DEBUG=False IS_TTY=True TZ=UTC SEARCH_BACKEND_ENGINE=sonic [i] Dependency versions: √ ARCHIVEBOX_BINARY v0.6.2 valid /usr/local/bin/archivebox √ PYTHON_BINARY v3.9.5 valid /usr/local/bin/python3.9 √ DJANGO_BINARY v3.1.10 valid /usr/local/lib/python3.9/site-packages/django/bin/django-admin.py √ CURL_BINARY v7.64.0 valid /usr/bin/curl √ WGET_BINARY v1.20.1 valid /usr/bin/wget √ NODE_BINARY v15.14.0 valid /usr/bin/node √ SINGLEFILE_BINARY v0.3.16 valid /node/node_modules/single-file/cli/single-file √ READABILITY_BINARY v0.0.2 valid /node/node_modules/readability-extractor/readability-extractor √ MERCURY_BINARY v1.0.0 valid /node/node_modules/@postlight/mercury-parser/cli.js √ GIT_BINARY v2.20.1 valid /usr/bin/git √ YOUTUBEDL_BINARY v2021.04.26 valid /usr/local/bin/youtube-dl √ CHROME_BINARY v90.0.4430.93 valid /usr/bin/chromium √ RIPGREP_BINARY v0.10.0 valid /usr/bin/rg [i] Source-code locations: √ PACKAGE_DIR 22 files valid /app/archivebox √ TEMPLATES_DIR 3 files valid /app/archivebox/templates - CUSTOM_TEMPLATES_DIR - disabled [i] Secrets locations: - CHROME_USER_DATA_DIR - disabled - COOKIES_FILE - disabled [i] Data locations: √ OUTPUT_DIR 11 files valid /data √ SOURCES_DIR 1 files valid ./sources √ LOGS_DIR 1 files valid ./logs √ ARCHIVE_DIR 1 files valid ./archive √ CONFIG_FILE 81.0 Bytes valid ./ArchiveBox.conf √ SQL_INDEX 212.0 KB valid ./index.sqlite3 ```
Author
Owner

@diego898 commented on GitHub (Jan 23, 2023):

The error is different if I place it in the data folder. Also, occasionally I’ll bet a folder at root called sonic.cfg which is very strange

<!-- gh-comment-id:1401101469 --> @diego898 commented on GitHub (Jan 23, 2023): The error is different if I place it in the data folder. Also, occasionally I’ll bet a folder at root called sonic.cfg which is very strange
Author
Owner

@pirate commented on GitHub (Jan 24, 2023):

folder at root called sonic.cfg is caused by the file not being where you told docker to look for it, so it creates it as an empty volume and mounts it. Keep it outside the data folder and mount it like you're doing in the docker-compose.yml you posted and it should work. Can you post the contents of your sonic.cfg file, maybe it got messed up somehow?

<!-- gh-comment-id:1402559030 --> @pirate commented on GitHub (Jan 24, 2023): folder at root called `sonic.cfg` is caused by the file not being where you told docker to look for it, so it creates it as an empty volume and mounts it. Keep it outside the data folder and mount it like you're doing in the `docker-compose.yml` you posted and it should work. Can you post the contents of your `sonic.cfg` file, maybe it got messed up somehow?
Author
Owner

@diego898 commented on GitHub (Feb 6, 2023):

This is the file:

# Sonic
# Fast, lightweight and schema-less search backend
# Configuration file
# Example: https://github.com/valeriansaliou/sonic/blob/master/config.cfg


[server]

log_level = "warn"


[channel]

inet = "0.0.0.0:1491"
tcp_timeout = 300

auth_password = "${env.SEARCH_BACKEND_PASSWORD}"

[channel.search]

query_limit_default = 65535
query_limit_maximum = 65535
query_alternates_try = 10

suggest_limit_default = 5
suggest_limit_maximum = 20


[store]

[store.kv]

path = "/var/lib/sonic/store/kv/"

retain_word_objects = 100000

[store.kv.pool]

inactive_after = 1800

[store.kv.database]

flush_after = 900

compress = true
parallelism = 2
max_files = 100
max_compactions = 1
max_flushes = 1
write_buffer = 16384
write_ahead_log = true

[store.fst]

path = "/var/lib/sonic/store/fst/"

[store.fst.pool]

inactive_after = 300

[store.fst.graph]

consolidate_after = 180

max_size = 2048
max_words = 250000

<!-- gh-comment-id:1419508358 --> @diego898 commented on GitHub (Feb 6, 2023): This is the file: ``` # Sonic # Fast, lightweight and schema-less search backend # Configuration file # Example: https://github.com/valeriansaliou/sonic/blob/master/config.cfg [server] log_level = "warn" [channel] inet = "0.0.0.0:1491" tcp_timeout = 300 auth_password = "${env.SEARCH_BACKEND_PASSWORD}" [channel.search] query_limit_default = 65535 query_limit_maximum = 65535 query_alternates_try = 10 suggest_limit_default = 5 suggest_limit_maximum = 20 [store] [store.kv] path = "/var/lib/sonic/store/kv/" retain_word_objects = 100000 [store.kv.pool] inactive_after = 1800 [store.kv.database] flush_after = 900 compress = true parallelism = 2 max_files = 100 max_compactions = 1 max_flushes = 1 write_buffer = 16384 write_ahead_log = true [store.fst] path = "/var/lib/sonic/store/fst/" [store.fst.pool] inactive_after = 300 [store.fst.graph] consolidate_after = 180 max_size = 2048 max_words = 250000 ```
Author
Owner

@diego898 commented on GitHub (Feb 6, 2023):

and re-runnig after re-downloading the file gives this error:

❯ docker-compose run archivebox update --index-only
[i] [2023-02-06 18:03:04] ArchiveBox v0.6.2: archivebox update --index-only
    > /data

[*] Indexing url: https://www.ecliptik.com/bookmarking-with-raindrop/ in the search index

[!] Sonic search backend threw an error while indexing: gaierror [Errno -2] Name or service not known
<!-- gh-comment-id:1419521415 --> @diego898 commented on GitHub (Feb 6, 2023): and re-runnig after re-downloading the file gives this error: ``` ❯ docker-compose run archivebox update --index-only [i] [2023-02-06 18:03:04] ArchiveBox v0.6.2: archivebox update --index-only > /data [*] Indexing url: https://www.ecliptik.com/bookmarking-with-raindrop/ in the search index [!] Sonic search backend threw an error while indexing: gaierror [Errno -2] Name or service not known ```
Author
Owner

@pirate commented on GitHub (Feb 6, 2023):

Very strange, your setup is completely standard but it's failing as if sonic is not running.

Lets try checking the sonic container logs, can you post the output of docker-compose logs sonic?

You can also try pinging/telnet the sonic container from the ArchiveBox one to see if it's a network issue:

docker-compose exec archivebox bash
$ telnet sonic 1491
# or
$ ping sonic
<!-- gh-comment-id:1419854532 --> @pirate commented on GitHub (Feb 6, 2023): Very strange, your setup is completely standard but it's failing as if sonic is not running. Lets try checking the sonic container logs, can you post the output of `docker-compose logs sonic`? You can also try pinging/telnet the sonic container from the ArchiveBox one to see if it's a network issue: ```bash docker-compose exec archivebox bash $ telnet sonic 1491 # or $ ping sonic ```
Author
Owner

@diego898 commented on GitHub (Feb 6, 2023):

These are the outputs:

~/archivebox
❯ docker-compose logs sonic
archivebox-sonic-1  | thread 'main' panicked at 'cannot read config file: Os { code: 21, kind: Other, message: "Is a directory" }', src/config/reader.rs:24:14
archivebox-sonic-1  | note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
archivebox-sonic-1  | thread 'main' panicked at 'cannot read config file: Os { code: 21, kind: Other, message: "Is a directory" }', src/config/reader.rs:24:14
archivebox-sonic-1  | note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
archivebox-sonic-1  | thread 'main' panicked at 'cannot read config file: Os { code: 21, kind: Other, message: "Is a directory" }', src/config/reader.rs:24:14
archivebox-sonic-1  | note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

and

~/archivebox took 4s
❯ docker-compose exec archivebox bash
root@19e3f981c5ec:/data# telnet sonic 1491
bash: telnet: command not found
root@19e3f981c5ec:/data# ping sonic
bash: ping: command not found
root@19e3f981c5ec:/data#
<!-- gh-comment-id:1419895333 --> @diego898 commented on GitHub (Feb 6, 2023): These are the outputs: ``` ~/archivebox ❯ docker-compose logs sonic archivebox-sonic-1 | thread 'main' panicked at 'cannot read config file: Os { code: 21, kind: Other, message: "Is a directory" }', src/config/reader.rs:24:14 archivebox-sonic-1 | note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace archivebox-sonic-1 | thread 'main' panicked at 'cannot read config file: Os { code: 21, kind: Other, message: "Is a directory" }', src/config/reader.rs:24:14 archivebox-sonic-1 | note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace archivebox-sonic-1 | thread 'main' panicked at 'cannot read config file: Os { code: 21, kind: Other, message: "Is a directory" }', src/config/reader.rs:24:14 archivebox-sonic-1 | note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace ``` and ``` ~/archivebox took 4s ❯ docker-compose exec archivebox bash root@19e3f981c5ec:/data# telnet sonic 1491 bash: telnet: command not found root@19e3f981c5ec:/data# ping sonic bash: ping: command not found root@19e3f981c5ec:/data# ```
Author
Owner

@diego898 commented on GitHub (Feb 6, 2023):

Note:

  • I deleted the sonic.cfg file from the data/ directory.
  • Deleted the directory from root, replaced it with the correct file
  • did a docker-compose down; docker-compose down; docker-compose up
  • did a docker-compose logs sonic and it came back empty (this is good right?)
  • Then I tried: to update the index and got:
~/archivebox took 6s
❯ docker-compose run archivebox update --index-only
[i] [2023-02-06 22:57:15] ArchiveBox v0.6.2: archivebox update --index-only
    > /data

[*] Indexing url: https://www.ecliptik.com/bookmarking-with-raindrop/ in the search index

[!] Sonic search backend threw an error while indexing: SonicServerError ERR invalid_meta_key(?["])

[*] Indexing url: https://news.ycombinator.com/item?id=34667067 in the search index

[*] Indexing url: https://news.ycombinator.com/item?id=34665738 in the search index

Back on the up screen I get:

~/archivebox took 1m8s
❯ docker-compose down; docker-compose down; docker-compose up
[+] Running 3/0
 ⠿ Container archivebox-archivebox-1  Removed                             0.0s
 ⠿ Container archivebox-sonic-1       R...                                0.0s
 ⠿ Network archivebox_default         Rem...                              0.0s
[+] Running 3/3
 ⠿ Network archivebox_default         Cre...                              0.1s
 ⠿ Container archivebox-sonic-1       C...                                0.1s
 ⠿ Container archivebox-archivebox-1  Created                             0.1s
Attaching to archivebox-archivebox-1, archivebox-sonic-1
archivebox-archivebox-1  | [i] [2023-02-06 22:56:32] ArchiveBox v0.6.2: archivebox server --quick-init 0.0.0.0:8000
archivebox-archivebox-1  |     > /data
archivebox-archivebox-1  |
archivebox-archivebox-1  | [^] Verifying and updating existing ArchiveBox collection to v0.6.2...
archivebox-archivebox-1  | ----------------------------------------------------------------------
archivebox-archivebox-1  |
archivebox-archivebox-1  | [*] Verifying archive folder structure...
archivebox-archivebox-1  |     + ./archive, ./sources, ./logs...
archivebox-archivebox-1  |     + ./ArchiveBox.conf...
archivebox-archivebox-1  |
archivebox-archivebox-1  | [*] Verifying main SQL index and running any migrations needed...
archivebox-archivebox-1  |     Operations to perform:
archivebox-archivebox-1  |     Apply all migrations: admin, auth, contenttypes, core, sessions
archivebox-archivebox-1  |     Running migrations:
archivebox-archivebox-1  |     No migrations to apply.
archivebox-archivebox-1  |
archivebox-archivebox-1  |     √ ./index.sqlite3
archivebox-archivebox-1  |
archivebox-archivebox-1  | [*] Checking links from indexes and archive folders (safe to Ctrl+C)...
archivebox-archivebox-1  |     √ Loaded 3 links from existing main index.
archivebox-archivebox-1  |     > Skipping full snapshot directory check (quick mode)
archivebox-archivebox-1  |
archivebox-archivebox-1  | ----------------------------------------------------------------------
archivebox-archivebox-1  | [√] Done. Verified and updated the existing ArchiveBox collection.
archivebox-archivebox-1  |
archivebox-archivebox-1  |     Hint: To view your archive index, run:
archivebox-archivebox-1  |         archivebox server  # then visit http://127.0.0.1:8000
archivebox-archivebox-1  |
archivebox-archivebox-1  |     To add new links, you can run:
archivebox-archivebox-1  |         archivebox add ~/some/path/or/url/to/list_of_links.txt
archivebox-archivebox-1  |
archivebox-archivebox-1  |     For more usage and examples, run:
archivebox-archivebox-1  |         archivebox help
archivebox-archivebox-1  |
archivebox-archivebox-1  | [+] Starting ArchiveBox webserver...
archivebox-archivebox-1  |     > Logging errors to ./logs/errors.log
archivebox-archivebox-1  | Performing system checks...
archivebox-archivebox-1  |
archivebox-archivebox-1  | System check identified no issues (0 silenced).
archivebox-archivebox-1  | February 06, 2023 - 22:56:34
archivebox-archivebox-1  | Django version 3.1.10, using settings 'core.settings'
archivebox-archivebox-1  | Starting development server at http://0.0.0.0:8000/
archivebox-archivebox-1  | Quit the server with CONTROL-C.
archivebox-archivebox-1  | "GET /admin/login/ HTTP/1.1" 200 11144
archivebox-sonic-1       | (WARN) - took a lot of time: 226ms to process channel message
archivebox-sonic-1       | (WARN) - took a lot of time: 90ms to process channel message
archivebox-sonic-1       | (WARN) - took a lot of time: 107ms to process channel message
archivebox-sonic-1       | (WARN) - took a lot of time: 77ms to process channel message
archivebox-sonic-1       | (WARN) - took a lot of time: 65ms to process channel message
archivebox-sonic-1       | (WARN) - took a lot of time: 75ms to process channel message
archivebox-sonic-1       | (WARN) - took a lot of time: 79ms to process channel message
archivebox-sonic-1       | (WARN) - took a lot of time: 78ms to process channel message
archivebox-sonic-1       | (WARN) - took a lot of time: 81ms to process channel message
archivebox-sonic-1       | (WARN) - took a lot of time: 100ms to process channel message
archivebox-sonic-1       | (WARN) - took a lot of time: 62ms to process channel message
archivebox-sonic-1       | (WARN) - took a lot of time: 72ms to process channel message
archivebox-sonic-1       | (WARN) - took a lot of time: 89ms to process channel message
archivebox-sonic-1       | (WARN) - took a lot of time: 64ms to process channel message
archivebox-sonic-1       | (WARN) - took a lot of time: 87ms to process channel message
archivebox-sonic-1       | (WARN) - took a lot of time: 93ms to process channel message
archivebox-sonic-1       | (WARN) - took a lot of time: 91ms to process channel message
archivebox-sonic-1       | (WARN) - took a lot of time: 60ms to process channel message
archivebox-sonic-1       | (WARN) - took a lot of time: 70ms to process channel message
archivebox-sonic-1       | (WARN) - took a lot of time: 59ms to process channel message
archivebox-sonic-1       | (WARN) - took a lot of time: 82ms to process channel message
archivebox-sonic-1       | (WARN) - took a lot of time: 70ms to process channel message
archivebox-sonic-1       | (WARN) - took a lot of time: 54ms to process channel message
archivebox-sonic-1       | (WARN) - took a lot of time: 52ms to process channel message
archivebox-sonic-1       | (WARN) - took a lot of time: 57ms to process channel message
archivebox-sonic-1       | (WARN) - took a lot of time: 70ms to process channel message
archivebox-sonic-1       | (WARN) - took a lot of time: 55ms to process channel message
archivebox-sonic-1       | (WARN) - took a lot of time: 55ms to process channel message
archivebox-sonic-1       | (WARN) - took a lot of time: 68ms to process channel message
archivebox-sonic-1       | (WARN) - took a lot of time: 58ms to process channel message
archivebox-sonic-1       | (WARN) - took a lot of time: 52ms to process channel message
archivebox-sonic-1       | (WARN) - took a lot of time: 53ms to process channel message
archivebox-sonic-1       | (WARN) - took a lot of time: 57ms to process channel message
archivebox-sonic-1       | (WARN) - took a lot of time: 118ms to process channel message
archivebox-sonic-1       | (WARN) - took a lot of time: 59ms to process channel message
archivebox-sonic-1       | (WARN) - took a lot of time: 68ms to process channel message
archivebox-sonic-1       | (WARN) - took a lot of time: 125ms to process channel message
archivebox-sonic-1       | (WARN) - took a lot of time: 86ms to process channel message
archivebox-sonic-1       | (WARN) - took a lot of time: 81ms to process channel message
archivebox-sonic-1       | (WARN) - took a lot of time: 89ms to process channel message
archivebox-sonic-1       | (WARN) - took a lot of time: 101ms to process channel message
archivebox-sonic-1       | (WARN) - took a lot of time: 69ms to process channel message
archivebox-sonic-1       | (WARN) - took a lot of time: 59ms to process channel message
archivebox-sonic-1       | (WARN) - took a lot of time: 81ms to process channel message
archivebox-sonic-1       | (WARN) - took a lot of time: 71ms to process channel message
archivebox-sonic-1       | (WARN) - took a lot of time: 76ms to process channel message
archivebox-sonic-1       | (WARN) - took a lot of time: 94ms to process channel message
archivebox-sonic-1       | (WARN) - took a lot of time: 63ms to process channel message
archivebox-sonic-1       | (WARN) - took a lot of time: 71ms to process channel message
archivebox-sonic-1       | (WARN) - took a lot of time: 63ms to process channel message
archivebox-sonic-1       | (WARN) - took a lot of time: 71ms to process channel message
archivebox-sonic-1       | (WARN) - took a lot of time: 70ms to process channel message
archivebox-sonic-1       | (WARN) - took a lot of time: 65ms to process channel message
archivebox-sonic-1       | (WARN) - took a lot of time: 64ms to process channel message
archivebox-sonic-1       | (WARN) - took a lot of time: 66ms to process channel message
archivebox-sonic-1       | (WARN) - took a lot of time: 54ms to process channel message
archivebox-sonic-1       | (WARN) - took a lot of time: 64ms to process channel message
archivebox-sonic-1       | (WARN) - took a lot of time: 53ms to process channel message
archivebox-sonic-1       | (WARN) - took a lot of time: 62ms to process channel message
archivebox-sonic-1       | (WARN) - took a lot of time: 50ms to process channel message
archivebox-sonic-1       | (WARN) - took a lot of time: 60ms to process channel message
archivebox-sonic-1       | (WARN) - took a lot of time: 53ms to process channel message
archivebox-sonic-1       | (WARN) - took a lot of time: 57ms to process channel message
archivebox-sonic-1       | (WARN) - took a lot of time: 51ms to process channel message
archivebox-sonic-1       | (WARN) - took a lot of time: 56ms to process channel message
archivebox-sonic-1       | (WARN) - took a lot of time: 54ms to process channel message
archivebox-sonic-1       | (WARN) - took a lot of time: 60ms to process channel message
archivebox-sonic-1       | (WARN) - took a lot of time: 56ms to process channel message
archivebox-sonic-1       | (WARN) - took a lot of time: 52ms to process channel message
archivebox-sonic-1       | (WARN) - took a lot of time: 55ms to process channel message
archivebox-sonic-1       | (WARN) - took a lot of time: 59ms to process channel message
archivebox-sonic-1       | (WARN) - took a lot of time: 52ms to process channel message
archivebox-archivebox-1  | "GET /admin/login/ HTTP/1.1" 200 11144
archivebox-archivebox-1  | "GET /admin/login/ HTTP/1.1" 200 11144
archivebox-archivebox-1  | "GET /admin/login/ HTTP/1.1" 200 11144

Very strange!

<!-- gh-comment-id:1419903488 --> @diego898 commented on GitHub (Feb 6, 2023): Note: - I deleted the `sonic.cfg` file from the `data/` directory. - Deleted the directory from root, replaced it with the correct file - did a `docker-compose down; docker-compose down; docker-compose up` - did a `docker-compose logs sonic` and it came back empty (this is good right?) - Then I tried: to update the index and got: ``` ~/archivebox took 6s ❯ docker-compose run archivebox update --index-only [i] [2023-02-06 22:57:15] ArchiveBox v0.6.2: archivebox update --index-only > /data [*] Indexing url: https://www.ecliptik.com/bookmarking-with-raindrop/ in the search index [!] Sonic search backend threw an error while indexing: SonicServerError ERR invalid_meta_key(?["]) [*] Indexing url: https://news.ycombinator.com/item?id=34667067 in the search index [*] Indexing url: https://news.ycombinator.com/item?id=34665738 in the search index ``` Back on the `up` screen I get: ``` ~/archivebox took 1m8s ❯ docker-compose down; docker-compose down; docker-compose up [+] Running 3/0 ⠿ Container archivebox-archivebox-1 Removed 0.0s ⠿ Container archivebox-sonic-1 R... 0.0s ⠿ Network archivebox_default Rem... 0.0s [+] Running 3/3 ⠿ Network archivebox_default Cre... 0.1s ⠿ Container archivebox-sonic-1 C... 0.1s ⠿ Container archivebox-archivebox-1 Created 0.1s Attaching to archivebox-archivebox-1, archivebox-sonic-1 archivebox-archivebox-1 | [i] [2023-02-06 22:56:32] ArchiveBox v0.6.2: archivebox server --quick-init 0.0.0.0:8000 archivebox-archivebox-1 | > /data archivebox-archivebox-1 | archivebox-archivebox-1 | [^] Verifying and updating existing ArchiveBox collection to v0.6.2... archivebox-archivebox-1 | ---------------------------------------------------------------------- archivebox-archivebox-1 | archivebox-archivebox-1 | [*] Verifying archive folder structure... archivebox-archivebox-1 | + ./archive, ./sources, ./logs... archivebox-archivebox-1 | + ./ArchiveBox.conf... archivebox-archivebox-1 | archivebox-archivebox-1 | [*] Verifying main SQL index and running any migrations needed... archivebox-archivebox-1 | Operations to perform: archivebox-archivebox-1 | Apply all migrations: admin, auth, contenttypes, core, sessions archivebox-archivebox-1 | Running migrations: archivebox-archivebox-1 | No migrations to apply. archivebox-archivebox-1 | archivebox-archivebox-1 | √ ./index.sqlite3 archivebox-archivebox-1 | archivebox-archivebox-1 | [*] Checking links from indexes and archive folders (safe to Ctrl+C)... archivebox-archivebox-1 | √ Loaded 3 links from existing main index. archivebox-archivebox-1 | > Skipping full snapshot directory check (quick mode) archivebox-archivebox-1 | archivebox-archivebox-1 | ---------------------------------------------------------------------- archivebox-archivebox-1 | [√] Done. Verified and updated the existing ArchiveBox collection. archivebox-archivebox-1 | archivebox-archivebox-1 | Hint: To view your archive index, run: archivebox-archivebox-1 | archivebox server # then visit http://127.0.0.1:8000 archivebox-archivebox-1 | archivebox-archivebox-1 | To add new links, you can run: archivebox-archivebox-1 | archivebox add ~/some/path/or/url/to/list_of_links.txt archivebox-archivebox-1 | archivebox-archivebox-1 | For more usage and examples, run: archivebox-archivebox-1 | archivebox help archivebox-archivebox-1 | archivebox-archivebox-1 | [+] Starting ArchiveBox webserver... archivebox-archivebox-1 | > Logging errors to ./logs/errors.log archivebox-archivebox-1 | Performing system checks... archivebox-archivebox-1 | archivebox-archivebox-1 | System check identified no issues (0 silenced). archivebox-archivebox-1 | February 06, 2023 - 22:56:34 archivebox-archivebox-1 | Django version 3.1.10, using settings 'core.settings' archivebox-archivebox-1 | Starting development server at http://0.0.0.0:8000/ archivebox-archivebox-1 | Quit the server with CONTROL-C. archivebox-archivebox-1 | "GET /admin/login/ HTTP/1.1" 200 11144 archivebox-sonic-1 | (WARN) - took a lot of time: 226ms to process channel message archivebox-sonic-1 | (WARN) - took a lot of time: 90ms to process channel message archivebox-sonic-1 | (WARN) - took a lot of time: 107ms to process channel message archivebox-sonic-1 | (WARN) - took a lot of time: 77ms to process channel message archivebox-sonic-1 | (WARN) - took a lot of time: 65ms to process channel message archivebox-sonic-1 | (WARN) - took a lot of time: 75ms to process channel message archivebox-sonic-1 | (WARN) - took a lot of time: 79ms to process channel message archivebox-sonic-1 | (WARN) - took a lot of time: 78ms to process channel message archivebox-sonic-1 | (WARN) - took a lot of time: 81ms to process channel message archivebox-sonic-1 | (WARN) - took a lot of time: 100ms to process channel message archivebox-sonic-1 | (WARN) - took a lot of time: 62ms to process channel message archivebox-sonic-1 | (WARN) - took a lot of time: 72ms to process channel message archivebox-sonic-1 | (WARN) - took a lot of time: 89ms to process channel message archivebox-sonic-1 | (WARN) - took a lot of time: 64ms to process channel message archivebox-sonic-1 | (WARN) - took a lot of time: 87ms to process channel message archivebox-sonic-1 | (WARN) - took a lot of time: 93ms to process channel message archivebox-sonic-1 | (WARN) - took a lot of time: 91ms to process channel message archivebox-sonic-1 | (WARN) - took a lot of time: 60ms to process channel message archivebox-sonic-1 | (WARN) - took a lot of time: 70ms to process channel message archivebox-sonic-1 | (WARN) - took a lot of time: 59ms to process channel message archivebox-sonic-1 | (WARN) - took a lot of time: 82ms to process channel message archivebox-sonic-1 | (WARN) - took a lot of time: 70ms to process channel message archivebox-sonic-1 | (WARN) - took a lot of time: 54ms to process channel message archivebox-sonic-1 | (WARN) - took a lot of time: 52ms to process channel message archivebox-sonic-1 | (WARN) - took a lot of time: 57ms to process channel message archivebox-sonic-1 | (WARN) - took a lot of time: 70ms to process channel message archivebox-sonic-1 | (WARN) - took a lot of time: 55ms to process channel message archivebox-sonic-1 | (WARN) - took a lot of time: 55ms to process channel message archivebox-sonic-1 | (WARN) - took a lot of time: 68ms to process channel message archivebox-sonic-1 | (WARN) - took a lot of time: 58ms to process channel message archivebox-sonic-1 | (WARN) - took a lot of time: 52ms to process channel message archivebox-sonic-1 | (WARN) - took a lot of time: 53ms to process channel message archivebox-sonic-1 | (WARN) - took a lot of time: 57ms to process channel message archivebox-sonic-1 | (WARN) - took a lot of time: 118ms to process channel message archivebox-sonic-1 | (WARN) - took a lot of time: 59ms to process channel message archivebox-sonic-1 | (WARN) - took a lot of time: 68ms to process channel message archivebox-sonic-1 | (WARN) - took a lot of time: 125ms to process channel message archivebox-sonic-1 | (WARN) - took a lot of time: 86ms to process channel message archivebox-sonic-1 | (WARN) - took a lot of time: 81ms to process channel message archivebox-sonic-1 | (WARN) - took a lot of time: 89ms to process channel message archivebox-sonic-1 | (WARN) - took a lot of time: 101ms to process channel message archivebox-sonic-1 | (WARN) - took a lot of time: 69ms to process channel message archivebox-sonic-1 | (WARN) - took a lot of time: 59ms to process channel message archivebox-sonic-1 | (WARN) - took a lot of time: 81ms to process channel message archivebox-sonic-1 | (WARN) - took a lot of time: 71ms to process channel message archivebox-sonic-1 | (WARN) - took a lot of time: 76ms to process channel message archivebox-sonic-1 | (WARN) - took a lot of time: 94ms to process channel message archivebox-sonic-1 | (WARN) - took a lot of time: 63ms to process channel message archivebox-sonic-1 | (WARN) - took a lot of time: 71ms to process channel message archivebox-sonic-1 | (WARN) - took a lot of time: 63ms to process channel message archivebox-sonic-1 | (WARN) - took a lot of time: 71ms to process channel message archivebox-sonic-1 | (WARN) - took a lot of time: 70ms to process channel message archivebox-sonic-1 | (WARN) - took a lot of time: 65ms to process channel message archivebox-sonic-1 | (WARN) - took a lot of time: 64ms to process channel message archivebox-sonic-1 | (WARN) - took a lot of time: 66ms to process channel message archivebox-sonic-1 | (WARN) - took a lot of time: 54ms to process channel message archivebox-sonic-1 | (WARN) - took a lot of time: 64ms to process channel message archivebox-sonic-1 | (WARN) - took a lot of time: 53ms to process channel message archivebox-sonic-1 | (WARN) - took a lot of time: 62ms to process channel message archivebox-sonic-1 | (WARN) - took a lot of time: 50ms to process channel message archivebox-sonic-1 | (WARN) - took a lot of time: 60ms to process channel message archivebox-sonic-1 | (WARN) - took a lot of time: 53ms to process channel message archivebox-sonic-1 | (WARN) - took a lot of time: 57ms to process channel message archivebox-sonic-1 | (WARN) - took a lot of time: 51ms to process channel message archivebox-sonic-1 | (WARN) - took a lot of time: 56ms to process channel message archivebox-sonic-1 | (WARN) - took a lot of time: 54ms to process channel message archivebox-sonic-1 | (WARN) - took a lot of time: 60ms to process channel message archivebox-sonic-1 | (WARN) - took a lot of time: 56ms to process channel message archivebox-sonic-1 | (WARN) - took a lot of time: 52ms to process channel message archivebox-sonic-1 | (WARN) - took a lot of time: 55ms to process channel message archivebox-sonic-1 | (WARN) - took a lot of time: 59ms to process channel message archivebox-sonic-1 | (WARN) - took a lot of time: 52ms to process channel message archivebox-archivebox-1 | "GET /admin/login/ HTTP/1.1" 200 11144 archivebox-archivebox-1 | "GET /admin/login/ HTTP/1.1" 200 11144 archivebox-archivebox-1 | "GET /admin/login/ HTTP/1.1" 200 11144 ``` Very strange!
Author
Owner

@pirate commented on GitHub (Feb 6, 2023):

Ah ok that seems fine now, looks like it's working. It's possible that the Ecliptik article text extraction had an issue and so Sonic is getting empty text for that URL, but it's working on other URLs as evidenced by the logs after that point.

<!-- gh-comment-id:1419962259 --> @pirate commented on GitHub (Feb 6, 2023): Ah ok that seems fine now, looks like it's working. It's possible that the Ecliptik article text extraction had an issue and so Sonic is getting empty text for that URL, but it's working on other URLs as evidenced by the logs after that point.
Author
Owner

@diego898 commented on GitHub (Feb 7, 2023):

wow what are the odds that the very first test url I made had a url specific error and it never occurred to me to try others! thank you! closing this for now!

<!-- gh-comment-id:1421103575 --> @diego898 commented on GitHub (Feb 7, 2023): wow what are the odds that the very first test url I made had a url specific error and it never occurred to me to try others! thank you! closing this for now!
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/ArchiveBox#679
No description provided.