[GH-ISSUE #955] Bug: django.db.utils.DatabaseError: database disk image is malformed #2102

Closed
opened 2026-03-01 17:56:31 +03:00 by kerem · 2 comments
Owner

Originally created by @terxw on GitHub (Mar 24, 2022).
Original GitHub issue: https://github.com/ArchiveBox/ArchiveBox/issues/955

Describe the bug

after running archivebox from python installation with the same config I cannot run archivebox in docker because of db error.
pragma integrity check of index.sqlite is without error

Steps to reproduce

cd /storage/data/docs/archivebox
docker-compose up -d
docker-compose run archivebox init

Screenshots or log output

archivebox is up-to-date                                                                                                                                                                      
archivebox_sonic_1 is up-to-date                                                                                                                                                              
[i] [2022-03-24 20:37:27] ArchiveBox v0.6.2: archivebox init                                                                                                                                  
    > /data                                                                                                                                                                                   
                                                                                                                                                                                              
[^] Verifying and updating existing ArchiveBox collection to v0.6.2...                                                                                                                        
----------------------------------------------------------------------                                                                                                                        
                                                                                                                                                                                              
[*] Verifying archive folder structure...                                                                                                                                                     
    + ./archive, ./sources, ./logs...                                                                                                                                                         
    + ./ArchiveBox.conf...                                                                                                                                                                    
                                                                                                                                                                                              
[*] Verifying main SQL index and running any migrations needed...                                                                                                                             
Traceback (most recent call last):                                                                                                                                                            
  File "/usr/local/lib/python3.9/site-packages/django/db/backends/utils.py", line 82, in _execute                                                                                             
    return self.cursor.execute(sql)                                                                                                                                                           
  File "/usr/local/lib/python3.9/site-packages/django/db/backends/sqlite3/base.py", line 411, in execute                                                                                      
    return Database.Cursor.execute(self, query)                                                                                                                                               
sqlite3.DatabaseError: database disk image is malformed                                                                                                                                       
                                                                                                                                                                                              
The above exception was the direct cause of the following exception:                                                                                                                          
                                                                                                                                                                                              
Traceback (most recent call last):                                                                                                                                                            
  File "/usr/local/bin/archivebox", line 33, in <module>                                                                                                                                      
    sys.exit(load_entry_point('archivebox', 'console_scripts', 'archivebox')())                                                                                                               
  File "/app/archivebox/cli/__init__.py", line 140, in main                                                                                                                                   
    run_subcommand(  
  File "/app/archivebox/cli/__init__.py", line 80, in run_subcommand                                                                                                                          
    module.main(args=subcommand_args, stdin=stdin, pwd=pwd)    # type: ignore                                                                                                                 
  File "/app/archivebox/cli/archivebox_init.py", line 43, in main                                                                                                                             
    init(                                                                                                                                                                                     
  File "/app/archivebox/util.py", line 114, in typechecked_function                                                                                                                           
    return func(*args, **kwargs)                                                                                                                                                              
  File "/app/archivebox/main.py", line 328, in init                                                                                                                                           
    for migration_line in apply_migrations(out_dir):                                                                                                                                          
  File "/app/archivebox/util.py", line 114, in typechecked_function                                                                                                                           
    return func(*args, **kwargs)                                                                                                                                                              
  File "/app/archivebox/index/sql.py", line 137, in apply_migrations                                                                                                                          
    call_command("makemigrations", interactive=False, stdout=null)                                                                                                                            
  File "/usr/local/lib/python3.9/site-packages/django/core/management/__init__.py", line 168, in call_command                                                                                 
    return command.execute(*args, **defaults)                                                                                                                                                 
  File "/usr/local/lib/python3.9/site-packages/django/core/management/base.py", line 371, in execute                                                                                          
    output = self.handle(*args, **options)                                                                                                                                                    
  File "/usr/local/lib/python3.9/site-packages/django/core/management/base.py", line 85, in wrapped                                                                                           
    res = handle_func(*args, **kwargs)                                                                                                                                                        
  File "/usr/local/lib/python3.9/site-packages/django/core/management/commands/makemigrations.py", line 101, in handle                                                                        
    loader.check_consistent_history(connection)                                                                                                                                               
  File "/usr/local/lib/python3.9/site-packages/django/db/migrations/loader.py", line 290, in check_consistent_history                                                                         
    applied = recorder.applied_migrations()                                                                                                                                                   
  File "/usr/local/lib/python3.9/site-packages/django/db/migrations/recorder.py", line 77, in applied_migrations                                                                              
    if self.has_table():                                                                                                                                                                      
  File "/usr/local/lib/python3.9/site-packages/django/db/migrations/recorder.py", line 56, in has_table                                                                                       
    tables = self.connection.introspection.table_names(cursor)                                                                                                                                
  File "/usr/local/lib/python3.9/site-packages/django/db/backends/base/introspection.py", line 48, in table_names                                                                             
    return get_names(cursor)                                                                                                                                                                  
  File "/usr/local/lib/python3.9/site-packages/django/db/backends/base/introspection.py", line 43, in get_names                                                                               
    return sorted(ti.name for ti in self.get_table_list(cursor)                                                                                                                               
  File "/usr/local/lib/python3.9/site-packages/django/db/backends/sqlite3/introspection.py", line 74, in get_table_list                                                                       
    cursor.execute("""                                                                                                                                                                        
  File "/usr/local/lib/python3.9/site-packages/django/db/backends/utils.py", line 66, in execute                                                                                              
    return self._execute_with_wrappers(sql, params, many=False, executor=self._execute)                                                                                                       
  File "/usr/local/lib/python3.9/site-packages/django/db/backends/utils.py", line 75, in _execute_with_wrappers                                                                               
    return executor(sql, params, many, context)                                                                                                                                               
  File "/usr/local/lib/python3.9/site-packages/django/db/backends/utils.py", line 84, in _execute                                                                                             
    return self.cursor.execute(sql, params)                                                                                                                                                   
  File "/usr/local/lib/python3.9/site-packages/django/db/utils.py", line 90, in __exit__                                                                                                      
    raise dj_exc_value.with_traceback(traceback) from exc_value                                                                                                                               
  File "/usr/local/lib/python3.9/site-packages/django/db/backends/utils.py", line 82, in _execute                                                                                             
    return self.cursor.execute(sql)                                                                                                                                                           
  File "/usr/local/lib/python3.9/site-packages/django/db/backends/sqlite3/base.py", line 411, in execute                                                                                      
    return Database.Cursor.execute(self, query)                                                                                                                                               
django.db.utils.DatabaseError: database disk image is malformed 

ArchiveBox version

archivebox is up-to-date
archivebox_sonic_1 is up-to-date
ArchiveBox v0.6.2
Cpython Linux Linux-5.13.0-25-generic-x86_64-with-glibc2.28 x86_64
IN_DOCKER=True DEBUG=False IS_TTY=True TZ=UTC SEARCH_BACKEND_ENGINE=sonic

[i] Dependency versions:
 √  ARCHIVEBOX_BINARY     v0.6.2          valid     /usr/local/bin/archivebox                                                    
 √  PYTHON_BINARY         v3.9.5          valid     /usr/local/bin/python3.9                                                     
 √  DJANGO_BINARY         v3.1.10         valid     /usr/local/lib/python3.9/site-packages/django/bin/django-admin.py           
 √  CURL_BINARY           v7.64.0         valid     /usr/bin/curl                                                                
 √  WGET_BINARY           v1.20.1         valid     /usr/bin/wget                                                                
 √  NODE_BINARY           v15.14.0        valid     /usr/bin/node                                                                
 √  SINGLEFILE_BINARY     v0.3.16         valid     /node/node_modules/single-file/cli/single-file                              
 √  READABILITY_BINARY    v0.0.2          valid     /node/node_modules/readability-extractor/readability-extractor              
 √  MERCURY_BINARY        v1.0.0          valid     /node/node_modules/@postlight/mercury-parser/cli.js                         
 √  GIT_BINARY            v2.20.1         valid     /usr/bin/git                                                                 
 -  YOUTUBEDL_BINARY      -               disabled  /usr/local/bin/youtube-dl                                                    
 √  CHROME_BINARY         v90.0.4430.93   valid     /usr/bin/chromium                                                            
 √  RIPGREP_BINARY        v0.10.0         valid     /usr/bin/rg                                                                                                                               
                                                                                                                                                                                              
[i] Source-code locations:                                                                                                                                                                    
 √  PACKAGE_DIR           22 files        valid     /app/archivebox                                                                                                                           
 √  TEMPLATES_DIR         3 files         valid     /app/archivebox/templates                                                                                                                 
 -  CUSTOM_TEMPLATES_DIR  -               disabled                                                                                                                                            
                                                                                                                                                                                              
[i] Secrets locations:                                                                                                                                                                        
 √  CHROME_USER_DATA_DIR  49 files        valid     ./chrome_user_dir                                                                                                                         
 √  COOKIES_FILE          342.3 KB        valid     ./cookies.txt                                                                                                                             
                                                                                                                                                                                              
[i] Data locations:                                                                                                                                                                           
 √  OUTPUT_DIR            20 files        valid     /data                                                                                                                                     
 √  SOURCES_DIR           601 files       valid     ./sources                                                                                                                                 
 √  LOGS_DIR              1 files         valid     ./logs                                                                                                                                    
 √  ARCHIVE_DIR           9601 files      valid     ./archive                                                                                                                                 
 √  CONFIG_FILE           81.0 Bytes      valid     ./ArchiveBox.conf                                                                                                                         
 √  SQL_INDEX             511.2 MB        valid     ./index.sqlite3 

archivebox venv config

[DEFAULT]
IS_TTY=False
USE_COLOR=False
SHOW_PROGRESS=False
IN_DOCKER=False
ONLY_NEW=True
TIMEOUT=60
MEDIA_TIMEOUT=3600
OUTPUT_PERMISSIONS=755
RESTRICT_FILE_NAMES=windows
URL_BLACKLIST=\.(css|js|otf|ttf|woff|woff2|gstatic\.com|googleapis\.com/css)(\?.*)?$
BIND_ADDR=127.0.0.1:8000
ALLOWED_HOSTS=*
DEBUG=False
SNAPSHOTS_PER_PAGE=40
CUSTOM_TEMPLATES_DIR=None
TIME_ZONE=UTC
SAVE_TITLE=True
SAVE_FAVICON=True
SAVE_WGET=True
SAVE_WGET_REQUISITES=True
SAVE_SINGLEFILE=True
SAVE_READABILITY=True
SAVE_MERCURY=True
SAVE_PDF=True
SAVE_SCREENSHOT=True
SAVE_DOM=True
SAVE_HEADERS=True
SAVE_WARC=True
SAVE_GIT=True
SAVE_MEDIA=False
SAVE_ARCHIVE_DOT_ORG=True
RESOLUTION=1440,2000
CHECK_SSL_VALIDITY=True
MEDIA_MAX_SIZE=750m
CURL_USER_AGENT=Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/90.0.4430.61 Safari/537.36 ArchiveBox/0.6.2 (+https://github.com/ArchiveBox/ArchiveBox/) curl/curl 7.68.0 (x86_64-pc-linux-gnu)
WGET_USER_AGENT=Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/90.0.4430.61 Safari/537.36 ArchiveBox/0.6.2 (+https://github.com/ArchiveBox/ArchiveBox/) wget/GNU Wget 1.20.3
CHROME_USER_AGENT=Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/90.0.4430.61 Safari/537.36 ArchiveBox/{VERSION} (+https://github.com/ArchiveBox/ArchiveBox/)
CHROME_HEADLESS=True
CHROME_SANDBOX=True
USE_INDEXING_BACKEND=True
USE_SEARCHING_BACKEND=True
SEARCH_BACKEND_ENGINE=ripgrep
SEARCH_BACKEND_HOST_NAME=localhost
SEARCH_BACKEND_PORT=1491
SEARCH_BACKEND_PASSWORD=xxxxxxxxxx
SONIC_COLLECTION=archivebox
SONIC_BUCKET=snapshots
SEARCH_BACKEND_TIMEOUT=90
FETCH_TITLE=True
#FETCH_FAVICON=True
FETCH_WGET=True
FETCH_WARC=True
FETCH_PDF=True
FETCH_SCREENSHOT=True
FETCH_DOM=True
FETCH_GIT=True
FETCH_MEDIA=false
SUBMIT_ARCHIVE_DOT_ORG=True
USE_SINGLEFILE=True
USE_CURL=True
USE_WGET=True
USE_READABILITY=True
USE_MERCURY=True
USE_GIT=True
USE_CHROME=True
USE_NODE=True
USE_YOUTUBEDL=True
USE_RIPGREP=True
CURL_BINARY=/usr/bin/curl
GIT_BINARY=/usr/bin/git
WGET_BINARY=/usr/bin/wget
YOUTUBEDL_BINARY=/usr/local/bin/youtube-dl
SECRET_KEY=xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
SINGLEFILE_BINARY=/home/kangus/node_modules/single-file/cli/single-file
READABILITY_BINARY=/home/kangus/node_modules/readability-extractor/readability-extractor
MERCURY_BINARY=/home/kangus//node_modules/@postlight/mercury-parser/cli.js
NODE_BINARY=node
RIPGREP_BINARY=rg
CHROME_BINARY=/usr/bin/google-chrome-stable
COOKIES_FILE=/mnt/nfs/OMV_MERGEFS/data/docs/archivebox/cookies.txt
CHROME_USER_DATA_DIR=/mnt/nfs/OMV_MERGEFS/data/docs/archivebox/google-chrome-stable
POCKET_CONSUMER_KEY=None
USER=kangus
PACKAGE_DIR=/home/kangus/.local/lib/python3.8/site-packages/archivebox
TEMPLATES_DIR=/home/kangus/.local/lib/python3.8/site-packages/archivebox/templates
ARCHIVE_DIR=/mnt/nfs/OMV_MERGEFS/data/docs/archivebox/archive
SOURCES_DIR=/mnt/nfs/OMV_MERGEFS/data/docs/archivebox/sources
LOGS_DIR=/mnt/nfs/OMV_MERGEFS/data/docs/archivebox/logs
URL_BLACKLIST_PTN=re.compile('\\.(css|js|otf|ttf|woff|woff2|gstatic\\.com|googleapis\\.com/css)(\\?.*)?$', re.IGNORECASE|re.MULTILINE)
ARCHIVEBOX_BINARY=/home/kangus/.local/bin/archivebox
WGET_AUTO_COMPRESSION=True


archivebox docker config

# Usage:
#     docker-compose up -d
#     docker-compose run archivebox init
#     echo "https://example.com" | docker-compose run archivebox archivebox add
#     docker-compose run archivebox add --depth=1 https://example.com/some/feed.rss
#     docker-compose run archivebox config --set PUBLIC_INDEX=True
# Documentation:
#     https://github.com/ArchiveBox/ArchiveBox/wiki/Docker#docker-compose
version: "3.7"
services:
    archivebox:
        container_name: archivebox
        # build: .
        image: archivebox/archivebox:latest
        command: server 0.0.0.0:8000
        stdin_open: true
        tty: true
        ports:
            - 8000:8000
        environment:
            - PGID=${PGID}
            - PUID=${PUID}
            - USE_COLOR=True
            - SHOW_PROGRESS=False
            - ONLY_NEW=True
            - DEBUG=True
            - TIMEOUT=180
            - DOCKER_CLIENT_TIMEOUT=120
            - COMPOSE_HTTP_TIMEOUT=120
            - MEDIA_TIMEOUT=3600
            - FETCH_TITLE=True
            - FETCH_WGET=True
            - FETCH_WARC=True
            - FETCH_PDF=True
            - FETCH_SCREENSHOT=True
            - FETCH_DOM=True
            - FETCH_GIT=True
            - FETCH_MEDIA=false
            - SUBMIT_ARCHIVE_DOT_ORG=True
            - USE_SINGLEFILE=True
            - CHECK_SSL_VALIDITY=False
            - FETCH_WGET_REQUISITES=True
            - RESOLUTION=1920,1080
            - SAVE_READABILITY=True
            - WGET_ARGS="--no-verbose --adjust-extension --convert-links --force-directories --backup-converted --span-hosts --no-parent -e robots=off --inet4-only"
            - WGET_USER_AGENT=Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/73.0.3683.75 Safari/537.36
            - CHROME_USER_AGENT=Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/73.0.3683.75 Safari/537.36
            - CHROME_HEADLESS=True
            - CHROME_USER_DATA_DIR=/data/chrome_user_dir
            - SECRET_KEY="XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXx"
            #- SINGLEFILE_BINARY="/home/kangus/node_modules/single-file/cli/single-file"
            - MAX_URL_ATTEMPTS=5
            - SAVE_TITLE=True
            - SAVE_PDF=True
            - SAVE_WARC=True
            - SAVE_WGET=True
            - SAVE_SINGLEFILE=True
            - SEARCH_BACKEND_ENGINE=sonic
            - SEARCH_BACKEND_HOST_NAME=sonic
            - SEARCH_BACKEND_PASSWORD=XXXXXXXXXXXX
            - COOKIES_FILE=/data/cookies.txt
        volumes:
            - /etc/localtime:/etc/localtime:ro
            - /storage/data/docs/archivebox:/data
            - /storage/data/docs/archivebox/sonic.cfg:/etc/sonic.cfg:ro
            - /storage/data/docs/archivebox/data/sonic:/var/lib/sonic/store
    sonic:
       image: valeriansaliou/sonic:v1.3.0
       expose:
           - 1491
       environment:
#           - PGID=1000
#           - PUID=1000
           - PUID=${PUID}
           - PGID=${PGID}
           - SEARCH_BACKEND_PASSWORD=XXXXXXXXXXXX
       volumes:
           - /storage/data/docs/archivebox/sonic.cfg:/etc/sonic.cfg:ro
           - /storage/data/docs/archivebox/data/sonic:/var/lib/sonic/store
# docker network create -d bridge my-network
networks:
  my-network:
    external: true

Originally created by @terxw on GitHub (Mar 24, 2022). Original GitHub issue: https://github.com/ArchiveBox/ArchiveBox/issues/955 #### Describe the bug ``` after running archivebox from python installation with the same config I cannot run archivebox in docker because of db error. pragma integrity check of index.sqlite is without error ``` #### Steps to reproduce ``` cd /storage/data/docs/archivebox docker-compose up -d docker-compose run archivebox init ``` #### Screenshots or log output ``` archivebox is up-to-date archivebox_sonic_1 is up-to-date [i] [2022-03-24 20:37:27] ArchiveBox v0.6.2: archivebox init > /data [^] Verifying and updating existing ArchiveBox collection to v0.6.2... ---------------------------------------------------------------------- [*] Verifying archive folder structure... + ./archive, ./sources, ./logs... + ./ArchiveBox.conf... [*] Verifying main SQL index and running any migrations needed... Traceback (most recent call last): File "/usr/local/lib/python3.9/site-packages/django/db/backends/utils.py", line 82, in _execute return self.cursor.execute(sql) File "/usr/local/lib/python3.9/site-packages/django/db/backends/sqlite3/base.py", line 411, in execute return Database.Cursor.execute(self, query) sqlite3.DatabaseError: database disk image is malformed The above exception was the direct cause of the following exception: Traceback (most recent call last): File "/usr/local/bin/archivebox", line 33, in <module> sys.exit(load_entry_point('archivebox', 'console_scripts', 'archivebox')()) File "/app/archivebox/cli/__init__.py", line 140, in main run_subcommand( File "/app/archivebox/cli/__init__.py", line 80, in run_subcommand module.main(args=subcommand_args, stdin=stdin, pwd=pwd) # type: ignore File "/app/archivebox/cli/archivebox_init.py", line 43, in main init( File "/app/archivebox/util.py", line 114, in typechecked_function return func(*args, **kwargs) File "/app/archivebox/main.py", line 328, in init for migration_line in apply_migrations(out_dir): File "/app/archivebox/util.py", line 114, in typechecked_function return func(*args, **kwargs) File "/app/archivebox/index/sql.py", line 137, in apply_migrations call_command("makemigrations", interactive=False, stdout=null) File "/usr/local/lib/python3.9/site-packages/django/core/management/__init__.py", line 168, in call_command return command.execute(*args, **defaults) File "/usr/local/lib/python3.9/site-packages/django/core/management/base.py", line 371, in execute output = self.handle(*args, **options) File "/usr/local/lib/python3.9/site-packages/django/core/management/base.py", line 85, in wrapped res = handle_func(*args, **kwargs) File "/usr/local/lib/python3.9/site-packages/django/core/management/commands/makemigrations.py", line 101, in handle loader.check_consistent_history(connection) File "/usr/local/lib/python3.9/site-packages/django/db/migrations/loader.py", line 290, in check_consistent_history applied = recorder.applied_migrations() File "/usr/local/lib/python3.9/site-packages/django/db/migrations/recorder.py", line 77, in applied_migrations if self.has_table(): File "/usr/local/lib/python3.9/site-packages/django/db/migrations/recorder.py", line 56, in has_table tables = self.connection.introspection.table_names(cursor) File "/usr/local/lib/python3.9/site-packages/django/db/backends/base/introspection.py", line 48, in table_names return get_names(cursor) File "/usr/local/lib/python3.9/site-packages/django/db/backends/base/introspection.py", line 43, in get_names return sorted(ti.name for ti in self.get_table_list(cursor) File "/usr/local/lib/python3.9/site-packages/django/db/backends/sqlite3/introspection.py", line 74, in get_table_list cursor.execute(""" File "/usr/local/lib/python3.9/site-packages/django/db/backends/utils.py", line 66, in execute return self._execute_with_wrappers(sql, params, many=False, executor=self._execute) File "/usr/local/lib/python3.9/site-packages/django/db/backends/utils.py", line 75, in _execute_with_wrappers return executor(sql, params, many, context) File "/usr/local/lib/python3.9/site-packages/django/db/backends/utils.py", line 84, in _execute return self.cursor.execute(sql, params) File "/usr/local/lib/python3.9/site-packages/django/db/utils.py", line 90, in __exit__ raise dj_exc_value.with_traceback(traceback) from exc_value File "/usr/local/lib/python3.9/site-packages/django/db/backends/utils.py", line 82, in _execute return self.cursor.execute(sql) File "/usr/local/lib/python3.9/site-packages/django/db/backends/sqlite3/base.py", line 411, in execute return Database.Cursor.execute(self, query) django.db.utils.DatabaseError: database disk image is malformed ``` #### ArchiveBox version <!-- Run the `archivebox version` command locally then copy paste the result here: --> ```logs archivebox is up-to-date archivebox_sonic_1 is up-to-date ArchiveBox v0.6.2 Cpython Linux Linux-5.13.0-25-generic-x86_64-with-glibc2.28 x86_64 IN_DOCKER=True DEBUG=False IS_TTY=True TZ=UTC SEARCH_BACKEND_ENGINE=sonic [i] Dependency versions: √ ARCHIVEBOX_BINARY v0.6.2 valid /usr/local/bin/archivebox √ PYTHON_BINARY v3.9.5 valid /usr/local/bin/python3.9 √ DJANGO_BINARY v3.1.10 valid /usr/local/lib/python3.9/site-packages/django/bin/django-admin.py √ CURL_BINARY v7.64.0 valid /usr/bin/curl √ WGET_BINARY v1.20.1 valid /usr/bin/wget √ NODE_BINARY v15.14.0 valid /usr/bin/node √ SINGLEFILE_BINARY v0.3.16 valid /node/node_modules/single-file/cli/single-file √ READABILITY_BINARY v0.0.2 valid /node/node_modules/readability-extractor/readability-extractor √ MERCURY_BINARY v1.0.0 valid /node/node_modules/@postlight/mercury-parser/cli.js √ GIT_BINARY v2.20.1 valid /usr/bin/git - YOUTUBEDL_BINARY - disabled /usr/local/bin/youtube-dl √ CHROME_BINARY v90.0.4430.93 valid /usr/bin/chromium √ RIPGREP_BINARY v0.10.0 valid /usr/bin/rg [i] Source-code locations: √ PACKAGE_DIR 22 files valid /app/archivebox √ TEMPLATES_DIR 3 files valid /app/archivebox/templates - CUSTOM_TEMPLATES_DIR - disabled [i] Secrets locations: √ CHROME_USER_DATA_DIR 49 files valid ./chrome_user_dir √ COOKIES_FILE 342.3 KB valid ./cookies.txt [i] Data locations: √ OUTPUT_DIR 20 files valid /data √ SOURCES_DIR 601 files valid ./sources √ LOGS_DIR 1 files valid ./logs √ ARCHIVE_DIR 9601 files valid ./archive √ CONFIG_FILE 81.0 Bytes valid ./ArchiveBox.conf √ SQL_INDEX 511.2 MB valid ./index.sqlite3 ``` <!-- Tickets without full version info will closed until it is provided, we need the full output here to help you solve your issue --> #### archivebox venv config ``` [DEFAULT] IS_TTY=False USE_COLOR=False SHOW_PROGRESS=False IN_DOCKER=False ONLY_NEW=True TIMEOUT=60 MEDIA_TIMEOUT=3600 OUTPUT_PERMISSIONS=755 RESTRICT_FILE_NAMES=windows URL_BLACKLIST=\.(css|js|otf|ttf|woff|woff2|gstatic\.com|googleapis\.com/css)(\?.*)?$ BIND_ADDR=127.0.0.1:8000 ALLOWED_HOSTS=* DEBUG=False SNAPSHOTS_PER_PAGE=40 CUSTOM_TEMPLATES_DIR=None TIME_ZONE=UTC SAVE_TITLE=True SAVE_FAVICON=True SAVE_WGET=True SAVE_WGET_REQUISITES=True SAVE_SINGLEFILE=True SAVE_READABILITY=True SAVE_MERCURY=True SAVE_PDF=True SAVE_SCREENSHOT=True SAVE_DOM=True SAVE_HEADERS=True SAVE_WARC=True SAVE_GIT=True SAVE_MEDIA=False SAVE_ARCHIVE_DOT_ORG=True RESOLUTION=1440,2000 CHECK_SSL_VALIDITY=True MEDIA_MAX_SIZE=750m CURL_USER_AGENT=Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/90.0.4430.61 Safari/537.36 ArchiveBox/0.6.2 (+https://github.com/ArchiveBox/ArchiveBox/) curl/curl 7.68.0 (x86_64-pc-linux-gnu) WGET_USER_AGENT=Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/90.0.4430.61 Safari/537.36 ArchiveBox/0.6.2 (+https://github.com/ArchiveBox/ArchiveBox/) wget/GNU Wget 1.20.3 CHROME_USER_AGENT=Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/90.0.4430.61 Safari/537.36 ArchiveBox/{VERSION} (+https://github.com/ArchiveBox/ArchiveBox/) CHROME_HEADLESS=True CHROME_SANDBOX=True USE_INDEXING_BACKEND=True USE_SEARCHING_BACKEND=True SEARCH_BACKEND_ENGINE=ripgrep SEARCH_BACKEND_HOST_NAME=localhost SEARCH_BACKEND_PORT=1491 SEARCH_BACKEND_PASSWORD=xxxxxxxxxx SONIC_COLLECTION=archivebox SONIC_BUCKET=snapshots SEARCH_BACKEND_TIMEOUT=90 FETCH_TITLE=True #FETCH_FAVICON=True FETCH_WGET=True FETCH_WARC=True FETCH_PDF=True FETCH_SCREENSHOT=True FETCH_DOM=True FETCH_GIT=True FETCH_MEDIA=false SUBMIT_ARCHIVE_DOT_ORG=True USE_SINGLEFILE=True USE_CURL=True USE_WGET=True USE_READABILITY=True USE_MERCURY=True USE_GIT=True USE_CHROME=True USE_NODE=True USE_YOUTUBEDL=True USE_RIPGREP=True CURL_BINARY=/usr/bin/curl GIT_BINARY=/usr/bin/git WGET_BINARY=/usr/bin/wget YOUTUBEDL_BINARY=/usr/local/bin/youtube-dl SECRET_KEY=xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx SINGLEFILE_BINARY=/home/kangus/node_modules/single-file/cli/single-file READABILITY_BINARY=/home/kangus/node_modules/readability-extractor/readability-extractor MERCURY_BINARY=/home/kangus//node_modules/@postlight/mercury-parser/cli.js NODE_BINARY=node RIPGREP_BINARY=rg CHROME_BINARY=/usr/bin/google-chrome-stable COOKIES_FILE=/mnt/nfs/OMV_MERGEFS/data/docs/archivebox/cookies.txt CHROME_USER_DATA_DIR=/mnt/nfs/OMV_MERGEFS/data/docs/archivebox/google-chrome-stable POCKET_CONSUMER_KEY=None USER=kangus PACKAGE_DIR=/home/kangus/.local/lib/python3.8/site-packages/archivebox TEMPLATES_DIR=/home/kangus/.local/lib/python3.8/site-packages/archivebox/templates ARCHIVE_DIR=/mnt/nfs/OMV_MERGEFS/data/docs/archivebox/archive SOURCES_DIR=/mnt/nfs/OMV_MERGEFS/data/docs/archivebox/sources LOGS_DIR=/mnt/nfs/OMV_MERGEFS/data/docs/archivebox/logs URL_BLACKLIST_PTN=re.compile('\\.(css|js|otf|ttf|woff|woff2|gstatic\\.com|googleapis\\.com/css)(\\?.*)?$', re.IGNORECASE|re.MULTILINE) ARCHIVEBOX_BINARY=/home/kangus/.local/bin/archivebox WGET_AUTO_COMPRESSION=True ``` #### archivebox docker config ``` # Usage: # docker-compose up -d # docker-compose run archivebox init # echo "https://example.com" | docker-compose run archivebox archivebox add # docker-compose run archivebox add --depth=1 https://example.com/some/feed.rss # docker-compose run archivebox config --set PUBLIC_INDEX=True # Documentation: # https://github.com/ArchiveBox/ArchiveBox/wiki/Docker#docker-compose version: "3.7" services: archivebox: container_name: archivebox # build: . image: archivebox/archivebox:latest command: server 0.0.0.0:8000 stdin_open: true tty: true ports: - 8000:8000 environment: - PGID=${PGID} - PUID=${PUID} - USE_COLOR=True - SHOW_PROGRESS=False - ONLY_NEW=True - DEBUG=True - TIMEOUT=180 - DOCKER_CLIENT_TIMEOUT=120 - COMPOSE_HTTP_TIMEOUT=120 - MEDIA_TIMEOUT=3600 - FETCH_TITLE=True - FETCH_WGET=True - FETCH_WARC=True - FETCH_PDF=True - FETCH_SCREENSHOT=True - FETCH_DOM=True - FETCH_GIT=True - FETCH_MEDIA=false - SUBMIT_ARCHIVE_DOT_ORG=True - USE_SINGLEFILE=True - CHECK_SSL_VALIDITY=False - FETCH_WGET_REQUISITES=True - RESOLUTION=1920,1080 - SAVE_READABILITY=True - WGET_ARGS="--no-verbose --adjust-extension --convert-links --force-directories --backup-converted --span-hosts --no-parent -e robots=off --inet4-only" - WGET_USER_AGENT=Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/73.0.3683.75 Safari/537.36 - CHROME_USER_AGENT=Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/73.0.3683.75 Safari/537.36 - CHROME_HEADLESS=True - CHROME_USER_DATA_DIR=/data/chrome_user_dir - SECRET_KEY="XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXx" #- SINGLEFILE_BINARY="/home/kangus/node_modules/single-file/cli/single-file" - MAX_URL_ATTEMPTS=5 - SAVE_TITLE=True - SAVE_PDF=True - SAVE_WARC=True - SAVE_WGET=True - SAVE_SINGLEFILE=True - SEARCH_BACKEND_ENGINE=sonic - SEARCH_BACKEND_HOST_NAME=sonic - SEARCH_BACKEND_PASSWORD=XXXXXXXXXXXX - COOKIES_FILE=/data/cookies.txt volumes: - /etc/localtime:/etc/localtime:ro - /storage/data/docs/archivebox:/data - /storage/data/docs/archivebox/sonic.cfg:/etc/sonic.cfg:ro - /storage/data/docs/archivebox/data/sonic:/var/lib/sonic/store sonic: image: valeriansaliou/sonic:v1.3.0 expose: - 1491 environment: # - PGID=1000 # - PUID=1000 - PUID=${PUID} - PGID=${PGID} - SEARCH_BACKEND_PASSWORD=XXXXXXXXXXXX volumes: - /storage/data/docs/archivebox/sonic.cfg:/etc/sonic.cfg:ro - /storage/data/docs/archivebox/data/sonic:/var/lib/sonic/store # docker network create -d bridge my-network networks: my-network: external: true ```
kerem closed this issue 2026-03-01 17:56:31 +03:00
Author
Owner

@pirate commented on GitHub (Mar 24, 2022):

Thanks for posting all the relevant info! Finally someone who knows how to open a good issue lol.

Are you able to open the SQLite3 file manually with $ sqlite3 index.sqlite3 or archivebox manage dbshell?

Unfortunately I've never seen this error before, don't know exactly how to fix it other than dumping and trying to rebuild the sqlite3 database.
You can try these instructions to do that with the index.sqlite3 file:
https://stackoverflow.com/questions/5274202/sqlite3-database-or-disk-is-full-the-database-disk-image-is-malformed

<!-- gh-comment-id:1078351517 --> @pirate commented on GitHub (Mar 24, 2022): Thanks for posting all the relevant info! Finally someone who knows how to open a good issue lol. Are you able to open the SQLite3 file manually with `$ sqlite3 index.sqlite3` or `archivebox manage dbshell`? Unfortunately I've never seen this error before, don't know exactly how to fix it other than dumping and trying to rebuild the sqlite3 database. You can try these instructions to do that with the `index.sqlite3` file: https://stackoverflow.com/questions/5274202/sqlite3-database-or-disk-is-full-the-database-disk-image-is-malformed
Author
Owner

@terxw commented on GitHub (Mar 24, 2022):

Thank You!
Worked like a charm!
Although I did pragma integrity_check; in DB Browser for sqlite, which was ok, also sqlite3 from my pc with path to index.sqlite3 on network share and integrity check was ok.
When i run sqlite3 index.sqlite3 localy on machine where archivebox is running, integrity check was not ok...

<!-- gh-comment-id:1078408576 --> @terxw commented on GitHub (Mar 24, 2022): Thank You! Worked like a charm! Although I did pragma integrity_check; in DB Browser for sqlite, which was ok, also sqlite3 from my pc with path to index.sqlite3 on network share and integrity check was ok. When i run sqlite3 index.sqlite3 localy on machine where archivebox is running, integrity check was not ok...
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/ArchiveBox#2102
No description provided.