[GH-ISSUE #1341] Re-Snapshot button is failing with core.models.Snapshot.DoesNotExist: Snapshot matching query does not exist #2328

Closed
opened 2026-03-01 17:58:14 +03:00 by kerem · 4 comments
Owner

Originally created by @pirate on GitHub (Feb 3, 2024).
Original GitHub issue: https://github.com/ArchiveBox/ArchiveBox/issues/1341

Discussed in https://github.com/ArchiveBox/ArchiveBox/discussions/1340

Originally posted by gerroon January 31, 2024

0.7.3
ArchiveBox v0.7.3+editable COMMIT_HASH=a4bd441 BUILD_TIME=2024-01-31 10:01:56 1706695316
IN_DOCKER=True IN_QEMU=False ARCH=x86_64 OS=Linux PLATFORM=Linux-5.14.0-4-amd64-x86_64-with-glibc2.36 PYTHON=Cpython
FS_ATOMIC=True FS_REMOTE=True FS_USER=1000:1000 FS_PERMS=644
DEBUG=False IS_TTY=True TZ=UTC SEARCH_BACKEND=sonic LDAP=False

[i] Dependency versions:
 √  PYTHON_BINARY         v3.11.7         valid     /usr/local/bin/python3.11
 √  SQLITE_BINARY         v2.6.0          valid     /usr/local/lib/python3.11/sqlite3/dbapi2.py
 √  DJANGO_BINARY         v3.1.14         valid     /usr/local/lib/python3.11/site-packages/django/__init__.py
 √  ARCHIVEBOX_BINARY     v0.7.3          valid     /usr/local/bin/archivebox

 √  CURL_BINARY           v8.5.0          valid     /usr/bin/curl
 √  WGET_BINARY           v1.21.3         valid     /usr/bin/wget
 √  NODE_BINARY           v20.11.0        valid     /usr/bin/node
 √  SINGLEFILE_BINARY     v1.1.46         valid     /app/node_modules/single-file-cli/single-file
 √  READABILITY_BINARY    v0.0.11         valid     /app/node_modules/readability-extractor/readability-extractor
 √  MERCURY_BINARY        v1.0.0          valid     /app/node_modules/@postlight/parser/cli.js
────GIT_BINARY            v2.39.2         valid     /usr/bin/git
 √  YOUTUBEDL_BINARY      v2023.12.30     valid     /usr/local/bin/yt-dlp
 √  CHROME_BINARY         v121.0.6167.57  valid     /usr/bin/chromium-browser
 √  RIPGREP_BINARY        v13.0.0         valid     /usr/bin/rg

[i] Source-code locations:
 √  PACKAGE_DIR           23 files        valid     /app/archivebox
 √  TEMPLATES_DIR         3 files         valid     /app/archivebox/templates
 -  CUSTOM_TEMPLATES_DIR  -               disabled  None

[i] Secrets locations:
 -  CHROME_USER_DATA_DIR  -               disabled  None
 -  COOKIES_FILE          -               disabled  None

[i] Data locations:
 √  OUTPUT_DIR            11 files @      valid     /data
 √  SOURCES_DIR           532 files       valid     ./sources
 √  LOGS_DIR              1 files         valid     ./logs
 √  ARCHIVE_DIR           382 files       valid     ./archive
 √  CONFIG_FILE           554.0 Bytes     valid     ./ArchiveBox.conf
 √  SQL_INDEX             4.4 MB          valid     ./index.sqlite3

archivebox_1  | [*] [2024-01-31 22:46:18] Writing 1 links to main index...
archivebox_1  | Internal Server Error: /admin/core/snapshot/
archivebox_1  | Traceback (most recent call last):
archivebox_1  |   File "/app/archivebox/index/sql.py", line 48, in write_link_to_sql_index
archivebox_1  |     info["timestamp"] = Snapshot.objects.get(url=link.url).timestamp
archivebox_1  |                         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    ivebox_1  |   File "/usr/local/lib/python3.11/site-packages/django/db/models/manager.py", line 85, in manager_method
    ivebox_1  |     return getattr(self.get_queryset(), name)(*args, **kwargs)
    ivebox_1  |            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
archivebox_1  |   File "/usr/local/lib/python3.11/site-packages/django/db/models/query.py", line 429, in get
archivebox_1  |     raise self.model.DoesNotExist(
    ivebox_1  | core.models.Snapshot.DoesNotExist: Snapshot matching query does not exist.
    ivebox_1  |
    ivebox_1  | During handling of the above exception, another exception occurred:
archivebox_1  |
archivebox_1  | Traceback (most recent call last):
    ivebox_1  |   File "/usr/local/lib/python3.11/site-packages/django/db/models/query.py", line 589, in update_or_create
    ivebox_1  |     obj = self.select_for_update().get(**kwargs)
    ivebox_1  |           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
archivebox_1  |   File "/usr/local/lib/python3.11/site-packages/django/db/models/query.py", line 429, in get
archivebox_1  |     raise self.model.DoesNotExist(
    ivebox_1  | core.models.Snapshot.DoesNotExist: Snapshot matching query does not exist.
    ivebox_1  |
    ivebox_1  | During handling of the above exception, another exception occurred:
archivebox_1  |
archivebox_1  | Traceback (most recent call last):
    ivebox_1  |   File "/usr/local/lib/python3.11/site-packages/django/db/backends/utils.py", line 84, in _execute
    ivebox_1  |     return self.cursor.execute(sql, params)
    ivebox_1  |            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
archivebox_1  |   File "/usr/local/lib/python3.11/site-packages/django/db/backends/sqlite3/base.py", line 413, in execute
archivebox_1  |     return Database.Cursor.execute(self, query, params)
    ivebox_1  |            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    ivebox_1  | sqlite3.OperationalError: database is locked
    ivebox_1  |
    ivebox_1  | The above exception was the direct cause of the following exception:
    ivebox_1  |
    ivebox_1  | Traceback (most recent call last):
    ivebox_1  |   File "/usr/local/lib/python3.11/site-packages/django/core/handlers/exception.py", line 47, in inner
    ivebox_1  |     response = get_response(request)
    ivebox_1  |                ^^^^^^^^^^^^^^^^^^^^^
    ivebox_1  |   File "/usr/local/lib/python3.11/site-packages/django/core/handlers/base.py", line 181, in _get_response
    ivebox_1  |     response = wrapped_callback(request, *callback_args, **callback_kwargs)
    ivebox_1  |                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    ivebox_1  |   File "/usr/local/lib/python3.11/site-packages/django/contrib/admin/options.py", line 614, in wrapper
    ivebox_1  |     return self.admin_site.admin_view(view)(*args, **kwargs)
    ivebox_1  |            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
              [t  File "/usr/local/lib/python3.11/site-packages/django/utils/decorators.py", line 130, in _wrapped_view

```</div>
Originally created by @pirate on GitHub (Feb 3, 2024). Original GitHub issue: https://github.com/ArchiveBox/ArchiveBox/issues/1341 ### Discussed in https://github.com/ArchiveBox/ArchiveBox/discussions/1340 <div type='discussions-op-text'> <sup>Originally posted by **gerroon** January 31, 2024</sup> ``` 0.7.3 ArchiveBox v0.7.3+editable COMMIT_HASH=a4bd441 BUILD_TIME=2024-01-31 10:01:56 1706695316 IN_DOCKER=True IN_QEMU=False ARCH=x86_64 OS=Linux PLATFORM=Linux-5.14.0-4-amd64-x86_64-with-glibc2.36 PYTHON=Cpython FS_ATOMIC=True FS_REMOTE=True FS_USER=1000:1000 FS_PERMS=644 DEBUG=False IS_TTY=True TZ=UTC SEARCH_BACKEND=sonic LDAP=False [i] Dependency versions: √ PYTHON_BINARY v3.11.7 valid /usr/local/bin/python3.11 √ SQLITE_BINARY v2.6.0 valid /usr/local/lib/python3.11/sqlite3/dbapi2.py √ DJANGO_BINARY v3.1.14 valid /usr/local/lib/python3.11/site-packages/django/__init__.py √ ARCHIVEBOX_BINARY v0.7.3 valid /usr/local/bin/archivebox √ CURL_BINARY v8.5.0 valid /usr/bin/curl √ WGET_BINARY v1.21.3 valid /usr/bin/wget √ NODE_BINARY v20.11.0 valid /usr/bin/node √ SINGLEFILE_BINARY v1.1.46 valid /app/node_modules/single-file-cli/single-file √ READABILITY_BINARY v0.0.11 valid /app/node_modules/readability-extractor/readability-extractor √ MERCURY_BINARY v1.0.0 valid /app/node_modules/@postlight/parser/cli.js ────GIT_BINARY v2.39.2 valid /usr/bin/git √ YOUTUBEDL_BINARY v2023.12.30 valid /usr/local/bin/yt-dlp √ CHROME_BINARY v121.0.6167.57 valid /usr/bin/chromium-browser √ RIPGREP_BINARY v13.0.0 valid /usr/bin/rg [i] Source-code locations: √ PACKAGE_DIR 23 files valid /app/archivebox √ TEMPLATES_DIR 3 files valid /app/archivebox/templates - CUSTOM_TEMPLATES_DIR - disabled None [i] Secrets locations: - CHROME_USER_DATA_DIR - disabled None - COOKIES_FILE - disabled None [i] Data locations: √ OUTPUT_DIR 11 files @ valid /data √ SOURCES_DIR 532 files valid ./sources √ LOGS_DIR 1 files valid ./logs √ ARCHIVE_DIR 382 files valid ./archive √ CONFIG_FILE 554.0 Bytes valid ./ArchiveBox.conf √ SQL_INDEX 4.4 MB valid ./index.sqlite3 ``` ``` archivebox_1 | [*] [2024-01-31 22:46:18] Writing 1 links to main index... archivebox_1 | Internal Server Error: /admin/core/snapshot/ archivebox_1 | Traceback (most recent call last): archivebox_1 | File "/app/archivebox/index/sql.py", line 48, in write_link_to_sql_index archivebox_1 | info["timestamp"] = Snapshot.objects.get(url=link.url).timestamp archivebox_1 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ivebox_1 | File "/usr/local/lib/python3.11/site-packages/django/db/models/manager.py", line 85, in manager_method ivebox_1 | return getattr(self.get_queryset(), name)(*args, **kwargs) ivebox_1 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ archivebox_1 | File "/usr/local/lib/python3.11/site-packages/django/db/models/query.py", line 429, in get archivebox_1 | raise self.model.DoesNotExist( ivebox_1 | core.models.Snapshot.DoesNotExist: Snapshot matching query does not exist. ivebox_1 | ivebox_1 | During handling of the above exception, another exception occurred: archivebox_1 | archivebox_1 | Traceback (most recent call last): ivebox_1 | File "/usr/local/lib/python3.11/site-packages/django/db/models/query.py", line 589, in update_or_create ivebox_1 | obj = self.select_for_update().get(**kwargs) ivebox_1 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ archivebox_1 | File "/usr/local/lib/python3.11/site-packages/django/db/models/query.py", line 429, in get archivebox_1 | raise self.model.DoesNotExist( ivebox_1 | core.models.Snapshot.DoesNotExist: Snapshot matching query does not exist. ivebox_1 | ivebox_1 | During handling of the above exception, another exception occurred: archivebox_1 | archivebox_1 | Traceback (most recent call last): ivebox_1 | File "/usr/local/lib/python3.11/site-packages/django/db/backends/utils.py", line 84, in _execute ivebox_1 | return self.cursor.execute(sql, params) ivebox_1 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ archivebox_1 | File "/usr/local/lib/python3.11/site-packages/django/db/backends/sqlite3/base.py", line 413, in execute archivebox_1 | return Database.Cursor.execute(self, query, params) ivebox_1 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ivebox_1 | sqlite3.OperationalError: database is locked ivebox_1 | ivebox_1 | The above exception was the direct cause of the following exception: ivebox_1 | ivebox_1 | Traceback (most recent call last): ivebox_1 | File "/usr/local/lib/python3.11/site-packages/django/core/handlers/exception.py", line 47, in inner ivebox_1 | response = get_response(request) ivebox_1 | ^^^^^^^^^^^^^^^^^^^^^ ivebox_1 | File "/usr/local/lib/python3.11/site-packages/django/core/handlers/base.py", line 181, in _get_response ivebox_1 | response = wrapped_callback(request, *callback_args, **callback_kwargs) ivebox_1 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ivebox_1 | File "/usr/local/lib/python3.11/site-packages/django/contrib/admin/options.py", line 614, in wrapper ivebox_1 | return self.admin_site.admin_view(view)(*args, **kwargs) ivebox_1 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ [t File "/usr/local/lib/python3.11/site-packages/django/utils/decorators.py", line 130, in _wrapped_view ```</div>
kerem closed this issue 2026-03-01 17:58:14 +03:00
Author
Owner

@makew0rld commented on GitHub (Feb 17, 2025):

Seems like a duplicate of #1506

<!-- gh-comment-id:2662125324 --> @makew0rld commented on GitHub (Feb 17, 2025): Seems like a duplicate of #1506
Author
Owner

@pirate commented on GitHub (Feb 18, 2025):

Clsoing both as stale for now as this is a generic error that just indicates the db is overloaded, you should try running fewer URLs at a time or move your db to a faster drive. The most common cause is trying to store the index.sqlite3 on a slow HDD or smb mount. We make continuous performance improvements in every version, and locking contention should decline over time as we make ArchiveBox more efficient.

<!-- gh-comment-id:2665186675 --> @pirate commented on GitHub (Feb 18, 2025): Clsoing both as stale for now as this is a generic error that just indicates the db is overloaded, you should try running fewer URLs at a time or move your db to a faster drive. The most common cause is trying to store the index.sqlite3 on a slow HDD or smb mount. We make continuous performance improvements in every version, and locking contention should decline over time as we make ArchiveBox more efficient.
Author
Owner

@makew0rld commented on GitHub (Feb 18, 2025):

Could URLs (or writes in general) not be queued to prevent this? I guess using something other than SQLite would work too.

<!-- gh-comment-id:2665688358 --> @makew0rld commented on GitHub (Feb 18, 2025): Could URLs (or writes in general) not be queued to prevent this? I guess using something other than SQLite would work too.
Author
Owner

@pirate commented on GitHub (Feb 18, 2025):

Both those options and more have already been extensively discussed in other issues, I don't want to repeat too much of it here but basically SQLite is plenty fast. It's not an issue with SQLite, it's a design choice that archivebox is single threaded because we're effectively rate limited by remote domains anyway. There are big performance improvements in the new betas that should eliminate most locking contention almost entirely.

<!-- gh-comment-id:2666439327 --> @pirate commented on GitHub (Feb 18, 2025): Both those options and more have already been extensively discussed in other issues, I don't want to repeat too much of it here but basically SQLite is plenty fast. It's not an issue with SQLite, it's a design choice that archivebox is single threaded because we're effectively rate limited by remote domains anyway. There are big performance improvements in the new betas that should eliminate most locking contention almost entirely.
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/ArchiveBox#2328
No description provided.