[GH-ISSUE #1509] Bug: Unable to "Archive again" (re-snapshot), object of type 'int' has no len() #3911

Open
opened 2026-03-15 00:57:25 +03:00 by kerem · 1 comment
Owner

Originally created by @jessienab on GitHub (Sep 6, 2024).
Original GitHub issue: https://github.com/ArchiveBox/ArchiveBox/issues/1509

Describe the bug

Attempting to "Re-Snapshot" (or Archive again as it's been renamed to) pre 0.8.3-rc archive results, returns the following errors:

  1. Server Error (500)
Error occurred while loading the page: object of type 'int' has no len() <QueryDict: {}> <QueryDict: {'csrfmiddlewaretoken': ['1234567890'], 'action': ['resnapshot_snapshot'], 'select_across': ['0'], 'index': ['0'], '_selected_action': ['0191b7ef-9242-029c-2289-8e00703ab333']}>

It also seems to then launch chromium or related browser processes, but they seem to get stuck in some kind of crazy loop, using over 100% CPU resources per process.

Steps to reproduce

  1. Run Archivebox 0.8.3-rc
  2. Find a snapshot that was taken before the 0.8.3 migration
  3. Select "Archive again"
  4. Record output error messages.

ArchiveBox version

0.8.3
ArchiveBox v0.8.3 COMMIT_HASH=31576e2 BUILD_TIME=2024-09-06 13:14:49 1725628489
IN_DOCKER=True IN_QEMU=False ARCH=x86_64 OS=Linux PLATFORM=Linux-6.6.47-1-lts-x86_64-with-glibc2.36 PYTHON=Cpython
FS_ATOMIC=True FS_REMOTE=True FS_USER=0:0 FS_PERMS=644
DEBUG=False IS_TTY=True TZ=UTC SEARCH_BACKEND=sonic LDAP=False

[i] Dependency versions:
 √  PYTHON_BINARY         v3.11.9         valid     /usr/local/bin/python3.11                                                   
 √  SQLITE_BINARY         v2.6.0          valid     /usr/local/lib/python3.11/sqlite3/dbapi2.py                                 
 √  DJANGO_BINARY         v5.1.1          valid     /usr/local/lib/python3.11/site-packages/django/__init__.py                  
 √  ARCHIVEBOX_BINARY     v0.8.3          valid     /usr/local/bin/archivebox                                                   

 √  CURL_BINARY           v8.9.1          valid     /usr/bin/curl                                                               
 √  WGET_BINARY           v1.21.3         valid     /usr/bin/wget                                                               
 √  NODE_BINARY           v20.17.0        valid     /usr/bin/node                                                               
 √  SINGLEFILE_BINARY     v1.1.54         valid     /app/node_modules/single-file-cli/single-file                               
 √  READABILITY_BINARY    v0.0.11         valid     /app/node_modules/readability-extractor/readability-extractor               
 √  MERCURY_BINARY        v1.0.0          valid     /app/node_modules/@postlight/parser/cli.js                                  
 √  GIT_BINARY            v2.39.2         valid     /usr/bin/git                                                                
 √  YOUTUBEDL_BINARY      v2024.8.6       valid     /usr/local/bin/yt-dlp                                                       
 √  CHROME_BINARY         v128.0.6613     valid     /usr/bin/chromium-browser                                                   
 √  RIPGREP_BINARY        v13.0.0         valid     /usr/bin/rg                                                                 

[i] Source-code locations:
 √  PACKAGE_DIR           34 files        valid     /app/archivebox                                                             
 √  TEMPLATES_DIR         4 files         valid     /app/archivebox/templates                                                   

[i] Data locations:
 √  OUTPUT_DIR            9 files @       valid     /data                                                                       
 √  CONFIG_FILE           81.0 Bytes      valid     ./ArchiveBox.conf                                                           
 √  SQL_INDEX             168.7 MB        valid     ./index.sqlite3                                                             
 √  ARCHIVE_DIR           4995 files      valid     ./archive                                                                   
 √  SOURCES_DIR           1712 files      valid     ./sources                                                                   
 X  PERSONAS_DIR          missing         invalid   ./personas                                                                  
 √  LOGS_DIR              1 files         valid     ./logs                                                                      
 X  CACHE_DIR             missing         invalid   ./cache                                                                     
 X  CUSTOM_TEMPLATES_DIR  missing         invalid   ./templates

Originally created by @jessienab on GitHub (Sep 6, 2024). Original GitHub issue: https://github.com/ArchiveBox/ArchiveBox/issues/1509 #### Describe the bug Attempting to "Re-Snapshot" (or Archive again as it's been renamed to) pre 0.8.3-rc archive results, returns the following errors: 1. Server Error (500) 2. ``` Error occurred while loading the page: object of type 'int' has no len() <QueryDict: {}> <QueryDict: {'csrfmiddlewaretoken': ['1234567890'], 'action': ['resnapshot_snapshot'], 'select_across': ['0'], 'index': ['0'], '_selected_action': ['0191b7ef-9242-029c-2289-8e00703ab333']}> ``` It also seems to then launch chromium or related browser processes, but they seem to get stuck in some kind of crazy loop, using over 100% CPU resources per process. #### Steps to reproduce 1. Run Archivebox 0.8.3-rc 2. Find a snapshot that was taken before the 0.8.3 migration 3. Select "Archive again" 4. Record output error messages. #### ArchiveBox version ``` 0.8.3 ArchiveBox v0.8.3 COMMIT_HASH=31576e2 BUILD_TIME=2024-09-06 13:14:49 1725628489 IN_DOCKER=True IN_QEMU=False ARCH=x86_64 OS=Linux PLATFORM=Linux-6.6.47-1-lts-x86_64-with-glibc2.36 PYTHON=Cpython FS_ATOMIC=True FS_REMOTE=True FS_USER=0:0 FS_PERMS=644 DEBUG=False IS_TTY=True TZ=UTC SEARCH_BACKEND=sonic LDAP=False [i] Dependency versions: √ PYTHON_BINARY v3.11.9 valid /usr/local/bin/python3.11 √ SQLITE_BINARY v2.6.0 valid /usr/local/lib/python3.11/sqlite3/dbapi2.py √ DJANGO_BINARY v5.1.1 valid /usr/local/lib/python3.11/site-packages/django/__init__.py √ ARCHIVEBOX_BINARY v0.8.3 valid /usr/local/bin/archivebox √ CURL_BINARY v8.9.1 valid /usr/bin/curl √ WGET_BINARY v1.21.3 valid /usr/bin/wget √ NODE_BINARY v20.17.0 valid /usr/bin/node √ SINGLEFILE_BINARY v1.1.54 valid /app/node_modules/single-file-cli/single-file √ READABILITY_BINARY v0.0.11 valid /app/node_modules/readability-extractor/readability-extractor √ MERCURY_BINARY v1.0.0 valid /app/node_modules/@postlight/parser/cli.js √ GIT_BINARY v2.39.2 valid /usr/bin/git √ YOUTUBEDL_BINARY v2024.8.6 valid /usr/local/bin/yt-dlp √ CHROME_BINARY v128.0.6613 valid /usr/bin/chromium-browser √ RIPGREP_BINARY v13.0.0 valid /usr/bin/rg [i] Source-code locations: √ PACKAGE_DIR 34 files valid /app/archivebox √ TEMPLATES_DIR 4 files valid /app/archivebox/templates [i] Data locations: √ OUTPUT_DIR 9 files @ valid /data √ CONFIG_FILE 81.0 Bytes valid ./ArchiveBox.conf √ SQL_INDEX 168.7 MB valid ./index.sqlite3 √ ARCHIVE_DIR 4995 files valid ./archive √ SOURCES_DIR 1712 files valid ./sources X PERSONAS_DIR missing invalid ./personas √ LOGS_DIR 1 files valid ./logs X CACHE_DIR missing invalid ./cache X CUSTOM_TEMPLATES_DIR missing invalid ./templates ```
Author
Owner

@pirate commented on GitHub (Sep 6, 2024):

Oh woah that's a major bug, not sure why I didn't hit it in tests. Thanks for helping beta test, I'll look into this immediately.

In the meantime I've removed the install instructions for the beta release until this is sorted out.

<!-- gh-comment-id:2334859695 --> @pirate commented on GitHub (Sep 6, 2024): Oh woah that's a major bug, not sure why I didn't hit it in tests. Thanks for helping beta test, I'll look into this immediately. In the meantime I've removed the install instructions for the beta release until this is sorted out.
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/ArchiveBox#3911
No description provided.