[GH-ISSUE #651] Bugfix: archivebox update still leaves many items Pending #408

Closed
opened 2026-03-01 14:43:17 +03:00 by kerem · 1 comment
Owner

Originally created by @berezovskyi on GitHub (Feb 6, 2021).
Original GitHub issue: https://github.com/ArchiveBox/ArchiveBox/issues/651

Describe the bug

When there are some items that were not successfully archived and I run archivebox update, some items remain "pending" but can be still archived through the web UI.

Steps to reproduce

  1. Import a lot of URIs and hit https://github.com/ArchiveBox/ArchiveBox/issues/550.
  2. Restart ArchiveBox.
  3. Run archivebox update.

Screenshots or log output

After running archivebox update (run in the loop as described in https://github.com/ArchiveBox/ArchiveBox/issues/550#issuecomment-774500057):

image

After clicking "Archive" in the admin panel:

same
image

ArchiveBox version

ArchiveBox v0.5.4
Cpython Linux Linux-5.8.0-41-generic-x86_64-with-glibc2.28 x86_64 (in Docker)

[i] Dependency versions:
 √  ARCHIVEBOX_BINARY     v0.5.4          valid     /usr/local/bin/archivebox
 √  PYTHON_BINARY         v3.9.1          valid     /usr/local/bin/python3.9
 √  DJANGO_BINARY         v3.1.3          valid     /usr/local/lib/python3.9/site-packages/django/bin/django-admin.py
 √  CURL_BINARY           v7.64.0         valid     /usr/bin/curl
 √  WGET_BINARY           v1.20.1         valid     /usr/bin/wget
 √  NODE_BINARY           v15.7.0         valid     /usr/bin/node
 √  SINGLEFILE_BINARY     v0.1.14         valid     /node/node_modules/single-file/cli/single-file
 √  READABILITY_BINARY    v0.1.0          valid     /node/node_modules/readability-extractor/readability-extractor
 √  MERCURY_BINARY        v1.0.0          valid     /node/node_modules/@postlight/mercury-parser/cli.js
 √  GIT_BINARY            v2.20.1         valid     /usr/bin/git
 √  YOUTUBEDL_BINARY      v2021.01.24.1   valid     /usr/local/bin/youtube-dl
 √  CHROME_BINARY         v87.0.4280.141  valid     /usr/bin/chromium
 √  RIPGREP_BINARY        v0.10.0         valid     /usr/bin/rg

[i] Source-code locations:
 √  PACKAGE_DIR           22 files        valid     /app/archivebox
 √  TEMPLATES_DIR         3 files         valid     /app/archivebox/templates

[i] Secrets locations:
 -  CHROME_USER_DATA_DIR  -               disabled
 -  COOKIES_FILE          -               disabled

[i] Data locations:
 √  OUTPUT_DIR            8 files         valid     /data
 √  SOURCES_DIR           158 files       valid     ./sources
 √  LOGS_DIR              0 files         valid     ./logs
 √  ARCHIVE_DIR           1640 files      valid     ./archive
 √  CONFIG_FILE           1.1 KB          valid     ./ArchiveBox.conf
 √  SQL_INDEX             16.9 MB         valid     ./index.sqlite3
Originally created by @berezovskyi on GitHub (Feb 6, 2021). Original GitHub issue: https://github.com/ArchiveBox/ArchiveBox/issues/651 <!-- Please fill out the following information, feel free to delete sections if they're not applicable or if long issue templates annoy you. (the only required section is the version information) --> #### Describe the bug <!-- A description of what the bug is, what you expected to happen, and any relevant context about issue. --> When there are some items that were not successfully archived and I run `archivebox update`, some items remain "pending" but can be still archived through the web UI. #### Steps to reproduce <!-- For example: 1. Ran ArchiveBox with the following config '...' 2. Saw this output during archiving '....' 3. UI didn't show the thing I was expecting '....' --> 1. Import a lot of URIs and hit https://github.com/ArchiveBox/ArchiveBox/issues/550. 2. Restart ArchiveBox. 3. Run `archivebox update`. #### Screenshots or log output <!-- If applicable, post any relevant screenshots or copy/pasted terminal output from ArchiveBox. If you're reporting a parsing / importing error, **you must paste a copy of your redacted import file here**. --> After running `archivebox update` (run in the loop as described in https://github.com/ArchiveBox/ArchiveBox/issues/550#issuecomment-774500057): ![image](https://user-images.githubusercontent.com/64734/107123909-b1160b00-68a0-11eb-9e0f-94e8c83fc852.png) After clicking "Archive" in the admin panel: _same_ ![image](https://user-images.githubusercontent.com/64734/107123951-db67c880-68a0-11eb-9095-db552109a4ff.png) #### ArchiveBox version <!-- Run the `archivebox version` command locally then copy paste the result here: --> ```logs ArchiveBox v0.5.4 Cpython Linux Linux-5.8.0-41-generic-x86_64-with-glibc2.28 x86_64 (in Docker) [i] Dependency versions: √ ARCHIVEBOX_BINARY v0.5.4 valid /usr/local/bin/archivebox √ PYTHON_BINARY v3.9.1 valid /usr/local/bin/python3.9 √ DJANGO_BINARY v3.1.3 valid /usr/local/lib/python3.9/site-packages/django/bin/django-admin.py √ CURL_BINARY v7.64.0 valid /usr/bin/curl √ WGET_BINARY v1.20.1 valid /usr/bin/wget √ NODE_BINARY v15.7.0 valid /usr/bin/node √ SINGLEFILE_BINARY v0.1.14 valid /node/node_modules/single-file/cli/single-file √ READABILITY_BINARY v0.1.0 valid /node/node_modules/readability-extractor/readability-extractor √ MERCURY_BINARY v1.0.0 valid /node/node_modules/@postlight/mercury-parser/cli.js √ GIT_BINARY v2.20.1 valid /usr/bin/git √ YOUTUBEDL_BINARY v2021.01.24.1 valid /usr/local/bin/youtube-dl √ CHROME_BINARY v87.0.4280.141 valid /usr/bin/chromium √ RIPGREP_BINARY v0.10.0 valid /usr/bin/rg [i] Source-code locations: √ PACKAGE_DIR 22 files valid /app/archivebox √ TEMPLATES_DIR 3 files valid /app/archivebox/templates [i] Secrets locations: - CHROME_USER_DATA_DIR - disabled - COOKIES_FILE - disabled [i] Data locations: √ OUTPUT_DIR 8 files valid /data √ SOURCES_DIR 158 files valid ./sources √ LOGS_DIR 0 files valid ./logs √ ARCHIVE_DIR 1640 files valid ./archive √ CONFIG_FILE 1.1 KB valid ./ArchiveBox.conf √ SQL_INDEX 16.9 MB valid ./index.sqlite3 ``` <!-- Tickets without full version info will closed until it is provided, we need the full output here to help you solve your issue -->
kerem 2026-03-01 14:43:17 +03:00
Author
Owner

@pirate commented on GitHub (Apr 6, 2021):

I believe this is already fixed in the last version v0.5.6. Comment back here if you're still seeing the issue and I'll reopen the ticket.

<!-- gh-comment-id:813842199 --> @pirate commented on GitHub (Apr 6, 2021): I believe this is already fixed in the last version v0.5.6. Comment back here if you're still seeing the issue and I'll reopen the ticket.
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/ArchiveBox#408
No description provided.