[GH-ISSUE #1209] Bug: can't use Python API #743

Closed
opened 2026-03-01 14:46:00 +03:00 by kerem · 3 comments
Owner

Originally created by @mlsteele on GitHub (Aug 11, 2023).
Original GitHub issue: https://github.com/ArchiveBox/ArchiveBox/issues/1209

Describe the bug

Importing to use the python API fails.

Similar to https://github.com/ArchiveBox/ArchiveBox/issues/473

Steps to reproduce

  1. pip install archivebox
  2. python -c "import archivebox.main"

Screenshots or log output

Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/Users/miles/.pyenv/versions/3.11.1/lib/python3.11/site-packages/archivebox/main.py", line 14, in <module>
    from .cli import (
  File "/Users/miles/.pyenv/versions/3.11.1/lib/python3.11/site-packages/archivebox/cli/__init__.py", line 83, in <module>
    SUBCOMMANDS = list_subcommands()
                  ^^^^^^^^^^^^^^^^^^
  File "/Users/miles/.pyenv/versions/3.11.1/lib/python3.11/site-packages/archivebox/cli/__init__.py", line 43, in list_subcommands
    module = import_module('.archivebox_{}'.format(subcommand), __package__)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/miles/.pyenv/versions/3.11.1/lib/python3.11/importlib/__init__.py", line 126, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/miles/.pyenv/versions/3.11.1/lib/python3.11/site-packages/archivebox/cli/archivebox_shell.py", line 11, in <module>
    from ..main import shell
ImportError: cannot import name 'shell' from partially initialized module 'archivebox.main' (most likely due to a circular import) (/Users/miles/.pyenv/versions/3.11.1/lib/python3.11/site-packages/archivebox/main.py)



ArchiveBox version

ArchiveBox v0.6.2
Cpython Darwin macOS-13.5-x86_64-i386-64bit x86_64
IN_DOCKER=False DEBUG=False IS_TTY=False TZ=UTC SEARCH_BACKEND_ENGINE=ripgrep

[i] Dependency versions:
 √  ARCHIVEBOX_BINARY     v0.6.2          valid     /Users/miles/.pyenv/versions/3.11.1/bin/archivebox                          
 √  PYTHON_BINARY         v3.11.1         valid     /Users/miles/.pyenv/versions/3.11.1/bin/python3.11                          
 √  DJANGO_BINARY         v3.1.14         valid     /Users/miles/.pyenv/versions/3.11.1/lib/python3.11/site-packages/django/bin/django-admin.py
 √  CURL_BINARY           v8.1.2          valid     /usr/bin/curl                                                               
 √  WGET_BINARY           v1.21.4         valid     /usr/local/bin/wget                                                         
 √  NODE_BINARY           v18.16.0        valid     /opt/nodejs/bin/node                                                        
 √  SINGLEFILE_BINARY     v1.0.47         valid     ./node_modules/single-file/cli/single-file                                  
 √  READABILITY_BINARY    v0.0.6          valid     ./node_modules/readability-extractor/readability-extractor                  
 √  MERCURY_BINARY        v1.0.0          valid     ./node_modules/@postlight/mercury-parser/cli.js                             
 √  GIT_BINARY            v2.41.0         valid     /usr/local/bin/git                                                          
 √  YOUTUBEDL_BINARY      v2021.12.17     valid     /Users/miles/.pyenv/versions/3.11.1/bin/youtube-dl                          
 √  CHROME_BINARY         v115.0.5790.75  valid     /Users/miles/Library/Caches/ms-playwright/chromium-1071/chrome-mac/Chromium.app/Contents/MacOS/Chromium
 √  RIPGREP_BINARY        v13.0.0         valid     /usr/local/bin/rg                                                           

[i] Source-code locations:
 √  PACKAGE_DIR           23 files        valid     /Users/miles/.pyenv/versions/3.11.1/lib/python3.11/site-packages/archivebox 
 √  TEMPLATES_DIR         3 files         valid     /Users/miles/.pyenv/versions/3.11.1/lib/python3.11/site-packages/archivebox/templates
 -  CUSTOM_TEMPLATES_DIR  -               disabled                                                                              

[i] Secrets locations:
 -  CHROME_USER_DATA_DIR  -               disabled                                                                              
 -  COOKIES_FILE          -               disabled                                                                              

[i] Data locations:
 √  OUTPUT_DIR            8 files         valid     /Users/miles/archivebox                                                     
 √  SOURCES_DIR           8 files         valid     ./sources                                                                   
 √  LOGS_DIR              1 files         valid     ./logs                                                                      
 √  ARCHIVE_DIR           40 files        valid     ./archive                                                                   
 √  CONFIG_FILE           222.0 Bytes     valid     ./ArchiveBox.conf                                                           
 √  SQL_INDEX             588.0 KB        valid     ./index.sqlite3                                                             
Originally created by @mlsteele on GitHub (Aug 11, 2023). Original GitHub issue: https://github.com/ArchiveBox/ArchiveBox/issues/1209 #### Describe the bug Importing to use the python API fails. Similar to https://github.com/ArchiveBox/ArchiveBox/issues/473 #### Steps to reproduce 1. `pip install archivebox` 2. `python -c "import archivebox.main"` #### Screenshots or log output ``` Traceback (most recent call last): File "<string>", line 1, in <module> File "/Users/miles/.pyenv/versions/3.11.1/lib/python3.11/site-packages/archivebox/main.py", line 14, in <module> from .cli import ( File "/Users/miles/.pyenv/versions/3.11.1/lib/python3.11/site-packages/archivebox/cli/__init__.py", line 83, in <module> SUBCOMMANDS = list_subcommands() ^^^^^^^^^^^^^^^^^^ File "/Users/miles/.pyenv/versions/3.11.1/lib/python3.11/site-packages/archivebox/cli/__init__.py", line 43, in list_subcommands module = import_module('.archivebox_{}'.format(subcommand), __package__) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/miles/.pyenv/versions/3.11.1/lib/python3.11/importlib/__init__.py", line 126, in import_module return _bootstrap._gcd_import(name[level:], package, level) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/miles/.pyenv/versions/3.11.1/lib/python3.11/site-packages/archivebox/cli/archivebox_shell.py", line 11, in <module> from ..main import shell ImportError: cannot import name 'shell' from partially initialized module 'archivebox.main' (most likely due to a circular import) (/Users/miles/.pyenv/versions/3.11.1/lib/python3.11/site-packages/archivebox/main.py) ```  #### ArchiveBox version <!-- Run the `archivebox version` command locally then copy paste the result here: --> ```logs ArchiveBox v0.6.2 Cpython Darwin macOS-13.5-x86_64-i386-64bit x86_64 IN_DOCKER=False DEBUG=False IS_TTY=False TZ=UTC SEARCH_BACKEND_ENGINE=ripgrep [i] Dependency versions: √ ARCHIVEBOX_BINARY v0.6.2 valid /Users/miles/.pyenv/versions/3.11.1/bin/archivebox √ PYTHON_BINARY v3.11.1 valid /Users/miles/.pyenv/versions/3.11.1/bin/python3.11 √ DJANGO_BINARY v3.1.14 valid /Users/miles/.pyenv/versions/3.11.1/lib/python3.11/site-packages/django/bin/django-admin.py √ CURL_BINARY v8.1.2 valid /usr/bin/curl √ WGET_BINARY v1.21.4 valid /usr/local/bin/wget √ NODE_BINARY v18.16.0 valid /opt/nodejs/bin/node √ SINGLEFILE_BINARY v1.0.47 valid ./node_modules/single-file/cli/single-file √ READABILITY_BINARY v0.0.6 valid ./node_modules/readability-extractor/readability-extractor √ MERCURY_BINARY v1.0.0 valid ./node_modules/@postlight/mercury-parser/cli.js √ GIT_BINARY v2.41.0 valid /usr/local/bin/git √ YOUTUBEDL_BINARY v2021.12.17 valid /Users/miles/.pyenv/versions/3.11.1/bin/youtube-dl √ CHROME_BINARY v115.0.5790.75 valid /Users/miles/Library/Caches/ms-playwright/chromium-1071/chrome-mac/Chromium.app/Contents/MacOS/Chromium √ RIPGREP_BINARY v13.0.0 valid /usr/local/bin/rg [i] Source-code locations: √ PACKAGE_DIR 23 files valid /Users/miles/.pyenv/versions/3.11.1/lib/python3.11/site-packages/archivebox √ TEMPLATES_DIR 3 files valid /Users/miles/.pyenv/versions/3.11.1/lib/python3.11/site-packages/archivebox/templates - CUSTOM_TEMPLATES_DIR - disabled [i] Secrets locations: - CHROME_USER_DATA_DIR - disabled - COOKIES_FILE - disabled [i] Data locations: √ OUTPUT_DIR 8 files valid /Users/miles/archivebox √ SOURCES_DIR 8 files valid ./sources √ LOGS_DIR 1 files valid ./logs √ ARCHIVE_DIR 40 files valid ./archive √ CONFIG_FILE 222.0 Bytes valid ./ArchiveBox.conf √ SQL_INDEX 588.0 KB valid ./index.sqlite3 ```
kerem closed this issue 2026-03-01 14:46:01 +03:00
Author
Owner

@pirate commented on GitHub (Aug 13, 2023):

Did you do these steps as mentioned in the other issue:

>>> from archivebox.config import setup_django
>>> setup_django()
...
>>> from main import init
>>> init()
...
>>> from core.models import Snapshot
>>> Snapshot.objects.all()
<QuerySet []>

A Django db must be initialized before you can call ArchiveBox functions, otherwise it doesn't know what DB to apply them to. This must be done before importing other portions of archivebox, as they all depend on a Django db being present and valid.

<!-- gh-comment-id:1676497130 --> @pirate commented on GitHub (Aug 13, 2023): Did you do these steps as mentioned in the other issue: ```python >>> from archivebox.config import setup_django >>> setup_django() ... >>> from main import init >>> init() ... >>> from core.models import Snapshot >>> Snapshot.objects.all() <QuerySet []> ``` A Django db must be initialized before you can call ArchiveBox functions, otherwise it doesn't know what DB to apply them to. This must be done before importing other portions of archivebox, as they all depend on a Django db being present and valid.
Author
Owner

@mlsteele commented on GitHub (Aug 14, 2023):

I hadn't. Using archivebox.config first as you say does make it work.

This example from the API docs crashes at from archivebox.main import
https://docs.archivebox.io/en/latest/Usage.html

import os
DATA_DIR = f"{os.environ['HOME']}/archivebox"
os.chdir(DATA_DIR)

from archivebox.main import check_data_folder, setup_django, add, remove, server

check_data_folder(DATA_DIR)
setup_django(DATA_DIR)

Whereas this modification using archivebox.config works.

# This modification based on your mention of `archive.config` works.
import os
DATA_DIR = f"{os.environ['HOME']}/archivebox"
os.chdir(DATA_DIR)

from archivebox.config import setup_django
setup_django()

from archivebox.main import check_data_folder, setup_django, add, remove, server

Consider changing those docs if that's the intended usage.

<!-- gh-comment-id:1678179361 --> @mlsteele commented on GitHub (Aug 14, 2023): I hadn't. Using `archivebox.config` first as you say does make it work. This example from the API docs crashes at `from archivebox.main import` https://docs.archivebox.io/en/latest/Usage.html ``` import os DATA_DIR = f"{os.environ['HOME']}/archivebox" os.chdir(DATA_DIR) from archivebox.main import check_data_folder, setup_django, add, remove, server check_data_folder(DATA_DIR) setup_django(DATA_DIR) ``` Whereas this modification using `archivebox.config` works. ``` # This modification based on your mention of `archive.config` works. import os DATA_DIR = f"{os.environ['HOME']}/archivebox" os.chdir(DATA_DIR) from archivebox.config import setup_django setup_django() from archivebox.main import check_data_folder, setup_django, add, remove, server ``` Consider changing those docs if that's the intended usage.
Author
Owner

@pirate commented on GitHub (Aug 16, 2023):

Docs updated: https://github.com/ArchiveBox/ArchiveBox/wiki/Usage#python-api-usage

<!-- gh-comment-id:1679805032 --> @pirate commented on GitHub (Aug 16, 2023): ✅ Docs updated: https://github.com/ArchiveBox/ArchiveBox/wiki/Usage#python-api-usage
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/ArchiveBox#743
No description provided.