[GH-ISSUE #364] Bugfix: ... Changed timeout to 120 seconds in conf file - still timing out after 60 seconds using docker-compose #253

Closed
opened 2026-03-01 14:41:52 +03:00 by kerem · 6 comments
Owner

Originally created by @Taubin on GitHub (Jul 15, 2020).
Original GitHub issue: https://github.com/ArchiveBox/ArchiveBox/issues/364

Describe the bug

I changed the timeout to 120 seconds using the ArchiveBox.conf file and pointed my docker-compose file to that env file, however it is still timing out after 60 seconds.

Conf file:

################################################################################
## General Settings
################################################################################

#OUTPUT_DIR="output"
OUTPUT_PERMISSIONS=755
#ONLY_NEW=False
TIMEOUT=120
MEDIA_TIMEOUT=3600
#TEMPLATES_DIR="archivebox/templates"
FOOTER_INFO="Content is hosted for personal archiving purposes only."

Docker-compose file:

version: '3'

services:
    archivebox:
        container_name: archivebox
        build: .                                   # replace this with nikisweeting/archivebox to use the docker-compose.yml file as a standalone file without avoid having to clone the repo
        stdin_open: true                           # needed to be able to input URLs directly after `docker-compose up`
        tty: true                                  # needed to be able to pipe in URLs via stdin to `docker-compose exec ...`
        env_file: /home/taubin/ArchiveBox/archivebox.conf   # this feature is available starting >v0.4
        # environment:
        #     - SHOW_PROGRESS=False                  # make docker logs nicer by not writing lots of progress bar lines
        #     - MEDIA_TIMEOUT=60                     # Change media timeout
        #     - TIMEOUT=120                          # Change timeout to 2 minutes
        # volumes:
            # - ./data:/data
        command: bash -c 'echo "https://github.com/pirate/ArchiveBox" | /bin/archive; tail -f /dev/null'  # archive the Github repo homepage as a starting point so the index doesn't just show an empty list to new users
        restart: unless-stopped
        volumes:
            - ./data:/data
            - /home/taubin/ArchiveBox:/archive


    nginx:
        container_name: archivebox-nginx
        image: 'nginx'
        ports:
            - '8098:80'
        volumes:
            - ./etc/nginx/nginx.conf:/etc/nginx/nginx.conf
            - ./data:/var/www
        restart: unless-stopped

Output:

[*] [2020-07-15 21:46:42] "https://www.redditstatic.com/desktop2x/fonts/redesignIcon/redesignFont.49673a028235b94b800c5f37667963e5.woff"
    https://www.redditstatic.com/desktop2x/fonts/redesignIcon/redesignFont.49673a028235b94b800c5f37667963e5.woff
    √ output/archive/1594849026.0
      > title
        Failed: Unable to detect page title
        Run to see full output:
            cd /home/taubin/ArchiveBox/output/archive/1594849026.0;
            curl https://www.redditstatic.com/desktop2x/fonts/redesignIcon/redesignFont.49673a028235b94b800c5f37667963e5.woff | grep <title>
      > wget
      > pdf
        Failed:TimeoutExpired Command 'chromium-browser' timed out after 60 seconds
        Run to see full output:
            cd /home/taubin/ArchiveBox/output/archive/1594849026.0;
            chromium-browser --headless "--user-agent=Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/73.0.3683.75 Safari/537.36" --window-size=1440,2000 --timeout=60000 --print-to-pdf https://www.redditstatic.com/desktop2x/fonts/redesignIcon/redesignFont.49673a028235b94b800c5f37667963e5.woff
      > screenshot
        Failed:TimeoutExpired Command 'chromium-browser' timed out after 60 seconds
        Run to see full output:
            cd /home/taubin/ArchiveBox/output/archive/1594849026.0;
            chromium-browser --headless "--user-agent=Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/73.0.3683.75 Safari/537.36" --window-size=1440,2000 --timeout=60000 --screenshot https://www.redditstatic.com/desktop2x/fonts/redesignIcon/redesignFont.49673a028235b94b800c5f37667963e5.woff
      > dom
        Failed:TimeoutExpired Command 'chromium-browser' timed out after 60 seconds
        Run to see full output:
            cd /home/taubin/ArchiveBox/output/archive/1594849026.0;
            chromium-browser --headless "--user-agent=Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/73.0.3683.75 Safari/537.36" --window-size=1440,2000 --timeout=60000 --dump-dom https://www.redditstatic.com/desktop2x/fonts/redesignIcon/redesignFont.49673a028235b94b800c5f37667963e5.woff
      > media
      > archive_org
        Failed: Failed to find "content-location" URL header in Archive.org response.
        Run to see full output:
            cd /home/taubin/ArchiveBox/output/archive/1594849026.0;
            curl --location --head --user-agent "ArchiveBox/6c4c6862e (+https://github.com/pirate/ArchiveBox/)" --max-time 60 https://web.archive.org/save/https://www.redditstatic.com/desktop2x/fonts/redesignIcon/redesignFont.49673a028235b94b800c5f37667963e5.woff

Steps to reproduce

  1. Installed archivebox using docker-compose per the instructions on website.

  2. Copied conf.default file to .conf

  3. Ran archivebox with the following (for testing)

taubin@taubinserver:~$ archivebox https://www.reddit.com/r/AskUbuntu/comments/hhtoay/ubuntu_20_cpu_threads_hit_100_network_drops/

Screenshots or log output

Output still showing 60 second timeout


[+] [2020-07-15 22:01:08] "https://www.reddit.com/register/?dest=https%3A%2F%2Fwww.reddit.com%2Fr%2FAskUbuntu%2Fcomments%2Fhhtoay%2Fubuntu_20_cpu_threads_hit_100_network_drops%2F"
    https://www.reddit.com/register/?dest=https%3A%2F%2Fwww.reddit.com%2Fr%2FAskUbuntu%2Fcomments%2Fhhtoay%2Fubuntu_20_cpu_threads_hit_100_network_drops%2F
    > output/archive/1594850467
      > title
      > favicon
      > wget
      > pdf
      > screenshot
      > dom
      > media
      > archive_org
        Failed:TimeoutExpired Command 'curl' timed out after 60 seconds
        Run to see full output:
            cd /home/taubin/ArchiveBox/output/archive/1594850467;
            curl --location --head --user-agent "ArchiveBox/6c4c6862e (+https://github.com/pirate/ArchiveBox/)" --max-time 60 https://web.archive.org/save/https://www.reddit.com/register/?dest=https%3A%2F%2Fwww.reddit.com%2Fr%2FAskUbuntu%2Fcomments%2Fhhtoay%2Fubuntu_20_cpu_threads_hit_100_network_drops%2F

Software versions

  • OS: Ubuntu 20.04
  • ArchiveBox version: 10799e4
  • Python version: Python 3.8.2
  • Chrome version: Chromium 84.0.4147.89 snap
Originally created by @Taubin on GitHub (Jul 15, 2020). Original GitHub issue: https://github.com/ArchiveBox/ArchiveBox/issues/364 <!-- Please fill out the following information, feel free to delete sections if they're not applicable or if long issue templates annoy you :) --> #### Describe the bug I changed the timeout to 120 seconds using the ArchiveBox.conf file and pointed my docker-compose file to that env file, however it is still timing out after 60 seconds. Conf file: ``` ################################################################################ ## General Settings ################################################################################ #OUTPUT_DIR="output" OUTPUT_PERMISSIONS=755 #ONLY_NEW=False TIMEOUT=120 MEDIA_TIMEOUT=3600 #TEMPLATES_DIR="archivebox/templates" FOOTER_INFO="Content is hosted for personal archiving purposes only." ``` Docker-compose file: ``` version: '3' services: archivebox: container_name: archivebox build: . # replace this with nikisweeting/archivebox to use the docker-compose.yml file as a standalone file without avoid having to clone the repo stdin_open: true # needed to be able to input URLs directly after `docker-compose up` tty: true # needed to be able to pipe in URLs via stdin to `docker-compose exec ...` env_file: /home/taubin/ArchiveBox/archivebox.conf # this feature is available starting >v0.4 # environment: # - SHOW_PROGRESS=False # make docker logs nicer by not writing lots of progress bar lines # - MEDIA_TIMEOUT=60 # Change media timeout # - TIMEOUT=120 # Change timeout to 2 minutes # volumes: # - ./data:/data command: bash -c 'echo "https://github.com/pirate/ArchiveBox" | /bin/archive; tail -f /dev/null' # archive the Github repo homepage as a starting point so the index doesn't just show an empty list to new users restart: unless-stopped volumes: - ./data:/data - /home/taubin/ArchiveBox:/archive nginx: container_name: archivebox-nginx image: 'nginx' ports: - '8098:80' volumes: - ./etc/nginx/nginx.conf:/etc/nginx/nginx.conf - ./data:/var/www restart: unless-stopped ``` Output: ``` [*] [2020-07-15 21:46:42] "https://www.redditstatic.com/desktop2x/fonts/redesignIcon/redesignFont.49673a028235b94b800c5f37667963e5.woff" https://www.redditstatic.com/desktop2x/fonts/redesignIcon/redesignFont.49673a028235b94b800c5f37667963e5.woff √ output/archive/1594849026.0 > title Failed: Unable to detect page title Run to see full output: cd /home/taubin/ArchiveBox/output/archive/1594849026.0; curl https://www.redditstatic.com/desktop2x/fonts/redesignIcon/redesignFont.49673a028235b94b800c5f37667963e5.woff | grep <title> > wget > pdf Failed:TimeoutExpired Command 'chromium-browser' timed out after 60 seconds Run to see full output: cd /home/taubin/ArchiveBox/output/archive/1594849026.0; chromium-browser --headless "--user-agent=Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/73.0.3683.75 Safari/537.36" --window-size=1440,2000 --timeout=60000 --print-to-pdf https://www.redditstatic.com/desktop2x/fonts/redesignIcon/redesignFont.49673a028235b94b800c5f37667963e5.woff > screenshot Failed:TimeoutExpired Command 'chromium-browser' timed out after 60 seconds Run to see full output: cd /home/taubin/ArchiveBox/output/archive/1594849026.0; chromium-browser --headless "--user-agent=Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/73.0.3683.75 Safari/537.36" --window-size=1440,2000 --timeout=60000 --screenshot https://www.redditstatic.com/desktop2x/fonts/redesignIcon/redesignFont.49673a028235b94b800c5f37667963e5.woff > dom Failed:TimeoutExpired Command 'chromium-browser' timed out after 60 seconds Run to see full output: cd /home/taubin/ArchiveBox/output/archive/1594849026.0; chromium-browser --headless "--user-agent=Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/73.0.3683.75 Safari/537.36" --window-size=1440,2000 --timeout=60000 --dump-dom https://www.redditstatic.com/desktop2x/fonts/redesignIcon/redesignFont.49673a028235b94b800c5f37667963e5.woff > media > archive_org Failed: Failed to find "content-location" URL header in Archive.org response. Run to see full output: cd /home/taubin/ArchiveBox/output/archive/1594849026.0; curl --location --head --user-agent "ArchiveBox/6c4c6862e (+https://github.com/pirate/ArchiveBox/)" --max-time 60 https://web.archive.org/save/https://www.redditstatic.com/desktop2x/fonts/redesignIcon/redesignFont.49673a028235b94b800c5f37667963e5.woff ``` #### Steps to reproduce 1. Installed archivebox using docker-compose per the instructions on website. 2. Copied conf.default file to .conf 3. Ran archivebox with the following (for testing) `taubin@taubinserver:~$ archivebox https://www.reddit.com/r/AskUbuntu/comments/hhtoay/ubuntu_20_cpu_threads_hit_100_network_drops/` #### Screenshots or log output Output still showing 60 second timeout ``` [+] [2020-07-15 22:01:08] "https://www.reddit.com/register/?dest=https%3A%2F%2Fwww.reddit.com%2Fr%2FAskUbuntu%2Fcomments%2Fhhtoay%2Fubuntu_20_cpu_threads_hit_100_network_drops%2F" https://www.reddit.com/register/?dest=https%3A%2F%2Fwww.reddit.com%2Fr%2FAskUbuntu%2Fcomments%2Fhhtoay%2Fubuntu_20_cpu_threads_hit_100_network_drops%2F > output/archive/1594850467 > title > favicon > wget > pdf > screenshot > dom > media > archive_org Failed:TimeoutExpired Command 'curl' timed out after 60 seconds Run to see full output: cd /home/taubin/ArchiveBox/output/archive/1594850467; curl --location --head --user-agent "ArchiveBox/6c4c6862e (+https://github.com/pirate/ArchiveBox/)" --max-time 60 https://web.archive.org/save/https://www.reddit.com/register/?dest=https%3A%2F%2Fwww.reddit.com%2Fr%2FAskUbuntu%2Fcomments%2Fhhtoay%2Fubuntu_20_cpu_threads_hit_100_network_drops%2F ``` #### Software versions - OS: Ubuntu 20.04 - ArchiveBox version: 10799e4 - Python version: Python 3.8.2 - Chrome version: Chromium 84.0.4147.89 snap
kerem closed this issue 2026-03-01 14:41:52 +03:00
Author
Owner

@pirate commented on GitHub (Jul 16, 2020):

Can you checkout the django branch and run archivebox config --get TIMEOUT and post the output here.

<!-- gh-comment-id:659105006 --> @pirate commented on GitHub (Jul 16, 2020): Can you checkout the `django` branch and run `archivebox config --get TIMEOUT` and post the output here.
Author
Owner

@Taubin commented on GitHub (Jul 16, 2020):

Using the django branch errors on me entirely:

taubin@taubinserver:~/ArchiveBox/bin$ archivebox config --get TIMEOUT
Traceback (most recent call last):
  File "/home/taubin/ArchiveBox/bin/archivebox", line 7, in <module>
    from .cli import main
ModuleNotFoundError: No module named 'archivebox'
<!-- gh-comment-id:659116295 --> @Taubin commented on GitHub (Jul 16, 2020): Using the django branch errors on me entirely: ``` taubin@taubinserver:~/ArchiveBox/bin$ archivebox config --get TIMEOUT Traceback (most recent call last): File "/home/taubin/ArchiveBox/bin/archivebox", line 7, in <module> from .cli import main ModuleNotFoundError: No module named 'archivebox' ```
Author
Owner

@pirate commented on GitHub (Jul 16, 2020):

git checkout django
git pull
pip install -e .
cd output
archivebox init
archivebox config --get TIMEOUT
<!-- gh-comment-id:659620717 --> @pirate commented on GitHub (Jul 16, 2020): ```bash git checkout django git pull pip install -e . cd output archivebox init archivebox config --get TIMEOUT ```
Author
Owner

@Taubin commented on GitHub (Jul 16, 2020):

Sorry about that.

taubin@taubinserver:~/ArchiveBox/output$ archivebox config --get TIMEOUT
/bin/bash: warning: setlocale: LC_ALL: cannot change locale (en_US.UTF-8)
[i] [2020-07-16 20:59:24] ArchiveBox v0.4.3: archivebox config --get TIMEOUT
    > /home/taubin/ArchiveBox/output

TIMEOUT=60
<!-- gh-comment-id:659669869 --> @Taubin commented on GitHub (Jul 16, 2020): Sorry about that. ``` taubin@taubinserver:~/ArchiveBox/output$ archivebox config --get TIMEOUT /bin/bash: warning: setlocale: LC_ALL: cannot change locale (en_US.UTF-8) [i] [2020-07-16 20:59:24] ArchiveBox v0.4.3: archivebox config --get TIMEOUT > /home/taubin/ArchiveBox/output TIMEOUT=60 ```
Author
Owner

@pirate commented on GitHub (Jul 16, 2020):

Ok good, and after this?

archivebox config --set TIMEOUT=3600
archivebox config --get TIMEOUT
<!-- gh-comment-id:659695110 --> @pirate commented on GitHub (Jul 16, 2020): Ok good, and after this? ```bash archivebox config --set TIMEOUT=3600 archivebox config --get TIMEOUT ```
Author
Owner

@Taubin commented on GitHub (Jul 16, 2020):

That worked, thank you very much!

<!-- gh-comment-id:659696697 --> @Taubin commented on GitHub (Jul 16, 2020): That worked, thank you very much!
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/ArchiveBox#253
No description provided.