[GH-ISSUE #258] EMpty or 404 with DOcker compose #1691

Closed
opened 2026-03-01 17:52:55 +03:00 by kerem · 2 comments
Owner

Originally created by @gerroon on GitHub (Aug 28, 2019).
Original GitHub issue: https://github.com/ArchiveBox/ArchiveBox/issues/258

Describe the bug

I am getting an empty page with SERVER:4098, I try /data/index.htm which gives me 404

Steps to reproduce

  1. Ran ArchiveBox with the following config '...'
    Docker-compose up

  2. Saw this output during archiving '....'

echo "https://github.com/pirate/ArchiveBox/wiki/Docker" | docker-compose exec -T archivebox /bin/archive                                             
fatal: Not a git repository (or any of the parent directories): .git
[*] [2019-08-28 00:41:42] Parsing new links from output/sources/stdin-1566952902.txt...
    > Adding 0 new links to index (parsed import as Plain Text)
[*] [2019-08-28 00:41:42] Saving main index files...
    √ /data/index.json
    √ /data/index.html
[▶] [2019-08-28 00:41:42] Updating content for 2 pages in archive...

[*] [2019-08-28 00:41:42] "Docker · pirate/ArchiveBox Wiki · GitHub"
    https://github.com/pirate/ArchiveBox/wiki/Docker
    √ /data/archive/1566952707

[*] [2019-08-28 00:41:42] "GitHub - pirate/ArchiveBox: 🗃 The open source self-hosted web archive. Takes browser history/bookmarks/Pocket/Pinboard/etc., saves HTML, JS, PDFs, media, and more..."
    https://github.com/pirate/ArchiveBox
    √ /data/archive/1566946654
[√] [2019-08-28 00:41:42] Update of 2 pages complete (0.02 sec)
    - 2 links skipped
    - 0 links updated
    - 0 links had errors
    To view your archive, open: /data/index.html
[*] [2019-08-28 00:41:42] Saving main index files...
    √ /data/index.json
    √ /data/index.html

  1. UI didn't show the thing I was expecting '....'
    I get an empty page like
Index of /

../

or 404

Screenshots or log output

Starting archivebox_docker_nginx_1        ... done
Recreating archivebox_docker_archivebox_1 ... done
Attaching to archivebox_docker_nginx_1, archivebox_docker_archivebox_1
archivebox_1  | fatal: Not a git repository (or any of the parent directories): .git
archivebox_1  | [*] [2019-08-28 00:41:03] Parsing new links from output/sources/stdin-1566952863.txt...                                                                                      
archivebox_1  |     > Adding 0 new links to index (parsed import as Plain Text)
archivebox_1  | [*] [2019-08-28 00:41:03] Saving main index files...
    √ /data/index.json/data/index.json
    √ /data/index.html/data/index.html
archivebox_1  | [▶] [2019-08-28 00:41:03] Updating content for 2 pages in archive...
archivebox_1  |
archivebox_1  | [*] [2019-08-28 00:41:03] "Docker · pirate/ArchiveBox Wiki · GitHub"
archivebox_1  |     https://github.com/pirate/ArchiveBox/wiki/Docker
archivebox_1  |     √ /data/archive/1566952707
archivebox_1  |
archivebox_1  | [*] [2019-08-28 00:41:03] "GitHub - pirate/ArchiveBox: 🗃 The open source self-hosted web archive. Takes browser history/bookmarks/Pocket/Pinboard/etc., saves HTML, JS, PDFs,
media, and more..."
archivebox_1  |     https://github.com/pirate/ArchiveBox
archivebox_1  |     √ /data/archive/1566946654
archivebox_1  | [√] [2019-08-28 00:41:03] Update of 2 pages complete (0.02 sec)
archivebox_1  |     - 2 links skipped
archivebox_1  |     - 0 links updated
archivebox_1  |     - 0 links had errors
archivebox_1  |     To view your archive, open: /data/index.html
archivebox_1  | [*] [2019-08-28 00:41:03] Saving main index files...
    √ /data/index.json/data/index.json
    √ /data/index.html/data/index.html
nginx_1       | 10.10.1.11 - - [28/Aug/2019:00:41:09 +0000] "GET / HTTP/1.1" 200 146 "http://10.10.1.10:4098/" "Mozilla/5.0 (Windows NT 10.0; rv:68.0) Gecko/20100101 Firefox/68.0"            
nginx_1       | 10.10.1.11 - - [28/Aug/2019:00:41:10 +0000] "GET / HTTP/1.1" 200 146 "http://10.10.1.10:4098/" "Mozilla/5.0 (Windows NT 10.0; rv:68.0) Gecko/20100101 Firefox/68.0"            
nginx_1       | 10.10.1.11 - - [28/Aug/2019:00:41:16 +0000] "GET /data HTTP/1.1" 404 146 "-" "Mozilla/5.0 (Windows NT 10.0; rv:68.0) Gecko/20100101 Firefox/68.0"                             
nginx_1       | 10.10.1.11 - - [28/Aug/2019:00:41:21 +0000] "GET /data/index.html HTTP/1.1" 404 146 "-" "Mozilla/5.0 (Windows NT 10.0; rv:68.0) Gecko/20100101 Firefox/68.0"                  
nginx_1       | 10.10.1.11 - - [28/Aug/2019:00:41:39 +0000] "GET / HTTP/1.1" 200 146 "-" "Mozilla/5.0 (Windows NT 10.0; rv:68.0) Gecko/20100101 Firefox/68.0"                                 
nginx_1       | 10.10.1.11 - - [28/Aug/2019:00:41:44 +0000] "GET / HTTP/1.1" 200 146 "-" "Mozilla/5.0 (Windows NT 10.0; rv:68.0) Gecko/20100101 Firefox/68.0"                                 
nginx_1       | 10.10.1.11 - - [28/Aug/2019:00:41:44 +0000] "GET / HTTP/1.1" 200 146 "-" "Mozilla/5.0 (Windows NT 10.0; rv:68.0) Gecko/20100101 Firefox/68.0"      

Software versions

I am using Docker-Compose on Debian Testing. I followed the https://github.com/pirate/ArchiveBox/wiki/Docker

As far as I can tell the data folder I defined the compose file is populated with bunc of files, so tehre is enough permission and access there.

Here is the docker-compose.yml


version: '3'

services:
    archivebox:
        build: .                                   # replace this with nikisweeting/archivebox to use the docker-compose.yml file as a standalone file without avoid having to clone the repo
        stdin_open: true                           # needed to be able to input URLs directly after `docker-compose up`
        tty: true                                  # needed to be able to pipe in URLs via stdin to `docker-compose exec ...`
        # env_file: path/to/your/ArchiveBox.conf   # this feature is available starting >v0.4
        environment:
            - USE_COLOR=False                      # make docker logs nicer by not spamming lots of ANSI colors
            - SHOW_PROGRESS=False                  # make docker logs nicer by not writing lots of progress bar lines
        volumes:
            - /media/DRIVE/archivebox/data:/data
        command: bash -c 'echo "https://github.com/pirate/ArchiveBox" | /bin/archive; tail -f /dev/null'  # archive the Github repo homepage as a starting point so the index doesn't just show an empty list to new users

    nginx:
        image: 'nginx'
        ports:
            - '4098:80'
        volumes:
            - ./etc/nginx/nginx.conf:/etc/nginx/nginx.conf
            - ./data:/var/www


Originally created by @gerroon on GitHub (Aug 28, 2019). Original GitHub issue: https://github.com/ArchiveBox/ArchiveBox/issues/258 #### Describe the bug I am getting an empty page with `SERVER:4098`, I try `/data/index.htm` which gives me `404` #### Steps to reproduce 1. Ran ArchiveBox with the following config '...' Docker-compose up 2. Saw this output during archiving '....' ``` echo "https://github.com/pirate/ArchiveBox/wiki/Docker" | docker-compose exec -T archivebox /bin/archive fatal: Not a git repository (or any of the parent directories): .git [*] [2019-08-28 00:41:42] Parsing new links from output/sources/stdin-1566952902.txt... > Adding 0 new links to index (parsed import as Plain Text) [*] [2019-08-28 00:41:42] Saving main index files... √ /data/index.json √ /data/index.html [▶] [2019-08-28 00:41:42] Updating content for 2 pages in archive... [*] [2019-08-28 00:41:42] "Docker · pirate/ArchiveBox Wiki · GitHub" https://github.com/pirate/ArchiveBox/wiki/Docker √ /data/archive/1566952707 [*] [2019-08-28 00:41:42] "GitHub - pirate/ArchiveBox: 🗃 The open source self-hosted web archive. Takes browser history/bookmarks/Pocket/Pinboard/etc., saves HTML, JS, PDFs, media, and more..." https://github.com/pirate/ArchiveBox √ /data/archive/1566946654 [√] [2019-08-28 00:41:42] Update of 2 pages complete (0.02 sec) - 2 links skipped - 0 links updated - 0 links had errors To view your archive, open: /data/index.html [*] [2019-08-28 00:41:42] Saving main index files... √ /data/index.json √ /data/index.html ``` 3. UI didn't show the thing I was expecting '....' I get an empty page like ``` Index of / ../ ``` or **404** #### Screenshots or log output ``` Starting archivebox_docker_nginx_1 ... done Recreating archivebox_docker_archivebox_1 ... done Attaching to archivebox_docker_nginx_1, archivebox_docker_archivebox_1 archivebox_1 | fatal: Not a git repository (or any of the parent directories): .git archivebox_1 | [*] [2019-08-28 00:41:03] Parsing new links from output/sources/stdin-1566952863.txt... archivebox_1 | > Adding 0 new links to index (parsed import as Plain Text) archivebox_1 | [*] [2019-08-28 00:41:03] Saving main index files... √ /data/index.json/data/index.json √ /data/index.html/data/index.html archivebox_1 | [▶] [2019-08-28 00:41:03] Updating content for 2 pages in archive... archivebox_1 | archivebox_1 | [*] [2019-08-28 00:41:03] "Docker · pirate/ArchiveBox Wiki · GitHub" archivebox_1 | https://github.com/pirate/ArchiveBox/wiki/Docker archivebox_1 | √ /data/archive/1566952707 archivebox_1 | archivebox_1 | [*] [2019-08-28 00:41:03] "GitHub - pirate/ArchiveBox: 🗃 The open source self-hosted web archive. Takes browser history/bookmarks/Pocket/Pinboard/etc., saves HTML, JS, PDFs, media, and more..." archivebox_1 | https://github.com/pirate/ArchiveBox archivebox_1 | √ /data/archive/1566946654 archivebox_1 | [√] [2019-08-28 00:41:03] Update of 2 pages complete (0.02 sec) archivebox_1 | - 2 links skipped archivebox_1 | - 0 links updated archivebox_1 | - 0 links had errors archivebox_1 | To view your archive, open: /data/index.html archivebox_1 | [*] [2019-08-28 00:41:03] Saving main index files... √ /data/index.json/data/index.json √ /data/index.html/data/index.html nginx_1 | 10.10.1.11 - - [28/Aug/2019:00:41:09 +0000] "GET / HTTP/1.1" 200 146 "http://10.10.1.10:4098/" "Mozilla/5.0 (Windows NT 10.0; rv:68.0) Gecko/20100101 Firefox/68.0" nginx_1 | 10.10.1.11 - - [28/Aug/2019:00:41:10 +0000] "GET / HTTP/1.1" 200 146 "http://10.10.1.10:4098/" "Mozilla/5.0 (Windows NT 10.0; rv:68.0) Gecko/20100101 Firefox/68.0" nginx_1 | 10.10.1.11 - - [28/Aug/2019:00:41:16 +0000] "GET /data HTTP/1.1" 404 146 "-" "Mozilla/5.0 (Windows NT 10.0; rv:68.0) Gecko/20100101 Firefox/68.0" nginx_1 | 10.10.1.11 - - [28/Aug/2019:00:41:21 +0000] "GET /data/index.html HTTP/1.1" 404 146 "-" "Mozilla/5.0 (Windows NT 10.0; rv:68.0) Gecko/20100101 Firefox/68.0" nginx_1 | 10.10.1.11 - - [28/Aug/2019:00:41:39 +0000] "GET / HTTP/1.1" 200 146 "-" "Mozilla/5.0 (Windows NT 10.0; rv:68.0) Gecko/20100101 Firefox/68.0" nginx_1 | 10.10.1.11 - - [28/Aug/2019:00:41:44 +0000] "GET / HTTP/1.1" 200 146 "-" "Mozilla/5.0 (Windows NT 10.0; rv:68.0) Gecko/20100101 Firefox/68.0" nginx_1 | 10.10.1.11 - - [28/Aug/2019:00:41:44 +0000] "GET / HTTP/1.1" 200 146 "-" "Mozilla/5.0 (Windows NT 10.0; rv:68.0) Gecko/20100101 Firefox/68.0" ``` #### Software versions I am using Docker-Compose on Debian Testing. I followed the https://github.com/pirate/ArchiveBox/wiki/Docker As far as I can tell the `data` folder I defined the compose file is populated with bunc of files, so tehre is enough permission and access there. Here is the docker-compose.yml ``` version: '3' services: archivebox: build: . # replace this with nikisweeting/archivebox to use the docker-compose.yml file as a standalone file without avoid having to clone the repo stdin_open: true # needed to be able to input URLs directly after `docker-compose up` tty: true # needed to be able to pipe in URLs via stdin to `docker-compose exec ...` # env_file: path/to/your/ArchiveBox.conf # this feature is available starting >v0.4 environment: - USE_COLOR=False # make docker logs nicer by not spamming lots of ANSI colors - SHOW_PROGRESS=False # make docker logs nicer by not writing lots of progress bar lines volumes: - /media/DRIVE/archivebox/data:/data command: bash -c 'echo "https://github.com/pirate/ArchiveBox" | /bin/archive; tail -f /dev/null' # archive the Github repo homepage as a starting point so the index doesn't just show an empty list to new users nginx: image: 'nginx' ports: - '4098:80' volumes: - ./etc/nginx/nginx.conf:/etc/nginx/nginx.conf - ./data:/var/www ```
kerem closed this issue 2026-03-01 17:52:55 +03:00
Author
Owner

@pirate commented on GitHub (Sep 25, 2019):

You visited http://127.0.0.1:4098 while docker was running and didn't see anything? What happens if you try opening the html file manually?

<!-- gh-comment-id:534871422 --> @pirate commented on GitHub (Sep 25, 2019): You visited `http://127.0.0.1:4098` while docker was running and didn't see anything? What happens if you try opening the html file manually?
Author
Owner

@pirate commented on GitHub (Jul 24, 2020):

git checkout django
git pull
docker build . -t archive box
docker run -v $PWD/output:/data archivebox init
docker run -v $PWD/output:/data -p 8000:8000 archivebox server

if this issue still happens on the latest version comment back here and I'll reopen the issue

<!-- gh-comment-id:663620004 --> @pirate commented on GitHub (Jul 24, 2020): ```bash git checkout django git pull docker build . -t archive box docker run -v $PWD/output:/data archivebox init docker run -v $PWD/output:/data -p 8000:8000 archivebox server ``` if this issue still happens on the latest version comment back here and I'll reopen the issue
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/ArchiveBox#1691
No description provided.