[GH-ISSUE #483] Bugfix: docker iframe adds /data/ to url incorrectly #3336

Closed
opened 2026-03-14 22:12:37 +03:00 by kerem · 15 comments
Owner

Originally created by @poblabs on GitHub (Sep 24, 2020).
Original GitHub issue: https://github.com/ArchiveBox/ArchiveBox/issues/483

Describe the bug

Using the docker-compose.yml instructions, adding a URL works fine. When accessing that URL I get a 404 within the iframe.

Steps to reproduce

  1. Start docker-compose up
  2. Add URL
  3. Click on the URL. I get the top bar, but the iframe is a 404

Screenshots or log output

It looks like the iframe is trying to go to http://server/data/archive/1600957105.659589/site/index.html when the data folder doesn't exist as a web entity.

Browsing to http://server/archive/1600957105.659589/site/index.html works. (note: without the /data/ folder)

How can I get the iframe working correctly?

image

Software versions

Latest docker-compose.yml

Originally created by @poblabs on GitHub (Sep 24, 2020). Original GitHub issue: https://github.com/ArchiveBox/ArchiveBox/issues/483 <!-- Please fill out the following information, feel free to delete sections if they're not applicable or if long issue templates annoy you :) --> #### Describe the bug Using the docker-compose.yml instructions, adding a URL works fine. When accessing that URL I get a 404 within the iframe. #### Steps to reproduce 1. Start docker-compose up 2. Add URL 3. Click on the URL. I get the top bar, but the iframe is a 404 #### Screenshots or log output It looks like the iframe is trying to go to `http://server/data/archive/1600957105.659589/site/index.html` when the **data** folder doesn't exist as a web entity. Browsing to http://server/archive/1600957105.659589/site/index.html works. (note: without the /data/ folder) How can I get the iframe working correctly? ![image](https://user-images.githubusercontent.com/3484775/94158299-3cda4e00-fe50-11ea-8477-0860a5f56359.png) #### Software versions Latest docker-compose.yml
kerem closed this issue 2026-03-14 22:12:42 +03:00
Author
Owner

@cdvv7788 commented on GitHub (Sep 24, 2020):

I will give this a check. This may be related to some changes we made to calculate the path to wget.

<!-- gh-comment-id:698382783 --> @cdvv7788 commented on GitHub (Sep 24, 2020): I will give this a check. This may be related to some changes we made to calculate the path to `wget`.
Author
Owner

@poblabs commented on GitHub (Sep 24, 2020):

Also worth noting that clicking this icon here, brings me to this very long incorrect URL.

image

http://server/archive/1600958844.850607//data/archive/1600958844.850607/site/index.html/

<!-- gh-comment-id:698396563 --> @poblabs commented on GitHub (Sep 24, 2020): Also worth noting that clicking this icon here, brings me to this very long incorrect URL. ![image](https://user-images.githubusercontent.com/3484775/94161808-161e1680-fe54-11ea-8fe2-180aef1bc324.png) `http://server/archive/1600958844.850607//data/archive/1600958844.850607/site/index.html/`
Author
Owner

@pirate commented on GitHub (Sep 24, 2020):

It's probably a str(Path(...)) that's expanding out to the full path now instead of the old relative path.

<!-- gh-comment-id:698490736 --> @pirate commented on GitHub (Sep 24, 2020): It's probably a `str(Path(...))` that's expanding out to the full path now instead of the old relative path.
Author
Owner

@cdvv7788 commented on GitHub (Sep 24, 2020):

Yes, that is the issue. I fixed the static index, but it seems this one persists. I will take care of this.

<!-- gh-comment-id:698495947 --> @cdvv7788 commented on GitHub (Sep 24, 2020): Yes, that is the issue. I fixed the static index, but it seems this one persists. I will take care of this.
Author
Owner

@cdvv7788 commented on GitHub (Sep 25, 2020):

@poblabs https://github.com/pirate/ArchiveBox/pull/486 This should fix the issue.

<!-- gh-comment-id:698921918 --> @cdvv7788 commented on GitHub (Sep 25, 2020): @poblabs https://github.com/pirate/ArchiveBox/pull/486 This should fix the issue.
Author
Owner

@poblabs commented on GitHub (Sep 25, 2020):

@cdvv7788 Will I be able to build a new docker image from your cdvv7788:wget-path branch?

<!-- gh-comment-id:698925753 --> @poblabs commented on GitHub (Sep 25, 2020): @cdvv7788 Will I be able to build a new docker image from your `cdvv7788:wget-path` branch?
Author
Owner

@cdvv7788 commented on GitHub (Sep 25, 2020):

Yes, it should be possible. If you are getting any issue, please let me know and I will check it.

<!-- gh-comment-id:698927138 --> @cdvv7788 commented on GitHub (Sep 25, 2020): Yes, it should be possible. If you are getting any issue, please let me know and I will check it.
Author
Owner

@poblabs commented on GitHub (Sep 25, 2020):

@cdvv7788 It failed.

docker build . -t archivebox --no-cache

Step 20/29 : WORKDIR "$CODE_DIR"
 ---> Running in e06219be686e
Removing intermediate container e06219be686e
 ---> 1442435435af
Step 21/29 : ADD . "$CODE_DIR"
 ---> 254a0d4fbc47
Step 22/29 : RUN pip install -e .
 ---> Running in 49333e96986c
Obtaining file:///app
    ERROR: Command errored out with exit status 1:
     command: /usr/local/bin/python -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'/app/setup.py'"'"'; __file__='"'"'/app/setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' egg_info --egg-base /tmp/pip-pip-egg-info-fstf2f7j
         cwd: /app/
    Complete output (11 lines):
    Traceback (most recent call last):
      File "<string>", line 1, in <module>
      File "/app/setup.py", line 17, in <module>
        VERSION = json.loads((PYTHON_DIR / "package.json").read_text().strip())['version']
      File "/usr/local/lib/python3.8/json/__init__.py", line 357, in loads
        return _default_decoder.decode(s)
      File "/usr/local/lib/python3.8/json/decoder.py", line 337, in decode
        obj, end = self.raw_decode(s, idx=_w(s, 0).end())
      File "/usr/local/lib/python3.8/json/decoder.py", line 355, in raw_decode
        raise JSONDecodeError("Expecting value", s, err.value) from None
    json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)
    ----------------------------------------
ERROR: Command errored out with exit status 1: python setup.py egg_info Check the logs for full command output.
The command '/bin/sh -c pip install -e .' returned a non-zero code: 1
<!-- gh-comment-id:698931539 --> @poblabs commented on GitHub (Sep 25, 2020): @cdvv7788 It failed. `docker build . -t archivebox --no-cache` ``` Step 20/29 : WORKDIR "$CODE_DIR" ---> Running in e06219be686e Removing intermediate container e06219be686e ---> 1442435435af Step 21/29 : ADD . "$CODE_DIR" ---> 254a0d4fbc47 Step 22/29 : RUN pip install -e . ---> Running in 49333e96986c Obtaining file:///app ERROR: Command errored out with exit status 1: command: /usr/local/bin/python -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'/app/setup.py'"'"'; __file__='"'"'/app/setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' egg_info --egg-base /tmp/pip-pip-egg-info-fstf2f7j cwd: /app/ Complete output (11 lines): Traceback (most recent call last): File "<string>", line 1, in <module> File "/app/setup.py", line 17, in <module> VERSION = json.loads((PYTHON_DIR / "package.json").read_text().strip())['version'] File "/usr/local/lib/python3.8/json/__init__.py", line 357, in loads return _default_decoder.decode(s) File "/usr/local/lib/python3.8/json/decoder.py", line 337, in decode obj, end = self.raw_decode(s, idx=_w(s, 0).end()) File "/usr/local/lib/python3.8/json/decoder.py", line 355, in raw_decode raise JSONDecodeError("Expecting value", s, err.value) from None json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0) ---------------------------------------- ERROR: Command errored out with exit status 1: python setup.py egg_info Check the logs for full command output. The command '/bin/sh -c pip install -e .' returned a non-zero code: 1 ```
Author
Owner

@cdvv7788 commented on GitHub (Sep 25, 2020):

Weird. I just tried building it and it didn't fail. The automated tests were able to build it too. Can you check if there are no modified files in your local copy? (git status). It looks like the package.json has issues.

<!-- gh-comment-id:698943029 --> @cdvv7788 commented on GitHub (Sep 25, 2020): Weird. I just tried building it and it didn't fail. The automated tests were able to build it too. Can you check if there are no modified files in your local copy? (git status). It looks like the `package.json` has issues.
Author
Owner

@poblabs commented on GitHub (Sep 25, 2020):

Strange. I downloaded the zip file into a new directory, ran the docker build and it still failed. I'll try from master and merge your changes to see if I can get that to build.

<!-- gh-comment-id:698948169 --> @poblabs commented on GitHub (Sep 25, 2020): Strange. I downloaded the zip file into a new directory, ran the docker build and it still failed. I'll try from master and merge your changes to see if I can get that to build.
Author
Owner

@poblabs commented on GitHub (Sep 25, 2020):

That worked, and the fix worked. Thanks!

<!-- gh-comment-id:698954142 --> @poblabs commented on GitHub (Sep 25, 2020): That worked, and the fix worked. Thanks!
Author
Owner

@poblabs commented on GitHub (Sep 25, 2020):

Actually, looks like there is still a minor problem.

Clicking this icon adds a trailing slash which messes up the CSS resources. Removing the trailing slash fixes it.

image

<!-- gh-comment-id:698958118 --> @poblabs commented on GitHub (Sep 25, 2020): Actually, looks like there is still a minor problem. Clicking this icon adds a trailing slash which messes up the CSS resources. Removing the trailing slash fixes it. ![image](https://user-images.githubusercontent.com/3484775/94161808-161e1680-fe54-11ea-8fe2-180aef1bc324.png)
Author
Owner

@cdvv7788 commented on GitHub (Sep 25, 2020):

Hmmm...it seems this is unrelated. wget is using relative paths (src="../www.google.com/images/branding/googlelogo/1x/googlelogo_white_background_color_272x92dp.png")...it fails to interpret it correctly when there is a trailing slash (the ../ actually points to the current folder instead of moving up 1 level). This is definitely an issue. I will open a new github issue to track it.

<!-- gh-comment-id:699083885 --> @cdvv7788 commented on GitHub (Sep 25, 2020): Hmmm...it seems this is unrelated. wget is using relative paths (`src="../www.google.com/images/branding/googlelogo/1x/googlelogo_white_background_color_272x92dp.png"`)...it fails to interpret it correctly when there is a trailing slash (the `../` actually points to the current folder instead of moving up 1 level). This is definitely an issue. I will open a new github issue to track it.
Author
Owner

@poblabs commented on GitHub (Sep 25, 2020):

@cdvv7788 I found it. Remove the trailing slash from this line.

So it looks like: <a href="/{}/{}" ....

<!-- gh-comment-id:699084349 --> @poblabs commented on GitHub (Sep 25, 2020): @cdvv7788 I found it. Remove the trailing slash [from this line](https://github.com/pirate/ArchiveBox/blob/master/archivebox/core/utils.py#L17). So it looks like: `<a href="/{}/{}" .... `
Author
Owner

@cdvv7788 commented on GitHub (Sep 25, 2020):

Yes, it is added in there. However, a trailing slash should not (ideally) break anything at all. I will remove it from that line, but I will leave https://github.com/pirate/ArchiveBox/issues/487 open to find a more permanent solution. Thanks!

<!-- gh-comment-id:699087977 --> @cdvv7788 commented on GitHub (Sep 25, 2020): Yes, it is added in there. However, a trailing slash should not (ideally) break anything at all. I will remove it from that line, but I will leave https://github.com/pirate/ArchiveBox/issues/487 open to find a more permanent solution. Thanks!
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/ArchiveBox#3336
No description provided.