[GH-ISSUE #500] Bugfix: Exception thrown from wget extractor #3346

Closed
opened 2026-03-14 22:18:26 +03:00 by kerem · 11 comments
Owner

Originally created by @jrruethe on GitHub (Oct 5, 2020).
Original GitHub issue: https://github.com/ArchiveBox/ArchiveBox/issues/500

Describe the bug

The Wget extractor is throwing an exception Failed to archive link: ValueError: '/data/index.html' does not start with '/data/archive/1601694245.432877', which appears to originate from this block of code.

Steps to reproduce

I am using archivebox add --update-all --depth 1 http://bookmarks?do=atom, where http://bookmarks?do=atom is my Shaarli instance. Essentially, I am archiving all my bookmarks.

The log output shows everything works fine for quite a while, then this exception occurs and the Docker container dies.

The folder referenced in the log output below is empty (/data/archive/1601694245.432877).

I believe that the archived link beforehand is fine (https://blog.thefactual.com/what-are-the-best-nonpartisan-news-sources), and that the exception is occurring on the next link, however I the log output doesn't show me what link it is trying to archive. The /data/index.html file that the exception is referring to appears to be the main index file, so I am not sure if this is happening at the very end, after all links have been indexed, and it is trying to rebuild the main index.html file as a last step.

This is with a recently published Docker image nikisweeting/archivebox@sha256:ee4c84369b8620c53f7d5772b70ad86aa22ca71d3ae3648ef19b75ba2c14efaf. I was previously using nikisweeting/archivebox:0.4.21, and when I switched to the newer image, I did not run any init or migration steps before calling add again.

Thank you for your time and help, let me know if there is additional information I can provide that would be useful.

BTW, I saw the comments in this issue describing how the main index is in the process of being removed. It very well could be that this issue is just a consequence of running code that is mid-refactor.

Screenshots or log output

2020-10-04T08:29:38.902862648Z [+] [2020-10-04 08:29:38] "blog.thefactual.com/what-are-the-best-nonpartisan-news-sources"
2020-10-04T08:29:38.902867437Z     https://blog.thefactual.com/what-are-the-best-nonpartisan-news-sources
2020-10-04T08:29:38.90287951Z     > ./archive/1601694245.427045
2020-10-04T08:29:38.903008686Z       > title
2020-10-04T08:29:39.469988358Z       > favicon
2020-10-04T08:29:40.605976312Z       > wget
2020-10-04T08:29:56.790239512Z       > pdf
2020-10-04T08:30:03.981854219Z       > screenshot
2020-10-04T08:30:09.238869897Z       > dom
2020-10-04T08:30:14.230718869Z       > readability
2020-10-04T08:30:15.530399373Z       > mercury
2020-10-04T08:30:18.555948774Z       > headers
2020-10-04T08:30:19.447750335Z     ! Failed to archive link: ValueError: '/data/index.html' does not start with '/data/archive/1601694245.432877'
2020-10-04T08:30:19.447779591Z 
2020-10-04T08:30:19.454165324Z Traceback (most recent call last):
2020-10-04T08:30:19.454182587Z   File "/usr/local/bin/archivebox", line 33, in <module>
2020-10-04T08:30:19.45570422Z     sys.exit(load_entry_point('archivebox', 'console_scripts', 'archivebox')())
2020-10-04T08:30:19.455721833Z   File "/app/archivebox/cli/__init__.py", line 123, in main
2020-10-04T08:30:19.456645155Z     run_subcommand(
2020-10-04T08:30:19.456662698Z   File "/app/archivebox/cli/__init__.py", line 63, in run_subcommand
2020-10-04T08:30:19.456722362Z     module.main(args=subcommand_args, stdin=stdin, pwd=pwd)    # type: ignore
2020-10-04T08:30:19.45673672Z   File "/app/archivebox/cli/archivebox_add.py", line 78, in main
2020-10-04T08:30:19.456934758Z     add(
2020-10-04T08:30:19.456943715Z   File "/app/archivebox/util.py", line 113, in typechecked_function
2020-10-04T08:30:19.457280297Z     return func(*args, **kwargs)
2020-10-04T08:30:19.457289795Z   File "/app/archivebox/main.py", line 559, in add
2020-10-04T08:30:19.457937511Z     archive_links(all_links, overwrite=overwrite, out_dir=out_dir)
2020-10-04T08:30:19.457947921Z   File "/app/archivebox/util.py", line 113, in typechecked_function
2020-10-04T08:30:19.458028775Z     return func(*args, **kwargs)
2020-10-04T08:30:19.458039155Z   File "/app/archivebox/extractors/__init__.py", line 157, in archive_links
2020-10-04T08:30:19.458703713Z     archive_link(to_archive, overwrite=overwrite, methods=methods, out_dir=Path(link.link_dir))
2020-10-04T08:30:19.458716236Z   File "/app/archivebox/util.py", line 113, in typechecked_function
2020-10-04T08:30:19.458788705Z     return func(*args, **kwargs)
2020-10-04T08:30:19.458797612Z   File "/app/archivebox/extractors/__init__.py", line 83, in archive_link
2020-10-04T08:30:19.458868156Z     write_link_details(link, out_dir=out_dir, skip_sql_index=skip_index)
2020-10-04T08:30:19.45887508Z   File "/app/archivebox/util.py", line 113, in typechecked_function
2020-10-04T08:30:19.458962987Z     return func(*args, **kwargs)
2020-10-04T08:30:19.458979369Z   File "/app/archivebox/index/__init__.py", line 350, in write_link_details
2020-10-04T08:30:19.459758926Z     write_json_link_details(link, out_dir=out_dir)
2020-10-04T08:30:19.459772251Z   File "/app/archivebox/util.py", line 113, in typechecked_function
2020-10-04T08:30:19.459850541Z     return func(*args, **kwargs)
2020-10-04T08:30:19.459865449Z   File "/app/archivebox/index/json.py", line 100, in write_json_link_details
2020-10-04T08:30:19.460144481Z     atomic_write(str(path), link._asdict(extended=True))
2020-10-04T08:30:19.460153118Z   File "/app/archivebox/index/schema.py", line 206, in _asdict
2020-10-04T08:30:19.460434244Z     'canonical': self.canonical_outputs(),
2020-10-04T08:30:19.460446568Z   File "/app/archivebox/index/schema.py", line 406, in canonical_outputs
2020-10-04T08:30:19.460610581Z     'wget_path': wget_output_path(self),
2020-10-04T08:30:19.460619568Z   File "/app/archivebox/util.py", line 113, in typechecked_function
2020-10-04T08:30:19.460691705Z     return func(*args, **kwargs)
2020-10-04T08:30:19.460699941Z   File "/app/archivebox/extractors/wget.py", line 182, in wget_output_path
2020-10-04T08:30:19.461027055Z     return str(html_files[0].relative_to(link.link_dir))
2020-10-04T08:30:19.461035802Z   File "/usr/local/lib/python3.8/pathlib.py", line 907, in relative_to
2020-10-04T08:30:19.46225102Z     raise ValueError("{!r} does not start with {!r}"
2020-10-04T08:30:19.462267782Z ValueError: '/data/index.html' does not start with '/data/archive/1601694245.432877'

Software versions

Docker image nikisweeting/archivebox@sha256:ee4c84369b8620c53f7d5772b70ad86aa22ca71d3ae3648ef19b75ba2c14efaf

Originally created by @jrruethe on GitHub (Oct 5, 2020). Original GitHub issue: https://github.com/ArchiveBox/ArchiveBox/issues/500 #### Describe the bug The Wget extractor is throwing an exception `Failed to archive link: ValueError: '/data/index.html' does not start with '/data/archive/1601694245.432877'`, which appears to originate from [this](https://github.com/pirate/ArchiveBox/blob/3e26ab3ce3695452141d99671dd0e5865ae9a096/archivebox/extractors/wget.py#L182) block of code. #### Steps to reproduce I am using `archivebox add --update-all --depth 1 http://bookmarks?do=atom`, where `http://bookmarks?do=atom` is my Shaarli instance. Essentially, I am archiving all my bookmarks. The log output shows everything works fine for quite a while, then this exception occurs and the Docker container dies. The folder referenced in the log output below is empty (`/data/archive/1601694245.432877`). I believe that the archived link beforehand is fine (`https://blog.thefactual.com/what-are-the-best-nonpartisan-news-sources`), and that the exception is occurring on the _next_ link, however I the log output doesn't show me what link it is trying to archive. The `/data/index.html` file that the exception is referring to appears to be the main index file, so I am not sure if this is happening at the very end, after all links have been indexed, and it is trying to rebuild the main index.html file as a last step. This is with a recently published Docker image `nikisweeting/archivebox@sha256:ee4c84369b8620c53f7d5772b70ad86aa22ca71d3ae3648ef19b75ba2c14efaf`. I was previously using `nikisweeting/archivebox:0.4.21`, and when I switched to the newer image, I did not run any `init` or migration steps before calling `add` again. Thank you for your time and help, let me know if there is additional information I can provide that would be useful. BTW, I saw the comments in [this issue](https://github.com/pirate/ArchiveBox/issues/490) describing how the main index is in the process of being removed. It very well could be that this issue is just a consequence of running code that is mid-refactor. #### Screenshots or log output ``` 2020-10-04T08:29:38.902862648Z [+] [2020-10-04 08:29:38] "blog.thefactual.com/what-are-the-best-nonpartisan-news-sources" 2020-10-04T08:29:38.902867437Z https://blog.thefactual.com/what-are-the-best-nonpartisan-news-sources 2020-10-04T08:29:38.90287951Z > ./archive/1601694245.427045 2020-10-04T08:29:38.903008686Z > title 2020-10-04T08:29:39.469988358Z > favicon 2020-10-04T08:29:40.605976312Z > wget 2020-10-04T08:29:56.790239512Z > pdf 2020-10-04T08:30:03.981854219Z > screenshot 2020-10-04T08:30:09.238869897Z > dom 2020-10-04T08:30:14.230718869Z > readability 2020-10-04T08:30:15.530399373Z > mercury 2020-10-04T08:30:18.555948774Z > headers 2020-10-04T08:30:19.447750335Z ! Failed to archive link: ValueError: '/data/index.html' does not start with '/data/archive/1601694245.432877' 2020-10-04T08:30:19.447779591Z 2020-10-04T08:30:19.454165324Z Traceback (most recent call last): 2020-10-04T08:30:19.454182587Z File "/usr/local/bin/archivebox", line 33, in <module> 2020-10-04T08:30:19.45570422Z sys.exit(load_entry_point('archivebox', 'console_scripts', 'archivebox')()) 2020-10-04T08:30:19.455721833Z File "/app/archivebox/cli/__init__.py", line 123, in main 2020-10-04T08:30:19.456645155Z run_subcommand( 2020-10-04T08:30:19.456662698Z File "/app/archivebox/cli/__init__.py", line 63, in run_subcommand 2020-10-04T08:30:19.456722362Z module.main(args=subcommand_args, stdin=stdin, pwd=pwd) # type: ignore 2020-10-04T08:30:19.45673672Z File "/app/archivebox/cli/archivebox_add.py", line 78, in main 2020-10-04T08:30:19.456934758Z add( 2020-10-04T08:30:19.456943715Z File "/app/archivebox/util.py", line 113, in typechecked_function 2020-10-04T08:30:19.457280297Z return func(*args, **kwargs) 2020-10-04T08:30:19.457289795Z File "/app/archivebox/main.py", line 559, in add 2020-10-04T08:30:19.457937511Z archive_links(all_links, overwrite=overwrite, out_dir=out_dir) 2020-10-04T08:30:19.457947921Z File "/app/archivebox/util.py", line 113, in typechecked_function 2020-10-04T08:30:19.458028775Z return func(*args, **kwargs) 2020-10-04T08:30:19.458039155Z File "/app/archivebox/extractors/__init__.py", line 157, in archive_links 2020-10-04T08:30:19.458703713Z archive_link(to_archive, overwrite=overwrite, methods=methods, out_dir=Path(link.link_dir)) 2020-10-04T08:30:19.458716236Z File "/app/archivebox/util.py", line 113, in typechecked_function 2020-10-04T08:30:19.458788705Z return func(*args, **kwargs) 2020-10-04T08:30:19.458797612Z File "/app/archivebox/extractors/__init__.py", line 83, in archive_link 2020-10-04T08:30:19.458868156Z write_link_details(link, out_dir=out_dir, skip_sql_index=skip_index) 2020-10-04T08:30:19.45887508Z File "/app/archivebox/util.py", line 113, in typechecked_function 2020-10-04T08:30:19.458962987Z return func(*args, **kwargs) 2020-10-04T08:30:19.458979369Z File "/app/archivebox/index/__init__.py", line 350, in write_link_details 2020-10-04T08:30:19.459758926Z write_json_link_details(link, out_dir=out_dir) 2020-10-04T08:30:19.459772251Z File "/app/archivebox/util.py", line 113, in typechecked_function 2020-10-04T08:30:19.459850541Z return func(*args, **kwargs) 2020-10-04T08:30:19.459865449Z File "/app/archivebox/index/json.py", line 100, in write_json_link_details 2020-10-04T08:30:19.460144481Z atomic_write(str(path), link._asdict(extended=True)) 2020-10-04T08:30:19.460153118Z File "/app/archivebox/index/schema.py", line 206, in _asdict 2020-10-04T08:30:19.460434244Z 'canonical': self.canonical_outputs(), 2020-10-04T08:30:19.460446568Z File "/app/archivebox/index/schema.py", line 406, in canonical_outputs 2020-10-04T08:30:19.460610581Z 'wget_path': wget_output_path(self), 2020-10-04T08:30:19.460619568Z File "/app/archivebox/util.py", line 113, in typechecked_function 2020-10-04T08:30:19.460691705Z return func(*args, **kwargs) 2020-10-04T08:30:19.460699941Z File "/app/archivebox/extractors/wget.py", line 182, in wget_output_path 2020-10-04T08:30:19.461027055Z return str(html_files[0].relative_to(link.link_dir)) 2020-10-04T08:30:19.461035802Z File "/usr/local/lib/python3.8/pathlib.py", line 907, in relative_to 2020-10-04T08:30:19.46225102Z raise ValueError("{!r} does not start with {!r}" 2020-10-04T08:30:19.462267782Z ValueError: '/data/index.html' does not start with '/data/archive/1601694245.432877' ``` #### Software versions Docker image `nikisweeting/archivebox@sha256:ee4c84369b8620c53f7d5772b70ad86aa22ca71d3ae3648ef19b75ba2c14efaf`
kerem closed this issue 2026-03-14 22:18:33 +03:00
Author
Owner

@cdvv7788 commented on GitHub (Oct 5, 2020):

I will check it. It was introduced in the last refactor probably. Thanks for the report.

<!-- gh-comment-id:703700136 --> @cdvv7788 commented on GitHub (Oct 5, 2020): I will check it. It was introduced in the last refactor probably. Thanks for the report.
Author
Owner

@cdvv7788 commented on GitHub (Oct 5, 2020):

We are not able to reproduce the issue. Can you try running init before this command? (backup your archive before).

<!-- gh-comment-id:703743173 --> @cdvv7788 commented on GitHub (Oct 5, 2020): We are not able to reproduce the issue. Can you try running `init` before this command? (backup your archive before).
Author
Owner

@jrruethe commented on GitHub (Oct 7, 2020):

Yes, I will try this and report back with the results. Thank you

<!-- gh-comment-id:704952905 --> @jrruethe commented on GitHub (Oct 7, 2020): Yes, I will try this and report back with the results. Thank you
Author
Owner

@jrruethe commented on GitHub (Oct 13, 2020):

This issue can be closed. I tried a few things, I'm not sure exactly what fixed it, but here is what I did:

  1. Made a backup
  2. Deleted the offending file data/index.html
  3. Updated to image nikisweeting/archivebox@sha256:f3db6ca0ac5eb9405daf5110dcb934cf7ba20b0a362adc380ccbf2d086a679c3
  4. archivebox init
  5. archivebox add ...

This ran for a bit, then completed successfully.

Thank you!

<!-- gh-comment-id:707737380 --> @jrruethe commented on GitHub (Oct 13, 2020): This issue can be closed. I tried a few things, I'm not sure exactly what fixed it, but here is what I did: 1) Made a backup 2) Deleted the offending file `data/index.html` 3) Updated to image `nikisweeting/archivebox@sha256:f3db6ca0ac5eb9405daf5110dcb934cf7ba20b0a362adc380ccbf2d086a679c3` 4) `archivebox init` 5) `archivebox add ...` This ran for a bit, then completed successfully. Thank you!
Author
Owner

@jrruethe commented on GitHub (Oct 17, 2020):

I have hit this exception again. I'm still investigating to see if this is something strange with my setup / configuration. The following occurred with Docker image nikisweeting/archivebox@sha256:5810591719d05f15cb3af20fce517fb9866f468dda23e321e8312a4baa455009, it appears to be right at the end when it should be writing out the /data/index.html file. The /data/index.json file was properly written, in my case it is 4.0GB, so I have a pretty large archive. Due to this, even a small update takes a while to process before the exception occurs, so it will take me some time to reproduce.

[√] [2020-10-16 15:19:24] Update of 53 pages complete (14.48 min)
    - 0 links skipped
    - 53 links updated
    - 53 links had errors

    Hint: To view your archive index, open:
        /data/index.html
    Or run the built-in webserver:
        archivebox server


      ███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████ 100.0% (120/120sec)
    √ /data/index.json                                                                                                                                                                                                               
Traceback (most recent call last):                                                                                                                                                                                                   
  File "/usr/local/bin/archivebox", line 33, in <module>
    sys.exit(load_entry_point('archivebox', 'console_scripts', 'archivebox')())
  File "/app/archivebox/cli/__init__.py", line 123, in main
    run_subcommand(
  File "/app/archivebox/cli/__init__.py", line 63, in run_subcommand
    module.main(args=subcommand_args, stdin=stdin, pwd=pwd)    # type: ignore
  File "/app/archivebox/cli/archivebox_add.py", line 78, in main
    add(
  File "/app/archivebox/util.py", line 113, in typechecked_function
    return func(*args, **kwargs)
  File "/app/archivebox/main.py", line 567, in add
    write_static_index([link.as_link_with_details() for link in all_links], out_dir=out_dir)
  File "/app/archivebox/util.py", line 113, in typechecked_function
    return func(*args, **kwargs)
  File "/app/archivebox/index/__init__.py", line 254, in write_static_index
    write_html_main_index(links, out_dir=out_dir, finished=True)
  File "/app/archivebox/util.py", line 113, in typechecked_function
    return func(*args, **kwargs)
  File "/app/archivebox/index/html.py", line 60, in write_html_main_index
    rendered_html = main_index_template(links, finished=finished)
  File "/app/archivebox/util.py", line 113, in typechecked_function
    return func(*args, **kwargs)
  File "/app/archivebox/index/html.py", line 75, in main_index_template
    'rows': '\n'.join(
  File "/app/archivebox/index/html.py", line 76, in <genexpr>
    main_index_row_template(link)
  File "/app/archivebox/util.py", line 113, in typechecked_function
    return func(*args, **kwargs)
  File "/app/archivebox/index/html.py", line 90, in main_index_row_template
    **link._asdict(extended=True),
  File "/app/archivebox/index/schema.py", line 206, in _asdict
    'canonical': self.canonical_outputs(),
  File "/app/archivebox/index/schema.py", line 406, in canonical_outputs
    'wget_path': wget_output_path(self),
  File "/app/archivebox/util.py", line 113, in typechecked_function
    return func(*args, **kwargs)
  File "/app/archivebox/extractors/wget.py", line 182, in wget_output_path
    return str(html_files[0].relative_to(link.link_dir))
  File "/usr/local/lib/python3.8/pathlib.py", line 907, in relative_to
    raise ValueError("{!r} does not start with {!r}"
ValueError: '/data/index.html' does not start with '/data/archive/1602820812.730733'
<!-- gh-comment-id:711070108 --> @jrruethe commented on GitHub (Oct 17, 2020): I have hit this exception again. I'm still investigating to see if this is something strange with my setup / configuration. The following occurred with Docker image `nikisweeting/archivebox@sha256:5810591719d05f15cb3af20fce517fb9866f468dda23e321e8312a4baa455009`, it appears to be right at the end when it should be writing out the `/data/index.html` file. The `/data/index.json` file was properly written, in my case it is 4.0GB, so I have a pretty large archive. Due to this, even a small update takes a while to process before the exception occurs, so it will take me some time to reproduce. ``` [√] [2020-10-16 15:19:24] Update of 53 pages complete (14.48 min) - 0 links skipped - 53 links updated - 53 links had errors Hint: To view your archive index, open: /data/index.html Or run the built-in webserver: archivebox server ███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████ 100.0% (120/120sec) √ /data/index.json Traceback (most recent call last): File "/usr/local/bin/archivebox", line 33, in <module> sys.exit(load_entry_point('archivebox', 'console_scripts', 'archivebox')()) File "/app/archivebox/cli/__init__.py", line 123, in main run_subcommand( File "/app/archivebox/cli/__init__.py", line 63, in run_subcommand module.main(args=subcommand_args, stdin=stdin, pwd=pwd) # type: ignore File "/app/archivebox/cli/archivebox_add.py", line 78, in main add( File "/app/archivebox/util.py", line 113, in typechecked_function return func(*args, **kwargs) File "/app/archivebox/main.py", line 567, in add write_static_index([link.as_link_with_details() for link in all_links], out_dir=out_dir) File "/app/archivebox/util.py", line 113, in typechecked_function return func(*args, **kwargs) File "/app/archivebox/index/__init__.py", line 254, in write_static_index write_html_main_index(links, out_dir=out_dir, finished=True) File "/app/archivebox/util.py", line 113, in typechecked_function return func(*args, **kwargs) File "/app/archivebox/index/html.py", line 60, in write_html_main_index rendered_html = main_index_template(links, finished=finished) File "/app/archivebox/util.py", line 113, in typechecked_function return func(*args, **kwargs) File "/app/archivebox/index/html.py", line 75, in main_index_template 'rows': '\n'.join( File "/app/archivebox/index/html.py", line 76, in <genexpr> main_index_row_template(link) File "/app/archivebox/util.py", line 113, in typechecked_function return func(*args, **kwargs) File "/app/archivebox/index/html.py", line 90, in main_index_row_template **link._asdict(extended=True), File "/app/archivebox/index/schema.py", line 206, in _asdict 'canonical': self.canonical_outputs(), File "/app/archivebox/index/schema.py", line 406, in canonical_outputs 'wget_path': wget_output_path(self), File "/app/archivebox/util.py", line 113, in typechecked_function return func(*args, **kwargs) File "/app/archivebox/extractors/wget.py", line 182, in wget_output_path return str(html_files[0].relative_to(link.link_dir)) File "/usr/local/lib/python3.8/pathlib.py", line 907, in relative_to raise ValueError("{!r} does not start with {!r}" ValueError: '/data/index.html' does not start with '/data/archive/1602820812.730733' ```
Author
Owner

@cdvv7788 commented on GitHub (Oct 19, 2020):

@jrruethe can you try with this branch? https://github.com/pirate/ArchiveBox/pull/502
you will need to build the image yourself:
docker build -t archivebox --no-cache
The actual archiving should be faster in that branch, making testing easier.

<!-- gh-comment-id:712225455 --> @cdvv7788 commented on GitHub (Oct 19, 2020): @jrruethe can you try with this branch? https://github.com/pirate/ArchiveBox/pull/502 you will need to build the image yourself: `docker build -t archivebox --no-cache` The actual archiving should be faster in that branch, making testing easier.
Author
Owner

@jrruethe commented on GitHub (Oct 19, 2020):

Sure, I will give that a try later this week and report back my results. Thank you.

<!-- gh-comment-id:712370552 --> @jrruethe commented on GitHub (Oct 19, 2020): Sure, I will give that a try later this week and report back my results. Thank you.
Author
Owner

@pirate commented on GitHub (Oct 22, 2020):

wow a 4GB index.json! I am sorry you had to suffer through that file being rewritten so many times in the old version. As @cdvv7788 the situation is much improved after #502. The new release with that improvement should be out within the next week or two.

<!-- gh-comment-id:714696174 --> @pirate commented on GitHub (Oct 22, 2020): wow a 4GB `index.json`! I am sorry you had to suffer through that file being rewritten so many times in the old version. As @cdvv7788 the situation is much improved after #502. The new release with that improvement should be out within the next week or two.
Author
Owner

@jrruethe commented on GitHub (Oct 22, 2020):

No suffering at all, I have it running on a cronjob in the background and I don't check on it too often, so I didn't notice for a while.

Honestly, Archivebox handles the large archive pretty well, I've been impressed.

I haven't had a chance to try out the #502 branch yet, I'll see if I can get to it soon.

Thanks!

<!-- gh-comment-id:714702576 --> @jrruethe commented on GitHub (Oct 22, 2020): No suffering at all, I have it running on a cronjob in the background and I don't check on it too often, so I didn't notice for a while. Honestly, Archivebox handles the large archive pretty well, I've been impressed. I haven't had a chance to try out the #502 branch yet, I'll see if I can get to it soon. Thanks!
Author
Owner

@jrruethe commented on GitHub (Oct 22, 2020):

Ok, I tested out this branch, and it fixes the exception. This issue can be closed again.

Right now, I am using an nginx container to serve the data/index.html file, but I am going to work on switching to the archivebox server method.

Thanks!

<!-- gh-comment-id:714773427 --> @jrruethe commented on GitHub (Oct 22, 2020): Ok, I tested out this branch, and it fixes the exception. This issue can be closed again. Right now, I am using an `nginx` container to serve the `data/index.html` file, but I am going to work on switching to the `archivebox server` method. Thanks!
Author
Owner

@cdvv7788 commented on GitHub (Oct 22, 2020):

Glad it worked. Let us know if it reappears.

<!-- gh-comment-id:714774719 --> @cdvv7788 commented on GitHub (Oct 22, 2020): Glad it worked. Let us know if it reappears.
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/ArchiveBox#3346
No description provided.