[GH-ISSUE #1107] Bug: Internal Server Error: /admin/core/snapshot/KeyError: 'title_str' occurs when portions of URL are longer than your filesystem allows #694

Open
opened 2026-03-01 14:45:35 +03:00 by kerem · 2 comments
Owner

Originally created by @bitjson on GitHub (Feb 25, 2023).
Original GitHub issue: https://github.com/ArchiveBox/ArchiveBox/issues/1107

Describe the bug

It seems like I've run into the same issue from https://github.com/ArchiveBox/ArchiveBox/issues/617, /admin/core/snapshot/ renders a Server Error (500).

The error persists when I restart the server, but the error resolves if I move the offending files out of the archive (a .html and a .orig file).

Steps to reproduce

I added a URL with depth=1, and it also archived https://ide.bitauth.com/import-template/eJw1j29rgzAQxr-KhL20GtNajYjg2kFfjK3D7dU6JCYXdMw_mFgGxe--RNcXx9397rmHuxt6ULyGlqEE1VoPKvH9qtFs0rXH-9Zfh8q3ADrdcKabvttoaIcfpmFzxd4q8b5V3yEXCVB8bAarMpYGdKwFUx2KBb9MrfMOSpuBtdMNKJTcZhetW7axJdO8LgcmbPtvUKzUORt6lxucBiTKnDTInNdzmR-Pl87k4pSTcH_pUvwbh5hWcUC2GAdUECZohG3IiAZCVoLHnMlwJ0Me0p3kYksiDpJKAEHwngfZYvj09pE_o9neOQ1DP2owp32ix8OpJJjgEofoy0VXGNXyN57_ADMdb0Q=

Screenshots or log output

System check identified no issues (0 silenced).
February 25, 2023 - 15:29:18
Django version 3.1.8, using settings 'core.settings'
Starting development server at http://127.0.0.1:8000/
Quit the server with CONTROL-C.
"GET / HTTP/1.1" 302 0
Internal Server Error: /admin/core/snapshot/
Traceback (most recent call last):
  File "/usr/local/Cellar/archivebox/0.6.2-1/libexec/lib/python3.9/site-packages/django/db/models/options.py", line 575, in get_field
    return self.fields_map[field_name]
KeyError: 'title_str'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/Cellar/archivebox/0.6.2-1/libexec/lib/python3.9/site-packages/django/contrib/admin/utils.py", line 265, in lookup_field
    f = _get_non_gfk_field(opts, name)
  File "/usr/local/Cellar/archivebox/0.6.2-1/libexec/lib/python3.9/site-packages/django/contrib/admin/utils.py", line 296, in _get_non_gfk_field
    field = opts.get_field(name)
  File "/usr/local/Cellar/archivebox/0.6.2-1/libexec/lib/python3.9/site-packages/django/db/models/options.py", line 577, in get_field
    raise FieldDoesNotExist("%s has no field named '%s'" % (self.object_name, field_name))
django.core.exceptions.FieldDoesNotExist: Snapshot has no field named 'title_str'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/Cellar/archivebox/0.6.2-1/libexec/lib/python3.9/site-packages/django/core/handlers/exception.py", line 47, in inner
    response = get_response(request)
  File "/usr/local/Cellar/archivebox/0.6.2-1/libexec/lib/python3.9/site-packages/django/core/handlers/base.py", line 204, in _get_response
    response = response.render()
  File "/usr/local/Cellar/archivebox/0.6.2-1/libexec/lib/python3.9/site-packages/django/template/response.py", line 105, in render
    self.content = self.rendered_content
  File "/usr/local/Cellar/archivebox/0.6.2-1/libexec/lib/python3.9/site-packages/django/template/response.py", line 83, in rendered_content
    return template.render(context, self._request)
  File "/usr/local/Cellar/archivebox/0.6.2-1/libexec/lib/python3.9/site-packages/django/template/backends/django.py", line 61, in render
    return self.template.render(context)
  File "/usr/local/Cellar/archivebox/0.6.2-1/libexec/lib/python3.9/site-packages/django/template/base.py", line 170, in render
    return self._render(context)
  File "/usr/local/Cellar/archivebox/0.6.2-1/libexec/lib/python3.9/site-packages/django/template/base.py", line 162, in _render
    return self.nodelist.render(context)
  File "/usr/local/Cellar/archivebox/0.6.2-1/libexec/lib/python3.9/site-packages/django/template/base.py", line 938, in render
    bit = node.render_annotated(context)
  File "/usr/local/Cellar/archivebox/0.6.2-1/libexec/lib/python3.9/site-packages/django/template/base.py", line 905, in render_annotated
    return self.render(context)
  File "/usr/local/Cellar/archivebox/0.6.2-1/libexec/lib/python3.9/site-packages/django/template/loader_tags.py", line 150, in render
    return compiled_parent._render(context)
  File "/usr/local/Cellar/archivebox/0.6.2-1/libexec/lib/python3.9/site-packages/django/template/base.py", line 162, in _render
    return self.nodelist.render(context)
  File "/usr/local/Cellar/archivebox/0.6.2-1/libexec/lib/python3.9/site-packages/django/template/base.py", line 938, in render
    bit = node.render_annotated(context)
  File "/usr/local/Cellar/archivebox/0.6.2-1/libexec/lib/python3.9/site-packages/django/template/base.py", line 905, in render_annotated
    return self.render(context)
  File "/usr/local/Cellar/archivebox/0.6.2-1/libexec/lib/python3.9/site-packages/django/template/loader_tags.py", line 150, in render
    return compiled_parent._render(context)
  File "/usr/local/Cellar/archivebox/0.6.2-1/libexec/lib/python3.9/site-packages/django/template/base.py", line 162, in _render
    return self.nodelist.render(context)
  File "/usr/local/Cellar/archivebox/0.6.2-1/libexec/lib/python3.9/site-packages/django/template/base.py", line 938, in render
    bit = node.render_annotated(context)
  File "/usr/local/Cellar/archivebox/0.6.2-1/libexec/lib/python3.9/site-packages/django/template/base.py", line 905, in render_annotated
    return self.render(context)
  File "/usr/local/Cellar/archivebox/0.6.2-1/libexec/lib/python3.9/site-packages/django/template/loader_tags.py", line 62, in render
    result = block.nodelist.render(context)
  File "/usr/local/Cellar/archivebox/0.6.2-1/libexec/lib/python3.9/site-packages/django/template/base.py", line 938, in render
    bit = node.render_annotated(context)
  File "/usr/local/Cellar/archivebox/0.6.2-1/libexec/lib/python3.9/site-packages/django/template/base.py", line 905, in render_annotated
    return self.render(context)
  File "/usr/local/Cellar/archivebox/0.6.2-1/libexec/lib/python3.9/site-packages/django/template/loader_tags.py", line 62, in render
    result = block.nodelist.render(context)
  File "/usr/local/Cellar/archivebox/0.6.2-1/libexec/lib/python3.9/site-packages/django/template/base.py", line 938, in render
    bit = node.render_annotated(context)
  File "/usr/local/Cellar/archivebox/0.6.2-1/libexec/lib/python3.9/site-packages/django/template/base.py", line 905, in render_annotated
    return self.render(context)
  File "/usr/local/Cellar/archivebox/0.6.2-1/libexec/lib/python3.9/site-packages/django/contrib/admin/templatetags/base.py", line 33, in render
    return super().render(context)
  File "/usr/local/Cellar/archivebox/0.6.2-1/libexec/lib/python3.9/site-packages/django/template/library.py", line 214, in render
    _dict = self.func(*resolved_args, **resolved_kwargs)
  File "/usr/local/Cellar/archivebox/0.6.2-1/libexec/lib/python3.9/site-packages/django/contrib/admin/templatetags/admin_list.py", line 341, in result_list
    'results': list(results(cl)),
  File "/usr/local/Cellar/archivebox/0.6.2-1/libexec/lib/python3.9/site-packages/django/contrib/admin/templatetags/admin_list.py", line 317, in results
    yield ResultList(None, items_for_result(cl, res, None))
  File "/usr/local/Cellar/archivebox/0.6.2-1/libexec/lib/python3.9/site-packages/django/contrib/admin/templatetags/admin_list.py", line 308, in __init__
    super().__init__(*items)
  File "/usr/local/Cellar/archivebox/0.6.2-1/libexec/lib/python3.9/site-packages/django/contrib/admin/templatetags/admin_list.py", line 233, in items_for_result
    f, attr, value = lookup_field(field_name, result, cl.model_admin)
  File "/usr/local/Cellar/archivebox/0.6.2-1/libexec/lib/python3.9/site-packages/django/contrib/admin/utils.py", line 274, in lookup_field
    value = attr(obj)
  File "/usr/local/Cellar/archivebox/0.6.2-1/libexec/lib/python3.9/site-packages/archivebox/core/admin.py", line 164, in title_str
    canon = obj.as_link().canonical_outputs()
  File "/usr/local/Cellar/archivebox/0.6.2-1/libexec/lib/python3.9/site-packages/archivebox/index/schema.py", line 427, in canonical_outputs
    'wget_path': wget_output_path(self),
  File "/usr/local/Cellar/archivebox/0.6.2-1/libexec/lib/python3.9/site-packages/archivebox/util.py", line 114, in typechecked_function
    return func(*args, **kwargs)
  File "/usr/local/Cellar/archivebox/0.6.2-1/libexec/lib/python3.9/site-packages/archivebox/extractors/wget.py", line 170, in wget_output_path
    if search_dir.exists():
  File "/usr/local/Cellar/python@3.9/3.9.16/Frameworks/Python.framework/Versions/3.9/lib/python3.9/pathlib.py", line 1424, in exists
    self.stat()
  File "/usr/local/Cellar/python@3.9/3.9.16/Frameworks/Python.framework/Versions/3.9/lib/python3.9/pathlib.py", line 1232, in stat
    return self._accessor.stat(self)
OSError: [Errno 63] File name too long: '/.../archivebox/archive/1677317139.02542/ide.bitauth.com/import-template/eJw1j29rgzAQxr-KhL20GtNajYjg2kFfjK3D7dU6JCYXdMw_mFgGxe--RNcXx9397rmHuxt6ULyGlqEE1VoPKvH9qtFs0rXH-9Zfh8q3ADrdcKabvttoaIcfpmFzxd4q8b5V3yEXCVB8bAarMpYGdKwFUx2KBb9MrfMOSpuBtdMNKJTcZhetW7axJdO8LgcmbPtvUKzUORt6lxucBiTKnDTInNdzmR-Pl87k4pSTcH_pUvwbh5hWcUC2GAdUECZohG3IiAZCVoLHnMlwJ0Me0p3kYksiDpJKAEHwngfZYvj09pE_o9neOQ1DP2owp32ix8OpJJjgEofoy0VXGNXyN57_ADMdb0Q='
"GET /admin/core/snapshot/ HTTP/1.1" 500 337243

ArchiveBox version

❯ archivebox version       
ArchiveBox v0.6.2
Cpython Darwin macOS-13.2.1-x86_64-i386-64bit x86_64
IN_DOCKER=False DEBUG=False IS_TTY=True TZ=UTC SEARCH_BACKEND_ENGINE=ripgrep

[i] Dependency versions:
 √  ARCHIVEBOX_BINARY     v0.6.2          valid     /usr/local/Cellar/archivebox/0.6.2-1/libexec/bin/archivebox                 
 √  PYTHON_BINARY         v3.9.16         valid     /usr/local/Cellar/python@3.9/3.9.16/Frameworks/Python.framework/Versions/3.9/bin/python3.9
 √  DJANGO_BINARY         v3.1.8          valid     /usr/local/Cellar/archivebox/0.6.2-1/libexec/lib/python3.9/site-packages/django/bin/django-admin.py
 √  CURL_BINARY           v7.86.0         valid     /usr/bin/curl                                                               
 √  WGET_BINARY           v1.21.3         valid     /usr/local/bin/wget                                                         
 √  NODE_BINARY           v18.14.2        valid     /Users/me/n/bin/node                                                     
 √  SINGLEFILE_BINARY     v0.3.17         valid     ./node_modules/single-file/cli/single-file                                  
 √  READABILITY_BINARY    v0.0.2          valid     ./node_modules/readability-extractor/readability-extractor                  
 √  MERCURY_BINARY        v1.0.0          valid     ./node_modules/@postlight/mercury-parser/cli.js                             
 √  GIT_BINARY            v2.39.0         valid     /usr/local/bin/git                                                          
 √  YOUTUBEDL_BINARY      v2021.12.17     valid     /usr/local/bin/youtube-dl                                                   
 √  CHROME_BINARY         v110.0.5481.100  valid     "/Applications/Google Chrome.app/Contents/MacOS/Google Chrome"              
 √  RIPGREP_BINARY        v13.0.0         valid     /usr/local/bin/rg                                                           

[i] Source-code locations:
 √  PACKAGE_DIR           23 files        valid     /usr/local/Cellar/archivebox/0.6.2-1/libexec/lib/python3.9/site-packages/archivebox
 √  TEMPLATES_DIR         3 files         valid     /usr/local/Cellar/archivebox/0.6.2-1/libexec/lib/python3.9/site-packages/archivebox/templates
 -  CUSTOM_TEMPLATES_DIR  -               disabled                                                                              

[i] Secrets locations:
 -  CHROME_USER_DATA_DIR  -               disabled                                                                              
 -  COOKIES_FILE          -               disabled                                                                              

[i] Data locations:
 √  OUTPUT_DIR            7 files         valid     /Users/me/archivebox                                             
 √  SOURCES_DIR           241 files       valid     ./sources                                                                   
 √  LOGS_DIR              1 files         valid     ./logs                                                                      
 √  ARCHIVE_DIR           1054 files      valid     ./archive                                                                   
 √  CONFIG_FILE           81.0 Bytes      valid     ./ArchiveBox.conf                                                           
 √  SQL_INDEX             9.3 MB          valid     ./index.sqlite3                                                             
Originally created by @bitjson on GitHub (Feb 25, 2023). Original GitHub issue: https://github.com/ArchiveBox/ArchiveBox/issues/1107 #### Describe the bug It seems like I've run into the same issue from https://github.com/ArchiveBox/ArchiveBox/issues/617, `/admin/core/snapshot/` renders a `Server Error (500)`. The error persists when I restart the server, but the error resolves if I move the offending files out of the archive (a `.html` and a `.orig` file). #### Steps to reproduce I added a URL with depth=1, and it also archived [`https://ide.bitauth.com/import-template/eJw1j29rgzAQxr-KhL20GtNajYjg2kFfjK3D7dU6JCYXdMw_mFgGxe--RNcXx9397rmHuxt6ULyGlqEE1VoPKvH9qtFs0rXH-9Zfh8q3ADrdcKabvttoaIcfpmFzxd4q8b5V3yEXCVB8bAarMpYGdKwFUx2KBb9MrfMOSpuBtdMNKJTcZhetW7axJdO8LgcmbPtvUKzUORt6lxucBiTKnDTInNdzmR-Pl87k4pSTcH_pUvwbh5hWcUC2GAdUECZohG3IiAZCVoLHnMlwJ0Me0p3kYksiDpJKAEHwngfZYvj09pE_o9neOQ1DP2owp32ix8OpJJjgEofoy0VXGNXyN57_ADMdb0Q=`](https://ide.bitauth.com/import-template/eJw1j29rgzAQxr-KhL20GtNajYjg2kFfjK3D7dU6JCYXdMw_mFgGxe--RNcXx9397rmHuxt6ULyGlqEE1VoPKvH9qtFs0rXH-9Zfh8q3ADrdcKabvttoaIcfpmFzxd4q8b5V3yEXCVB8bAarMpYGdKwFUx2KBb9MrfMOSpuBtdMNKJTcZhetW7axJdO8LgcmbPtvUKzUORt6lxucBiTKnDTInNdzmR-Pl87k4pSTcH_pUvwbh5hWcUC2GAdUECZohG3IiAZCVoLHnMlwJ0Me0p3kYksiDpJKAEHwngfZYvj09pE_o9neOQ1DP2owp32ix8OpJJjgEofoy0VXGNXyN57_ADMdb0Q=) #### Screenshots or log output ``` System check identified no issues (0 silenced). February 25, 2023 - 15:29:18 Django version 3.1.8, using settings 'core.settings' Starting development server at http://127.0.0.1:8000/ Quit the server with CONTROL-C. "GET / HTTP/1.1" 302 0 Internal Server Error: /admin/core/snapshot/ Traceback (most recent call last): File "/usr/local/Cellar/archivebox/0.6.2-1/libexec/lib/python3.9/site-packages/django/db/models/options.py", line 575, in get_field return self.fields_map[field_name] KeyError: 'title_str' During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/usr/local/Cellar/archivebox/0.6.2-1/libexec/lib/python3.9/site-packages/django/contrib/admin/utils.py", line 265, in lookup_field f = _get_non_gfk_field(opts, name) File "/usr/local/Cellar/archivebox/0.6.2-1/libexec/lib/python3.9/site-packages/django/contrib/admin/utils.py", line 296, in _get_non_gfk_field field = opts.get_field(name) File "/usr/local/Cellar/archivebox/0.6.2-1/libexec/lib/python3.9/site-packages/django/db/models/options.py", line 577, in get_field raise FieldDoesNotExist("%s has no field named '%s'" % (self.object_name, field_name)) django.core.exceptions.FieldDoesNotExist: Snapshot has no field named 'title_str' During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/usr/local/Cellar/archivebox/0.6.2-1/libexec/lib/python3.9/site-packages/django/core/handlers/exception.py", line 47, in inner response = get_response(request) File "/usr/local/Cellar/archivebox/0.6.2-1/libexec/lib/python3.9/site-packages/django/core/handlers/base.py", line 204, in _get_response response = response.render() File "/usr/local/Cellar/archivebox/0.6.2-1/libexec/lib/python3.9/site-packages/django/template/response.py", line 105, in render self.content = self.rendered_content File "/usr/local/Cellar/archivebox/0.6.2-1/libexec/lib/python3.9/site-packages/django/template/response.py", line 83, in rendered_content return template.render(context, self._request) File "/usr/local/Cellar/archivebox/0.6.2-1/libexec/lib/python3.9/site-packages/django/template/backends/django.py", line 61, in render return self.template.render(context) File "/usr/local/Cellar/archivebox/0.6.2-1/libexec/lib/python3.9/site-packages/django/template/base.py", line 170, in render return self._render(context) File "/usr/local/Cellar/archivebox/0.6.2-1/libexec/lib/python3.9/site-packages/django/template/base.py", line 162, in _render return self.nodelist.render(context) File "/usr/local/Cellar/archivebox/0.6.2-1/libexec/lib/python3.9/site-packages/django/template/base.py", line 938, in render bit = node.render_annotated(context) File "/usr/local/Cellar/archivebox/0.6.2-1/libexec/lib/python3.9/site-packages/django/template/base.py", line 905, in render_annotated return self.render(context) File "/usr/local/Cellar/archivebox/0.6.2-1/libexec/lib/python3.9/site-packages/django/template/loader_tags.py", line 150, in render return compiled_parent._render(context) File "/usr/local/Cellar/archivebox/0.6.2-1/libexec/lib/python3.9/site-packages/django/template/base.py", line 162, in _render return self.nodelist.render(context) File "/usr/local/Cellar/archivebox/0.6.2-1/libexec/lib/python3.9/site-packages/django/template/base.py", line 938, in render bit = node.render_annotated(context) File "/usr/local/Cellar/archivebox/0.6.2-1/libexec/lib/python3.9/site-packages/django/template/base.py", line 905, in render_annotated return self.render(context) File "/usr/local/Cellar/archivebox/0.6.2-1/libexec/lib/python3.9/site-packages/django/template/loader_tags.py", line 150, in render return compiled_parent._render(context) File "/usr/local/Cellar/archivebox/0.6.2-1/libexec/lib/python3.9/site-packages/django/template/base.py", line 162, in _render return self.nodelist.render(context) File "/usr/local/Cellar/archivebox/0.6.2-1/libexec/lib/python3.9/site-packages/django/template/base.py", line 938, in render bit = node.render_annotated(context) File "/usr/local/Cellar/archivebox/0.6.2-1/libexec/lib/python3.9/site-packages/django/template/base.py", line 905, in render_annotated return self.render(context) File "/usr/local/Cellar/archivebox/0.6.2-1/libexec/lib/python3.9/site-packages/django/template/loader_tags.py", line 62, in render result = block.nodelist.render(context) File "/usr/local/Cellar/archivebox/0.6.2-1/libexec/lib/python3.9/site-packages/django/template/base.py", line 938, in render bit = node.render_annotated(context) File "/usr/local/Cellar/archivebox/0.6.2-1/libexec/lib/python3.9/site-packages/django/template/base.py", line 905, in render_annotated return self.render(context) File "/usr/local/Cellar/archivebox/0.6.2-1/libexec/lib/python3.9/site-packages/django/template/loader_tags.py", line 62, in render result = block.nodelist.render(context) File "/usr/local/Cellar/archivebox/0.6.2-1/libexec/lib/python3.9/site-packages/django/template/base.py", line 938, in render bit = node.render_annotated(context) File "/usr/local/Cellar/archivebox/0.6.2-1/libexec/lib/python3.9/site-packages/django/template/base.py", line 905, in render_annotated return self.render(context) File "/usr/local/Cellar/archivebox/0.6.2-1/libexec/lib/python3.9/site-packages/django/contrib/admin/templatetags/base.py", line 33, in render return super().render(context) File "/usr/local/Cellar/archivebox/0.6.2-1/libexec/lib/python3.9/site-packages/django/template/library.py", line 214, in render _dict = self.func(*resolved_args, **resolved_kwargs) File "/usr/local/Cellar/archivebox/0.6.2-1/libexec/lib/python3.9/site-packages/django/contrib/admin/templatetags/admin_list.py", line 341, in result_list 'results': list(results(cl)), File "/usr/local/Cellar/archivebox/0.6.2-1/libexec/lib/python3.9/site-packages/django/contrib/admin/templatetags/admin_list.py", line 317, in results yield ResultList(None, items_for_result(cl, res, None)) File "/usr/local/Cellar/archivebox/0.6.2-1/libexec/lib/python3.9/site-packages/django/contrib/admin/templatetags/admin_list.py", line 308, in __init__ super().__init__(*items) File "/usr/local/Cellar/archivebox/0.6.2-1/libexec/lib/python3.9/site-packages/django/contrib/admin/templatetags/admin_list.py", line 233, in items_for_result f, attr, value = lookup_field(field_name, result, cl.model_admin) File "/usr/local/Cellar/archivebox/0.6.2-1/libexec/lib/python3.9/site-packages/django/contrib/admin/utils.py", line 274, in lookup_field value = attr(obj) File "/usr/local/Cellar/archivebox/0.6.2-1/libexec/lib/python3.9/site-packages/archivebox/core/admin.py", line 164, in title_str canon = obj.as_link().canonical_outputs() File "/usr/local/Cellar/archivebox/0.6.2-1/libexec/lib/python3.9/site-packages/archivebox/index/schema.py", line 427, in canonical_outputs 'wget_path': wget_output_path(self), File "/usr/local/Cellar/archivebox/0.6.2-1/libexec/lib/python3.9/site-packages/archivebox/util.py", line 114, in typechecked_function return func(*args, **kwargs) File "/usr/local/Cellar/archivebox/0.6.2-1/libexec/lib/python3.9/site-packages/archivebox/extractors/wget.py", line 170, in wget_output_path if search_dir.exists(): File "/usr/local/Cellar/python@3.9/3.9.16/Frameworks/Python.framework/Versions/3.9/lib/python3.9/pathlib.py", line 1424, in exists self.stat() File "/usr/local/Cellar/python@3.9/3.9.16/Frameworks/Python.framework/Versions/3.9/lib/python3.9/pathlib.py", line 1232, in stat return self._accessor.stat(self) OSError: [Errno 63] File name too long: '/.../archivebox/archive/1677317139.02542/ide.bitauth.com/import-template/eJw1j29rgzAQxr-KhL20GtNajYjg2kFfjK3D7dU6JCYXdMw_mFgGxe--RNcXx9397rmHuxt6ULyGlqEE1VoPKvH9qtFs0rXH-9Zfh8q3ADrdcKabvttoaIcfpmFzxd4q8b5V3yEXCVB8bAarMpYGdKwFUx2KBb9MrfMOSpuBtdMNKJTcZhetW7axJdO8LgcmbPtvUKzUORt6lxucBiTKnDTInNdzmR-Pl87k4pSTcH_pUvwbh5hWcUC2GAdUECZohG3IiAZCVoLHnMlwJ0Me0p3kYksiDpJKAEHwngfZYvj09pE_o9neOQ1DP2owp32ix8OpJJjgEofoy0VXGNXyN57_ADMdb0Q=' "GET /admin/core/snapshot/ HTTP/1.1" 500 337243 ``` #### ArchiveBox version <!-- Run the `archivebox version` command locally then copy paste the result here: --> ```logs ❯ archivebox version ArchiveBox v0.6.2 Cpython Darwin macOS-13.2.1-x86_64-i386-64bit x86_64 IN_DOCKER=False DEBUG=False IS_TTY=True TZ=UTC SEARCH_BACKEND_ENGINE=ripgrep [i] Dependency versions: √ ARCHIVEBOX_BINARY v0.6.2 valid /usr/local/Cellar/archivebox/0.6.2-1/libexec/bin/archivebox √ PYTHON_BINARY v3.9.16 valid /usr/local/Cellar/python@3.9/3.9.16/Frameworks/Python.framework/Versions/3.9/bin/python3.9 √ DJANGO_BINARY v3.1.8 valid /usr/local/Cellar/archivebox/0.6.2-1/libexec/lib/python3.9/site-packages/django/bin/django-admin.py √ CURL_BINARY v7.86.0 valid /usr/bin/curl √ WGET_BINARY v1.21.3 valid /usr/local/bin/wget √ NODE_BINARY v18.14.2 valid /Users/me/n/bin/node √ SINGLEFILE_BINARY v0.3.17 valid ./node_modules/single-file/cli/single-file √ READABILITY_BINARY v0.0.2 valid ./node_modules/readability-extractor/readability-extractor √ MERCURY_BINARY v1.0.0 valid ./node_modules/@postlight/mercury-parser/cli.js √ GIT_BINARY v2.39.0 valid /usr/local/bin/git √ YOUTUBEDL_BINARY v2021.12.17 valid /usr/local/bin/youtube-dl √ CHROME_BINARY v110.0.5481.100 valid "/Applications/Google Chrome.app/Contents/MacOS/Google Chrome" √ RIPGREP_BINARY v13.0.0 valid /usr/local/bin/rg [i] Source-code locations: √ PACKAGE_DIR 23 files valid /usr/local/Cellar/archivebox/0.6.2-1/libexec/lib/python3.9/site-packages/archivebox √ TEMPLATES_DIR 3 files valid /usr/local/Cellar/archivebox/0.6.2-1/libexec/lib/python3.9/site-packages/archivebox/templates - CUSTOM_TEMPLATES_DIR - disabled [i] Secrets locations: - CHROME_USER_DATA_DIR - disabled - COOKIES_FILE - disabled [i] Data locations: √ OUTPUT_DIR 7 files valid /Users/me/archivebox √ SOURCES_DIR 241 files valid ./sources √ LOGS_DIR 1 files valid ./logs √ ARCHIVE_DIR 1054 files valid ./archive √ CONFIG_FILE 81.0 Bytes valid ./ArchiveBox.conf √ SQL_INDEX 9.3 MB valid ./index.sqlite3 ```
Author
Owner

@pirate commented on GitHub (Feb 25, 2023):

Sorry no good solution for this at the moment. Filename lengths are enforced by your filesystem and are out of the control of ArchiveBox. It's a fundamental limit of using the filesystem to store URLs for the wget downloads (only affects the wget extractor through, all the other methods are fine).

There are filesystems that let you use longer names IIRC, but that's probably too much to ask for you to switch 😉

We could add error handling for this in the backend just so it doesn't throw a 500, but we're always going to have issues with these files and the nicer error doesn't really change the fact that your FS cant handle writing and reading path fragments beyond the max length limit.

<!-- gh-comment-id:1445215490 --> @pirate commented on GitHub (Feb 25, 2023): Sorry no good solution for this at the moment. Filename lengths are enforced by your filesystem and are out of the control of ArchiveBox. It's a fundamental limit of using the filesystem to store URLs for the wget downloads (only affects the wget extractor through, all the other methods are fine). There are filesystems that let you use longer names IIRC, but that's probably too much to ask for you to switch 😉 We could add error handling for this in the backend just so it doesn't throw a 500, but we're always going to have issues with these files and the nicer error doesn't really change the fact that your FS cant handle writing and reading path fragments beyond the max length limit.
Author
Owner

@berezovskyi commented on GitHub (May 27, 2023):

For anyone wondering how to fix their system:

# the domain from the trace goes here
docker-compose run archivebox list -t domain link.foreignaffairs.com
# once you find the exact link for the timestamp (also in the trace, before the domain)
docker-compose run archivebox remove --filter-type exact --delete https://link.foreignaffairs.com/very/long/url/here

# for the future
docker-compose run archivebox config --set SAVE_WGET=False
<!-- gh-comment-id:1565704157 --> @berezovskyi commented on GitHub (May 27, 2023): For anyone wondering how to fix their system: ``` # the domain from the trace goes here docker-compose run archivebox list -t domain link.foreignaffairs.com # once you find the exact link for the timestamp (also in the trace, before the domain) docker-compose run archivebox remove --filter-type exact --delete https://link.foreignaffairs.com/very/long/url/here # for the future docker-compose run archivebox config --set SAVE_WGET=False ```
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/ArchiveBox#694
No description provided.