[GH-ISSUE #783] Bug: Sorting by file size throws "Server Error (500)" #2008

Open
opened 2026-03-01 17:55:46 +03:00 by kerem · 1 comment
Owner

Originally created by @AlexanderRitter02 on GitHub (Jul 6, 2021).
Original GitHub issue: https://github.com/ArchiveBox/ArchiveBox/issues/783

Describe the bug

Trying to sort snapshots by file size results in "Server Error (500)" when clicking on the SIZE header.

Clicking where displayed in the picture:
grafik
Results in:
grafik

I already tried running archivebox init, since that was suggested on similar issues, but it did not help.
I will happily provide any more information.

Steps to reproduce

I am using the Docker-Compose install on Windows.

  1. Start the web server (docker-compose -up)
  2. Go to http://127.0.0.1:8000/admin/core/snapshot/
  3. Click on the header SIZE in the file size column

This will lead to the page http://127.0.0.1:8000/admin/core/snapshot/?o=4.-1 which displays a big Server Error (500).

Screenshots or log output

archivebox_1  | System check identified no issues (0 silenced).
archivebox_1  | July 06, 2021 - 11:32:22
archivebox_1  | Django version 3.1.10, using settings 'core.settings'
archivebox_1  | Starting development server at http://0.0.0.0:8000/
archivebox_1  | Quit the server with CONTROL-C.
archivebox_1  | "GET /add/ HTTP/1.1" 200 6758
archivebox_1  | "GET / HTTP/1.1" 302 0
archivebox_1  | "GET /admin/core/snapshot/ HTTP/1.1" 200 111453
archivebox_1  | "GET /admin/jsi18n/ HTTP/1.1" 200 3191
archivebox_1  | Internal Server Error: /admin/core/snapshot/
archivebox_1  | Traceback (most recent call last):
archivebox_1  |   File "/usr/local/lib/python3.9/site-packages/django/core/handlers/exception.py", line 47, in inner
archivebox_1  |     response = get_response(request)
archivebox_1  |   File "/usr/local/lib/python3.9/site-packages/django/core/handlers/base.py", line 181, in _get_response
archivebox_1  |     response = wrapped_callback(request, *callback_args, **callback_kwargs)
archivebox_1  |   File "/usr/local/lib/python3.9/site-packages/django/contrib/admin/options.py", line 614, in wrapper
archivebox_1  |     return self.admin_site.admin_view(view)(*args, **kwargs)
archivebox_1  |   File "/usr/local/lib/python3.9/site-packages/django/utils/decorators.py", line 130, in _wrapped_view
archivebox_1  |     response = view_func(request, *args, **kwargs)
archivebox_1  |   File "/usr/local/lib/python3.9/site-packages/django/views/decorators/cache.py", line 44, in _wrapped_view_func
archivebox_1  |     response = view_func(request, *args, **kwargs)
archivebox_1  |   File "/usr/local/lib/python3.9/site-packages/django/contrib/admin/sites.py", line 233, in inner
archivebox_1  |     return view(request, *args, **kwargs)
archivebox_1  |   File "/usr/local/lib/python3.9/site-packages/django/utils/decorators.py", line 43, in _wrapper
archivebox_1  |     return bound_method(*args, **kwargs)
archivebox_1  |   File "/usr/local/lib/python3.9/site-packages/django/utils/decorators.py", line 130, in _wrapped_view
archivebox_1  |     response = view_func(request, *args, **kwargs)
archivebox_1  |   File "/usr/local/lib/python3.9/site-packages/django/contrib/admin/options.py", line 1811, in changelist_view
archivebox_1  |     'selection_note': _('0 of %(cnt)s selected') % {'cnt': len(cl.result_list)},
archivebox_1  |   File "/usr/local/lib/python3.9/site-packages/django/db/models/query.py", line 269, in __len__
archivebox_1  |     self._fetch_all()
archivebox_1  |   File "/usr/local/lib/python3.9/site-packages/django/db/models/query.py", line 1308, in _fetch_all
archivebox_1  |     self._result_cache = list(self._iterable_class(self))
archivebox_1  |   File "/usr/local/lib/python3.9/site-packages/django/db/models/query.py", line 53, in __iter__
archivebox_1  |     results = compiler.execute_sql(chunked_fetch=self.chunked_fetch, chunk_size=self.chunk_size)
archivebox_1  |   File "/usr/local/lib/python3.9/site-packages/django/db/models/sql/compiler.py", line 1143, in execute_sql
archivebox_1  |     sql, params = self.as_sql()
archivebox_1  |   File "/usr/local/lib/python3.9/site-packages/django/db/models/sql/compiler.py", line 498, in as_sql
archivebox_1  |     extra_select, order_by, group_by = self.pre_sql_setup()
archivebox_1  |   File "/usr/local/lib/python3.9/site-packages/django/db/models/sql/compiler.py", line 56, in pre_sql_setup
archivebox_1  |     order_by = self.get_order_by()
archivebox_1  |   File "/usr/local/lib/python3.9/site-packages/django/db/models/sql/compiler.py", line 346, in get_order_by
archivebox_1  |     order_by.extend(self.find_ordering_name(
archivebox_1  |   File "/usr/local/lib/python3.9/site-packages/django/db/models/sql/compiler.py", line 747, in find_ordering_name
archivebox_1  |     return [(OrderBy(transform_function(t, alias), descending=descending), False) for t in targets]
archivebox_1  |   File "/usr/local/lib/python3.9/site-packages/django/db/models/sql/compiler.py", line 747, in <listcomp>
archivebox_1  |     return [(OrderBy(transform_function(t, alias), descending=descending), False) for t in targets]
archivebox_1  |   File "/usr/local/lib/python3.9/site-packages/django/db/models/sql/query.py", line 1584, in transform
archivebox_1  |     return self.try_transform(wrapped, name)
archivebox_1  |   File "/usr/local/lib/python3.9/site-packages/django/db/models/sql/query.py", line 1198, in try_transform
archivebox_1  |     raise FieldError(
archivebox_1  | django.core.exceptions.FieldError: Unsupported lookup 'count' for AutoField or join on the field not permitted, perhaps you meant contains?
archivebox_1  | "GET /admin/core/snapshot/?o=4.-1 HTTP/1.1" 500 145

ArchiveBox version

Creating archivebox_archivebox_run ... done
ArchiveBox v0.6.2
Cpython Linux Linux-5.4.72-microsoft-standard-WSL2-x86_64-with-glibc2.28 x86_64
IN_DOCKER=True DEBUG=False IS_TTY=True TZ=UTC SEARCH_BACKEND_ENGINE=ripgrep

[i] Dependency versions:
 √  ARCHIVEBOX_BINARY     v0.6.2          valid     /usr/local/bin/archivebox                                           
 √  PYTHON_BINARY         v3.9.5          valid     /usr/local/bin/python3.9                                            
 √  DJANGO_BINARY         v3.1.10         valid     /usr/local/lib/python3.9/site-packages/django/bin/django-admin.py   
 √  CURL_BINARY           v7.64.0         valid     /usr/bin/curl                                                       
 √  WGET_BINARY           v1.20.1         valid     /usr/bin/wget                                                       
 √  NODE_BINARY           v15.14.0        valid     /usr/bin/node                                                       
 √  SINGLEFILE_BINARY     v0.3.16         valid     /node/node_modules/single-file/cli/single-file                      
 √  READABILITY_BINARY    v0.0.2          valid     /node/node_modules/readability-extractor/readability-extractor      
 √  MERCURY_BINARY        v1.0.0          valid     /node/node_modules/@postlight/mercury-parser/cli.js                 
 √  GIT_BINARY            v2.20.1         valid     /usr/bin/git                                                        
 -  YOUTUBEDL_BINARY      -               disabled  /usr/local/bin/youtube-dl                                           
 √  CHROME_BINARY         v90.0.4430.93   valid     /usr/bin/chromium                                                   
 √  RIPGREP_BINARY        v0.10.0         valid     /usr/bin/rg                                                         

[i] Source-code locations:
 √  PACKAGE_DIR           22 files        valid     /app/archivebox                                                     
 √  TEMPLATES_DIR         3 files         valid     /app/archivebox/templates                                           
 -  CUSTOM_TEMPLATES_DIR  -               disabled                                                                      

[i] Secrets locations:
 -  CHROME_USER_DATA_DIR  -               disabled                                                                      
 √  COOKIES_FILE          25.8 KB         valid     ./cookies.txt                                                       

[i] Data locations:
 √  OUTPUT_DIR            7 files         valid     /data                                                               
 √  SOURCES_DIR           256 files       valid     ./sources                                                           
 √  LOGS_DIR              1 files         valid     ./logs                                                              
 √  ARCHIVE_DIR           1401 files      valid     ./archive                                                           
 √  CONFIG_FILE           179.0 Bytes     valid     ./ArchiveBox.conf                                                   
 √  SQL_INDEX             13.3 MB         valid     ./index.sqlite3
Originally created by @AlexanderRitter02 on GitHub (Jul 6, 2021). Original GitHub issue: https://github.com/ArchiveBox/ArchiveBox/issues/783 <!-- Please fill out the following information, feel free to delete sections if they're not applicable or if long issue templates annoy you. (the only required section is the version information) --> #### Describe the bug <!-- A description of what the bug is, what you expected to happen, and any relevant context about issue. --> Trying to sort snapshots by file size results in "Server Error (500)" when clicking on the SIZE header. Clicking where displayed in the picture: ![grafik](https://user-images.githubusercontent.com/42546705/124593825-643e5c80-de5f-11eb-90cb-eb48e2ade0c1.png) Results in: ![grafik](https://user-images.githubusercontent.com/42546705/124594123-b8e1d780-de5f-11eb-96ca-2c9228b702fb.png) I already tried running `archivebox init`, since that was suggested on similar issues, but it did not help. I will happily provide any more information. #### Steps to reproduce <!-- For example: 1. Ran ArchiveBox with the following config '...' 2. Saw this output during archiving '....' 3. UI didn't show the thing I was expecting '....' --> I am using the Docker-Compose install on Windows. 1. Start the web server (`docker-compose -up`) 2. Go to `http://127.0.0.1:8000/admin/core/snapshot/` 3. Click on the header **SIZE** in the file size column :x: This will lead to the page `http://127.0.0.1:8000/admin/core/snapshot/?o=4.-1` which displays a big **Server Error (500)**. #### Screenshots or log output <!-- If applicable, post any relevant screenshots or copy/pasted terminal output from ArchiveBox. If you're reporting a parsing / importing error, **you must paste a copy of your redacted import file here**. --> ``` archivebox_1 | System check identified no issues (0 silenced). archivebox_1 | July 06, 2021 - 11:32:22 archivebox_1 | Django version 3.1.10, using settings 'core.settings' archivebox_1 | Starting development server at http://0.0.0.0:8000/ archivebox_1 | Quit the server with CONTROL-C. archivebox_1 | "GET /add/ HTTP/1.1" 200 6758 archivebox_1 | "GET / HTTP/1.1" 302 0 archivebox_1 | "GET /admin/core/snapshot/ HTTP/1.1" 200 111453 archivebox_1 | "GET /admin/jsi18n/ HTTP/1.1" 200 3191 archivebox_1 | Internal Server Error: /admin/core/snapshot/ archivebox_1 | Traceback (most recent call last): archivebox_1 | File "/usr/local/lib/python3.9/site-packages/django/core/handlers/exception.py", line 47, in inner archivebox_1 | response = get_response(request) archivebox_1 | File "/usr/local/lib/python3.9/site-packages/django/core/handlers/base.py", line 181, in _get_response archivebox_1 | response = wrapped_callback(request, *callback_args, **callback_kwargs) archivebox_1 | File "/usr/local/lib/python3.9/site-packages/django/contrib/admin/options.py", line 614, in wrapper archivebox_1 | return self.admin_site.admin_view(view)(*args, **kwargs) archivebox_1 | File "/usr/local/lib/python3.9/site-packages/django/utils/decorators.py", line 130, in _wrapped_view archivebox_1 | response = view_func(request, *args, **kwargs) archivebox_1 | File "/usr/local/lib/python3.9/site-packages/django/views/decorators/cache.py", line 44, in _wrapped_view_func archivebox_1 | response = view_func(request, *args, **kwargs) archivebox_1 | File "/usr/local/lib/python3.9/site-packages/django/contrib/admin/sites.py", line 233, in inner archivebox_1 | return view(request, *args, **kwargs) archivebox_1 | File "/usr/local/lib/python3.9/site-packages/django/utils/decorators.py", line 43, in _wrapper archivebox_1 | return bound_method(*args, **kwargs) archivebox_1 | File "/usr/local/lib/python3.9/site-packages/django/utils/decorators.py", line 130, in _wrapped_view archivebox_1 | response = view_func(request, *args, **kwargs) archivebox_1 | File "/usr/local/lib/python3.9/site-packages/django/contrib/admin/options.py", line 1811, in changelist_view archivebox_1 | 'selection_note': _('0 of %(cnt)s selected') % {'cnt': len(cl.result_list)}, archivebox_1 | File "/usr/local/lib/python3.9/site-packages/django/db/models/query.py", line 269, in __len__ archivebox_1 | self._fetch_all() archivebox_1 | File "/usr/local/lib/python3.9/site-packages/django/db/models/query.py", line 1308, in _fetch_all archivebox_1 | self._result_cache = list(self._iterable_class(self)) archivebox_1 | File "/usr/local/lib/python3.9/site-packages/django/db/models/query.py", line 53, in __iter__ archivebox_1 | results = compiler.execute_sql(chunked_fetch=self.chunked_fetch, chunk_size=self.chunk_size) archivebox_1 | File "/usr/local/lib/python3.9/site-packages/django/db/models/sql/compiler.py", line 1143, in execute_sql archivebox_1 | sql, params = self.as_sql() archivebox_1 | File "/usr/local/lib/python3.9/site-packages/django/db/models/sql/compiler.py", line 498, in as_sql archivebox_1 | extra_select, order_by, group_by = self.pre_sql_setup() archivebox_1 | File "/usr/local/lib/python3.9/site-packages/django/db/models/sql/compiler.py", line 56, in pre_sql_setup archivebox_1 | order_by = self.get_order_by() archivebox_1 | File "/usr/local/lib/python3.9/site-packages/django/db/models/sql/compiler.py", line 346, in get_order_by archivebox_1 | order_by.extend(self.find_ordering_name( archivebox_1 | File "/usr/local/lib/python3.9/site-packages/django/db/models/sql/compiler.py", line 747, in find_ordering_name archivebox_1 | return [(OrderBy(transform_function(t, alias), descending=descending), False) for t in targets] archivebox_1 | File "/usr/local/lib/python3.9/site-packages/django/db/models/sql/compiler.py", line 747, in <listcomp> archivebox_1 | return [(OrderBy(transform_function(t, alias), descending=descending), False) for t in targets] archivebox_1 | File "/usr/local/lib/python3.9/site-packages/django/db/models/sql/query.py", line 1584, in transform archivebox_1 | return self.try_transform(wrapped, name) archivebox_1 | File "/usr/local/lib/python3.9/site-packages/django/db/models/sql/query.py", line 1198, in try_transform archivebox_1 | raise FieldError( archivebox_1 | django.core.exceptions.FieldError: Unsupported lookup 'count' for AutoField or join on the field not permitted, perhaps you meant contains? archivebox_1 | "GET /admin/core/snapshot/?o=4.-1 HTTP/1.1" 500 145 ``` #### ArchiveBox version <!-- Run the `archivebox version` command locally then copy paste the result here: --> ```logs Creating archivebox_archivebox_run ... done ArchiveBox v0.6.2 Cpython Linux Linux-5.4.72-microsoft-standard-WSL2-x86_64-with-glibc2.28 x86_64 IN_DOCKER=True DEBUG=False IS_TTY=True TZ=UTC SEARCH_BACKEND_ENGINE=ripgrep [i] Dependency versions: √ ARCHIVEBOX_BINARY v0.6.2 valid /usr/local/bin/archivebox √ PYTHON_BINARY v3.9.5 valid /usr/local/bin/python3.9 √ DJANGO_BINARY v3.1.10 valid /usr/local/lib/python3.9/site-packages/django/bin/django-admin.py √ CURL_BINARY v7.64.0 valid /usr/bin/curl √ WGET_BINARY v1.20.1 valid /usr/bin/wget √ NODE_BINARY v15.14.0 valid /usr/bin/node √ SINGLEFILE_BINARY v0.3.16 valid /node/node_modules/single-file/cli/single-file √ READABILITY_BINARY v0.0.2 valid /node/node_modules/readability-extractor/readability-extractor √ MERCURY_BINARY v1.0.0 valid /node/node_modules/@postlight/mercury-parser/cli.js √ GIT_BINARY v2.20.1 valid /usr/bin/git - YOUTUBEDL_BINARY - disabled /usr/local/bin/youtube-dl √ CHROME_BINARY v90.0.4430.93 valid /usr/bin/chromium √ RIPGREP_BINARY v0.10.0 valid /usr/bin/rg [i] Source-code locations: √ PACKAGE_DIR 22 files valid /app/archivebox √ TEMPLATES_DIR 3 files valid /app/archivebox/templates - CUSTOM_TEMPLATES_DIR - disabled [i] Secrets locations: - CHROME_USER_DATA_DIR - disabled √ COOKIES_FILE 25.8 KB valid ./cookies.txt [i] Data locations: √ OUTPUT_DIR 7 files valid /data √ SOURCES_DIR 256 files valid ./sources √ LOGS_DIR 1 files valid ./logs √ ARCHIVE_DIR 1401 files valid ./archive √ CONFIG_FILE 179.0 Bytes valid ./ArchiveBox.conf √ SQL_INDEX 13.3 MB valid ./index.sqlite3 ``` <!-- Tickets without full version info will closed until it is provided, we need the full output here to help you solve your issue -->
Author
Owner

@pirate commented on GitHub (Jul 6, 2021):

Sorting by actualy on-disk size is not really possible in the current version, as we don't save the size in the DB so it's not a sortable column. It's lazily computed and cached on each pageload. What I tried to do was sort by number of succesful ArchiveResults (which is only a weak proxy for archive size), but I didn't test it super thoroughly or spend a lot of time implementing that, so I'm not surprised it's broken.

Thanks for reporting.

<!-- gh-comment-id:875132931 --> @pirate commented on GitHub (Jul 6, 2021): Sorting by actualy on-disk size is not really possible in the current version, as we don't save the size in the DB so it's not a sortable column. It's lazily computed and cached on each pageload. What I tried to do was sort by number of succesful `ArchiveResult`s (which is only a weak proxy for archive size), but I didn't test it super thoroughly or spend a lot of time implementing that, so I'm not surprised it's broken. Thanks for reporting.
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/ArchiveBox#2008
No description provided.