[GH-ISSUE #508] Bugfix: Issue with tags on tests #328

Closed
opened 2026-03-01 14:42:33 +03:00 by kerem · 2 comments
Owner

Originally created by @cdvv7788 on GitHub (Oct 20, 2020).
Original GitHub issue: https://github.com/ArchiveBox/ArchiveBox/issues/508

Describe the bug

The docker tests are breaking because the wrong table is being referenced when trying to retrieve tags: https://github.com/pirate/ArchiveBox/runs/1281187367?check_suite_focus=true

Steps to reproduce

Run the tests for docker in github actions.

Screenshots or log output

Run echo "http://www.test-nginx-2.local" | docker run -i --network host -v "$PWD"/data:/data archivebox add
[i] [2020-10-20 13:27:31] ArchiveBox v0.4.21: archivebox add
    > /data

[+] [2020-10-20 13:27:33] Adding 1 links to index (crawl depth=0)...
    > Saved verbatim input to sources/1603200453-import.txt
    > Parsed 1 URLs from input (Plain Text)
Traceback (most recent call last):
  File "/usr/local/lib/python3.8/site-packages/django/db/backends/utils.py", line 86, in _execute
    return self.cursor.execute(sql, params)
  File "/usr/local/lib/python3.8/site-packages/django/db/backends/sqlite3/base.py", line 396, in execute
    return Database.Cursor.execute(self, query, params)
sqlite3.OperationalError: no such column: core_snapshot.tags

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/usr/local/bin/archivebox", line 33, in <module>
    sys.exit(load_entry_point('archivebox', 'console_scripts', 'archivebox')())
  File "/app/archivebox/cli/__init__.py", line 123, in main
    run_subcommand(
  File "/app/archivebox/cli/__init__.py", line 63, in run_subcommand
    module.main(args=subcommand_args, stdin=stdin, pwd=pwd)    # type: ignore
  File "/app/archivebox/cli/archivebox_add.py", line 78, in main
    add(
  File "/app/archivebox/util.py", line 113, in typechecked_function
    return func(*args, **kwargs)
  File "/app/archivebox/main.py", line 549, in add
    new_links = dedupe_links(all_links, imported_links)
  File "/app/archivebox/util.py", line 113, in typechecked_function
    return func(*args, **kwargs)
  File "/app/archivebox/index/__init__.py", line 327, in dedupe_links
    dedup_links = fix_duplicate_links_in_index(snapshots, new_links)
  File "/app/archivebox/util.py", line 113, in typechecked_function
    return func(*args, **kwargs)
  File "/app/archivebox/index/__init__.py", line 311, in fix_duplicate_links_in_index
    if index_link:
  File "/usr/local/lib/python3.8/site-packages/django/db/models/query.py", line 280, in __bool__
    self._fetch_all()
  File "/usr/local/lib/python3.8/site-packages/django/db/models/query.py", line 1261, in _fetch_all
    self._result_cache = list(self._iterable_class(self))
  File "/usr/local/lib/python3.8/site-packages/django/db/models/query.py", line 57, in __iter__
    results = compiler.execute_sql(chunked_fetch=self.chunked_fetch, chunk_size=self.chunk_size)
  File "/usr/local/lib/python3.8/site-packages/django/db/models/sql/compiler.py", line 1152, in execute_sql
    cursor.execute(sql, params)
  File "/usr/local/lib/python3.8/site-packages/django/db/backends/utils.py", line 68, in execute
    return self._execute_with_wrappers(sql, params, many=False, executor=self._execute)
  File "/usr/local/lib/python3.8/site-packages/django/db/backends/utils.py", line 77, in _execute_with_wrappers
    return executor(sql, params, many, context)
  File "/usr/local/lib/python3.8/site-packages/django/db/backends/utils.py", line 86, in _execute
    return self.cursor.execute(sql, params)
  File "/usr/local/lib/python3.8/site-packages/django/db/utils.py", line 90, in __exit__
    raise dj_exc_value.with_traceback(traceback) from exc_value
  File "/usr/local/lib/python3.8/site-packages/django/db/backends/utils.py", line 86, in _execute
    return self.cursor.execute(sql, params)
  File "/usr/local/lib/python3.8/site-packages/django/db/backends/sqlite3/base.py", line 396, in execute
    return Database.Cursor.execute(self, query, params)
django.db.utils.OperationalError: no such column: core_snapshot.tags
Error: Process completed with exit code 1.
Originally created by @cdvv7788 on GitHub (Oct 20, 2020). Original GitHub issue: https://github.com/ArchiveBox/ArchiveBox/issues/508 #### Describe the bug The docker tests are breaking because the wrong table is being referenced when trying to retrieve tags: https://github.com/pirate/ArchiveBox/runs/1281187367?check_suite_focus=true #### Steps to reproduce Run the tests for docker in github actions. #### Screenshots or log output ``` Run echo "http://www.test-nginx-2.local" | docker run -i --network host -v "$PWD"/data:/data archivebox add [i] [2020-10-20 13:27:31] ArchiveBox v0.4.21: archivebox add > /data [+] [2020-10-20 13:27:33] Adding 1 links to index (crawl depth=0)... > Saved verbatim input to sources/1603200453-import.txt > Parsed 1 URLs from input (Plain Text) Traceback (most recent call last): File "/usr/local/lib/python3.8/site-packages/django/db/backends/utils.py", line 86, in _execute return self.cursor.execute(sql, params) File "/usr/local/lib/python3.8/site-packages/django/db/backends/sqlite3/base.py", line 396, in execute return Database.Cursor.execute(self, query, params) sqlite3.OperationalError: no such column: core_snapshot.tags The above exception was the direct cause of the following exception: Traceback (most recent call last): File "/usr/local/bin/archivebox", line 33, in <module> sys.exit(load_entry_point('archivebox', 'console_scripts', 'archivebox')()) File "/app/archivebox/cli/__init__.py", line 123, in main run_subcommand( File "/app/archivebox/cli/__init__.py", line 63, in run_subcommand module.main(args=subcommand_args, stdin=stdin, pwd=pwd) # type: ignore File "/app/archivebox/cli/archivebox_add.py", line 78, in main add( File "/app/archivebox/util.py", line 113, in typechecked_function return func(*args, **kwargs) File "/app/archivebox/main.py", line 549, in add new_links = dedupe_links(all_links, imported_links) File "/app/archivebox/util.py", line 113, in typechecked_function return func(*args, **kwargs) File "/app/archivebox/index/__init__.py", line 327, in dedupe_links dedup_links = fix_duplicate_links_in_index(snapshots, new_links) File "/app/archivebox/util.py", line 113, in typechecked_function return func(*args, **kwargs) File "/app/archivebox/index/__init__.py", line 311, in fix_duplicate_links_in_index if index_link: File "/usr/local/lib/python3.8/site-packages/django/db/models/query.py", line 280, in __bool__ self._fetch_all() File "/usr/local/lib/python3.8/site-packages/django/db/models/query.py", line 1261, in _fetch_all self._result_cache = list(self._iterable_class(self)) File "/usr/local/lib/python3.8/site-packages/django/db/models/query.py", line 57, in __iter__ results = compiler.execute_sql(chunked_fetch=self.chunked_fetch, chunk_size=self.chunk_size) File "/usr/local/lib/python3.8/site-packages/django/db/models/sql/compiler.py", line 1152, in execute_sql cursor.execute(sql, params) File "/usr/local/lib/python3.8/site-packages/django/db/backends/utils.py", line 68, in execute return self._execute_with_wrappers(sql, params, many=False, executor=self._execute) File "/usr/local/lib/python3.8/site-packages/django/db/backends/utils.py", line 77, in _execute_with_wrappers return executor(sql, params, many, context) File "/usr/local/lib/python3.8/site-packages/django/db/backends/utils.py", line 86, in _execute return self.cursor.execute(sql, params) File "/usr/local/lib/python3.8/site-packages/django/db/utils.py", line 90, in __exit__ raise dj_exc_value.with_traceback(traceback) from exc_value File "/usr/local/lib/python3.8/site-packages/django/db/backends/utils.py", line 86, in _execute return self.cursor.execute(sql, params) File "/usr/local/lib/python3.8/site-packages/django/db/backends/sqlite3/base.py", line 396, in execute return Database.Cursor.execute(self, query, params) django.db.utils.OperationalError: no such column: core_snapshot.tags Error: Process completed with exit code 1. ```
kerem closed this issue 2026-03-01 14:42:33 +03:00
Author
Owner

@cdvv7788 commented on GitHub (Oct 20, 2020):

It looks like the cause of this is in the merge/dedupe logic that still has the old tags logic. I am working on fixing this.

<!-- gh-comment-id:712875379 --> @cdvv7788 commented on GitHub (Oct 20, 2020): It looks like the cause of this is in the merge/dedupe logic that still has the old `tags` logic. I am working on fixing this.
Author
Owner

@cdvv7788 commented on GitHub (Oct 22, 2020):

Fixed with PR. Closing.

<!-- gh-comment-id:714487294 --> @cdvv7788 commented on GitHub (Oct 22, 2020): Fixed with PR. Closing.
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/ArchiveBox#328
No description provided.