[GH-ISSUE #596] Error: django.db.utils.IntegrityError: UNIQUE constraint failed: core_tag.slug #3389

Closed
opened 2026-03-14 22:35:46 +03:00 by kerem · 3 comments
Owner

Originally created by @terxw on GitHub (Jan 2, 2021).
Original GitHub issue: https://github.com/ArchiveBox/ArchiveBox/issues/596

after running newest docker image (nikisweeting/archivebox:latest) i get the following bug
version 0.4.21 could run without this problem, although very slow, I have 30000 links, that is why i am trying newwer version

Steps to reproduce

docker-compose.yml

version: "3.7"
services:
    archivebox:
        container_name: archivebox
        # build: .
        image: nikisweeting/archivebox:latest
        command: server 0.0.0.0:8000
        stdin_open: true
        tty: true
        ports:
            - 8000:8000
        environment:
            - USE_COLOR=True
            - SHOW_PROGRESS=False
            - ONLY_NEW=False
            - TIMEOUT=120
            - MEDIA_TIMEOUT=3600
            - FETCH_TITLE=True
            - FETCH_WGET=True
            - FETCH_WARC=True
            - FETCH_PDF=True
            - FETCH_SCREENSHOT=True
            - FETCH_DOM=True
            - FETCH_GIT=True
            - FETCH_MEDIA=false
            - SUBMIT_ARCHIVE_DOT_ORG=True
            - USE_SINGLEFILE=True
            - CHECK_SSL_VALIDITY=False
            - FETCH_WGET_REQUISITES=True
            - RESOLUTION="1440,900"
            - WGET_USER_AGENT="Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/73.0.3683.75 Safari/537.36"
            - CHROME_USER_AGENT="Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/73.0.3683.75 Safari/537.36"
            - CHROME_HEADLESS=True
            - SECRET_KEY=""
        volumes:
            - /etc/localtime:/etc/localtime:ro
            - /storage/data/docs/archivebox:/data

run ti with:

docker-compose run archivebox init

log output

docker-compose run archivebox init
[i] [2021-01-02 10:15:56] ArchiveBox v0.5.0: archivebox init
    > /data
[!] This folder contains a JSON index. It is deprecated, and will no longer be kept up to date automatically.
    You can run `archivebox list --json --with-headers > index.json` to manually generate it.
[*] Updating existing ArchiveBox collection in this folder...
    /data
------------------------------------------------------------------
[*] Verifying archive folder structure...
    √ /data/sources
    √ /data/archive
    √ /data/logs
    √ /data/ArchiveBox.conf
[*] Verifying main SQL index and running migrations...
    √ /data/index.sqlite3
Traceback (most recent call last):
  File "/usr/local/lib/python3.9/site-packages/django/db/models/query.py", line 573, in get_or_create
    return self.get(**kwargs), False
  File "/usr/local/lib/python3.9/site-packages/django/db/models/query.py", line 429, in get
    raise self.model.DoesNotExist(
__fake__.DoesNotExist: Tag matching query does not exist.
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
  File "/usr/local/lib/python3.9/site-packages/django/db/backends/utils.py", line 84, in _execute
    return self.cursor.execute(sql, params)
  File "/usr/local/lib/python3.9/site-packages/django/db/backends/sqlite3/base.py", line 413, in execute
    return Database.Cursor.execute(self, query, params)
sqlite3.IntegrityError: UNIQUE constraint failed: core_tag.slug
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
  File "/usr/local/bin/archivebox", line 33, in <module>
    sys.exit(load_entry_point('archivebox', 'console_scripts', 'archivebox')())
  File "/app/archivebox/cli/__init__.py", line 123, in main
    run_subcommand(
  File "/app/archivebox/cli/__init__.py", line 63, in run_subcommand
    module.main(args=subcommand_args, stdin=stdin, pwd=pwd)    # type: ignore
  File "/app/archivebox/cli/archivebox_init.py", line 33, in main
    init(
  File "/app/archivebox/util.py", line 113, in typechecked_function
    return func(*args, **kwargs)
  File "/app/archivebox/main.py", line 323, in init
    for migration_line in apply_migrations(out_dir):
  File "/app/archivebox/util.py", line 113, in typechecked_function
    return func(*args, **kwargs)
  File "/app/archivebox/index/sql.py", line 102, in apply_migrations
    call_command("migrate", interactive=False, stdout=out)
  File "/usr/local/lib/python3.9/site-packages/django/core/management/__init__.py", line 168, in call_command
    return command.execute(*args, **defaults)
  File "/usr/local/lib/python3.9/site-packages/django/core/management/base.py", line 371, in execute
    output = self.handle(*args, **options)
  File "/usr/local/lib/python3.9/site-packages/django/core/management/base.py", line 85, in wrapped
    res = handle_func(*args, **kwargs)
  File "/usr/local/lib/python3.9/site-packages/django/core/management/commands/migrate.py", line 243, in handle
    post_migrate_state = executor.migrate(
  File "/usr/local/lib/python3.9/site-packages/django/db/migrations/executor.py", line 117, in migrate
    state = self._migrate_all_forwards(state, plan, full_plan, fake=fake, fake_initial=fake_initial)
  File "/usr/local/lib/python3.9/site-packages/django/db/migrations/executor.py", line 147, in _migrate_all_forwards
    state = self.apply_migration(state, migration, fake=fake, fake_initial=fake_initial)
  File "/usr/local/lib/python3.9/site-packages/django/db/migrations/executor.py", line 227, in apply_migration
    state = migration.apply(state, schema_editor)
  File "/usr/local/lib/python3.9/site-packages/django/db/migrations/migration.py", line 124, in apply
    operation.database_forwards(self.app_label, schema_editor, old_state, project_state)
  File "/usr/local/lib/python3.9/site-packages/django/db/migrations/operations/special.py", line 190, in database_forwards
    self.code(from_state.apps, schema_editor)
  File "/app/archivebox/core/migrations/0006_auto_20201012_1520.py", line 20, in forwards_func
    to_add, _ = TagModel.objects.get_or_create(name=tag, slug=slugify(tag))
  File "/usr/local/lib/python3.9/site-packages/django/db/models/manager.py", line 85, in manager_method
    return getattr(self.get_queryset(), name)(*args, **kwargs)
  File "/usr/local/lib/python3.9/site-packages/django/db/models/query.py", line 576, in get_or_create
    return self._create_object_from_params(kwargs, params)
  File "/usr/local/lib/python3.9/site-packages/django/db/models/query.py", line 610, in _create_object_from_params
    obj = self.create(**params)
  File "/usr/local/lib/python3.9/site-packages/django/db/models/query.py", line 447, in create
    obj.save(force_insert=True, using=self.db)
  File "/usr/local/lib/python3.9/site-packages/django/db/models/base.py", line 753, in save
    self.save_base(using=using, force_insert=force_insert,
  File "/usr/local/lib/python3.9/site-packages/django/db/models/base.py", line 790, in save_base
    updated = self._save_table(
  File "/usr/local/lib/python3.9/site-packages/django/db/models/base.py", line 895, in _save_table
    results = self._do_insert(cls._base_manager, using, fields, returning_fields, raw)
  File "/usr/local/lib/python3.9/site-packages/django/db/models/base.py", line 933, in _do_insert
    return manager._insert(
  File "/usr/local/lib/python3.9/site-packages/django/db/models/manager.py", line 85, in manager_method
    return getattr(self.get_queryset(), name)(*args, **kwargs)
  File "/usr/local/lib/python3.9/site-packages/django/db/models/query.py", line 1254, in _insert
    return query.get_compiler(using=using).execute_sql(returning_fields)
  File "/usr/local/lib/python3.9/site-packages/django/db/models/sql/compiler.py", line 1397, in execute_sql
    cursor.execute(sql, params)
  File "/usr/local/lib/python3.9/site-packages/django/db/backends/utils.py", line 66, in execute
    return self._execute_with_wrappers(sql, params, many=False, executor=self._execute)
  File "/usr/local/lib/python3.9/site-packages/django/db/backends/utils.py", line 75, in _execute_with_wrappers
    return executor(sql, params, many, context)
  File "/usr/local/lib/python3.9/site-packages/django/db/backends/utils.py", line 84, in _execute
    return self.cursor.execute(sql, params)
  File "/usr/local/lib/python3.9/site-packages/django/db/utils.py", line 90, in __exit__
    raise dj_exc_value.with_traceback(traceback) from exc_value
  File "/usr/local/lib/python3.9/site-packages/django/db/backends/utils.py", line 84, in _execute
    return self.cursor.execute(sql, params)
  File "/usr/local/lib/python3.9/site-packages/django/db/backends/sqlite3/base.py", line 413, in execute
    return Database.Cursor.execute(self, query, params)
django.db.utils.IntegrityError: UNIQUE constraint failed: core_tag.slug    

Software versions

  • OS: ubuntu 20.04
  • ArchiveBox version: latest docker 0.5.0 nikisweeting/archivebox:latest
  • Python version: 3.9 in docker
  • Chrome version: from docker
Originally created by @terxw on GitHub (Jan 2, 2021). Original GitHub issue: https://github.com/ArchiveBox/ArchiveBox/issues/596 #### after running newest docker image (nikisweeting/archivebox:latest) i get the following bug version 0.4.21 could run without this problem, although very slow, I have 30000 links, that is why i am trying newwer version #### Steps to reproduce docker-compose.yml ``` version: "3.7" services: archivebox: container_name: archivebox # build: . image: nikisweeting/archivebox:latest command: server 0.0.0.0:8000 stdin_open: true tty: true ports: - 8000:8000 environment: - USE_COLOR=True - SHOW_PROGRESS=False - ONLY_NEW=False - TIMEOUT=120 - MEDIA_TIMEOUT=3600 - FETCH_TITLE=True - FETCH_WGET=True - FETCH_WARC=True - FETCH_PDF=True - FETCH_SCREENSHOT=True - FETCH_DOM=True - FETCH_GIT=True - FETCH_MEDIA=false - SUBMIT_ARCHIVE_DOT_ORG=True - USE_SINGLEFILE=True - CHECK_SSL_VALIDITY=False - FETCH_WGET_REQUISITES=True - RESOLUTION="1440,900" - WGET_USER_AGENT="Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/73.0.3683.75 Safari/537.36" - CHROME_USER_AGENT="Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/73.0.3683.75 Safari/537.36" - CHROME_HEADLESS=True - SECRET_KEY="" volumes: - /etc/localtime:/etc/localtime:ro - /storage/data/docs/archivebox:/data ``` run ti with: ``` docker-compose run archivebox init ``` #### log output ``` docker-compose run archivebox init [i] [2021-01-02 10:15:56] ArchiveBox v0.5.0: archivebox init > /data [!] This folder contains a JSON index. It is deprecated, and will no longer be kept up to date automatically. You can run `archivebox list --json --with-headers > index.json` to manually generate it. [*] Updating existing ArchiveBox collection in this folder... /data ------------------------------------------------------------------ [*] Verifying archive folder structure... √ /data/sources √ /data/archive √ /data/logs √ /data/ArchiveBox.conf [*] Verifying main SQL index and running migrations... √ /data/index.sqlite3 Traceback (most recent call last): File "/usr/local/lib/python3.9/site-packages/django/db/models/query.py", line 573, in get_or_create return self.get(**kwargs), False File "/usr/local/lib/python3.9/site-packages/django/db/models/query.py", line 429, in get raise self.model.DoesNotExist( __fake__.DoesNotExist: Tag matching query does not exist. During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/usr/local/lib/python3.9/site-packages/django/db/backends/utils.py", line 84, in _execute return self.cursor.execute(sql, params) File "/usr/local/lib/python3.9/site-packages/django/db/backends/sqlite3/base.py", line 413, in execute return Database.Cursor.execute(self, query, params) sqlite3.IntegrityError: UNIQUE constraint failed: core_tag.slug The above exception was the direct cause of the following exception: Traceback (most recent call last): File "/usr/local/bin/archivebox", line 33, in <module> sys.exit(load_entry_point('archivebox', 'console_scripts', 'archivebox')()) File "/app/archivebox/cli/__init__.py", line 123, in main run_subcommand( File "/app/archivebox/cli/__init__.py", line 63, in run_subcommand module.main(args=subcommand_args, stdin=stdin, pwd=pwd) # type: ignore File "/app/archivebox/cli/archivebox_init.py", line 33, in main init( File "/app/archivebox/util.py", line 113, in typechecked_function return func(*args, **kwargs) File "/app/archivebox/main.py", line 323, in init for migration_line in apply_migrations(out_dir): File "/app/archivebox/util.py", line 113, in typechecked_function return func(*args, **kwargs) File "/app/archivebox/index/sql.py", line 102, in apply_migrations call_command("migrate", interactive=False, stdout=out) File "/usr/local/lib/python3.9/site-packages/django/core/management/__init__.py", line 168, in call_command return command.execute(*args, **defaults) File "/usr/local/lib/python3.9/site-packages/django/core/management/base.py", line 371, in execute output = self.handle(*args, **options) File "/usr/local/lib/python3.9/site-packages/django/core/management/base.py", line 85, in wrapped res = handle_func(*args, **kwargs) File "/usr/local/lib/python3.9/site-packages/django/core/management/commands/migrate.py", line 243, in handle post_migrate_state = executor.migrate( File "/usr/local/lib/python3.9/site-packages/django/db/migrations/executor.py", line 117, in migrate state = self._migrate_all_forwards(state, plan, full_plan, fake=fake, fake_initial=fake_initial) File "/usr/local/lib/python3.9/site-packages/django/db/migrations/executor.py", line 147, in _migrate_all_forwards state = self.apply_migration(state, migration, fake=fake, fake_initial=fake_initial) File "/usr/local/lib/python3.9/site-packages/django/db/migrations/executor.py", line 227, in apply_migration state = migration.apply(state, schema_editor) File "/usr/local/lib/python3.9/site-packages/django/db/migrations/migration.py", line 124, in apply operation.database_forwards(self.app_label, schema_editor, old_state, project_state) File "/usr/local/lib/python3.9/site-packages/django/db/migrations/operations/special.py", line 190, in database_forwards self.code(from_state.apps, schema_editor) File "/app/archivebox/core/migrations/0006_auto_20201012_1520.py", line 20, in forwards_func to_add, _ = TagModel.objects.get_or_create(name=tag, slug=slugify(tag)) File "/usr/local/lib/python3.9/site-packages/django/db/models/manager.py", line 85, in manager_method return getattr(self.get_queryset(), name)(*args, **kwargs) File "/usr/local/lib/python3.9/site-packages/django/db/models/query.py", line 576, in get_or_create return self._create_object_from_params(kwargs, params) File "/usr/local/lib/python3.9/site-packages/django/db/models/query.py", line 610, in _create_object_from_params obj = self.create(**params) File "/usr/local/lib/python3.9/site-packages/django/db/models/query.py", line 447, in create obj.save(force_insert=True, using=self.db) File "/usr/local/lib/python3.9/site-packages/django/db/models/base.py", line 753, in save self.save_base(using=using, force_insert=force_insert, File "/usr/local/lib/python3.9/site-packages/django/db/models/base.py", line 790, in save_base updated = self._save_table( File "/usr/local/lib/python3.9/site-packages/django/db/models/base.py", line 895, in _save_table results = self._do_insert(cls._base_manager, using, fields, returning_fields, raw) File "/usr/local/lib/python3.9/site-packages/django/db/models/base.py", line 933, in _do_insert return manager._insert( File "/usr/local/lib/python3.9/site-packages/django/db/models/manager.py", line 85, in manager_method return getattr(self.get_queryset(), name)(*args, **kwargs) File "/usr/local/lib/python3.9/site-packages/django/db/models/query.py", line 1254, in _insert return query.get_compiler(using=using).execute_sql(returning_fields) File "/usr/local/lib/python3.9/site-packages/django/db/models/sql/compiler.py", line 1397, in execute_sql cursor.execute(sql, params) File "/usr/local/lib/python3.9/site-packages/django/db/backends/utils.py", line 66, in execute return self._execute_with_wrappers(sql, params, many=False, executor=self._execute) File "/usr/local/lib/python3.9/site-packages/django/db/backends/utils.py", line 75, in _execute_with_wrappers return executor(sql, params, many, context) File "/usr/local/lib/python3.9/site-packages/django/db/backends/utils.py", line 84, in _execute return self.cursor.execute(sql, params) File "/usr/local/lib/python3.9/site-packages/django/db/utils.py", line 90, in __exit__ raise dj_exc_value.with_traceback(traceback) from exc_value File "/usr/local/lib/python3.9/site-packages/django/db/backends/utils.py", line 84, in _execute return self.cursor.execute(sql, params) File "/usr/local/lib/python3.9/site-packages/django/db/backends/sqlite3/base.py", line 413, in execute return Database.Cursor.execute(self, query, params) django.db.utils.IntegrityError: UNIQUE constraint failed: core_tag.slug ``` #### Software versions - OS: ubuntu 20.04 - ArchiveBox version: latest docker 0.5.0 nikisweeting/archivebox:latest - Python version: 3.9 in docker - Chrome version: from docker
kerem 2026-03-14 22:35:46 +03:00
Author
Owner

@pirate commented on GitHub (Jan 2, 2021):

Thanks for reporting, we'll investigate. In the meantime can you switch to using archivebox/archivebox:latest we moved away from the old nikisweeting/archivebox repo under my personal account to a new more "official" one.

<!-- gh-comment-id:753469706 --> @pirate commented on GitHub (Jan 2, 2021): Thanks for reporting, we'll investigate. In the meantime can you switch to using `archivebox/archivebox:latest` we moved away from the old `nikisweeting/archivebox` repo under my personal account to a new more "official" one.
Author
Owner

@pirate commented on GitHub (Feb 1, 2021):

It's caused by both name=... and slug=... being used to create a unique Tag, but only one of those fields conflicts with an existing Tag you have.

archivebox/core/migrations/0006_auto_20201012_1520.py:

def forwards_func(apps, schema_editor):
    SnapshotModel = apps.get_model("core", "Snapshot")
    TagModel = apps.get_model("core", "Tag")

    db_alias = schema_editor.connection.alias
    snapshots = SnapshotModel.objects.all()
    for snapshot in snapshots:
        tags = snapshot.tags
        tag_set = (
            set(tag.strip() for tag in (snapshot.tags_old or '').split(','))
        )
        tag_set.discard("")

        for tag in tag_set:
            to_add, _ = TagModel.objects.get_or_create(name=tag, slug=slugify(tag))
            snapshot.tags.add(to_add)

This is a rare but annoying edge case, it was a simple 1-line fix aa84a7f, but I didn't manage to add it in time for the v0.5.4 release.

        for tag in tag_set:
-            to_add, _ = TagModel.objects.get_or_create(name=tag, slug=slugify(tag))
+            to_add, _ = TagModel.objects.get_or_create(name=tag, defaults={'slug': slugify(tag)})
            snapshot.tags.add(to_add)

You're welcome to run dev, or wait for the next release:

docker build -t archivebox:dev https://github.com/ArchiveBox/ArchiveBox.git#dev
docker run -v $PWD:/data archivebox:dev ...

Let me know if you're still having issues after that and I can reopen the ticket.

<!-- gh-comment-id:770739832 --> @pirate commented on GitHub (Feb 1, 2021): It's caused by both `name=...` and `slug=...` being used to create a unique Tag, but only one of those fields conflicts with an existing Tag you have. `archivebox/core/migrations/0006_auto_20201012_1520.py`: ```python def forwards_func(apps, schema_editor): SnapshotModel = apps.get_model("core", "Snapshot") TagModel = apps.get_model("core", "Tag") db_alias = schema_editor.connection.alias snapshots = SnapshotModel.objects.all() for snapshot in snapshots: tags = snapshot.tags tag_set = ( set(tag.strip() for tag in (snapshot.tags_old or '').split(',')) ) tag_set.discard("") for tag in tag_set: to_add, _ = TagModel.objects.get_or_create(name=tag, slug=slugify(tag)) snapshot.tags.add(to_add) ``` This is a rare but annoying edge case, it was a simple 1-line fix aa84a7f, but I didn't manage to add it in time for the v0.5.4 release. ```diff for tag in tag_set: - to_add, _ = TagModel.objects.get_or_create(name=tag, slug=slugify(tag)) + to_add, _ = TagModel.objects.get_or_create(name=tag, defaults={'slug': slugify(tag)}) snapshot.tags.add(to_add) ``` You're welcome to run `dev`, or wait for the next release: ```bash docker build -t archivebox:dev https://github.com/ArchiveBox/ArchiveBox.git#dev docker run -v $PWD:/data archivebox:dev ... ``` Let me know if you're still having issues after that and I can reopen the ticket.
Author
Owner

@pirate commented on GitHub (Apr 12, 2022):

Note I've added a new DB/filesystem troubleshooting area to the wiki that may help people arriving here from Google: https://github.com/ArchiveBox/ArchiveBox/wiki/Upgrading-or-Merging-Archives#database-troubleshooting

Contributions/suggestions welcome there.

<!-- gh-comment-id:1097266199 --> @pirate commented on GitHub (Apr 12, 2022): Note I've added a new DB/filesystem troubleshooting area to the wiki that may help people arriving here from Google: https://github.com/ArchiveBox/ArchiveBox/wiki/Upgrading-or-Merging-Archives#database-troubleshooting Contributions/suggestions welcome there.
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/ArchiveBox#3389
No description provided.