[GH-ISSUE #440] Error during import from folder: 'FileSystemStorage' object has no attribute '_s3copy' #341

Open
opened 2026-02-25 21:31:44 +03:00 by kerem · 0 comments
Owner

Originally created by @scheibling on GitHub (Dec 23, 2021).
Original GitHub issue: https://github.com/ciur/papermerge/issues/440

Originally assigned to: @ciur on GitHub.

Description
Papermerge doesn't delete files from import folder after successful import, leading to repeated imports of the same file.
Function upload_txt_artifact_to_s3() in papermerge/wsignals/signals.py:190 seems to be run even though I have the config option set to:

DEFAULT_FILE_STORAGE = "mglib.storage.FileSystemStorage"

The issue can be temporarily fixed by commenting out the _s3copy-functions in upload_txt_artifact_to_s3 and upload_hocr_artifact_to_s3 functions

Expected
File is imported once, then removed or moved

Actual
File is imported, OCR works but file isn't moved/removed, see below error message.

Info:

  • OS: Debian 11
  • Browser Firefox
  • Database SQLite
  • Papermerge Version 2.0.1

Logs:

Dec 23 16:23:39 papermerge-hostname python[17001]: [2021-12-23 17:23:39,437: ERROR/ForkPoolWorker-2] Task papermerge.core.management.commands.worker.import_from_local_folder[1ea28a0c-a402-4301-a7c8-c591d09f560e] raised unexpected: AttributeError("'FileSystemStorage' object has no attribute '_s3copy'")
Dec 23 16:23:39 papermerge-hostname python[17001]: Traceback (most recent call last):
Dec 23 16:23:39 papermerge-hostname python[17001]:   File "/opt/papermerge/.venv/lib/python3.9/site-packages/celery/app/trace.py", line 451, in trace_task
Dec 23 16:23:39 papermerge-hostname python[17001]:     R = retval = fun(*args, **kwargs)
Dec 23 16:23:39 papermerge-hostname python[17001]:   File "/opt/papermerge/.venv/lib/python3.9/site-packages/celery/app/trace.py", line 734, in __protected_call__
Dec 23 16:23:39 papermerge-hostname python[17001]:     return self.run(*args, **kwargs)
Dec 23 16:23:39 papermerge-hostname python[17001]:   File "/opt/papermerge/.venv/lib/python3.9/site-packages/papermerge/core/management/commands/worker.py", line 70, in import_from_local_folder
Dec 23 16:23:39 papermerge-hostname python[17001]:     import_documents(settings.PAPERMERGE_IMPORTER_DIR)
Dec 23 16:23:39 papermerge-hostname python[17001]:   File "/opt/papermerge/.venv/lib/python3.9/site-packages/papermerge/core/importers/local.py", line 52, in import_documents
Dec 23 16:23:39 papermerge-hostname python[17001]:     doc = go_through_pipelines(init_kwargs, apply_kwargs)
Dec 23 16:23:39 papermerge-hostname python[17001]:   File "/opt/papermerge/.venv/lib/python3.9/site-packages/papermerge/core/import_pipeline.py", line 371, in go_through_pipelines
Dec 23 16:23:39 papermerge-hostname python[17001]:     doc = importer.apply(**apply_kwargs)
Dec 23 16:23:39 papermerge-hostname python[17001]:   File "/opt/papermerge/.venv/lib/python3.9/site-packages/papermerge/core/import_pipeline.py", line 323, in apply
Dec 23 16:23:39 papermerge-hostname python[17001]:     self.ocr_document(
Dec 23 16:23:39 papermerge-hostname python[17001]:   File "/opt/papermerge/.venv/lib/python3.9/site-packages/papermerge/core/import_pipeline.py", line 185, in ocr_document
Dec 23 16:23:39 papermerge-hostname python[17001]:     ocr_page(
Dec 23 16:23:39 papermerge-hostname python[17001]:   File "/opt/papermerge/.venv/lib/python3.9/site-packages/celery/local.py", line 188, in __call__
Dec 23 16:23:39 papermerge-hostname python[17001]:     return self._get_current_object()(*a, **kw)
Dec 23 16:23:39 papermerge-hostname python[17001]:   File "/opt/papermerge/.venv/lib/python3.9/site-packages/celery/app/trace.py", line 735, in __protected_call__
Dec 23 16:23:39 papermerge-hostname python[17001]:     return orig(self, *args, **kwargs)
Dec 23 16:23:39 papermerge-hostname python[17001]:   File "/opt/papermerge/.venv/lib/python3.9/site-packages/celery/app/task.py", line 392, in __call__
Dec 23 16:23:39 papermerge-hostname python[17001]:     return self.run(*args, **kwargs)
Dec 23 16:23:39 papermerge-hostname python[17001]:   File "/opt/papermerge/.venv/lib/python3.9/site-packages/papermerge/core/tasks.py", line 50, in ocr_page
Dec 23 16:23:39 papermerge-hostname python[17001]:     main_ocr_page(
Dec 23 16:23:39 papermerge-hostname python[17001]:   File "/opt/papermerge/.venv/lib/python3.9/site-packages/papermerge/core/ocr/page.py", line 405, in ocr_page
Dec 23 16:23:39 papermerge-hostname python[17001]:     ocr_page_pdf(
Dec 23 16:23:39 papermerge-hostname python[17001]:   File "/opt/papermerge/.venv/lib/python3.9/site-packages/papermerge/core/ocr/page.py", line 261, in ocr_page_pdf
Dec 23 16:23:39 papermerge-hostname python[17001]:     notify_txt_ready(
Dec 23 16:23:39 papermerge-hostname python[17001]:   File "/opt/papermerge/.venv/lib/python3.9/site-packages/papermerge/core/ocr/page.py", line 163, in notify_txt_ready
Dec 23 16:23:39 papermerge-hostname python[17001]:     signals.post_page_txt.send(
Dec 23 16:23:39 papermerge-hostname python[17001]:   File "/opt/papermerge/.venv/lib/python3.9/site-packages/django/dispatch/dispatcher.py", line 180, in send
Dec 23 16:23:39 papermerge-hostname python[17001]:     return [
Dec 23 16:23:39 papermerge-hostname python[17001]:   File "/opt/papermerge/.venv/lib/python3.9/site-packages/django/dispatch/dispatcher.py", line 181, in <listcomp>
Dec 23 16:23:39 papermerge-hostname python[17001]:     (receiver, receiver(signal=self, sender=sender, **named))
Dec 23 16:23:39 papermerge-hostname python[17001]:   File "/opt/papermerge/papermerge/wsignals/signals.py", line 216, in upload_txt_artifact_to_s3
Dec 23 16:23:39 papermerge-hostname python[17001]:     default_storage._s3copy(
Dec 23 16:23:39 papermerge-hostname python[17001]: AttributeError: 'FileSystemStorage' object has no attribute '_s3copy'
Originally created by @scheibling on GitHub (Dec 23, 2021). Original GitHub issue: https://github.com/ciur/papermerge/issues/440 Originally assigned to: @ciur on GitHub. **Description** Papermerge doesn't delete files from import folder after successful import, leading to repeated imports of the same file. Function upload_txt_artifact_to_s3() in papermerge/wsignals/signals.py:190 seems to be run even though I have the config option set to: ```python DEFAULT_FILE_STORAGE = "mglib.storage.FileSystemStorage" ``` The issue can be temporarily fixed by commenting out the _s3copy-functions in upload_txt_artifact_to_s3 and upload_hocr_artifact_to_s3 functions **Expected** File is imported once, then removed or moved **Actual** File is imported, OCR works but file isn't moved/removed, see below error message. **Info:** - OS: Debian 11 - Browser Firefox - Database SQLite - Papermerge Version 2.0.1 **Logs:** ``` Dec 23 16:23:39 papermerge-hostname python[17001]: [2021-12-23 17:23:39,437: ERROR/ForkPoolWorker-2] Task papermerge.core.management.commands.worker.import_from_local_folder[1ea28a0c-a402-4301-a7c8-c591d09f560e] raised unexpected: AttributeError("'FileSystemStorage' object has no attribute '_s3copy'") Dec 23 16:23:39 papermerge-hostname python[17001]: Traceback (most recent call last): Dec 23 16:23:39 papermerge-hostname python[17001]: File "/opt/papermerge/.venv/lib/python3.9/site-packages/celery/app/trace.py", line 451, in trace_task Dec 23 16:23:39 papermerge-hostname python[17001]: R = retval = fun(*args, **kwargs) Dec 23 16:23:39 papermerge-hostname python[17001]: File "/opt/papermerge/.venv/lib/python3.9/site-packages/celery/app/trace.py", line 734, in __protected_call__ Dec 23 16:23:39 papermerge-hostname python[17001]: return self.run(*args, **kwargs) Dec 23 16:23:39 papermerge-hostname python[17001]: File "/opt/papermerge/.venv/lib/python3.9/site-packages/papermerge/core/management/commands/worker.py", line 70, in import_from_local_folder Dec 23 16:23:39 papermerge-hostname python[17001]: import_documents(settings.PAPERMERGE_IMPORTER_DIR) Dec 23 16:23:39 papermerge-hostname python[17001]: File "/opt/papermerge/.venv/lib/python3.9/site-packages/papermerge/core/importers/local.py", line 52, in import_documents Dec 23 16:23:39 papermerge-hostname python[17001]: doc = go_through_pipelines(init_kwargs, apply_kwargs) Dec 23 16:23:39 papermerge-hostname python[17001]: File "/opt/papermerge/.venv/lib/python3.9/site-packages/papermerge/core/import_pipeline.py", line 371, in go_through_pipelines Dec 23 16:23:39 papermerge-hostname python[17001]: doc = importer.apply(**apply_kwargs) Dec 23 16:23:39 papermerge-hostname python[17001]: File "/opt/papermerge/.venv/lib/python3.9/site-packages/papermerge/core/import_pipeline.py", line 323, in apply Dec 23 16:23:39 papermerge-hostname python[17001]: self.ocr_document( Dec 23 16:23:39 papermerge-hostname python[17001]: File "/opt/papermerge/.venv/lib/python3.9/site-packages/papermerge/core/import_pipeline.py", line 185, in ocr_document Dec 23 16:23:39 papermerge-hostname python[17001]: ocr_page( Dec 23 16:23:39 papermerge-hostname python[17001]: File "/opt/papermerge/.venv/lib/python3.9/site-packages/celery/local.py", line 188, in __call__ Dec 23 16:23:39 papermerge-hostname python[17001]: return self._get_current_object()(*a, **kw) Dec 23 16:23:39 papermerge-hostname python[17001]: File "/opt/papermerge/.venv/lib/python3.9/site-packages/celery/app/trace.py", line 735, in __protected_call__ Dec 23 16:23:39 papermerge-hostname python[17001]: return orig(self, *args, **kwargs) Dec 23 16:23:39 papermerge-hostname python[17001]: File "/opt/papermerge/.venv/lib/python3.9/site-packages/celery/app/task.py", line 392, in __call__ Dec 23 16:23:39 papermerge-hostname python[17001]: return self.run(*args, **kwargs) Dec 23 16:23:39 papermerge-hostname python[17001]: File "/opt/papermerge/.venv/lib/python3.9/site-packages/papermerge/core/tasks.py", line 50, in ocr_page Dec 23 16:23:39 papermerge-hostname python[17001]: main_ocr_page( Dec 23 16:23:39 papermerge-hostname python[17001]: File "/opt/papermerge/.venv/lib/python3.9/site-packages/papermerge/core/ocr/page.py", line 405, in ocr_page Dec 23 16:23:39 papermerge-hostname python[17001]: ocr_page_pdf( Dec 23 16:23:39 papermerge-hostname python[17001]: File "/opt/papermerge/.venv/lib/python3.9/site-packages/papermerge/core/ocr/page.py", line 261, in ocr_page_pdf Dec 23 16:23:39 papermerge-hostname python[17001]: notify_txt_ready( Dec 23 16:23:39 papermerge-hostname python[17001]: File "/opt/papermerge/.venv/lib/python3.9/site-packages/papermerge/core/ocr/page.py", line 163, in notify_txt_ready Dec 23 16:23:39 papermerge-hostname python[17001]: signals.post_page_txt.send( Dec 23 16:23:39 papermerge-hostname python[17001]: File "/opt/papermerge/.venv/lib/python3.9/site-packages/django/dispatch/dispatcher.py", line 180, in send Dec 23 16:23:39 papermerge-hostname python[17001]: return [ Dec 23 16:23:39 papermerge-hostname python[17001]: File "/opt/papermerge/.venv/lib/python3.9/site-packages/django/dispatch/dispatcher.py", line 181, in <listcomp> Dec 23 16:23:39 papermerge-hostname python[17001]: (receiver, receiver(signal=self, sender=sender, **named)) Dec 23 16:23:39 papermerge-hostname python[17001]: File "/opt/papermerge/papermerge/wsignals/signals.py", line 216, in upload_txt_artifact_to_s3 Dec 23 16:23:39 papermerge-hostname python[17001]: default_storage._s3copy( Dec 23 16:23:39 papermerge-hostname python[17001]: AttributeError: 'FileSystemStorage' object has no attribute '_s3copy' ```
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/papermerge#341
No description provided.