[GH-ISSUE #382] Configure Workers to annotate on Import #303

Open
opened 2026-02-25 21:31:39 +03:00 by kerem · 5 comments
Owner

Originally created by @jwalzer on GitHub (May 31, 2021).
Original GitHub issue: https://github.com/ciur/papermerge/issues/382

Originally assigned to: @ciur on GitHub.

Is your feature request related to a problem? Please describe.
I would like to have multiple workers running, each monitoring a different IMPORTER_DIR (probably on different hosts)
The imported Documents should then be classified depending on their Input-Location

Describe the solution you'd like
Optimal solution could be, that a worker could/should tag a document automatically with some meta-information, like:

  • imported: yes
  • import_hostname: worker01
  • import_path: /import/scanner1

optionally:

  • Configure the workers for some hardcoded tags that are set on specific workers
  • Configure the worker to import to a dedicated users .inbox
  • Optionally, make a single worker monitor multiple directories, every dir with different Configurations/Options

Describe alternatives you've considered
I thought about completely releasing the idea of the IMPORTER_DIR and solving it via shellscripting like in https://github.com/Ryther/papermerge-importer but I see value in having this implemented into papermerge

Additional context
Having importer metadata that can be queried in the automatas allows for lot more sophisticated workflows and usecases

Originally created by @jwalzer on GitHub (May 31, 2021). Original GitHub issue: https://github.com/ciur/papermerge/issues/382 Originally assigned to: @ciur on GitHub. **Is your feature request related to a problem? Please describe.** I would like to have multiple workers running, each monitoring a different IMPORTER_DIR (probably on different hosts) The imported Documents should then be classified depending on their Input-Location **Describe the solution you'd like** Optimal solution could be, that a worker could/should tag a document automatically with some meta-information, like: - imported: yes - import_hostname: worker01 - import_path: /import/scanner1 optionally: - Configure the workers for some hardcoded tags that are set on specific workers - Configure the worker to import to a dedicated users .inbox - Optionally, make a single worker monitor multiple directories, every dir with different Configurations/Options **Describe alternatives you've considered** I thought about completely releasing the idea of the IMPORTER_DIR and solving it via shellscripting like in https://github.com/Ryther/papermerge-importer but I see value in having this implemented into papermerge **Additional context** Having importer metadata that can be queried in the automatas allows for lot more sophisticated workflows and usecases
Author
Owner

@ciur commented on GitHub (Jun 1, 2021):

Actually you can run multiple workers with different importer directory each.
Notice that IMPORTER_DIR is worker specific configuration i.e it can differ from worker to worker.

What is not there, and I consider it a good idea is the "imported Documents should then be classified depending on their Input-Location". At least tagging docs differently depending of their origin - would be nice.

Thank you for opening this feature request.

<!-- gh-comment-id:852376243 --> @ciur commented on GitHub (Jun 1, 2021): Actually you can run multiple workers with different importer directory each. Notice that [IMPORTER_DIR](https://papermerge.com/docs/Installation/settings.html#document-importer) is worker specific configuration i.e it can differ from worker to worker. What is not there, and I consider it a good idea is the "imported Documents should then be classified depending on their Input-Location". At least tagging docs differently depending of their origin - would be nice. Thank you for opening this feature request.
Author
Owner

@jwalzer commented on GitHub (Jun 1, 2021):

Yes, my main request is the tagging. Because the worker, communicating via queue/redis, there shouldn't be any issue concerning syncronisation. But multiple Workers will allow to have multiple ingestion points. Allowing to configure every worker with some dedicated tags (maybe even freeform) can also delegate worker setup in a highly distributed environment.
Different People can setup their workers, with the tags they are using.
Only thing missing would be to have a secure way to determine the user into which to inject the document.

<!-- gh-comment-id:852472035 --> @jwalzer commented on GitHub (Jun 1, 2021): Yes, my main request is the tagging. Because the worker, communicating via queue/redis, there shouldn't be any issue concerning syncronisation. But multiple Workers will allow to have multiple ingestion points. Allowing to configure every worker with some dedicated tags (maybe even freeform) can also delegate worker setup in a highly distributed environment. Different People can setup their workers, with the tags they are using. Only thing missing would be to have a secure way to determine the user into which to inject the document.
Author
Owner

@mutax commented on GitHub (Jun 2, 2021):

My scenario is a scanner that has three quick-scan buttons that put the document to different directories on my samba share.
This way with tagging based on the source directory I can already pre-classify the document. Would be very useful here!

<!-- gh-comment-id:853430522 --> @mutax commented on GitHub (Jun 2, 2021): My scenario is a scanner that has three quick-scan buttons that put the document to different directories on my samba share. This way with tagging based on the source directory I can already pre-classify the document. Would be very useful here!
Author
Owner

@ciur commented on GitHub (Jun 21, 2021):

Hi @jwalzer, thank you for your kind donation.
The feature your are asking for will make its way into Papermerge 2.1. However, please keep in mind that Papermerge 2.1 is scheduled for December 2021. I intentionally decided to spend more time developing next release to address all accumulated technical debt. In any case I assure you that it is worth waiting 😉

<!-- gh-comment-id:865250518 --> @ciur commented on GitHub (Jun 21, 2021): Hi @jwalzer, thank you for your kind donation. The feature your are asking for will make its way into Papermerge 2.1. However, please keep in mind that Papermerge 2.1 is scheduled for December 2021. I intentionally decided to spend more time developing next release to address all accumulated technical debt. In any case I assure you that it is worth waiting :wink:
Author
Owner

@jwalzer commented on GitHub (Jun 22, 2021):

no problem. The donation is for the job done so far ;) If you need some friendly tester for the features, drop me a note

<!-- gh-comment-id:866022825 --> @jwalzer commented on GitHub (Jun 22, 2021): no problem. The donation is for the job done so far ;) If you need some friendly tester for the features, drop me a note
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/papermerge#303
No description provided.