mirror of
https://github.com/ciur/papermerge.git
synced 2026-04-25 12:05:58 +03:00
[GH-ISSUE #169] Refactor of document importing #132
Labels
No labels
2.1
3.0
3.0.1
3.0.2
3.0.3
3.0.3
3.1
3.2
3.2
3.3
3.5
3.x
Fixed. Waiting for feedback.
Fixed. Waiting for feedback.
UX
Version 2.1 - alpha
XSS
announcement
beta
blocker
bug
cannot reproduce
confirmed
confirmed
critical
demo
dependencies
deployment
detchnical debt
discussion
docker
documentation
donations
duplicate
enhancement
feature request
frontend
fundraising
good first issue
good issue
help wanted
high
implemented
important
improvement
incomplete
invalid
investigation
kubernetes
low
low impact
medium
medium
medium impact
migration from 2.0
migration from 2.1
missing-language
missing-ocr-language
no-activity
note
ocr
outofscope
packaging
performance
popular request
pull-request
pypi
question
raspberry pi
roadmap
search
security
setup
status
task
technical debt
updates
user xp
version 1.4.0 - demo
will be implemented
will not be implemented
wontfix
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
starred/papermerge#132
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @francescocarzaniga on GitHub (Oct 14, 2020).
Original GitHub issue: https://github.com/ciur/papermerge/issues/169
I am leaving a few issues at the same time, but this is just the things a noticed in the last couple of days and a reminder for me if I want to do some PRs.
Document importing is the major cornerstone of Papermerge, but right now the code is scattered in a few places and it does not support plugins at all. One solution would be to make an importer app and move the DocumentImporter class there. Then I propose there should be a split with a new class DocumentProcessor. As subclasses DocumentValidator would check mimetypes (using python-magic and not file extension) and count pages, while DocumentImporter does the actual importing itself. They could also be merged into DocumentProcessor. Then such a class could be used both in the upload view and other importers to do all the heavy lifting. This would allow to add custom pre- and post- processing on the documents and go a long way to provide more pluggability.
This is just a proposal, the spirit is to make document importing more unified and modular, so a number of solutions is possible.
@ciur commented on GitHub (Oct 15, 2020):
@francescocarzaniga, description is too generic. Maybe you can be more specific.
@francescocarzaniga commented on GitHub (Oct 17, 2020):
I guess this and #167 stem from the same problem, lack of plugin support. This issue is a suggestion to change the document importing system to be more modular and plugin-friendly.
While I think this may not be urgent, the sooner you standardise a plugin system and document it the quicker people will get on board. If you truly want Papermerge to be plugin centric, then more people working on plugins => more features => bigger userbase.
@ciur commented on GitHub (Oct 18, 2020):
@francescocarzaniga
I am working on it ... :). In future Papermerge will be apps (app=plugin in django parlance) oriented.
@ciur commented on GitHub (Feb 28, 2021):
@francescocarzaniga
Document Pipelines are now part of Papermerge.