mirror of
https://github.com/ciur/papermerge.git
synced 2026-04-25 03:55:58 +03:00
[GH-ISSUE #526] Maintain same filenames and folder structure on filesystem as in the database #406
Labels
No labels
2.1
3.0
3.0.1
3.0.2
3.0.3
3.0.3
3.1
3.2
3.2
3.3
3.5
3.x
Fixed. Waiting for feedback.
Fixed. Waiting for feedback.
UX
Version 2.1 - alpha
XSS
announcement
beta
blocker
bug
cannot reproduce
confirmed
confirmed
critical
demo
dependencies
deployment
detchnical debt
discussion
docker
documentation
donations
duplicate
enhancement
feature request
frontend
fundraising
good first issue
good issue
help wanted
high
implemented
important
improvement
incomplete
invalid
investigation
kubernetes
low
low impact
medium
medium
medium impact
migration from 2.0
migration from 2.1
missing-language
missing-ocr-language
no-activity
note
ocr
outofscope
packaging
performance
popular request
pull-request
pypi
question
raspberry pi
roadmap
search
security
setup
status
task
technical debt
updates
user xp
version 1.4.0 - demo
will be implemented
will not be implemented
wontfix
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
starred/papermerge#406
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @rmatte on GitHub (Feb 17, 2023).
Original GitHub issue: https://github.com/ciur/papermerge/issues/526
Originally assigned to: @ciur on GitHub.
I tried using papermerge for a bit and I really like the interface. The one thing that was kind of offputting is that I noticed that the files on the filesystem don't get updated according to the changes that I make to them in the UI. For instance, I can rename a file in the UI but when I look at the version of that file that's stored on the filesystem the name stays the same as it was when I uploaded it. Also, when I created folders in papermerge and move documents in to them that is not reflected on the filesystem.
Why is this bad? Because if I were to lose the papermerge database due to corruption or something, even though I do still have the actual files on the filesystem, they aren't named and organized properly, so it would be a huge chore to then figure out what everything is and get going again.
It would be a significant improvement if you'd make it so that when you rename a file in the UI it also renames the files on disk and when you create a directory in the UI it creates a directory on disk and moves the files in to it. The files on disk should basically be a mirror image of what's in the UI in terms of filenames and folders/directories. Since the database does appear to have a link back to each individual file you could make it do this for any existing files which haven't been renamed or moved to a directory yet as well. If you need to keep multiple versions of files you could simply add a version number to the end of each version or maybe create a sub-folder representing each file and have each version stored in there.
If there's some sort of config option that can be enabled to allow for this let me know, but I read the docs and tried a lot of different stuff and I couldn't get it to behave this way.
I know that there's a way to issue a command to create backups which include a copy of the database and the files in it, but it's still less than ideal. It would be very nice to be able to just backup the document directories and files somewhere as well, but they need to be renamed and organized in directories on disk in a way that makes sense for that to really be feasible.
I don't see any advantage at all to storing the documents in their original filenames all in one directory like the software is currently doing. It's a mess to work with outside of the application. This also seems like it would be fairly simple to code. Code a migration step which runs once to re-organize any existing documents on disk based on the information in the database, then have hooks in place in the code when documents are renamed and moved to mirror those actions on disk.
Thanks.
@qq7te commented on GitHub (Feb 17, 2023):
exactly! I thought I could come up with a patch to allow that, but unfortunately my free time is limited. I've tried, but didn't get far.
@rmatte do you have the time for a proof of concept?
@rmatte commented on GitHub (Feb 17, 2023):
I see that this is written in Python and that is the language that I code in almost daily, so yeah, if I can find some time to really dig in to this and come up with a patch I will. It'll take a fair bit of time though as I'll need to first read through the code and familiarize myself with it. I'll do what I can if I can find time to invest in learning all of this first, but this feels like something that would probably take the main developer a few hours to throw together since they already know where everything is vs probably like 15 to 20 combined hours of effort for me to familiarize myself with the code in depth and with the database structure first before actually coding the changes. I'm sure there are some nuances with the OCR, the file versioning and stuff which would need to be accounted for.
@rmatte commented on GitHub (Feb 20, 2023):
I've decided to just use my new Synology NAS for my document management and write some custom scripts to do automated OCR and file placement. My scanner software does OCR already as well. It's going to be way easier than trying to refactor this package. The NAS already has the ability to index the content of documents and then search through them so it'll be perfect for this. I really do hope that the developer implements this feature eventually though as it doesn't make any sense not to. Having documents organized neatly in to folders on disk with matching filenames to what's in papermerge just makes a whole lot of sense.