mirror of
https://github.com/ciur/papermerge.git
synced 2026-04-25 12:05:58 +03:00
[GH-ISSUE #46] UTF-8 error while uploading file #34
Labels
No labels
2.1
3.0
3.0.1
3.0.2
3.0.3
3.0.3
3.1
3.2
3.2
3.3
3.5
3.x
Fixed. Waiting for feedback.
Fixed. Waiting for feedback.
UX
Version 2.1 - alpha
XSS
announcement
beta
blocker
bug
cannot reproduce
confirmed
confirmed
critical
demo
dependencies
deployment
detchnical debt
discussion
docker
documentation
donations
duplicate
enhancement
feature request
frontend
fundraising
good first issue
good issue
help wanted
high
implemented
important
improvement
incomplete
invalid
investigation
kubernetes
low
low impact
medium
medium
medium impact
migration from 2.0
migration from 2.1
missing-language
missing-ocr-language
no-activity
note
ocr
outofscope
packaging
performance
popular request
pull-request
pypi
question
raspberry pi
roadmap
search
security
setup
status
task
technical debt
updates
user xp
version 1.4.0 - demo
will be implemented
will not be implemented
wontfix
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
starred/papermerge#34
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @jhf2442 on GitHub (Jul 22, 2020).
Original GitHub issue: https://github.com/ciur/papermerge/issues/46
Docker image downloaded and started 5 min ago, therefore latest/greatest
while uploading a 180kB, 20-page PDF document (that opens perfectly in okular)
@ciur commented on GitHub (Jul 23, 2020):
Thank you for your feedback.
pdfinfo utility has an unexpected output :(
pdfinfo (part of poppler)- is used internally to figure out number of pages in the document.
Can you, please, run pdfinfo utility on the pdf document 2020-07-17_AGB_A02092019.pdf again and paste here the output?
Example:
@jhf2442 commented on GitHub (Jul 24, 2020):
Here we go :
-> it's the creation date field that contains some strange data ! (yes it's two diamonds)
@jhf2442 commented on GitHub (Jul 24, 2020):
here the header of the PDF
@ciur commented on GitHub (Jul 24, 2020):
I think those two diamonds cause the issue (de: sind schuldig) as they might be encoded in something different than UTF-8 (just guessing).
Does the document contains sensitive information ?
In case it is just random AGB (i.e. no sensitive data) would you send me a copy of it (my email is at the very bottom of readme page)? Otherwise I have no other means of troubleshooting the issue.
@ciur commented on GitHub (Jul 25, 2020):
I received your document and fixed encoding issue.
Fix will be available in 1.4.0 (in about 2 weeks).
Thank you again for providing useful feedback!