[GH-ISSUE #424] Option to display OCR'ed text #325

Closed
opened 2026-02-25 21:31:42 +03:00 by kerem · 2 comments
Owner

Originally created by @eric-saintetienne on GitHub (Oct 24, 2021).
Original GitHub issue: https://github.com/ciur/papermerge/issues/424

Originally assigned to: @ciur on GitHub.

As of 2.0, the OCR text is displayed on top of the image, as an overlay.
Displaying the OCR'ed text in a text area would be useful, at least to check that OCR worked as expected (not many recognition errors) but also to select/copy the OCR'ed text to the clipboard for saving.

For the implementation, it's up to you, I think a textarea could be added under the metadata, on the right hand side of a file, or it could be a separate page containing a textarea and accessed via a new entry in the contextual (right click) menu for a given document.

The alternative is to open the image viewer and do Ctrl+A, but there are problems:

  1. Ctrl+A selects the whole page, including a bunch of other text (whatever is displayed on the page, like the html div and spans)
  2. Once selected with Cltr+A the overlayed OCR'ed text is still hard to read (it's not as readable as a textearea)

Thanks!

Originally created by @eric-saintetienne on GitHub (Oct 24, 2021). Original GitHub issue: https://github.com/ciur/papermerge/issues/424 Originally assigned to: @ciur on GitHub. As of 2.0, the OCR text is displayed on top of the image, as an overlay. Displaying the OCR'ed text in a text area would be useful, at least to check that OCR worked as expected (not many recognition errors) but also to select/copy the OCR'ed text to the clipboard for saving. For the implementation, it's up to you, I think a textarea could be added under the metadata, on the right hand side of a file, or it could be a separate page containing a textarea and accessed via a new entry in the contextual (right click) menu for a given document. The alternative is to open the image viewer and do Ctrl+A, but there are problems: 1. Ctrl+A selects the whole page, including a bunch of other text (whatever is displayed on the page, like the html div and spans) 2. Once selected with Cltr+A the overlayed OCR'ed text is still hard to read (it's not as readable as a textearea) Thanks!
kerem 2026-02-25 21:31:42 +03:00
Author
Owner

@ajarzyn commented on GitHub (Nov 29, 2021):

Hello @eric-saintetienne,

You already can do this, but I must admit that this option is rather hidden, so maybe good suggestion would be to make it more accessible.

How to view OCRed text of the page:

  1. Open your document
  2. Select this icon in the left top corner: obraz
  3. Select page you would like to see OCR from (this is important without selection option won't work)
  4. Press right mouse button on the page
  5. Select "View OCRed text"

I may make a pull request to add this description to documentation.
@ciur would that be a good idea?

<!-- gh-comment-id:982080682 --> @ajarzyn commented on GitHub (Nov 29, 2021): Hello @eric-saintetienne, You already can do this, but I must admit that this option is rather hidden, so maybe good suggestion would be to make it more accessible. How to view OCRed text of the page: 1. Open your document 2. Select this icon in the left top corner: ![obraz](https://user-images.githubusercontent.com/18590439/143952224-4689c9d7-d13f-47f7-80d5-990d8eeaca3a.png) 3. Select page you would like to see OCR from (this is important without selection option won't work) 4. Press right mouse button on the page 5. Select "View OCRed text" I may make a pull request to add this description to documentation. @ciur would that be a good idea?
Author
Owner

@ciur commented on GitHub (Aug 27, 2022):

Feature is available in 2.1.0x and it is more intuitive to use.
Here is a quick demos:

select-first-page-is-not-necessary-anymore

<!-- gh-comment-id:1229132936 --> @ciur commented on GitHub (Aug 27, 2022): Feature is available in 2.1.0x and it is more intuitive to use. Here is a quick demos: ![select-first-page-is-not-necessary-anymore](https://user-images.githubusercontent.com/24827601/187017685-ee64bcde-e72a-4d1b-bea6-8929014f6a4f.gif)
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/papermerge#325
No description provided.