[GH-ISSUE #93] Is Spanish supported in OCR? #92

Closed
opened 2026-02-27 15:54:57 +03:00 by kerem · 3 comments
Owner

Originally created by @juanfcocontreras on GitHub (Nov 19, 2017).
Original GitHub issue: https://github.com/RD17/ambar/issues/93

In the Content Extraction section of the Readme document and in the FAQS says:

Supported languages: Eng, Rus, Ita, Deu, Fra,Spa, Pl, Nld. If you miss your language, please create a new issue and we'll add it ASAP.

While in the same document says:

Supported language analyzers: English ambar_en, Russian ambar_ru, German ambar_de, Italian ambar_it, Polish ambar_pl, Chinese ambar_cn, CJK ambar_cjk

Can I use Spanish language analyzer? If so, which is their code: ambar_es?

Originally created by @juanfcocontreras on GitHub (Nov 19, 2017). Original GitHub issue: https://github.com/RD17/ambar/issues/93 In the [Content Extraction section of the Readme document](https://github.com/RD17/ambar/blob/master/README.md#content-extraction) and in the [FAQS](https://github.com/RD17/ambar/blob/master/README.md#which-languages-are-supported-for-ocr) says: > Supported languages: Eng, Rus, Ita, Deu, Fra,**Spa**, Pl, Nld. If you miss your language, please create a new issue and we'll add it ASAP. While in [the same document](https://github.com/RD17/ambar/blob/master/README.md#search) says: > Supported language analyzers: English `ambar_en`, Russian `ambar_ru`, German `ambar_de`, Italian `ambar_it`, Polish `ambar_pl`, Chinese `ambar_cn`, CJK `ambar_cjk` Can I use Spanish language analyzer? If so, which is their code: `ambar_es`?
kerem closed this issue 2026-02-27 15:54:57 +03:00
Author
Owner

@sochix commented on GitHub (Nov 20, 2017):

Hi @juanfcocontreras !
OCR supports SPANISH, but lang analyzer isn't.
We'll add ES lang analyzer in the next release.

<!-- gh-comment-id:345616122 --> @sochix commented on GitHub (Nov 20, 2017): Hi @juanfcocontreras ! OCR supports SPANISH, but lang analyzer isn't. We'll add ES lang analyzer in the next release.
Author
Owner

@juanfcocontreras commented on GitHub (Nov 20, 2017):

Could you explain the difference?

Thanks in advance!

<!-- gh-comment-id:345628015 --> @juanfcocontreras commented on GitHub (Nov 20, 2017): Could you explain the difference? Thanks in advance!
Author
Owner

@sochix commented on GitHub (Nov 21, 2017):

OCR is used for optical character recognition from image
Lang analyzer is used inside ElasticSearch for words tokenization

<!-- gh-comment-id:345941845 --> @sochix commented on GitHub (Nov 21, 2017): OCR is used for optical character recognition from image Lang analyzer is used inside ElasticSearch for words tokenization
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/ambar#92
No description provided.