[GH-ISSUE #252] Support for Swedish [sv-SE] OCR #244

Closed
opened 2026-02-27 15:55:47 +03:00 by kerem · 3 comments
Owner

Originally created by @Yavari on GitHub (Aug 13, 2019).
Original GitHub issue: https://github.com/RD17/ambar/issues/252

Can you please add support for Swedish language or guide me to have I can do it so that I can add a pull request?

Originally created by @Yavari on GitHub (Aug 13, 2019). Original GitHub issue: https://github.com/RD17/ambar/issues/252 Can you please add support for Swedish language or guide me to have I can do it so that I can add a pull request?
kerem 2026-02-27 15:55:47 +03:00
  • closed this issue
  • added the
    wontfix
    label
Author
Owner

@Yavari commented on GitHub (Aug 14, 2019):

Here is some code I am using in a another project. Please let me know if you want me to create a pull request.

    "ambar_sv": {
      "tokenizer": "standard",
      "filter": [
        "lowercase",
        "icu_folding_se",
        "swedish_stop",
        "swedish_stemmer"
      ],
	  
   "swedish_stemmer": {
      "type": "stemmer",
      "language": "swedish"
    },

    "swedish_stop": {
      "type": "stop",
      "stopwords": "_swedish_"
    },
   "icu_folding_se": {
      "type": "icu_folding",
      "unicodeSetFilter": "[^åäöÅÄÖ]"
    }

analysis-icu plugin needs to be installed for icu_folding.

    RUN bin/elasticsearch-plugin install analysis-icu
<!-- gh-comment-id:521175005 --> @Yavari commented on GitHub (Aug 14, 2019): Here is some code I am using in a another project. Please let me know if you want me to create a pull request. "ambar_sv": { "tokenizer": "standard", "filter": [ "lowercase", "icu_folding_se", "swedish_stop", "swedish_stemmer" ], "swedish_stemmer": { "type": "stemmer", "language": "swedish" }, "swedish_stop": { "type": "stop", "stopwords": "_swedish_" }, "icu_folding_se": { "type": "icu_folding", "unicodeSetFilter": "[^åäöÅÄÖ]" } analysis-icu plugin needs to be installed for icu_folding. RUN bin/elasticsearch-plugin install analysis-icu
Author
Owner

@Yavari commented on GitHub (Aug 14, 2019):

I guess https://github.com/RD17/ambar/blob/master/Pipeline/Dockerfile also needs the following line:

tesseract-ocr-swe \
<!-- gh-comment-id:521198616 --> @Yavari commented on GitHub (Aug 14, 2019): I guess https://github.com/RD17/ambar/blob/master/Pipeline/Dockerfile also needs the following line: tesseract-ocr-swe \
Author
Owner

@stale[bot] commented on GitHub (Aug 29, 2019):

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

<!-- gh-comment-id:526146568 --> @stale[bot] commented on GitHub (Aug 29, 2019): This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/ambar#244
No description provided.