[PR #437] [MERGED] Refactor Hugging Face provider: fallbacks, retries, and improved error handling #742

Closed
opened 2026-03-13 21:06:46 +03:00 by kerem · 0 comments
Owner

📋 Pull Request Information

Original PR: https://github.com/AJaySi/ALwrity/pull/437
Author: @AJaySi
Created: 3/12/2026
Status: Merged
Merged: 3/12/2026
Merged by: @AJaySi

Base: mainHead: codex/fix-merge-conflicts-in-huggingface_provider.py


📝 Commits (1)

  • 9689008 Repair huggingface provider and restore explicit retry/fallback behavior

📊 Changes

1 file changed (+151 additions, -497 deletions)

View changed files

📝 backend/services/llm_providers/huggingface_provider.py (+151 -497)

📄 Description

Motivation

  • Reduce brittle behavior and improve robustness when calling Hugging Face Router by adding explicit fallback model sequencing and retries.
  • Standardize error classification and logging to make failures more actionable and debuggable.
  • Support structured JSON outputs and resilient parsing with fallbacks for models that don't support response_format.

Description

  • Replace ad-hoc imports and logging with get_service_logger, make OpenAI / NotFoundError safe when the openai package is missing, and add @lru_cache for _get_hf_client creation.
  • Implement explicit model candidate and fallback sequencing via _candidate_model_variants and _fallback_model_sequence, and centralize non-retryable classification in _is_non_retryable_hf_error and _classify_hf_error.
  • Add robust API key validation in get_huggingface_api_key, use tenacity retries with _should_retry_hf_error, and wrap huggingface_text_response and huggingface_structured_json_response with retry logic.
  • Rework text and structured response flows to iterate fallback models, handle NotFoundError and other call errors per-model, enforce fallback/no-response_format paths for structured responses, and include regex-based JSON extraction as a parsing fallback.
  • Simplify get_available_models and validate_model, and replace many ad-hoc debug/log messages with structured error logging using _error_details.

Testing

  • No automated tests were executed as part of this rollout; recommend running unit tests for backend/services/llm_providers and integration tests against the Hugging Face Router endpoint to validate fallback and parsing behavior.

Codex Task


🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.

## 📋 Pull Request Information **Original PR:** https://github.com/AJaySi/ALwrity/pull/437 **Author:** [@AJaySi](https://github.com/AJaySi) **Created:** 3/12/2026 **Status:** ✅ Merged **Merged:** 3/12/2026 **Merged by:** [@AJaySi](https://github.com/AJaySi) **Base:** `main` ← **Head:** `codex/fix-merge-conflicts-in-huggingface_provider.py` --- ### 📝 Commits (1) - [`9689008`](https://github.com/AJaySi/ALwrity/commit/968900858ca04e6592baf22e1187d6fb710302dc) Repair huggingface provider and restore explicit retry/fallback behavior ### 📊 Changes **1 file changed** (+151 additions, -497 deletions) <details> <summary>View changed files</summary> 📝 `backend/services/llm_providers/huggingface_provider.py` (+151 -497) </details> ### 📄 Description ### Motivation - Reduce brittle behavior and improve robustness when calling Hugging Face Router by adding explicit fallback model sequencing and retries. - Standardize error classification and logging to make failures more actionable and debuggable. - Support structured JSON outputs and resilient parsing with fallbacks for models that don't support `response_format`. ### Description - Replace ad-hoc imports and logging with `get_service_logger`, make `OpenAI` / `NotFoundError` safe when the `openai` package is missing, and add `@lru_cache` for `_get_hf_client` creation. - Implement explicit model candidate and fallback sequencing via `_candidate_model_variants` and `_fallback_model_sequence`, and centralize non-retryable classification in `_is_non_retryable_hf_error` and `_classify_hf_error`. - Add robust API key validation in `get_huggingface_api_key`, use `tenacity` retries with `_should_retry_hf_error`, and wrap `huggingface_text_response` and `huggingface_structured_json_response` with retry logic. - Rework text and structured response flows to iterate fallback models, handle `NotFoundError` and other call errors per-model, enforce fallback/no-`response_format` paths for structured responses, and include regex-based JSON extraction as a parsing fallback. - Simplify `get_available_models` and `validate_model`, and replace many ad-hoc debug/log messages with structured error logging using `_error_details`. ### Testing - No automated tests were executed as part of this rollout; recommend running unit tests for `backend/services/llm_providers` and integration tests against the Hugging Face Router endpoint to validate fallback and parsing behavior. ------ [Codex Task](https://chatgpt.com/codex/tasks/task_e_69b2996898a8832881568559c55ec9cb) --- <sub>🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.</sub>
kerem 2026-03-13 21:06:46 +03:00
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/ALwrity#742
No description provided.