[PR #418] [MERGED] Add explicit HF fallback policy controls and route SIF to low-cost HF models #722

New issue

Closed

opened 2026-03-13 21:05:53 +03:00 by kerem · 0 comments

kerem commented

2026-03-13 21:05:53 +03:00

Owner

📋 Pull Request Information

Original PR: https://github.com/AJaySi/ALwrity/pull/418
Author: @AJaySi
Created: 3/12/2026
Status: ✅ Merged
Merged: 3/12/2026
Merged by: @AJaySi

Base: main ← Head: codex/refactor-huggingface-response-functions

📝 Commits (1)

bf19137 Refine HF fallback policy controls and SIF low-cost routing

📊 Changes

3 files changed (+85 additions, -20 deletions)

View changed files

📝 backend/services/intelligence/sif_agents.py (+12 -1)
📝 backend/services/llm_providers/huggingface_provider.py (+41 -11)
📝 backend/services/llm_providers/main_text_generation.py (+32 -8)

📄 Description

Motivation

Make Hugging Face model fallback behavior policy-driven so callers can control exact fallback sequences and avoid implicit global fallbacks.
Allow an explicit empty fallback policy ([]) to mean “only try the requested model” while preserving optional variant-stripping behavior.
Ensure llm_text_gen can provide sensible, minimal premium fallbacks while allowing SIF/low-cost callers to route to cheaper HF models.

Description

Add fallback_models: Optional[List[str]] = None and allow_model_variant_fallback: bool = True to huggingface_text_response and huggingface_structured_json_response and thread those args into all model-attempt loops (including the structured no-response_format retry path).
Change _candidate_model_variants to respect allow_model_variant_fallback and update _fallback_model_sequence to use the caller-provided fallback_models exactly when provided, otherwise fall back to the global HF_FALLBACK_MODELS as a safe default.
Update llm_text_gen to accept preferred_hf_models as a caller-owned policy (first entry = requested model, remainder = fallback sequence), introduce PREMIUM_HF_MINIMAL_FALLBACK_MODELS as the premium default, and pass fallback_models + allow_model_variant_fallback into HF calls; also ensure HF fallback branch uses the minimal premium policy.
Route SIF/shared usage by updating SharedLLMWrapper to call llm_text_gen(..., preferred_hf_models=REMOTE_LOW_COST_HF_MODELS) and add REMOTE_LOW_COST_HF_MODELS to provide an explicit low-cost HF sequence.

Testing

Ran syntax/compile checks with python -m py_compile backend/services/llm_providers/huggingface_provider.py backend/services/llm_providers/main_text_generation.py backend/services/intelligence/sif_agents.py, which completed successfully.
Verified the modified call sites in llm_text_gen and SharedLLMWrapper are passing the new HF fallback parameters and that the new constants (PREMIUM_HF_MINIMAL_FALLBACK_MODELS, REMOTE_LOW_COST_HF_MODELS) are present.

Codex Task

_{🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.}

## 📋 Pull Request Information **Original PR:** https://github.com/AJaySi/ALwrity/pull/418 **Author:** [@AJaySi](https://github.com/AJaySi) **Created:** 3/12/2026 **Status:** ✅ Merged **Merged:** 3/12/2026 **Merged by:** [@AJaySi](https://github.com/AJaySi) **Base:** `main` ← **Head:** `codex/refactor-huggingface-response-functions` --- ### 📝 Commits (1) - [`bf19137`](https://github.com/AJaySi/ALwrity/commit/bf191374a502888ca7d6ca10d57b5df6c90f53c4) Refine HF fallback policy controls and SIF low-cost routing ### 📊 Changes **3 files changed** (+85 additions, -20 deletions) <details> <summary>View changed files</summary> 📝 `backend/services/intelligence/sif_agents.py` (+12 -1) 📝 `backend/services/llm_providers/huggingface_provider.py` (+41 -11) 📝 `backend/services/llm_providers/main_text_generation.py` (+32 -8) </details> ### 📄 Description ### Motivation - Make Hugging Face model fallback behavior policy-driven so callers can control exact fallback sequences and avoid implicit global fallbacks. - Allow an explicit empty fallback policy (`[]`) to mean “only try the requested model” while preserving optional variant-stripping behavior. - Ensure `llm_text_gen` can provide sensible, minimal premium fallbacks while allowing SIF/low-cost callers to route to cheaper HF models. ### Description - Add `fallback_models: Optional[List[str]] = None` and `allow_model_variant_fallback: bool = True` to `huggingface_text_response` and `huggingface_structured_json_response` and thread those args into all model-attempt loops (including the structured no-`response_format` retry path). - Change `_candidate_model_variants` to respect `allow_model_variant_fallback` and update `_fallback_model_sequence` to use the caller-provided `fallback_models` exactly when provided, otherwise fall back to the global `HF_FALLBACK_MODELS` as a safe default. - Update `llm_text_gen` to accept `preferred_hf_models` as a caller-owned policy (first entry = requested model, remainder = fallback sequence), introduce `PREMIUM_HF_MINIMAL_FALLBACK_MODELS` as the premium default, and pass `fallback_models` + `allow_model_variant_fallback` into HF calls; also ensure HF fallback branch uses the minimal premium policy. - Route SIF/shared usage by updating `SharedLLMWrapper` to call `llm_text_gen(..., preferred_hf_models=REMOTE_LOW_COST_HF_MODELS)` and add `REMOTE_LOW_COST_HF_MODELS` to provide an explicit low-cost HF sequence. ### Testing - Ran syntax/compile checks with `python -m py_compile backend/services/llm_providers/huggingface_provider.py backend/services/llm_providers/main_text_generation.py backend/services/intelligence/sif_agents.py`, which completed successfully. - Verified the modified call sites in `llm_text_gen` and `SharedLLMWrapper` are passing the new HF fallback parameters and that the new constants (`PREMIUM_HF_MINIMAL_FALLBACK_MODELS`, `REMOTE_LOW_COST_HF_MODELS`) are present. ------ [Codex Task](https://chatgpt.com/codex/tasks/task_e_69b0db58825483288364cbda15fdb6fb) --- <sub>🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.</sub>

kerem

2026-03-13 21:05:53 +03:00

closed this issue
added the
pull-request
label

No milestone

No project

No assignees

1 participant

Notifications

Due date

The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference

starred/ALwrity#722

No description provided.

Rows
Columns