[PR #405] [MERGED] Enforce fail-fast SIF behavior and low-cost remote fallback #708

Closed
opened 2026-03-13 21:05:04 +03:00 by kerem · 0 comments
Owner

📋 Pull Request Information

Original PR: https://github.com/AJaySi/ALwrity/pull/405
Author: @AJaySi
Created: 3/9/2026
Status: Merged
Merged: 3/11/2026
Merged by: @AJaySi

Base: mainHead: codex/fix-oserror-when-loading-models


📝 Commits (1)

  • 8b0547c Make SIF fail fast and add low-cost remote LLM fallback

📊 Changes

7 files changed (+219 additions, -67 deletions)

View changed files

📝 backend/services/intelligence/agents/agent_orchestrator.py (+1 -1)
📝 backend/services/intelligence/agents/core_agent_framework.py (+51 -17)
📝 backend/services/intelligence/agents/specialized/base.py (+1 -1)
📝 backend/services/intelligence/sif_agents.py (+55 -15)
📝 backend/services/intelligence/txtai_service.py (+11 -2)
📝 backend/services/llm_providers/huggingface_provider.py (+81 -22)
📝 backend/services/llm_providers/main_text_generation.py (+19 -9)

📄 Description

Summary

This follow-up addresses review feedback around silent failures, model fallback behavior, and cost control for agent-heavy SIF flows.

What changed

  • Fail fast when local agent/runtime is unavailable

    • BaseALwrityAgent._generate_llm_response now raises when no local LLM is present (instead of returning [LLM Unavailable]).
    • Agent run() now raises if txtai_agent is missing (instead of returning "Agent not initialized").
    • _execute_fallback no longer returns simulated/mock success strings; it now raises a hard error with explicit context.
  • Remote fallback path now exists and is explicit

    • On local LLM generation failure, _generate_llm_response now attempts remote fallback via llm_text_gen.
    • Fallback is logged clearly with success/failure messages for observability.
  • Cost-aware remote fallback model selection

    • Added preferred_hf_models support to llm_text_gen.
    • Agent remote fallback passes the same small-model set used for local fallback:
      • Qwen/Qwen2.5-1.5B-Instruct
      • Qwen/Qwen2.5-0.5B-Instruct
      • TinyLlama/TinyLlama-1.1B-Chat-v1.0
    • Updated default HF model IDs in llm_text_gen provider selection to provider-qualified form (:groq) where applicable.
  • Fail-fast indexing/search behavior in txtai service

    • TxtaiIntelligenceService now supports SIF_FAIL_FAST (defaults to true).
    • If service initialization failed, index_content and search raise RuntimeError (instead of silently returning).
    • Search exceptions re-raise when fail-fast is enabled.

Why

  • Prevent hidden degradation for non-technical users.
  • Preserve low operating cost for chatty agents by preferring small models even when switching to remote.
  • Surface operational failures quickly so SIF and agent issues are debugged/fixed instead of masked.

Validation

  • python -m py_compile backend/services/intelligence/agents/core_agent_framework.py backend/services/intelligence/txtai_service.py backend/services/llm_providers/main_text_generation.py

Notes

  • This intentionally removes mock/simulated fallback outputs in core agent fallback execution paths.
  • Behavior can be tuned via SIF_FAIL_FAST if needed for local troubleshooting.

Codex Task


🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.

## 📋 Pull Request Information **Original PR:** https://github.com/AJaySi/ALwrity/pull/405 **Author:** [@AJaySi](https://github.com/AJaySi) **Created:** 3/9/2026 **Status:** ✅ Merged **Merged:** 3/11/2026 **Merged by:** [@AJaySi](https://github.com/AJaySi) **Base:** `main` ← **Head:** `codex/fix-oserror-when-loading-models` --- ### 📝 Commits (1) - [`8b0547c`](https://github.com/AJaySi/ALwrity/commit/8b0547cdb58336f07aea02d1f59bb10195c4ef0f) Make SIF fail fast and add low-cost remote LLM fallback ### 📊 Changes **7 files changed** (+219 additions, -67 deletions) <details> <summary>View changed files</summary> 📝 `backend/services/intelligence/agents/agent_orchestrator.py` (+1 -1) 📝 `backend/services/intelligence/agents/core_agent_framework.py` (+51 -17) 📝 `backend/services/intelligence/agents/specialized/base.py` (+1 -1) 📝 `backend/services/intelligence/sif_agents.py` (+55 -15) 📝 `backend/services/intelligence/txtai_service.py` (+11 -2) 📝 `backend/services/llm_providers/huggingface_provider.py` (+81 -22) 📝 `backend/services/llm_providers/main_text_generation.py` (+19 -9) </details> ### 📄 Description ## Summary This follow-up addresses review feedback around silent failures, model fallback behavior, and cost control for agent-heavy SIF flows. ### What changed - **Fail fast when local agent/runtime is unavailable** - `BaseALwrityAgent._generate_llm_response` now raises when no local LLM is present (instead of returning `[LLM Unavailable]`). - Agent `run()` now raises if `txtai_agent` is missing (instead of returning `"Agent not initialized"`). - `_execute_fallback` no longer returns simulated/mock success strings; it now raises a hard error with explicit context. - **Remote fallback path now exists and is explicit** - On local LLM generation failure, `_generate_llm_response` now attempts **remote fallback via `llm_text_gen`**. - Fallback is logged clearly with success/failure messages for observability. - **Cost-aware remote fallback model selection** - Added `preferred_hf_models` support to `llm_text_gen`. - Agent remote fallback passes the same small-model set used for local fallback: - `Qwen/Qwen2.5-1.5B-Instruct` - `Qwen/Qwen2.5-0.5B-Instruct` - `TinyLlama/TinyLlama-1.1B-Chat-v1.0` - Updated default HF model IDs in `llm_text_gen` provider selection to provider-qualified form (`:groq`) where applicable. - **Fail-fast indexing/search behavior in txtai service** - `TxtaiIntelligenceService` now supports `SIF_FAIL_FAST` (defaults to `true`). - If service initialization failed, `index_content` and `search` raise `RuntimeError` (instead of silently returning). - Search exceptions re-raise when fail-fast is enabled. ## Why - Prevent hidden degradation for non-technical users. - Preserve low operating cost for chatty agents by preferring small models even when switching to remote. - Surface operational failures quickly so SIF and agent issues are debugged/fixed instead of masked. ## Validation - `python -m py_compile backend/services/intelligence/agents/core_agent_framework.py backend/services/intelligence/txtai_service.py backend/services/llm_providers/main_text_generation.py` ## Notes - This intentionally removes mock/simulated fallback outputs in core agent fallback execution paths. - Behavior can be tuned via `SIF_FAIL_FAST` if needed for local troubleshooting. ------ [Codex Task](https://chatgpt.com/codex/tasks/task_e_69ae82908b48832890c551123f984e14) --- <sub>🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.</sub>
kerem 2026-03-13 21:05:04 +03:00
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/ALwrity#708
No description provided.