starred/ALwrity

Fork 0

mirror of https://github.com/AJaySi/ALwrity.git synced 2026-04-25 08:55:58 +03:00

[PR #405] [MERGED] Enforce fail-fast SIF behavior and low-cost remote fallback #708

New issue

Closed

opened 2026-03-13 21:05:04 +03:00 by kerem · 0 comments

kerem commented

2026-03-13 21:05:04 +03:00

Owner

📋 Pull Request Information

Original PR: https://github.com/AJaySi/ALwrity/pull/405
Author: @AJaySi
Created: 3/9/2026
Status: ✅ Merged
Merged: 3/11/2026
Merged by: @AJaySi

Base: main ← Head: codex/fix-oserror-when-loading-models

📝 Commits (1)

8b0547c Make SIF fail fast and add low-cost remote LLM fallback

📊 Changes

7 files changed (+219 additions, -67 deletions)

View changed files

📝 backend/services/intelligence/agents/agent_orchestrator.py (+1 -1)
📝 backend/services/intelligence/agents/core_agent_framework.py (+51 -17)
📝 backend/services/intelligence/agents/specialized/base.py (+1 -1)
📝 backend/services/intelligence/sif_agents.py (+55 -15)
📝 backend/services/intelligence/txtai_service.py (+11 -2)
📝 backend/services/llm_providers/huggingface_provider.py (+81 -22)
📝 backend/services/llm_providers/main_text_generation.py (+19 -9)

📄 Description

Summary

This follow-up addresses review feedback around silent failures, model fallback behavior, and cost control for agent-heavy SIF flows.

What changed

Fail fast when local agent/runtime is unavailable
- BaseALwrityAgent._generate_llm_response now raises when no local LLM is present (instead of returning [LLM Unavailable]).
- Agent run() now raises if txtai_agent is missing (instead of returning "Agent not initialized").
- _execute_fallback no longer returns simulated/mock success strings; it now raises a hard error with explicit context.
Remote fallback path now exists and is explicit
- On local LLM generation failure, _generate_llm_response now attempts remote fallback via llm_text_gen.
- Fallback is logged clearly with success/failure messages for observability.
Cost-aware remote fallback model selection
- Added preferred_hf_models support to llm_text_gen.
- Agent remote fallback passes the same small-model set used for local fallback:
  - Qwen/Qwen2.5-1.5B-Instruct
  - Qwen/Qwen2.5-0.5B-Instruct
  - TinyLlama/TinyLlama-1.1B-Chat-v1.0
- Updated default HF model IDs in llm_text_gen provider selection to provider-qualified form (:groq) where applicable.
Fail-fast indexing/search behavior in txtai service
- TxtaiIntelligenceService now supports SIF_FAIL_FAST (defaults to true).
- If service initialization failed, index_content and search raise RuntimeError (instead of silently returning).
- Search exceptions re-raise when fail-fast is enabled.

Why

Prevent hidden degradation for non-technical users.
Preserve low operating cost for chatty agents by preferring small models even when switching to remote.
Surface operational failures quickly so SIF and agent issues are debugged/fixed instead of masked.

Validation

python -m py_compile backend/services/intelligence/agents/core_agent_framework.py backend/services/intelligence/txtai_service.py backend/services/llm_providers/main_text_generation.py

Notes

This intentionally removes mock/simulated fallback outputs in core agent fallback execution paths.
Behavior can be tuned via SIF_FAIL_FAST if needed for local troubleshooting.

Codex Task

_{🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.}

## 📋 Pull Request Information **Original PR:** https://github.com/AJaySi/ALwrity/pull/405 **Author:** [@AJaySi](https://github.com/AJaySi) **Created:** 3/9/2026 **Status:** ✅ Merged **Merged:** 3/11/2026 **Merged by:** [@AJaySi](https://github.com/AJaySi) **Base:** `main` ← **Head:** `codex/fix-oserror-when-loading-models` --- ### 📝 Commits (1) - [`8b0547c`](https://github.com/AJaySi/ALwrity/commit/8b0547cdb58336f07aea02d1f59bb10195c4ef0f) Make SIF fail fast and add low-cost remote LLM fallback ### 📊 Changes **7 files changed** (+219 additions, -67 deletions) <details> <summary>View changed files</summary> 📝 `backend/services/intelligence/agents/agent_orchestrator.py` (+1 -1) 📝 `backend/services/intelligence/agents/core_agent_framework.py` (+51 -17) 📝 `backend/services/intelligence/agents/specialized/base.py` (+1 -1) 📝 `backend/services/intelligence/sif_agents.py` (+55 -15) 📝 `backend/services/intelligence/txtai_service.py` (+11 -2) 📝 `backend/services/llm_providers/huggingface_provider.py` (+81 -22) 📝 `backend/services/llm_providers/main_text_generation.py` (+19 -9) </details> ### 📄 Description ## Summary This follow-up addresses review feedback around silent failures, model fallback behavior, and cost control for agent-heavy SIF flows. ### What changed - **Fail fast when local agent/runtime is unavailable** - `BaseALwrityAgent._generate_llm_response` now raises when no local LLM is present (instead of returning `[LLM Unavailable]`). - Agent `run()` now raises if `txtai_agent` is missing (instead of returning `"Agent not initialized"`). - `_execute_fallback` no longer returns simulated/mock success strings; it now raises a hard error with explicit context. - **Remote fallback path now exists and is explicit** - On local LLM generation failure, `_generate_llm_response` now attempts **remote fallback via `llm_text_gen`**. - Fallback is logged clearly with success/failure messages for observability. - **Cost-aware remote fallback model selection** - Added `preferred_hf_models` support to `llm_text_gen`. - Agent remote fallback passes the same small-model set used for local fallback: - `Qwen/Qwen2.5-1.5B-Instruct` - `Qwen/Qwen2.5-0.5B-Instruct` - `TinyLlama/TinyLlama-1.1B-Chat-v1.0` - Updated default HF model IDs in `llm_text_gen` provider selection to provider-qualified form (`:groq`) where applicable. - **Fail-fast indexing/search behavior in txtai service** - `TxtaiIntelligenceService` now supports `SIF_FAIL_FAST` (defaults to `true`). - If service initialization failed, `index_content` and `search` raise `RuntimeError` (instead of silently returning). - Search exceptions re-raise when fail-fast is enabled. ## Why - Prevent hidden degradation for non-technical users. - Preserve low operating cost for chatty agents by preferring small models even when switching to remote. - Surface operational failures quickly so SIF and agent issues are debugged/fixed instead of masked. ## Validation - `python -m py_compile backend/services/intelligence/agents/core_agent_framework.py backend/services/intelligence/txtai_service.py backend/services/llm_providers/main_text_generation.py` ## Notes - This intentionally removes mock/simulated fallback outputs in core agent fallback execution paths. - Behavior can be tuned via `SIF_FAIL_FAST` if needed for local troubleshooting. ------ [Codex Task](https://chatgpt.com/codex/tasks/task_e_69ae82908b48832890c551123f984e14) --- <sub>🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.</sub>

kerem

2026-03-13 21:05:04 +03:00

closed this issue
added the
pull-request
label

No milestone

No project

No assignees

1 participant

Notifications

Due date

The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference

starred/ALwrity#708

No description provided.

Rows
Columns