[PR #403] [MERGED] Fix txtai IndexIDMap 'nprobe' fallback loop by skipping incompatible FAISS index load #707

Closed
opened 2026-03-13 21:05:04 +03:00 by kerem · 0 comments
Owner

📋 Pull Request Information

Original PR: https://github.com/AJaySi/ALwrity/pull/403
Author: @AJaySi
Created: 3/9/2026
Status: Merged
Merged: 3/9/2026
Merged by: @AJaySi

Base: mainHead: codex/fix-nprobe-attribute-error-in-search


📝 Commits (2)

  • cd8582e Fix txtai nprobe fallback to avoid reloading incompatible faiss index
  • 952824a Stabilize txtai nprobe handling without dropping loaded index state

📊 Changes

1 file changed (+70 additions, -47 deletions)

View changed files

📝 backend/services/intelligence/txtai_service.py (+70 -47)

📄 Description

Motivation

  • Backend logs showed repeated 'IndexIDMap' object has no attribute 'nprobe' errors when the service attempted to fallback from FAISS to numpy but then reloaded a persisted FAISS ANN index, causing the same failure to recur.
  • The goal is to make TxtaiIntelligenceService more resilient by ensuring backend switches do not reintroduce incompatible persisted ANN state and by providing safe retry/fallback code paths for search, similarity, and clustering.

Description

  • Changed _initialize_embeddings to accept load_existing_index: bool and added logic to skip loading a persisted index when instructed (prevents reloading incompatible FAISS state).
  • Added _is_nprobe_incompatibility to centralize detection of the FAISS IndexIDMap/nprobe error pattern and _switch_to_numpy_backend to encapsulate backend switching and reinitialization behavior.
  • Hardened search to retry against a numpy backend with load_existing_index=False and, if a retry still raises the same error, force a non-ANN retrieval path by calling search(..., index=False).
  • Hardened get_similarity to switch to numpy on the incompatibility and, if similarity() keeps failing, fall back to a cosine similarity computed from vectors obtained via transform.
  • Updated cluster to use the centralized incompatibility detection and backend switch so clustering falls back cleanly to the non-graph routine when ANN/graph is unavailable.

Testing

  • Compiled the modified module with python -m py_compile backend/services/intelligence/txtai_service.py, and compilation succeeded.
  • No additional automated tests were available in this change; runtime behavior should be validated in an environment with txtai/faiss indexes to confirm the fallback behavior prevents the previous repeat errors.

Codex Task


🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.

## 📋 Pull Request Information **Original PR:** https://github.com/AJaySi/ALwrity/pull/403 **Author:** [@AJaySi](https://github.com/AJaySi) **Created:** 3/9/2026 **Status:** ✅ Merged **Merged:** 3/9/2026 **Merged by:** [@AJaySi](https://github.com/AJaySi) **Base:** `main` ← **Head:** `codex/fix-nprobe-attribute-error-in-search` --- ### 📝 Commits (2) - [`cd8582e`](https://github.com/AJaySi/ALwrity/commit/cd8582eb8c8f57b6d9d906403180085ae9cd0850) Fix txtai nprobe fallback to avoid reloading incompatible faiss index - [`952824a`](https://github.com/AJaySi/ALwrity/commit/952824a271529b4545a90684c084e504fb2407f6) Stabilize txtai nprobe handling without dropping loaded index state ### 📊 Changes **1 file changed** (+70 additions, -47 deletions) <details> <summary>View changed files</summary> 📝 `backend/services/intelligence/txtai_service.py` (+70 -47) </details> ### 📄 Description ### Motivation - Backend logs showed repeated `'IndexIDMap' object has no attribute 'nprobe'` errors when the service attempted to fallback from FAISS to numpy but then reloaded a persisted FAISS ANN index, causing the same failure to recur. - The goal is to make `TxtaiIntelligenceService` more resilient by ensuring backend switches do not reintroduce incompatible persisted ANN state and by providing safe retry/fallback code paths for search, similarity, and clustering. ### Description - Changed `_initialize_embeddings` to accept `load_existing_index: bool` and added logic to skip loading a persisted index when instructed (prevents reloading incompatible FAISS state). - Added `_is_nprobe_incompatibility` to centralize detection of the FAISS `IndexIDMap`/`nprobe` error pattern and `_switch_to_numpy_backend` to encapsulate backend switching and reinitialization behavior. - Hardened `search` to retry against a numpy backend with `load_existing_index=False` and, if a retry still raises the same error, force a non-ANN retrieval path by calling `search(..., index=False)`. - Hardened `get_similarity` to switch to numpy on the incompatibility and, if `similarity()` keeps failing, fall back to a cosine similarity computed from vectors obtained via `transform`. - Updated `cluster` to use the centralized incompatibility detection and backend switch so clustering falls back cleanly to the non-graph routine when ANN/graph is unavailable. ### Testing - Compiled the modified module with `python -m py_compile backend/services/intelligence/txtai_service.py`, and compilation succeeded. - No additional automated tests were available in this change; runtime behavior should be validated in an environment with txtai/faiss indexes to confirm the fallback behavior prevents the previous repeat errors. ------ [Codex Task](https://chatgpt.com/codex/tasks/task_e_69ae6a6c16c883289c58abe3ef05712b) --- <sub>🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.</sub>
kerem 2026-03-13 21:05:04 +03:00
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/ALwrity#707
No description provided.