[PR #375] [MERGED] Harden usage-limit middleware failure handling #677

Closed
opened 2026-03-13 21:03:27 +03:00 by kerem · 0 comments
Owner

📋 Pull Request Information

Original PR: https://github.com/AJaySi/ALwrity/pull/375
Author: @AJaySi
Created: 3/4/2026
Status: Merged
Merged: 3/5/2026
Merged by: @AJaySi

Base: mainHead: codex/update-error-handling-in-monitoring-middleware


📝 Commits (1)

  • d82569a Harden usage limit enforcement failure handling

📊 Changes

1 file changed (+117 additions, -7 deletions)

View changed files

📝 backend/services/subscription/monitoring_middleware.py (+117 -7)

📄 Description

Motivation

  • Prevent protected routes from silently bypassing usage enforcement when the enforcement infrastructure (DB/tables/service) fails, and make failures observable to operations.
  • Provide a controlled emergency fail-open gate for short incidents so teams can opt into temporary bypass with clear audit trails.
  • Surface structured logs/metrics for enforcement errors so incidents can be detected and triaged quickly.

Description

  • Added an emergency gate controlled by the environment variable USAGE_LIMITS_EMERGENCY_FAIL_OPEN with helper _is_usage_limits_emergency_fail_open_enabled() to allow explicit temporary fail-open behavior.
  • Introduced structured enforcement-error telemetry via _record_usage_limit_enforcement_error() that binds event=usage_limit_enforcement_error and maintains lightweight in-process counters in USAGE_LIMIT_ENFORCEMENT_ERROR_METRICS.
  • Added _build_usage_enforcement_unavailable_response() which returns a standard 503 payload (USAGE_LIMIT_ENFORCEMENT_UNAVAILABLE) and updated check_usage_limits_middleware to: return 503 when DB/session/tables/operational/unexpected errors prevent enforcement for protected (provider-detected) routes unless the emergency fail-open flag is enabled, and to continue returning 429 for explicit limit violations.
  • Small imports/cleanup: imported os and initialized the new constants and helpers in backend/services/subscription/monitoring_middleware.py.

Testing

  • Ran python -m py_compile backend/services/subscription/monitoring_middleware.py to validate the module compiles successfully (succeeded).

Codex Task


🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.

## 📋 Pull Request Information **Original PR:** https://github.com/AJaySi/ALwrity/pull/375 **Author:** [@AJaySi](https://github.com/AJaySi) **Created:** 3/4/2026 **Status:** ✅ Merged **Merged:** 3/5/2026 **Merged by:** [@AJaySi](https://github.com/AJaySi) **Base:** `main` ← **Head:** `codex/update-error-handling-in-monitoring-middleware` --- ### 📝 Commits (1) - [`d82569a`](https://github.com/AJaySi/ALwrity/commit/d82569a1d0fe8f52c3c300327dae642fd297d2a3) Harden usage limit enforcement failure handling ### 📊 Changes **1 file changed** (+117 additions, -7 deletions) <details> <summary>View changed files</summary> 📝 `backend/services/subscription/monitoring_middleware.py` (+117 -7) </details> ### 📄 Description ### Motivation - Prevent protected routes from silently bypassing usage enforcement when the enforcement infrastructure (DB/tables/service) fails, and make failures observable to operations. - Provide a controlled emergency fail-open gate for short incidents so teams can opt into temporary bypass with clear audit trails. - Surface structured logs/metrics for enforcement errors so incidents can be detected and triaged quickly. ### Description - Added an emergency gate controlled by the environment variable `USAGE_LIMITS_EMERGENCY_FAIL_OPEN` with helper `_is_usage_limits_emergency_fail_open_enabled()` to allow explicit temporary fail-open behavior. - Introduced structured enforcement-error telemetry via `_record_usage_limit_enforcement_error()` that binds `event=usage_limit_enforcement_error` and maintains lightweight in-process counters in `USAGE_LIMIT_ENFORCEMENT_ERROR_METRICS`. - Added `_build_usage_enforcement_unavailable_response()` which returns a standard `503` payload (`USAGE_LIMIT_ENFORCEMENT_UNAVAILABLE`) and updated `check_usage_limits_middleware` to: return `503` when DB/session/tables/operational/unexpected errors prevent enforcement for protected (provider-detected) routes unless the emergency fail-open flag is enabled, and to continue returning `429` for explicit limit violations. - Small imports/cleanup: imported `os` and initialized the new constants and helpers in `backend/services/subscription/monitoring_middleware.py`. ### Testing - Ran `python -m py_compile backend/services/subscription/monitoring_middleware.py` to validate the module compiles successfully (succeeded). ------ [Codex Task](https://chatgpt.com/codex/tasks/task_e_69a8468e2b888328bdd4b8f88f1cfaf6) --- <sub>🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.</sub>
kerem 2026-03-13 21:03:27 +03:00
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/ALwrity#677
No description provided.