[PR #550] Fix flaky prompt_picker_test race conditions in parallel test execution #557

New issue

Closed

opened 2026-03-02 04:13:58 +03:00 by kerem · 0 comments

kerem commented

2026-03-02 04:13:58 +03:00

Owner

Original Pull Request: https://github.com/git-ai-project/git-ai/pull/550

State: closed
Merged: Yes

Fix flaky prompt_picker_test race conditions in parallel execution

Summary

Six tests in prompt_picker_test.rs were failing intermittently in CI due to race conditions when Rust runs tests in parallel. All database tests share a single InternalDatabase singleton (via OnceLock<Mutex<...>>), causing two problems:

Primary key collisions: Tests used hardcoded IDs like "prompt1", "prompt2". The upsert_prompt ON CONFLICT DO UPDATE clause would silently overwrite one test's data with another's (e.g., changing the workdir), causing assertions to fail.
Unscoped searches: search_prompts calls without a workdir filter could return results from concurrent tests. For example, searching "authentication" could match another test's "AUTHENTICATION" prompt, and the case-sensitive Rust .contains() assertion would fail.
PoisonError cascade: When a test panicked while holding the MutexGuard, the mutex was poisoned, causing all subsequent tests to fail with PoisonError.

Fixes applied:

Atomic counter (TEST_ID_COUNTER) generates unique prompt IDs per test invocation
search_prompts calls now scoped by each test's unique workdir (from TestRepo temp directory)
Mutex locks use unwrap_or_else(|e| e.into_inner()) to recover from poisoned state

Verified with 10/10 consecutive local passes (previously failing ~1-2/10). All 15 CI checks now passing.

Review & Testing Checklist for Human

The poisoned mutex recovery (unwrap_or_else(|poisoned| poisoned.into_inner())) intentionally swallows prior panics — verify this doesn't mask real database corruption across tests. If a test panics mid-write, the next test proceeds with whatever partial state was left behind.
test_database_list_prompts_no_filter and test_database_list_prompts_pagination still query with None workdir — their loose assertions (>= 2, <= 2) could still be affected by cross-test data accumulation in edge cases
test_database_search_prompts_no_matches now scopes by workdir, changing semantics from "no global matches" to "no matches in this workdir" — confirm this is acceptable
Recommended test plan: run cargo test in a loop locally (e.g. for i in $(seq 1 20); do cargo test || break; done) to gain confidence in the fix under contention

Notes

Link to Devin run: https://app.devin.ai/sessions/5c559e2b2e454d9887acaa3c5a2da1f9
Requested by: @svarlamov
Reference failing CI run: https://github.com/git-ai-project/git-ai/actions/runs/22126553530/job/63957675865

**Original Pull Request:** https://github.com/git-ai-project/git-ai/pull/550 **State:** closed **Merged:** Yes --- # Fix flaky prompt_picker_test race conditions in parallel execution ## Summary Six tests in `prompt_picker_test.rs` were failing intermittently in CI due to race conditions when Rust runs tests in parallel. All database tests share a single `InternalDatabase` singleton (via `OnceLock<Mutex<...>>`), causing two problems: 1. **Primary key collisions**: Tests used hardcoded IDs like `"prompt1"`, `"prompt2"`. The `upsert_prompt` `ON CONFLICT DO UPDATE` clause would silently overwrite one test's data with another's (e.g., changing the `workdir`), causing assertions to fail. 2. **Unscoped searches**: `search_prompts` calls without a workdir filter could return results from concurrent tests. For example, searching `"authentication"` could match another test's `"AUTHENTICATION"` prompt, and the case-sensitive Rust `.contains()` assertion would fail. 3. **PoisonError cascade**: When a test panicked while holding the `MutexGuard`, the mutex was poisoned, causing all subsequent tests to fail with `PoisonError`. **Fixes applied:** - Atomic counter (`TEST_ID_COUNTER`) generates unique prompt IDs per test invocation - `search_prompts` calls now scoped by each test's unique workdir (from `TestRepo` temp directory) - Mutex locks use `unwrap_or_else(|e| e.into_inner())` to recover from poisoned state Verified with 10/10 consecutive local passes (previously failing ~1-2/10). All 15 CI checks now passing. ## Review & Testing Checklist for Human - [ ] The poisoned mutex recovery (`unwrap_or_else(|poisoned| poisoned.into_inner())`) intentionally swallows prior panics — verify this doesn't mask real database corruption across tests. If a test panics mid-write, the next test proceeds with whatever partial state was left behind. - [ ] `test_database_list_prompts_no_filter` and `test_database_list_prompts_pagination` still query with `None` workdir — their loose assertions (`>= 2`, `<= 2`) could still be affected by cross-test data accumulation in edge cases - [ ] `test_database_search_prompts_no_matches` now scopes by workdir, changing semantics from "no global matches" to "no matches in this workdir" — confirm this is acceptable - [ ] Recommended test plan: run `cargo test` in a loop locally (e.g. `for i in $(seq 1 20); do cargo test || break; done`) to gain confidence in the fix under contention ### Notes - Link to Devin run: https://app.devin.ai/sessions/5c559e2b2e454d9887acaa3c5a2da1f9 - Requested by: @svarlamov - Reference failing CI run: https://github.com/git-ai-project/git-ai/actions/runs/22126553530/job/63957675865