[PR #516] [MERGED] Unify stats ignore behavior with shared defaults + linguist-generated support #536

Closed
opened 2026-03-02 04:13:54 +03:00 by kerem · 0 comments
Owner

📋 Pull Request Information

Original PR: https://github.com/git-ai-project/git-ai/pull/516
Author: @svarlamov
Created: 2/13/2026
Status: Merged
Merged: 2/13/2026
Merged by: @svarlamov

Base: mainHead: codex/unify-ignore-source-of-truth


📝 Commits (3)

  • 3d24554 Unify stats ignores with shared defaults and matcher
  • 7f1f3be Narrow default snapshot directory ignore pattern
  • 2aa8b85 Add regression test for nested named lockfile ignores

📊 Changes

10 files changed (+564 additions, -36 deletions)

View changed files

📝 src/authorship/diff_ai_accepted.rs (+3 -2)
src/authorship/ignore.rs (+346 -0)
📝 src/authorship/internal_db.rs (+8 -3)
📝 src/authorship/mod.rs (+1 -0)
📝 src/authorship/post_commit.rs (+11 -3)
📝 src/authorship/range_authorship.rs (+7 -20)
📝 src/authorship/stats.rs (+6 -4)
📝 src/commands/git_ai_handlers.rs (+10 -2)
📝 tests/commit_post_stats_benchmark.rs (+6 -2)
📝 tests/stats.rs (+166 -0)

📄 Description

Summary

  • add shared ignore module at src/authorship/ignore.rs as the source of truth for stats filtering
  • include upstream default patterns (lockfiles, generated/minified/vendor/node_modules, snapshots)
  • load root .gitattributes patterns marked linguist-generated and include them in effective ignores
  • make CLI stats, range stats, and post-commit stats all use the same effective ignore matcher
  • preserve compatibility helper in range_authorship while delegating to shared matcher

Correctness + edge cases

  • .gitattributes parsing now handles escaped whitespace in patterns
  • [attr]... macro lines are ignored during linguist pattern extraction
  • invalid glob fallback supports exact filename or exact full-path matching
  • post-commit skip estimation now uses ignore-aware filtering
  • internal_db env-sensitive test updated to account for GIT_AI_TEST_DB_PATH

Performance

  • avoid recompiling globs per path by introducing IgnoreMatcher and reusing it on hot paths
  • benchmark comparison vs pre-change baseline (same benchmark suite):
    • thousands-files stats fast path: ~4.3% faster (469.97ms -> 449.91ms avg)
    • post-commit hunk-density benchmark, contiguous case: ~14.3% faster (331.5ms -> 284.25ms avg)
    • post-commit hunk-density benchmark, scattered case: ~2.8% faster (207.25ms -> 201.5ms avg)

Validation

  • cargo test
  • cargo test --test commit_post_stats_benchmark -- --ignored --nocapture

Open with Devin

🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.

## 📋 Pull Request Information **Original PR:** https://github.com/git-ai-project/git-ai/pull/516 **Author:** [@svarlamov](https://github.com/svarlamov) **Created:** 2/13/2026 **Status:** ✅ Merged **Merged:** 2/13/2026 **Merged by:** [@svarlamov](https://github.com/svarlamov) **Base:** `main` ← **Head:** `codex/unify-ignore-source-of-truth` --- ### 📝 Commits (3) - [`3d24554`](https://github.com/git-ai-project/git-ai/commit/3d245547c8c55f8ed0a510e2b69f44abeaf259c3) Unify stats ignores with shared defaults and matcher - [`7f1f3be`](https://github.com/git-ai-project/git-ai/commit/7f1f3be2aac11499ff91469090f818aabe1b102c) Narrow default snapshot directory ignore pattern - [`2aa8b85`](https://github.com/git-ai-project/git-ai/commit/2aa8b853237814b035820622c187fe9c6f73e2ce) Add regression test for nested named lockfile ignores ### 📊 Changes **10 files changed** (+564 additions, -36 deletions) <details> <summary>View changed files</summary> 📝 `src/authorship/diff_ai_accepted.rs` (+3 -2) ➕ `src/authorship/ignore.rs` (+346 -0) 📝 `src/authorship/internal_db.rs` (+8 -3) 📝 `src/authorship/mod.rs` (+1 -0) 📝 `src/authorship/post_commit.rs` (+11 -3) 📝 `src/authorship/range_authorship.rs` (+7 -20) 📝 `src/authorship/stats.rs` (+6 -4) 📝 `src/commands/git_ai_handlers.rs` (+10 -2) 📝 `tests/commit_post_stats_benchmark.rs` (+6 -2) 📝 `tests/stats.rs` (+166 -0) </details> ### 📄 Description ## Summary - add shared ignore module at `src/authorship/ignore.rs` as the source of truth for stats filtering - include upstream default patterns (lockfiles, generated/minified/vendor/node_modules, snapshots) - load root `.gitattributes` patterns marked `linguist-generated` and include them in effective ignores - make CLI stats, range stats, and post-commit stats all use the same effective ignore matcher - preserve compatibility helper in `range_authorship` while delegating to shared matcher ## Correctness + edge cases - `.gitattributes` parsing now handles escaped whitespace in patterns - `[attr]...` macro lines are ignored during linguist pattern extraction - invalid glob fallback supports exact filename or exact full-path matching - post-commit skip estimation now uses ignore-aware filtering - `internal_db` env-sensitive test updated to account for `GIT_AI_TEST_DB_PATH` ## Performance - avoid recompiling globs per path by introducing `IgnoreMatcher` and reusing it on hot paths - benchmark comparison vs pre-change baseline (same benchmark suite): - thousands-files stats fast path: ~4.3% faster (`469.97ms` -> `449.91ms` avg) - post-commit hunk-density benchmark, contiguous case: ~14.3% faster (`331.5ms` -> `284.25ms` avg) - post-commit hunk-density benchmark, scattered case: ~2.8% faster (`207.25ms` -> `201.5ms` avg) ## Validation - `cargo test` - `cargo test --test commit_post_stats_benchmark -- --ignored --nocapture` <!-- devin-review-badge-begin --> --- <a href="https://app.devin.ai/review/git-ai-project/git-ai/pull/516" target="_blank"> <picture> <source media="(prefers-color-scheme: dark)" srcset="https://static.devin.ai/assets/gh-open-in-devin-review-dark.svg?v=1"> <img src="https://static.devin.ai/assets/gh-open-in-devin-review-light.svg?v=1" alt="Open with Devin"> </picture> </a> <!-- devin-review-badge-end --> --- <sub>🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.</sub>
kerem 2026-03-02 04:13:54 +03:00
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/git-ai#536
No description provided.