[GH-ISSUE #175] Commit stats should ignore attestations without associated prompts for calculations #68

Closed
opened 2026-03-02 04:11:37 +03:00 by kerem · 2 comments
Owner

Originally created by @svarlamov on GitHub (Nov 1, 2025).
Original GitHub issue: https://github.com/git-ai-project/git-ai/issues/175

Since the DMP updates, git-ai now retains AI attributions for formatting changes/etc made in a subsequent commit. This is great! However, stats should be updated to ensure that it ignores these changes, as these AI lines were already 'tracked' in their respective commits. This might involve some changes to the output of commit stats, because these lines probably also shouldn't be counted as human additions either.

Below is a good example from the git-ai repo.

git notes --ref=ai show 9d630d4daefde20a9148e708d98b3adc0e5aa4ff

src/ci/github.rs
  b414720 7
src/commands/ci_handlers.rs
  b414720 2,59-60,62,64,193-195,205-207
---
{
  "schema_version": "authorship/3.0.0",
  "base_commit_sha": "cb0198084f7da364f550b082129bafac96953b3d",
  "prompts": {}
}
Originally created by @svarlamov on GitHub (Nov 1, 2025). Original GitHub issue: https://github.com/git-ai-project/git-ai/issues/175 Since the DMP updates, `git-ai` now retains AI attributions for formatting changes/etc made in a subsequent commit. This is great! However, stats should be updated to ensure that it ignores these changes, as these AI lines were already 'tracked' in their respective commits. This might involve some changes to the output of commit stats, because these lines probably also shouldn't be counted as human additions either. Below is a good example from the `git-ai` repo. `git notes --ref=ai show 9d630d4daefde20a9148e708d98b3adc0e5aa4ff` ``` src/ci/github.rs b414720 7 src/commands/ci_handlers.rs b414720 2,59-60,62,64,193-195,205-207 --- { "schema_version": "authorship/3.0.0", "base_commit_sha": "cb0198084f7da364f550b082129bafac96953b3d", "prompts": {} } ```
kerem 2026-03-02 04:11:37 +03:00
Author
Owner

@acunniffe commented on GitHub (Nov 1, 2025):

Do we need to think about keeping the notes tree up to date before we can calculate final authorship log.

If someone's notes is a few commits behind couldn't the same series of checkpoints and prompts hypothetically lead to different authorship logs being computed? Especially in this move / reformat case.

<!-- gh-comment-id:3476438296 --> @acunniffe commented on GitHub (Nov 1, 2025): Do we need to think about keeping the notes tree up to date before we can calculate final authorship log. If someone's notes is a few commits behind couldn't the same series of checkpoints and prompts hypothetically lead to different authorship logs being computed? Especially in this move / reformat case.
Author
Owner

@svarlamov commented on GitHub (Nov 2, 2025):

We seed the initial attribution ranges for diff match patch when the first checkpoint is created (on a per-file basis). So if you have missing notes and then you do some reformatting/moving based on those line ranges, you could end up with an incorrect authorship note. This is only an issue if you're missing notes for a commit that you do have. It's not an issue if you're just behind on everything.

I think the greatest likelihood for this kind of issue arises after the very first clone, since after that, for every new commit, you should always be getting the corresponding authorship log. That said, I don't think it causes any major integrity errors down the road, it'll just mean those lines end up losing their AI attribution

<!-- gh-comment-id:3477002955 --> @svarlamov commented on GitHub (Nov 2, 2025): We seed the initial attribution ranges for diff match patch when the first checkpoint is created (on a per-file basis). So if you have missing notes and then you do some reformatting/moving based on those line ranges, you could end up with an incorrect authorship note. This is only an issue if you're missing notes for a commit that you do have. It's not an issue if you're just behind on everything. I think the greatest likelihood for this kind of issue arises after the very first clone, since after that, for every new commit, you should always be getting the corresponding authorship log. That said, I don't think it causes any major integrity errors down the road, it'll just mean those lines end up losing their AI attribution
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/git-ai#68
No description provided.