[PR #16] [MERGED] Add Custom Redaction Rules (Block Words, Dates, Regex Patterns) #18

Closed
opened 2026-03-02 11:45:02 +03:00 by kerem · 0 comments
Owner

📋 Pull Request Information

Original PR: https://github.com/karant-dev/AutoRedact/pull/16
Author: @Copilot
Created: 12/12/2025
Status: Merged
Merged: 12/12/2025
Merged by: @karant-dev

Base: mainHead: copilot/add-custom-redaction-rules


📝 Commits (5)

  • 36238db Initial plan
  • 137924e Add custom redaction rules (block words, dates, regex patterns)
  • bed88b5 Address code review feedback: add regex lastIndex reset and improve date validation
  • f7d639e Fix over-redaction: only redact words that contain or match the sensitive text
  • dca4867 Refactor: extract hasValidOverlap helper to reduce code duplication

📊 Changes

8 files changed (+806 additions, -88 deletions)

View changed files

📝 src/App.tsx (+25 -1)
📝 src/components/Header.tsx (+30 -0)
📝 src/components/SettingsDropdown.tsx (+317 -67)
📝 src/hooks/useDetectionSettings.ts (+103 -1)
📝 src/hooks/useOCR.ts (+15 -10)
📝 src/types/index.ts (+10 -0)
src/utils/datePatterns.ts (+197 -0)
📝 src/utils/ocr.ts (+109 -9)

📄 Description

  • Update types: Add blockWords (string[]), customRegex (CustomRegexRule[]), and customDates (string[]) to DetectionSettings
  • Add helper utils: Create date parsing utility (datePatterns.ts) to generate multiple date format regex patterns from a date string
  • Update useDetectionSettings hook: Add functions to manage block words, custom regex patterns, and custom dates with validation
  • Update detection logic: Modify detection flow in ocr.ts and useOCR.ts to include block words, dates, and custom regex
  • Update Settings UI: Add "Advanced" section to SettingsDropdown.tsx with Block Words, Custom Dates, and Custom Regex fields with validation
  • Build and lint passes
  • Run code review and address feedback
  • Run CodeQL security check (no issues found)
  • Fix over-redaction issue: Updated overlap detection to require text-based validation (word must contain/match the sensitive text)
  • Refactor: Extract hasValidOverlap helper function to reduce code duplication

Bug Fix

Fixed issue where custom rules (block words, dates, regex) would redact entire lines instead of just the matching text. The fix adds text-based validation on top of positional overlap detection - now a word is only redacted if it actually contains the matched text or vice versa.

Before: DATE:12-12-25*TIME:06:22* would be entirely redacted when matching date 12-12-25
After: Only the specific date portion is redacted

Screenshots

Settings dropdown

Original prompt

This section details on the original issue you should resolve

<issue_title>[Feat]: Custom Redaction Rules (Regex & Blacklist)</issue_title>
<issue_description>### Is your feature request related to a problem?
The default detectors don't catch my proprietary Project Names (e.g. 'Project Titan') or specific Invoice IDs.

Describe the solution you'd like

Add an 'Advanced' section to Settings with two fields:

  1. Block Words: List of words to always redact (e.g. names, birth dates, etc).
  2. Custom Regex: User-defined regex patterns (e.g. 'INV-\d{4}').

Describe alternatives you've considered

Drawing manual boxes over them.

Additional context

Important: Validate regex to prevent crashes. Handle case-sensitivity options.</issue_description>

<agent_instructions>Be smart about this, instead of just using plain string, for something like birth dates, parse the date, and redact that date in any format dates can show up in</agent_instructions>

Comments on the Issue (you are @copilot in this section)


💬 We'd love your input! Share your thoughts on Copilot coding agent in our 2 minute survey.


🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.

## 📋 Pull Request Information **Original PR:** https://github.com/karant-dev/AutoRedact/pull/16 **Author:** [@Copilot](https://github.com/apps/copilot-swe-agent) **Created:** 12/12/2025 **Status:** ✅ Merged **Merged:** 12/12/2025 **Merged by:** [@karant-dev](https://github.com/karant-dev) **Base:** `main` ← **Head:** `copilot/add-custom-redaction-rules` --- ### 📝 Commits (5) - [`36238db`](https://github.com/karant-dev/AutoRedact/commit/36238db2c491b27b8572c4ff7bbd7ee56e1adeaa) Initial plan - [`137924e`](https://github.com/karant-dev/AutoRedact/commit/137924e52850605284a62c04088c13419d178bbe) Add custom redaction rules (block words, dates, regex patterns) - [`bed88b5`](https://github.com/karant-dev/AutoRedact/commit/bed88b5aa6d9d5b5be7808f113c7712206ad1f53) Address code review feedback: add regex lastIndex reset and improve date validation - [`f7d639e`](https://github.com/karant-dev/AutoRedact/commit/f7d639e4c4ae4d1a7f2d992befc1dfbded1fc216) Fix over-redaction: only redact words that contain or match the sensitive text - [`dca4867`](https://github.com/karant-dev/AutoRedact/commit/dca4867db129bc2436ce977c2b98600552197e53) Refactor: extract hasValidOverlap helper to reduce code duplication ### 📊 Changes **8 files changed** (+806 additions, -88 deletions) <details> <summary>View changed files</summary> 📝 `src/App.tsx` (+25 -1) 📝 `src/components/Header.tsx` (+30 -0) 📝 `src/components/SettingsDropdown.tsx` (+317 -67) 📝 `src/hooks/useDetectionSettings.ts` (+103 -1) 📝 `src/hooks/useOCR.ts` (+15 -10) 📝 `src/types/index.ts` (+10 -0) ➕ `src/utils/datePatterns.ts` (+197 -0) 📝 `src/utils/ocr.ts` (+109 -9) </details> ### 📄 Description - [x] **Update types**: Add `blockWords` (string[]), `customRegex` (CustomRegexRule[]), and `customDates` (string[]) to `DetectionSettings` - [x] **Add helper utils**: Create date parsing utility (`datePatterns.ts`) to generate multiple date format regex patterns from a date string - [x] **Update `useDetectionSettings` hook**: Add functions to manage block words, custom regex patterns, and custom dates with validation - [x] **Update detection logic**: Modify detection flow in `ocr.ts` and `useOCR.ts` to include block words, dates, and custom regex - [x] **Update Settings UI**: Add "Advanced" section to `SettingsDropdown.tsx` with Block Words, Custom Dates, and Custom Regex fields with validation - [x] **Build and lint passes** - [x] **Run code review and address feedback** - [x] **Run CodeQL security check (no issues found)** - [x] **Fix over-redaction issue**: Updated overlap detection to require text-based validation (word must contain/match the sensitive text) - [x] **Refactor**: Extract `hasValidOverlap` helper function to reduce code duplication ## Bug Fix Fixed issue where custom rules (block words, dates, regex) would redact entire lines instead of just the matching text. The fix adds text-based validation on top of positional overlap detection - now a word is only redacted if it actually contains the matched text or vice versa. **Before**: `DATE:12-12-25*TIME:06:22*` would be entirely redacted when matching date `12-12-25` **After**: Only the specific date portion is redacted ## Screenshots ![Settings dropdown](https://github.com/user-attachments/assets/8b6e54ff-6810-4d0b-9b90-0a81dcc8abf9) <!-- START COPILOT ORIGINAL PROMPT --> <details> <summary>Original prompt</summary> > > ---- > > *This section details on the original issue you should resolve* > > <issue_title>[Feat]: Custom Redaction Rules (Regex & Blacklist)</issue_title> > <issue_description>### Is your feature request related to a problem? > The default detectors don't catch my proprietary Project Names (e.g. 'Project Titan') or specific Invoice IDs. > > ### Describe the solution you'd like > Add an 'Advanced' section to Settings with two fields: > 1. **Block Words**: List of words to always redact (e.g. names, birth dates, etc). > 2. **Custom Regex**: User-defined regex patterns (e.g. 'INV-\d{4}'). > > ### Describe alternatives you've considered > Drawing manual boxes over them. > > ### Additional context > Important: Validate regex to prevent crashes. Handle case-sensitivity options.</issue_description> > > <agent_instructions>Be smart about this, instead of just using plain string, for something like birth dates, parse the date, and redact that date in any format dates can show up in</agent_instructions> > > ## Comments on the Issue (you are @copilot in this section) > > <comments> > </comments> > </details> <!-- START COPILOT CODING AGENT SUFFIX --> - Fixes karant-dev/AutoRedact#12 <!-- START COPILOT CODING AGENT TIPS --> --- 💬 We'd love your input! Share your thoughts on Copilot coding agent in our [2 minute survey](https://gh.io/copilot-coding-agent-survey). --- <sub>🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.</sub>
kerem 2026-03-02 11:45:02 +03:00
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/AutoRedact#18
No description provided.