mirror of https://github.com/karant-dev/AutoRedact.git synced 2026-04-25 15:55:50 +03:00

🛡️ Client-side, privacy-first image redaction tool. Automatically detects and blurs PII (Emails, IPs, Keys) using local OCR. No server, no data leaks.

Find a file

Karan Thakkar b91664f376 fix: security patch for release v2.1.2 (#46 ) * fix: patch CVE-2026-23745 and upgrade OS packages * fix: update npm to latest and patch tar to fix all vulnerabilities * fix: aggressively clean npm cache to prevent false positives * ci: improve trivy logging by printing table to console		2026-01-18 09:32:44 -08:00
.github	fix: security patch for release v2.1.2 (#46 )	2026-01-18 09:32:44 -08:00
.husky	Community Readiness: Docs, Docker, Husky	2025-12-11 13:27:19 +00:00
docs	feat(api): add configuration support (#34 )	2025-12-14 07:16:37 -08:00
public	Initial commit (GPLv3)	2025-12-11 13:08:14 +00:00
src	chore: release v2.1.2 (#39 )	2026-01-03 10:52:31 -08:00
test	Initial commit (GPLv3)	2025-12-11 13:08:14 +00:00
.dockerignore	Initial commit (GPLv3)	2025-12-11 13:08:14 +00:00
.gitignore	docs: finalize guidelines and artifacts (#7 )	2025-12-11 06:13:38 -08:00
.trivyignore	fix(ci): add missing .trivyignore file (#31 )	2025-12-13 14:14:56 -08:00
CHANGELOG.md	chore(release): prepare v1.1.0 with changelog (#18 )	2025-12-12 08:33:11 -08:00
CONTRIBUTING.md	docs: finalize guidelines and artifacts (#7 )	2025-12-11 06:13:38 -08:00
docker-compose.yml	feat: add docker API (v2.1) (#25 )	2025-12-13 07:37:52 -08:00
Dockerfile	fix: security patch for release v2.1.2 (#46 )	2026-01-18 09:32:44 -08:00
eng.traineddata	Initial commit (GPLv3)	2025-12-11 13:08:14 +00:00
eslint.config.js	Initial commit (GPLv3)	2025-12-11 13:08:14 +00:00
GEMINI.md	fix(cli): add PDF support via pdftoppm (fixes #23 ) (#38 )	2026-01-03 10:42:48 -08:00
index.html	Initial commit (GPLv3)	2025-12-11 13:08:14 +00:00
LICENSE	Initial commit (GPLv3)	2025-12-11 13:08:14 +00:00
nginx.conf	Initial commit (GPLv3)	2025-12-11 13:08:14 +00:00
package-lock.json	chore(deps): bump the dependencies group across 1 directory with 7 updates (#43 )	2026-01-18 08:24:30 -08:00
package.json	chore(deps): bump the dependencies group across 1 directory with 7 updates (#43 )	2026-01-18 08:24:30 -08:00
README.md	feat(api): add configuration support (#34 )	2025-12-14 07:16:37 -08:00
SECURITY.md	feat: add security policy and issue templates (#9 )	2025-12-11 14:09:53 -08:00
tsconfig.app.json	Initial commit (GPLv3)	2025-12-11 13:08:14 +00:00
tsconfig.json	Initial commit (GPLv3)	2025-12-11 13:08:14 +00:00
tsconfig.node.json	Initial commit (GPLv3)	2025-12-11 13:08:14 +00:00
vite.config.ts	Initial commit (GPLv3)	2025-12-11 13:08:14 +00:00

README.md

🛡️ AutoRedact

Secure, client-side image redaction powered by OCR.

Deploy to Cloudflare

All processing happens 100% in your browser. Your images never touch a server.

✨ Features

🔍 Automatic Detection - Finds emails, IP addresses, credit cards, and API keys
🎯 Precise Redaction - Uses OCR word-level bounding boxes for accurate redaction
🔒 Privacy First - Everything runs locally via Tesseract.js
📦 Batch Processing - Process unlimited images at once
⚡ ZIP Download - Download all redacted files in one click

🚀 Quick Start

# Option 1: NPM (Local Dev)
npm install
npm run dev

# Option 2: Docker (Easiest)
docker run -p 8080:8080 karantdev/autoredact:latest

# Option 3: Docker Compose
docker compose up -d

Open http://localhost:5173 and drop your images.

Command Line Interface (CLI)

AutoRedact now supports a fully offline CLI mode using the same powerful engine. (jpg and png support only, for now. PDF support incoming)

# Process a single image
npm run cli -- input.jpg

# Disable specific redactors
npm run cli -- input.jpg --no-emails --no-ips

# Use custom rules
npm run cli -- input.jpg --block-words "Confidential" --custom-regex "Project-\d+"

🎯 What Gets Redacted

Type	Pattern
📧 Emails	`user@example.com`
🌐 IPs	`192.168.1.1`
💳 Credit Cards	`4242-4242-4242-4242`
🔑 API Keys	Stripe, GitHub, AWS

🛠️ Tech Stack

React + Vite + TypeScript
Tesseract.js v6 (OCR)
JSZip (batch exports)
Tailwind CSS

📁 Structure

src/
├── adapters/     # Interface implementations (Browser/Node)
├── components/   # UI Components
├── core/         # Pure Logic (Regex, Math, Image Proc)
├── hooks/        # Custom Hooks
├── utils/        # Helpers
├── types/        # TS Interfaces
├── cli.ts        # CLI Entry Point
└── App.tsx       # Main Entry

📄 License

GNU General Public License v3.0

📖 Real-World Recipes

🛠️ CLI Power Usage

1. Batch Process a Directory

The CLI processes one file at a time. Use a shell loop to process entire folders:

# Process all JPGs in 'input' dir and save to 'output' dir
mkdir -p output
for f in input/*.jpg; do
  npm run cli -- "$f" -o "output/$(basename "$f")"
done

2. Strict Redaction for Finance/Invoices

Enable strict blocking for sensitive documents:

npm run cli -- invoice.jpg \
  --block-words "Confidential,SSN,Account" \
  --custom-regex "(?i)account\s*#?\s*\d+" \
  --no-ips # Disable IP scanner if irrelevant to boost speed

3. Allowlist for Internal Docs

Prevent redaction of known internal terms or headers:

npm run cli -- internal-doc.jpg \
  --allowlist "CorpCorp,192.168.1.1,ProjectX"

The Docker API runs on port 3000 by default. It uses standard detection settings (Emails, IPs, Keys, PII) by default, but is fully configurable via the settings parameter.

👉 View Full API Documentation for detailed usage, schema, and Python/Node.js examples.

Quick Test (Curl)

curl -X POST http://localhost:3000/redact \
  -F "image=@/path/to/doc.jpg" \
  -o redacted.png