[PR #1731] [MERGED] Persona system and CHROME_USER_DATA_DIR #4497

Closed
opened 2026-03-15 01:47:41 +03:00 by kerem · 0 comments
Owner

📋 Pull Request Information

Original PR: https://github.com/ArchiveBox/ArchiveBox/pull/1731
Author: @pirate
Created: 12/31/2025
Status: Merged
Merged: 12/31/2025
Merged by: @pirate

Base: devHead: claude/review-chrome-plugin-userdata-vJUYW


📝 Commits (7)

  • 877b5f9 Derive CHROME_USER_DATA_DIR from ACTIVE_PERSONA in config system
  • 1a86789 Move Chrome default args to config.json CHROME_ARGS
  • 503a2f7 Add Persona class with cleanup_chrome() method
  • b1e31c3 Simplify Persona class: remove convenience functions, fix get_active()
  • b8a66c4 Convert Persona to Django ModelWithConfig, add to get_config()
  • f7b186d Apply suggestion from @cubic-dev-ai[bot]
  • 4285a05 Fix getEnvArray to parse JSON when '[' present, CSV otherwise

📊 Changes

6 files changed (+378 additions, -95 deletions)

View changed files

📝 archivebox/config/configset.py (+11 -4)
📝 archivebox/misc/util.py (+30 -3)
📝 archivebox/personas/models.py (+155 -59)
📝 archivebox/plugins/chrome/chrome_utils.js (+110 -22)
📝 archivebox/plugins/chrome/config.json (+62 -4)
📝 archivebox/plugins/chrome/on_Crawl__20_chrome_launch.bg.js (+10 -3)

📄 Description

Summary

Related issues

Changes these areas

  • Bugfixes
  • Feature behavior
  • Command line interface
  • Configuration options
  • Internal architecture
  • Snapshot data layout on disk

Summary by cubic

Make Chrome launches persona-aware and more reliable by auto-deriving CHROME_USER_DATA_DIR/CHROME_EXTENSIONS_DIR from ACTIVE_PERSONA and adding a Persona class to manage profile data. Also make Chrome args configurable and clean up stale locks to prevent failed launches.

  • New Features

    • Auto-derive CHROME_USER_DATA_DIR, CHROME_EXTENSIONS_DIR, and COOKIES_FILE from ACTIVE_PERSONA (fallback to “Default”).
    • New Persona class with per-persona directories, config helpers, and chrome cleanup.
    • Chrome launch supports userDataDir and passes --user-data-dir; args now configurable via CHROME_ARGS, CHROME_ARGS_EXTRA, and CHROME_SANDBOX.
  • Bug Fixes

    • Remove stale SingletonLock/SingletonSocket across all persona profiles in chrome_cleanup and killZombieChrome to prevent startup failures.
    • Hooks now read CHROME_USER_DATA_DIR and CHROME_EXTENSIONS_DIR from env (derived by get_config), avoiding persona-specific logic in plugins.

Written for commit 4285a05d19. Summary will update on new commits.


🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.

## 📋 Pull Request Information **Original PR:** https://github.com/ArchiveBox/ArchiveBox/pull/1731 **Author:** [@pirate](https://github.com/pirate) **Created:** 12/31/2025 **Status:** ✅ Merged **Merged:** 12/31/2025 **Merged by:** [@pirate](https://github.com/pirate) **Base:** `dev` ← **Head:** `claude/review-chrome-plugin-userdata-vJUYW` --- ### 📝 Commits (7) - [`877b5f9`](https://github.com/ArchiveBox/ArchiveBox/commit/877b5f91c29aa8ae025576c673f9af6da2afab65) Derive CHROME_USER_DATA_DIR from ACTIVE_PERSONA in config system - [`1a86789`](https://github.com/ArchiveBox/ArchiveBox/commit/1a867895234d23ed7f41c8f712380bb5ed8c6836) Move Chrome default args to config.json CHROME_ARGS - [`503a2f7`](https://github.com/ArchiveBox/ArchiveBox/commit/503a2f77cb5282dd4c97ca8d62b697ef71d39dd5) Add Persona class with cleanup_chrome() method - [`b1e31c3`](https://github.com/ArchiveBox/ArchiveBox/commit/b1e31c3def83861797d4bfda11460b2e5cc4402a) Simplify Persona class: remove convenience functions, fix get_active() - [`b8a66c4`](https://github.com/ArchiveBox/ArchiveBox/commit/b8a66c4a84b991cc6075cce8e0bff51633867baa) Convert Persona to Django ModelWithConfig, add to get_config() - [`f7b186d`](https://github.com/ArchiveBox/ArchiveBox/commit/f7b186d7c8c643edb5a65084dc8870e4dcc35136) Apply suggestion from @cubic-dev-ai[bot] - [`4285a05`](https://github.com/ArchiveBox/ArchiveBox/commit/4285a05d19a8b246fbdcbad2ef66f186ed0b1ed7) Fix getEnvArray to parse JSON when '[' present, CSV otherwise ### 📊 Changes **6 files changed** (+378 additions, -95 deletions) <details> <summary>View changed files</summary> 📝 `archivebox/config/configset.py` (+11 -4) 📝 `archivebox/misc/util.py` (+30 -3) 📝 `archivebox/personas/models.py` (+155 -59) 📝 `archivebox/plugins/chrome/chrome_utils.js` (+110 -22) 📝 `archivebox/plugins/chrome/config.json` (+62 -4) 📝 `archivebox/plugins/chrome/on_Crawl__20_chrome_launch.bg.js` (+10 -3) </details> ### 📄 Description <!-- IMPORTANT: Do not submit PRs with only formatting / PEP8 / line length changes. --> # Summary <!--e.g. This PR fixes ABC or adds the ability to do XYZ...--> # Related issues <!-- e.g. #123 or Roadmap goal # https://github.com/pirate/ArchiveBox/wiki/Roadmap --> # Changes these areas - [ ] Bugfixes - [ ] Feature behavior - [ ] Command line interface - [ ] Configuration options - [ ] Internal architecture - [ ] Snapshot data layout on disk <!-- This is an auto-generated description by cubic. --> --- ## Summary by cubic Make Chrome launches persona-aware and more reliable by auto-deriving CHROME_USER_DATA_DIR/CHROME_EXTENSIONS_DIR from ACTIVE_PERSONA and adding a Persona class to manage profile data. Also make Chrome args configurable and clean up stale locks to prevent failed launches. - **New Features** - Auto-derive CHROME_USER_DATA_DIR, CHROME_EXTENSIONS_DIR, and COOKIES_FILE from ACTIVE_PERSONA (fallback to “Default”). - New Persona class with per-persona directories, config helpers, and chrome cleanup. - Chrome launch supports userDataDir and passes --user-data-dir; args now configurable via CHROME_ARGS, CHROME_ARGS_EXTRA, and CHROME_SANDBOX. - **Bug Fixes** - Remove stale SingletonLock/SingletonSocket across all persona profiles in chrome_cleanup and killZombieChrome to prevent startup failures. - Hooks now read CHROME_USER_DATA_DIR and CHROME_EXTENSIONS_DIR from env (derived by get_config), avoiding persona-specific logic in plugins. <sup>Written for commit 4285a05d19a8b246fbdcbad2ef66f186ed0b1ed7. Summary will update on new commits.</sup> <!-- End of auto-generated description by cubic. --> --- <sub>🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.</sub>
kerem 2026-03-15 01:47:41 +03:00
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/ArchiveBox#4497
No description provided.