mirror of
https://github.com/ArchiveBox/ArchiveBox.git
synced 2026-04-25 17:16:00 +03:00
[PR #1741] [MERGED] Delete pid_utils.py and migrate to Process model #4504
Labels
No labels
expected: maybe someday
expected: next release
expected: release after next
expected: unlikely unless contributed
good first ticket
help wanted
pull-request
scope: all users
scope: windows users
size: easy
size: hard
size: medium
size: medium
status: backlog
status: blocked
status: done
status: idea-phase
status: needs followup
status: wip
status: wontfix
touches: API/CLI/Spec
touches: configuration
touches: data/schema/architecture
touches: dependencies/packaging
touches: docs
touches: js
touches: views/replayers/html/css
why: correctness
why: functionality
why: performance
why: security
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
starred/ArchiveBox#4504
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
📋 Pull Request Information
Original PR: https://github.com/ArchiveBox/ArchiveBox/pull/1741
Author: @pirate
Created: 12/31/2025
Status: ✅ Merged
Merged: 12/31/2025
Merged by: @pirate
Base:
dev← Head:claude/refactor-process-management-WcQyZ📝 Commits (5)
2d3a2feAdd terminate, kill_tree, and query methods to Process modelb822352Delete pid_utils.py and migrate to Process modelee201a0Fix code review issues in process management refactor5121b0eMerge branch 'dev' into claude/refactor-process-management-WcQyZb2132d1Fix cubic review issues: process_type detection, cmd storage, PID cleanup, and migration📊 Changes
26 files changed (+3889 additions, -1181 deletions)
View changed files
📝
TODO_process_tracking.md(+239 -5)➕
archivebox/cli/archivebox_extract.py(+265 -0)➕
archivebox/cli/archivebox_orchestrator.py(+67 -0)➕
archivebox/cli/archivebox_remove.py(+98 -0)➕
archivebox/cli/archivebox_search.py(+131 -0)📝
archivebox/core/models.py(+73 -321)📝
archivebox/crawls/models.py(+13 -168)📝
archivebox/hooks.py(+46 -256)➕
archivebox/machine/migrations/0002_process_parent_and_type.py(+101 -0)📝
archivebox/machine/models.py(+895 -195)📝
archivebox/misc/process_utils.py(+45 -6)➕
archivebox/plugins/captcha2/config.json(+21 -0)➕
archivebox/plugins/captcha2/on_Crawl__01_captcha2.js(+121 -0)➕
archivebox/plugins/captcha2/on_Crawl__11_captcha2_config.js(+279 -0)➕
archivebox/plugins/captcha2/templates/icon.html(+0 -0)➕
archivebox/plugins/captcha2/tests/test_captcha2.py(+184 -0)➕
archivebox/plugins/chrome/on_Crawl__00_chrome_install.py(+184 -0)➕
archivebox/plugins/chrome/on_Crawl__10_chrome_validate_config.py(+172 -0)➕
archivebox/plugins/chrome/on_Crawl__20_chrome_launch.bg.js(+245 -0)➕
archivebox/plugins/istilldontcareaboutcookies/on_Crawl__02_istilldontcareaboutcookies.js(+115 -0)...and 6 more files
📄 Description
Summary
Related issues
Changes these areas
Summary by cubic
Replaced PID-file based process tracking with the Process model as the single source of truth, adding hierarchy, is_alive checks, and safe kill with SIGTERM→SIGKILL escalation. This removes workers/pid_utils.py and simplifies worker/orchestrator/crawl/hook code while improving safety against PID reuse.
Refactors
Migration
Written for commit
b2132d1f14. Summary will update on new commits.🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.