[PR #1219] [MERGED] After a timeout, chrome will leave behind a SingletonLock, which prev… #4369

Closed
opened 2026-03-15 01:40:56 +03:00 by kerem · 0 comments
Owner

📋 Pull Request Information

Original PR: https://github.com/ArchiveBox/ArchiveBox/pull/1219
Author: @spresse1
Created: 8/28/2023
Status: Merged
Merged: 8/29/2023
Merged by: @pirate

Base: devHead: chrome-cleanup


📝 Commits (1)

  • 603ce7e After a timeout, chrome will leave behind a SingletonLock, which prevents future instances of chrome from starting. When an extractor fails due to a timeout, remove this file.

📊 Changes

4 files changed (+18 additions, -0 deletions)

View changed files

📝 archivebox/extractors/dom.py (+2 -0)
📝 archivebox/extractors/pdf.py (+2 -0)
📝 archivebox/extractors/screenshot.py (+2 -0)
📝 archivebox/util.py (+12 -0)

📄 Description

Summary

After a timeout in the pdf, screenshot, or dom extractor inside a docker container, chrome will leave behind a file at ~/.config/chromium/SingletonLock. This stops all three extractors from functioning until the docker container is completely torn down and regenerated.

This code adds removal of this file after a timeout only when running in a docker container. There is no behavior change outside a docker container. This was a deliberate choice on my part, as I don't want to interfere with a user's running chrome sessions when not running in a docker container.

Related issues

#1181

Changes these areas

  • Bugfixes
  • Feature behavior
  • Command line interface
  • Configuration options
  • Internal architecture
  • Snapshot data layout on disk

🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.

## 📋 Pull Request Information **Original PR:** https://github.com/ArchiveBox/ArchiveBox/pull/1219 **Author:** [@spresse1](https://github.com/spresse1) **Created:** 8/28/2023 **Status:** ✅ Merged **Merged:** 8/29/2023 **Merged by:** [@pirate](https://github.com/pirate) **Base:** `dev` ← **Head:** `chrome-cleanup` --- ### 📝 Commits (1) - [`603ce7e`](https://github.com/ArchiveBox/ArchiveBox/commit/603ce7ec1048321835ca6ec9647192e5249546ae) After a timeout, chrome will leave behind a SingletonLock, which prevents future instances of chrome from starting. When an extractor fails due to a timeout, remove this file. ### 📊 Changes **4 files changed** (+18 additions, -0 deletions) <details> <summary>View changed files</summary> 📝 `archivebox/extractors/dom.py` (+2 -0) 📝 `archivebox/extractors/pdf.py` (+2 -0) 📝 `archivebox/extractors/screenshot.py` (+2 -0) 📝 `archivebox/util.py` (+12 -0) </details> ### 📄 Description # Summary After a timeout in the pdf, screenshot, or dom extractor inside a docker container, chrome will leave behind a file at ~/.config/chromium/SingletonLock. This stops all three extractors from functioning until the docker container is completely torn down and regenerated. This code adds removal of this file after a timeout _only when running in a docker container_. There is no behavior change outside a docker container. This was a deliberate choice on my part, as I don't want to interfere with a user's running chrome sessions when not running in a docker container. # Related issues #1181 # Changes these areas - [x] Bugfixes - [ ] Feature behavior - [ ] Command line interface - [ ] Configuration options - [ ] Internal architecture - [ ] Snapshot data layout on disk --- <sub>🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.</sub>
kerem 2026-03-15 01:40:56 +03:00
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/ArchiveBox#4369
No description provided.