[PR #438] [CLOSED] feat: Check page responsiveness before trying to archive it #1167

Closed
opened 2026-03-01 14:48:42 +03:00 by kerem · 0 comments
Owner

📋 Pull Request Information

Original PR: https://github.com/ArchiveBox/ArchiveBox/pull/438
Author: @cdvv7788
Created: 8/12/2020
Status: Closed

Base: masterHead: hotfix/#209


📝 Commits (1)

  • da14f1c feat: Check page responsiveness before trying to archive it

📊 Changes

3 files changed (+37 additions, -18 deletions)

View changed files

📝 archivebox/extractors/__init__.py (+24 -18)
📝 archivebox/util.py (+7 -0)
📝 tests/test_extractors.py (+6 -0)

📄 Description

Summary

Run a HEAD request before trying to archive. Skip archiving and add failed + 1 to stats in the case that the status code is > 400 or there is an exception.

**Related issues: #209

Changes these areas

  • Bugfixes
  • Feature behavior
  • Command line interface
  • Configuration options
  • Internal architecture
  • Archived data layout on disk

🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.

## 📋 Pull Request Information **Original PR:** https://github.com/ArchiveBox/ArchiveBox/pull/438 **Author:** [@cdvv7788](https://github.com/cdvv7788) **Created:** 8/12/2020 **Status:** ❌ Closed **Base:** `master` ← **Head:** `hotfix/#209` --- ### 📝 Commits (1) - [`da14f1c`](https://github.com/ArchiveBox/ArchiveBox/commit/da14f1c7eb8d39a8d22904f2e1dd9afcadf8c8a2) feat: Check page responsiveness before trying to archive it ### 📊 Changes **3 files changed** (+37 additions, -18 deletions) <details> <summary>View changed files</summary> 📝 `archivebox/extractors/__init__.py` (+24 -18) 📝 `archivebox/util.py` (+7 -0) 📝 `tests/test_extractors.py` (+6 -0) </details> ### 📄 Description # Summary Run a `HEAD` request before trying to archive. Skip archiving and add failed + 1 to stats in the case that the status code is > 400 or there is an exception. **Related issues: #209 # Changes these areas - [ ] Bugfixes - [X] Feature behavior - [ ] Command line interface - [ ] Configuration options - [ ] Internal architecture - [ ] Archived data layout on disk --- <sub>🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.</sub>
kerem 2026-03-01 14:48:42 +03:00
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/ArchiveBox#1167
No description provided.