[PR #608] [MERGED] Extractors bugs #1245

Closed
opened 2026-03-01 14:49:00 +03:00 by kerem · 0 comments
Owner

📋 Pull Request Information

Original PR: https://github.com/ArchiveBox/ArchiveBox/pull/608
Author: @cdvv7788
Created: 1/7/2021
Status: Merged
Merged: 1/7/2021
Merged by: @pirate

Base: devHead: extractor-bugs


📝 Commits (2)

  • e9e4adf fix: wget_output_path failing on some extractors. Add a new condition
  • 6031ffa fix: Mercury extractor error was incorrectly initialized

📊 Changes

2 files changed (+5 additions, -1 deletions)

View changed files

📝 archivebox/extractors/mercury.py (+1 -1)
📝 archivebox/extractors/wget.py (+4 -0)

📄 Description

Summary

Some extractors are raising issues under certain circumstances (i.e. rss feeds). This PR adds a couple conditions to handle them better.

Related issues

Changes these areas

  • Bugfixes
  • Feature behavior
  • Command line interface
  • Configuration options
  • Internal architecture
  • Snapshot data layout on disk

🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.

## 📋 Pull Request Information **Original PR:** https://github.com/ArchiveBox/ArchiveBox/pull/608 **Author:** [@cdvv7788](https://github.com/cdvv7788) **Created:** 1/7/2021 **Status:** ✅ Merged **Merged:** 1/7/2021 **Merged by:** [@pirate](https://github.com/pirate) **Base:** `dev` ← **Head:** `extractor-bugs` --- ### 📝 Commits (2) - [`e9e4adf`](https://github.com/ArchiveBox/ArchiveBox/commit/e9e4adfc341b3e3637ce5af33e3f3fc8a6481d6d) fix: wget_output_path failing on some extractors. Add a new condition - [`6031ffa`](https://github.com/ArchiveBox/ArchiveBox/commit/6031ffa3b245530d0f0544d52454af5956718ec5) fix: Mercury extractor error was incorrectly initialized ### 📊 Changes **2 files changed** (+5 additions, -1 deletions) <details> <summary>View changed files</summary> 📝 `archivebox/extractors/mercury.py` (+1 -1) 📝 `archivebox/extractors/wget.py` (+4 -0) </details> ### 📄 Description <!-- IMPORTANT: Do not submit PRs with only formatting / PEP8 / line length changes. --> # Summary <!--e.g. This PR fixes ABC or adds the ability to do XYZ...--> Some extractors are raising issues under certain circumstances (i.e. rss feeds). This PR adds a couple conditions to handle them better. # Related issues <!-- e.g. #123 or Roadmap goal # https://github.com/pirate/ArchiveBox/wiki/Roadmap --> # Changes these areas - [X] Bugfixes - [ ] Feature behavior - [ ] Command line interface - [ ] Configuration options - [ ] Internal architecture - [ ] Snapshot data layout on disk --- <sub>🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.</sub>
kerem 2026-03-01 14:49:00 +03:00
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/ArchiveBox#1245
No description provided.