[PR #1671] abx-plugin-title: use CURL_USER_AGENT when downloading page #2971

Closed
opened 2026-03-01 18:01:17 +03:00 by kerem · 0 comments
Owner

Original Pull Request: https://github.com/ArchiveBox/ArchiveBox/pull/1671

State: closed
Merged: No


abx-plugin-title fetches the title with whatever works first:

  • reuse already downloaded page with dom/singlepage/wget
  • download the page with python-requests

the plugin documents/returns a curl command-line but it's never used. See https://github.com/ArchiveBox/ArchiveBox/issues/1670

At least we could mimick curl behaviour when downloading the page with python-requests by using the CURL_USER_AGENT setting.

Summary

Related issues

Changes these areas

  • Bugfixes
  • Feature behavior
  • Command line interface
  • Configuration options
  • Internal architecture
  • Snapshot data layout on disk
**Original Pull Request:** https://github.com/ArchiveBox/ArchiveBox/pull/1671 **State:** closed **Merged:** No --- abx-plugin-title fetches the title with whatever works first: - reuse already downloaded page with dom/singlepage/wget - download the page with python-requests the plugin documents/returns a curl command-line but it's never used. See https://github.com/ArchiveBox/ArchiveBox/issues/1670 At least we could mimick curl behaviour when downloading the page with python-requests by using the CURL_USER_AGENT setting. <!-- IMPORTANT: Do not submit PRs with only formatting / PEP8 / line length changes. --> # Summary <!--e.g. This PR fixes ABC or adds the ability to do XYZ...--> # Related issues <!-- e.g. #123 or Roadmap goal # https://github.com/pirate/ArchiveBox/wiki/Roadmap --> # Changes these areas - [x] Bugfixes - [ ] Feature behavior - [ ] Command line interface - [ ] Configuration options - [ ] Internal architecture - [ ] Snapshot data layout on disk
kerem 2026-03-01 18:01:17 +03:00
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/ArchiveBox#2971
No description provided.