[PR #465] [MERGED] First part migrating to Pathlib #4191

Closed
opened 2026-03-15 01:31:13 +03:00 by kerem · 0 comments
Owner

📋 Pull Request Information

Original PR: https://github.com/ArchiveBox/ArchiveBox/pull/465
Author: @apkallum
Created: 9/3/2020
Status: Merged
Merged: 9/17/2020
Merged by: @cdvv7788

Base: masterHead: pathlib1


📝 Commits (9)

  • d44ed3d first attempt to migrate to Pathlib
  • 81dabd0 pathlib with / syntax for config, index
  • 247e872 no home() in Paths
  • 37b5ea6 add support for Paths in json encoder
  • 3fc973b update stubs file
  • 65a9f7d fix oneshot command type signature
  • 7f8649b fix github action folder listing
  • 867fc0a fix test type casting for folder['path']
  • a23ae28 test: Fix tests post-rebase

📊 Changes

22 files changed (+246 additions, -243 deletions)

View changed files

📝 archivebox/cli/archivebox_oneshot.py (+1 -1)
📝 archivebox/config/__init__.py (+53 -53)
📝 archivebox/config/stubs.py (+6 -4)
📝 archivebox/extractors/__init__.py (+5 -4)
📝 archivebox/extractors/archive_org.py (+10 -10)
📝 archivebox/extractors/dom.py (+10 -11)
📝 archivebox/extractors/favicon.py (+5 -4)
📝 archivebox/extractors/git.py (+10 -10)
📝 archivebox/extractors/media.py (+10 -11)
📝 archivebox/extractors/pdf.py (+9 -10)
📝 archivebox/extractors/screenshot.py (+9 -10)
📝 archivebox/extractors/singlefile.py (+8 -8)
📝 archivebox/extractors/title.py (+3 -2)
📝 archivebox/extractors/wget.py (+20 -25)
📝 archivebox/index/__init__.py (+31 -31)
📝 archivebox/index/html.py (+3 -2)
📝 archivebox/index/json.py (+5 -5)
📝 archivebox/index/sql.py (+8 -7)
📝 archivebox/logging_util.py (+5 -4)
📝 archivebox/main.py (+27 -28)

...and 2 more files

📄 Description

Summary

This PR migrates most of the os.path uses to pathlib.Path + fixes corresponding type checks

Changes these areas

  • Bugfixes
  • Feature behavior
  • Command line interface
  • Configuration options
  • Internal architecture
  • Archived data layout on disk

Roadmap Goals

0.6


🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.

## 📋 Pull Request Information **Original PR:** https://github.com/ArchiveBox/ArchiveBox/pull/465 **Author:** [@apkallum](https://github.com/apkallum) **Created:** 9/3/2020 **Status:** ✅ Merged **Merged:** 9/17/2020 **Merged by:** [@cdvv7788](https://github.com/cdvv7788) **Base:** `master` ← **Head:** `pathlib1` --- ### 📝 Commits (9) - [`d44ed3d`](https://github.com/ArchiveBox/ArchiveBox/commit/d44ed3d45b92ea85c0d61ff75d2a67d45d88c845) first attempt to migrate to Pathlib - [`81dabd0`](https://github.com/ArchiveBox/ArchiveBox/commit/81dabd018b3221395588572bed18016a7f631dac) pathlib with / syntax for config, index - [`247e872`](https://github.com/ArchiveBox/ArchiveBox/commit/247e872c70a67f7c6e3cdda5b353f124266d623b) no home() in Paths - [`37b5ea6`](https://github.com/ArchiveBox/ArchiveBox/commit/37b5ea67cbd273dc0e93368ed44cceedafd8fbb0) add support for Paths in json encoder - [`3fc973b`](https://github.com/ArchiveBox/ArchiveBox/commit/3fc973b5174e1406e9b0f484c1b245c928f007c4) update stubs file - [`65a9f7d`](https://github.com/ArchiveBox/ArchiveBox/commit/65a9f7dc2628f219a8b82ac201f149f5d27af7b9) fix oneshot command type signature - [`7f8649b`](https://github.com/ArchiveBox/ArchiveBox/commit/7f8649b0f3ea3d0399eb404ef8ab601224c8db96) fix github action folder listing - [`867fc0a`](https://github.com/ArchiveBox/ArchiveBox/commit/867fc0a9cafc44527963c68d10d11c7a3b574a81) fix test type casting for folder['path'] - [`a23ae28`](https://github.com/ArchiveBox/ArchiveBox/commit/a23ae28fb980f6bf32de335d0ea8d904f95951e0) test: Fix tests post-rebase ### 📊 Changes **22 files changed** (+246 additions, -243 deletions) <details> <summary>View changed files</summary> 📝 `archivebox/cli/archivebox_oneshot.py` (+1 -1) 📝 `archivebox/config/__init__.py` (+53 -53) 📝 `archivebox/config/stubs.py` (+6 -4) 📝 `archivebox/extractors/__init__.py` (+5 -4) 📝 `archivebox/extractors/archive_org.py` (+10 -10) 📝 `archivebox/extractors/dom.py` (+10 -11) 📝 `archivebox/extractors/favicon.py` (+5 -4) 📝 `archivebox/extractors/git.py` (+10 -10) 📝 `archivebox/extractors/media.py` (+10 -11) 📝 `archivebox/extractors/pdf.py` (+9 -10) 📝 `archivebox/extractors/screenshot.py` (+9 -10) 📝 `archivebox/extractors/singlefile.py` (+8 -8) 📝 `archivebox/extractors/title.py` (+3 -2) 📝 `archivebox/extractors/wget.py` (+20 -25) 📝 `archivebox/index/__init__.py` (+31 -31) 📝 `archivebox/index/html.py` (+3 -2) 📝 `archivebox/index/json.py` (+5 -5) 📝 `archivebox/index/sql.py` (+8 -7) 📝 `archivebox/logging_util.py` (+5 -4) 📝 `archivebox/main.py` (+27 -28) _...and 2 more files_ </details> ### 📄 Description # Summary This PR migrates most of the `os.path` uses to `pathlib.Path` + fixes corresponding type checks # Changes these areas - [ ] Bugfixes - [ ] Feature behavior - [ ] Command line interface - [ ] Configuration options - [x] Internal architecture - [ ] Archived data layout on disk # Roadmap Goals 0.6 --- <sub>🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.</sub>
kerem 2026-03-15 01:31:13 +03:00
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/ArchiveBox#4191
No description provided.