[PR #518] [MERGED] tests: Add tests for several different ways to extract the title #1203

Closed
opened 2026-03-01 14:48:50 +03:00 by kerem · 0 comments
Owner

📋 Pull Request Information

Original PR: https://github.com/ArchiveBox/ArchiveBox/pull/518
Author: @cdvv7788
Created: 10/30/2020
Status: Merged
Merged: 10/30/2020
Merged by: @pirate

Base: masterHead: title_test


📝 Commits (1)

  • e7e33ea tests: Add tests for several different ways to extract the title

📊 Changes

4 files changed (+761 additions, -4 deletions)

View changed files

📝 archivebox/extractors/title.py (+8 -2)
tests/mock_server/templates/malformed.html (+8 -0)
tests/mock_server/templates/title_og_with_html.com.html (+698 -0)
📝 tests/test_title.py (+47 -2)

📄 Description

Summary

Added the tests for three cases of title extraction:

  • <title> tag
  • og metatag
  • <title> tag in a malformed html

Related issues #493

Changes these areas

  • Bugfixes
  • Feature behavior
  • Command line interface
  • Configuration options
  • Internal architecture
  • Snapshot data layout on disk

🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.

## 📋 Pull Request Information **Original PR:** https://github.com/ArchiveBox/ArchiveBox/pull/518 **Author:** [@cdvv7788](https://github.com/cdvv7788) **Created:** 10/30/2020 **Status:** ✅ Merged **Merged:** 10/30/2020 **Merged by:** [@pirate](https://github.com/pirate) **Base:** `master` ← **Head:** `title_test` --- ### 📝 Commits (1) - [`e7e33ea`](https://github.com/ArchiveBox/ArchiveBox/commit/e7e33ea7a53e5e50ce54dd7e4c0fc16be8d32cb1) tests: Add tests for several different ways to extract the title ### 📊 Changes **4 files changed** (+761 additions, -4 deletions) <details> <summary>View changed files</summary> 📝 `archivebox/extractors/title.py` (+8 -2) ➕ `tests/mock_server/templates/malformed.html` (+8 -0) ➕ `tests/mock_server/templates/title_og_with_html.com.html` (+698 -0) 📝 `tests/test_title.py` (+47 -2) </details> ### 📄 Description # Summary Added the tests for three cases of title extraction: - `<title>` tag - og metatag - `<title>` tag in a malformed html # Related issues #493 # Changes these areas - [X] Bugfixes - [ ] Feature behavior - [ ] Command line interface - [ ] Configuration options - [ ] Internal architecture - [ ] Snapshot data layout on disk --- <sub>🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.</sub>
kerem 2026-03-01 14:48:50 +03:00
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/ArchiveBox#1203
No description provided.