[GH-ISSUE #1514] New Extractor Idea: extruct for Open Graph Protocol web content metadata extraction #3916

Open
opened 2026-03-15 00:58:19 +03:00 by kerem · 0 comments
Owner

Originally created by @pirate on GitHub (Sep 11, 2024).
Original GitHub issue: https://github.com/ArchiveBox/ArchiveBox/issues/1514

https://github.com/scrapinghub/extruct

image

Originally created by @pirate on GitHub (Sep 11, 2024). Original GitHub issue: https://github.com/ArchiveBox/ArchiveBox/issues/1514 https://github.com/scrapinghub/extruct ![image](https://github.com/user-attachments/assets/86387cc6-4c15-4c13-9eca-8ce18e22fcab)
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/ArchiveBox#3916
No description provided.