mirror of
https://github.com/ArchiveBox/ArchiveBox.git
synced 2026-04-25 17:16:00 +03:00
[GH-ISSUE #1619] Feature Request: Integration of ReplayWeb.Page for previewing WARC/WACZ files #971
Labels
No labels
expected: maybe someday
expected: next release
expected: release after next
expected: unlikely unless contributed
good first ticket
help wanted
pull-request
scope: all users
scope: windows users
size: easy
size: hard
size: medium
size: medium
status: backlog
status: blocked
status: done
status: idea-phase
status: needs followup
status: wip
status: wontfix
touches: API/CLI/Spec
touches: configuration
touches: data/schema/architecture
touches: dependencies/packaging
touches: docs
touches: js
touches: views/replayers/html/css
why: correctness
why: functionality
why: performance
why: security
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
starred/ArchiveBox#971
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @nopper on GitHub (Dec 12, 2024).
Original GitHub issue: https://github.com/ArchiveBox/ArchiveBox/issues/1619
Originally assigned to: @pirate on GitHub.
What type of suggestion are you making?
New extractor / type of content to save
What is the problem that your feature request solves?
It would be great if ArchiveBox provided a simple HTML page for viewing the warc/wacz files that are being generated.
What is your proposed solution?
One suggestion would be to incorporate replayweb.page as a potential web component for rendering these files. This would be particularly useful for sites that are not adequately captured by single-file formats due to the presence of dynamic elements or scripts.
What hacks or alternative solutions have you tried to solve the problem?
Share the entire output of the
archivebox versioncommand for the current verison you are using.How badly do you want this new feature?
Mini Survey
@pirate commented on GitHub (Dec 12, 2024):
Already have this on my roadmap 😁
have even implemented it a couple times: https://github.com/ArchiveBox/ArchiveBox/pull/1327/files#diff-08041ea7039132ec35c8a6d986cb5f3808decf38291f48f199a8d76af3d1cba5
the blocker is the warcs produced by plain wget are not standards compliant and look terrible, so we need to switch to wget-lua or browsertrix to actually capture them, and that's a much bigger ordeal