[GH-ISSUE #166] Archive Interactive Site #1625

Closed
opened 2026-03-01 17:52:17 +03:00 by kerem · 4 comments
Owner

Originally created by @diego898 on GitHub (Mar 9, 2019).
Original GitHub issue: https://github.com/ArchiveBox/ArchiveBox/issues/166

Describe the bug

I am trying to save a full working copy of the interactive site https://distill.pub/2019/activation-atlas/ but when I try echo "https://distill.pub/2019/activation-atlas/" | ./archive the resulting archive only preserves the text. Reading the documentation, the default should save everything.

Originally created by @diego898 on GitHub (Mar 9, 2019). Original GitHub issue: https://github.com/ArchiveBox/ArchiveBox/issues/166 ### Describe the bug I am trying to save a full working copy of the interactive site https://distill.pub/2019/activation-atlas/ but when I try `echo "https://distill.pub/2019/activation-atlas/" | ./archive` the resulting archive only preserves the text. Reading the documentation, the default should save everything.
kerem closed this issue 2026-03-01 17:52:17 +03:00
Author
Owner

@pirate commented on GitHub (Mar 11, 2019):

It's quite a complex site to save, but it does show up properly in the PDF/Screenshot output in my test:

Screen Shot 2019-03-11 at 4 12 58 AM

As this is not specifically related to this site, but is a general problem with archiving complex, interactive content, I'm closing it in favor of more specific issues with individual archive methods.

You can track our progress improving a few pieces of interactive site archiving here:

<!-- gh-comment-id:471443470 --> @pirate commented on GitHub (Mar 11, 2019): It's quite a complex site to save, but it does show up properly in the PDF/Screenshot output in my test: <img width="1489" alt="Screen Shot 2019-03-11 at 4 12 58 AM" src="https://user-images.githubusercontent.com/511499/54109121-270f5000-43b4-11e9-8a02-064446264fdf.png"> As this is not specifically related to this site, but is a general problem with archiving complex, interactive content, I'm closing it in favor of more specific issues with individual archive methods. You can track our progress improving a few pieces of interactive site archiving here: - https://github.com/pirate/ArchiveBox/issues/154 - https://github.com/pirate/ArchiveBox/issues/130
Author
Owner

@diego898 commented on GitHub (Mar 11, 2019):

@pirate thanks for the response - is it a goal/within-scope for this project to try and save a site as complex as that?

<!-- gh-comment-id:471537926 --> @diego898 commented on GitHub (Mar 11, 2019): @pirate thanks for the response - is it a goal/within-scope for this project to try and save a site as complex as that?
Author
Owner

@pirate commented on GitHub (Mar 11, 2019):

Yes @diego898, it's 100% in-scope. We want to be able to save every website you can view online with perfect fidelity. It's a goal shared by many of the other tools as well, from Archive.org to Webrecorder.io, we all care about getting it right ;)

This is a good set of tests if you're interested in seeing which tools are able to archive interactive sites: http://acid.matkelly.com

<!-- gh-comment-id:471752370 --> @pirate commented on GitHub (Mar 11, 2019): Yes @diego898, it's 100% in-scope. We want to be able to save every website you can view online with perfect fidelity. It's a goal shared by many of the other tools as well, from Archive.org to Webrecorder.io, we all care about getting it right ;) This is a good set of tests if you're interested in seeing which tools are able to archive interactive sites: http://acid.matkelly.com
Author
Owner

@diego898 commented on GitHub (Mar 20, 2019):

hey @pirate - sorry I don't know much about this. How exactly do I "use" or "test" using those sets of tests? thanks!

<!-- gh-comment-id:474660044 --> @diego898 commented on GitHub (Mar 20, 2019): hey @pirate - sorry I don't know much about this. How exactly do I "use" or "test" using those sets of tests? thanks!
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/ArchiveBox#1625
No description provided.