[GH-ISSUE #1261] DOM extractor output contains JS that can be executed upon viewing, and is subject to same security risks as viewing WGET output #774

Closed
opened 2026-03-01 14:46:12 +03:00 by kerem · 1 comment
Owner

Originally created by @p0n1 on GitHub (Nov 3, 2023).
Original GitHub issue: https://github.com/ArchiveBox/ArchiveBox/issues/1261

I noticed the following descriptions on executing archived JS.

Note: Only the wget extractor method executes archived JS when viewing snapshots, all other archive methods produce static output that does not execute JS on viewing. If you are worried about these issues ^ you should disable the wget extractor method using archivebox config --set SAVE_WGET=False.

Source: https://github.com/ArchiveBox/ArchiveBox#security-risks-of-viewing-archived-js

Workarounds
Disable the wget extractor by setting archivebox config --set SAVE_WGET=False, ensure you are always logged out, or serve only a static HTML version of your archive.

Source: https://github.com/ArchiveBox/ArchiveBox/security/advisories/GHSA-cr45-98w9-gwqx

I think the the SAVE_DOM archive method could also lead to the similar issue. When viewing Chrome > HTML ./output.html, any remote javascript will be loaded and executed.

Is that right? If so, we should also document this and remind users to disable this option if they should worry about the XSS/CSRF issue.

Originally created by @p0n1 on GitHub (Nov 3, 2023). Original GitHub issue: https://github.com/ArchiveBox/ArchiveBox/issues/1261 I noticed the following descriptions on executing archived JS. > Note: Only the wget extractor method executes archived JS when viewing snapshots, all other archive methods produce static output that does not execute JS on viewing. If you are worried about these issues ^ you should disable the wget extractor method using archivebox config --set SAVE_WGET=False. Source: https://github.com/ArchiveBox/ArchiveBox#security-risks-of-viewing-archived-js > Workarounds Disable the wget extractor by setting [archivebox config --set SAVE_WGET=False](https://github.com/ArchiveBox/ArchiveBox/wiki/Configuration#save_wget), ensure you are always logged out, or serve only a [static HTML version](https://github.com/ArchiveBox/ArchiveBox/wiki/Publishing-Your-Archive#2-export-and-host-it-as-static-html) of your archive. Source: https://github.com/ArchiveBox/ArchiveBox/security/advisories/GHSA-cr45-98w9-gwqx I think the the [`SAVE_DOM`](https://github.com/ArchiveBox/ArchiveBox/wiki/Configuration#save_dom) archive method could also lead to the similar issue. When viewing `Chrome > HTML ./output.html`, any remote javascript will be loaded and executed. Is that right? If so, we should also document this and remind users to disable this option if they should worry about the [XSS/CSRF issue](https://github.com/ArchiveBox/ArchiveBox/issues/239).
kerem closed this issue 2026-03-01 14:46:13 +03:00
Author
Owner

@pirate commented on GitHub (Nov 4, 2023):

You're right, it used to be stripped before we expanded it to be the full outerHTML with <head>, but it I didn't realize it became included when we changed that. Good catch, thanks!

I updated the CVE GHSA-cr45-98w9-gwqx CVE-2023-45815, README.md, and Security Overview Wiki page.

<!-- gh-comment-id:1793332540 --> @pirate commented on GitHub (Nov 4, 2023): You're right, it used to be stripped before we expanded it to be the full outerHTML with `<head>`, but it I didn't realize it became included when we changed that. Good catch, thanks! I updated the [CVE `GHSA-cr45-98w9-gwqx` `CVE-2023-45815`](https://github.com/ArchiveBox/ArchiveBox/security/advisories/GHSA-cr45-98w9-gwqx), [README.md](https://github.com/ArchiveBox/ArchiveBox/commit/ebb716514d9e64082352680d1db91390f2db38b0), and [Security Overview](https://github.com/ArchiveBox/ArchiveBox/wiki/Security-Overview#%EF%B8%8F-things-to-watch-out-for-%EF%B8%8F) Wiki page.
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/ArchiveBox#774
No description provided.