[GH-ISSUE #1227] Bug: inconsistent file naming #3774

Closed
opened 2026-03-15 00:24:49 +03:00 by kerem · 1 comment
Owner

Originally created by @sasasqt on GitHub (Sep 3, 2023).
Original GitHub issue: https://github.com/ArchiveBox/ArchiveBox/issues/1227

Describe the bug

github.com/ArchiveBox/ArchiveBox@73a5f74d38/archivebox/extractors/dom.py (L40)
github.com/ArchiveBox/ArchiveBox@73a5f74d38/archivebox/extractors/readability.py (L102)
dom.html should be renamed to output.html, or another way around

Steps to reproduce

Screenshots or log output

ArchiveBox version

replace this line with the *full*, unshortened output of running `archivebox version`
Originally created by @sasasqt on GitHub (Sep 3, 2023). Original GitHub issue: https://github.com/ArchiveBox/ArchiveBox/issues/1227 <!-- Please fill out the following information, feel free to delete sections if they're not applicable or if long issue templates annoy you. (the only required section is the version information) --> #### Describe the bug <!-- A description of what the bug is, what you expected to happen, and any relevant context about issue. --> https://github.com/ArchiveBox/ArchiveBox/blob/73a5f74d3840284bceaabced9cf99575b8c15d54/archivebox/extractors/dom.py#L40 https://github.com/ArchiveBox/ArchiveBox/blob/73a5f74d3840284bceaabced9cf99575b8c15d54/archivebox/extractors/readability.py#L102 dom.html should be renamed to output.html, or another way around #### Steps to reproduce <!-- For example: 1. Ran ArchiveBox with the following config '...' 2. Saw this output during archiving '....' 3. UI didn't show the thing I was expecting '....' --> #### Screenshots or log output <!-- If applicable, post any relevant screenshots or copy/pasted terminal output from ArchiveBox. If you're reporting a parsing / importing error, **you must paste a copy of your redacted import file here**. --> #### ArchiveBox version <!-- Run the `archivebox version` command locally then copy paste the result here: --> ```logs replace this line with the *full*, unshortened output of running `archivebox version` ``` <!-- Tickets without full version info will closed until it is provided, we need the full output here to help you solve your issue -->
kerem closed this issue 2026-03-15 00:24:55 +03:00
Author
Owner

@pirate commented on GitHub (Sep 4, 2023):

These are different extractor outputs, they must have separate filenames. The specific names output.html / output.pdf etc are from older versions where we used to only have one output, but now that there are multiple outputs most of them go into a folder named after the extractor, e.g. media/, git/. In the future every output will be in a folder named after the extractor, and these legacy names will stop being used entirely.

<!-- gh-comment-id:1704584427 --> @pirate commented on GitHub (Sep 4, 2023): These are different extractor outputs, they must have *separate* filenames. The specific names `output.html` / `output.pdf` etc are from older versions where we used to only have one output, but now that there are multiple outputs most of them go into a folder named after the extractor, e.g. `media/`, `git/`. In the future every output will be in a folder named after the extractor, and these legacy names will stop being used entirely.
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/ArchiveBox#3774
No description provided.