[GH-ISSUE #266] Allow saving screenshots as JPG #3209

Closed
opened 2026-03-14 21:37:05 +03:00 by kerem · 5 comments
Owner

Originally created by @dzek69 on GitHub (Sep 12, 2019).
Original GitHub issue: https://github.com/ArchiveBox/ArchiveBox/issues/266

Type

  • Request modification of existing behavior or design

What is the problem that your feature request solves

I would like to save page screenshots as JPG to save space. PNG's of websites that are mostly images are using more disk space that I'd like them to. Allowing saving screenshot as JPG would solve that.

Describe the ideal specific solution you'd want, and whether it fits into any broader scope of

Ideally I should be able to select jpeg quality as well. Having default

How badly do you want this new feature?

  • It's important to add it in the near-mid term future, mostly because it probably shouldn't be hard to implement with quality hardcoded/choose

  • I'd contribute to development if I'd feel safe in Python, but I do not
  • I like ArchiveBox so far / would recommend it to a friend
Originally created by @dzek69 on GitHub (Sep 12, 2019). Original GitHub issue: https://github.com/ArchiveBox/ArchiveBox/issues/266 ## Type - [x] Request modification of existing behavior or design ## What is the problem that your feature request solves I would like to save page screenshots as JPG to save space. PNG's of websites that are mostly images are using more disk space that I'd like them to. Allowing saving screenshot as JPG would solve that. ## Describe the ideal specific solution you'd want, and whether it fits into any broader scope of Ideally I should be able to select jpeg quality as well. Having default ## How badly do you want this new feature? - [x] It's important to add it in the near-mid term future, mostly because it probably shouldn't be hard to implement with quality hardcoded/choose --- - [x] I'd contribute to development if I'd feel safe in Python, but I do not - [x] I like ArchiveBox so far / would recommend it to a friend
kerem closed this issue 2026-03-14 21:37:10 +03:00
Author
Owner

@pirate commented on GitHub (Sep 19, 2019):

Unfortunately this is not supported by chrome, so it would have to be a separate program that runs after the screenshot process to convert the PNGs to JPGs. I'll leave this open to see if other people want this too, but I cant promise this feature will be added anytime soon.

As a quick fix in the meantime, might I suggest storing the entire archive dir on a compressed filesystem like ZFS?

apt install zfsutils-linux    # or brew cask install openzfs
zpool create -f \
    -o mountpoint=/mnt/archivebox \
    -o compression=lz4 \
    -o atime=off \
    -o sync=standard \
    -o aclinherit=passthrough \
    -o utf8only=on \
    -o normalization=formD \
    -o casesensitivity=sensitive \
    archivebox ~/archivebox.zfs
zpool mount archivebox
<!-- gh-comment-id:532915503 --> @pirate commented on GitHub (Sep 19, 2019): Unfortunately this is not supported by chrome, so it would have to be a separate program that runs after the screenshot process to convert the PNGs to JPGs. I'll leave this open to see if other people want this too, but I cant promise this feature will be added anytime soon. As a quick fix in the meantime, might I suggest storing the entire archive dir on a compressed filesystem like ZFS? ```bash apt install zfsutils-linux # or brew cask install openzfs zpool create -f \ -o mountpoint=/mnt/archivebox \ -o compression=lz4 \ -o atime=off \ -o sync=standard \ -o aclinherit=passthrough \ -o utf8only=on \ -o normalization=formD \ -o casesensitivity=sensitive \ archivebox ~/archivebox.zfs zpool mount archivebox ```
Author
Owner

@LaserWires commented on GitHub (Sep 19, 2019):

It would be more ideal to employ image base64 format and imbed them directly without populating the archive directory.

<!-- gh-comment-id:533014858 --> @LaserWires commented on GitHub (Sep 19, 2019): It would be more ideal to employ image base64 format and imbed them directly without populating the archive directory.
Author
Owner

@dzek69 commented on GitHub (Sep 19, 2019):

@LaserWires why it would be more ideal?

I disagree, because it's easier to do something with it if it's stored as a file instead of embedded into something else. Additionally encoding something into base64 makes it's around 37% bigger. Huge waste of space. I see no advantages.

<!-- gh-comment-id:533035941 --> @dzek69 commented on GitHub (Sep 19, 2019): @LaserWires why it would be more ideal? I disagree, because it's easier to do something with it if it's stored as a file instead of embedded into something else. Additionally encoding something into base64 makes it's around 37% bigger. Huge waste of space. I see no advantages.
Author
Owner

@dzek69 commented on GitHub (Sep 19, 2019):

@pirate thank you for your reply. I think I'm gonna do some post-processing then, to replace png with jpg

If you feel like implementing it with external tools someday - please leave this ticket open, if not - feel free to close.

<!-- gh-comment-id:533036647 --> @dzek69 commented on GitHub (Sep 19, 2019): @pirate thank you for your reply. I think I'm gonna do some post-processing then, to replace png with jpg If you feel like implementing it with external tools someday - please leave this ticket open, if not - feel free to close.
Author
Owner

@pirate commented on GitHub (Sep 20, 2019):

I think I'm going to close this for now, and keep compressed filesystems as the recommended solution for disk space saving. Images are only a small potential source of bloat, and LZ4 compression on the whole folder will do a better overall job than any individual fix.

<!-- gh-comment-id:533388508 --> @pirate commented on GitHub (Sep 20, 2019): I think I'm going to close this for now, and keep compressed filesystems as the recommended solution for disk space saving. Images are only a small potential source of bloat, and LZ4 compression on the whole folder will do a better overall job than any individual fix.
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/ArchiveBox#3209
No description provided.