[GH-ISSUE #829] Feature Request: Cookies.txt use for singlefile #2026

Open
opened 2026-03-01 17:55:54 +03:00 by kerem · 3 comments
Owner

Originally created by @TheAnachronism on GitHub (Aug 16, 2021).
Original GitHub issue: https://github.com/ArchiveBox/ArchiveBox/issues/829

Type

  • General question or discussion
  • Propose a brand new feature
  • Request modification of existing behavior or design

What is the problem that your feature request solves

The singlefile CLI supports a cookies.txt file too:
https://github.com/gildas-lormeau/SingleFile/issues/574
Would it be possible to also use this here?
If it's available for wget it should be possible for singlefile too

Describe the ideal specific solution you'd want, and whether it fits into any broader scope of changes

Restricted websites can be archived with more then just wget or a configured chrome binary.

What hacks or alternative solutions have you tried to solve the problem?

Speaking about singlefile, only manually using the webextension and then upload it in the correct directory.

How badly do you want this new feature?

  • It's an urgent deal-breaker, I can't live without it
  • It's important to add it in the near-mid term future
  • It would be nice to have eventually

  • I'm willing to contribute dev time / money to fix this issue
  • I like ArchiveBox so far / would recommend it to a friend
  • I've had a lot of difficulty getting ArchiveBox set up
Originally created by @TheAnachronism on GitHub (Aug 16, 2021). Original GitHub issue: https://github.com/ArchiveBox/ArchiveBox/issues/829 <!-- Please fill out the following information, feel free to delete sections if they're not applicable or if long issue templates annoy you :) --> ## Type - [x] General question or discussion - [x] Propose a brand new feature - [ ] Request modification of existing behavior or design ## What is the problem that your feature request solves <!-- e.g. I need to be able to archive spanish and french subtitle files from a particular <example.com> movie site that's going down soon. --> The singlefile CLI supports a cookies.txt file too: https://github.com/gildas-lormeau/SingleFile/issues/574 Would it be possible to also use this here? If it's available for wget it should be possible for singlefile too ## Describe the ideal specific solution you'd want, and whether it fits into any broader scope of changes <!-- e.g. I specifically need a new archive method to look for multilingual subtitle files related to pages. The bigger picture solution is the ability for custom user scripts to be run in a puppeteer context during archiving. --> Restricted websites can be archived with more then just wget or a configured chrome binary. ## What hacks or alternative solutions have you tried to solve the problem? <!-- A clear and concise description of any alternative solutions, workarounds, or other software you've considered using to fix the problem. --> Speaking about singlefile, only manually using the webextension and then upload it in the correct directory. ## How badly do you want this new feature? - [ ] It's an urgent deal-breaker, I can't live without it - [x] It's important to add it in the near-mid term future - [x] It would be nice to have eventually --- - [x] I'm willing to contribute [dev time](https://github.com/ArchiveBox/ArchiveBox#archivebox-development) / [money](https://github.com/sponsors/pirate) to fix this issue - [x] I like ArchiveBox so far / would recommend it to a friend - [ ] I've had a lot of difficulty getting ArchiveBox set up
Author
Owner

@TheAnachronism commented on GitHub (Aug 16, 2021):

I'm not experienced in python at all but have quite a bit of coding experience, so for the start I'll just go through the code and propose the changes here

<!-- gh-comment-id:899860984 --> @TheAnachronism commented on GitHub (Aug 16, 2021): I'm not experienced in python at all but have quite a bit of coding experience, so for the start I'll just go through the code and propose the changes here
Author
Owner

@TheAnachronism commented on GitHub (Aug 16, 2021):

Alright so I guess this entire thing shouldn't be that hard.
Looking at https://github.com/ArchiveBox/ArchiveBox/blob/dev/archivebox/extractors/wget.py only COOKIES_FILE has to be added to the config imports and cli argument has to added to the cmd object on line 48 in https://github.com/ArchiveBox/ArchiveBox/blob/dev/archivebox/extractors/singlefile.py.

Because this entire thing doesn't seem to big, how fast could this get implemented? (Not really that important but was just wondering)

<!-- gh-comment-id:899863929 --> @TheAnachronism commented on GitHub (Aug 16, 2021): Alright so I guess this entire thing shouldn't be that hard. Looking at https://github.com/ArchiveBox/ArchiveBox/blob/dev/archivebox/extractors/wget.py only COOKIES_FILE has to be added to the config imports and cli argument has to added to the cmd object on line 48 in https://github.com/ArchiveBox/ArchiveBox/blob/dev/archivebox/extractors/singlefile.py. Because this entire thing doesn't seem to big, how fast could this get implemented? (Not really that important but was just wondering)
Author
Owner

@pirate commented on GitHub (Aug 19, 2021):

It may have to wait till after I get back from vacation in October / Dec, and even then it's lower priority than some of our pending refactors.

If you PR it then as fast as you want ;) I don't think it would take more than a day to implement, test, and document it.

<!-- gh-comment-id:901977338 --> @pirate commented on GitHub (Aug 19, 2021): It may have to wait till after I get back from vacation in October / Dec, and even then it's lower priority than some of our pending refactors. If you PR it then as fast as you want ;) I don't think it would take more than a day to implement, test, and document it.
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/ArchiveBox#2026
No description provided.