mirror of
https://github.com/ArchiveBox/ArchiveBox.git
synced 2026-04-25 09:06:02 +03:00
[GH-ISSUE #1237] Support: Docker v0.7.1 unable to use /tmp directory for crontab mkfstemp #760
Labels
No labels
expected: maybe someday
expected: next release
expected: release after next
expected: unlikely unless contributed
good first ticket
help wanted
pull-request
scope: all users
scope: windows users
size: easy
size: hard
size: medium
size: medium
status: backlog
status: blocked
status: done
status: idea-phase
status: needs followup
status: wip
status: wontfix
touches: API/CLI/Spec
touches: configuration
touches: data/schema/architecture
touches: dependencies/packaging
touches: docs
touches: js
touches: views/replayers/html/css
why: correctness
why: functionality
why: performance
why: security
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
starred/ArchiveBox#760
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @cutterkom on GitHub (Oct 6, 2023).
Original GitHub issue: https://github.com/ArchiveBox/ArchiveBox/issues/1237
I want to schedule archiving a list of URLs:
urls.txtconsists of a list of URLs:This works, no parsing error as described here: https://github.com/ArchiveBox/ArchiveBox/issues/968
I restart the scheduler and run it:
But it's not working, in the logs there is a "Failed to parse":

What am I missing?
Also, using cron style does not work, day, month... is working find within
schedule:In general, what is your workflow advice to archive a list of URLs regularly?
@pirate commented on GitHub (Oct 9, 2023):
Have you mounted urls.txt in a docker volume or is it in the root of the data folder? Please post your docker-compose.yml volume config to show where urls.txt is located.
Also when using
*characters in CLI args you have to use single quotes, not double quotes as otherwise it'll expand to the list of all files in the current directory as you see in your last screenshot.@cutterkom commented on GitHub (Oct 15, 2023):
Oh, right. I didn't mount it!
The
docker-compose.ymlis here, I mount the urls list now into the scheduler section:github.com/forummuenchen/forum-archivebox@43bd662c13/docker-compose.yml (L119)Before that, I download the list into the
archiveboxdirectory with:When I want to add the urls list:
I get:
@pirate commented on GitHub (Oct 16, 2023):
So
<is actually piping outside of docker (because < is greedily parsed by the first shell that sees it), not inside docker. If you want to load the file inside docker, do:@cutterkom commented on GitHub (Nov 8, 2023):
Sorry for the long delay, I was on vacation....
I tried your propesed statement (without the
-T), but it's not doing the trick.data/logs/schedule.logsays, it failed to parse:Do you have any other recommondation or can you point me to projects the use the scheduler in production?
@pirate commented on GitHub (Nov 8, 2023):
Can you try the latest 0.7 image:
archivebox/archivebox:latestand set--depth=1@cutterkom commented on GitHub (Nov 9, 2023):
Okay, I updated to the latest image and ran:
Unfortunately, it does not do the trick (yet):
@pirate commented on GitHub (Jan 19, 2024):
The latest version has a bunch of fixes and improvements that might help with this. Can you try on 0.7.2 or 0.7.3?
Other things to check: make sure your Docker VM has enough disk space available, make sure /data is readable and writable.