mirror of
https://github.com/hibiken/asynq.git
synced 2026-04-25 23:15:51 +03:00
[GH-ISSUE #334] [FEATURE REQUEST] Recover tasks immediately when server is crashed #1161
Labels
No labels
CLI
bug
designing
documentation
duplicate
enhancement
good first issue
good first issue
help wanted
idea
invalid
investigate
needs-more-info
performance
pr-welcome
pull-request
question
wontfix
work in progress
work in progress
work-around-available
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
starred/asynq#1161
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @hawksheng on GitHub (Oct 20, 2021).
Original GitHub issue: https://github.com/hibiken/asynq/issues/334
Originally assigned to: @hibiken on GitHub.
Is your feature request related to a problem? Please describe.
Some tasks of my project plan to run for several hours, however, if the asynq-server crashes/hang, they can't be retried immediately until a timeout is reached.
Describe the solution you'd like
Describe alternatives you've considered
N/A
Additional context
N/A
btw, really a great job you have made 👍
@hibiken commented on GitHub (Oct 20, 2021):
@Chosokabeho thank you for creating an issue! This is a great feedback.
I believe RabbitMQ does something similar.
Quote from their getting started guide:
We have the second part (Timeout) for Asynq, but as you pointed out, the first part is not implemented in Asynq. I will look into the behavior of other systems (RabbitMQ, etc) and come up with a sensible solution here.
I want to look at other systems because I'm not sure if we want to consider temporary network partitions and deal with it gracefully. For example, if asynq-server processes could not talk to Redis for some time due to network partition, do we want to requeue those tasks immediately? Maybe we need to have some tolerance around network partition (e.g. allow up to certain duration to be offline)
I'll look into this more and find a solution here.
Let me know if anyone has experience around this topic!
@svedova commented on GitHub (Feb 4, 2022):
Our use case is as follows:
Here's a mock code:
Now this is mock code, but basically we retrieve the task ids and reprocess them. The handler receives the reprocessed task but does not update the state to completed afterwards. I tried to manually cancel the task by using
inspector.CancelProcessingbut it doesn't work.Any idea how can I resume a task when the worker server restarts?
@hibiken commented on GitHub (Feb 4, 2022):
@svedova thanks for commenting, I moved the discussion to #396 👍