starred/asynq

Fork 0

mirror of https://github.com/hibiken/asynq.git synced 2026-04-25 23:15:51 +03:00

[GH-ISSUE #1015] [BUG] Worker failure appears to increment retry counter #2510

New issue

Open

opened 2026-03-15 20:44:51 +03:00 by kerem · 3 comments

kerem commented

2026-03-15 20:44:51 +03:00

Owner

Originally created by @kmcgovern-apixio on GitHub (Jan 29, 2025).
Original GitHub issue: https://github.com/hibiken/asynq/issues/1015

Originally assigned to: @hibiken, @kamikazechaser on GitHub.

Describe the bug
When a worker fails a task will move into archived state without retrying if asynq.MaxRetries(0) is set on the task. Setting it to any positive value causes allows X worker failures. IE setting asynq.MaxRetries(3) will cause allow the task to get picked up again up to 3 times with worker failures as the cause

Environment (please complete the following information):

OS: linux
asynq package version: 0.25.1
Redis version: 7.4.2

To Reproduce
Steps to reproduce the behavior (Code snippets if applicable):

create task with asynq.MaxRetries(0)
run task
kill worker before tasks completes (IE ctrl + c)
check task state in redis hgetall "asynq:{default}:t:8abc3be7-2f6f-4ac4-a610-1c0a23188d96"
start worker backup
after lease expires, task will move to archived state (can check with hgetall)

Expected behavior
tasks are picked back up on worker failure. retries is incremented only upon a task returning an error or panic

Screenshots
output from redis cli (payloads and job name redacted)

127.0.0.1:6379> hgetall "asynq:{default}:t:8abc3be7-2f6f-4ac4-a610-1c0a23188d96"
1) "state"
2) "active"
3) "msg"
4) "\n\x1finternal:myjobname\x12\xc2\x02taskpayloadhere\x1a$8abc3be7-2f6f-4ac4-a610-1c0a23188d96\"\adefault@\x88\x0e`\x80\xa3\x05"
127.0.0.1:6379> hgetall "asynq:{default}:t:8abc3be7-2f6f-4ac4-a610-1c0a23188d96"
1) "state"
2) "archived"
3) "msg"
4) "\n\x1finternal:myjobname\x12\xc2\x02taskpayloadhere\x1a$8abc3be7-2f6f-4ac4-a610-1c0a23188d96\"\adefault:\x19asynq: task lease expired@\x88\x0eX\xa6\xe7\xe6\xbc\x06`\x80\xa3\x05"

Additional context
For now i am bumping the max retry value to be able to handle this, but I was expecting worker failures to not impact retries

Originally created by @kmcgovern-apixio on GitHub (Jan 29, 2025). Original GitHub issue: https://github.com/hibiken/asynq/issues/1015 Originally assigned to: @hibiken, @kamikazechaser on GitHub. **Describe the bug** When a worker fails a task will move into `archived` state without retrying if `asynq.MaxRetries(0)` is set on the task. Setting it to any positive value causes allows X worker failures. IE setting `asynq.MaxRetries(3)` will cause allow the task to get picked up again up to 3 times with worker failures as the cause **Environment (please complete the following information):** - OS: linux - `asynq` package version: 0.25.1 - Redis version: 7.4.2 **To Reproduce** Steps to reproduce the behavior (Code snippets if applicable): 1. create task with asynq.MaxRetries(0) 2. run task 3. kill worker before tasks completes (IE ctrl + c) 4. check task state in redis `hgetall "asynq:{default}:t:8abc3be7-2f6f-4ac4-a610-1c0a23188d96"` 5. start worker backup 6. after lease expires, task will move to `archived` state (can check with hgetall) **Expected behavior** tasks are picked back up on worker failure. retries is incremented only upon a task returning an error or panic **Screenshots** output from redis cli (payloads and job name redacted) ``` 127.0.0.1:6379> hgetall "asynq:{default}:t:8abc3be7-2f6f-4ac4-a610-1c0a23188d96" 1) "state" 2) "active" 3) "msg" 4) "\n\x1finternal:myjobname\x12\xc2\x02taskpayloadhere\x1a$8abc3be7-2f6f-4ac4-a610-1c0a23188d96\"\adefault@\x88\x0e`\x80\xa3\x05" 127.0.0.1:6379> hgetall "asynq:{default}:t:8abc3be7-2f6f-4ac4-a610-1c0a23188d96" 1) "state" 2) "archived" 3) "msg" 4) "\n\x1finternal:myjobname\x12\xc2\x02taskpayloadhere\x1a$8abc3be7-2f6f-4ac4-a610-1c0a23188d96\"\adefault:\x19asynq: task lease expired@\x88\x0eX\xa6\xe7\xe6\xbc\x06`\x80\xa3\x05" ``` **Additional context** For now i am bumping the max retry value to be able to handle this, but I was expecting worker failures to not impact retries

kerem added the

bug

label

2026-03-15 20:44:51 +03:00

kerem commented

2026-03-15 20:45:02 +03:00

Author

Owner

@kamikazechaser commented on GitHub (May 15, 2025):

https://github.com/hibiken/asynq/blob/master/recoverer.go#L99

Possibly related to the above.

asynq.MaxRetries(0)

Looks like an edge case. We could handle this.

@kamikazechaser commented on GitHub (May 15, 2025): https://github.com/hibiken/asynq/blob/master/recoverer.go#L99 Possibly related to the above. > asynq.MaxRetries(0) Looks like an edge case. We could handle this.

kerem commented

2026-03-15 20:45:07 +03:00

Author

Owner

@kamikazechaser commented on GitHub (May 15, 2025):

@kmcgovern-apixio Try using the sohail/recoverer-fix branch and see if it fixes your issue.

@kamikazechaser commented on GitHub (May 15, 2025): @kmcgovern-apixio Try using the sohail/recoverer-fix branch and see if it fixes your issue.

kerem commented

2026-03-15 20:45:12 +03:00

Author

Owner

@dmitrii-doronin commented on GitHub (Dec 29, 2025):

@kamikazechaser, hi. Thank you for taking a look at the issue. Appreciate it.

It seems that the proposed solution would lead to worker failures to always be retried. What's your opinion on introducing isRetryableFunc or something similar to allow for more granular control here? It might not even be a breaking change if the library provides a default retry function through the options.

If there's a more appropriate way to handle the case in this thread, I would be really glad if you could point it out.

@dmitrii-doronin commented on GitHub (Dec 29, 2025): @kamikazechaser, hi. Thank you for taking a look at the issue. Appreciate it. It seems that the proposed solution would lead to worker failures to always be retried. What's your opinion on introducing `isRetryableFunc` or something similar to allow for more granular control here? It might not even be a breaking change if the library provides a default retry function through the options. If there's a more appropriate way to handle the case in this thread, I would be really glad if you could point it out.

No milestone

No project

No assignees

1 participant

Notifications

Due date

The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference

starred/asynq#2510

No description provided.

Rows
Columns