[GH-ISSUE #472] [FEATURE REQUEST] Self-Managed Retries #214

Closed
opened 2026-03-02 05:19:41 +03:00 by kerem · 7 comments
Owner

Originally created by @AlexCuse on GitHub (May 17, 2022).
Original GitHub issue: https://github.com/hibiken/asynq/issues/472

Originally assigned to: @hibiken on GitHub.

Is your feature request related to a problem? Please describe.
I would like to be able to leverage "processor managed" retries for specific tasks running through an asynq server.

Describe the solution you'd like
I will have a PR shortly for a SelfManagedRetry option that will override Retry/Retried logic to retry a task indefinitely (from asynq's perspective) allowing the processor to terminate by either returning success or SkipRetry.

Describe alternatives you've considered
Looked at the options around "consuming retry count" and thought about figuring considered setting a high number of retries since looking at this within a fairly short expiration right now but came around to favoring something a little more direct.

There is a risk of tasks retrying indefinitely if not self-managed correctly but this should be difficult to fall into without using the option.

Originally created by @AlexCuse on GitHub (May 17, 2022). Original GitHub issue: https://github.com/hibiken/asynq/issues/472 Originally assigned to: @hibiken on GitHub. **Is your feature request related to a problem? Please describe.** I would like to be able to leverage "processor managed" retries for specific tasks running through an asynq server. **Describe the solution you'd like** I will have a PR shortly for a `SelfManagedRetry` option that will override Retry/Retried logic to retry a task indefinitely (from asynq's perspective) allowing the processor to terminate by either returning success or SkipRetry. **Describe alternatives you've considered** Looked at the options around "consuming retry count" and thought about figuring considered setting a high number of retries since looking at this within a fairly short expiration right now but came around to favoring something a little more direct. There is a risk of tasks retrying indefinitely if not self-managed correctly but this should be difficult to fall into without using the option.
kerem 2026-03-02 05:19:41 +03:00
Author
Owner

@hibiken commented on GitHub (May 18, 2022):

@AlexCuse Thank you for creating an issue.

Have you taken a look at Config.IsFailure?
You can provide this function to determine whether the error returned from the Handler should be considered a failure. Non-failure error won't consume the retry count; You can leverage this feature to try task indefinitely if you so choose.

See this wiki for more info (https://github.com/hibiken/asynq/wiki/Task-Retry#non-failure-error)

<!-- gh-comment-id:1129531147 --> @hibiken commented on GitHub (May 18, 2022): @AlexCuse Thank you for creating an issue. Have you taken a look at [`Config.IsFailure`](https://pkg.go.dev/github.com/hibiken/asynq#Config.IsFailure)? You can provide this function to determine whether the error returned from the `Handler` should be considered a failure. Non-failure error won't consume the retry count; You can leverage this feature to try task indefinitely if you so choose. See this wiki for more info (https://github.com/hibiken/asynq/wiki/Task-Retry#non-failure-error)
Author
Owner

@AlexCuse commented on GitHub (May 18, 2022):

I have looked at it @hibiken - the problem with most of what is there is that its not task type specific. We have a subset of critical tasks that are "chained" so to speak and we really only want to change the behavior for those tasks - the idea is to retry up to an expiration that is encoded in the task payloads, and once the expiration is hit to move the task into a cancellation queue where processors will clean up any progress associated with the attempt.

Maybe another option could be to make the existing methods task-type aware?

<!-- gh-comment-id:1129814562 --> @AlexCuse commented on GitHub (May 18, 2022): I have looked at it @hibiken - the problem with most of what is there is that its not task type specific. We have a subset of critical tasks that are "chained" so to speak and we really only want to change the behavior for those tasks - the idea is to retry up to an expiration that is encoded in the task payloads, and once the expiration is hit to move the task into a cancellation queue where processors will clean up any progress associated with the attempt. Maybe another option could be to make the existing methods task-type aware?
Author
Owner

@hibiken commented on GitHub (May 18, 2022):

Can you define your own error type and include task type?

e.g.

type TaskError struct {
    TaskType string
    // ... other metadata 
}

func (e *TaskError) Error() string { /* ... */ }

func IsFailure(err error) bool {
    taskErr, ok := err.(*TaskError)
    if !ok {
        return true // other errors are always considered failures
    }
    switch taskErr.TaskType {
    case "foo":
        // ... handle each task type differently       
    }
}
<!-- gh-comment-id:1130286100 --> @hibiken commented on GitHub (May 18, 2022): Can you define your own error type and include task type? e.g. ```go type TaskError struct { TaskType string // ... other metadata } func (e *TaskError) Error() string { /* ... */ } func IsFailure(err error) bool { taskErr, ok := err.(*TaskError) if !ok { return true // other errors are always considered failures } switch taskErr.TaskType { case "foo": // ... handle each task type differently } }
Author
Owner

@AlexCuse commented on GitHub (May 18, 2022):

We could I'm not sure how attractive it is as the number of registered processors in the system grows though. Will think a bit about how far this could get us.

<!-- gh-comment-id:1130466229 --> @AlexCuse commented on GitHub (May 18, 2022): We could I'm not sure how attractive it is as the number of registered processors in the system grows though. Will think a bit about how far this could get us.
Author
Owner

@AlexCuse commented on GitHub (May 18, 2022):

@hibiken it occurs to me thinking about this that a custom IsFailure func registered with the processor could be preferable to the additional option in some (most?) ways. I will look into ways this could be achieved but is that something that might be more attractive than a task level option?

<!-- gh-comment-id:1130486688 --> @AlexCuse commented on GitHub (May 18, 2022): @hibiken it occurs to me thinking about this that a custom `IsFailure` func registered with the processor could be preferable to the additional option in some (most?) ways. I will look into ways this could be achieved but is that something that might be more attractive than a task level option?
Author
Owner

@hibiken commented on GitHub (May 18, 2022):

I believe the options we offer now (MaxRetry option and IsFailure callback) are generic enough to accommodate most use cases. I don't see adding a new option to further customize retry logic.

<!-- gh-comment-id:1130659189 --> @hibiken commented on GitHub (May 18, 2022): I believe the options we offer now (`MaxRetry` option and `IsFailure` callback) are generic enough to accommodate most use cases. I don't see adding a new option to further customize retry logic.
Author
Owner

@AlexCuse commented on GitHub (May 19, 2022):

Thanks for the quick response @hibiken i disagree but we will be able to muddle through with what's there for awhile. I have already changed the code and will probably open the PR anyway once I've tested it in a running system in case you change your mind, seems like a pretty simple change.

<!-- gh-comment-id:1131498852 --> @AlexCuse commented on GitHub (May 19, 2022): Thanks for the quick response @hibiken i disagree but we will be able to muddle through with what's there for awhile. I have already changed the code and will probably open the PR anyway once I've tested it in a running system in case you change your mind, seems like a pretty simple change.
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/asynq#214
No description provided.