[GH-ISSUE #535] [FEATURE REQUEST] Support BatchEnqueue for client #2276

Open
opened 2026-03-15 19:55:01 +03:00 by kerem · 11 comments
Owner

Originally created by @Percivalll on GitHub (Sep 2, 2022).
Original GitHub issue: https://github.com/hibiken/asynq/issues/535

Originally assigned to: @hibiken on GitHub.

When a lot of tasks need to be enqueued, current method is slow because every redis-op needs at least a RTT.
For example: If I want enqueue 1000000 tasks, each client.Enqueue spends 13ms in my environment. So if I execute it without concurrency, this will spend 1000000*13=13000000ms, almost 3.6 hours. Definitely I can use a lot of goroutines to shorten the time, but takes a lot of cpu usage and many redis connections.

I think we should supply a BatchEnqueue method for supporting user to enqueue a lot of tasks at once. For redis broker, we can use pipeline to decrease network and cpu overhead.

Originally created by @Percivalll on GitHub (Sep 2, 2022). Original GitHub issue: https://github.com/hibiken/asynq/issues/535 Originally assigned to: @hibiken on GitHub. When a lot of tasks need to be enqueued, current method is slow because every redis-op needs at least a RTT. For example: If I want enqueue 1000000 tasks, each client.Enqueue spends 13ms in my environment. So if I execute it without concurrency, this will spend 1000000*13=13000000ms, almost 3.6 hours. Definitely I can use a lot of goroutines to shorten the time, but takes a lot of cpu usage and many redis connections. I think we should supply a BatchEnqueue method for supporting user to enqueue a lot of tasks at once. For redis broker, we can use pipeline to decrease network and cpu overhead.
Author
Owner
<!-- gh-comment-id:1235054381 --> @Percivalll commented on GitHub (Sep 2, 2022): Related discussions: https://github.com/hibiken/asynq/issues/339#issuecomment-985507125 https://github.com/hibiken/asynq/issues/352
Author
Owner

@KillianH commented on GitHub (Sep 6, 2022):

I need that too :) I have several millions of tasks to enqueue in my workflow. Overall I really love the lib I can handle 7 millions tasks in 44 minutes (with some computation and database requests)

<!-- gh-comment-id:1238759235 --> @KillianH commented on GitHub (Sep 6, 2022): I need that too :) I have several millions of tasks to enqueue in my workflow. Overall I really love the lib I can handle 7 millions tasks in 44 minutes (with some computation and database requests)
Author
Owner

@hibiken commented on GitHub (Sep 10, 2022):

Thank you @Serinalice for creating this feature request!

This feature makes a lot of sense and the package should support this use case.
We should probably discuss the API first (What should it look like? How should we handle partial errors?)

<!-- gh-comment-id:1242741446 --> @hibiken commented on GitHub (Sep 10, 2022): Thank you @Serinalice for creating this feature request! This feature makes a lot of sense and the package should support this use case. We should probably discuss the API first (What should it look like? How should we handle partial errors?)
Author
Owner

@Percivalll commented on GitHub (Sep 11, 2022):

No problem, I'll describe my preliminary ideas!

On Sep 10, 2022, at 22:27, Ken Hibino @.***> wrote:



Thank you @Serinalicehttps://github.com/Serinalice for creating this feature request!

This feature makes a lot of sense and the package should support this use case.
We should probably discuss the API first (What should it look like? How should we handle partial errors?)


Reply to this email directly, view it on GitHubhttps://github.com/hibiken/asynq/issues/535#issuecomment-1242741446, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AODYOTHFSG72DSHHWBFB53TV5SLF3ANCNFSM6AAAAAAQC5HN3U.
You are receiving this because you were mentioned.Message ID: @.***>

<!-- gh-comment-id:1242886098 --> @Percivalll commented on GitHub (Sep 11, 2022): No problem, I'll describe my preliminary ideas! On Sep 10, 2022, at 22:27, Ken Hibino ***@***.***> wrote:  Thank you @Serinalice<https://github.com/Serinalice> for creating this feature request! This feature makes a lot of sense and the package should support this use case. We should probably discuss the API first (What should it look like? How should we handle partial errors?) — Reply to this email directly, view it on GitHub<https://github.com/hibiken/asynq/issues/535#issuecomment-1242741446>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AODYOTHFSG72DSHHWBFB53TV5SLF3ANCNFSM6AAAAAAQC5HN3U>. You are receiving this because you were mentioned.Message ID: ***@***.***>
Author
Owner

@Percivalll commented on GitHub (Sep 13, 2022):

How about this:

func (c *Client) EnqueueBatch(tasks []*Task, opts ...Option) ([]*TaskInfo, error)
func (c *Client) EnqueueContext(ctx context.Context, tasks []*Task, opts ...Option) ([]*TaskInfo, error)

If error is nil, all tasks have been successfully enqueued. If not, array of task info saves all successfully tasks.

<!-- gh-comment-id:1244935421 --> @Percivalll commented on GitHub (Sep 13, 2022): How about this: ``` func (c *Client) EnqueueBatch(tasks []*Task, opts ...Option) ([]*TaskInfo, error) func (c *Client) EnqueueContext(ctx context.Context, tasks []*Task, opts ...Option) ([]*TaskInfo, error) ``` If error is nil, all tasks have been successfully enqueued. If not, array of task info saves all successfully tasks.
Author
Owner

@xuyang2 commented on GitHub (Sep 13, 2022):

How to configure/default batch size?

https://redis.io/docs/manual/pipelining/

IMPORTANT NOTE: While the client sends commands using pipelining, the server will be forced to queue the replies, using memory. So if you need to send a lot of commands with pipelining, it is better to send them as batches each containing a reasonable number, for instance 10k commands, read the replies, and then send another 10k commands again, and so forth. The speed will be nearly the same, but the additional memory used will be at most the amount needed to queue the replies for these 10k commands.

<!-- gh-comment-id:1245046867 --> @xuyang2 commented on GitHub (Sep 13, 2022): How to configure/default batch size? https://redis.io/docs/manual/pipelining/ > IMPORTANT NOTE: While the client sends commands using pipelining, the server will be forced to queue the replies, using memory. So if you need to send a lot of commands with pipelining, it is better to send them as batches each containing a reasonable number, for instance 10k commands, read the replies, and then send another 10k commands again, and so forth. The speed will be nearly the same, but the additional memory used will be at most the amount needed to queue the replies for these 10k commands.
Author
Owner

@Percivalll commented on GitHub (Sep 13, 2022):

How to configure/default batch size?

https://redis.io/docs/manual/pipelining/

IMPORTANT NOTE: While the client sends commands using pipelining, the server will be forced to queue the replies, using memory. So if you need to send a lot of commands with pipelining, it is better to send them as batches each containing a reasonable number, for instance 10k commands, read the replies, and then send another 10k commands again, and so forth. The speed will be nearly the same, but the additional memory used will be at most the amount needed to queue the replies for these 10k commands.

By length of tasks.

<!-- gh-comment-id:1245071705 --> @Percivalll commented on GitHub (Sep 13, 2022): > How to configure/default batch size? > > https://redis.io/docs/manual/pipelining/ > > > IMPORTANT NOTE: While the client sends commands using pipelining, the server will be forced to queue the replies, using memory. So if you need to send a lot of commands with pipelining, it is better to send them as batches each containing a reasonable number, for instance 10k commands, read the replies, and then send another 10k commands again, and so forth. The speed will be nearly the same, but the additional memory used will be at most the amount needed to queue the replies for these 10k commands. By length of tasks.
Author
Owner

@yousifh commented on GitHub (Sep 14, 2022):

Would this new batch API pipeline the existing EVALSHA enqueue scripts or it will use a new Lua script that takes the batch of tasks and enqueue them all at once?

<!-- gh-comment-id:1247391473 --> @yousifh commented on GitHub (Sep 14, 2022): Would this new batch API pipeline the existing `EVALSHA` enqueue scripts or it will use a new Lua script that takes the batch of tasks and enqueue them all at once?
Author
Owner

@5idu commented on GitHub (Nov 8, 2022):

Are there any new developments on this issue?

<!-- gh-comment-id:1306472920 --> @5idu commented on GitHub (Nov 8, 2022): Are there any new developments on this issue?
Author
Owner

@developersam1995 commented on GitHub (Mar 5, 2023):

Should this API support a combination of initial task states between "aggregating", "pending", and "scheduled"?

<!-- gh-comment-id:1455019881 --> @developersam1995 commented on GitHub (Mar 5, 2023): Should this API support a combination of initial task states between "aggregating", "pending", and "scheduled"?
Author
Owner

@thanhps42 commented on GitHub (May 25, 2024):

any update?

<!-- gh-comment-id:2130635577 --> @thanhps42 commented on GitHub (May 25, 2024): any update?
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/asynq#2276
No description provided.