[GH-ISSUE #158] Possible improvements on parallelism

kerem commented

2026-03-01 21:40:07 +03:00

Owner

Originally created by @aidansteele on GitHub (Mar 16, 2020).
Original GitHub issue: https://github.com/nektos/act/issues/158

Background

I'll start with an example from one of my projects. Here's the graph rendered by act -l:

 ╭──────────╮ ╭─────────╮ ╭───────╮
 │ frontend │ │ backend │ │ certs │
 ╰──────────╯ ╰─────────╯ ╰───────╯
                 ⬇
   ╭────────────────╮ ╭────────╮
   │ deployfrontend │ │ deploy │
   ╰────────────────╯ ╰────────╯

In this case, deployfrontend needs frontend and deploy needs backend.

My understanding of the current state

Right now, it appears that act forms a dependency graph of workflows, jobs, steps, matrixes, etc. These are then transformed into model.Plan, model.Stage and model.Run structs. A model.Run is effectively an instance of a job -- it isn't a 1:1 mapping because matrixes mean that there could be from 0-n instances of the same job, each with different variables.

All runs within a stage are run in parallel and a stage is only complete when all runs have completed. At this point, execution can proceed to the next stage.

For stage N (where N is the zero-based stage to run in order) stage to run, all runs have a list of transitive dependencies N long. So runs with no dependencies are all in stage 0, runs with 1 dependency are in stage 1, etc.

Is this correct?

Proposal / discussion

Right now deployfrontend won't run until frontend, backend and certs have all completed. In my case, frontend completes much faster than backend. I would love for deployfrontend to start execution as soon as frontend has completed, even if backend is still executing.

I think this would mean eliminating the concept of stages entirely. Can I ask why they were introduced? Without diving into it in real depth, it kind of feels like the existing abstractions around common.Executor would be sufficient to model the DAG necessary to represent execution, so I'm not sure why model.Stage is there. I feel like I can't confidently suggest eliminating stages without understanding what problem they are currently solving.

One other random point: maybe it's not an issue in practice, but I feel like maybe it would be useful also limiting the number of actively executing runs to the number of CPUs present. Thoughts on that?

I'd be happy to submit a PR or at least ideas for a PR proposal if you are happy with the idea of refactoring to allow runs to start independently of each other, but I feel I should get some feedback first.

Originally created by @aidansteele on GitHub (Mar 16, 2020). Original GitHub issue: https://github.com/nektos/act/issues/158 ## Background I'll start with an example from one of my projects. Here's the graph rendered by `act -l`: ``` ╭──────────╮ ╭─────────╮ ╭───────╮ │ frontend │ │ backend │ │ certs │ ╰──────────╯ ╰─────────╯ ╰───────╯ ⬇ ╭────────────────╮ ╭────────╮ │ deployfrontend │ │ deploy │ ╰────────────────╯ ╰────────╯ ``` In this case, `deployfrontend` needs `frontend` and `deploy` needs `backend`. ## My understanding of the current state Right now, it appears that `act` forms a dependency graph of workflows, jobs, steps, matrixes, etc. These are then transformed into `model.Plan`, `model.Stage` and `model.Run` structs. A `model.Run` is effectively an instance of a job -- it isn't a 1:1 mapping because matrixes mean that there could be from 0-n instances of the same job, each with different variables. All runs within a stage are run in parallel and a stage is only complete when all runs have completed. At this point, execution can proceed to the next stage. For stage N (where N is the zero-based stage to run in order) stage to run, all runs have a list of transitive dependencies N long. So runs with no dependencies are all in stage 0, runs with 1 dependency are in stage 1, etc. Is this correct? ## Proposal / discussion Right now `deployfrontend` won't run until `frontend`, `backend` and `certs` have all completed. In my case, `frontend` completes *much* faster than `backend`. I would love for `deployfrontend` to start execution as soon as `frontend` has completed, even if `backend` is still executing. I think this would mean eliminating the concept of stages entirely. Can I ask why they were introduced? Without diving into it in *real* depth, it kind of feels like the existing abstractions around `common.Executor` would be sufficient to model the DAG necessary to represent execution, so I'm not sure why `model.Stage` is there. I feel like I can't confidently suggest eliminating stages without understanding what problem they are currently solving. One other random point: maybe it's not an issue in practice, but I feel like maybe it would be useful also limiting the number of actively executing runs to the number of CPUs present. Thoughts on that? I'd be happy to submit a PR or at least ideas for a PR proposal if you are happy with the idea of refactoring to allow runs to start independently of each other, but I feel I should get some feedback first.

kerem added the

kind/feature-request

area/action

stale-exempt

labels

2026-03-01 21:40:07 +03:00

kerem commented

2026-03-01 21:40:08 +03:00

Author

Owner

@aidansteele commented on GitHub (Mar 16, 2020):

Sorry for the verbosity 🤯

@aidansteele commented on GitHub (Mar 16, 2020): Sorry for the verbosity 🤯

kerem commented

2026-03-01 21:40:08 +03:00

Author

Owner

@cplee commented on GitHub (Apr 16, 2020):

@aidansteele - sorry for the delay, love the verbosity.

So, the idea of stages was actually a carryover from the original design of GitHub Actions, before it was rewritten on Azure DevOps. I'm totally onboard with a refactor to improve performance and also interested in the idea of a parallelism flag to limit how many containers are running at once.

@cplee commented on GitHub (Apr 16, 2020): @aidansteele - sorry for the delay, love the verbosity. So, the idea of stages was actually a carryover from the original design of GitHub Actions, before it was rewritten on Azure DevOps. I'm totally onboard with a refactor to improve performance and also interested in the idea of a `parallelism` flag to limit how many containers are running at once.

kerem commented

2026-03-01 21:40:08 +03:00

Author

Owner

@github-actions[bot] commented on GitHub (Jun 16, 2020):

Issue is stale and will be closed in 7 days unless there is new activity

@github-actions[bot] commented on GitHub (Jun 16, 2020): Issue is stale and will be closed in 7 days unless there is new activity

kerem commented

2026-03-01 21:40:08 +03:00

Author

Owner

@hashhar commented on GitHub (Jan 20, 2021):

Hi, thanks for creating this awesome tool.

I ran into this issue due to different reasons. For projects which have large number of independent jobs (or maybe large matrix) act tries to run all of them at the same time.

This is not feasible for a workflow with a sufficiently large number of jobs. It'd be great if there was some mechanism to limit the number of parallel "actions" running at once.

We can limit it somewhat by running each job serially but it isn't possible in case of large matrices.

@hashhar commented on GitHub (Jan 20, 2021): Hi, thanks for creating this awesome tool. I ran into this issue due to different reasons. For projects which have large number of independent jobs (or maybe large matrix) `act` tries to run all of them at the same time. This is not feasible for a workflow with a sufficiently large number of jobs. It'd be great if there was some mechanism to limit the number of parallel "actions" running at once. We can limit it somewhat by running each job serially but it isn't possible in case of large matrices.

kerem commented

2026-03-01 21:40:08 +03:00

Author

Owner

@catthehacker commented on GitHub (Jan 20, 2021):

I have an idea of how to make it work without major refactor, by using waitgroups, but I'm not yet able to test/make that happen due to lack of time.
Unless someone with more time and Go knowledge chimes in, I'd recommend scripting over act and run jobs each after previous one finishes.

@catthehacker commented on GitHub (Jan 20, 2021): I have an idea of how to make it work without major refactor, by using waitgroups, but I'm not yet able to test/make that happen due to lack of time. Unless someone with more time and Go knowledge chimes in, I'd recommend scripting over `act` and run jobs each after previous one finishes.

kerem commented

2026-03-01 21:40:08 +03:00

Author

Owner

@hashhar commented on GitHub (Jan 21, 2021):

Thanks @CatTheHacker. I too thought of a similar approach but haven't seen the code yet.

I'll try to see if I can find some time over the weekends to contribute a fix.

@hashhar commented on GitHub (Jan 21, 2021): Thanks @CatTheHacker. I too thought of a similar approach but haven't seen the code yet. I'll try to see if I can find some time over the weekends to contribute a fix.

kerem commented

2026-03-01 21:40:08 +03:00

Author

Owner

@catthehacker commented on GitHub (Jan 21, 2021):

@cplee Hi, could you re-open this issue, please? I think it's something good to track and eventually work on

@catthehacker commented on GitHub (Jan 21, 2021): @cplee Hi, could you re-open this issue, please? I think it's something good to track and eventually work on

kerem commented

2026-03-01 21:40:08 +03:00

Author

Owner

@AndrewSav commented on GitHub (Sep 23, 2021):

If this is not in scope of this issue I can open a separate one:

Currently, when running act it attempts to run as many actions in parallel as it can. Given that mine is a dev machine, it creates unbearable resource congestion, I simply do not have the memory to run a dozen of actions at the same time, even github has a queueing facility when there are too many. Yet, I cannot find a control in act to limit this parallelism. I would like to be able to say not to run more than X action at the same time, because I know that my machine will not have memory to fit more.

This seems to fit into "Possible improvements on parallelism", but happy to copy and paste this into a new issue. Also if this is solved and I overlooked a feature please let me know.

@AndrewSav commented on GitHub (Sep 23, 2021): If this is not in scope of this issue I can open a separate one: Currently, when running `act` it attempts to run as many actions in parallel as it can. Given that mine is a dev machine, it creates unbearable resource congestion, I simply do not have the memory to run a dozen of actions at the same time, even github has a queueing facility when there are too many. Yet, I cannot find a control in `act` to limit this parallelism. I would like to be able to say not to run more than X action at the same time, because I know that my machine will not have memory to fit more. This seems to fit into "Possible improvements on parallelism", but happy to copy and paste this into a new issue. Also if this is solved and I overlooked a feature please let me know.

kerem commented

2026-03-01 21:40:08 +03:00

Author

Owner

@catthehacker commented on GitHub (Sep 23, 2021):

Currently it's not possible to limit how many parallel jobs/workflows run.

@catthehacker commented on GitHub (Sep 23, 2021): Currently it's not possible to limit how many parallel jobs/workflows run.

kerem commented

2026-03-01 21:40:08 +03:00

Author

Owner

@catthehacker commented on GitHub (Sep 24, 2021):

#823 adds basic support for limiting concurrency, it does not behave like GitHub but it should not spawn hellish amounts of containers/jobs :)

@catthehacker commented on GitHub (Sep 24, 2021): #823 adds basic support for limiting concurrency, it does not behave like GitHub but it should not spawn hellish amounts of containers/jobs :)

Rows
Columns

[GH-ISSUE #158] Possible improvements on parallelism #101

Background

My understanding of the current state

Proposal / discussion