mirror of
https://github.com/nektos/act.git
synced 2026-04-26 01:15:51 +03:00
[GH-ISSUE #158] Possible improvements on parallelism #101
Labels
No labels
area/action
area/cli
area/docs
area/image
area/runner
area/workflow
backlog
confirmed/not-planned
kind/bug
kind/discussion
kind/external
kind/feature-request
kind/question
meta/duplicate
meta/invalid
meta/need-more-info
meta/resolved
meta/wontfix
meta/workaround
needs-work
pull-request
review/not-planned
size/M
size/XL
size/XXL
stale
stale-exempt
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
starred/act#101
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @aidansteele on GitHub (Mar 16, 2020).
Original GitHub issue: https://github.com/nektos/act/issues/158
Background
I'll start with an example from one of my projects. Here's the graph rendered by
act -l:In this case,
deployfrontendneedsfrontendanddeployneedsbackend.My understanding of the current state
Right now, it appears that
actforms a dependency graph of workflows, jobs, steps, matrixes, etc. These are then transformed intomodel.Plan,model.Stageandmodel.Runstructs. Amodel.Runis effectively an instance of a job -- it isn't a 1:1 mapping because matrixes mean that there could be from 0-n instances of the same job, each with different variables.All runs within a stage are run in parallel and a stage is only complete when all runs have completed. At this point, execution can proceed to the next stage.
For stage N (where N is the zero-based stage to run in order) stage to run, all runs have a list of transitive dependencies N long. So runs with no dependencies are all in stage 0, runs with 1 dependency are in stage 1, etc.
Is this correct?
Proposal / discussion
Right now
deployfrontendwon't run untilfrontend,backendandcertshave all completed. In my case,frontendcompletes much faster thanbackend. I would love fordeployfrontendto start execution as soon asfrontendhas completed, even ifbackendis still executing.I think this would mean eliminating the concept of stages entirely. Can I ask why they were introduced? Without diving into it in real depth, it kind of feels like the existing abstractions around
common.Executorwould be sufficient to model the DAG necessary to represent execution, so I'm not sure whymodel.Stageis there. I feel like I can't confidently suggest eliminating stages without understanding what problem they are currently solving.One other random point: maybe it's not an issue in practice, but I feel like maybe it would be useful also limiting the number of actively executing runs to the number of CPUs present. Thoughts on that?
I'd be happy to submit a PR or at least ideas for a PR proposal if you are happy with the idea of refactoring to allow runs to start independently of each other, but I feel I should get some feedback first.
@aidansteele commented on GitHub (Mar 16, 2020):
Sorry for the verbosity 🤯
@cplee commented on GitHub (Apr 16, 2020):
@aidansteele - sorry for the delay, love the verbosity.
So, the idea of stages was actually a carryover from the original design of GitHub Actions, before it was rewritten on Azure DevOps. I'm totally onboard with a refactor to improve performance and also interested in the idea of a
parallelismflag to limit how many containers are running at once.@github-actions[bot] commented on GitHub (Jun 16, 2020):
Issue is stale and will be closed in 7 days unless there is new activity
@hashhar commented on GitHub (Jan 20, 2021):
Hi, thanks for creating this awesome tool.
I ran into this issue due to different reasons. For projects which have large number of independent jobs (or maybe large matrix)
acttries to run all of them at the same time.This is not feasible for a workflow with a sufficiently large number of jobs. It'd be great if there was some mechanism to limit the number of parallel "actions" running at once.
We can limit it somewhat by running each job serially but it isn't possible in case of large matrices.
@catthehacker commented on GitHub (Jan 20, 2021):
I have an idea of how to make it work without major refactor, by using waitgroups, but I'm not yet able to test/make that happen due to lack of time.
Unless someone with more time and Go knowledge chimes in, I'd recommend scripting over
actand run jobs each after previous one finishes.@hashhar commented on GitHub (Jan 21, 2021):
Thanks @CatTheHacker. I too thought of a similar approach but haven't seen the code yet.
I'll try to see if I can find some time over the weekends to contribute a fix.
@catthehacker commented on GitHub (Jan 21, 2021):
@cplee Hi, could you re-open this issue, please? I think it's something good to track and eventually work on
@AndrewSav commented on GitHub (Sep 23, 2021):
If this is not in scope of this issue I can open a separate one:
Currently, when running
actit attempts to run as many actions in parallel as it can. Given that mine is a dev machine, it creates unbearable resource congestion, I simply do not have the memory to run a dozen of actions at the same time, even github has a queueing facility when there are too many. Yet, I cannot find a control inactto limit this parallelism. I would like to be able to say not to run more than X action at the same time, because I know that my machine will not have memory to fit more.This seems to fit into "Possible improvements on parallelism", but happy to copy and paste this into a new issue. Also if this is solved and I overlooked a feature please let me know.
@catthehacker commented on GitHub (Sep 23, 2021):
Currently it's not possible to limit how many parallel jobs/workflows run.
@catthehacker commented on GitHub (Sep 24, 2021):
#823 adds basic support for limiting concurrency, it does not behave like GitHub but it should not spawn hellish amounts of containers/jobs :)