mirror of
https://github.com/amidaware/tacticalrmm.git
synced 2026-04-26 23:15:57 +03:00
[GH-ISSUE #231] Instability (requiring a daily reboot) since 0.2.20 update (now on 0.2.21) #2088
Labels
No labels
In Process
bug
bug
dev-triage
documentation
duplicate
enhancement
fixed
good first issue
help wanted
integration
invalid
pull-request
question
requires agent update
security
ui tweak
wontfix
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
starred/tacticalrmm#2088
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @rtwright68 on GitHub (Jan 5, 2021).
Original GitHub issue: https://github.com/amidaware/tacticalrmm/issues/231
Running a VMware VM (2 CPUs, 8GB RAM, 250GB storage) on Ubuntu.
475 agents total. Having issues taking control then agents lose contact.
Reboot fixes for about another day (rebooted around the same time yesterday).
CPU & RAM usage look normal. Read/write vdisk latency look good.
df shows:
Filesystem 1K-blocks Used Available Use% Mounted on
udev 4032736 0 4032736 0% /dev
tmpfs 815336 1248 814088 1% /run
/dev/mapper/ubuntu--vg-ubuntu--lv 127974628 19797204 101633656 17% /
tmpfs 4076660 276 4076384 1% /dev/shm
tmpfs 5120 0 5120 0% /run/lock
tmpfs 4076660 0 4076660 0% /sys/fs/cgroup
/dev/sda2 999320 202092 728416 22% /boot
/dev/loop0 56704 56704 0 100% /snap/core18/1932
/dev/loop1 73088 73088 0 100% /snap/lxd/16099
/dev/loop2 56832 56832 0 100% /snap/core18/1944
/dev/loop4 31872 31872 0 100% /snap/snapd/10707
/dev/loop3 31872 31872 0 100% /snap/snapd/10492
/dev/loop5 69376 69376 0 100% /snap/lxd/18150
tmpfs 815332 0 815332 0% /run/user/1000
Not sure what else to check? Was very rock solid running 0.2.18.
@azulskyknight commented on GitHub (Jan 5, 2021):
Is the only issue with take control? If so, perhaps you should stop rebooting and try the big red recover connection button at the top?
Mesh was updated in 0.2.20, this causes all the mesh agents to update and sometimes they go a bit nuts and Tactical has to track "new" mesh agents down again. If you log into mesh itself and look you'll see duplication of agents when this happens.
But again that's just mesh issues. If the RMM agents themselves are dropping that's a different issue.
@rtwright68 commented on GitHub (Jan 5, 2021):
Definitely tried the recover agent and that has helped in some cases. The odd thing that triggers the need to reboot is the communication between the agents and the server starts dropping, rebooting fixes that.
@bbrendon commented on GitHub (Jan 5, 2021):
How many total agents do you have? Have you looked around in /var/log/* for problems?
@rtwright68 commented on GitHub (Jan 6, 2021):
We now have 478 agents. Will look at the /var/log/ to see if anything is obvious.
@wh1te909 commented on GitHub (Jan 7, 2021):
what exactly is dropping? just mesh communication? or are agents showing offline in tactical UI?
also what model is your CPU? 2 cpu's seems very low for 478 agents
@rtwright68 commented on GitHub (Jan 7, 2021):
We have seen both. Its a VMware VM so I will boost up the CPU count. The CPUs on the ESXi hosts are: Intel(R) Xeon(R) Silver 4110 CPU @ 2.10GHz
@rtwright68 commented on GitHub (Jan 11, 2021):
TRMM worked great over the weekend, it appears to be suffering from the memory leak issue still.
@wh1te909 commented on GitHub (Jan 17, 2021):
please upgrade to latest version, check 0.3.0 release notes for migration guide then let me know if still issues