mirror of
https://github.com/amidaware/rmmagent.git
synced 2026-04-26 06:45:48 +03:00
[GH-ISSUE #1] Linux Agent goes offline #3
Labels
No labels
bug
bug
enhancement
fixed
pull-request
question
wontfix
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
starred/rmmagent#3
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @ryszard-suchocki on GitHub (Mar 21, 2022).
Original GitHub issue: https://github.com/amidaware/rmmagent/issues/1
Originally assigned to: @wh1te909 on GitHub.
Hi,
I'm testing the community beta Linux Agent for TRMM. I want to report that after a while Linux agent goes offline (status changed to offline), although the checks work fine. Also, it is possible to invoke remote commands, etc. so there is communication between agent and server. Could you verify on your side?
• Ubuntu 20.04 x86_64 5.4.0-104-generic • Agent v2.0.0
Temporary I'm running agent by invoking
./rmmagent -m svcBest regards
@dinger1986 commented on GitHub (Mar 21, 2022):
Where are your agents hosted?
@ryszard-suchocki commented on GitHub (Mar 21, 2022):
Could you clarify in more simple words? The whole setup works in a simple environment, in LAN. Linux Agent works on a physical machine with "direct" access to TRMM. Other agents (Win) communicate fine (local and remote).
@dinger1986 commented on GitHub (Mar 21, 2022):
ok, I am having issues with amazon agents but fine for all others
@ryszard-suchocki commented on GitHub (Mar 21, 2022):
Could you elaborate on how you register agents? My approach was:
@wh1te909 commented on GitHub (Mar 21, 2022):
https://github.com/amidaware/tacticalrmm/blob/develop/api/tacticalrmm/core/agent_linux.sh this should help
@wh1te909 commented on GitHub (Mar 21, 2022):
you need to keep it running via systemd or something similar on your distro
@georgebarnick commented on GitHub (Mar 21, 2022):
Installed using the above script with code-signed agents. Workig fine on a Ubuntu 20.04 test VM I made on my local VMware Workstation with no issues. Then deployed it on some AWS and Azure VMs I have (a mix of Ubuntu 20.04 and CentOS 7), and having the issue described in OP where they're going offline after a few minutes after running their first checks. The agents are running in systemd as suggested, and
systemctl restart tacticalagent.servicewill bring them back to "online" status in the dashboard, but they slowly go back to offline again. Curious what to try next.Edit: Further information about some examples of agents below
Agent that's working fine: Ubuntu 20.04 x86_64 5.4.0-105-generic • Agent v2.0.0
AWS Ubuntu agent that's going offline: Ubuntu 20.04 x86_64 5.13.0-1017-aws • Agent v2.0.0
Azure Ubuntu agent that's going offline: Ubuntu 20.04 x86_64 5.13.0-1017-azure • Agent v2.0.0
Azure CentOS agent that's going offline: Centos 7.9.2009 x86_64 3.10.0-1160.53.1.el7.x86_64 • Agent v2.0.0
Happy to provide any other troubleshooting information as-needed.
@wh1te909 commented on GitHub (Mar 21, 2022):
@georgebarnick please enable debug logging so we can see where it's getting stuck
modify
/etc/systemd/system/tacticalagent.serviceand change
to
(add the
-log debug)then
systemctl daemon-reload && systemctl restart tacticalagentwait for agent to go offline then lets see what's in
/var/log/tacticalagent.log@georgebarnick commented on GitHub (Mar 21, 2022):
@wh1te909 So far the only things in the log after the agent service restarts and goes through its checks and everything the first time is:
every few minutes
and
every second.
I installed with the
--nomeshflag on most if not all of these VMs that are going offline. Not sure if that's going to be related to the agent going offline or a separate issue, but maybe @ryszard-suchocki can chime in if he has the Mesh Agent with his affected install or not. The reason I did--nomeshwas that the install seemed to get stuck on the "Getting mesh node id" step on one of them, so I just decided to omit it from all of them. I could try to reinstall with the mesh agent if you need and have an idea on why it might have gotten stuck there. I'm no expert with MeshCentral yet so haven't troubleshot that myself.@ryszard-suchocki commented on GitHub (Mar 21, 2022):
In my case, Mesh Agent has been installed before, separately to TRMM. I did not use
-nomeshparameter when "installing" TRMM. So I decided to remove my agent and "install" it by passing-nomeshand-log debugparameters. Although-nomeshparameter the log file got filled by:so I decided to manually copy the meshagent executable to specified folder (which had not exist, need to be created manually). Now log look like below and agent status is correct, the last response time is updated correctly
@wh1te909 commented on GitHub (Mar 22, 2022):
thanks I will do some testing without mesh. The agent should still check in without mesh so that is probably a bug
@wh1te909 commented on GitHub (Mar 22, 2022):
so from my initial testing with
--nomesh(been about 12 hours now on a few vms) I get that error in the logs about not finding the executable which obviously is expected but the agent continues to check in and doesn't freeze which also is expected so im still not sure why your agents are going offline. I have not tested on AWS or Azure though I will do that today@dinger1986 commented on GitHub (Mar 22, 2022):
I have found some that were dying after installing mesh they stay online but some arent staying online long enough to install mesh, or get stuck on Getting Mesh node ID....., it doesnt seem to be just AWS, it seems to be random machines, across centos and ubuntu
@ryszard-suchocki commented on GitHub (Mar 22, 2022):
A few moments ago I have removed the "mesh agent" executable from "/opt/tacticalmesh" and the issue occurred again. Would someone like to try my "installation" steps? I can share my builds and generated config to analyze them. What is worth noting, in my case "mesh agent" still works in the background, as it was installed separately.
Edit: I have tried to run RMM Agent on my NAS (Asustor). The same behavior. Without "mesh agent" status changed to offline; when executable placed in "/opt/tacticalmesh" everything works fine.
@wh1te909 commented on GitHub (Mar 25, 2022):
@ryszard-suchocki yes please share your installation steps
I am still unable to reproduce, I have been testing for a few days now, with mesh, without mesh. On azure, AWS, hetzner etc. Not able to reproduce at all
@ryszard-suchocki commented on GitHub (Mar 26, 2022):
My env: Proxmox 6.X, agent build in Ubuntu 20.04 container (ubuntu-20.04-standard_20.04-1_amd64.tar.gz; Rel. 2021-04-05 13:09:49):
https://go.dev/dl/go1.17.8.linux-amd64.tar.gz && tar -C /usr/local/ -xzf go1.17.8.linux-amd64.tar.gzwget https://github.com/amidaware/rmmagent/archive/refs/tags/v2.0.0.zip && apt install unzipenv CGO_ENABLED=0 GOOS=linux GOARCH=amd64 go build -ldflags "-s -w"rmmagentexecutable to destination hostInstall:
-m install --api https://trmm.tld/ --client-id X --site-id X --agent-type server --auth a2c4e...XXXXXXXX)./rmmagent -m install --api https://trmm.tld/ --client-id X --site-id X --agent-type server --auth a2c4e...XXXXXXXX **-nomesh**@wh1te909 commented on GitHub (Mar 26, 2022):
@ryszard-suchocki please use the installation script that I linked to in a previous comment and see how that installs it and uses systemd to keep it running
@dinger1986 commented on GitHub (Mar 26, 2022):
also can you try send command and send df -h and see if it works?
Mine goes offline but can still send commands
@wh1te909 commented on GitHub (Mar 26, 2022):
ok all nevermind I found the bug, I forgot to spawn the function that attempts to sync the meshnodeid into it's own goroutine so it basically hangs forever when mesh is not installed LOL. will push a fix shortly
@ryszard-suchocki commented on GitHub (Mar 27, 2022):
Fixed!