[GH-ISSUE #2356] After running for several days, the error "Transport endpoint is not connected" frequently occurs. #1161

Open
opened 2026-03-04 01:51:49 +03:00 by kerem · 6 comments
Owner

Originally created by @jestiny0 on GitHub (Oct 22, 2023).
Original GitHub issue: https://github.com/s3fs-fuse/s3fs-fuse/issues/2356

Additional Information

Version of s3fs being used (s3fs --version)

s3fs --version

Amazon Simple Storage Service File System V1.85(commit:unknown) with OpenSSL
Copyright (C) 2010 Randy Rizun <rrizun@gmail.com>
License GPL2: GNU GPL version 2 <https://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law

Version of fuse being used (pkg-config --modversion fuse, rpm -qi fuse or dpkg -s fuse)

rpm -qi fuse

Name        : fuse
Version     : 2.9.2
Release     : 11.amzn2
Architecture: aarch64
Install Date: Fri Oct 20 07:01:34 2023
Group       : System Environment/Base
Size        : 370377
License     : GPL+
Signature   : RSA/SHA256, Thu Dec  6 19:31:45 2018, Key ID 11cf1f95c87f5b1a
Source RPM  : fuse-2.9.2-11.amzn2.src.rpm
Build Date  : Fri Nov 16 20:36:10 2018
Build Host  : build.amazon.com
Relocations : (not relocatable)
Packager    : Amazon Linux
Vendor      : Amazon Linux
URL         : https://github.com/libfuse/libfuse
Summary     : File System in Userspace (FUSE) utilities
Description :
With FUSE it is possible to implement a fully functional filesystem in a
userspace program. This package contains the FUSE userspace tools to
mount a FUSE filesystem.

Kernel information (uname -r)

5.10.196-185.743.amzn2.aarch64

GNU/Linux Distribution, if applicable (cat /etc/os-release)

bash-4.2# cat /etc/os-release
NAME="Amazon Linux"
VERSION="2"
ID="amzn"
ID_LIKE="centos rhel fedora"
VERSION_ID="2"
PRETTY_NAME="Amazon Linux 2"
ANSI_COLOR="0;33"
CPE_NAME="cpe:2.3:o:amazon:amazon_linux:2"
HOME_URL="https://amazonlinux.com/"

How to run s3fs, if applicable

mkdir /mnt-s3 && echo mypassword > /passwd-s3fs && chmod 600 /passwd-s3fs && \ s3fs mys3bucket /mnt-s3 -o passwd_file=/passwd-s3fs -o stat_cache_expire=30 -o nosscache -o nodnscache

Details about issue

My online service, which mounts an AWS S3 bucket using the s3fs command, can run stably for a period of time. However, after a few days, a large number of Exceptions:Transport endpoint is not connected frequently occur, and the only solution is to restart the service. I reviewed previous issues and found that someone resolved this problem by adding the -o nodnscache option. This option indeed solved the problem for a while, but recently the issue has recurred. Is there a better solution available?
Note: I have several other online services that also use the same command and are currently running stable.

Originally created by @jestiny0 on GitHub (Oct 22, 2023). Original GitHub issue: https://github.com/s3fs-fuse/s3fs-fuse/issues/2356 <!-- -------------------------------------------------------------------------- The following information is very important in order to help us to help you. Omission of the following details may delay your support request or receive no attention at all. Keep in mind that the commands we provide to retrieve information are oriented to GNU/Linux Distributions, so you could need to use others if you use s3fs on macOS or BSD. --------------------------------------------------------------------------- --> ### Additional Information #### Version of s3fs being used (`s3fs --version`) ``` s3fs --version Amazon Simple Storage Service File System V1.85(commit:unknown) with OpenSSL Copyright (C) 2010 Randy Rizun <rrizun@gmail.com> License GPL2: GNU GPL version 2 <https://gnu.org/licenses/gpl.html> This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law ``` #### Version of fuse being used (`pkg-config --modversion fuse`, `rpm -qi fuse` or `dpkg -s fuse`) ``` rpm -qi fuse Name : fuse Version : 2.9.2 Release : 11.amzn2 Architecture: aarch64 Install Date: Fri Oct 20 07:01:34 2023 Group : System Environment/Base Size : 370377 License : GPL+ Signature : RSA/SHA256, Thu Dec 6 19:31:45 2018, Key ID 11cf1f95c87f5b1a Source RPM : fuse-2.9.2-11.amzn2.src.rpm Build Date : Fri Nov 16 20:36:10 2018 Build Host : build.amazon.com Relocations : (not relocatable) Packager : Amazon Linux Vendor : Amazon Linux URL : https://github.com/libfuse/libfuse Summary : File System in Userspace (FUSE) utilities Description : With FUSE it is possible to implement a fully functional filesystem in a userspace program. This package contains the FUSE userspace tools to mount a FUSE filesystem. ``` #### Kernel information (`uname -r`) 5.10.196-185.743.amzn2.aarch64 #### GNU/Linux Distribution, if applicable (`cat /etc/os-release`) ``` bash-4.2# cat /etc/os-release NAME="Amazon Linux" VERSION="2" ID="amzn" ID_LIKE="centos rhel fedora" VERSION_ID="2" PRETTY_NAME="Amazon Linux 2" ANSI_COLOR="0;33" CPE_NAME="cpe:2.3:o:amazon:amazon_linux:2" HOME_URL="https://amazonlinux.com/" ``` #### How to run s3fs, if applicable <!-- Describe the s3fs "command line" or "/etc/fstab" entry used. --> `mkdir /mnt-s3 && echo mypassword > /passwd-s3fs && chmod 600 /passwd-s3fs && \ s3fs mys3bucket /mnt-s3 -o passwd_file=/passwd-s3fs -o stat_cache_expire=30 -o nosscache -o nodnscache` ### Details about issue My online service, which mounts an AWS S3 bucket using the s3fs command, can run stably for a period of time. However, after a few days, a large number of Exceptions:`Transport endpoint is not connected` frequently occur, and the only solution is to restart the service. I reviewed previous issues and found that someone resolved this problem by adding the `-o nodnscache `option. This option indeed solved the problem for a while, but recently the issue has recurred. Is there a better solution available? Note: I have several other online services that also use the same command and are currently running stable.
Author
Owner

@jestiny0 commented on GitHub (Nov 1, 2023):

Can someone please help take a look at this problem? It keeps happening every few days lately. Thank you so much in advance

<!-- gh-comment-id:1788321960 --> @jestiny0 commented on GitHub (Nov 1, 2023): Can someone please help take a look at this problem? It keeps happening every few days lately. Thank you so much in advance
Author
Owner

@nguyenminhdungpg commented on GitHub (Nov 3, 2023):

@jestiny0 Hi, I also has the error "Transport endpoint is not connected" quite often but I cannot figure out when exactly it happens. So I apply a solution that can also help you.
I upload a dump text file to the bucket, eg named "s3fs_connection_status.txt", has a simple content like "s3fs connection status", then in the VM, I create a cronjob runs every 5 minutes, cat the content of the file in mounted folder, check if the content is "s3fs connection status". If it is false, I re-mount the bucket by unmount and mount again.

Script looks like

echo 's3fs connection status' >> s3fs connection status.txt
aws s3 cp ./s3fs_connection_status.txt s3://my-bucket/  --profile s3-profile

Cronjob setup look like
*/5 * * * * root run-one ./check_s3fs_connection_and_fix.sh >> /var/log/s3fs-mountpoint-status.log

Cronjob script file check_s3fs_connection_and_fix.sh looks like

#!/bin/bash
sync_content=`cat ./my-mount-point/s3fs_connection_status.txt`

if [[ $sync_content = "s3fs connection status" ]]
then
    echo "status: OK"
else
    echo "status: DISCONNECTED"
    echo "Starting to remount"
    umount ./my-mount-point
    s3fs my-bucket ./my-mount-point -o passwd_file=${HOME}/.passwd-s3fs -o ......
    echo "Remount Finished"
fi

In this script, you can add a timestamp to echo msg and setup an observability stack to trace log of s3fs and the cronjob log. Base on logs, you can add alerts to make them noticeable.

<!-- gh-comment-id:1791770501 --> @nguyenminhdungpg commented on GitHub (Nov 3, 2023): @jestiny0 Hi, I also has the error "Transport endpoint is not connected" quite often but I cannot figure out when exactly it happens. So I apply a solution that can also help you. I upload a dump text file to the bucket, eg named "s3fs_connection_status.txt", has a simple content like "s3fs connection status", then in the VM, I create a cronjob runs every 5 minutes, cat the content of the file in mounted folder, check if the content is "s3fs connection status". If it is false, I re-mount the bucket by unmount and mount again. Script looks like ``` echo 's3fs connection status' >> s3fs connection status.txt aws s3 cp ./s3fs_connection_status.txt s3://my-bucket/ --profile s3-profile ``` Cronjob setup look like `*/5 * * * * root run-one ./check_s3fs_connection_and_fix.sh >> /var/log/s3fs-mountpoint-status.log` Cronjob script file **check_s3fs_connection_and_fix.sh** looks like ``` #!/bin/bash sync_content=`cat ./my-mount-point/s3fs_connection_status.txt` if [[ $sync_content = "s3fs connection status" ]] then echo "status: OK" else echo "status: DISCONNECTED" echo "Starting to remount" umount ./my-mount-point s3fs my-bucket ./my-mount-point -o passwd_file=${HOME}/.passwd-s3fs -o ...... echo "Remount Finished" fi ``` In this script, you can add a timestamp to echo msg and setup an observability stack to trace log of s3fs and the cronjob log. Base on logs, you can add alerts to make them noticeable.
Author
Owner

@jestiny0 commented on GitHub (Nov 7, 2023):

@nguyenminhdungpg
Thank you for your suggestion. I plan to adopt a similar approach like yours, by periodically checking and remounting. However, I still hope that the official solution could provide a better resolution.

<!-- gh-comment-id:1797397570 --> @jestiny0 commented on GitHub (Nov 7, 2023): @nguyenminhdungpg Thank you for your suggestion. I plan to adopt a similar approach like yours, by periodically checking and remounting. However, I still hope that the official solution could provide a better resolution.
Author
Owner

@nguyenminhdungpg commented on GitHub (Nov 7, 2023):

@jestiny0 I also hope that but at this time the work around does its job quite good. Sometimes I get Slack notification msg that it has just been remounted, may be 2 times in one random night, may be no notification after 3 weeks...

<!-- gh-comment-id:1797509905 --> @nguyenminhdungpg commented on GitHub (Nov 7, 2023): @jestiny0 I also hope that but at this time the work around does its job quite good. Sometimes I get Slack notification msg that it has just been remounted, may be 2 times in one random night, may be no notification after 3 weeks...
Author
Owner

@alphaxvzf commented on GitHub (Apr 4, 2024):

I had the same problem - after a while, s3fs mounted directory got disconnected. In my case, the reason was s3fs mount used all the free space on a device for its cache. Cleaning the cache or rebooting the device (in my case, the cache was in /tmp that is cleaned on reboot) solves the problem for a while, so it is better you either avoid using s3fs cache or use a larger disk for it.

<!-- gh-comment-id:2037210150 --> @alphaxvzf commented on GitHub (Apr 4, 2024): I had the same problem - after a while, s3fs mounted directory got disconnected. In my case, the reason was s3fs mount used all the free space on a device for its cache. Cleaning the cache or rebooting the device (in my case, the cache was in /tmp that is cleaned on reboot) solves the problem for a while, so it is better you either avoid using s3fs cache or use a larger disk for it.
Author
Owner

@Allan-Nava commented on GitHub (Jan 21, 2025):

Any good solutions?

<!-- gh-comment-id:2604921371 --> @Allan-Nava commented on GitHub (Jan 21, 2025): Any good solutions?
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/s3fs-fuse#1161
No description provided.