mirror of
https://github.com/s3fs-fuse/s3fs-fuse.git
synced 2026-04-25 13:26:00 +03:00
[GH-ISSUE #506] Repeatedly dropping mount: Transport endpoint is not connected. #281
Labels
No labels
bug
bug
dataloss
duplicate
enhancement
feature request
help wanted
invalid
need info
performance
pull-request
question
question
testing
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
starred/s3fs-fuse#281
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @ruffle-b on GitHub (Nov 21, 2016).
Original GitHub issue: https://github.com/s3fs-fuse/s3fs-fuse/issues/506
I have four s3 buckets mounted and one keeps on getting 'stuck'. ls shows the mountpoint thusly:
and any attempt to access it returns a "Transport endpoint is not connected" error. AFAICS it's only one bucket that's doing this but there is regular access to the content of that bucket so it could be use related. I can still access the content of my other s3fs mounted buckets.
Running with debug in the foreground I get:
The last few lines of the output log are:
This happened four times yesterday. Unmounting and remounting makes it work for a while.
s3fs version 1.80, Ubuntu 16.04.1 LTS
I'm very happy to test, capture debug etc.
@substa commented on GitHub (Dec 15, 2016):
Same problem here.
Temporarily I'm solving with a script to check if mountdir is writable, and eventually remount it.
I'm also logging the failures, and in the last 3 days this error occured 6 times (absolutely random).
@rkroboth commented on GitHub (Jan 3, 2017):
I did some testing on how long it takes to write 300 10kb files:
It makes some sense to me, as under the hood S3FS is just hitting the S3 http api, so it should be much slower. I am just wondering if this slowness opens up opportunities for file system resources to max out, like file handles, connections, and so on, causing these hangs some people see?
@rkroboth commented on GitHub (Jan 4, 2017):
Actually did some further testing, just doing a bunch of "reads" from a single bucket with about 1000 10kb files, via a single S3FS mount point. The reads were done using a php script, which forks off 15 children, then proceeds to randomly read files in a bucket. I did this on a m4 xlarge Amazon linux instance, and achieved a rate of about 300 files read per second. However invariably, after between 1 to 10 minutes, the mount point suddenly goes away and I get the error you mention "Transport endpoint is not connected." So definitely seems to be an issue here with the mount point intermittently disappearing.
@gaul commented on GitHub (Jan 5, 2017):
@rkroboth Could you re-run your test with the logging flags
-d -d -f -o f2 -o curldbgand see if you can reproduce your symptoms? This will help us diagnose the issue.@ggtakec commented on GitHub (Jan 15, 2017):
@andrewgaul Thanks for your help.
@rkroboth @substa @ruffle-b I'm sorry for my late reply.
If you can, please try to run s3fs with readwrite_timeout(and connect_timeout) option.
If this problem depends on timeout, the result will change with these options.
And please try to use latest codes in master branch which is fixed a bug about multi-processes.
Thanks in advance for your assistance.
@ggtakec commented on GitHub (Jan 15, 2017):
#152 is the same issue
@gkrinc commented on GitHub (May 23, 2017):
This started happening recently for us as well. We have an ansible task to get and make s3fs-fuse. Based on our deployment history I think this may be related to v1.81/1.82. The problem started happening this weekend shortly after provisioning new servers and therefore a new version of s3fs-fuse. Before that we would have been running v1.80.
Others have been experiencing this issues long before v1.81/1.82 though...
We haven't made any changes in our AWS environment related to permissions, bucket policies, etc.
I think we're going to try rolling back to v1.80 to see if it corrects the problem.
@b0ku1 commented on GitHub (Aug 11, 2017):
@gkrinc Did rolling back fix the problem? Please let me know. Thanks in advance!
@baregawi commented on GitHub (Jan 15, 2019):
If this is only happening intermittently, try increasing the number of
-o retries=4or higher.@ggtakec commented on GitHub (Mar 30, 2019):
We kept this issue open for a long time.
Is this problem continuing?
We launch new version 1.86, which fixed some problem(bugs).
Please use the latest version.
I will close this, but if the problem persists, please reopen or post a new issue.
If you encounter problems with s3fs as well, try using the
dbglevel`` -dcurldbgor similar option to print out the log.It contains information for the solution.
@Findus1 commented on GitHub (Mar 30, 2019):
Please do not close this issue until it has been confirmed fixed.
This issue has been open for a long time because the issue has been present for a long time - for several years at least, I have had to regularly restart my S3FS mounts. I have tried every fix in this thread, and note that there are many other people with this issue.
V1.86 has not yet been released, but I will upgrade to 1.85 and let you know in a few weeks if the problem persists. Please don't close this issue without testing
@ggtakec commented on GitHub (Mar 30, 2019):
@Findus1 I'm sorry and Thanks to ping.
I reopen this issue and we hope to get the result from you.
Thanks in advance for your help.
@Findus1 commented on GitHub (Mar 30, 2019):
Thanks Takeshi,
Sorry if my reply sounded a bit grumpy - the endpoint dropping has been a
pain for years. Thanks for reopening and for your work on it. I'll post
some testing results here in a few weeks.
On Sat, 30 Mar 2019, 10:17 Takeshi Nakatani, notifications@github.com
wrote:
@gaul commented on GitHub (Jul 9, 2019):
@Findus1 Transport endpoint not connected means that s3fs has exited which could be for any number of reasons. It would be better if you opened a new issue with as much context as possible, including the log flags I suggested above. Please test against the latest version 1.85 or master.
@gaul commented on GitHub (Feb 3, 2020):
Closing due to inactivity. Please reopen if symptoms persist.