mirror of
https://github.com/s3fs-fuse/s3fs-fuse.git
synced 2026-04-25 05:16:00 +03:00
[GH-ISSUE #2422] s3fs fuse hangs EC2 node very frequently #1193
Labels
No labels
bug
bug
dataloss
duplicate
enhancement
feature request
help wanted
invalid
need info
performance
pull-request
question
question
testing
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
starred/s3fs-fuse#1193
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @uareddy on GitHub (Feb 25, 2024).
Original GitHub issue: https://github.com/s3fs-fuse/s3fs-fuse/issues/2422
Hi
I am using s3fs as a mount in AWS ec2 . we are using the EC2 instance as sftp server .
We receive daily 400+ files , The max concurrent files is around 100.
The file sizes are from 10GB to 300GB all files in binary format .
I am using s3fs version 1.93
The munt command we are using :
sudo s3fs $sftp_bucket_name:/incoming/ /var/sftp/incoming/ -o allow_other -o curldbg -o max_background=1000 -o max_stat_cache_size=100000 -o stat_cache_expire=900 -o multipart_size=512 -o parallel_count=30 -o multireq_max=30 -o dbglevel=info -o complement_stat -o compat_dir -o readwrite_timeout=900 -o connect_timeout=900 -o stat_cache_interval_expire=600 -o ensure_diskfree=512 -o nonempty -o endpoint=eu-west-2 -o use_sse=kmsid:$kms_key -o iam_role=$sftp_iam_role -o umask=0007,uid=$user_id
we increased the /tmp size to 100GB.
The same configure works perfectly for some days and suddenly in one day the it stops and the EC2 nodes also hangs .
The log we observed at the time is :
Feb 24 05:37:46 ip-100-96-131-182 kernel: Call Trace:
Feb 24 05:37:46 ip-100-96-131-182 kernel: __schedule+0x28e/0x890
Feb 24 05:37:46 ip-100-96-131-182 kernel: schedule+0x28/0x80
Feb 24 05:37:46 ip-100-96-131-182 kernel: request_wait_answer+0x125/0x1f0 [fuse]
Feb 24 05:37:46 ip-100-96-131-182 kernel: ? finish_wait+0x80/0x80
Feb 24 05:37:46 ip-100-96-131-182 kernel: __fuse_request_send+0x7f/0x90 [fuse]
Feb 24 05:37:46 ip-100-96-131-182 kernel: fuse_simple_request+0xbd/0x190 [fuse]
Feb 24 05:37:46 ip-100-96-131-182 kernel: fuse_do_getattr+0x106/0x310 [fuse]
Feb 24 05:37:46 ip-100-96-131-182 kernel: vfs_statx+0x89/0xe0
Feb 24 05:37:46 ip-100-96-131-182 kernel: SYSC_newlstat+0x39/0x70
Feb 24 05:37:46 ip-100-96-131-182 kernel: do_syscall_64+0x67/0x110
Feb 24 05:37:46 ip-100-96-131-182 kernel: entry_SYSCALL_64_after_hwframe+0x5e/0xc3
Feb 24 05:37:46 ip-100-96-131-182 kernel: RIP: 0033:0x7f4ae0dcee05
Feb 24 05:37:46 ip-100-96-131-182 kernel: RSP: 002b:00007f4ab27f80f8 EFLAGS: 00000246 ORIG_RAX: 0000000000000006
Feb 24 05:37:46 ip-100-96-131-182 kernel: RAX: ffffffffffffffda RBX: 00007f4ab27f9270 RCX: 00007f4ae0dcee05
Feb 24 05:37:46 ip-100-96-131-182 kernel: RDX: 00007f4ab27f8130 RSI: 00007f4ab27f8130 RDI: 00007f4ab27f8270
Feb 24 05:37:46 ip-100-96-131-182 kernel: RBP: 00007f4ab27f8200 R08: 00007f4a98096690 R09: 00007f4ab27f7f80
Feb 24 05:37:46 ip-100-96-131-182 kernel: R10: 00007f4ab27f80b0 R11: 0000000000000246 R12: 00007f4ab27f8270
Feb 24 05:37:46 ip-100-96-131-182 kernel: R13: 00007f4a9809669a R14: 00007f4a980966a2 R15: 00007f4ab27f8282
Feb 24 05:39:46 ip-100-96-131-182 kernel: INFO: task oneagentos:5895 blocked for more than 120 seconds.
Feb 24 05:39:46 ip-100-96-131-182 kernel: Not tainted 4.14.336-256.559.amzn2.x86_64 #1
we have another empty mount for /var/sftp/home where we did not place any objects
Is there anything you suggest in mount parameters to be changed ,
Please let me know if you need any further info
@ggtakec commented on GitHub (Apr 14, 2024):
@uareddy I'm sorry for my late reply.
Is this problem still occurring?
if it hangs, you may not be able to get the s3fs log at that time.
(If we can see it, it will be helpful to solve this problem)
It is difficult to determine the cause, but is it possible that the cache directory(use_cache option) disk is full?
(If you can try, you may be able to avoid this by periodically deleting cache files in a separate process.)
Also, s3fs v1.94 has been released. could you try to use it?