[GH-ISSUE #1037] s3fs crash with segfault due to invalid access null path. gdb backtrace attached. #570

Closed
opened 2026-03-04 01:46:49 +03:00 by kerem · 6 comments
Owner

Originally created by @jimhuaang on GitHub (Jun 13, 2019).
Original GitHub issue: https://github.com/s3fs-fuse/s3fs-fuse/issues/1037

Version of s3fs being used (s3fs --version)

V1.85

Version of fuse being used (pkg-config --modversion fuse, rpm -qi fuse, dpkg -s fuse)

$pkg-config --modversion fuse
2.9.2

$ dpkg -s fuse
Package: fuse
Status: install ok installed
Priority: optional
Section: utils
Installed-Size: 152
Maintainer: Ubuntu Developers ubuntu-devel-discuss@lists.ubuntu.com
Architecture: amd64
Version: 2.9.2-4ubuntu4.14.04.1
Depends: libc6 (>= 2.14), libfuse2 (= 2.9.2-4ubuntu4.14.04.1), adduser, mount (>= 2.19.1), sed (>= 4), udev | makedev
Conffiles:
/etc/fuse.conf 298587592c8444196833f317def414f2

Kernel information (uname -r)

3.19.0-26-generic

GNU/Linux Distribution, if applicable (cat /etc/os-release)

NAME="Ubuntu"
VERSION="14.04.3 LTS, Trusty Tahr"
ID=ubuntu
ID_LIKE=debian
PRETTY_NAME="Ubuntu 14.04.3 LTS"
VERSION_ID="14.04"
HOME_URL="http://www.ubuntu.com/"
SUPPORT_URL="http://help.ubuntu.com/"
BUG_REPORT_URL="http://bugs.launchpad.net/ubuntu/"

s3fs command line used, if applicable

s3fs <bucket> <mount_point> -o parallel_count=20 -o hard_remove

/etc/fstab entry, if applicable

s3fs syslog messages (grep s3fs /var/log/syslog, journalctl | grep s3fs, or s3fs outputs)

if you execute s3fs with dbglevel, curldbg option, you can get detail debug messages

[test 1]
Jun 13 01:32:42 zwr04n06 s3fs[34689]: curl.cpp:RequestPerform(2266): HTTP response code 500, returning EIO. Body Text: <?xml version="1.0" encoding="UTF-8"?>#012<Error><Code>InternalError</Code>#012<Message>We encountered an internal error. Please try again.</Message><RequestId>eab4f8c66bec2ea7</RequestId><Resource /></Error>
Jun 13 01:32:42 zwr04n06 s3fs[34689]: curl.cpp:MultiPerform(4119): thread failed - rc(-5)
Jun 13 01:32:42 zwr04n06 s3fs[34689]: curl.cpp:RequestPerform(2266): HTTP response code 500, returning EIO. Body Text: <?xml version="1.0" encoding="UTF-8"?>#012<Error><Code>InternalError</Code>#012<Message>We encountered an internal error. Please try again.</Message><RequestId>3c31e41f597808e8</RequestId><Resource /></Error>
Jun 13 01:32:42 zwr04n06 s3fs[34689]: curl.cpp:MultiPerform(4119): thread failed - rc(-5)
Jun 13 01:32:56 zwr04n06 s3fs[34689]: curl.cpp:MultiRead(4190): failed a request(500: http://vbr-qsfs1.s3.alpha.stor.qingstor.me/xc/ss-xcuipa2q.img.lz40001?partNumber=3&uploadId=46a7d45240000ab)
Jun 13 01:32:56 zwr04n06 s3fs[34689]: curl.cpp:MultiRead(4190): failed a request(500: http://vbr-qsfs1.s3.alpha.stor.qingstor.me/xc/ss-xcuipa2q.img.lz40001?partNumber=7&uploadId=46a7d45240000ab)
Jun 13 01:36:28 zwr04n06 kernel: [138027.564567] s3fs[15488]: segfault at 0 ip 00007fc19a9b3216 sp 00007fc17effca88 error 4 in libc-2.19.so[7fc19a873000+1bb000]

[test 2]
Jun 13 01:33:56 zwr01n09 s3fs[14884]: curl.cpp:RequestPerform(2266): HTTP response code 500, returning EIO. Body Text: <?xml version="1.0" encoding="UTF-8"?>
Jun 13 01:33:56 zwr01n09 s3fs[14884]: curl.cpp:MultiPerform(4119): thread failed - rc(-5)
Jun 13 01:34:12 zwr01n09 s3fs[14884]: curl.cpp:MultiRead(4190): failed a request(500: http://vbr-qsfs1.s3.alpha.stor.qingstor.me/d7/ss-d7ceegp6.img.lz40001?partNumber=20&uploadId=46a7d894c8000b2)
Jun 13 01:34:20 zwr01n09 s3fs[14884]: s3fs.cpp:s3fs_release(2347): could not find fd(file=-)
Jun 13 01:40:54 zwr01n09 kernel: s3fs[18355]: segfault at 0 ip 00007fd165a14216 sp 00007fd159ffaa88 error 4 in libc-2.19.so[7fd1658d4000+1bb000]

Details about issue

[test 1]
gdb backtrace]
Program terminated with signal SIGSEGV, Segmentation fault.
#0 0x000000000041382c in s3fs_truncate (_path=<error reading variable: Cannot access memory at address 0xfffffffffffffeb8>, _path@entry=<error reading variable: Cannot access memory at address 0x8>,
size=<error reading variable: Cannot access memory at address 0xfffffffffffffeb0>, size@entry=<error reading variable: Cannot access memory at address 0x8>) at s3fs.cpp:2095
2095 if(NULL == (ent = FdManager::get()->Open(path, &meta, static_cast<ssize_t>(size), -1, false, true))){
(gdb)bt
#0 0x000000000041382c in s3fs_truncate (_path=<error reading variable: Cannot access memory at address 0xfffffffffffffeb8>, _path@entry=<error reading variable: Cannot access memory at address 0x8>,
size=<error reading variable: Cannot access memory at address 0xfffffffffffffeb0>, size@entry=<error reading variable: Cannot access memory at address 0x8>) at s3fs.cpp:2095
Cannot access memory at address 0x8

[test 2]
(gdb) bt
#0 0x00007fd165a14216 in __strcmp_ssse3 () at ../sysdeps/x86_64/multiarch/../strcmp.S:1064
#1 0x00000000004096a6 in check_parent_object_access (path=0x0, mask=1) at s3fs.cpp:742
#2 0x000000000041529e in s3fs_flush (_path=0x0, fi=0x7fd159ffacd0) at s3fs.cpp:2276
#3 0x00007fd166d86fc7 in ?? () from /lib/x86_64-linux-gnu/libfuse.so.2

Originally created by @jimhuaang on GitHub (Jun 13, 2019). Original GitHub issue: https://github.com/s3fs-fuse/s3fs-fuse/issues/1037 #### Version of s3fs being used (s3fs --version) V1.85 #### Version of fuse being used (pkg-config --modversion fuse, rpm -qi fuse, dpkg -s fuse) $pkg-config --modversion fuse 2.9.2 $ dpkg -s fuse Package: fuse Status: install ok installed Priority: optional Section: utils Installed-Size: 152 Maintainer: Ubuntu Developers <ubuntu-devel-discuss@lists.ubuntu.com> Architecture: amd64 Version: 2.9.2-4ubuntu4.14.04.1 Depends: libc6 (>= 2.14), libfuse2 (= 2.9.2-4ubuntu4.14.04.1), adduser, mount (>= 2.19.1), sed (>= 4), udev | makedev Conffiles: /etc/fuse.conf 298587592c8444196833f317def414f2 #### Kernel information (uname -r) 3.19.0-26-generic #### GNU/Linux Distribution, if applicable (cat /etc/os-release) NAME="Ubuntu" VERSION="14.04.3 LTS, Trusty Tahr" ID=ubuntu ID_LIKE=debian PRETTY_NAME="Ubuntu 14.04.3 LTS" VERSION_ID="14.04" HOME_URL="http://www.ubuntu.com/" SUPPORT_URL="http://help.ubuntu.com/" BUG_REPORT_URL="http://bugs.launchpad.net/ubuntu/" #### s3fs command line used, if applicable ``` s3fs <bucket> <mount_point> -o parallel_count=20 -o hard_remove ``` #### /etc/fstab entry, if applicable ``` ``` #### s3fs syslog messages (grep s3fs /var/log/syslog, journalctl | grep s3fs, or s3fs outputs) _if you execute s3fs with dbglevel, curldbg option, you can get detail debug messages_ ``` [test 1] Jun 13 01:32:42 zwr04n06 s3fs[34689]: curl.cpp:RequestPerform(2266): HTTP response code 500, returning EIO. Body Text: <?xml version="1.0" encoding="UTF-8"?>#012<Error><Code>InternalError</Code>#012<Message>We encountered an internal error. Please try again.</Message><RequestId>eab4f8c66bec2ea7</RequestId><Resource /></Error> Jun 13 01:32:42 zwr04n06 s3fs[34689]: curl.cpp:MultiPerform(4119): thread failed - rc(-5) Jun 13 01:32:42 zwr04n06 s3fs[34689]: curl.cpp:RequestPerform(2266): HTTP response code 500, returning EIO. Body Text: <?xml version="1.0" encoding="UTF-8"?>#012<Error><Code>InternalError</Code>#012<Message>We encountered an internal error. Please try again.</Message><RequestId>3c31e41f597808e8</RequestId><Resource /></Error> Jun 13 01:32:42 zwr04n06 s3fs[34689]: curl.cpp:MultiPerform(4119): thread failed - rc(-5) Jun 13 01:32:56 zwr04n06 s3fs[34689]: curl.cpp:MultiRead(4190): failed a request(500: http://vbr-qsfs1.s3.alpha.stor.qingstor.me/xc/ss-xcuipa2q.img.lz40001?partNumber=3&uploadId=46a7d45240000ab) Jun 13 01:32:56 zwr04n06 s3fs[34689]: curl.cpp:MultiRead(4190): failed a request(500: http://vbr-qsfs1.s3.alpha.stor.qingstor.me/xc/ss-xcuipa2q.img.lz40001?partNumber=7&uploadId=46a7d45240000ab) Jun 13 01:36:28 zwr04n06 kernel: [138027.564567] s3fs[15488]: segfault at 0 ip 00007fc19a9b3216 sp 00007fc17effca88 error 4 in libc-2.19.so[7fc19a873000+1bb000] [test 2] Jun 13 01:33:56 zwr01n09 s3fs[14884]: curl.cpp:RequestPerform(2266): HTTP response code 500, returning EIO. Body Text: <?xml version="1.0" encoding="UTF-8"?> Jun 13 01:33:56 zwr01n09 s3fs[14884]: curl.cpp:MultiPerform(4119): thread failed - rc(-5) Jun 13 01:34:12 zwr01n09 s3fs[14884]: curl.cpp:MultiRead(4190): failed a request(500: http://vbr-qsfs1.s3.alpha.stor.qingstor.me/d7/ss-d7ceegp6.img.lz40001?partNumber=20&uploadId=46a7d894c8000b2) Jun 13 01:34:20 zwr01n09 s3fs[14884]: s3fs.cpp:s3fs_release(2347): could not find fd(file=-) Jun 13 01:40:54 zwr01n09 kernel: s3fs[18355]: segfault at 0 ip 00007fd165a14216 sp 00007fd159ffaa88 error 4 in libc-2.19.so[7fd1658d4000+1bb000] ``` ### Details about issue [test 1] gdb backtrace] Program terminated with signal SIGSEGV, Segmentation fault. #0 0x000000000041382c in s3fs_truncate (_path=<error reading variable: Cannot access memory at address 0xfffffffffffffeb8>, _path@entry=<error reading variable: Cannot access memory at address 0x8>, size=<error reading variable: Cannot access memory at address 0xfffffffffffffeb0>, size@entry=<error reading variable: Cannot access memory at address 0x8>) at s3fs.cpp:2095 2095 if(NULL == (ent = FdManager::get()->Open(path, &meta, static_cast<ssize_t>(size), -1, false, true))){ (gdb)bt #0 0x000000000041382c in s3fs_truncate (_path=<error reading variable: Cannot access memory at address 0xfffffffffffffeb8>, _path@entry=<error reading variable: Cannot access memory at address 0x8>, size=<error reading variable: Cannot access memory at address 0xfffffffffffffeb0>, size@entry=<error reading variable: Cannot access memory at address 0x8>) at s3fs.cpp:2095 Cannot access memory at address 0x8 [test 2] (gdb) bt #0 0x00007fd165a14216 in __strcmp_ssse3 () at ../sysdeps/x86_64/multiarch/../strcmp.S:1064 #1 0x00000000004096a6 in check_parent_object_access (path=0x0, mask=1) at s3fs.cpp:742 #2 0x000000000041529e in s3fs_flush (_path=0x0, fi=0x7fd159ffacd0) at s3fs.cpp:2276 #3 0x00007fd166d86fc7 in ?? () from /lib/x86_64-linux-gnu/libfuse.so.2
kerem 2026-03-04 01:46:49 +03:00
Author
Owner

@gaul commented on GitHub (Jun 13, 2019):

What kind of S3 implementation do you use? Your server returns 500 for an UploadPart operation. You also have a relatively high parallel_count; do these errors persist with a lower value?

Also it looks like subsequent FUSE operations have an invalid path which doesn't make much sense. What series of operations did you issue?

<!-- gh-comment-id:501595948 --> @gaul commented on GitHub (Jun 13, 2019): What kind of S3 implementation do you use? Your server returns 500 for an UploadPart operation. You also have a relatively high `parallel_count`; do these errors persist with a lower value? Also it looks like subsequent FUSE operations have an invalid path which doesn't make much sense. What series of operations did you issue?
Author
Owner

@jimhuaang commented on GitHub (Jun 13, 2019):

Thanks for the quick response.

We used a in-house Object Store with Amazon S3 Compatibility API.
We have done a group of tests to backup 10 vm images in parallel to a directory:

  • s3fs used to mount a S3 bucket to the directory
  • each images have same size, changed from 30 to 70 G
  • each images will be split to a smaller part of 1G, the spliced part will be written to the directory
  • each images and spliced part has unique name
  • two nodes with s3fs mount the same S3 bucket to the same directory path
  • two nodes has its /tmp mounted to a 50G ssd drive.

The tests worked when images size is below 60G, and failed with server 500 and s3fs failure when images size is increased to 70G.

lscpu on each node output as following,

Architecture:          x86_64
CPU op-mode(s):        32-bit, 64-bit
Byte Order:            Little Endian
CPU(s):                24
On-line CPU(s) list:   0-23
Thread(s) per core:    2
Core(s) per socket:    6
Socket(s):             2
NUMA node(s):          2
Vendor ID:             GenuineIntel
CPU family:            6
Model:                 62
Stepping:              4
CPU MHz:               2600.000
BogoMIPS:              5208.90
Virtualization:        VT-x
L1d cache:             32K
L1i cache:             32K
L2 cache:              256K
L3 cache:              15360K
NUMA node0 CPU(s):     0-5,12-17
NUMA node1 CPU(s):     6-11,18-23

We will try to decrease paralle_count to 10 and will let you know the result asap.

<!-- gh-comment-id:501684511 --> @jimhuaang commented on GitHub (Jun 13, 2019): Thanks for the quick response. We used a in-house Object Store with Amazon S3 Compatibility API. We have done a group of tests to backup 10 vm images in parallel to a directory: - s3fs used to mount a S3 bucket to the directory - each images have same size, changed from 30 to 70 G - each images will be split to a smaller part of 1G, the spliced part will be written to the directory - each images and spliced part has unique name - two nodes with s3fs mount the same S3 bucket to the same directory path - two nodes has its /tmp mounted to a 50G ssd drive. The tests worked when images size is below 60G, and failed with server 500 and s3fs failure when images size is increased to 70G. lscpu on each node output as following, ``` Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Byte Order: Little Endian CPU(s): 24 On-line CPU(s) list: 0-23 Thread(s) per core: 2 Core(s) per socket: 6 Socket(s): 2 NUMA node(s): 2 Vendor ID: GenuineIntel CPU family: 6 Model: 62 Stepping: 4 CPU MHz: 2600.000 BogoMIPS: 5208.90 Virtualization: VT-x L1d cache: 32K L1i cache: 32K L2 cache: 256K L3 cache: 15360K NUMA node0 CPU(s): 0-5,12-17 NUMA node1 CPU(s): 6-11,18-23 ``` We will try to decrease paralle_count to 10 and will let you know the result asap.
Author
Owner

@gaul commented on GitHub (Jun 13, 2019):

I cannot reproduce your symptoms when inducing 500 errors during multipart upload. I did verify that s3fs retries on 500 errors which should work around your server issue.

As for the segfault, please try running s3fs under Valgrind to see if it reports errors.

<!-- gh-comment-id:501860896 --> @gaul commented on GitHub (Jun 13, 2019): I cannot reproduce your symptoms when inducing 500 errors during multipart upload. I did verify that s3fs retries on 500 errors which should work around your server issue. As for the segfault, please try running s3fs under Valgrind to see if it reports errors.
Author
Owner

@jimhuaang commented on GitHub (Jun 14, 2019):

when decrease paralle_count to 10, server has no error, s3fs_flush get segfault due to the null path from fuse_operation. I have attached gdb backtrace with fuse-dbg installed as following

(gdb) bt
#0  0x00007fe02c339216 in __strcmp_ssse3 () at ../sysdeps/x86_64/multiarch/../strcmp.S:1064
#1  0x00000000004096a6 in check_parent_object_access (path=0x0, mask=1) at s3fs.cpp:742
#2  0x000000000041529e in s3fs_flush (_path=0x0, fi=0x7fe025b43cd0) at s3fs.cpp:2276
#3  0x00007fe02d6abfc7 in fuse_flush_common (f=f@entry=0xe9ced0, req=req@entry=0x7fe0100008c0, ino=ino@entry=35, path=0x0, fi=fi@entry=0x7fe025b43cd0) at fuse.c:3839
#4  0x00007fe02d6ac241 in fuse_lib_flush (req=0x7fe0100008c0, ino=35, fi=0x7fe025b43cd0) at fuse.c:3889
#5  0x00007fe02d6b2506 in do_flush (req=<optimized out>, nodeid=<optimized out>, inarg=<optimized out>) at fuse_lowlevel.c:1321
#6  0x00007fe02d6b322b in fuse_ll_process_buf (data=0xe9d1c0, buf=0x7fe025b43e80, ch=<optimized out>) at fuse_lowlevel.c:2441
#7  0x00007fe02d6afe49 in fuse_do_work (data=0x7fe01c0011a0) at fuse_loop_mt.c:117
#8  0x00007fe02c5c6182 in start_thread (arg=0x7fe025b44700) at pthread_create.c:312
#9  0x00007fe02c2f347d in ?? () from /lib/x86_64-linux-gnu/libc.so.6
#10 0x0000000000000000 in ?? ()
<!-- gh-comment-id:501943413 --> @jimhuaang commented on GitHub (Jun 14, 2019): when decrease paralle_count to 10, server has no error, s3fs_flush get segfault due to the null path from fuse_operation. I have attached gdb backtrace with fuse-dbg installed as following ``` (gdb) bt #0 0x00007fe02c339216 in __strcmp_ssse3 () at ../sysdeps/x86_64/multiarch/../strcmp.S:1064 #1 0x00000000004096a6 in check_parent_object_access (path=0x0, mask=1) at s3fs.cpp:742 #2 0x000000000041529e in s3fs_flush (_path=0x0, fi=0x7fe025b43cd0) at s3fs.cpp:2276 #3 0x00007fe02d6abfc7 in fuse_flush_common (f=f@entry=0xe9ced0, req=req@entry=0x7fe0100008c0, ino=ino@entry=35, path=0x0, fi=fi@entry=0x7fe025b43cd0) at fuse.c:3839 #4 0x00007fe02d6ac241 in fuse_lib_flush (req=0x7fe0100008c0, ino=35, fi=0x7fe025b43cd0) at fuse.c:3889 #5 0x00007fe02d6b2506 in do_flush (req=<optimized out>, nodeid=<optimized out>, inarg=<optimized out>) at fuse_lowlevel.c:1321 #6 0x00007fe02d6b322b in fuse_ll_process_buf (data=0xe9d1c0, buf=0x7fe025b43e80, ch=<optimized out>) at fuse_lowlevel.c:2441 #7 0x00007fe02d6afe49 in fuse_do_work (data=0x7fe01c0011a0) at fuse_loop_mt.c:117 #8 0x00007fe02c5c6182 in start_thread (arg=0x7fe025b44700) at pthread_create.c:312 #9 0x00007fe02c2f347d in ?? () from /lib/x86_64-linux-gnu/libc.so.6 #10 0x0000000000000000 in ?? () ```
Author
Owner

@gaul commented on GitHub (Jul 26, 2020):

Could you retest with 1.85 or preferably the latest master?

<!-- gh-comment-id:663951403 --> @gaul commented on GitHub (Jul 26, 2020): Could you retest with 1.85 or preferably the latest master?
Author
Owner

@gaul commented on GitHub (Oct 10, 2020):

Please reopen if symptoms persist.

<!-- gh-comment-id:706509505 --> @gaul commented on GitHub (Oct 10, 2020): Please reopen if symptoms persist.
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/s3fs-fuse#570
No description provided.