[GH-ISSUE #1772] s3fs aborts in GetXmlNsUrl during parallel execution #915

Closed
opened 2026-03-04 01:49:52 +03:00 by kerem · 6 comments
Owner

Originally created by @CarstenGrohmann on GitHub (Oct 7, 2021).
Original GitHub issue: https://github.com/s3fs-fuse/s3fs-fuse/issues/1772

s3fs has been terminated itself twice whiteout further notice on my system. After enabling core dumps I got one, but unfortunately I deleted the binary. I recompiled the binary. The analysis with gdb don't show any junk, thereby I assume core dump and binary matches well.

Core dump analysis

The dump shows an abort in GetXmlNsUrl during a copy assignment of strNs (type std::string):

(gdb) bt
#0  0x00007fd07d969207 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:55
#1  0x00007fd07d96a8f8 in __GI_abort () at abort.c:90
#2  0x00007fd07e494765 in __gnu_cxx::__verbose_terminate_handler () at ../../../../libstdc++-v3/libsupc++/vterminate.cc:50
#3  0x00007fd07e492746 in __cxxabiv1::__terminate (handler=<optimized out>) at ../../../../libstdc++-v3/libsupc++/eh_terminate.cc:38
#4  0x00007fd07e492773 in std::terminate () at ../../../../libstdc++-v3/libsupc++/eh_terminate.cc:48
#5  0x00007fd07e492993 in __cxxabiv1::__cxa_throw (obj=0x7fcf7406c000, tinfo=0x7fd07e71db00 <typeinfo for std::bad_alloc>, dest=0x7fd07e490ce0 <std::bad_alloc::~bad_alloc()>) at ../../../../libstdc++-v3/libsupc++/eh_throw.cc:87
#6  0x00007fd07e492f2d in operator new (sz=140533008707552) at ../../../../libstdc++-v3/libsupc++/new_op.cc:56
#7  0x00007fd07e4f1a19 in allocate (this=<optimized out>, __n=<optimized out>) at /usr/src/debug/gcc-4.8.5-20150702/obj-x86_64-redhat-linux/x86_64-redhat-linux/libstdc++-v3/include/ext/new_allocator.h:104
#8  std::string::_Rep::_S_create (__capacity=140533008707527, __old_capacity=<optimized out>, __alloc=...)
    at /usr/src/debug/gcc-4.8.5-20150702/obj-x86_64-redhat-linux/x86_64-redhat-linux/libstdc++-v3/include/bits/basic_string.tcc:607
#9  0x00007fd07e4f262b in std::string::_Rep::_M_clone (this=0x7fd0640b3d70, __alloc=..., __res=__res@entry=0)
    at /usr/src/debug/gcc-4.8.5-20150702/obj-x86_64-redhat-linux/x86_64-redhat-linux/libstdc++-v3/include/bits/basic_string.tcc:629
#10 0x00007fd07e4f2d9c in _M_grab (__alloc2=..., __alloc1=..., this=<optimized out>) at /usr/src/debug/gcc-4.8.5-20150702/obj-x86_64-redhat-linux/x86_64-redhat-linux/libstdc++-v3/include/bits/basic_string.h:229
#11 std::string::assign (this=this@entry=0x7fd003ffe2d0,
    __str="http://s3.amazonaws.com/doc/2006-03-01/\000P\000\000\000\000\000\000\000E\000\000\000\000\000\000\000\001\000\000\000\320\177\000\000P[\006d\320\177\000\000\020\340\000d\320\177\000\000\260\220\bd\320\177\000\000hk\000d\320\177\000\000Hs\000d\320\177\000\000@\000\000\000\000\000\000\000\065", '\000' <repeats 23 times>, "\300\200\363~\320\177\000\000\000\000\000\000\000\000\000\000\060\000\000\000\000\000\000\000\065\000\000\000\000\000\000\000\005\000\000\000\000\000\000\000\005", '\000' <repeats 11 times>, "GCMS65534\000\071\061"...<Address 0x7fd064214000 out of bounds>)
    at /usr/src/debug/gcc-4.8.5-20150702/obj-x86_64-redhat-linux/x86_64-redhat-linux/libstdc++-v3/include/bits/basic_string.tcc:249
#12 0x000000000042f525 in operator= (
    __str="http://s3.amazonaws.com/doc/2006-03-01/\000P\000\000\000\000\000\000\000E\000\000\000\000\000\000\000\001\000\000\000\320\177\000\000P[\006d\320\177\000\000\020\340\000d\320\177\000\000\260\220\bd\320\177\000\000hk\000d\320\177\000\000Hs\000d\320\177\000\000@\000\000\000\000\000\000\000\065", '\000' <repeats 23 times>, "\300\200\363~\320\177\000\000\000\000\000\000\000\000\000\000\060\000\000\000\000\000\000\000\065\000\000\000\000\000\000\000\005\000\000\000\000\000\000\000\005", '\000' <repeats 11 times>, "GCMS65534\000\071\061"...<Address 0x7fd064214000 out of bounds>, this=0x7fd003ffe2d0) at /usr/include/c++/4.8.2/bits/basic_string.h:547
#13 GetXmlNsUrl (doc=doc@entry=0x7fcf7404c260, nsurl="") at s3fs_xml.cpp:62
#14 0x000000000042f70d in get_base_exp (doc=doc@entry=0x7fcf7404c260, exp=exp@entry=0x4a08bf "IsTruncated") at s3fs_xml.cpp:79
#15 0x0000000000430860 in is_truncated (doc=doc@entry=0x7fcf7404c260) at s3fs_xml.cpp:310
#16 0x000000000040d26f in list_bucket (path=path@entry=0x7fcf7406d3c0 "/Dir1/File1.7z", head=...,
    delimiter=delimiter@entry=0x493964 "/", check_content_only=check_content_only@entry=true) at s3fs.cpp:2756
#17 0x00000000004121d3 in directory_empty (path=path@entry=0x7fcf7406d3c0 "/Dir1/File1.7z") at s3fs.cpp:1099
#18 0x0000000000424db4 in s3fs_rename (_from=<optimized out>, _to=<optimized out>) at s3fs.cpp:1583
#19 0x00007fd07f17c347 in fuse_lib_rename (req=0x7fcf74065860, olddir=448853, oldname=0x7fcfe40554f0 ".File1.7z.lXsbr9", newdir=448853,
    newname=0x7fcfe405554a "File1.7z") at fuse.c:3038
#20 0x00007fd07f186b6b in fuse_ll_process_buf (data=0x1d180d0, buf=0x7fd003ffed80, ch=<optimized out>) at fuse_lowlevel.c:2441
#21 0x00007fd07f183401 in fuse_do_work (data=0x7fcfe4022d80) at fuse_loop_mt.c:117
#22 0x00007fd07dd07dd5 in start_thread (arg=0x7fd003fff700) at pthread_create.c:307
#23 0x00007fd07da30ead in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

The threads 18-14, 10-9, 7, 5, 2-1 execute the same code at the same time:

(gdb) thread apply all bt

[...]

Thread 18 (Thread 0x7fd022bfe700 (LWP 31385)):
#0  __lll_lock_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:135
#1  0x00007fd07dd09de6 in _L_lock_941 () from /lib64/libpthread.so.0
#2  0x00007fd07dd09cdf in __GI___pthread_mutex_lock (mutex=0x7fd07f5d0930 <_rtld_local+2352>) at ../nptl/pthread_mutex_lock.c:113
#3  0x00007fd07da6f36f in __GI___dl_iterate_phdr (callback=callback@entry=0x7fd07df2d280 <_Unwind_IteratePhdrCallback>, data=data@entry=0x7fd022bfc9e0) at dl-iteratephdr.c:41
#4  0x00007fd07df2dbbf in _Unwind_Find_FDE (pc=0x7fd07e4929db <__cxxabiv1::__cxa_rethrow()+59>, bases=bases@entry=0x7fd022bfcc68) at ../../../libgcc/unwind-dw2-fde-dip.c:461
#5  0x00007fd07df2ad2c in uw_frame_state_for (context=context@entry=0x7fd022bfcbc0, fs=fs@entry=0x7fd022bfccb0) at ../../../libgcc/unwind-dw2.c:1245
#6  0x00007fd07df2bbd3 in _Unwind_RaiseException (exc=0x7fcfe4003930) at ../../../libgcc/unwind.inc:99
#7  0x00007fd07df2bebd in _Unwind_Resume_or_Rethrow (exc=0x7fcfe4003930) at ../../../libgcc/unwind.inc:252
#8  0x00007fd07e4929dc in __cxxabiv1::__cxa_rethrow () at ../../../../libstdc++-v3/libsupc++/eh_throw.cc:118
#9  0x00007fd07e49472f in __gnu_cxx::__verbose_terminate_handler () at ../../../../libstdc++-v3/libsupc++/vterminate.cc:80
#10 0x00007fd07e492746 in __cxxabiv1::__terminate (handler=<optimized out>) at ../../../../libstdc++-v3/libsupc++/eh_terminate.cc:38
#11 0x00007fd07e492773 in std::terminate () at ../../../../libstdc++-v3/libsupc++/eh_terminate.cc:48
#12 0x00007fd07e492993 in __cxxabiv1::__cxa_throw (obj=0x7fcfe4003950, tinfo=0x7fd07e71db00 <typeinfo for std::bad_alloc>, dest=0x7fd07e490ce0 <std::bad_alloc::~bad_alloc()>) at ../../../../libstdc++-v3/libsupc++/eh_throw.cc:87
#13 0x00007fd07e492f2d in operator new (sz=140533008707552) at ../../../../libstdc++-v3/libsupc++/new_op.cc:56
#14 0x00007fd07e4f1a19 in allocate (this=<optimized out>, __n=<optimized out>) at /usr/src/debug/gcc-4.8.5-20150702/obj-x86_64-redhat-linux/x86_64-redhat-linux/libstdc++-v3/include/ext/new_allocator.h:104
#15 std::string::_Rep::_S_create (__capacity=140533008707527, __old_capacity=<optimized out>, __alloc=...)
    at /usr/src/debug/gcc-4.8.5-20150702/obj-x86_64-redhat-linux/x86_64-redhat-linux/libstdc++-v3/include/bits/basic_string.tcc:607
#16 0x00007fd07e4f262b in std::string::_Rep::_M_clone (this=0x7fd0640b3d70, __alloc=..., __res=__res@entry=0)
    at /usr/src/debug/gcc-4.8.5-20150702/obj-x86_64-redhat-linux/x86_64-redhat-linux/libstdc++-v3/include/bits/basic_string.tcc:629
#17 0x00007fd07e4f2d9c in _M_grab (__alloc2=..., __alloc1=..., this=<optimized out>) at /usr/src/debug/gcc-4.8.5-20150702/obj-x86_64-redhat-linux/x86_64-redhat-linux/libstdc++-v3/include/bits/basic_string.h:229
#18 std::string::assign (this=this@entry=0x7fd022bfd200,
    __str="http://s3.amazonaws.com/doc/2006-03-01/\000P\000\000\000\000\000\000\000E\000\000\000\000\000\000\000\001\000\000\000\320\177\000\000P[\006d\320\177\000\000\020\340\000d\320\177\000\000\260\220\bd\320\177\000\000hk\000d\320\177\000\000Hs\000d\320\177\000\000@\000\000\000\000\000\000\000\065", '\000' <repeats 23 times>, "\300\200\363~\320\177\000\000\000\000\000\000\000\000\000\000\060\000\000\000\000\000\000\000\065\000\000\000\000\000\000\000\005\000\000\000\000\000\000\000\005", '\000' <repeats 11 times>, "GCMS65534\000\071\061"...<Address 0x7fd064214000 out of bounds>)
    at /usr/src/debug/gcc-4.8.5-20150702/obj-x86_64-redhat-linux/x86_64-redhat-linux/libstdc++-v3/include/bits/basic_string.tcc:249
#19 0x000000000042f525 in operator= (
    __str="http://s3.amazonaws.com/doc/2006-03-01/\000P\000\000\000\000\000\000\000E\000\000\000\000\000\000\000\001\000\000\000\320\177\000\000P[\006d\320\177\000\000\020\340\000d\320\177\000\000\260\220\bd\320\177\000\000hk\000d\320\177\000\000Hs\000d\320\177\000\000@\000\000\000\000\000\000\000\065", '\000' <repeats 23 times>, "\300\200\363~\320\177\000\000\000\000\000\000\000\000\000\000\060\000\000\000\000\000\000\000\065\000\000\000\000\000\000\000\005\000\000\000\000\000\000\000\005", '\000' <repeats 11 times>, "GCMS65534\000\071\061"...<Address 0x7fd064214000 out of bounds>, this=0x7fd022bfd200) at /usr/include/c++/4.8.2/bits/basic_string.h:547
#20 GetXmlNsUrl (doc=doc@entry=0x7fcfe4028540, nsurl="") at s3fs_xml.cpp:62
#21 0x000000000042f70d in get_base_exp (doc=doc@entry=0x7fcfe4028540, exp=exp@entry=0x4a08cb "Prefix") at s3fs_xml.cpp:79
#22 0x0000000000431bd5 in get_prefix (doc=0x7fcfe4028540) at s3fs_xml.cpp:109
#23 append_objects_from_xml (path=path@entry=0x7fcfe4002a80 "/Dir2/File2.7z", doc=doc@entry=0x7fcfe4028540,
    head=...) at s3fs_xml.cpp:414
#24 0x000000000040d25c in list_bucket (path=path@entry=0x7fcfe4002a80 "/Dir2/File2.7z", head=..., delimiter=delimiter@entry=0x493964 "/", check_content_only=check_content_only@entry=true) at s3fs.cpp:2751
#25 0x00000000004121d3 in directory_empty (path=path@entry=0x7fcfe4002a80 "/Dir2/File2.7z") at s3fs.cpp:1099
#26 0x0000000000424db4 in s3fs_rename (_from=<optimized out>, _to=<optimized out>) at s3fs.cpp:1583
#27 0x00007fd07f17c347 in fuse_lib_rename (req=<optimized out>, olddir=<optimized out>, oldname=<optimized out>, newdir=<optimized out>, newname=<optimized out>) at fuse.c:3038
#28 0x00007fd07f186b6b in fuse_ll_process_buf (data=0x1d180d0, buf=0x7fd022bfdd80, ch=<optimized out>) at fuse_lowlevel.c:2441
#29 0x00007fd07f183401 in fuse_do_work (data=0x7fcfb801cb90) at fuse_loop_mt.c:117
#30 0x00007fd07dd07dd5 in start_thread (arg=0x7fd022bfe700) at pthread_create.c:307
#31 0x00007fd07da30ead in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

[...]

Thread 7 (Thread 0x7fd0217fd700 (LWP 31396)):
#0  0x00007fd07df2a946 in uw_update_context_1 (context=context@entry=0x7fd0217fbe50, fs=fs@entry=0x7fd0217fbf40) at ../../../libgcc/unwind-dw2.c:1436
#1  0x00007fd07df2ac01 in uw_update_context (context=context@entry=0x7fd0217fbe50, fs=fs@entry=0x7fd0217fbf40) at ../../../libgcc/unwind-dw2.c:1506
#2  0x00007fd07df2bbc8 in _Unwind_RaiseException (exc=0x7fcf600d2370) at ../../../libgcc/unwind.inc:122
#3  0x00007fd07e492986 in __cxxabiv1::__cxa_throw (obj=0x7fcf600d2390, tinfo=0x7fd07e71db00 <typeinfo for std::bad_alloc>, dest=0x7fd07e490ce0 <std::bad_alloc::~bad_alloc()>) at ../../../../libstdc++-v3/libsupc++/eh_throw.cc:82
#4  0x00007fd07e492f2d in operator new (sz=140533008707552) at ../../../../libstdc++-v3/libsupc++/new_op.cc:56
#5  0x00007fd07e4f1a19 in allocate (this=<optimized out>, __n=<optimized out>) at /usr/src/debug/gcc-4.8.5-20150702/obj-x86_64-redhat-linux/x86_64-redhat-linux/libstdc++-v3/include/ext/new_allocator.h:104
#6  std::string::_Rep::_S_create (__capacity=140533008707527, __old_capacity=<optimized out>, __alloc=...)
    at /usr/src/debug/gcc-4.8.5-20150702/obj-x86_64-redhat-linux/x86_64-redhat-linux/libstdc++-v3/include/bits/basic_string.tcc:607
#7  0x00007fd07e4f262b in std::string::_Rep::_M_clone (this=0x7fd0640b3d70, __alloc=..., __res=__res@entry=0)
    at /usr/src/debug/gcc-4.8.5-20150702/obj-x86_64-redhat-linux/x86_64-redhat-linux/libstdc++-v3/include/bits/basic_string.tcc:629
#8  0x00007fd07e4f2d9c in _M_grab (__alloc2=..., __alloc1=..., this=<optimized out>) at /usr/src/debug/gcc-4.8.5-20150702/obj-x86_64-redhat-linux/x86_64-redhat-linux/libstdc++-v3/include/bits/basic_string.h:229
#9  std::string::assign (this=this@entry=0x7fd0217fc200,
    __str="http://s3.amazonaws.com/doc/2006-03-01/\000P\000\000\000\000\000\000\000E\000\000\000\000\000\000\000\001\000\000\000\320\177\000\000P[\006d\320\177\000\000\020\340\000d\320\177\000\000\260\220\bd\320\177\000\000hk\000d\320\177\000\000Hs\000d\320\177\000\000@\000\000\000\000\000\000\000\065", '\000' <repeats 23 times>, "\300\200\363~\320\177\000\000\000\000\000\000\000\000\000\000\060\000\000\000\000\000\000\000\065\000\000\000\000\000\000\000\005\000\000\000\000\000\000\000\005", '\000' <repeats 11 times>, "GCMS65534\000\071\061"...<Address 0x7fd064214000 out of bounds>)
    at /usr/src/debug/gcc-4.8.5-20150702/obj-x86_64-redhat-linux/x86_64-redhat-linux/libstdc++-v3/include/bits/basic_string.tcc:249
#10 0x000000000042f525 in operator= (
    __str="http://s3.amazonaws.com/doc/2006-03-01/\000P\000\000\000\000\000\000\000E\000\000\000\000\000\000\000\001\000\000\000\320\177\000\000P[\006d\320\177\000\000\020\340\000d\320\177\000\000\260\220\bd\320\177\000\000hk\000d\320\177\000\000Hs\000d\320\177\000\000@\000\000\000\000\000\000\000\065", '\000' <repeats 23 times>, "\300\200\363~\320\177\000\000\000\000\000\000\000\000\000\000\060\000\000\000\000\000\000\000\065\000\000\000\000\000\000\000\005\000\000\000\000\000\000\000\005", '\000' <repeats 11 times>, "GCMS65534\000\071\061"...<Address 0x7fd064214000 out of bounds>, this=0x7fd0217fc200) at /usr/include/c++/4.8.2/bits/basic_string.h:547
#11 GetXmlNsUrl (doc=doc@entry=0x7fcf601293b0, nsurl="") at s3fs_xml.cpp:62
#12 0x000000000042f70d in get_base_exp (doc=doc@entry=0x7fcf601293b0, exp=exp@entry=0x4a08cb "Prefix") at s3fs_xml.cpp:79
#13 0x0000000000431bd5 in get_prefix (doc=0x7fcf601293b0) at s3fs_xml.cpp:109
#14 append_objects_from_xml (path=path@entry=0x7fcf600d7ae0 "/Dir3/File3.7z",
    doc=doc@entry=0x7fcf601293b0, head=...) at s3fs_xml.cpp:414
#15 0x000000000040d25c in list_bucket (path=path@entry=0x7fcf600d7ae0 "/Dir3/File3.7z", head=..., delimiter=delimiter@entry=0x493964 "/", check_content_only=check_content_only@entry=true) at s3fs.cpp:2751
#16 0x00000000004121d3 in directory_empty (path=path@entry=0x7fcf600d7ae0 "/Dir3/File3.7z") at s3fs.cpp:1099
#17 0x0000000000424db4 in s3fs_rename (_from=<optimized out>, _to=<optimized out>) at s3fs.cpp:1583
#18 0x00007fd07f17c347 in fuse_lib_rename (req=<optimized out>, olddir=<optimized out>, oldname=<optimized out>, newdir=<optimized out>, newname=<optimized out>) at fuse.c:3038
#19 0x00007fd07f186b6b in fuse_ll_process_buf (data=0x1d180d0, buf=0x7fd0217fcd80, ch=<optimized out>) at fuse_lowlevel.c:2441
#20 0x00007fd07f183401 in fuse_do_work (data=0x7fcf54027040) at fuse_loop_mt.c:117
#21 0x00007fd07dd07dd5 in start_thread (arg=0x7fd0217fd700) at pthread_create.c:307
#22 0x00007fd07da30ead in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

It looks like the parallel execution of GetXmlNsUrl triggers an C++ exception in line 62:

github.com/s3fs-fuse/s3fs-fuse@b4edad86d6/src/s3fs_xml.cpp (L37-L66)

Reproducibility

The issue occurs randomly after a runtime of several days. I can't reproduce this issue manually.

Additional observation

The content of __str in std::string::assign is unexpected long. I would expect a smaller string terminating after the first null byte: __str="http://s3.amazonaws.com/doc/2006-03-01/\000.

Originally created by @CarstenGrohmann on GitHub (Oct 7, 2021). Original GitHub issue: https://github.com/s3fs-fuse/s3fs-fuse/issues/1772 s3fs has been terminated itself twice whiteout further notice on my system. After enabling core dumps I got one, but unfortunately I deleted the binary. I recompiled the binary. The analysis with gdb don't show any junk, thereby I assume core dump and binary matches well. ### Core dump analysis The dump shows an abort in `GetXmlNsUrl` during a copy assignment of `strNs` (type `std::string`): ``` (gdb) bt #0 0x00007fd07d969207 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:55 #1 0x00007fd07d96a8f8 in __GI_abort () at abort.c:90 #2 0x00007fd07e494765 in __gnu_cxx::__verbose_terminate_handler () at ../../../../libstdc++-v3/libsupc++/vterminate.cc:50 #3 0x00007fd07e492746 in __cxxabiv1::__terminate (handler=<optimized out>) at ../../../../libstdc++-v3/libsupc++/eh_terminate.cc:38 #4 0x00007fd07e492773 in std::terminate () at ../../../../libstdc++-v3/libsupc++/eh_terminate.cc:48 #5 0x00007fd07e492993 in __cxxabiv1::__cxa_throw (obj=0x7fcf7406c000, tinfo=0x7fd07e71db00 <typeinfo for std::bad_alloc>, dest=0x7fd07e490ce0 <std::bad_alloc::~bad_alloc()>) at ../../../../libstdc++-v3/libsupc++/eh_throw.cc:87 #6 0x00007fd07e492f2d in operator new (sz=140533008707552) at ../../../../libstdc++-v3/libsupc++/new_op.cc:56 #7 0x00007fd07e4f1a19 in allocate (this=<optimized out>, __n=<optimized out>) at /usr/src/debug/gcc-4.8.5-20150702/obj-x86_64-redhat-linux/x86_64-redhat-linux/libstdc++-v3/include/ext/new_allocator.h:104 #8 std::string::_Rep::_S_create (__capacity=140533008707527, __old_capacity=<optimized out>, __alloc=...) at /usr/src/debug/gcc-4.8.5-20150702/obj-x86_64-redhat-linux/x86_64-redhat-linux/libstdc++-v3/include/bits/basic_string.tcc:607 #9 0x00007fd07e4f262b in std::string::_Rep::_M_clone (this=0x7fd0640b3d70, __alloc=..., __res=__res@entry=0) at /usr/src/debug/gcc-4.8.5-20150702/obj-x86_64-redhat-linux/x86_64-redhat-linux/libstdc++-v3/include/bits/basic_string.tcc:629 #10 0x00007fd07e4f2d9c in _M_grab (__alloc2=..., __alloc1=..., this=<optimized out>) at /usr/src/debug/gcc-4.8.5-20150702/obj-x86_64-redhat-linux/x86_64-redhat-linux/libstdc++-v3/include/bits/basic_string.h:229 #11 std::string::assign (this=this@entry=0x7fd003ffe2d0, __str="http://s3.amazonaws.com/doc/2006-03-01/\000P\000\000\000\000\000\000\000E\000\000\000\000\000\000\000\001\000\000\000\320\177\000\000P[\006d\320\177\000\000\020\340\000d\320\177\000\000\260\220\bd\320\177\000\000hk\000d\320\177\000\000Hs\000d\320\177\000\000@\000\000\000\000\000\000\000\065", '\000' <repeats 23 times>, "\300\200\363~\320\177\000\000\000\000\000\000\000\000\000\000\060\000\000\000\000\000\000\000\065\000\000\000\000\000\000\000\005\000\000\000\000\000\000\000\005", '\000' <repeats 11 times>, "GCMS65534\000\071\061"...<Address 0x7fd064214000 out of bounds>) at /usr/src/debug/gcc-4.8.5-20150702/obj-x86_64-redhat-linux/x86_64-redhat-linux/libstdc++-v3/include/bits/basic_string.tcc:249 #12 0x000000000042f525 in operator= ( __str="http://s3.amazonaws.com/doc/2006-03-01/\000P\000\000\000\000\000\000\000E\000\000\000\000\000\000\000\001\000\000\000\320\177\000\000P[\006d\320\177\000\000\020\340\000d\320\177\000\000\260\220\bd\320\177\000\000hk\000d\320\177\000\000Hs\000d\320\177\000\000@\000\000\000\000\000\000\000\065", '\000' <repeats 23 times>, "\300\200\363~\320\177\000\000\000\000\000\000\000\000\000\000\060\000\000\000\000\000\000\000\065\000\000\000\000\000\000\000\005\000\000\000\000\000\000\000\005", '\000' <repeats 11 times>, "GCMS65534\000\071\061"...<Address 0x7fd064214000 out of bounds>, this=0x7fd003ffe2d0) at /usr/include/c++/4.8.2/bits/basic_string.h:547 #13 GetXmlNsUrl (doc=doc@entry=0x7fcf7404c260, nsurl="") at s3fs_xml.cpp:62 #14 0x000000000042f70d in get_base_exp (doc=doc@entry=0x7fcf7404c260, exp=exp@entry=0x4a08bf "IsTruncated") at s3fs_xml.cpp:79 #15 0x0000000000430860 in is_truncated (doc=doc@entry=0x7fcf7404c260) at s3fs_xml.cpp:310 #16 0x000000000040d26f in list_bucket (path=path@entry=0x7fcf7406d3c0 "/Dir1/File1.7z", head=..., delimiter=delimiter@entry=0x493964 "/", check_content_only=check_content_only@entry=true) at s3fs.cpp:2756 #17 0x00000000004121d3 in directory_empty (path=path@entry=0x7fcf7406d3c0 "/Dir1/File1.7z") at s3fs.cpp:1099 #18 0x0000000000424db4 in s3fs_rename (_from=<optimized out>, _to=<optimized out>) at s3fs.cpp:1583 #19 0x00007fd07f17c347 in fuse_lib_rename (req=0x7fcf74065860, olddir=448853, oldname=0x7fcfe40554f0 ".File1.7z.lXsbr9", newdir=448853, newname=0x7fcfe405554a "File1.7z") at fuse.c:3038 #20 0x00007fd07f186b6b in fuse_ll_process_buf (data=0x1d180d0, buf=0x7fd003ffed80, ch=<optimized out>) at fuse_lowlevel.c:2441 #21 0x00007fd07f183401 in fuse_do_work (data=0x7fcfe4022d80) at fuse_loop_mt.c:117 #22 0x00007fd07dd07dd5 in start_thread (arg=0x7fd003fff700) at pthread_create.c:307 #23 0x00007fd07da30ead in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111 ``` The threads 18-14, 10-9, 7, 5, 2-1 execute the same code at the same time: ``` (gdb) thread apply all bt [...] Thread 18 (Thread 0x7fd022bfe700 (LWP 31385)): #0 __lll_lock_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:135 #1 0x00007fd07dd09de6 in _L_lock_941 () from /lib64/libpthread.so.0 #2 0x00007fd07dd09cdf in __GI___pthread_mutex_lock (mutex=0x7fd07f5d0930 <_rtld_local+2352>) at ../nptl/pthread_mutex_lock.c:113 #3 0x00007fd07da6f36f in __GI___dl_iterate_phdr (callback=callback@entry=0x7fd07df2d280 <_Unwind_IteratePhdrCallback>, data=data@entry=0x7fd022bfc9e0) at dl-iteratephdr.c:41 #4 0x00007fd07df2dbbf in _Unwind_Find_FDE (pc=0x7fd07e4929db <__cxxabiv1::__cxa_rethrow()+59>, bases=bases@entry=0x7fd022bfcc68) at ../../../libgcc/unwind-dw2-fde-dip.c:461 #5 0x00007fd07df2ad2c in uw_frame_state_for (context=context@entry=0x7fd022bfcbc0, fs=fs@entry=0x7fd022bfccb0) at ../../../libgcc/unwind-dw2.c:1245 #6 0x00007fd07df2bbd3 in _Unwind_RaiseException (exc=0x7fcfe4003930) at ../../../libgcc/unwind.inc:99 #7 0x00007fd07df2bebd in _Unwind_Resume_or_Rethrow (exc=0x7fcfe4003930) at ../../../libgcc/unwind.inc:252 #8 0x00007fd07e4929dc in __cxxabiv1::__cxa_rethrow () at ../../../../libstdc++-v3/libsupc++/eh_throw.cc:118 #9 0x00007fd07e49472f in __gnu_cxx::__verbose_terminate_handler () at ../../../../libstdc++-v3/libsupc++/vterminate.cc:80 #10 0x00007fd07e492746 in __cxxabiv1::__terminate (handler=<optimized out>) at ../../../../libstdc++-v3/libsupc++/eh_terminate.cc:38 #11 0x00007fd07e492773 in std::terminate () at ../../../../libstdc++-v3/libsupc++/eh_terminate.cc:48 #12 0x00007fd07e492993 in __cxxabiv1::__cxa_throw (obj=0x7fcfe4003950, tinfo=0x7fd07e71db00 <typeinfo for std::bad_alloc>, dest=0x7fd07e490ce0 <std::bad_alloc::~bad_alloc()>) at ../../../../libstdc++-v3/libsupc++/eh_throw.cc:87 #13 0x00007fd07e492f2d in operator new (sz=140533008707552) at ../../../../libstdc++-v3/libsupc++/new_op.cc:56 #14 0x00007fd07e4f1a19 in allocate (this=<optimized out>, __n=<optimized out>) at /usr/src/debug/gcc-4.8.5-20150702/obj-x86_64-redhat-linux/x86_64-redhat-linux/libstdc++-v3/include/ext/new_allocator.h:104 #15 std::string::_Rep::_S_create (__capacity=140533008707527, __old_capacity=<optimized out>, __alloc=...) at /usr/src/debug/gcc-4.8.5-20150702/obj-x86_64-redhat-linux/x86_64-redhat-linux/libstdc++-v3/include/bits/basic_string.tcc:607 #16 0x00007fd07e4f262b in std::string::_Rep::_M_clone (this=0x7fd0640b3d70, __alloc=..., __res=__res@entry=0) at /usr/src/debug/gcc-4.8.5-20150702/obj-x86_64-redhat-linux/x86_64-redhat-linux/libstdc++-v3/include/bits/basic_string.tcc:629 #17 0x00007fd07e4f2d9c in _M_grab (__alloc2=..., __alloc1=..., this=<optimized out>) at /usr/src/debug/gcc-4.8.5-20150702/obj-x86_64-redhat-linux/x86_64-redhat-linux/libstdc++-v3/include/bits/basic_string.h:229 #18 std::string::assign (this=this@entry=0x7fd022bfd200, __str="http://s3.amazonaws.com/doc/2006-03-01/\000P\000\000\000\000\000\000\000E\000\000\000\000\000\000\000\001\000\000\000\320\177\000\000P[\006d\320\177\000\000\020\340\000d\320\177\000\000\260\220\bd\320\177\000\000hk\000d\320\177\000\000Hs\000d\320\177\000\000@\000\000\000\000\000\000\000\065", '\000' <repeats 23 times>, "\300\200\363~\320\177\000\000\000\000\000\000\000\000\000\000\060\000\000\000\000\000\000\000\065\000\000\000\000\000\000\000\005\000\000\000\000\000\000\000\005", '\000' <repeats 11 times>, "GCMS65534\000\071\061"...<Address 0x7fd064214000 out of bounds>) at /usr/src/debug/gcc-4.8.5-20150702/obj-x86_64-redhat-linux/x86_64-redhat-linux/libstdc++-v3/include/bits/basic_string.tcc:249 #19 0x000000000042f525 in operator= ( __str="http://s3.amazonaws.com/doc/2006-03-01/\000P\000\000\000\000\000\000\000E\000\000\000\000\000\000\000\001\000\000\000\320\177\000\000P[\006d\320\177\000\000\020\340\000d\320\177\000\000\260\220\bd\320\177\000\000hk\000d\320\177\000\000Hs\000d\320\177\000\000@\000\000\000\000\000\000\000\065", '\000' <repeats 23 times>, "\300\200\363~\320\177\000\000\000\000\000\000\000\000\000\000\060\000\000\000\000\000\000\000\065\000\000\000\000\000\000\000\005\000\000\000\000\000\000\000\005", '\000' <repeats 11 times>, "GCMS65534\000\071\061"...<Address 0x7fd064214000 out of bounds>, this=0x7fd022bfd200) at /usr/include/c++/4.8.2/bits/basic_string.h:547 #20 GetXmlNsUrl (doc=doc@entry=0x7fcfe4028540, nsurl="") at s3fs_xml.cpp:62 #21 0x000000000042f70d in get_base_exp (doc=doc@entry=0x7fcfe4028540, exp=exp@entry=0x4a08cb "Prefix") at s3fs_xml.cpp:79 #22 0x0000000000431bd5 in get_prefix (doc=0x7fcfe4028540) at s3fs_xml.cpp:109 #23 append_objects_from_xml (path=path@entry=0x7fcfe4002a80 "/Dir2/File2.7z", doc=doc@entry=0x7fcfe4028540, head=...) at s3fs_xml.cpp:414 #24 0x000000000040d25c in list_bucket (path=path@entry=0x7fcfe4002a80 "/Dir2/File2.7z", head=..., delimiter=delimiter@entry=0x493964 "/", check_content_only=check_content_only@entry=true) at s3fs.cpp:2751 #25 0x00000000004121d3 in directory_empty (path=path@entry=0x7fcfe4002a80 "/Dir2/File2.7z") at s3fs.cpp:1099 #26 0x0000000000424db4 in s3fs_rename (_from=<optimized out>, _to=<optimized out>) at s3fs.cpp:1583 #27 0x00007fd07f17c347 in fuse_lib_rename (req=<optimized out>, olddir=<optimized out>, oldname=<optimized out>, newdir=<optimized out>, newname=<optimized out>) at fuse.c:3038 #28 0x00007fd07f186b6b in fuse_ll_process_buf (data=0x1d180d0, buf=0x7fd022bfdd80, ch=<optimized out>) at fuse_lowlevel.c:2441 #29 0x00007fd07f183401 in fuse_do_work (data=0x7fcfb801cb90) at fuse_loop_mt.c:117 #30 0x00007fd07dd07dd5 in start_thread (arg=0x7fd022bfe700) at pthread_create.c:307 #31 0x00007fd07da30ead in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111 [...] Thread 7 (Thread 0x7fd0217fd700 (LWP 31396)): #0 0x00007fd07df2a946 in uw_update_context_1 (context=context@entry=0x7fd0217fbe50, fs=fs@entry=0x7fd0217fbf40) at ../../../libgcc/unwind-dw2.c:1436 #1 0x00007fd07df2ac01 in uw_update_context (context=context@entry=0x7fd0217fbe50, fs=fs@entry=0x7fd0217fbf40) at ../../../libgcc/unwind-dw2.c:1506 #2 0x00007fd07df2bbc8 in _Unwind_RaiseException (exc=0x7fcf600d2370) at ../../../libgcc/unwind.inc:122 #3 0x00007fd07e492986 in __cxxabiv1::__cxa_throw (obj=0x7fcf600d2390, tinfo=0x7fd07e71db00 <typeinfo for std::bad_alloc>, dest=0x7fd07e490ce0 <std::bad_alloc::~bad_alloc()>) at ../../../../libstdc++-v3/libsupc++/eh_throw.cc:82 #4 0x00007fd07e492f2d in operator new (sz=140533008707552) at ../../../../libstdc++-v3/libsupc++/new_op.cc:56 #5 0x00007fd07e4f1a19 in allocate (this=<optimized out>, __n=<optimized out>) at /usr/src/debug/gcc-4.8.5-20150702/obj-x86_64-redhat-linux/x86_64-redhat-linux/libstdc++-v3/include/ext/new_allocator.h:104 #6 std::string::_Rep::_S_create (__capacity=140533008707527, __old_capacity=<optimized out>, __alloc=...) at /usr/src/debug/gcc-4.8.5-20150702/obj-x86_64-redhat-linux/x86_64-redhat-linux/libstdc++-v3/include/bits/basic_string.tcc:607 #7 0x00007fd07e4f262b in std::string::_Rep::_M_clone (this=0x7fd0640b3d70, __alloc=..., __res=__res@entry=0) at /usr/src/debug/gcc-4.8.5-20150702/obj-x86_64-redhat-linux/x86_64-redhat-linux/libstdc++-v3/include/bits/basic_string.tcc:629 #8 0x00007fd07e4f2d9c in _M_grab (__alloc2=..., __alloc1=..., this=<optimized out>) at /usr/src/debug/gcc-4.8.5-20150702/obj-x86_64-redhat-linux/x86_64-redhat-linux/libstdc++-v3/include/bits/basic_string.h:229 #9 std::string::assign (this=this@entry=0x7fd0217fc200, __str="http://s3.amazonaws.com/doc/2006-03-01/\000P\000\000\000\000\000\000\000E\000\000\000\000\000\000\000\001\000\000\000\320\177\000\000P[\006d\320\177\000\000\020\340\000d\320\177\000\000\260\220\bd\320\177\000\000hk\000d\320\177\000\000Hs\000d\320\177\000\000@\000\000\000\000\000\000\000\065", '\000' <repeats 23 times>, "\300\200\363~\320\177\000\000\000\000\000\000\000\000\000\000\060\000\000\000\000\000\000\000\065\000\000\000\000\000\000\000\005\000\000\000\000\000\000\000\005", '\000' <repeats 11 times>, "GCMS65534\000\071\061"...<Address 0x7fd064214000 out of bounds>) at /usr/src/debug/gcc-4.8.5-20150702/obj-x86_64-redhat-linux/x86_64-redhat-linux/libstdc++-v3/include/bits/basic_string.tcc:249 #10 0x000000000042f525 in operator= ( __str="http://s3.amazonaws.com/doc/2006-03-01/\000P\000\000\000\000\000\000\000E\000\000\000\000\000\000\000\001\000\000\000\320\177\000\000P[\006d\320\177\000\000\020\340\000d\320\177\000\000\260\220\bd\320\177\000\000hk\000d\320\177\000\000Hs\000d\320\177\000\000@\000\000\000\000\000\000\000\065", '\000' <repeats 23 times>, "\300\200\363~\320\177\000\000\000\000\000\000\000\000\000\000\060\000\000\000\000\000\000\000\065\000\000\000\000\000\000\000\005\000\000\000\000\000\000\000\005", '\000' <repeats 11 times>, "GCMS65534\000\071\061"...<Address 0x7fd064214000 out of bounds>, this=0x7fd0217fc200) at /usr/include/c++/4.8.2/bits/basic_string.h:547 #11 GetXmlNsUrl (doc=doc@entry=0x7fcf601293b0, nsurl="") at s3fs_xml.cpp:62 #12 0x000000000042f70d in get_base_exp (doc=doc@entry=0x7fcf601293b0, exp=exp@entry=0x4a08cb "Prefix") at s3fs_xml.cpp:79 #13 0x0000000000431bd5 in get_prefix (doc=0x7fcf601293b0) at s3fs_xml.cpp:109 #14 append_objects_from_xml (path=path@entry=0x7fcf600d7ae0 "/Dir3/File3.7z", doc=doc@entry=0x7fcf601293b0, head=...) at s3fs_xml.cpp:414 #15 0x000000000040d25c in list_bucket (path=path@entry=0x7fcf600d7ae0 "/Dir3/File3.7z", head=..., delimiter=delimiter@entry=0x493964 "/", check_content_only=check_content_only@entry=true) at s3fs.cpp:2751 #16 0x00000000004121d3 in directory_empty (path=path@entry=0x7fcf600d7ae0 "/Dir3/File3.7z") at s3fs.cpp:1099 #17 0x0000000000424db4 in s3fs_rename (_from=<optimized out>, _to=<optimized out>) at s3fs.cpp:1583 #18 0x00007fd07f17c347 in fuse_lib_rename (req=<optimized out>, olddir=<optimized out>, oldname=<optimized out>, newdir=<optimized out>, newname=<optimized out>) at fuse.c:3038 #19 0x00007fd07f186b6b in fuse_ll_process_buf (data=0x1d180d0, buf=0x7fd0217fcd80, ch=<optimized out>) at fuse_lowlevel.c:2441 #20 0x00007fd07f183401 in fuse_do_work (data=0x7fcf54027040) at fuse_loop_mt.c:117 #21 0x00007fd07dd07dd5 in start_thread (arg=0x7fd0217fd700) at pthread_create.c:307 #22 0x00007fd07da30ead in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111 ``` It looks like the parallel execution of `GetXmlNsUrl` triggers an C++ exception in line 62: https://github.com/s3fs-fuse/s3fs-fuse/blob/b4edad86d6a60a985a310fac2225cdf7be40bd6d/src/s3fs_xml.cpp#L37-L66 ### Reproducibility The issue occurs randomly after a runtime of several days. I can't reproduce this issue manually. ### Additional observation The content of `__str` in `std::string::assign` is unexpected long. I would expect a smaller string terminating after the first null byte: `__str="http://s3.amazonaws.com/doc/2006-03-01/\000`.
kerem closed this issue 2026-03-04 01:49:52 +03:00
Author
Owner

@CarstenGrohmann commented on GitHub (Oct 26, 2021):

Adding an AutoLock would be a simple workaround, but it prevents parallel execution of this function:

diff --git a/src/s3fs_xml.cpp b/src/s3fs_xml.cpp
index d020954..b7368a4 100644
--- a/src/s3fs_xml.cpp
+++ b/src/s3fs_xml.cpp
@@ -21,6 +21,7 @@
 #include <cstdio>
 #include <cstdlib>
 
+#include "autolock.h"
 #include "common.h"
 #include "s3fs.h"
 #include "s3fs_xml.h"
@@ -39,10 +40,14 @@ static bool GetXmlNsUrl(xmlDocPtr doc, std::string& nsurl)
     static time_t tmLast = 0;  // cache for 60 sec.
     static std::string strNs;
     bool result = false;
+    static pthread_mutex_t lock;
 
     if(!doc){
         return false;
     }
+
+    AutoLock auto_lock(&lock);
+
     if((tmLast + 60) < time(NULL)){
         // refresh
         tmLast = time(NULL);

I've tested the patch functionally, but the long time test isn't finished yet. I'll share the test results after s3fs run more than two weeks without abort.

<!-- gh-comment-id:951658927 --> @CarstenGrohmann commented on GitHub (Oct 26, 2021): Adding an `AutoLock` would be a simple workaround, but it prevents parallel execution of this function: ``` diff --git a/src/s3fs_xml.cpp b/src/s3fs_xml.cpp index d020954..b7368a4 100644 --- a/src/s3fs_xml.cpp +++ b/src/s3fs_xml.cpp @@ -21,6 +21,7 @@ #include <cstdio> #include <cstdlib> +#include "autolock.h" #include "common.h" #include "s3fs.h" #include "s3fs_xml.h" @@ -39,10 +40,14 @@ static bool GetXmlNsUrl(xmlDocPtr doc, std::string& nsurl) static time_t tmLast = 0; // cache for 60 sec. static std::string strNs; bool result = false; + static pthread_mutex_t lock; if(!doc){ return false; } + + AutoLock auto_lock(&lock); + if((tmLast + 60) < time(NULL)){ // refresh tmLast = time(NULL); ``` I've tested the patch functionally, but the long time test isn't finished yet. I'll share the test results after s3fs run more than two weeks without abort.
Author
Owner

@ggtakec commented on GitHub (Oct 26, 2021):

@CarstenGrohmann Thank you for contacting us about the problem. The reply was late.
As you pointed out, the following size part of backtrace is abnormal.

#6  0x00007fd07e492f2d in operator new (sz=140533008707552) at ../../../../libstdc++-v3/libsupc++/new_op.cc:56

I tried creating a PR #1789 to check the length(xmlStrlen) of the pointer in xmlChar.
If possible, it would be helpful if you could test it.

<!-- gh-comment-id:951851366 --> @ggtakec commented on GitHub (Oct 26, 2021): @CarstenGrohmann Thank you for contacting us about the problem. The reply was late. As you pointed out, the following size part of backtrace is abnormal. ``` #6 0x00007fd07e492f2d in operator new (sz=140533008707552) at ../../../../libstdc++-v3/libsupc++/new_op.cc:56 ``` I tried creating a PR #1789 to check the length(xmlStrlen) of the pointer in xmlChar. If possible, it would be helpful if you could test it.
Author
Owner

@ggtakec commented on GitHub (Oct 26, 2021):

If multiple threads are writing to memory at the same time and this bug is occurring, the above PR fix may not make sense.
(I will check it just in case.)

<!-- gh-comment-id:951869333 --> @ggtakec commented on GitHub (Oct 26, 2021): If multiple threads are writing to memory at the same time and this bug is occurring, the above PR fix may not make sense. (I will check it just in case.)
Author
Owner

@CarstenGrohmann commented on GitHub (Oct 26, 2021):

@ggtakec I'll test the PR. Since I can't reproduce the issue, it may take a two weeks or so to check if s3fs aborts abnormally or if it runs fine.

<!-- gh-comment-id:951984904 --> @CarstenGrohmann commented on GitHub (Oct 26, 2021): @ggtakec I'll test the PR. Since I can't reproduce the issue, it may take a two weeks or so to check if s3fs aborts abnormally or if it runs fine.
Author
Owner

@ggtakec commented on GitHub (Oct 26, 2021):

Thanks for your kindness.
I think that the PR has been newly changed to use AutoLock, so you probably won't have the same problem.
We may merge this PR without waiting for the reproduction test.
If you have time, please try it.

<!-- gh-comment-id:951998574 --> @ggtakec commented on GitHub (Oct 26, 2021): Thanks for your kindness. I think that the PR has been newly changed to use AutoLock, so you probably won't have the same problem. We may merge this PR without waiting for the reproduction test. If you have time, please try it.
Author
Owner

@CarstenGrohmann commented on GitHub (Nov 16, 2021):

I run the current development version of s3fs for several days and transferred more that 2M files and the error doesn't occur.

#1789 solves this issue.

Thank you!

<!-- gh-comment-id:970080753 --> @CarstenGrohmann commented on GitHub (Nov 16, 2021): I run the current development version of s3fs for several days and transferred more that 2M files and the error doesn't occur. #1789 solves this issue. Thank you!
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/s3fs-fuse#915
No description provided.