[GH-ISSUE #1453] Corruption following rename() #762

Closed
opened 2026-03-04 01:48:33 +03:00 by kerem · 16 comments
Owner

Originally created by @init-js on GitHub (Oct 15, 2020).
Original GitHub issue: https://github.com/s3fs-fuse/s3fs-fuse/issues/1453

Version of s3fs being used (s3fs --version)

Amazon Simple Storage Service File System V1.87 (commit:194262c) with OpenSSL
Copyright (C) 2010 Randy Rizun <rrizun@gmail.com>
License GPL2: GNU GPL version 2 <https://gnu.org/licenses/gpl.html>

Version of fuse being used (pkg-config --modversion fuse, rpm -qi fuse, dpkg -s fuse)

2.9.9

Kernel information (uname -r)

4.14.177-139.254.amzn2.x86_64

GNU/Linux Distribution, if applicable (cat /etc/os-release)

NAME="Ubuntu"
VERSION="20.04.1 LTS (Focal Fossa)"
ID=ubuntu
ID_LIKE=debian
PRETTY_NAME="Ubuntu 20.04.1 LTS"
VERSION_ID="20.04"
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
VERSION_CODENAME=focal
UBUNTU_CODENAME=focal

s3fs command line used, if applicable

s3fs my_bucket /mnt -d -d -f -o use_cache=/tmp/s3fs-cache -o enable_noobj_cache -o endpoint=us-west-2 -o nomultipart -o use_path_request_style -o iam_role=auto -o del_cache

/etc/fstab entry, if applicable

s3fs syslog messages (grep s3fs /var/log/syslog, journalctl | grep s3fs, or s3fs outputs)

if you execute s3fs with dbglevel, curldbg option, you can get detail debug messages

Details about issue

It feels like a caching bug is causing s3fs to upload a file filled with zeroes if you trigger a chmod/chown just after a rename.
At the end of the following program, "B" will have the correct length, but its contents will be nothing but nil bytes \0.
(This is assuming /mnt is the fuse mountpoint.)

#!/bin/bash
# s3fs.fuse mounted there
mntpoint=/mnt/

tmp="$mntpoint/hello_$RANDOM$RANDOM"
target="$mntpoint/my_target"

# the following two lines are not strictly necessary, but it's cleanup.
rm -f "$target"
touch "$target"

# write 11 bytes to A
echo -n "HELLO WORLD" > "$tmp"
hexdump -C "$tmp". # <- this shows "HELLO WORLD", as expected

# rename(A, to=B)
mv "$tmp" "$target"

# CACHEBUG: uncommenting the following line causes the corruption to disappear (any read operation over the file works)
# hexdump -C "$target"

# this causes a PUT-COPY(B, to=B) to occur, to update metadata.
chmod 0644 "$target"

echo "final contents:"
hexdump -C "$target". #<- this shows 11 nil bytes. length is good. contents are corrupted.

Note: the target file is not only corrupted when accessed via fuse. If we access the target file out-of-band, via the AWS console, it reveals that the target file is also corrupted in the bucket, at rest. S3fs makes an explicit PUT with nil bytes.

The chmod 0644 call triggers:

  1. s3fs.cpp:s3fs_getattr(828): [path=/linux_src/my_target] (HEAD my_target)
  2. s3fs.cpp:s3fs_chmod(1603): [path=/linux_src/my_target][mode=100644] (PUT+COPY my_target -> my_target)

Then, reading the file at the end of the script (i.e. the last line) triggers:

  1. s3fs.cpp:s3fs_getattr(828): [path=/linux_src/my_target]. (HEAD my_target)
  2. s3fs.cpp:s3fs_open(2132): [path=/linux_src/my_target][flags=0x8000]
  3. s3fs.cpp:s3fs_flush(2264): [path=/linux_src/my_target][fd=0] (HEAD my_target)
  4. s3fs.cpp:s3fs_flush(2264): [path=/linux_src/my_target][fd=12]
    curl.cpp:PutRequest(3433): uploading... [path=/linux_src/my_target][fd=12][size=11]
    curl.cpp:GetObjectRequest(3517): downloading... [path=/linux_src/my_target][fd=12]

Not sure why that last upload needs to happen, but that's where the file gets corrupted. What gets uploaded is a body of 11 nul bytes. The broken upload does not happen if the file has been read between the rename and the chmod.

Originally created by @init-js on GitHub (Oct 15, 2020). Original GitHub issue: https://github.com/s3fs-fuse/s3fs-fuse/issues/1453 #### Version of s3fs being used (s3fs --version) ``` Amazon Simple Storage Service File System V1.87 (commit:194262c) with OpenSSL Copyright (C) 2010 Randy Rizun <rrizun@gmail.com> License GPL2: GNU GPL version 2 <https://gnu.org/licenses/gpl.html> ``` #### Version of fuse being used (pkg-config --modversion fuse, rpm -qi fuse, dpkg -s fuse) `2.9.9` #### Kernel information (uname -r) `4.14.177-139.254.amzn2.x86_64` #### GNU/Linux Distribution, if applicable (cat /etc/os-release) ``` NAME="Ubuntu" VERSION="20.04.1 LTS (Focal Fossa)" ID=ubuntu ID_LIKE=debian PRETTY_NAME="Ubuntu 20.04.1 LTS" VERSION_ID="20.04" HOME_URL="https://www.ubuntu.com/" SUPPORT_URL="https://help.ubuntu.com/" BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/" PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy" VERSION_CODENAME=focal UBUNTU_CODENAME=focal ``` #### s3fs command line used, if applicable ``` s3fs my_bucket /mnt -d -d -f -o use_cache=/tmp/s3fs-cache -o enable_noobj_cache -o endpoint=us-west-2 -o nomultipart -o use_path_request_style -o iam_role=auto -o del_cache ``` #### /etc/fstab entry, if applicable ``` ``` #### s3fs syslog messages (grep s3fs /var/log/syslog, journalctl | grep s3fs, or s3fs outputs) _if you execute s3fs with dbglevel, curldbg option, you can get detail debug messages_ ``` ``` ### Details about issue It feels like a caching bug is causing s3fs to upload a file filled with zeroes if you trigger a chmod/chown just after a rename. At the end of the following program, "B" will have the correct length, but its contents will be nothing but nil bytes `\0`. (This is assuming /mnt is the fuse mountpoint.) ``` #!/bin/bash # s3fs.fuse mounted there mntpoint=/mnt/ tmp="$mntpoint/hello_$RANDOM$RANDOM" target="$mntpoint/my_target" # the following two lines are not strictly necessary, but it's cleanup. rm -f "$target" touch "$target" # write 11 bytes to A echo -n "HELLO WORLD" > "$tmp" hexdump -C "$tmp". # <- this shows "HELLO WORLD", as expected # rename(A, to=B) mv "$tmp" "$target" # CACHEBUG: uncommenting the following line causes the corruption to disappear (any read operation over the file works) # hexdump -C "$target" # this causes a PUT-COPY(B, to=B) to occur, to update metadata. chmod 0644 "$target" echo "final contents:" hexdump -C "$target". #<- this shows 11 nil bytes. length is good. contents are corrupted. ``` Note: the target file is not only corrupted when accessed via fuse. If we access the target file out-of-band, via the AWS console, it reveals that the target file is _also_ corrupted in the bucket, at rest. S3fs makes an explicit PUT with nil bytes. The `chmod 0644` call triggers: 1. s3fs.cpp:s3fs_getattr(828): [path=/linux_src/my_target] (HEAD my_target) 1. s3fs.cpp:s3fs_chmod(1603): [path=/linux_src/my_target][mode=100644] (PUT+COPY my_target -> my_target) Then, reading the file at the end of the script (i.e. the last line) triggers: 1. s3fs.cpp:s3fs_getattr(828): [path=/linux_src/my_target]. (HEAD my_target) 1. s3fs.cpp:s3fs_open(2132): [path=/linux_src/my_target][flags=0x8000] 1. s3fs.cpp:s3fs_flush(2264): [path=/linux_src/my_target][fd=0] (HEAD my_target) 1. s3fs.cpp:s3fs_flush(2264): [path=/linux_src/my_target][fd=12] curl.cpp:PutRequest(3433): uploading... [path=/linux_src/my_target][fd=12][size=11] curl.cpp:GetObjectRequest(3517): downloading... [path=/linux_src/my_target][fd=12] Not sure why that last upload needs to happen, but that's where the file gets corrupted. What gets uploaded is a body of 11 nul bytes. The broken upload does not happen if the file has been read between the rename and the chmod.
kerem 2026-03-04 01:48:33 +03:00
  • closed this issue
  • added the
    dataloss
    label
Author
Owner

@gaul commented on GitHub (Oct 16, 2020):

I can reproduce these symptoms although am unsure about the fix:

--- a/src/fdcache_entity.cpp
+++ b/src/fdcache_entity.cpp
@@ -437,7 +437,7 @@ int FdEntity::Open(headers_t* pmeta, off_t size, time_t time, bool no_fd_lock_wa
                 // mtime is older than it, the cache file is removed and
                 // the processing comes here.
                 //
-                pagelist.Resize(size, false, (0 <= time ? false : true));
+                pagelist.Resize(size, false, false);
 
                 is_truncate = true;
             }

@ggtakec do you think this is correct?

<!-- gh-comment-id:710085421 --> @gaul commented on GitHub (Oct 16, 2020): I can reproduce these symptoms although am unsure about the fix: ```patch --- a/src/fdcache_entity.cpp +++ b/src/fdcache_entity.cpp @@ -437,7 +437,7 @@ int FdEntity::Open(headers_t* pmeta, off_t size, time_t time, bool no_fd_lock_wa // mtime is older than it, the cache file is removed and // the processing comes here. // - pagelist.Resize(size, false, (0 <= time ? false : true)); + pagelist.Resize(size, false, false); is_truncate = true; } ``` @ggtakec do you think this is correct?
Author
Owner

@init-js commented on GitHub (Oct 16, 2020):

Attaching two traces that show the different sequence of events. The corrupt trace is the script in the original issue post. the "ok" trace is the one where we read the file between the rename and the chmod.

s3fs-ok.trace.log
s3fs-corrupt.trace.log

Aside from the extra obvious read() operation that's added between rename and chmod, the main difference is s3fs's behaviour during the very last read of the script (i.e. after final contents:).

# in the corrupt trace:

[2020-10-15 11:12:53] [INF] s3fs.cpp:s3fs_chmod(1603): [path=/linux_src/my_target][mode=100644]
[2020-10-15 11:12:53] [INF]     s3fs.cpp:put_headers(780): [path=/linux_src/my_target]
[2020-10-15 11:12:53] [INF]       curl.cpp:PutHeadRequest(3208): [tpath=/linux_src/my_target]
[2020-10-15 11:12:53] [INF]       curl.cpp:prepare_url(4831): URL is https://s3.amazonaws.com/my_bucket/linux_src/my_target
[2020-10-15 11:12:53] [INF]       curl.cpp:prepare_url(4864): URL changed is https://s3.amazonaws.com/my_bucket/linux_src/my_target
[2020-10-15 11:12:53] [INF]       curl.cpp:PutHeadRequest(3297): copying... [path=/linux_src/my_target]
[2020-10-15 11:12:53] [INF]       curl.cpp:insertV4Headers(2863): computing signature [PUT] [/linux_src/my_target] [] []
...

#
# THIS BLOCK -- uploads a blob of the right size, but filled with zeroes
#
[2020-10-15 11:12:53] [INF]       curl.cpp:PutRequest(3433): uploading... [path=/linux_src/my_target][fd=12][size=11]
[2020-10-15 11:12:53] [INF]       curl.cpp:insertV4Headers(2863): computing signature [PUT] [/linux_src/my_target] [] [71b6c1d53832f789a7f2435a7c629245fa3761ad8487775ebf4957330213a706]
[2020-10-15 11:12:53] [INF]       curl.cpp:url_to_host(99): url is https://s3.amazonaws.com
[2020-10-15 11:12:53] [INF]       curl.cpp:RequestPerform(2520): HTTP response code 200

[2020-10-15 11:12:53] [INF]       curl.cpp:GetObjectRequest(3498): [tpath=/linux_src/my_target][start=0][size=11]
[2020-10-15 11:12:53] [INF]       curl.cpp:PreGetObjectRequest(3446): [tpath=/linux_src/my_target][start=0][size=11]
[2020-10-15 11:12:53] [INF]       curl.cpp:prepare_url(4831): URL is https://s3.amazonaws.com/my_bucket/linux_src/my_target
[2020-10-15 11:12:53] [INF]       curl.cpp:prepare_url(4864): URL changed is https://s3.amazonaws.com/my_bucket/linux_src/my_target
[2020-10-15 11:12:53] [INF]       curl.cpp:GetObjectRequest(3517): downloading... [path=/linux_src/my_target][fd=12]
[2020-10-15 11:12:53] [INF]       curl.cpp:insertV4Headers(2863): computing signature [GET] [/linux_src/my_target] [] []
[2020-10-15 11:12:53] [INF]       curl.cpp:url_to_host(99): url is https://s3.amazonaws.com
[2020-10-15 11:12:53] [INF]       curl.cpp:RequestPerform(2520): HTTP response code 206
[2020-10-15 11:12:53] [INF] s3fs.cpp:s3fs_getattr(828): [path=/linux_src/my_target]

In the OK trace, the last read just hits the cache after chmod() completes? no download nor upload

[2020-10-15 12:02:20] [INF] s3fs.cpp:s3fs_chmod(1603): [path=/linux_src/my_target][mode=100644]
[2020-10-15 12:02:20] [INF]     s3fs.cpp:put_headers(780): [path=/linux_src/my_target]
[2020-10-15 12:02:20] [INF]       curl.cpp:PutHeadRequest(3208): [tpath=/linux_src/my_target]
[2020-10-15 12:02:20] [INF]       curl.cpp:prepare_url(4831): URL is https://s3.amazonaws.com/my_bucket/linux_src/my_target
[2020-10-15 12:02:20] [INF]       curl.cpp:prepare_url(4864): URL changed is https://s3.amazonaws.com/my_bucket/linux_src/my_target
[2020-10-15 12:02:20] [INF]       curl.cpp:PutHeadRequest(3297): copying... [path=/linux_src/my_target]
[2020-10-15 12:02:20] [INF]       curl.cpp:insertV4Headers(2863): computing signature [PUT] [/linux_src/my_target] [] []
[2020-10-15 12:02:20] [INF]       curl.cpp:url_to_host(99): url is https://s3.amazonaws.com
[2020-10-15 12:02:20] [INF]       curl.cpp:RequestPerform(2520): HTTP response code 200
[2020-10-15 12:02:20] [INF]       fdcache.cpp:SetMtime(1821): [path=/linux_src/my_target][fd=12][time=160278854getattr /linux_src/my_target
[2020-10-15 12:02:20] unique: 112, success, outsize: 120
[2020-10-15 12:02:20] unique: 113, opcode: OPEN (14), nodeid: 12, insize: 48, pid: 113
[2020-10-15 12:02:20] open flags: 0x8000 /linux_src/my_target
[2020-10-15 12:02:20] open[12] flags: 0x8000 /linux_src/my_target
[2020-10-15 12:02:20] unique: 113, success, outsize: 32
[2020-10-15 12:02:20] unique: 114, opcode: FLUSH (25), nodeid: 12, insize: 64, pid: 113
[2020-10-15 12:02:20] flush[12]
[2020-10-15 12:02:20] unique: 114, success, outsize: 16
[2020-10-15 12:02:20] unique: 115, opcode: READ (15), nodeid: 12, insize: 80, pid: 113
[2020-10-15 12:02:20] read[12] 4096 bytes from 0 flags: 0x8000
[2020-10-15 12:02:20] read[12] 11 bytes from 0
[2020-10-15 12:02:20] unique: 115, success, outsize: 27
[2020-10-15 12:02:20] unique: 116, opcode: GETATTR (3), nodeid: 12, insize: 56, pid: 113
[2020-10-15 12:02:20] getattr /linux_src/my_target
[2020-10-15 12:02:20] unique: 116, success, outsize: 120
[2020-10-15 12:02:20] unique: 117, opcode: FLUSH (25), nodeid: 12, insize: 64, pid: 113
[2020-10-15 12:02:20] flush[12]
[2020-10-15 12:02:20] unique: 117, success, outsize: 16
[2020-10-15 12:02:20] unique: 118, opcode: RELEASE (18), nodeid: 12, insize: 64, pid: 0
[2020-10-15 12:02:20] release[12] flags: 0x8000
[2020-10-15 12:02:20] unique: 118, success, outsize: 16

# End of trace
<!-- gh-comment-id:710476953 --> @init-js commented on GitHub (Oct 16, 2020): Attaching two traces that show the different sequence of events. The corrupt trace is the script in the original issue post. the "ok" trace is the one where we read the file between the rename and the chmod. [s3fs-ok.trace.log](https://github.com/s3fs-fuse/s3fs-fuse/files/5393847/s3fs-ok.trace.log) [s3fs-corrupt.trace.log](https://github.com/s3fs-fuse/s3fs-fuse/files/5393845/s3fs-corrupt.trace.log) Aside from the extra obvious read() operation that's added between rename and chmod, the main difference is s3fs's behaviour during the very last read of the script (i.e. after `final contents:`). ```text # in the corrupt trace: [2020-10-15 11:12:53] [INF] s3fs.cpp:s3fs_chmod(1603): [path=/linux_src/my_target][mode=100644] [2020-10-15 11:12:53] [INF] s3fs.cpp:put_headers(780): [path=/linux_src/my_target] [2020-10-15 11:12:53] [INF] curl.cpp:PutHeadRequest(3208): [tpath=/linux_src/my_target] [2020-10-15 11:12:53] [INF] curl.cpp:prepare_url(4831): URL is https://s3.amazonaws.com/my_bucket/linux_src/my_target [2020-10-15 11:12:53] [INF] curl.cpp:prepare_url(4864): URL changed is https://s3.amazonaws.com/my_bucket/linux_src/my_target [2020-10-15 11:12:53] [INF] curl.cpp:PutHeadRequest(3297): copying... [path=/linux_src/my_target] [2020-10-15 11:12:53] [INF] curl.cpp:insertV4Headers(2863): computing signature [PUT] [/linux_src/my_target] [] [] ... # # THIS BLOCK -- uploads a blob of the right size, but filled with zeroes # [2020-10-15 11:12:53] [INF] curl.cpp:PutRequest(3433): uploading... [path=/linux_src/my_target][fd=12][size=11] [2020-10-15 11:12:53] [INF] curl.cpp:insertV4Headers(2863): computing signature [PUT] [/linux_src/my_target] [] [71b6c1d53832f789a7f2435a7c629245fa3761ad8487775ebf4957330213a706] [2020-10-15 11:12:53] [INF] curl.cpp:url_to_host(99): url is https://s3.amazonaws.com [2020-10-15 11:12:53] [INF] curl.cpp:RequestPerform(2520): HTTP response code 200 [2020-10-15 11:12:53] [INF] curl.cpp:GetObjectRequest(3498): [tpath=/linux_src/my_target][start=0][size=11] [2020-10-15 11:12:53] [INF] curl.cpp:PreGetObjectRequest(3446): [tpath=/linux_src/my_target][start=0][size=11] [2020-10-15 11:12:53] [INF] curl.cpp:prepare_url(4831): URL is https://s3.amazonaws.com/my_bucket/linux_src/my_target [2020-10-15 11:12:53] [INF] curl.cpp:prepare_url(4864): URL changed is https://s3.amazonaws.com/my_bucket/linux_src/my_target [2020-10-15 11:12:53] [INF] curl.cpp:GetObjectRequest(3517): downloading... [path=/linux_src/my_target][fd=12] [2020-10-15 11:12:53] [INF] curl.cpp:insertV4Headers(2863): computing signature [GET] [/linux_src/my_target] [] [] [2020-10-15 11:12:53] [INF] curl.cpp:url_to_host(99): url is https://s3.amazonaws.com [2020-10-15 11:12:53] [INF] curl.cpp:RequestPerform(2520): HTTP response code 206 [2020-10-15 11:12:53] [INF] s3fs.cpp:s3fs_getattr(828): [path=/linux_src/my_target] ``` In the OK trace, the last read just hits the cache after chmod() completes? no download nor upload ```text [2020-10-15 12:02:20] [INF] s3fs.cpp:s3fs_chmod(1603): [path=/linux_src/my_target][mode=100644] [2020-10-15 12:02:20] [INF] s3fs.cpp:put_headers(780): [path=/linux_src/my_target] [2020-10-15 12:02:20] [INF] curl.cpp:PutHeadRequest(3208): [tpath=/linux_src/my_target] [2020-10-15 12:02:20] [INF] curl.cpp:prepare_url(4831): URL is https://s3.amazonaws.com/my_bucket/linux_src/my_target [2020-10-15 12:02:20] [INF] curl.cpp:prepare_url(4864): URL changed is https://s3.amazonaws.com/my_bucket/linux_src/my_target [2020-10-15 12:02:20] [INF] curl.cpp:PutHeadRequest(3297): copying... [path=/linux_src/my_target] [2020-10-15 12:02:20] [INF] curl.cpp:insertV4Headers(2863): computing signature [PUT] [/linux_src/my_target] [] [] [2020-10-15 12:02:20] [INF] curl.cpp:url_to_host(99): url is https://s3.amazonaws.com [2020-10-15 12:02:20] [INF] curl.cpp:RequestPerform(2520): HTTP response code 200 [2020-10-15 12:02:20] [INF] fdcache.cpp:SetMtime(1821): [path=/linux_src/my_target][fd=12][time=160278854getattr /linux_src/my_target [2020-10-15 12:02:20] unique: 112, success, outsize: 120 [2020-10-15 12:02:20] unique: 113, opcode: OPEN (14), nodeid: 12, insize: 48, pid: 113 [2020-10-15 12:02:20] open flags: 0x8000 /linux_src/my_target [2020-10-15 12:02:20] open[12] flags: 0x8000 /linux_src/my_target [2020-10-15 12:02:20] unique: 113, success, outsize: 32 [2020-10-15 12:02:20] unique: 114, opcode: FLUSH (25), nodeid: 12, insize: 64, pid: 113 [2020-10-15 12:02:20] flush[12] [2020-10-15 12:02:20] unique: 114, success, outsize: 16 [2020-10-15 12:02:20] unique: 115, opcode: READ (15), nodeid: 12, insize: 80, pid: 113 [2020-10-15 12:02:20] read[12] 4096 bytes from 0 flags: 0x8000 [2020-10-15 12:02:20] read[12] 11 bytes from 0 [2020-10-15 12:02:20] unique: 115, success, outsize: 27 [2020-10-15 12:02:20] unique: 116, opcode: GETATTR (3), nodeid: 12, insize: 56, pid: 113 [2020-10-15 12:02:20] getattr /linux_src/my_target [2020-10-15 12:02:20] unique: 116, success, outsize: 120 [2020-10-15 12:02:20] unique: 117, opcode: FLUSH (25), nodeid: 12, insize: 64, pid: 113 [2020-10-15 12:02:20] flush[12] [2020-10-15 12:02:20] unique: 117, success, outsize: 16 [2020-10-15 12:02:20] unique: 118, opcode: RELEASE (18), nodeid: 12, insize: 64, pid: 0 [2020-10-15 12:02:20] release[12] flags: 0x8000 [2020-10-15 12:02:20] unique: 118, success, outsize: 16 # End of trace ```
Author
Owner

@init-js commented on GitHub (Nov 5, 2020):

@gaul , thanks for looking into it.

Can you justify your reasoning a bit more for patching that block? It seems strange that the state is incorrect by the time the final read takes place. Ignoring time seems like a patch for this scenario, but I'm not convinced it's a comprehensive fix.

github.com/s3fs-fuse/s3fs-fuse@38e1eaa8a3/src/fdcache_entity.cpp (L432)

<!-- gh-comment-id:722694107 --> @init-js commented on GitHub (Nov 5, 2020): @gaul , thanks for looking into it. Can you justify your reasoning a bit more for patching that block? It seems strange that the state is incorrect by the time the final read takes place. Ignoring time seems like a patch for this scenario, but I'm not convinced it's a comprehensive fix. https://github.com/s3fs-fuse/s3fs-fuse/blob/38e1eaa8a39d3622bafb087c7685571064540284/src/fdcache_entity.cpp#L432
Author
Owner

@bitprophet commented on GitHub (Feb 10, 2021):

I can confirm this bug:

  • RHEL 7.7
  • Kernel 3.10.0
  • fuse 2.9.2
  • s3fs-fuse 1.87 and 1.88 (tested both, confirmed bug exists on both)
  • s3fs configured with use_cache

Specifically, I can confirm the proximate trigger of the bug, though I haven't yet proven the pseudo-workaround of "read before chmod" including that reading before the chmod avoids the issue; I haven't tested the patch (though I'm intending to).

My 100% reproducible real world trigger is using rpm-sign (eg rpm --delsign /path/to/s3fs/some.rpm, though I've seen what may be the same issue in a much more infrequent basis during runs of createrepo across large RPM repositories.

To wit:

  • Copy known good RPM file from local disk into s3fs mount
  • Confirm cached copy matches OK
  • Confirm s3 AWS console has correct etag checksum and data
  • rpmsign (which strace shows is performing an unlink+rename and a chmod)
  • Confirm cached copy is now all zeroes
  • Confirm S3 copy is still good (per init-js's note that the subsequent accesses are what corrupt s3)
  • Read the file via s3fs (rpm -qip, md5sum, etc)
  • Confirm file appeared to that operation as all zeroes
  • Confirm s3 now contains the all-zeroes data and etag checksum
<!-- gh-comment-id:777040993 --> @bitprophet commented on GitHub (Feb 10, 2021): I can confirm this bug: * RHEL 7.7 * Kernel 3.10.0 * fuse 2.9.2 * s3fs-fuse 1.87 and 1.88 (tested both, confirmed bug exists on both) * s3fs configured with `use_cache` Specifically, I can confirm the proximate trigger of the bug, ~though I haven't yet proven the pseudo-workaround of "read before chmod"~ including that reading before the chmod avoids the issue; I haven't tested the patch (though I'm intending to). My 100% reproducible real world trigger is using `rpm-sign` (eg `rpm --delsign /path/to/s3fs/some.rpm`, though I've seen what may be the same issue in a much more infrequent basis during runs of `createrepo` across large RPM repositories. To wit: * Copy known good RPM file from local disk into s3fs mount * Confirm cached copy matches OK * Confirm s3 AWS console has correct etag checksum and data * rpmsign (which strace shows is performing an unlink+rename and a chmod) * Confirm cached copy is now all zeroes * Confirm S3 copy is still good (per init-js's note that the subsequent accesses are what corrupt s3) * Read the file via s3fs (rpm -qip, md5sum, etc) * Confirm file appeared to that operation as all zeroes * Confirm s3 now contains the all-zeroes data and etag checksum
Author
Owner

@bitprophet commented on GitHub (Feb 10, 2021):

Actually I'm not sure the prior comments outlined something I found above: the cache is being filled with the corrupt data by the interplay between the rename and the chmod, and the open() is just flushing that already bogus data upstream.

  • Nuke files for reset: cache empty
  • Create new file A: cache contains (good copy of) A
  • Rename A to B: cache is empty again (due to unlink of A; unclear if we should instead expect B to exist in cache after this, but I suspect yes?)
  • If you open B at this point, s3fs correctly re-fetches the data from S3 and fills the cache. No problemo.
  • If instead you chmod B: it creates a zeroed out cache entry for B.
  • Open: uploads that zeroed out cache entry for B, to S3, and also returns it to the user of the fuse mount.

(Note: the stat cache mirrors the cache in all cases - when the cache is empty the stat cache is also. So at least a disparity there is unlikely to be at fault.)

The more I (admittedly not a C++ or filesystem dev) look at this the more I'm wondering why the rename isn't resulting in a rename in the cache; that appears to be the root of the issue.

The chmod (setattr/getattr underneath, iirc) is probably a second-order bug that might also need fixing (refresh from S3 when cache is empty, instead of spitting out a bunch of zeroed pagelist blocks or whatever it's doing now) but would likely not be triggered if the other is patched.

<!-- gh-comment-id:777118717 --> @bitprophet commented on GitHub (Feb 10, 2021): Actually I'm not sure the prior comments outlined something I found above: the cache is being filled with the corrupt data by the interplay between the rename and the chmod, and the open() is just flushing that already bogus data upstream. * Nuke files for reset: cache empty * Create new file A: cache contains (good copy of) A * Rename A to B: cache is empty again (due to unlink of A; unclear if we should instead expect B to exist in cache after this, but I suspect yes?) * If you open B at this point, s3fs correctly re-fetches the data from S3 and fills the cache. No problemo. * If instead you chmod B: it creates a zeroed out cache entry for B. * Open: uploads that zeroed out cache entry for B, to S3, and also returns it to the user of the fuse mount. (Note: the stat cache mirrors the cache in all cases - when the cache is empty the stat cache is also. So at least a disparity there is unlikely to be at fault.) The more I (admittedly not a C++ or filesystem dev) look at this the more I'm wondering why the rename isn't resulting in a rename in the cache; that appears to be the root of the issue. The chmod (setattr/getattr underneath, iirc) is probably a second-order bug that might also need fixing (refresh from S3 when cache is empty, instead of spitting out a bunch of zeroed pagelist blocks or whatever it's doing now) but would likely not be triggered if the other is patched.
Author
Owner

@gaul commented on GitHub (Feb 11, 2021):

Thanks for digging into this! I will look into this again but it would help greatly if you can send a PR with a minimized and failing test case to test/integration-test-main.sh

<!-- gh-comment-id:777162449 --> @gaul commented on GitHub (Feb 11, 2021): Thanks for digging into this! I will look into this again but it would help greatly if you can send a PR with a minimized and failing test case to `test/integration-test-main.sh`
Author
Owner

@gaul commented on GitHub (Feb 11, 2021):

Also if you are working on a fix, consider the simplest approach of flush or invalidate instead of a more complicated one preserving the cache contents. While performance is important, we can work on that in a subsequent commit.

<!-- gh-comment-id:777166060 --> @gaul commented on GitHub (Feb 11, 2021): Also if you are working on a fix, consider the simplest approach of flush or invalidate instead of a more complicated one preserving the cache contents. While performance is important, we can work on that in a subsequent commit.
Author
Owner

@gaul commented on GitHub (Feb 11, 2021):

Here is a minimal test case, transliterated from the description from @bitprophet:

function test_issue1453 {
    describe "issue 1453"
    echo foo > A
    mv A B
    chmod +x B
    cmp B <(echo foo)
}
<!-- gh-comment-id:777362362 --> @gaul commented on GitHub (Feb 11, 2021): Here is a minimal test case, transliterated from the description from @bitprophet: ```bash function test_issue1453 { describe "issue 1453" echo foo > A mv A B chmod +x B cmp B <(echo foo) } ```
Author
Owner

@gaul commented on GitHub (Feb 11, 2021):

We can further minimize this by creating the object with the AWS CLI:

function test_issue1453 {
    describe "issue 1453"
    OBJECT_NAME="$(basename $PWD)/B"
    echo "foo" | aws_cli s3 cp - "s3://${TEST_BUCKET_1}/${OBJECT_NAME}"
    chmod +x B
    cmp B <(echo foo)
}

s3fs seems to do an unnecessary flush when it does not know anything about the remote object and a user modifies the metadata of it.

<!-- gh-comment-id:777367066 --> @gaul commented on GitHub (Feb 11, 2021): We can further minimize this by creating the object with the AWS CLI: ```bash function test_issue1453 { describe "issue 1453" OBJECT_NAME="$(basename $PWD)/B" echo "foo" | aws_cli s3 cp - "s3://${TEST_BUCKET_1}/${OBJECT_NAME}" chmod +x B cmp B <(echo foo) } ``` s3fs seems to do an unnecessary flush when it does not know anything about the remote object and a user modifies the metadata of it.
Author
Owner

@gaul commented on GitHub (Feb 11, 2021):

I can verify that https://github.com/s3fs-fuse/s3fs-fuse/issues/1453#issuecomment-710085421 addresses the symptom with the above test case. I believe that the logic here is incomplete -- it considers that the mtime is not -1 because a file write has occurred but instead it is due to the chmod. @ggtakec what do you think?

<!-- gh-comment-id:777465069 --> @gaul commented on GitHub (Feb 11, 2021): I can verify that https://github.com/s3fs-fuse/s3fs-fuse/issues/1453#issuecomment-710085421 addresses the symptom with the above test case. I believe that the logic here is incomplete -- it considers that the mtime is not -1 because a file write has occurred but instead it is due to the chmod. @ggtakec what do you think?
Author
Owner

@ggtakec commented on GitHub (Feb 17, 2021):

I'm sorry I was late in noticing this issue.
It may work with changing @gaul, but it needs a little more time to check.

But, when I was checking the log file, there was something wrong.

2021-02-17T14:38:09.272Z [INF]   s3fs.cpp:rename_object(1241): [from=/yyy][to=/zzz]
2021-02-17T14:38:09.273Z [INF]     s3fs.cpp:put_headers(746): [path=/zzz]
    ...                             ...
    ...                             ...
2021-02-17T14:38:09.475Z [INF] s3fs.cpp:s3fs_unlink(1085): [path=/yyy]
2021-02-17T14:38:09.475Z [INF]       curl.cpp:DeleteRequest(2713): [tpath=/yyy]
2021-02-17T14:38:09.475Z [INF]       curl_util.cpp:prepare_url(250): URL is https://s3.amazonaws.com/mytestbucket/yyy
2021-02-17T14:38:09.475Z [INF]       curl_util.cpp:prepare_url(283): URL changed is https://mytestbucket.s3.amazonaws.com/yyy
2021-02-17T14:38:09.475Z [INF]       curl.cpp:insertV4Headers(2639): computing signature [DELETE] [/yyy] [] []
2021-02-17T14:38:09.475Z [INF]       curl_util.cpp:url_to_host(327): url is https://s3.amazonaws.com
2021-02-17T14:38:09.690Z [INF]       curl.cpp:RequestPerform(2286): HTTP response code 204
2021-02-17T14:38:09.691Z [INF]       fdcache.cpp:DeleteCacheFile(149): [path=/yyy]
2021-02-17T14:38:09.691Z [INF]       cache.cpp:DelStat(578): delete stat cache entry[path=/yyy]
2021-02-17T14:38:09.691Z [INF]       cache.cpp:DelSymlink(737): delete symbolic link cache entry[path=/yyy]
2021-02-17T14:38:09.691Z [INF]       cache.cpp:DelStat(578): delete stat cache entry[path=/zzz]
2021-02-17T14:38:09.691Z [INF]       fdcache.cpp:DeleteCacheFile(149): [path=/zzz]

In the process of s3fs.cpp:rename_object(), the file cache is also moved.
Then, it deletes the original file and delete the cache.
However, as you can see in the last line above, it even deleted the file cache of the new file(zzz).
Clearing the Stats cache for new files is the correct behavior. This will be needed if the ctime changes(although but it should put more conditions for stats cache clear).

At this point, I think there was an inconsistency(it seems that the cache was acting as if it existed even though there was no cache file) and it was filled with 0x00 bytes.
After applying and trying the following patches, this issue seems to have been resolved.(I will still test)

diff --git a/src/s3fs.cpp b/src/s3fs.cpp
index 2a98eb8..05c588b 100644
--- a/src/s3fs.cpp
+++ b/src/s3fs.cpp
@@ -1270,7 +1270,6 @@ static int rename_object(const char* from, const char* to, bool update_ctime)
     result = s3fs_unlink(from);

     StatCache::getStatCacheData()->DelStat(to);
-    FdManager::DeleteCacheFile(to);

     return result;
 }
@@ -1323,7 +1322,6 @@ static int rename_object_nocopy(const char* from, const char* to, bool update_ct

     // Stats
     StatCache::getStatCacheData()->DelStat(to);
-    FdManager::DeleteCacheFile(to);

     return result;
 }

I'm sorry, but please take a little more time to completely check the logic and finish the test.
I would be grateful if you could try the above changes.

<!-- gh-comment-id:780670321 --> @ggtakec commented on GitHub (Feb 17, 2021): I'm sorry I was late in noticing this issue. It may work with changing @gaul, but it needs a little more time to check. But, when I was checking the log file, there was something wrong. ``` 2021-02-17T14:38:09.272Z [INF] s3fs.cpp:rename_object(1241): [from=/yyy][to=/zzz] 2021-02-17T14:38:09.273Z [INF] s3fs.cpp:put_headers(746): [path=/zzz] ... ... ... ... 2021-02-17T14:38:09.475Z [INF] s3fs.cpp:s3fs_unlink(1085): [path=/yyy] 2021-02-17T14:38:09.475Z [INF] curl.cpp:DeleteRequest(2713): [tpath=/yyy] 2021-02-17T14:38:09.475Z [INF] curl_util.cpp:prepare_url(250): URL is https://s3.amazonaws.com/mytestbucket/yyy 2021-02-17T14:38:09.475Z [INF] curl_util.cpp:prepare_url(283): URL changed is https://mytestbucket.s3.amazonaws.com/yyy 2021-02-17T14:38:09.475Z [INF] curl.cpp:insertV4Headers(2639): computing signature [DELETE] [/yyy] [] [] 2021-02-17T14:38:09.475Z [INF] curl_util.cpp:url_to_host(327): url is https://s3.amazonaws.com 2021-02-17T14:38:09.690Z [INF] curl.cpp:RequestPerform(2286): HTTP response code 204 2021-02-17T14:38:09.691Z [INF] fdcache.cpp:DeleteCacheFile(149): [path=/yyy] 2021-02-17T14:38:09.691Z [INF] cache.cpp:DelStat(578): delete stat cache entry[path=/yyy] 2021-02-17T14:38:09.691Z [INF] cache.cpp:DelSymlink(737): delete symbolic link cache entry[path=/yyy] 2021-02-17T14:38:09.691Z [INF] cache.cpp:DelStat(578): delete stat cache entry[path=/zzz] 2021-02-17T14:38:09.691Z [INF] fdcache.cpp:DeleteCacheFile(149): [path=/zzz] ``` In the process of `s3fs.cpp:rename_object()`, the file cache is also moved. Then, it deletes the original file and delete the cache. However, as you can see in the last line above, it even deleted the file cache of the new file(zzz). _Clearing the Stats cache for new files is the correct behavior. This will be needed if the ctime changes(although but it should put more conditions for stats cache clear)._ At this point, I think there was an inconsistency(it seems that the cache was acting as if it existed even though there was no cache file) and it was filled with 0x00 bytes. After applying and trying the following patches, this issue seems to have been resolved.(I will still test) ``` diff --git a/src/s3fs.cpp b/src/s3fs.cpp index 2a98eb8..05c588b 100644 --- a/src/s3fs.cpp +++ b/src/s3fs.cpp @@ -1270,7 +1270,6 @@ static int rename_object(const char* from, const char* to, bool update_ctime) result = s3fs_unlink(from); StatCache::getStatCacheData()->DelStat(to); - FdManager::DeleteCacheFile(to); return result; } @@ -1323,7 +1322,6 @@ static int rename_object_nocopy(const char* from, const char* to, bool update_ct // Stats StatCache::getStatCacheData()->DelStat(to); - FdManager::DeleteCacheFile(to); return result; } ``` I'm sorry, but please take a little more time to completely check the logic and finish the test. I would be grateful if you could try the above changes.
Author
Owner

@ggtakec commented on GitHub (Feb 17, 2021):

After applying the above patch, other tests failed.
I need to investigate a little more.

<!-- gh-comment-id:780689108 --> @ggtakec commented on GitHub (Feb 17, 2021): After applying the above patch, other tests failed. I need to investigate a little more.
Author
Owner

@ggtakec commented on GitHub (Feb 19, 2021):

@gaul (and all) I found the cause of the problem of this.
(The part that was deleting the cache shown above was one of the factors other than the multipart HEAD request.)

The rename logic internally copies the file and deletes the original file.
The cause of the bug was that s3fs was updating mtime immediately after this copy, at which time a new cache file was created and the contents of the file were null.

I will publicize the fix immediately.

<!-- gh-comment-id:782187063 --> @ggtakec commented on GitHub (Feb 19, 2021): @gaul (and all) I found the cause of the problem of this. _(The part that was deleting the cache shown above was one of the factors other than the multipart HEAD request.)_ The rename logic internally copies the file and deletes the original file. The cause of the bug was that s3fs was updating mtime immediately after this copy, at which time a new cache file was created and the contents of the file were null. I will publicize the fix immediately.
Author
Owner

@gaul commented on GitHub (Feb 20, 2021):

@init-js Could you test with the latest master?

<!-- gh-comment-id:782690389 --> @gaul commented on GitHub (Feb 20, 2021): @init-js Could you test with the latest master?
Author
Owner

@bitprophet commented on GitHub (Feb 24, 2021):

1.89 is already cut but just want to note that the patch which made it into the release, DOES fix my reproducible issue. Thanks so much!

I'm hoping it also fixes my less-reproducible issue (which may be purely metadata modification, no renaming; would guess in that case it's just "metadata-modified files were not in cache to begin with") but time will tell.

<!-- gh-comment-id:785185896 --> @bitprophet commented on GitHub (Feb 24, 2021): 1.89 is already cut but just want to note that the patch which made it into the release, DOES fix my reproducible issue. Thanks so much! I'm hoping it also fixes my less-reproducible issue (which may be purely metadata modification, no renaming; would guess in that case it's just "metadata-modified files were not in cache to begin with") but time will tell.
Author
Owner

@gaul commented on GitHub (Feb 25, 2021):

@bitprophet Thanks for working with us on this issue! Good bug reports make a big difference in finding a fix. If there are other symptoms let's open a new issue and address that separately.

<!-- gh-comment-id:785518632 --> @gaul commented on GitHub (Feb 25, 2021): @bitprophet Thanks for working with us on this issue! Good bug reports make a big difference in finding a fix. If there are other symptoms let's open a new issue and address that separately.
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/s3fs-fuse#762
No description provided.