[GH-ISSUE #1499] Local cached file's MD5 is bad. #787

Closed
opened 2026-03-04 01:48:48 +03:00 by kerem · 4 comments
Owner

Originally created by @fly3366 on GitHub (Dec 18, 2020).
Original GitHub issue: https://github.com/s3fs-fuse/s3fs-fuse/issues/1499

Additional Information

The following information is very important in order to help us to help you. Omission of the following details may delay your support request or receive no attention at all.
Keep in mind that the commands we provide to retrieve information are oriented to GNU/Linux Distributions, so you could need to use others if you use s3fs on macOS or BSD

Version of s3fs being used (s3fs --version)

1.87

Version of fuse being used (pkg-config --modversion fuse, rpm -qi fuse, dpkg -s fuse)

_example: 2.9.3_15

Kernel information (uname -r)

4.9

GNU/Linux Distribution, if applicable (cat /etc/os-release)

debian jessie(8)

s3fs command line used, if applicable

``
use_cache /tmp
stat_cache_expire 86400
enable_content_md5
big_writes
multipart 128
multi_req 4

#### /etc/fstab entry, if applicable

in container, not use fstab

#### s3fs syslog messages (grep s3fs /var/log/syslog, journalctl | grep s3fs, or s3fs outputs)
_if you execute s3fs with dbglevel, curldbg option, you can get detail debug messages_

such like:
[ERR] curl.cpp:RequestPerform(2316): ### CURLE_WRITE_ERROR
[ERR] curl.cpp:RequestPerform(2448): ### giving up
[WAN] curl_multi.cpp:MultiPerform(171): thread failed - rc(-5)

### Details about issue
1. no local cache state: MD5 is bad.
2. after stage 1 has local cache: MD5 is bad, and I exec md5sum for cached file direct, also bad.
3. delete old cache and try agian: MD5 is true.

md5 complete use python's hashlib.

I hava diff the cache and true file, they are same size. But differ.

Q: Log mean network error?
Q: If I don't have meta(which include hash), how can i get a true file?
Q: s3fs has use etag?
Originally created by @fly3366 on GitHub (Dec 18, 2020). Original GitHub issue: https://github.com/s3fs-fuse/s3fs-fuse/issues/1499 ### Additional Information _The following information is very important in order to help us to help you. Omission of the following details may delay your support request or receive no attention at all._ _Keep in mind that the commands we provide to retrieve information are oriented to GNU/Linux Distributions, so you could need to use others if you use s3fs on macOS or BSD_ #### Version of s3fs being used (s3fs --version) 1.87 #### Version of fuse being used (pkg-config --modversion fuse, rpm -qi fuse, dpkg -s fuse) _example: 2.9.3_15 #### Kernel information (uname -r) 4.9 #### GNU/Linux Distribution, if applicable (cat /etc/os-release) debian jessie(8) #### s3fs command line used, if applicable `` use_cache /tmp stat_cache_expire 86400 enable_content_md5 big_writes multipart 128 multi_req 4 ``` #### /etc/fstab entry, if applicable ``` in container, not use fstab ``` #### s3fs syslog messages (grep s3fs /var/log/syslog, journalctl | grep s3fs, or s3fs outputs) _if you execute s3fs with dbglevel, curldbg option, you can get detail debug messages_ ``` such like: [ERR] curl.cpp:RequestPerform(2316): ### CURLE_WRITE_ERROR [ERR] curl.cpp:RequestPerform(2448): ### giving up [WAN] curl_multi.cpp:MultiPerform(171): thread failed - rc(-5) ``` ### Details about issue 1. no local cache state: MD5 is bad. 2. after stage 1 has local cache: MD5 is bad, and I exec md5sum for cached file direct, also bad. 3. delete old cache and try agian: MD5 is true. md5 complete use python's hashlib. I hava diff the cache and true file, they are same size. But differ. Q: Log mean network error? Q: If I don't have meta(which include hash), how can i get a true file? Q: s3fs has use etag?
kerem closed this issue 2026-03-04 01:48:49 +03:00
Author
Owner

@fly3366 commented on GitHub (Dec 18, 2020):

After seeing log, curl's request has timeout. May s3fs throw IO Error? but not complete?

<!-- gh-comment-id:747857133 --> @fly3366 commented on GitHub (Dec 18, 2020): After seeing log, curl's request has timeout. May s3fs throw IO Error? but not complete?
Author
Owner

@fly3366 commented on GitHub (Dec 18, 2020):

May curl effect? log hash : The S3FS_CURLOPT_KEEP_SENDING_ON_ERROR option could not be set. For maximize performance you need to enable this option and you should use libcurl 7.51.0 or later.

<!-- gh-comment-id:747908953 --> @fly3366 commented on GitHub (Dec 18, 2020): May curl effect? log hash : `The S3FS_CURLOPT_KEEP_SENDING_ON_ERROR option could not be set. For maximize performance you need to enable this option and you should use libcurl 7.51.0 or later.`
Author
Owner

@fly3366 commented on GitHub (Dec 18, 2020):

I had diff object file between true file and cache file. 0-128M and 256M-end is true, but 128M-256M is bad. Need any checksum method?

<!-- gh-comment-id:748016073 --> @fly3366 commented on GitHub (Dec 18, 2020): I had diff object file between true file and cache file. 0-128M and 256M-end is true, but 128M-256M is bad. Need any checksum method?
Author
Owner

@gaul commented on GitHub (Dec 22, 2020):

If you can find a way to reproduce the symptoms please reopen the issue. s3fs does not use ETags to check file integrity since it allows reading only parts of files and does not require downloading the entire object. Generally HTTPS should suffice to ensure data keeps its integrity during transfer but it is possible that a logic error within s3fs could misuse its own cache.

<!-- gh-comment-id:749428975 --> @gaul commented on GitHub (Dec 22, 2020): If you can find a way to reproduce the symptoms please reopen the issue. s3fs does not use ETags to check file integrity since it allows reading only parts of files and does not require downloading the entire object. Generally HTTPS should suffice to ensure data keeps its integrity during transfer but it is possible that a logic error within s3fs could misuse its own cache.
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/s3fs-fuse#787
No description provided.