mirror of
https://github.com/s3fs-fuse/s3fs-fuse.git
synced 2026-04-25 21:35:58 +03:00
[GH-ISSUE #1528] Suggestions for improvement #803
Labels
No labels
bug
bug
dataloss
duplicate
enhancement
feature request
help wanted
invalid
need info
performance
pull-request
question
question
testing
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
starred/s3fs-fuse#803
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @CarstenGrohmann on GitHub (Jan 21, 2021).
Original GitHub issue: https://github.com/s3fs-fuse/s3fs-fuse/issues/1528
Hello,
last weeks I've troubleshooted a eventual consistency issue with a 3rd party S3 system.
During this time, I noticed these 3 ways to improve s3fs-fuse.
Add current time to to statistics headline enabled with "-o set_check_cache_sigusr1"
Please add the current date and time to statistics output.
Current (fdcache.cpp:53):
Suggestion:
Add error message why attribute update is failing in
put_header()and log this reason to syslog or at least to debug output:What do you think about including the current time in the output of the debug log?
This would simplify the correlation with external events for long-running jobs. In addition, you can calculate the time span between two different file system operations.
What do you think about this?
Many greetings
Carsten
@gaul commented on GitHub (Jan 21, 2021):
This seems fine; would you like to submit a pull request? It also might be good to add a timestamp to the s3fs logs.
As for additional debugging info, anything at INFO level that helps users is fine since this is not logged by default.
I am curious about the eventual consistency you observed -- can you share more details? Mutating objects will always have this problem although s3fs creates a zero-byte object for new files that introduces another inconsistency opportunity https://github.com/s3fs-fuse/s3fs-fuse/issues/1013#issuecomment-484076448.
@gaul commented on GitHub (Jan 22, 2021):
Also if you are trying to track down EC behavior, S3Proxy has a middleware that might help you simulate these issues.
@CarstenGrohmann commented on GitHub (Jan 23, 2021):
Yes, I'll submit a PR. It may take a few days.
Regarding the eventual consistency we've observed: We use rsync to copy a bunch of files to a S3 bucket provided by local NetApp StorageGrid appliances. We use a 10GBit connection from the server to the grid as well as between the grid nodes.
If we now only observe the s3fs connection to the S3 bucket, we see a few HEAD requests, followed by a series of PUT requests to create and write the file. All these requests are processed without error.
After the file is written, rsync executes a chown() call. In rare cases (less than 1:5000), the resulting PUT request is answered by the grid with 404 NoSuchKey.
According to NA, this behaviour is standard-compliant and is due to the asynchronous processing of the ILM policies.
The problem can be solved with two settings:
That was our consistency issue.