mirror of
https://github.com/s3fs-fuse/s3fs-fuse.git
synced 2026-04-25 13:26:00 +03:00
[GH-ISSUE #1892] Add "Performance considerations" to manpage #962
Labels
No labels
bug
bug
dataloss
duplicate
enhancement
feature request
help wanted
invalid
need info
performance
pull-request
question
question
testing
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
starred/s3fs-fuse#962
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @CarstenGrohmann on GitHub (Feb 16, 2022).
Original GitHub issue: https://github.com/s3fs-fuse/s3fs-fuse/issues/1892
Dear S3FS Developer,
what do you think about adding the section below to the manpage?
PERFORMANCE CONSIDERATIONS
This section discusses settings to improve s3fs performance.
In most cases, backend performance cannot be controlled and is therefore not part of this discussion.
Details of the local storage usage is discussed in "LOCAL STORAGE CONSUMPTION".
CPU and Memory Consumption
s3fs is a multi-threaded application. Depending on the workload it may use multiple CPUs and a certain amount of memory. You can monitor the CPU and memory consumption with the "top" utility.
Performance of S3 requests
s3fs provides several options (e.g. "-o multipart_size", "-o parallel_count") to control behaviour and thus indirectly the performance. The possible combinations of these options in conjunction with the various S3 backends are so varied that there is no individual recommendation other than the default values. Improved individual settings can be found by testing and measuring.
The two options "Enable no object cache" ("-o enable_noobj_cache") and "Disable support of alternative directory names" ("-o notsup_compat_dir") can be used to control shared access to the same bucket by different applications:
Enable no object cache ("-o enable_noobj_cache")
If a bucket is used exclusively by an s3fs instance, you can enable the cache for non-existent files and directories with "-o enable_noobj_cache". This eliminates repeated requests to check the existence of an object, saving time and possibly money.
Disable support of alternative directory names ("-o notsup_compat_dir")
s3fs supports "dir/", "dir" and "dir_$folder$" to map directory names to S3 objects and vice versa.
Some applications use a different naming schema for associating directory names to S3 objects. For example, Apache Hadoop uses the "dir_$folder$" schema to create S3 objects for directories.
The option "-o notsup_compat_dir" can be set if all accessing tools use the "dir/" naming schema for directory objects and the bucket does not contain any objects with a different naming scheme. In this case, accessing directory objects saves time and possibly money because alternative schemas are not checked.
@ggtakec commented on GitHub (Feb 19, 2022):
@CarstenGrohmann Thanks for your kindness.
I agree with the addition to the man page.
(Maybe I think it should be added new(or existed) page to the wiki as well)
As @gaul also points out in #1866, I feel that the man page(and wiki) needs an overall review.
@gaul Please give us your opinion.