[GH-ISSUE #1892] Add "Performance considerations" to manpage #962

Closed
opened 2026-03-04 01:50:15 +03:00 by kerem · 1 comment
Owner

Originally created by @CarstenGrohmann on GitHub (Feb 16, 2022).
Original GitHub issue: https://github.com/s3fs-fuse/s3fs-fuse/issues/1892

Dear S3FS Developer,

what do you think about adding the section below to the manpage?

PERFORMANCE CONSIDERATIONS

This section discusses settings to improve s3fs performance.

In most cases, backend performance cannot be controlled and is therefore not part of this discussion.

Details of the local storage usage is discussed in "LOCAL STORAGE CONSUMPTION".

CPU and Memory Consumption

s3fs is a multi-threaded application. Depending on the workload it may use multiple CPUs and a certain amount of memory. You can monitor the CPU and memory consumption with the "top" utility.

Performance of S3 requests

s3fs provides several options (e.g. "-o multipart_size", "-o parallel_count") to control behaviour and thus indirectly the performance. The possible combinations of these options in conjunction with the various S3 backends are so varied that there is no individual recommendation other than the default values. Improved individual settings can be found by testing and measuring.

The two options "Enable no object cache" ("-o enable_noobj_cache") and "Disable support of alternative directory names" ("-o notsup_compat_dir") can be used to control shared access to the same bucket by different applications:

  • Enable no object cache ("-o enable_noobj_cache")

    If a bucket is used exclusively by an s3fs instance, you can enable the cache for non-existent files and directories with "-o enable_noobj_cache". This eliminates repeated requests to check the existence of an object, saving time and possibly money.

  • Disable support of alternative directory names ("-o notsup_compat_dir")

    s3fs supports "dir/", "dir" and "dir_$folder$" to map directory names to S3 objects and vice versa.

    Some applications use a different naming schema for associating directory names to S3 objects. For example, Apache Hadoop uses the "dir_$folder$" schema to create S3 objects for directories.

    The option "-o notsup_compat_dir" can be set if all accessing tools use the "dir/" naming schema for directory objects and the bucket does not contain any objects with a different naming scheme. In this case, accessing directory objects saves time and possibly money because alternative schemas are not checked.

Originally created by @CarstenGrohmann on GitHub (Feb 16, 2022). Original GitHub issue: https://github.com/s3fs-fuse/s3fs-fuse/issues/1892 Dear S3FS Developer, what do you think about adding the section below to the manpage? ## PERFORMANCE CONSIDERATIONS This section discusses settings to improve s3fs performance. In most cases, backend performance cannot be controlled and is therefore not part of this discussion. Details of the local storage usage is discussed in "LOCAL STORAGE CONSUMPTION". ### CPU and Memory Consumption s3fs is a multi-threaded application. Depending on the workload it may use multiple CPUs and a certain amount of memory. You can monitor the CPU and memory consumption with the "top" utility. ### Performance of S3 requests s3fs provides several options (e.g. "-o multipart_size", "-o parallel_count") to control behaviour and thus indirectly the performance. The possible combinations of these options in conjunction with the various S3 backends are so varied that there is no individual recommendation other than the default values. Improved individual settings can be found by testing and measuring. The two options "Enable no object cache" ("-o enable_noobj_cache") and "Disable support of alternative directory names" ("-o notsup_compat_dir") can be used to control shared access to the same bucket by different applications: * Enable no object cache ("-o enable_noobj_cache") If a bucket is used exclusively by an s3fs instance, you can enable the cache for non-existent files and directories with "-o enable_noobj_cache". This eliminates repeated requests to check the existence of an object, saving time and possibly money. * Disable support of alternative directory names ("-o notsup_compat_dir") s3fs supports "dir/", "dir" and "dir_$folder$" to map directory names to S3 objects and vice versa. Some applications use a different naming schema for associating directory names to S3 objects. For example, Apache Hadoop uses the "dir_$folder$" schema to create S3 objects for directories. The option "-o notsup_compat_dir" can be set if all accessing tools use the "dir/" naming schema for directory objects and the bucket does not contain any objects with a different naming scheme. In this case, accessing directory objects saves time and possibly money because alternative schemas are not checked.
kerem closed this issue 2026-03-04 01:50:16 +03:00
Author
Owner

@ggtakec commented on GitHub (Feb 19, 2022):

@CarstenGrohmann Thanks for your kindness.
I agree with the addition to the man page.
(Maybe I think it should be added new(or existed) page to the wiki as well)

As @gaul also points out in #1866, I feel that the man page(and wiki) needs an overall review.
@gaul Please give us your opinion.

<!-- gh-comment-id:1045966718 --> @ggtakec commented on GitHub (Feb 19, 2022): @CarstenGrohmann Thanks for your kindness. I agree with the addition to the man page. (Maybe I think it should be added new(or existed) page to the wiki as well) As @gaul also points out in #1866, I feel that the man page(and wiki) needs an overall review. @gaul Please give us your opinion.
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/s3fs-fuse#962
No description provided.