[GH-ISSUE #303] du -skh <directory> takes too much time #158

Closed
opened 2026-03-04 01:42:43 +03:00 by kerem · 4 comments
Owner

Originally created by @dejlek on GitHub (Nov 26, 2015).
Original GitHub issue: https://github.com/s3fs-fuse/s3fs-fuse/issues/303

Today I have compiled and installed s3fs from the master branch. One of the things I noticed is that du -skh <directory> takes too long (20+ min) for a directory which contains only few subdirectories and all subdirectories in total have ~200 files (not big ones).

Is this an expected behaviour?

Originally created by @dejlek on GitHub (Nov 26, 2015). Original GitHub issue: https://github.com/s3fs-fuse/s3fs-fuse/issues/303 Today I have compiled and installed s3fs from the master branch. One of the things I noticed is that `du -skh <directory>` takes too long (20+ min) for a directory which contains only few subdirectories and all subdirectories in total have ~200 files (not big ones). Is this an expected behaviour?
kerem closed this issue 2026-03-04 01:42:43 +03:00
Author
Owner

@gaul commented on GitHub (Nov 28, 2015):

This is inherent to the design of s3fs which strives for close POSIX compatibility. Every call to getdents requires an S3 HEAD object call per file to fill the stat struct. goofys has looser POSIX compatibility and calls to getdents only require one S3 list objects call instead of one per file. This yields vastly improved performance for du.

<!-- gh-comment-id:160291775 --> @gaul commented on GitHub (Nov 28, 2015): This is inherent to the design of s3fs which strives for close POSIX compatibility. Every call to `getdents` requires an S3 HEAD object call per file to fill the `stat` struct. [goofys](https://github.com/kahing/goofys) has looser POSIX compatibility and calls to `getdents` only require one S3 list objects call instead of one per file. This yields vastly improved performance for `du`.
Author
Owner

@ggtakec commented on GitHub (Dec 13, 2015):

Hi, @dejlek
s3fs posts wasteful HEAD request for compatibility with other s3 client.
We can also discard this compatibility, but has been held so far.
It might have an impact on the speed of this order s3fs.

In order to alleviate this, s3fs can have the ability to cache the stat information for each object.
Then you can use max_stat_cache_size, stat_cache_expire, and enable_noobj_cache options.
In order to reduce unnecessary requests, I recommend that you use the enable_noobj_cache option.

If you can, pleae try to set these options.
Thanks in advance for your help.

<!-- gh-comment-id:164234632 --> @ggtakec commented on GitHub (Dec 13, 2015): Hi, @dejlek s3fs posts wasteful HEAD request for compatibility with other s3 client. We can also discard this compatibility, but has been held so far. It might have an impact on the speed of this order s3fs. In order to alleviate this, s3fs can have the ability to cache the stat information for each object. Then you can use max_stat_cache_size, stat_cache_expire, and enable_noobj_cache options. In order to reduce unnecessary requests, I recommend that you use the enable_noobj_cache option. If you can, pleae try to set these options. Thanks in advance for your help.
Author
Owner

@gaul commented on GitHub (Jan 24, 2019):

@dejlek master includes a readdir optimization that should reduce your run-time and setting -o multireq_max greater than 20 will also improve speed. However s3fs does not do any kind of read-ahead optimization which would greatly accerelate this operation. I have been experimenting with parallel unix tools for this situation but am not ready to share my progress.

<!-- gh-comment-id:457037938 --> @gaul commented on GitHub (Jan 24, 2019): @dejlek master includes a readdir optimization that should reduce your run-time and setting `-o multireq_max` greater than 20 will also improve speed. However s3fs does not do any kind of read-ahead optimization which would greatly accerelate this operation. I have been experimenting with parallel unix tools for this situation but am not ready to share my progress.
Author
Owner

@ggtakec commented on GitHub (Mar 29, 2019):

We launch new version 1.86, which is tuned some performance issue.
Please try to use it or master branch code.
I will close this, but if the problem persists, please reopen or post a new issue.
Thanks in advance for your assistance.

<!-- gh-comment-id:477880378 --> @ggtakec commented on GitHub (Mar 29, 2019): We launch new version 1.86, which is tuned some performance issue. Please try to use it or master branch code. I will close this, but if the problem persists, please reopen or post a new issue. Thanks in advance for your assistance.
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/s3fs-fuse#158
No description provided.