mirror of
https://github.com/s3fs-fuse/s3fs-fuse.git
synced 2026-04-25 05:16:00 +03:00
[GH-ISSUE #2193] s3fs is order of magnitude slower in scanning directory tree than direct s3 access #1117
Labels
No labels
bug
bug
dataloss
duplicate
enhancement
feature request
help wanted
invalid
need info
performance
pull-request
question
question
testing
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
starred/s3fs-fuse#1117
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @kgabor on GitHub (Jun 23, 2023).
Original GitHub issue: https://github.com/s3fs-fuse/s3fs-fuse/issues/2193
Similar problem was reported in #1465 .
We have several zarr datasets stored in S3 buckets that consists of hundred thousands of 10-40MB chunk objects arranged in an index tree like directory structure.
A public example eg.
s3://aind-open-data/exaSPIM_653431_2023-05-06_10-23-15/exaSPIM.zarr/tile_x_0000_y_0000_z_0000_ch_488.zarr/that has 159,911 objects and a total size of 1.1 TB.
Traversing (listing) these directory structures (or
stat-ing of a pre-existing list of these objects) is order of magnitude slower via s3fs than using direct S3 api communication. I could not get any notable performance improvement by themultireq_maxorparallel_countparameters, settingmultireq_maxto high values like 1024 seems to make performance even worse. The use case is that the processing application uses s3fs and checks for existence for each chunk at opening and thus the overall input data rate remains very limited (at about max. ~1.2 GB/min), irrespectively of the no. of reader threads and s3fs mount parameters. Why?Additional Information
Version of s3fs being used (
s3fs --version)V1.90
Version of fuse being used (
pkg-config --modversion fuse,rpm -qi fuseordpkg -s fuse)3.10.5-1build1
Kernel information (
uname -r)5.19.0-1027-aws
GNU/Linux Distribution, if applicable (
cat /etc/os-release)Ubuntu 22.04.2 LTS
How to run s3fs, if applicable
[] command line
[] /etc/fstab
sudo s3fs aind-scratch-data ./aind-scratch-data -o rw,allow_other,umask=0002,uid=$(id -u),gid=$(id -g),use_cache=/home/ubuntu/s3cache,ensure_diskfree=200000,parallel_count=16,nomultipart,multireq_max=32@gaul commented on GitHub (Jun 24, 2023):
s3fs 1.91 reduces the number of HEAD requests but something is wrong if we don't get more speedup with more parallelism. See #1482 for background on how to make
readdirmuch faster at the cost of POSIX compatibility.@ggtakec commented on GitHub (Jun 25, 2023):
@kgabor
I think if the s3fs command called from the
findcommand has recursive checks on directories, etc., it may slow down the operation.To solve this, it may be effective to increase the size of the file stat cache with
max_stat_cache_size.This cache is a cache of stat information for files that have been read once, so in your case set it higher than 159,911.
Hopefully this will improve performance.
@kgabor commented on GitHub (Jun 27, 2023):
@gaul I'm experimenting with
max_stat_cache_size=5000000,stat_cache_expire=1300000. Would the idea of pre-filling stat cache with running afindcommand work? Is this a memory only cache ? (I only see entries in .aind-open-data.stat cache dir for files that were actually opened)First experiment with starting the processing job along with
findin parallel did not give any performance improvement. (I expected a nice speedup oncefindfills up the chunk file stat cache, but nothing...)@kgabor commented on GitHub (Jun 28, 2023):
Latest caching experiment. cache dir empty, s3fs mount:
This proceeds with 100-200 files/s, finishes in a few hours (959,467 objects). Nothing appears in
~/s3cache, s3fs process has 3-4GB mem usage.Now, if
findis repeated, it's much faster, several thousand files/s, cache is working.Now, if I start the data processing, that reads these very same dirs on 32 threads, re-running
findin the meantime above gets very slow <100 files/s! If I stop the data processing,findgets fast again.I suspect, there must be a locking bottleneck in cache access or something similar in concurrency handling. This is also supported by the experience that I/O throughput (in data processing) is mostly independent of the no. of data processing threads and CPU usage remains well below 100% (i.e. limited by data rate). Also, not much difference of s3fs
parallel_countandmultireq_maxoptions.