mirror of
https://github.com/s3fs-fuse/s3fs-fuse.git
synced 2026-04-25 05:16:00 +03:00
[GH-ISSUE #2709] first line(cache file stat) is different: "49425:0" != "49425:26214400" #1280
Labels
No labels
bug
bug
dataloss
duplicate
enhancement
feature request
help wanted
invalid
need info
performance
pull-request
question
question
testing
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
starred/s3fs-fuse#1280
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @gaul on GitHub (Aug 22, 2025).
Original GitHub issue: https://github.com/s3fs-fuse/s3fs-fuse/issues/2709
CI shows this symptom sometimes and I reproduced it locally in Valgrind:
@ggtakec commented on GitHub (Aug 23, 2025):
@gaul Thanks, I was able to reproduce it, so I will investigate.
@ggtakec commented on GitHub (Aug 24, 2025):
Although I was able to reproduce this issue, it appears to be a bug with very low reproducibility. (I was able to confirm it less than 0.1%).
The likely cause of this is that when updating the file that manages the stats information for the FileCache file, the update is read before the changes are reflected.
I was unable to find any other reason for this low occurrence.
In fact, since a sync was not performed after updating the stats information for the FileCache file, this seems highly likely.
I have submitted a PR #2710 to fix this, so please check it out.
@gaul commented on GitHub (Aug 24, 2025):
Example from CI: https://github.com/s3fs-fuse/s3fs-fuse/actions/runs/16602135084/job/47222510366?pr=1867
@ggtakec commented on GitHub (Aug 26, 2025):
I posted new PR #2714 for this issue, and closed #2710.
Please check it.
@gaul commented on GitHub (Aug 27, 2025):
I see a different symptom now:
From: https://productionresultssa7.blob.core.windows.net/actions-results/0e186977-0912-4e07-94d3-b6d76abf919b/workflow-job-run-e4bb639a-e3e5-5382-9d4a-c186d04f0676/logs/job/job-logs.txt?rsct=text%2Fplain&se=2025-08-27T02%3A47%3A26Z&sig=YPA%2B5%2Bfwyg785bCqZWYDnbm4js5J0mw%2F5mJQDU9eApA%3D&ske=2025-08-27T11%3A07%3A45Z&skoid=ca7593d4-ee42-46cd-af88-8b886a2f84eb&sks=b&skt=2025-08-26T23%3A07%3A45Z&sktid=398a6654-997b-47e9-b12b-9515b896b4de&skv=2025-05-05&sp=r&spr=https&sr=b&st=2025-08-27T02%3A37%3A21Z&sv=2025-05-05
@ggtakec commented on GitHub (Aug 27, 2025):
Thanks for reporting it, I'll look into it.
@ggtakec commented on GitHub (Aug 28, 2025):
It was difficult to reproduce the
could not get first or end line from cache file stat:error locally, but I reviewed the code again.The possibility of the above error occurring with
test_cache_file_statis the same as last time: it lies in File upload->flush->release.PageList::Serialize()is called during flush and release, and it seems that the file was read during the release process.The problem is that in
PageList::Serialize(), the procedure for updating the FileCacheStat file contents isftruncate(0 byte)->pwrite. If the file is read in between these steps, the file will be 0 bytes.Since this possibility is not zero, one of the following solutions is required:
(1) rename(guarantees atomic operation) or (2) lock the file.
Sincerenamewould make the processing too complicated, I will modify the code to use method (2) to useflock.By gaul's advice, I changed my mind and decided to avoid this by using rename instead of flock.
@ggtakec commented on GitHub (Aug 30, 2025):
I think I've fixed this issue.
I'll close it and reopen it if the problem persists.