[GH-ISSUE #2438] Deadlock for FdManager::fd_manager_lock && FdEntity::fdent_lock when write data concurrently #1199

Closed
opened 2026-03-04 01:52:09 +03:00 by kerem · 7 comments
Owner

Originally created by @Roay on GitHub (Mar 27, 2024).
Original GitHub issue: https://github.com/s3fs-fuse/s3fs-fuse/issues/2438

s3fs version : master

The First :
s3fs_write -> 'autoent.GetExistFdEntity' -> 'FdManager::get()->GetExistFdEntity'
first lock: FdManager::fd_manager_lock [line: 659 in fdcache.cpp]
sec lock: FdEntity::fdent_lock [line: 361 in fdcache_entity.cpp]
The Second :
s3fs_write -> 'ent->Write()' -> 'WriteMixMultipart()' -> 'NoCacheLoadAndPost' -> 'ChangeEntityToTempPath'
first lock: FdEntity::fdent_lock [line: 2088 in fdcache_entity.cpp]
sec lock: FdManager::fd_manager_lock [line: 780 in fdcache.cpp]

Originally created by @Roay on GitHub (Mar 27, 2024). Original GitHub issue: https://github.com/s3fs-fuse/s3fs-fuse/issues/2438 s3fs version : master **The First :** s3fs_write -> 'autoent.GetExistFdEntity' -> 'FdManager::get()->GetExistFdEntity' first lock: FdManager::fd_manager_lock [line: 659 in fdcache.cpp] sec lock: FdEntity::fdent_lock [line: 361 in fdcache_entity.cpp] **The Second :** s3fs_write -> 'ent->Write()' -> 'WriteMixMultipart()' -> 'NoCacheLoadAndPost' -> 'ChangeEntityToTempPath' first lock: FdEntity::fdent_lock [line: 2088 in fdcache_entity.cpp] sec lock: FdManager::fd_manager_lock [line: 780 in fdcache.cpp]
kerem closed this issue 2026-03-04 01:52:09 +03:00
Author
Owner

@Roay commented on GitHub (Mar 27, 2024):

#Triggered fix deadlock in clean up cache

<!-- gh-comment-id:2022331212 --> @Roay commented on GitHub (Mar 27, 2024): [#Triggered](https://github.com/s3fs-fuse/s3fs-fuse/pull/1151) fix deadlock in clean up cache
Author
Owner

@amarjayr commented on GitHub (Jun 10, 2024):

@gaul apologies for the direct ping but any updates here? I've included a python script to reproduce the deadlock in #2463 (which I closed as duplicate).

I'm happy to submit a fix, but could use some guidance on the recommended solution.

<!-- gh-comment-id:2158839512 --> @amarjayr commented on GitHub (Jun 10, 2024): @gaul apologies for the direct ping but any updates here? I've included a python script to reproduce the deadlock in #2463 (which I closed as duplicate). I'm happy to submit a fix, but could use some guidance on the recommended solution.
Author
Owner

@ggtakec commented on GitHub (Jun 23, 2024):

@Roay
Thank you for your detailed explanation.
@amarjayr
And also for providing the code for testing, thanks!

I checked the calling sequence you pointed out and confirmed the possibility of a deadlock.
It seems that a lot of checking is required to fix it, so please wait a little while. (Because there is an issue with the logic)

<!-- gh-comment-id:2184940256 --> @ggtakec commented on GitHub (Jun 23, 2024): @Roay Thank you for your detailed explanation. @amarjayr And also for providing the code for testing, thanks! I checked the calling sequence you pointed out and confirmed the possibility of a deadlock. It seems that a lot of checking is required to fix it, so please wait a little while. (Because there is an issue with the logic)
Author
Owner

@ggtakec commented on GitHub (Jun 23, 2024):

@Roay @amarjayr
I have posted PR #2478 as a solution to this problem.
If you can test the problem with the source code, please try it.
Thanks in advance for your assistance.

<!-- gh-comment-id:2185165011 --> @ggtakec commented on GitHub (Jun 23, 2024): @Roay @amarjayr I have posted PR #2478 as a solution to this problem. If you can test the problem with the source code, please try it. Thanks in advance for your assistance.
Author
Owner

@amarjayr commented on GitHub (Jun 24, 2024):

@ggtakec this fixes the repro provided in #2463.

Thanks for the quick turnaround!

<!-- gh-comment-id:2187478475 --> @amarjayr commented on GitHub (Jun 24, 2024): @ggtakec this fixes the repro provided in #2463. Thanks for the quick turnaround!
Author
Owner

@ggtakec commented on GitHub (Jun 25, 2024):

@amarjayr Thank you for your cooperation, letting us know that the issue has been resolved.
We expect the PR will be merged after review, please wait a little while until then.

<!-- gh-comment-id:2189092194 --> @ggtakec commented on GitHub (Jun 25, 2024): @amarjayr Thank you for your cooperation, letting us know that the issue has been resolved. We expect the PR will be merged after review, please wait a little while until then.
Author
Owner

@ggtakec commented on GitHub (Jul 1, 2024):

#2478 was merged, I closed this.
Thanks.

<!-- gh-comment-id:2200170026 --> @ggtakec commented on GitHub (Jul 1, 2024): #2478 was merged, I closed this. Thanks.
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/s3fs-fuse#1199
No description provided.