[PR #1801] [MERGED] ir: Add heuristic based LDS barrier pass #2314

Closed
opened 2026-02-27 21:16:01 +03:00 by kerem · 0 comments
Owner

📋 Pull Request Information

Original PR: https://github.com/shadps4-emu/shadPS4/pull/1801
Author: @raphaelthegreat
Created: 12/16/2024
Status: Merged
Merged: 12/19/2024
Merged by: @georgemoralis

Base: mainHead: lds-barr


📝 Commits (2)

  • c5f065b ir: Add heuristic based LDS barrier pass
  • 875428e lds_barriers: Limit to nvidia

📊 Changes

6 files changed (+55 additions, -0 deletions)

View changed files

📝 CMakeLists.txt (+1 -0)
📝 src/shader_recompiler/ir/passes/ir_passes.h (+5 -0)
src/shader_recompiler/ir/passes/shared_memory_barrier_pass.cpp (+46 -0)
📝 src/shader_recompiler/profile.h (+1 -0)
📝 src/shader_recompiler/recompiler.cpp (+1 -0)
📝 src/video_core/renderer_vulkan/vk_pipeline_cache.cpp (+1 -0)

📄 Description

Sometimes shaders can use shared memory without appropriate barriers in ISA level. This is probably because compiler optimized them away as it found them not needed (local workgroup size 64 for example makes barriers not necessary). This adds a new IR pass that attempts to insert such barriers to avoid device loss issues and graphics artifacts on NVIDIA.

Inserting barriers right after data share write instructions is not possible as we must have the barrier be outside of the non-uniform conditional block. The process is a bit naive at the moment, but it involves walking the generated AST and for shaders that use shared memory it inserts barriers after all zero-depth divergent conditional blocks.

The zero-depth clause prevents insertion of barriers already inside non-uniform blocks as that can cause issues since not all threads are executing that block. A block is deemed non-uniform if its condition contains gl_LocalInvocationId, which is simple and effective for now, but by far not exhaustive


🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.

## 📋 Pull Request Information **Original PR:** https://github.com/shadps4-emu/shadPS4/pull/1801 **Author:** [@raphaelthegreat](https://github.com/raphaelthegreat) **Created:** 12/16/2024 **Status:** ✅ Merged **Merged:** 12/19/2024 **Merged by:** [@georgemoralis](https://github.com/georgemoralis) **Base:** `main` ← **Head:** `lds-barr` --- ### 📝 Commits (2) - [`c5f065b`](https://github.com/shadps4-emu/shadPS4/commit/c5f065b3ae732571c5c629628669ba2a7412d076) ir: Add heuristic based LDS barrier pass - [`875428e`](https://github.com/shadps4-emu/shadPS4/commit/875428e784cf4b09506fc71abf53b54780d46891) lds_barriers: Limit to nvidia ### 📊 Changes **6 files changed** (+55 additions, -0 deletions) <details> <summary>View changed files</summary> 📝 `CMakeLists.txt` (+1 -0) 📝 `src/shader_recompiler/ir/passes/ir_passes.h` (+5 -0) ➕ `src/shader_recompiler/ir/passes/shared_memory_barrier_pass.cpp` (+46 -0) 📝 `src/shader_recompiler/profile.h` (+1 -0) 📝 `src/shader_recompiler/recompiler.cpp` (+1 -0) 📝 `src/video_core/renderer_vulkan/vk_pipeline_cache.cpp` (+1 -0) </details> ### 📄 Description Sometimes shaders can use shared memory without appropriate barriers in ISA level. This is probably because compiler optimized them away as it found them not needed (local workgroup size 64 for example makes barriers not necessary). This adds a new IR pass that attempts to insert such barriers to avoid device loss issues and graphics artifacts on NVIDIA. Inserting barriers right after data share write instructions is not possible as we must have the barrier be outside of the non-uniform conditional block. The process is a bit naive at the moment, but it involves walking the generated AST and for shaders that use shared memory it inserts barriers after all zero-depth divergent conditional blocks. The zero-depth clause prevents insertion of barriers already inside non-uniform blocks as that can cause issues since not all threads are executing that block. A block is deemed non-uniform if its condition contains gl_LocalInvocationId, which is simple and effective for now, but by far not exhaustive --- <sub>🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.</sub>
kerem 2026-02-27 21:16:01 +03:00
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/shadPS4#2314
No description provided.