[PR #2767] [CLOSED] shader_recompiler: (WIP) Implement more accurate ReadConst support (including dynamic indexing) #2966

Closed
opened 2026-02-27 22:01:56 +03:00 by kerem · 0 comments
Owner

📋 Pull Request Information

Original PR: https://github.com/shadps4-emu/shadPS4/pull/2767
Author: @LNDF
Created: 4/10/2025
Status: Closed

Base: mainHead: read-const


📝 Commits (10+)

📊 Changes

72 files changed (+9312 additions, -158 deletions)

View changed files

📝 CMakeLists.txt (+56 -1)
src/common/cartesian_invoke.h (+43 -0)
📝 src/common/func_traits.h (+1 -0)
src/shader_recompiler/backend/asm_x64/emit_x64.cpp (+268 -0)
src/shader_recompiler/backend/asm_x64/emit_x64.h (+15 -0)
src/shader_recompiler/backend/asm_x64/emit_x64_atomic.cpp (+138 -0)
src/shader_recompiler/backend/asm_x64/emit_x64_barrier.cpp (+20 -0)
src/shader_recompiler/backend/asm_x64/emit_x64_bitwise_conversion.cpp (+204 -0)
src/shader_recompiler/backend/asm_x64/emit_x64_composite.cpp (+350 -0)
src/shader_recompiler/backend/asm_x64/emit_x64_context_get_set.cpp (+221 -0)
src/shader_recompiler/backend/asm_x64/emit_x64_convert.cpp (+279 -0)
src/shader_recompiler/backend/asm_x64/emit_x64_floating_point.cpp (+766 -0)
src/shader_recompiler/backend/asm_x64/emit_x64_image.cpp (+62 -0)
src/shader_recompiler/backend/asm_x64/emit_x64_instructions.h (+482 -0)
src/shader_recompiler/backend/asm_x64/emit_x64_integer.cpp (+624 -0)
src/shader_recompiler/backend/asm_x64/emit_x64_logical.cpp (+42 -0)
src/shader_recompiler/backend/asm_x64/emit_x64_select.cpp (+71 -0)
src/shader_recompiler/backend/asm_x64/emit_x64_shared_memory.cpp (+39 -0)
src/shader_recompiler/backend/asm_x64/emit_x64_special.cpp (+55 -0)
src/shader_recompiler/backend/asm_x64/emit_x64_undefined.cpp (+28 -0)

...and 52 more files

📄 Description

The goal of this pull request is to be able to more accurately support ReadConst (LOAD_DWORD) in shaders.
The current implementation doesn't support dynamic offsets, and doesn't correctly handle the cases where the base address is modified between the source ReadConst/UserData and the user ReadConst. Example:

%20 = ReadConst %19 #0
%21 = IAdd32  %20 #8
%22 = CompositeConstructU32x2 %21 #0
%23 = ReadConst %22 #0      <--- Here we dont take the addition in %21 into account

This PR aims to solve this issue in the following way.

  1. First all the ReadConsts in the shader being compiled are collected in a list
  2. A sub IR-Program is created from the shader IR-Program that contains all the relevant instructions to compute the ReadConsts. This includes all arguments of each ReadConst and so recursively, the necessary conditionals and loops, etc.
  3. Subprogram is aanalyzed to extract how many times each ReadConst can be executed (it it's inside a loop, it may be executed more than once and read different values). That way we can decide offsets for the ReadConst data in the flatbuf (extended user data)
  4. Subprogram is modified to introduce instructions to save the result from each ReadConst into the flatbuf. Also, the ReadConsts in the original shader IR are modified to point to the correct location in the flatbuf.
  5. Subprogram is compiled to x64 assembly and run every time the flatbuf is refreshed (before drawing)
    Doing that, the flatbuf should contain all the data the shader needs.

Current issues and status

This is a draft, because it regresses/braks games right now.
I was able to toest in 2 games.

  • Gran Turismo Sport: After doing the initial brighness configuration screen, I'm able to continue further and see a washed out version fo the initial credits, press circle and hit a "Attempted to track non-GPU memory address". In current main, I cannot make out from the initial configuration screen (new save) and I see a black screen during the initial credits, to hit the same memory error if I press circle.
  • Bloodborne: Currently, there is an issue in with a specific shader that I'm not able to resolve.

Bloodborne issue

A crash occurs when executing the x64 code generated for the 0x8bc6ea32 shader. The following is the IR of the subprogram:

Block $0
         Prologue
%8     = GetUserData SGPR6 (uses: 4)
%9     = GetUserData SGPR7 (uses: 4)
%10    = CompositeConstructU32x2 %8, %9 (uses: 4)
%11    = ReadConst (flags=0)  %10, #0 (uses: 2)
         SetUserData #16, %11
%12    = ReadConst (flags=0)  %10, #1 (uses: 2)
         SetUserData #17, %12
%13    = ReadConst (flags=0)  %10, #2 (uses: 2)
         SetUserData #18, %13
%14    = ReadConst (flags=0)  %10, #3 (uses: 2)
         SetUserData #19, %14
%15    = CompositeConstructU32x2 %8, %9 (uses: 4)
%16    = ReadConst (flags=0)  %15, #16 (uses: 2)
         SetUserData #20, %16
%17    = ReadConst (flags=0)  %15, #17 (uses: 2)
         SetUserData #21, %17
%18    = ReadConst (flags=0)  %15, #18 (uses: 2)
         SetUserData #22, %18
%19    = ReadConst (flags=0)  %15, #19 (uses: 2)
         SetUserData #23, %19
%20    = CompositeConstructU32x4 %11, %12, %13, %14 (uses: 1)
%21    = ReadConstBuffer %20, #32 (uses: 1)

Block $1
%22    = CompositeConstructU32x2 %8, %9 (uses: 8)
%23    = ReadConst (flags=0)  %22, #4 (uses: 1)
         SetUserData #24, %23
%24    = ReadConst (flags=0)  %22, #5 (uses: 1)
         SetUserData #25, %24
%25    = ReadConst (flags=0)  %22, #6 (uses: 1)
         SetUserData #26, %25
%26    = ReadConst (flags=0)  %22, #7 (uses: 1)
         SetUserData #27, %26
%27    = ReadConst (flags=0)  %22, #8 (uses: 1)
         SetUserData #28, %27
%28    = ReadConst (flags=0)  %22, #9 (uses: 1)
         SetUserData #29, %28
%29    = ReadConst (flags=0)  %22, #10 (uses: 1)
         SetUserData #30, %29
%30    = ReadConst (flags=0)  %22, #11 (uses: 1)
         SetUserData #31, %30

Block $2
%31    = IEqual32 %21, #0 (uses: 1)
%32    = LogicalNot %31 (uses: 1)
%33    = ConditionRef %32 (uses: 0) (goes to block $5 if false)

Block $3
%34    = CompositeConstructU32x4 %16, %17, %18, %19 (uses: 2)
%35    = ReadConstBuffer %34, #18 (uses: 1)
%36    = ReadConstBuffer %34, #19 (uses: 1)

Block $4
%37    = Phi [ %35, {Block $3} ], [ %8, {Block $2} ] (uses: 1)
%38    = Phi [ %36, {Block $3} ], [ %9, {Block $2} ] (uses: 1)

Block $5
%39    = CompositeConstructU32x2 %37, %38 (uses: 4)
%40    = ReadConst (flags=0)  %39, #12 (uses: 1)
         SetUserData #32, %40
%41    = ReadConst (flags=0)  %39, #13 (uses: 1)
         SetUserData #33, %41
%42    = ReadConst (flags=0)  %39, #14 (uses: 1)
         SetUserData #34, %42
%43    = ReadConst (flags=0)  %39, #15 (uses: 1)
         SetUserData #35, %43

Block $6
         Epilogue

If condition %33 is true, the shader gets an address (%35 and %36) that will be used to read from later (%40-%43).
The address obtained in %35 and %36 is not a valid address (0xC28C270A00000030). The buffer resource used to point to the buffer that is read from seems correct (base address, stride, etc.). I think the ReadConstBuffer implementation in the x64 shader backend is correct (apart from clamping)and I'm not able to find the cause of this issue.
Would be great if someone with more knowledge on how buffers work could take a look at this.

Additional notes

Now, if shader dumping is enabled, the ASL will also be dumped. Additionaly, IR, ASL and assembly code will be dumped for the subprogram.


🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.

## 📋 Pull Request Information **Original PR:** https://github.com/shadps4-emu/shadPS4/pull/2767 **Author:** [@LNDF](https://github.com/LNDF) **Created:** 4/10/2025 **Status:** ❌ Closed **Base:** `main` ← **Head:** `read-const` --- ### 📝 Commits (10+) - [`d88294a`](https://github.com/shadps4-emu/shadPS4/commit/d88294a34ce6da3bb2a6a30c2664c96fd55b4330) Dump IR program - [`38957ba`](https://github.com/shadps4-emu/shadPS4/commit/38957ba4702f50faef59a67cc2d1107f01d9abb6) Handle non inmediate offset on S_LOAD_DWORD - [`6465310`](https://github.com/shadps4-emu/shadPS4/commit/6465310942e866e298b717612e4cc59ba5ebaad3) ASL dumping - [`60be1e4`](https://github.com/shadps4-emu/shadPS4/commit/60be1e43769e2b1bdf7e17ca8097fbf1f62b6958) Fix unreacheable ASL dump - [`4bf9bbf`](https://github.com/shadps4-emu/shadPS4/commit/4bf9bbf86ffe8888c3105acbb78a486834ce5d07) Add conditional tree - [`6fff3a0`](https://github.com/shadps4-emu/shadPS4/commit/6fff3a03db5d2349383f9dea2b72d8d2109a9d09) Usefulness of conditional tree - [`e15bf43`](https://github.com/shadps4-emu/shadPS4/commit/e15bf43e26b786dba1cee21b9c0b836723d98cb2) Subprogram creation - [`03d4471`](https://github.com/shadps4-emu/shadPS4/commit/03d4471c3b0dbe5c097e2375a5749ab34deeecd1) clang-format - [`5843352`](https://github.com/shadps4-emu/shadPS4/commit/5843352c2819295a67f88ff42d9b57b26bff8cee) Fix subprogram generation - [`7a93230`](https://github.com/shadps4-emu/shadPS4/commit/7a93230f2ebe8e9775c62da60f44db7ba6b5d2be) ImmValue ### 📊 Changes **72 files changed** (+9312 additions, -158 deletions) <details> <summary>View changed files</summary> 📝 `CMakeLists.txt` (+56 -1) ➕ `src/common/cartesian_invoke.h` (+43 -0) 📝 `src/common/func_traits.h` (+1 -0) ➕ `src/shader_recompiler/backend/asm_x64/emit_x64.cpp` (+268 -0) ➕ `src/shader_recompiler/backend/asm_x64/emit_x64.h` (+15 -0) ➕ `src/shader_recompiler/backend/asm_x64/emit_x64_atomic.cpp` (+138 -0) ➕ `src/shader_recompiler/backend/asm_x64/emit_x64_barrier.cpp` (+20 -0) ➕ `src/shader_recompiler/backend/asm_x64/emit_x64_bitwise_conversion.cpp` (+204 -0) ➕ `src/shader_recompiler/backend/asm_x64/emit_x64_composite.cpp` (+350 -0) ➕ `src/shader_recompiler/backend/asm_x64/emit_x64_context_get_set.cpp` (+221 -0) ➕ `src/shader_recompiler/backend/asm_x64/emit_x64_convert.cpp` (+279 -0) ➕ `src/shader_recompiler/backend/asm_x64/emit_x64_floating_point.cpp` (+766 -0) ➕ `src/shader_recompiler/backend/asm_x64/emit_x64_image.cpp` (+62 -0) ➕ `src/shader_recompiler/backend/asm_x64/emit_x64_instructions.h` (+482 -0) ➕ `src/shader_recompiler/backend/asm_x64/emit_x64_integer.cpp` (+624 -0) ➕ `src/shader_recompiler/backend/asm_x64/emit_x64_logical.cpp` (+42 -0) ➕ `src/shader_recompiler/backend/asm_x64/emit_x64_select.cpp` (+71 -0) ➕ `src/shader_recompiler/backend/asm_x64/emit_x64_shared_memory.cpp` (+39 -0) ➕ `src/shader_recompiler/backend/asm_x64/emit_x64_special.cpp` (+55 -0) ➕ `src/shader_recompiler/backend/asm_x64/emit_x64_undefined.cpp` (+28 -0) _...and 52 more files_ </details> ### 📄 Description The goal of this pull request is to be able to more accurately support ReadConst (LOAD_DWORD) in shaders. The current implementation doesn't support dynamic offsets, and doesn't correctly handle the cases where the base address is modified between the source ReadConst/UserData and the user ReadConst. Example: ``` %20 = ReadConst %19 #0 %21 = IAdd32 %20 #8 %22 = CompositeConstructU32x2 %21 #0 %23 = ReadConst %22 #0 <--- Here we dont take the addition in %21 into account ``` This PR aims to solve this issue in the following way. 1. First all the ReadConsts in the shader being compiled are collected in a list 2. A sub IR-Program is created from the shader IR-Program that contains all the relevant instructions to compute the ReadConsts. This includes all arguments of each ReadConst and so recursively, the necessary conditionals and loops, etc. 3. Subprogram is aanalyzed to extract how many times each ReadConst can be executed (it it's inside a loop, it may be executed more than once and read different values). That way we can decide offsets for the ReadConst data in the flatbuf (extended user data) 4. Subprogram is modified to introduce instructions to save the result from each ReadConst into the flatbuf. Also, the ReadConsts in the original shader IR are modified to point to the correct location in the flatbuf. 5. Subprogram is compiled to x64 assembly and run every time the flatbuf is refreshed (before drawing) Doing that, the flatbuf should contain all the data the shader needs. ## Current issues and status This is a draft, because it regresses/braks games right now. I was able to toest in 2 games. * Gran Turismo Sport: After doing the initial brighness configuration screen, I'm able to continue further and see a washed out version fo the initial credits, press circle and hit a "Attempted to track non-GPU memory address". In current main, I cannot make out from the initial configuration screen (new save) and I see a black screen during the initial credits, to hit the same memory error if I press circle. * Bloodborne: Currently, there is an issue in with a specific shader that I'm not able to resolve. ### Bloodborne issue A crash occurs when executing the x64 code generated for the 0x8bc6ea32 shader. The following is the IR of the subprogram: ``` Block $0 Prologue %8 = GetUserData SGPR6 (uses: 4) %9 = GetUserData SGPR7 (uses: 4) %10 = CompositeConstructU32x2 %8, %9 (uses: 4) %11 = ReadConst (flags=0) %10, #0 (uses: 2) SetUserData #16, %11 %12 = ReadConst (flags=0) %10, #1 (uses: 2) SetUserData #17, %12 %13 = ReadConst (flags=0) %10, #2 (uses: 2) SetUserData #18, %13 %14 = ReadConst (flags=0) %10, #3 (uses: 2) SetUserData #19, %14 %15 = CompositeConstructU32x2 %8, %9 (uses: 4) %16 = ReadConst (flags=0) %15, #16 (uses: 2) SetUserData #20, %16 %17 = ReadConst (flags=0) %15, #17 (uses: 2) SetUserData #21, %17 %18 = ReadConst (flags=0) %15, #18 (uses: 2) SetUserData #22, %18 %19 = ReadConst (flags=0) %15, #19 (uses: 2) SetUserData #23, %19 %20 = CompositeConstructU32x4 %11, %12, %13, %14 (uses: 1) %21 = ReadConstBuffer %20, #32 (uses: 1) Block $1 %22 = CompositeConstructU32x2 %8, %9 (uses: 8) %23 = ReadConst (flags=0) %22, #4 (uses: 1) SetUserData #24, %23 %24 = ReadConst (flags=0) %22, #5 (uses: 1) SetUserData #25, %24 %25 = ReadConst (flags=0) %22, #6 (uses: 1) SetUserData #26, %25 %26 = ReadConst (flags=0) %22, #7 (uses: 1) SetUserData #27, %26 %27 = ReadConst (flags=0) %22, #8 (uses: 1) SetUserData #28, %27 %28 = ReadConst (flags=0) %22, #9 (uses: 1) SetUserData #29, %28 %29 = ReadConst (flags=0) %22, #10 (uses: 1) SetUserData #30, %29 %30 = ReadConst (flags=0) %22, #11 (uses: 1) SetUserData #31, %30 Block $2 %31 = IEqual32 %21, #0 (uses: 1) %32 = LogicalNot %31 (uses: 1) %33 = ConditionRef %32 (uses: 0) (goes to block $5 if false) Block $3 %34 = CompositeConstructU32x4 %16, %17, %18, %19 (uses: 2) %35 = ReadConstBuffer %34, #18 (uses: 1) %36 = ReadConstBuffer %34, #19 (uses: 1) Block $4 %37 = Phi [ %35, {Block $3} ], [ %8, {Block $2} ] (uses: 1) %38 = Phi [ %36, {Block $3} ], [ %9, {Block $2} ] (uses: 1) Block $5 %39 = CompositeConstructU32x2 %37, %38 (uses: 4) %40 = ReadConst (flags=0) %39, #12 (uses: 1) SetUserData #32, %40 %41 = ReadConst (flags=0) %39, #13 (uses: 1) SetUserData #33, %41 %42 = ReadConst (flags=0) %39, #14 (uses: 1) SetUserData #34, %42 %43 = ReadConst (flags=0) %39, #15 (uses: 1) SetUserData #35, %43 Block $6 Epilogue ``` If condition %33 is true, the shader gets an address (%35 and %36) that will be used to read from later (%40-%43). The address obtained in %35 and %36 is not a valid address (0xC28C270A00000030). The buffer resource used to point to the buffer that is read from seems correct (base address, stride, etc.). I think the ReadConstBuffer implementation in the x64 shader backend is correct (apart from clamping)and I'm not able to find the cause of this issue. Would be great if someone with more knowledge on how buffers work could take a look at this. ## Additional notes Now, if shader dumping is enabled, the ASL will also be dumped. Additionaly, IR, ASL and assembly code will be dumped for the subprogram. --- <sub>🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.</sub>
kerem 2026-02-27 22:01:56 +03:00
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/shadPS4#2966
No description provided.