Will resolve a future issue in the multisampling PR, where the
GPU_USE_FRAMEBUFFER_FETCH flag changes at runtime if you switch between no AA and MSAA.
Just figured I'd get it in separately for safety.
This reduces the number of vertex shaders and thus pipelines by quite a
bit more in a few games, like Tekken and GoW, continuing the fight
against shader compile stutter.
The perf impact should be minimal if not positive due to less pipeline
changes.
GLES fixes
Make the vertex input declarations match (always declare fog input). Fixes D3D11 validation
Tess fix
In #16104, we drastically reduced the number of shader variants for
games that use flexible lighting setups. I looked at a few games and it
seems that a lot of games have the same shaders with fog on/off, while
fog is super cheap to compute. So let's just always do it, reducing
vertex shader variants further (though the amount of pipelines will probably
remain the same, since we still specialize the fragment shader).
Might also be worth adding a dynamic bool for the fragment shader, but
if so, doing it separately.
This drastically reduces the shader compile stutter that happens when a lot of new
light setups are created, like on the first punch in Tekken 6.
There's more stuff that might benefit from being made dynamic like this.
These branches are very cheap on modern GPUs since they're branching on
a uniform variable, so no divergence.
Only tested on Vulkan. I think we'll need to keep the old path too for
gpus like Mali-450...
This is used by several games to render to the alpha channel of RGBA4444
images, which cannot normally be done directly on the PSP.
Can be used as a far more efficient replacement for
ReinterpretFramebuffers/ShaderColorBitmask
It took writing and debugging #15500 for me to understand what the issue with the old path was..
Much simpler alternative to #15500, or we could merge both but disable Split/Second
for this one. Needs some benchmarks I guess...
In many cases, games use lighting just for diffuse or something, this
helps skip what's not needed too. Good improvement in a scene from a
Naruto game.
Tested for all values of A * B + 0 * (255 - B), as well as A * 127 + B *
(255 - 127), and matches accurately. Spot checked other values, but not
exhaustively.