Commit graph

77 commits

Author SHA1 Message Date
Unknown W. Brackets
d6fa301ab1 softgpu: Track CLUTs as states for binning.
This way we can have multiple CLUTs in process at once, which helps.
2022-01-16 08:14:09 -08:00
Unknown W. Brackets
edb79d968f softgpu: Cache CLUT params in sampler state.
And now there's no more gstate for pixel drawing or sampling.  Just a
little left in rasterization.
2022-01-15 18:09:09 -08:00
Unknown W. Brackets
c0e85e6170 softgpu: Move texenv color into sampler state. 2022-01-15 17:52:40 -08:00
Unknown W. Brackets
ad3635c82a softgpu: Move tex size to cached state. 2022-01-15 17:22:43 -08:00
Unknown W. Brackets
b915a82c41 softgpu: Correct decal doubling without alpha. 2022-01-09 12:23:55 -08:00
Unknown W. Brackets
72aa4be879 samplerjit: Skip processing alpha if unused. 2022-01-09 12:23:55 -08:00
Unknown W. Brackets
fe0b3dbd01 samplerjit: Fix alpha for 565 in linear lookup. 2022-01-09 11:08:46 -08:00
Henrik Rydgård
f82f24a9bb
Merge pull request #15280 from unknownbrackets/samplerjit-dxt
Correct some recent regressions in samplerjit
2022-01-05 09:42:30 +01:00
Unknown W. Brackets
741a9b0a4d samplerjit: Fix DXT compilation. 2022-01-05 00:00:03 -08:00
Unknown W. Brackets
19998976c7 samplerjit: Correct linear compile failure.
It was resetting to nullptr, because `nearest` was nullptr.
2022-01-04 23:58:07 -08:00
Unknown W. Brackets
2aa57679fa softjit: Keep mip S/T calc in SIMD.
This is only a tiny bit faster, though.
2022-01-03 06:45:10 -08:00
Unknown W. Brackets
26e7768a67 samplerjit: Remove old linear nearest paths.
We only use it for DXT now, so let's not keep the dead code around.
2022-01-02 17:28:52 -08:00
Unknown W. Brackets
5e3bef7e14 samplerjit: Avoid gather if overread could crash.
This should be rare, but a game could easily shove a CLUT4 texture at the
end of VRAM, and then accessing the last index would segfault.
2022-01-02 17:28:52 -08:00
Unknown W. Brackets
7806dfddea samplerjit: Use VPGATHERDD for all types. 2022-01-02 17:19:18 -08:00
Unknown W. Brackets
ce6ea8da11 samplerjit: Apply gather lookup to all CLUT4. 2022-01-02 17:19:18 -08:00
Unknown W. Brackets
22f770c828 samplerjit: Use VPGATHERDD for simple CLUT4 loads.
Planning to expand this to more paths.
2022-01-02 17:19:17 -08:00
Unknown W. Brackets
65c84d5dd5 samplerjit: Avoid a couple more copies in AVX.
From looking at assembly, just trying to keep it small.
2022-01-02 17:01:14 -08:00
Henrik Rydgård
c7062d7063
Merge pull request #15271 from unknownbrackets/samplerjit-color16
samplerjit: Decode colors in parallel
2022-01-02 17:55:46 +01:00
Henrik Rydgård
6fb5d82fe0
Merge pull request #15264 from unknownbrackets/samplerjit-vec
A couple more smaller samplerjit optimizations
2022-01-02 17:32:54 +01:00
Unknown W. Brackets
0eec4e7e4d samplerjit: Decode colors in parallel.
Not used in a ton of games, but a decent improvement where it is used.
2022-01-02 08:27:55 -08:00
Unknown W. Brackets
7060035303 samplerjit: Implement nearest in jit.
This uses the tex func and similar within jit.
2022-01-01 16:58:05 -08:00
Unknown W. Brackets
91c9343e87 samplerjit: Refactor and reuse constant pool.
It's just here to be rip accessible, the fixed values can be output just
once.
2022-01-01 16:58:05 -08:00
Unknown W. Brackets
40240be91c samplerjit: Update nearest args, temp disable jit.
This temporarily disables jit for nearest, but refactors to use the new
arg structure.  It now matches linear.
2022-01-01 16:58:05 -08:00
Unknown W. Brackets
06e954fe2a samplerjit: Create a separate fetch func.
This allows nearest to become more similar to linear, where it applies the
texture function.
2022-01-01 16:58:04 -08:00
Unknown W. Brackets
6aec68aa5c samplerjit: Correct wrong bufw at mip levels.
Oops, was always using the base bufw.
2022-01-01 16:40:02 -08:00
Unknown W. Brackets
dbb015f427 samplerjit: Oops, fix Linux mipmap handling. 2022-01-01 16:40:02 -08:00
Unknown W. Brackets
8ea67b571b samplerjit: Tiny dependency optimizations.
This had a small but measureable impact (~0.3%.)
2021-12-31 08:11:57 -08:00
Unknown W. Brackets
fc3688d273 samplerjit: Small AVX optimization to modulate.
Only gives about 0.5% but it's still something.
2021-12-31 08:10:04 -08:00
Henrik Rydgård
244b0a86f6
Merge pull request #15262 from unknownbrackets/samplerjit-vec
samplerjit: Use SSSE3/SSE4 in linear filtering
2021-12-31 09:29:59 +01:00
Unknown W. Brackets
33e9841a4a softgpu: Skip zero size triangles.
These were drawing before, incorrectly, which caused artifacts.
Noticeable in Blade Dancer.
2021-12-31 00:20:12 -08:00
Unknown W. Brackets
1addf84e90 samplerjit: Use SSSE3/SSE4 in linear filtering. 2021-12-30 23:22:56 -08:00
Unknown W. Brackets
147b81d6f7 x64jit: Add AVX/AVX2 encodings.
Also fix the FMA double ones, which were passing W wrongly.
2021-12-29 19:46:26 -08:00
Unknown W. Brackets
28cfbe0e5a samplerjit: Add an alternate profiling method.
This is more useful to group common operations together for profiling.
2021-12-29 07:11:39 -08:00
Unknown W. Brackets
3aedea89eb samplerjit: Correct level lookup offset. 2021-12-29 07:09:36 -08:00
Unknown W. Brackets
bf06342f9d samplerjit: Minor SSE4 optimizations.
These seem to be a bit faster.
2021-12-29 07:07:35 -08:00
Unknown W. Brackets
631706a8ba samplerjit: Set stackArgPos_ early.
Unfortunately, this has to match the value set lower...
2021-12-28 20:21:21 -08:00
Unknown W. Brackets
74eb450e76 samplerjit: Move texture function into jit.
Could do this also for nearest, might end up with a third set of functions
there for a direct sample lookup (for debug funcs.)
2021-12-28 17:52:17 -08:00
Unknown W. Brackets
940e6bb1d7 samplerjit: Lookup both mip tex values. 2021-12-28 16:22:54 -08:00
Unknown W. Brackets
6b55d328e5 samplerjit: Use regcache for linear filtering.
This makes it easier to reuse for mipmap filtering.
2021-12-28 15:37:25 -08:00
Unknown W. Brackets
cdf14c8579 samplerjit: Calculate mip level U/V/offsets.
Not actually doing the sampling for the second mip level in the single jit
pass yet, but close.
2021-12-28 14:12:58 -08:00
Unknown W. Brackets
a4558a5736 samplerjit: Take texptr/bufw as arrays.
Prep for moving mip map sampling into linear.
2021-12-28 12:04:16 -08:00
Unknown W. Brackets
4864850b3b samplerjit: Handle mipmap width/height in S/T calc. 2021-12-28 11:29:29 -08:00
Unknown W. Brackets
a84accf713 samplerjit: Move S/T calculation into jit.
Gives a pretty decent 5-10% improvement in many places.
2021-12-28 09:58:23 -08:00
Unknown W. Brackets
476dfdf731 samplerjit: Add more bits for S/T, skip multiply.
For now, we're not using those other bits yet.
2021-12-27 18:24:37 -08:00
Unknown W. Brackets
75f105f84b softgpu: Make linear filtering more accurate.
This matches tests for various u/v offsets and x/y subpixel offsets.
Mipmaps are probably still wrong.
2021-12-27 11:37:32 -08:00
Unknown W. Brackets
3cd19b02ac samplerjit: Handle unswizzled offsets too. 2021-12-27 11:37:32 -08:00
Unknown W. Brackets
820361f34b samplerjit: Calculate texel byte offset as vector. 2021-12-27 11:37:32 -08:00
Unknown W. Brackets
4d6a2f3919 samplerjit: Blend linear using integers. 2021-12-27 11:37:32 -08:00
Unknown W. Brackets
6f4e735757 samplerjit: Accumulate results in an XMM. 2021-12-27 11:37:32 -08:00
Unknown W. Brackets
b00a66e34c samplerjit: Pass u/v coords as vector. 2021-12-27 11:37:32 -08:00