Commit graph

262 commits

Author SHA1 Message Date
Unknown W. Brackets
e7639f666c Ensure stuff run on every prim are inlined.
These both get a ton of calls and show up on profiles.
2014-03-30 00:42:26 -07:00
Unknown W. Brackets
466a5418d2 Cut time in EstimatePerVertexCost().
From profling, this cuts 4% time while lighting is enabled, and 14% while
disabled (on ARM.)  But maybe should just inline this...
2014-03-30 00:42:26 -07:00
Unknown W. Brackets
a4327702f1 Reduce some includes under GPU/. 2014-03-29 16:51:38 -07:00
Unknown W. Brackets
69c2500dc6 Make sure vai->flags are set while hashing.
Since we decode the verts in this case we have a fresh flag.  Might
be #5718?
2014-03-25 08:19:38 -07:00
Henrik Rydgård
cf3d75975d Merge pull request #5720 from unknownbrackets/texcache
Optimize xxHash for ARM/NEON devices
2014-03-25 09:32:15 +01:00
Unknown W. Brackets
b800762ceb Add a NEON-optimized version of XXH32.
This takes at least 40% less time to hash on NEON/ARM devices.
2014-03-25 00:34:55 -07:00
Henrik Rydgard
dc07d3410a More checks for alpha test elimination 2014-03-24 17:33:20 +01:00
Henrik Rydgard
2c76e6d023 Correctly keep track of "full alpha" in vertices (x86 jit only). 2014-03-24 11:19:44 +01:00
Henrik Rydgard
f2f0355d94 Split up ApplyShader into ApplyVertexShader and ApplyFragmentShader.
Will allow the alphatest avoiding optimization later.
2014-03-24 10:55:07 +01:00
Henrik Rydgard
b174996c1c Add a conservative check that prevents alpha testing in a few cases.
This will become really powerful if we add some code to the vertex decoder
to check for non-full alpha in the vertices, and set gstate_c.vertexFullAlpha if none
is found (probably want to do the reverse, set it to true and clear if any non-255 alpha is found).

Alpha testing is a performance killer on many mobile GPUs so big efforts to
avoid it can be worth it.
2014-03-23 16:32:38 +01:00
Henrik Rydgard
20d480a374 Minor GPU code cleanups 2014-03-23 16:32:38 +01:00
Henrik Rydgard
8b92dcea47 Transform: Compute the "DCID" (draw call ID) incrementally instead of an extra pass. 2014-03-23 01:51:51 +01:00
Henrik Rydgard
b4d99b1981 Revert "Avoid caching when HW T&L with morph enabled."
This reverts commit 557eae7ca9.
2014-03-15 10:46:04 +01:00
raven02
557eae7ca9 Avoid caching when HW T&L with morph enabled. 2014-03-14 21:04:32 +08:00
Henrik Rydgard
9e6d7abf4e Minor cleanup 2014-02-12 11:10:44 +01:00
Unknown W. Brackets
7380c5b664 Stop showing z = 1.0 for non through in debugger.
Oops.
2014-02-09 00:33:15 -08:00
Unknown W. Brackets
e7eca477b0 Add a tab to show vertex values to the GE debugger.
Should be pretty useful, especially for depth issues.
2014-02-08 22:03:29 -08:00
Unknown W. Brackets
d2108a962e Switch from USING_GLES2 to MOBILE_DEVICE.
Still using USING_GLES2 for, well, GLES2.  But for things that are really
about mobile, we need a new define.  Devices are coming that don't use
GLES2.
2014-02-08 16:37:58 -08:00
Unknown W. Brackets
79864a5ee0 Fix some initialization order warnings. 2014-01-10 22:21:24 -08:00
Unknown W. Brackets
473fb866e6 softgpu: Implement vertex preview.
And move ConvertMatrix4x3To4x4() into a common place since there were
differing implementations, which was only confusing.
2013-12-29 13:45:10 -08:00
Henrik Rydgård
9e42086e21 Logspam reduction 2013-12-09 13:45:17 +01:00
Sacha
756f651eb2 Buildfix for older compilers, such as GCC 4.6.3 2013-11-22 16:22:41 +10:00
Unknown W. Brackets
70b6a4a796 Small optimization to vertex preview. 2013-11-20 22:36:48 -08:00
Henrik Rydgard
e3d471f590 Fix issues with texturing from render targets with prescaled UV (texcoord speed hack) 2013-11-19 23:38:36 +01:00
Unknown W. Brackets
ed16e42ca8 Cleanup, don't even need z here. 2013-11-17 14:49:51 -08:00
Unknown W. Brackets
018d95180a Fix 3d vertex preview. 2013-11-17 14:31:21 -08:00
Unknown W. Brackets
a3bd2f1365 Fix Vec3ByMatrix44() and use it for matrix math. 2013-11-17 14:10:57 -08:00
Unknown W. Brackets
b541c81ba3 Clean up Mat3x3 etc. constness. 2013-11-17 13:27:51 -08:00
Unknown W. Brackets
fcc77f525f Implement some basic vertex previews on prim.
3D doesn't work correctly (sometimes it does...)  2D should be working.
2013-11-17 13:27:50 -08:00
Henrik Rydgard
b0ccf5981c Don't bother with glDrawRangeElements, seems to not improve perf. 2013-11-14 17:33:43 +01:00
Henrik Rydgard
3b63ef7005 Remove the SubmitPrim param forceIndexType, optimize BBOX more. 2013-11-14 14:03:03 +01:00
Henrik Rydgard
35ae239eb9 Optimize bbox some more 2013-11-14 12:25:53 +01:00
Henrik Rydgard
4f93654a88 Oops, accidentally enabled some bbox debugging code 2013-11-14 11:49:06 +01:00
Henrik Rydgard
8a69543ec4 BBOX: Transform the planes by the matrix so we don't need to transform the box 2013-11-14 11:44:13 +01:00
Henrik Rydgard
179934ec9f Decode step by step when sw skinning 2013-11-13 18:10:57 +01:00
Henrik Rydgard
46313ced55 Prepare transform pipeline for step by step decoding 2013-11-13 18:10:57 +01:00
Henrik Rydgard
7e67476b00 Simple unoptimized software skinning.
Does not take advantage of the possible reduction in state changes yet.
2013-11-13 18:10:57 +01:00
Henrik Rydgård
ab3fe9ba86 Extract the software transform code into its own file. 2013-11-13 14:56:34 +01:00
Henrik Rydgard
cf15ec8a53 Add BBOX support (very conservative test) 2013-11-12 17:06:03 +01:00
Henrik Rydgard
f4ad7c64e5 Fix issue with texcoord speed hack (bPrescaleUV) in software transform
(and also thus rectangles of course even when hw transform is enabled)
2013-11-10 11:18:26 +01:00
Unknown W. Brackets
a1fa65f631 Stupid typos, broke 4444 and 565. 2013-11-03 18:43:24 -08:00
Henrik Rydgard
f0fd7679ce Preliminary ARM vertex decoder JIT. Has a weird issue in PosS16.
Other minor changes and fixes.
2013-11-03 20:15:42 +01:00
Henrik Rydgard
810b1a061f Vertex decoder JIT for x86 and x64. Handles the most common vertex formats. 2013-11-03 15:27:12 +01:00
Henrik Rydgård
07a868910e Add a temporary hack option that may help debugging the wipeout glow.
It reduces the glow problem by a lot but is obviously incorrect.
2013-10-30 22:47:36 +01:00
Henrik Rydgard
78aa64500e Fix bug when combining draw calls for hashing 2013-10-29 16:18:15 +01:00
Henrik Rydgård
60b1cb0008 Combine draw calls when hashing when possible, should help the same games as #4312 2013-10-29 12:14:09 +01:00
Unknown W. Brackets
1e65a691f4 Cap the number of vertexes per flush.
Might not be realistic, but we crash if we go over.  Pretty unlikely to
happen in real games, but I suppose not impossible.  Happens in the vertex
speed demo (#3106.)
2013-10-27 14:43:58 -07:00
Sacha
ecfe43c149 CityHash is not used anymore, so we won't compile it. 2013-10-28 03:26:00 +10:00
Henrik Rydgard
b832508c4b Let's put the stencil parameters in the right order.. 2013-10-10 21:41:00 +02:00
Henrik Rydgard
5c8a74d911 Stencil rectangle clears: Take the value from the second vertex. 2013-10-10 21:36:32 +02:00