Compare commits

...
Sign in to create a new pull request.

97 commits

Author SHA1 Message Date
Flyinghead
5ec4f8951d Merge remote-tracking branch 'origin/fh/mymaster' into fh/oit 2018-10-02 15:56:01 +02:00
Flyinghead
2798770879 GLSL compile error with mesa driver: need explicit smooth qualifier 2018-10-02 09:31:00 -04:00
Flyinghead
d7311b57a3 Merge remote-tracking branch 'origin/fh/mymaster' into fh/oit 2018-09-12 18:11:09 +02:00
Flyinghead
9300112583 Merge remote-tracking branch 'origin/fh/mymaster' into fh/oit 2018-09-06 12:20:30 +02:00
Flyinghead
eef68bab0d Merge remote-tracking branch 'origin/fh/mymaster' into fh/oit 2018-09-03 15:11:13 +02:00
Flyinghead
0a8eadb0b4 Fog color clamping must be done after shadowing and only when fog is on 2018-09-02 22:50:19 +02:00
Flyinghead
2a8f3d3427 Direct framebuffer write fix 2018-09-02 20:31:28 +02:00
Flyinghead
3607ba73ca Merge remote-tracking branch 'origin/fh/mymaster' into fh/oit (untested) 2018-09-01 13:55:15 +02:00
Flyinghead
2ed88970ed Fix compile error with last merge 2018-08-26 18:09:57 +02:00
Flyinghead
175706eda2 Merge remote-tracking branch 'origin/fh/mymaster' into fh/oit 2018-08-26 17:47:42 +02:00
Flyinghead
b395861ddb Merge branch 'fh/mymaster' into fh/oit 2018-08-13 21:11:42 +02:00
Flyinghead
a70f5958d7 Merge branch 'fh/mymaster' into fh/oit 2018-08-13 20:56:44 +02:00
Flyinghead
beff2646e2 Merge remote-tracking branch 'origin/fh/mymaster' into fh/oit 2018-08-02 17:44:49 +02:00
Flyinghead
2157fd7cda Add abuffer.cpp to windows build 2018-08-01 21:20:37 +02:00
Flyinghead
c2b053b2e3 Merge branch 'fh/mymaster' into fh/oit 2018-08-01 21:09:32 +02:00
Flyinghead
4bd23f2c8e tentative fix for constant overflow GLSL error on Intel HD 2018-07-24 10:09:22 +02:00
Flyinghead
185413b505 Merge upstream and master into fh/deferred-shading 2018-07-19 12:39:15 +02:00
Flyinghead
1c5bfe7869 Create OpenGL 4.3 core profile (was 3.1) 2018-07-16 14:26:56 +02:00
Flyinghead
521eb405d6 Merge remote-tracking branch 'origin/master' into fh/deferred-shading 2018-07-14 10:53:08 +02:00
Flyinghead
3c5cc05996 Merge remote-tracking branch 'origin/master' into fh/deferred-shading 2018-07-11 20:24:59 +02:00
Flyinghead
5f6597fb29 depth_scale is no longer used. More clean-up 2018-07-11 18:10:22 +02:00
Flyinghead
b7f0b8a944 Another potential Mesa driver fix. Got rid of deprecated OpenGL #defs 2018-07-11 17:53:37 +02:00
Flyinghead
03a74ccdb7 Tentative fix for Mesa 18.2 driver error "opaque variables cannot be
operands of the ?: operator"
2018-07-11 12:03:07 +02:00
Flyinghead
21eac7d6b0 Final fragment shader performance improvement
Use an index list of pixels instead of copying all data into a local
array. Uses less memory, which makes it faster. Also, build the index
and insert-sort at the same time. Could benefit from sorting polygons
back-to-front before rendering.
Use #defs for modifier volume bitwise ops.
2018-07-11 11:45:57 +02:00
Flyinghead
cb73894325 Fix dumpTexture 2018-07-11 11:37:34 +02:00
Flyinghead
9d766a39a5 Merge remote-tracking branch 'origin/master' into fh/deferred-shading 2018-07-11 11:06:45 +02:00
Flyinghead
14803e72e8 Merge remote-tracking branch 'origin/master' into fh/deferred-shading 2018-07-10 09:05:32 +02:00
Flyinghead
f29e3f80c8 Merge remote-tracking branch 'origin/master' into fh/deferred-shading 2018-07-09 15:13:19 +02:00
Flyinghead
cfe163b2fd gitignore 2018-07-09 15:08:16 +02:00
Flyinghead
2ad2c98429 Merge remote-tracking branch 'origin/master' into fh/deferred-shading 2018-07-06 20:29:22 +02:00
Flyinghead
ad1963262f Fix translucent shadows (Xtreme Sports) and remove undeeded code 2018-07-05 22:33:48 +02:00
Flyinghead
e132804fb5 Merge remote-tracking branch 'origin/master' into fh/deferred-shading 2018-07-05 20:13:13 +02:00
Flyinghead
4e5006fbfe Get rid of GLES #defines 2018-07-03 21:05:11 +02:00
Flyinghead
c535e98099 Fix previous merge 2018-07-03 20:59:31 +02:00
Flyinghead
04630dcd7c Merge remote-tracking branch 'origin/master' into fh/deferred-shading 2018-07-03 20:48:26 +02:00
Flyinghead
292ff84e22 Improve performance of translucent modifier volumes.
Same optimization as opaque modvols: use triangles instead of screen
quad.
Makes NBA 2K2 and NFL 2K2 playable.
2018-07-03 17:07:48 +02:00
Flyinghead
9b60a97b44 Merge remote-tracking branch 'origin/master' into fh/deferred-shading 2018-07-03 15:26:16 +02:00
Flyinghead
373fa178c3 Merge remote-tracking branch 'origin/master' into fh/deferred-shading 2018-07-03 14:24:43 +02:00
Flyinghead
f9aba6402b Merge remote-tracking branch 'origin/master' into fh/deferred-shading 2018-07-02 16:08:16 +02:00
Flyinghead
5fa230c26e Merge remote-tracking branch 'origin/master' into fh/deferred-shading 2018-07-02 10:29:30 +02:00
Flyinghead
9fdfd70458 Might help non-NVidia GPU. 2018-07-01 22:51:58 +02:00
Flyinghead
5d71987193 Use correct data types for a-buffer pointers texture. Might help
non-NVidia drivers.
2018-07-01 22:03:00 +02:00
Flyinghead
eea20d9942 Remove unneeded shader uniforms and params 2018-07-01 21:18:21 +02:00
Flyinghead
c4cde2c69c Flat shading support
Previously all polys were using Gouraud shading. Fixes wrong colors in Evolution battle.
2018-07-01 19:30:23 +02:00
Flyinghead
f08ea64012 Added back const qualifiers. Workaround for GLSL compile error with
NVidia drivers 396.
2018-07-01 10:07:41 +02:00
Flyinghead
8e309b01aa Delete const and in qualifiers as they cause GL compile errors 2018-06-30 16:33:46 +02:00
Flyinghead
e1c0946cd2 Merge remote-tracking branch 'origin/master' into fh/deferred-shading 2018-06-30 14:14:14 +02:00
Flyinghead
fac54519c0 Fix autosort mode per render pass. Remove hacky Always depth on
autosorted TR polys.
2018-06-30 13:38:40 +02:00
Flyinghead
d100a3d1cf Merge remote-tracking branch 'origin/master' into fh/deferred-shading 2018-06-30 12:55:30 +02:00
Flyinghead
6436587c73 Don't add vertices at end of strip if not merging strips together. 2018-06-30 12:48:30 +02:00
Flyinghead
0a4cdfb973 Support for dst/src select on TR polys
Copy TR poly params to shader memory.
Support src/dst select on TR polys. Fixes white areas in Evil Dead -
Hail to the King.
Limited support for two-volumes TR polys.
2018-06-29 22:30:56 +02:00
Flyinghead
ccfa5b9495 Merge remote-tracking branch 'origin/master' into fh/deferred-shading 2018-06-29 19:04:53 +02:00
Flyinghead
e570a72c3d Flat shading (non-gouraud) support 2018-06-29 17:39:38 +02:00
Flyinghead
c120f21c0b Dump texture utility for debugging 2018-06-29 17:34:04 +02:00
Flyinghead
9d4ed0cc19 Merge remote-tracking branch 'origin/master' into fh/deferred-shading 2018-06-29 17:17:40 +02:00
Flyinghead
7941458a15 Merge remote-tracking branch 'origin/master' into fh/deferred-shading 2018-06-29 17:11:20 +02:00
Flyinghead
2786ab5e31 Merge remote-tracking branch 'origin/master' into fh/deferred-shading 2018-06-29 12:04:08 +02:00
Flyinghead
4bc4391d97 Set depth to 0 for translucent polys that use always depth func
It seems that autosort translucent polys support the always depth func. In that case their depth should be
ignored for sorting and they should be drawn last (sorted by poly number if there is more than one).
This fixes graphic glitches in Psyvariar. Also observed in V-Rally 2.
2018-06-28 12:25:25 +02:00
Flyinghead
37f5e55d15 Merge remote-tracking branch 'origin/master' into fh/deferred-shading 2018-06-26 15:57:07 +02:00
Flyinghead
b38dea86ee Improve modifier volume parsing and drawing logic.
Parse modifier volumes similarily to other polys (first, count, params).
Draw all triangles in one shot and use quad to sum up instead of
redrawing entire strip. Use OR operation for open volumes/quads
(Soulcalibur).
Support for open transparent modifier volumes (OR).
2018-06-26 14:24:45 +02:00
Flyinghead
2ee45d8d6b Optimize RTT to VRAM. Add US version of THPS2 to per-game settings. 2018-06-25 16:48:57 +02:00
Flyinghead
13725ccdc1 Workaround for Virtua Tennis ball color problem.
There's a texture corruption of the tennis ball and other textures,
notably the players' bags in the first intro sequence. The corruption is
due to render to texture squashing existing textures. Not sure what's
going on but this avoids the texture corruption. The original problem
remains.
2018-06-25 15:48:57 +02:00
Flyinghead
9b2762e1b4 Fix non-autosort translucent polys drawing.
Depth func must be handled manually in non-autosort translucent mode. So
it's now done in the final a-buffer shader. Fixes Namco logo in
Soulcalibur.
2018-06-25 11:26:32 +02:00
Flyinghead
0e091e2e84 Reset a-buffer pointers at init 2018-06-09 12:22:47 +02:00
Flyinghead
609d5bcd19 Push more silence on audio underrun to catch up 2018-06-08 19:35:03 +02:00
Flyinghead
43109bfc45 Use float constants in GLSL 2018-06-07 18:23:26 +02:00
Flyinghead
718c341aa4 Increase TA context size (verts, idx) 2018-06-07 17:36:43 +02:00
Flyinghead
56ed4f7033 Added TA_YUV_TEX_CTRL struct 2018-06-07 17:33:40 +02:00
Flyinghead
9eded0cf37 Fix SW1-JPB random texture corruption problem and video choppiness. 2018-06-07 17:13:42 +02:00
Flyinghead
63b90a5c8e fog "Z" value must be clamped between 1 and 255.9 2018-06-05 12:06:51 +02:00
Flyinghead
64e5d585b6 Disable overflow detection until I can figure out why it slows things
down like that. Buffer set to 512MB. Also z=1 value for quads looks
logical.
2018-06-04 19:30:48 +02:00
Flyinghead
11caeb9d02 Fix OSD location when scale_x > 1. Reset atomic abuffer counter and
check overflow at beginning of cycle, hopefully for better perfs
2018-06-04 12:46:30 +02:00
Flyinghead
5cff417da1 Texture-based fog table 2018-06-03 11:39:32 +02:00
Flyinghead
ef954dfe26 Got rid of viewport_width/height. Another quad fix. 2018-06-02 23:12:01 +02:00
Flyinghead
6fdc9fb0aa Fixed depth problems due to reusing the same depth buffer twice. Fixed
widescreen issue with quads: left margin wasn't drawn or cleared. Was
also interfering with RTT. Fixed problem with TR modifier volumes atomic
operations.
2018-06-02 19:00:46 +02:00
Flyinghead
33b378cbf5 Dynamic pixel buffer resizing 2018-05-31 21:35:18 +02:00
Flyinghead
e2201c7dd0 Skip empty render passes 2018-05-31 21:31:20 +02:00
Flyinghead
bb6e33e0d5 Multipass rendering support 2018-05-31 20:06:58 +02:00
Flyinghead
0bd53de8e6 Increased global_param_tr list (Rez). Add log to identify which list is
overrun.
2018-05-30 22:41:22 +02:00
Flyinghead
8d1bfa7683 Translucent modifer volumes 2018-05-30 20:59:57 +02:00
Flyinghead
ef7cf6c0e5 WIP two-volume mode support 2018-05-30 16:30:07 +02:00
Flyinghead
901940634f Create a PolyParam for each TR polygon strip so that we have a correct
seq_num -> needed for sorting.
2018-05-30 16:29:32 +02:00
Flyinghead
f3b8311955 read_frame fix (again) 2018-05-30 15:17:24 +02:00
Flyinghead
74a28f08fc read_frame fix 2018-05-29 19:52:53 +02:00
Flyinghead
4bca94cf86 Removed pass 2: OP and TR pixels are not added to a-buffers. Discard TR
pixels as much as possible in pass 3 based on blending mode and
color/alpha values. Removed depth and stencil buffer from RTT FBO as
they are no longer needed.
2018-05-29 18:42:21 +02:00
Flyinghead
0bb28b2e64 Fully parse two-volume mode polygons. 2018-05-29 18:33:59 +02:00
Flyinghead
c197570286 A-buffers: linked list implementation 2018-05-28 23:39:47 +02:00
Flyinghead
aa996566fe Parse translucent modifier volumes. Fix for overrun. Still need to be
checked...
2018-05-28 23:38:26 +02:00
Flyinghead
81b96f2ede a-buffers RTT fix 2018-05-28 12:34:14 +02:00
Flyinghead
0d32618203 Cosmetic changes 2018-05-28 12:07:52 +02:00
Flyinghead
00fbc3f6f0 A-buffers: Handle manual sort TR. Sort on depth then on poly number. 2018-05-27 22:51:12 +02:00
Flyinghead
d23ce4b24a Removed previous methods. Better a-buffer impl. 2018-05-27 21:34:18 +02:00
Flyinghead
c005515f21 WIP A-buffers 2018-05-27 10:48:52 +02:00
Flyinghead
1c24ae2c31 WIP experimental Average Colors and Depth Peeling renderers 2018-05-26 10:37:36 +02:00
Flyinghead
a858eb6a11 No need to redraw modvols 2018-05-23 15:16:26 +02:00
Flyinghead
912d138804 Merge branch 'master' into fh/deferred-shading 2018-05-23 15:13:54 +02:00
Flyinghead
9c0489dd5e WIP 2018-05-23 11:39:11 +02:00
19 changed files with 2027 additions and 1581 deletions

1
.gitignore vendored
View file

@ -48,6 +48,7 @@ reicast-ios.xccheckout
shell/linux/.map shell/linux/.map
shell/linux/nosym-reicast.elf shell/linux/nosym-reicast.elf
shell/linux/reicast.elf shell/linux/reicast.elf
shell/linux/reicast_naomi.elf
# Visual Studio # Visual Studio
generated generated

View file

@ -103,7 +103,7 @@ void dump_frame(const char* file, TA_context* ctx, u8* vram, u8* vram_ref = NULL
u32 bytes = ctx->tad.End() - ctx->tad.thd_root; u32 bytes = ctx->tad.End() - ctx->tad.thd_root;
fwrite("TAFRAME3", 1, 8, fw); fwrite("TAFRAME4", 1, 8, fw);
fwrite(&ctx->rend.isRTT, 1, sizeof(ctx->rend.isRTT), fw); fwrite(&ctx->rend.isRTT, 1, sizeof(ctx->rend.isRTT), fw);
u32 zero = 0; u32 zero = 0;
@ -168,10 +168,17 @@ TA_context* read_frame(const char* file, u8* vram_ref = NULL) {
fread(id0, 1, 8, fw); fread(id0, 1, 8, fw);
if (memcmp(id0, "TAFRAME3", 8) != 0) { if (memcmp(id0, "TAFRAME", 7) != 0 || (id0[7] != '3' && id0[7] != '4')) {
fclose(fw); fclose(fw);
return 0; return 0;
} }
int sizeofPolyParam = sizeof(PolyParam);
int sizeofVertex = sizeof(Vertex);
if (id0[7] == '3')
{
sizeofPolyParam -= 12;
sizeofVertex -= 16;
}
TA_context* ctx = tactx_Alloc(); TA_context* ctx = tactx_Alloc();
@ -184,8 +191,10 @@ TA_context* read_frame(const char* file, u8* vram_ref = NULL) {
fread(&ctx->rend.fb_X_CLIP.full, 1, sizeof(ctx->rend.fb_X_CLIP.full), fw); fread(&ctx->rend.fb_X_CLIP.full, 1, sizeof(ctx->rend.fb_X_CLIP.full), fw);
fread(&ctx->rend.fb_Y_CLIP.full, 1, sizeof(ctx->rend.fb_Y_CLIP.full), fw); fread(&ctx->rend.fb_Y_CLIP.full, 1, sizeof(ctx->rend.fb_Y_CLIP.full), fw);
fread(ctx->rend.global_param_op.Append(), 1, sizeof(PolyParam), fw); fread(ctx->rend.global_param_op.Append(), 1, sizeofPolyParam, fw);
fread(ctx->rend.verts.Append(4), 1, 4 * sizeof(Vertex), fw); Vertex *vtx = ctx->rend.verts.Append(4);
for (int i = 0; i < 4; i++)
fread(vtx + i, 1, sizeofVertex, fw);
fread(&t, 1, sizeof(t), fw); fread(&t, 1, sizeof(t), fw);
verify(t == VRAM_SIZE); verify(t == VRAM_SIZE);

View file

@ -8,6 +8,7 @@ struct List
int size; int size;
bool* overrun; bool* overrun;
const char *list_name;
__forceinline int used() const { return size-avail; } __forceinline int used() const { return size-avail; }
__forceinline int bytes() const { return used()* sizeof(T); } __forceinline int bytes() const { return used()* sizeof(T); }
@ -17,6 +18,8 @@ struct List
{ {
*overrun |= true; *overrun |= true;
Clear(); Clear();
if (list_name != NULL)
printf("List overrun for list %s\n", list_name);
return daty; return daty;
} }
@ -45,7 +48,7 @@ struct List
T* head() const { return daty-used(); } T* head() const { return daty-used(); }
void InitBytes(int maxbytes,bool* ovrn) void InitBytes(int maxbytes,bool* ovrn, const char *name)
{ {
maxbytes-=maxbytes%sizeof(T); maxbytes-=maxbytes%sizeof(T);
@ -58,11 +61,12 @@ struct List
overrun=ovrn; overrun=ovrn;
Clear(); Clear();
list_name = name;
} }
void Init(int maxsize,bool* ovrn) void Init(int maxsize,bool* ovrn, const char *name)
{ {
InitBytes(maxsize*sizeof(T),ovrn); InitBytes(maxsize*sizeof(T),ovrn, name);
} }
void Clear() void Clear()
@ -76,4 +80,4 @@ struct List
Clear(); Clear();
free(daty); free(daty);
} }
}; };

View file

@ -37,9 +37,9 @@ void YUV_init()
YUV_dest=TA_YUV_TEX_BASE&VRAM_MASK;//TODO : add the masking needed YUV_dest=TA_YUV_TEX_BASE&VRAM_MASK;//TODO : add the masking needed
TA_YUV_TEX_CNT=0; TA_YUV_TEX_CNT=0;
YUV_blockcount=(((TA_YUV_TEX_CTRL>>0)&0x3F)+1)*(((TA_YUV_TEX_CTRL>>8)&0x3F)+1); YUV_blockcount = (TA_YUV_TEX_CTRL.yuv_u_size + 1) * (TA_YUV_TEX_CTRL.yuv_v_size + 1);
if ((TA_YUV_TEX_CTRL>>16 )&1) if (TA_YUV_TEX_CTRL.yuv_tex != 0)
{ {
die ("YUV: Not supported configuration\n"); die ("YUV: Not supported configuration\n");
YUV_x_size=16; YUV_x_size=16;
@ -47,8 +47,8 @@ void YUV_init()
} }
else // yesh!!! else // yesh!!!
{ {
YUV_x_size=(((TA_YUV_TEX_CTRL>>0)&0x3F)+1)*16; YUV_x_size = (TA_YUV_TEX_CTRL.yuv_u_size + 1) * 16;
YUV_y_size=(((TA_YUV_TEX_CTRL>>8)&0x3F)+1)*16; YUV_y_size = (TA_YUV_TEX_CTRL.yuv_v_size + 1) * 16;
} }
} }
@ -164,7 +164,7 @@ void YUV_data(u32* data , u32 count)
YUV_init(); YUV_init();
} }
u32 block_size=(TA_YUV_TEX_CTRL & (1<<24))==0?384:512; u32 block_size = TA_YUV_TEX_CTRL.yuv_form == 0 ? 384 : 512;
verify(block_size==384); //no support for 512 verify(block_size==384); //no support for 512

View file

@ -370,7 +370,22 @@ union TA_GLOB_TILE_CLIP_type
}; };
u32 full; u32 full;
}; };
union TA_YUV_TEX_CTRL_type
{
struct
{
u32 yuv_u_size : 6;
u32 reserved1 : 2;
u32 yuv_v_size : 6;
u32 reserved2 : 2;
u32 yuv_tex : 1;
u32 reserved3 : 7;
u32 yuv_form : 1;
u32 reserved4 : 7;
};
u32 full;
};
// TA REGS // TA REGS
#define TA_OL_BASE_addr 0x00000124 // RW Object list write start address #define TA_OL_BASE_addr 0x00000124 // RW Object list write start address
@ -483,7 +498,7 @@ union TA_GLOB_TILE_CLIP_type
#define TA_ALLOC_CTRL PvrReg(TA_ALLOC_CTRL_addr,u32) // RW Object list control #define TA_ALLOC_CTRL PvrReg(TA_ALLOC_CTRL_addr,u32) // RW Object list control
#define TA_LIST_INIT PvrReg(TA_LIST_INIT_addr,u32) // RW TA initialization #define TA_LIST_INIT PvrReg(TA_LIST_INIT_addr,u32) // RW TA initialization
#define TA_YUV_TEX_BASE PvrReg(TA_YUV_TEX_BASE_addr,u32) // RW YUV422 texture write start address #define TA_YUV_TEX_BASE PvrReg(TA_YUV_TEX_BASE_addr,u32) // RW YUV422 texture write start address
#define TA_YUV_TEX_CTRL PvrReg(TA_YUV_TEX_CTRL_addr,u32) // RW YUV converter control #define TA_YUV_TEX_CTRL PvrReg(TA_YUV_TEX_CTRL_addr, TA_YUV_TEX_CTRL_type) // RW YUV converter control
#define TA_YUV_TEX_CNT PvrReg(TA_YUV_TEX_CNT_addr,u32) // R YUV converter macro block counter value #define TA_YUV_TEX_CNT PvrReg(TA_YUV_TEX_CNT_addr,u32) // R YUV converter macro block counter value
#define TA_LIST_CONT PvrReg(TA_LIST_CONT_addr,u32) // RW TA continuation processing #define TA_LIST_CONT PvrReg(TA_LIST_CONT_addr,u32) // RW TA continuation processing

View file

@ -23,10 +23,10 @@ bool ta_parse_vdrc(TA_context* ctx);
#define STRIPS_AS_PPARAMS 1 #define STRIPS_AS_PPARAMS 1
#define TRIG_SORT 1 #define TRIG_SORT 0
#if TRIG_SORT #if TRIG_SORT
#undef STRIPS_AS_PPARAMS #undef STRIPS_AS_PPARAMS
#define STRIPS_AS_PPARAMS 1 #define STRIPS_AS_PPARAMS 1
#endif #endif

View file

@ -17,6 +17,12 @@ struct Vertex
u8 spc[4]; u8 spc[4];
float u,v; float u,v;
// Two volumes format
u8 col1[4];
u8 spc1[4];
float u1,v1;
}; };
struct PolyParam struct PolyParam
@ -35,6 +41,9 @@ struct PolyParam
float zvZ; float zvZ;
u32 tileclip; u32 tileclip;
//float zMin,zMax; //float zMin,zMax;
TSP tsp1;
TCW tcw1;
u32 texid1;
}; };
struct ModifierVolumeParam struct ModifierVolumeParam
@ -98,6 +107,7 @@ struct RenderPass {
u32 mvo_count; u32 mvo_count;
u32 pt_count; u32 pt_count;
u32 tr_count; u32 tr_count;
u32 mvo_tr_count;
}; };
struct rend_context struct rend_context
@ -124,6 +134,7 @@ struct rend_context
List<u16> idx; List<u16> idx;
List<ModTriangle> modtrig; List<ModTriangle> modtrig;
List<ModifierVolumeParam> global_param_mvo; List<ModifierVolumeParam> global_param_mvo;
List<ModifierVolumeParam> global_param_mvo_tr;
List<PolyParam> global_param_op; List<PolyParam> global_param_op;
List<PolyParam> global_param_pt; List<PolyParam> global_param_pt;
@ -139,6 +150,7 @@ struct rend_context
global_param_tr.Clear(); global_param_tr.Clear();
modtrig.Clear(); modtrig.Clear();
global_param_mvo.Clear(); global_param_mvo.Clear();
global_param_mvo_tr.Clear();
render_passes.Clear(); render_passes.Clear();
Overrun=false; Overrun=false;
@ -189,16 +201,23 @@ struct TA_context
{ {
tad.Reset((u8*)OS_aligned_malloc(32, 8*1024*1024)); tad.Reset((u8*)OS_aligned_malloc(32, 8*1024*1024));
rend.verts.InitBytes(2*1024*1024,&rend.Overrun); //up to 2 MB of vtx data/frame = ~ 75k vtx/frame rend.verts.InitBytes(4 * 1024 * 1024, &rend.Overrun, "verts"); //up to 4 mb of vtx data/frame = ~ 96k vtx/frame
rend.idx.Init(120*1024,&rend.Overrun); //up to 120K indexes ( idx have stripification overhead ) rend.idx.Init(120 * 1024, &rend.Overrun, "idx"); //up to 120K indexes ( idx have stripification overhead )
rend.global_param_op.Init(4096,&rend.Overrun); rend.global_param_op.Init(4096, &rend.Overrun, "global_param_op");
rend.global_param_pt.Init(4096,&rend.Overrun); rend.global_param_pt.Init(4096, &rend.Overrun, "global_param_pt");
rend.global_param_mvo.Init(4096,&rend.Overrun); rend.global_param_mvo.Init(4096, &rend.Overrun, "global_param_mvo");
rend.global_param_tr.Init(8192,&rend.Overrun); #if STRIPS_AS_PPARAMS
// That makes a lot of polyparams but this is required for proper sorting...
// Rez uses more than 8192 translucent polygons sometimes
rend.global_param_tr.Init(10240, &rend.Overrun, "global_param_tr");
#else
rend.global_param_tr.Init(4096, &rend.Overrun, "global_param_tr");
#endif
rend.global_param_mvo_tr.Init(4096, &rend.Overrun, "global_param_mvo_tr");
rend.modtrig.Init(8192,&rend.Overrun); rend.modtrig.Init(16384, &rend.Overrun, "modtrig");
rend.render_passes.Init(sizeof(RenderPass) * 10, &rend.Overrun); // 10 render passes rend.render_passes.Init(sizeof(RenderPass) * 10, &rend.Overrun, "render_passes"); // 10 render passes
Reset(); Reset();
} }
@ -222,6 +241,7 @@ struct TA_context
rend.global_param_tr.Free(); rend.global_param_tr.Free();
rend.modtrig.Free(); rend.modtrig.Free();
rend.global_param_mvo.Free(); rend.global_param_mvo.Free();
rend.global_param_mvo_tr.Free();
rend.render_passes.Free(); rend.render_passes.Free();
} }
}; };

View file

@ -85,6 +85,8 @@ List<PolyParam>* CurrentPPlist;
//TA state vars //TA state vars
DECL_ALIGN(4) u8 FaceBaseColor[4]; DECL_ALIGN(4) u8 FaceBaseColor[4];
DECL_ALIGN(4) u8 FaceOffsColor[4]; DECL_ALIGN(4) u8 FaceOffsColor[4];
DECL_ALIGN(4) u8 FaceBaseColor1[4];
DECL_ALIGN(4) u8 FaceOffsColor1[4];
DECL_ALIGN(4) u32 SFaceBaseColor; DECL_ALIGN(4) u32 SFaceBaseColor;
DECL_ALIGN(4) u32 SFaceOffsColor; DECL_ALIGN(4) u32 SFaceOffsColor;
@ -769,7 +771,7 @@ public:
CurrentPP=&nullPP; CurrentPP=&nullPP;
CurrentPPlist=0; CurrentPPlist=0;
if (ListType == ListType_Opaque_Modifier_Volume) if (ListType == ListType_Opaque_Modifier_Volume || ListType == ListType_Translucent_Modifier_Volume)
EndModVol(); EndModVol();
} }
@ -808,6 +810,9 @@ public:
if (d_pp->pcw.Texture) { if (d_pp->pcw.Texture) {
d_pp->texid = renderer->GetTexture(d_pp->tsp,d_pp->tcw); d_pp->texid = renderer->GetTexture(d_pp->tsp,d_pp->tcw);
} }
d_pp->tsp1.full = -1;
d_pp->tcw1.full = -1;
d_pp->texid1 = -1;
} }
} }
@ -860,6 +865,11 @@ public:
TA_PolyParam3* pp=(TA_PolyParam3*)vpp; TA_PolyParam3* pp=(TA_PolyParam3*)vpp;
glob_param_bdc(pp); glob_param_bdc(pp);
CurrentPP->tsp1.full = pp->tsp1.full;
CurrentPP->tcw1.full = pp->tcw1.full;
if (pp->pcw.Texture)
CurrentPP->texid1 = renderer->GetTexture(pp->tsp1, pp->tcw1);
} }
__forceinline __forceinline
static void TACALL AppendPolyParam4A(void* vpp) static void TACALL AppendPolyParam4A(void* vpp)
@ -867,13 +877,19 @@ public:
TA_PolyParam4A* pp=(TA_PolyParam4A*)vpp; TA_PolyParam4A* pp=(TA_PolyParam4A*)vpp;
glob_param_bdc(pp); glob_param_bdc(pp);
CurrentPP->tsp1.full = pp->tsp1.full;
CurrentPP->tcw1.full = pp->tcw1.full;
if (pp->pcw.Texture)
CurrentPP->texid1 = renderer->GetTexture(pp->tsp1, pp->tcw1);
} }
__forceinline __forceinline
static void TACALL AppendPolyParam4B(void* vpp) static void TACALL AppendPolyParam4B(void* vpp)
{ {
TA_PolyParam4B* pp=(TA_PolyParam4B*)vpp; TA_PolyParam4B* pp=(TA_PolyParam4B*)vpp;
poly_float_color(FaceBaseColor,FaceColor0); poly_float_color(FaceBaseColor, FaceColor0);
poly_float_color(FaceBaseColor1, FaceColor1);
} }
//Poly Strip handling //Poly Strip handling
@ -884,13 +900,6 @@ public:
{ {
CurrentPP->count=vdrc.idx.used() - CurrentPP->first; CurrentPP->count=vdrc.idx.used() - CurrentPP->first;
int vbase=vdrc.verts.used();
*vdrc.idx.Append()=vbase-1;
*vdrc.idx.Append()=vbase;
if (CurrentPP->count&1)
*vdrc.idx.Append()=vbase;
#if STRIPS_AS_PPARAMS #if STRIPS_AS_PPARAMS
if (CurrentPPlist==&vdrc.global_param_tr) if (CurrentPPlist==&vdrc.global_param_tr)
{ {
@ -900,7 +909,20 @@ public:
d_pp->first=vdrc.idx.used(); d_pp->first=vdrc.idx.used();
d_pp->count=0; d_pp->count=0;
} }
else
{
#endif #endif
int vbase=vdrc.verts.used();
*vdrc.idx.Append()=vbase-1;
*vdrc.idx.Append()=vbase;
if (CurrentPP->count&1)
*vdrc.idx.Append()=vbase;
#if STRIPS_AS_PPARAMS
}
#endif
} }
@ -941,6 +963,14 @@ public:
cv->u = f16(vtx->u_name);\ cv->u = f16(vtx->u_name);\
cv->v = f16(vtx->v_name); cv->v = f16(vtx->v_name);
#define vert_uv1_32(u_name,v_name) \
cv->u1 = (vtx->u_name);\
cv->v1 = (vtx->v_name);
#define vert_uv1_16(u_name,v_name) \
cv->u1 = f16(vtx->u_name);\
cv->v1 = f16(vtx->v_name);
//Color conversions //Color conversions
#define vert_packed_color_(to,src) \ #define vert_packed_color_(to,src) \
{ \ { \
@ -984,6 +1014,20 @@ public:
cv->spc[2] = FaceOffsColor[2]*satint/256; \ cv->spc[2] = FaceOffsColor[2]*satint/256; \
cv->spc[3] = FaceOffsColor[3]; } cv->spc[3] = FaceOffsColor[3]; }
#define vert_face_base_color1(baseint) \
{ u32 satint=float_to_satu8(vtx->baseint); \
cv->col1[0] = FaceBaseColor1[0]*satint/256; \
cv->col1[1] = FaceBaseColor1[1]*satint/256; \
cv->col1[2] = FaceBaseColor1[2]*satint/256; \
cv->col1[3] = FaceBaseColor1[3]; }
#define vert_face_offs_color1(offsint) \
{ u32 satint=float_to_satu8(vtx->offsint); \
cv->spc1[0] = FaceOffsColor1[0]*satint/256; \
cv->spc1[1] = FaceOffsColor1[1]*satint/256; \
cv->spc1[2] = FaceOffsColor1[2]*satint/256; \
cv->spc1[3] = FaceOffsColor1[3]; }
//vert_float_color_(cv->spc,FaceOffsColor[3],FaceOffsColor[0]*satint/256,FaceOffsColor[1]*satint/256,FaceOffsColor[2]*satint/256); } //vert_float_color_(cv->spc,FaceOffsColor[3],FaceOffsColor[0]*satint/256,FaceOffsColor[1]*satint/256,FaceOffsColor[2]*satint/256); }
@ -1109,6 +1153,7 @@ public:
vert_cvt_base; vert_cvt_base;
vert_packed_color(col,BaseCol0); vert_packed_color(col,BaseCol0);
vert_packed_color(col1, BaseCol1);
} }
//(Non-Textured, Intensity, with Two Volumes) //(Non-Textured, Intensity, with Two Volumes)
@ -1118,6 +1163,7 @@ public:
vert_cvt_base; vert_cvt_base;
vert_face_base_color(BaseInt0); vert_face_base_color(BaseInt0);
vert_face_base_color1(BaseInt1);
} }
//(Textured, Packed Color, with Two Volumes) //(Textured, Packed Color, with Two Volumes)
@ -1136,6 +1182,10 @@ public:
{ {
vert_res_base; vert_res_base;
vert_packed_color(col1, BaseCol1);
vert_packed_color(spc1, OffsCol1);
vert_uv1_32(u1, v1);
} }
//(Textured, Packed Color, 16bit UV, with Two Volumes) //(Textured, Packed Color, 16bit UV, with Two Volumes)
@ -1154,6 +1204,10 @@ public:
{ {
vert_res_base; vert_res_base;
vert_packed_color(col1, BaseCol1);
vert_packed_color(spc1, OffsCol1);
vert_uv1_16(u1, v1);
} }
//(Textured, Intensity, with Two Volumes) //(Textured, Intensity, with Two Volumes)
@ -1172,6 +1226,10 @@ public:
{ {
vert_res_base; vert_res_base;
vert_face_base_color1(BaseInt1);
vert_face_offs_color1(OffsInt1);
vert_uv1_32(u1,v1);
} }
//(Textured, Intensity, 16bit UV, with Two Volumes) //(Textured, Intensity, 16bit UV, with Two Volumes)
@ -1190,6 +1248,10 @@ public:
{ {
vert_res_base; vert_res_base;
vert_face_base_color1(BaseInt1);
vert_face_offs_color1(OffsInt1);
vert_uv1_16(u1, v1);
} }
//Sprites //Sprites
@ -1217,6 +1279,9 @@ public:
if (d_pp->pcw.Texture) { if (d_pp->pcw.Texture) {
d_pp->texid = renderer->GetTexture(d_pp->tsp,d_pp->tcw); d_pp->texid = renderer->GetTexture(d_pp->tsp,d_pp->tcw);
} }
d_pp->tcw1.full = -1;
d_pp->tsp1.full = -1;
d_pp->texid1 = -1;
SFaceBaseColor=spr->BaseCol; SFaceBaseColor=spr->BaseCol;
SFaceOffsColor=spr->OffsCol; SFaceOffsColor=spr->OffsCol;
@ -1374,6 +1439,8 @@ public:
List<ModifierVolumeParam> *list = NULL; List<ModifierVolumeParam> *list = NULL;
if (CurrentList == ListType_Opaque_Modifier_Volume) if (CurrentList == ListType_Opaque_Modifier_Volume)
list = &vdrc.global_param_mvo; list = &vdrc.global_param_mvo;
else if (CurrentList == ListType_Translucent_Modifier_Volume)
list = &vdrc.global_param_mvo_tr;
else else
return; return;
if (list->used() > 0) if (list->used() > 0)
@ -1390,6 +1457,8 @@ public:
ModifierVolumeParam *p = NULL; ModifierVolumeParam *p = NULL;
if (CurrentList == ListType_Opaque_Modifier_Volume) if (CurrentList == ListType_Opaque_Modifier_Volume)
p = vdrc.global_param_mvo.Append(); p = vdrc.global_param_mvo.Append();
else if (CurrentList == ListType_Translucent_Modifier_Volume)
p = vdrc.global_param_mvo_tr.Append();
else else
return; return;
p->isp.full = param->isp.full; p->isp.full = param->isp.full;
@ -1399,7 +1468,7 @@ public:
__forceinline __forceinline
static void AppendModVolVertexA(TA_ModVolA* mvv) static void AppendModVolVertexA(TA_ModVolA* mvv)
{ {
if (CurrentList!=ListType_Opaque_Modifier_Volume) if (CurrentList != ListType_Opaque_Modifier_Volume && CurrentList != ListType_Translucent_Modifier_Volume)
return; return;
lmr=vdrc.modtrig.Append(); lmr=vdrc.modtrig.Append();
@ -1419,7 +1488,7 @@ public:
__forceinline __forceinline
static void AppendModVolVertexB(TA_ModVolB* mvv) static void AppendModVolVertexB(TA_ModVolB* mvv)
{ {
if (CurrentList!=ListType_Opaque_Modifier_Volume) if (CurrentList != ListType_Opaque_Modifier_Volume && CurrentList != ListType_Translucent_Modifier_Volume)
return; return;
lmr->y2=mvv->y2; lmr->y2=mvv->y2;
lmr->z2=mvv->z2; lmr->z2=mvv->z2;
@ -1486,6 +1555,7 @@ bool ta_parse_vdrc(TA_context* ctx)
render_pass->mvo_count = vd_rc.global_param_mvo.used(); render_pass->mvo_count = vd_rc.global_param_mvo.used();
render_pass->pt_count = vd_rc.global_param_pt.used(); render_pass->pt_count = vd_rc.global_param_pt.used();
render_pass->tr_count = vd_rc.global_param_tr.used(); render_pass->tr_count = vd_rc.global_param_tr.used();
render_pass->mvo_tr_count = vd_rc.global_param_mvo_tr.used();
render_pass->autosort = UsingAutoSort(pass); render_pass->autosort = UsingAutoSort(pass);
render_pass->z_clear = ClearZBeforePass(pass); render_pass->z_clear = ClearZBeforePass(pass);
} }
@ -1637,6 +1707,9 @@ void FillBGP(TA_context* ctx)
bgpp->isp.full=vri(strip_base); bgpp->isp.full=vri(strip_base);
bgpp->tsp.full=vri(strip_base+4); bgpp->tsp.full=vri(strip_base+4);
bgpp->tcw.full=vri(strip_base+8); bgpp->tcw.full=vri(strip_base+8);
bgpp->tcw1.full = -1;
bgpp->tsp1.full = -1;
bgpp->texid1 = -1;
bgpp->count=4; bgpp->count=4;
bgpp->first=0; bgpp->first=0;
bgpp->tileclip=0;//disabled ! HA ~ bgpp->tileclip=0;//disabled ! HA ~

View file

@ -704,9 +704,11 @@ void x11_window_create()
verify(glXCreateContextAttribsARB != 0); verify(glXCreateContextAttribsARB != 0);
int context_attribs[] = int context_attribs[] =
{ {
GLX_CONTEXT_MAJOR_VERSION_ARB, 3, GLX_CONTEXT_MAJOR_VERSION_ARB, 4,
GLX_CONTEXT_MINOR_VERSION_ARB, 1, GLX_CONTEXT_MINOR_VERSION_ARB, 3,
#ifndef RELEASE
GLX_CONTEXT_FLAGS_ARB, GLX_CONTEXT_DEBUG_BIT_ARB, GLX_CONTEXT_FLAGS_ARB, GLX_CONTEXT_DEBUG_BIT_ARB,
#endif
GLX_CONTEXT_PROFILE_MASK_ARB, GLX_CONTEXT_CORE_PROFILE_BIT_ARB, GLX_CONTEXT_PROFILE_MASK_ARB, GLX_CONTEXT_CORE_PROFILE_BIT_ARB,
None None
}; };
@ -716,7 +718,7 @@ void x11_window_create()
if (!x11_glc) if (!x11_glc)
{ {
die("Failed to create GL3.1 context\n"); die("Failed to create OpenGL 4.3 context\n");
} }
#endif #endif

View file

@ -7,6 +7,8 @@ static bool pcm_blocking = true;
static snd_pcm_uframes_t buffer_size; static snd_pcm_uframes_t buffer_size;
static snd_pcm_uframes_t period_size; static snd_pcm_uframes_t period_size;
#define MAX_LATENCY 100
// We're making these functions static - there's no need to pollute the global namespace // We're making these functions static - there's no need to pollute the global namespace
static void alsa_init() static void alsa_init()
{ {
@ -89,7 +91,7 @@ static void alsa_init()
} }
else else
printf("ALSA: period size set to %ld\n", period_size); printf("ALSA: period size set to %ld\n", period_size);
buffer_size = (44100 * 100 /* settings.omx.Audio_Latency */ / 1000 / period_size + 1) * period_size; buffer_size = (44100 * MAX_LATENCY / 1000 / period_size + 1) * period_size;
rc=snd_pcm_hw_params_set_buffer_size_near(handle, params, &buffer_size); rc=snd_pcm_hw_params_set_buffer_size_near(handle, params, &buffer_size);
if (rc < 0) if (rc < 0)
{ {

View file

@ -161,11 +161,7 @@ struct pp_8888
{ {
__forceinline static u32 packRGB(u8 R,u8 G,u8 B) __forceinline static u32 packRGB(u8 R,u8 G,u8 B)
{ {
#if __BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__ && defined(GLES)
return (R << 0) | (G << 8) | (B << 16) | 0xFF000000;
#else
return (R << 24) | (G << 16) | (B << 8) | 0xFF; return (R << 24) | (G << 16) | (B << 8) | 0xFF;
#endif
} }
}; };

545
core/rend/gles/abuffer.cpp Normal file
View file

@ -0,0 +1,545 @@
/*
* abuffer.cpp
*
* Created on: May 26, 2018
* Author: raph
*/
#include "glcache.h"
GLuint pixels_buffer;
GLuint pixels_pointers;
GLuint atomic_buffer;
PipelineShader g_abuffer_final_shader;
PipelineShader g_abuffer_final_nosort_shader;
PipelineShader g_abuffer_clear_shader;
PipelineShader g_abuffer_tr_modvol_shaders[ModeCount];
static GLuint g_quadBuffer = 0;
static GLuint g_quadVertexArray = 0;
static int g_imageWidth = 0;
static int g_imageHeight = 0;
GLuint pixel_buffer_size = 512 * 1024 * 1024; // Initial size 512 MB
#define MAX_PIXELS_PER_FRAGMENT "32"
static const char *final_shader_source = SHADER_HEADER "\
#define DEPTH_SORTED %d \n\
#define MAX_PIXELS_PER_FRAGMENT " MAX_PIXELS_PER_FRAGMENT " \n\
\n\
layout(binding = 0) uniform sampler2D tex; \n\
uniform highp float shade_scale_factor; \n\
\n\
out vec4 FragColor; \n\
\n\
uint pixel_list[MAX_PIXELS_PER_FRAGMENT]; \n\
\n\
\n\
int fillAndSortFragmentArray(ivec2 coords) \n\
{ \n\
// Load fragments into a local memory array for sorting \n\
uint idx = imageLoad(abufferPointerImg, coords).x; \n\
int count = 0; \n\
for (; idx != EOL && count < MAX_PIXELS_PER_FRAGMENT; count++) \n\
{ \n\
const Pixel p = pixels[idx]; \n\
int j = count - 1; \n\
Pixel jp = pixels[pixel_list[j]]; \n\
#if DEPTH_SORTED == 1 \n\
while (j >= 0 \n\
&& (jp.depth < p.depth \n\
|| (jp.depth == p.depth && getPolyNumber(jp) > getPolyNumber(p)))) \n\
#else \n\
while (j >= 0 && getPolyNumber(jp) > getPolyNumber(p)) \n\
#endif \n\
{ \n\
pixel_list[j + 1] = pixel_list[j]; \n\
j--; \n\
jp = pixels[pixel_list[j]]; \n\
} \n\
pixel_list[j + 1] = idx; \n\
idx = p.next; \n\
} \n\
return count; \n\
} \n\
\n\
// Blend fragments back-to-front \n\
vec4 resolveAlphaBlend(ivec2 coords) { \n\
\n\
// Copy and sort fragments into a local array \n\
int num_frag = fillAndSortFragmentArray(coords); \n\
\n\
vec4 finalColor = texture(tex, gl_FragCoord.xy / textureSize(tex, 0)); \n\
vec4 secondaryBuffer = vec4(0.0); // Secondary accumulation buffer \n\
float depth = 1.0; \n\
\n\
for (int i = 0; i < num_frag; i++) \n\
{ \n\
const Pixel pixel = pixels[pixel_list[i]]; \n\
const PolyParam pp = tr_poly_params[getPolyNumber(pixel)]; \n\
#if DEPTH_SORTED != 1 \n\
const float frag_depth = pixel.depth; \n\
switch (getDepthFunc(pp)) \n\
{ \n\
case 0: // Never \n\
continue; \n\
case 1: // Greater \n\
if (frag_depth <= depth) \n\
continue; \n\
break; \n\
case 2: // Equal \n\
if (frag_depth != depth) \n\
continue; \n\
break; \n\
case 3: // Greater or equal \n\
if (frag_depth < depth) \n\
continue; \n\
break; \n\
case 4: // Less \n\
if (frag_depth >= depth) \n\
continue; \n\
break; \n\
case 5: // Not equal \n\
if (frag_depth == depth) \n\
continue; \n\
break; \n\
case 6: // Less or equal \n\
if (frag_depth > depth) \n\
continue; \n\
break; \n\
case 7: // Always \n\
break; \n\
} \n\
\n\
if (getDepthMask(pp)) \n\
depth = frag_depth; \n\
#endif \n\
bool area1 = false; \n\
bool shadowed = false; \n\
if (isShadowed(pixel)) \n\
{ \n\
if (isTwoVolumes(pp)) \n\
area1 = true; \n\
else \n\
shadowed = true; \n\
} \n\
vec4 srcColor; \n\
if (getSrcSelect(pp, area1)) \n\
srcColor = secondaryBuffer; \n\
else \n\
{ \n\
srcColor = pixel.color; \n\
if (shadowed) \n\
srcColor.rgb *= shade_scale_factor; \n\
} \n\
vec4 dstColor = getDstSelect(pp, area1) ? secondaryBuffer : finalColor; \n\
vec4 srcCoef; \n\
vec4 dstCoef; \n\
\n\
int srcBlend = getSrcBlendFunc(pp, area1); \n\
switch (srcBlend) \n\
{ \n\
case ZERO: \n\
srcCoef = vec4(0.0); \n\
break; \n\
case ONE: \n\
srcCoef = vec4(1.0); \n\
break; \n\
case OTHER_COLOR: \n\
srcCoef = finalColor; \n\
break; \n\
case INVERSE_OTHER_COLOR: \n\
srcCoef = vec4(1.0) - dstColor; \n\
break; \n\
case SRC_ALPHA: \n\
srcCoef = vec4(srcColor.a); \n\
break; \n\
case INVERSE_SRC_ALPHA: \n\
srcCoef = vec4(1.0 - srcColor.a); \n\
break; \n\
case DST_ALPHA: \n\
srcCoef = vec4(dstColor.a); \n\
break; \n\
case INVERSE_DST_ALPHA: \n\
srcCoef = vec4(1.0 - dstColor.a); \n\
break; \n\
} \n\
int dstBlend = getDstBlendFunc(pp, area1); \n\
switch (dstBlend) \n\
{ \n\
case ZERO: \n\
dstCoef = vec4(0.0); \n\
break; \n\
case ONE: \n\
dstCoef = vec4(1.0); \n\
break; \n\
case OTHER_COLOR: \n\
dstCoef = srcColor; \n\
break; \n\
case INVERSE_OTHER_COLOR: \n\
dstCoef = vec4(1.0) - srcColor; \n\
break; \n\
case SRC_ALPHA: \n\
dstCoef = vec4(srcColor.a); \n\
break; \n\
case INVERSE_SRC_ALPHA: \n\
dstCoef = vec4(1.0 - srcColor.a); \n\
break; \n\
case DST_ALPHA: \n\
dstCoef = vec4(dstColor.a); \n\
break; \n\
case INVERSE_DST_ALPHA: \n\
dstCoef = vec4(1.0 - dstColor.a); \n\
break; \n\
} \n\
const vec4 result = clamp(dstColor * dstCoef + srcColor * srcCoef, 0.0, 1.0); \n\
if (getDstSelect(pp, area1)) \n\
secondaryBuffer = result; \n\
else \n\
finalColor = result; \n\
} \n\
\n\
return finalColor; \n\
\n\
} \n\
\n\
void main(void) \n\
{ \n\
ivec2 coords = ivec2(gl_FragCoord.xy); \n\
// Compute and output final color for the frame buffer \n\
// Visualize the number of layers in use \n\
//FragColor = vec4(float(fillFragmentArray(coords)) / MAX_PIXELS_PER_FRAGMENT, 0, 0, 1); \n\
FragColor = resolveAlphaBlend(coords); \n\
} \n\
";
static const char *clear_shader_source = SHADER_HEADER "\
\n\
void main(void) \n\
{ \n\
ivec2 coords = ivec2(gl_FragCoord.xy); \n\
\n\
// Reset pointers \n\
imageStore(abufferPointerImg, coords, uvec4(EOL)); \n\
\n\
// Discard fragment so nothing is written to the framebuffer \n\
discard; \n\
} \n\
";
static const char *tr_modvol_shader_source = SHADER_HEADER "\
#define MV_MODE %d \n\
#define MAX_PIXELS_PER_FRAGMENT " MAX_PIXELS_PER_FRAGMENT " \n\
\n\
// Must match ModifierVolumeMode enum values \n\
#define MV_XOR 0 \n\
#define MV_OR 1 \n\
#define MV_INCLUSION 2 \n\
#define MV_EXCLUSION 3 \n\
\n\
void main(void) \n\
{ \n\
#if MV_MODE == MV_XOR || MV_MODE == MV_OR \n\
setFragDepth(); \n\
#endif \n\
ivec2 coords = ivec2(gl_FragCoord.xy); \n\
\n\
uint idx = imageLoad(abufferPointerImg, coords).x; \n\
int list_len = 0; \n\
while (idx != EOL && list_len < MAX_PIXELS_PER_FRAGMENT) \n\
{ \n\
const Pixel pixel = pixels[idx]; \n\
const PolyParam pp = tr_poly_params[getPolyNumber(pixel)]; \n\
if (getShadowEnable(pp)) \n\
{ \n\
#if MV_MODE == MV_XOR \n\
if (gl_FragDepth <= pixel.depth) \n\
atomicXor(pixels[idx].seq_num, SHADOW_STENCIL); \n\
#elif MV_MODE == MV_OR \n\
if (gl_FragDepth <= pixel.depth) \n\
atomicOr(pixels[idx].seq_num, SHADOW_STENCIL); \n\
#elif MV_MODE == MV_INCLUSION \n\
uint prev_val = atomicAnd(pixels[idx].seq_num, ~(SHADOW_STENCIL)); \n\
if ((prev_val & (SHADOW_STENCIL|SHADOW_ACC)) == SHADOW_STENCIL) \n\
pixels[idx].seq_num = bitfieldInsert(pixel.seq_num, 1u, 31, 1); \n\
#elif MV_MODE == MV_EXCLUSION \n\
uint prev_val = atomicAnd(pixels[idx].seq_num, ~(SHADOW_STENCIL|SHADOW_ACC)); \n\
if ((prev_val & (SHADOW_STENCIL|SHADOW_ACC)) == SHADOW_ACC) \n\
pixels[idx].seq_num = bitfieldInsert(pixel.seq_num, 1u, 31, 1); \n\
#endif \n\
} \n\
idx = pixel.next; \n\
list_len++; \n\
} \n\
\n\
discard; \n\
} \n\
";
void DrawQuad();
void initABuffer()
{
g_imageWidth = screen_width;
g_imageHeight = screen_height;
if (g_imageWidth > 0 && g_imageHeight > 0)
{
if (pixels_pointers == 0)
pixels_pointers = glcache.GenTexture();
glActiveTexture(GL_TEXTURE4);
glBindTexture(GL_TEXTURE_2D, pixels_pointers);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_NEAREST);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_NEAREST);
glTexImage2D(GL_TEXTURE_2D, 0, GL_R32UI, g_imageWidth, g_imageHeight, 0, GL_RED_INTEGER, GL_UNSIGNED_INT, 0);
glBindImageTexture(4, pixels_pointers, 0, false, 0, GL_READ_WRITE, GL_R32UI);
glCheck();
}
if (pixels_buffer == 0 )
{
// Create the buffer
glGenBuffers(1, &pixels_buffer);
// Bind it
glBindBuffer(GL_SHADER_STORAGE_BUFFER, pixels_buffer);
// Declare storage
glBufferData(GL_SHADER_STORAGE_BUFFER, pixel_buffer_size, NULL, GL_DYNAMIC_COPY);
glBindBufferBase(GL_SHADER_STORAGE_BUFFER, 0, pixels_buffer);
glCheck();
}
if (atomic_buffer == 0 )
{
// Create the buffer
glGenBuffers(1, &atomic_buffer);
// Bind it
glBindBuffer(GL_ATOMIC_COUNTER_BUFFER, atomic_buffer);
// Declare storage
glBufferData(GL_ATOMIC_COUNTER_BUFFER, 4, NULL, GL_DYNAMIC_COPY);
glBindBufferBase(GL_ATOMIC_COUNTER_BUFFER, 0, atomic_buffer);
GLint zero = 0;
glBufferSubData(GL_SHADER_STORAGE_BUFFER, 0, sizeof(GLint), &zero);
glCheck();
}
if (g_abuffer_final_shader.program == 0)
{
char source[16384];
sprintf(source, final_shader_source, 1);
CompilePipelineShader(&g_abuffer_final_shader, source);
}
if (g_abuffer_final_nosort_shader.program == 0)
{
char source[16384];
sprintf(source, final_shader_source, 0);
CompilePipelineShader(&g_abuffer_final_nosort_shader, source);
}
if (g_abuffer_clear_shader.program == 0)
CompilePipelineShader(&g_abuffer_clear_shader, clear_shader_source);
if (g_abuffer_tr_modvol_shaders[0].program == 0)
{
char source[16384];
for (int mode = 0; mode < ModeCount; mode++)
{
sprintf(source, tr_modvol_shader_source, mode);
CompilePipelineShader(&g_abuffer_tr_modvol_shaders[mode], source);
}
}
if (g_quadVertexArray == 0)
glGenVertexArrays(1, &g_quadVertexArray);
if (g_quadBuffer == 0)
glGenBuffers(1, &g_quadBuffer);
glCheck();
// Clear A-buffer pointers
glcache.UseProgram(g_abuffer_clear_shader.program);
ShaderUniforms.Set(&g_abuffer_clear_shader);
DrawQuad();
glMemoryBarrier(GL_SHADER_IMAGE_ACCESS_BARRIER_BIT);
glCheck();
}
void reshapeABuffer(int w, int h)
{
if (w != g_imageWidth || h != g_imageHeight) {
if (pixels_pointers != 0)
{
glcache.DeleteTextures(1, &pixels_pointers);
pixels_pointers = 0;
}
initABuffer();
}
}
void DrawQuad()
{
glBindVertexArray(g_quadVertexArray);
float xmin = (ShaderUniforms.scale_coefs[2] - 1) / ShaderUniforms.scale_coefs[0];
float xmax = (ShaderUniforms.scale_coefs[2] + 1) / ShaderUniforms.scale_coefs[0];
float ymin = (ShaderUniforms.scale_coefs[3] - 1) / ShaderUniforms.scale_coefs[1];
float ymax = (ShaderUniforms.scale_coefs[3] + 1) / ShaderUniforms.scale_coefs[1];
if (ymin > ymax)
{
float t = ymin;
ymin = ymax;
ymax = t;
}
struct Vertex vertices[] = {
{ xmin, ymax, 1, { 255, 255, 255, 255 }, { 0, 0, 0, 0 }, 0, 1 },
{ xmin, ymin, 1, { 255, 255, 255, 255 }, { 0, 0, 0, 0 }, 0, 0 },
{ xmax, ymax, 1, { 255, 255, 255, 255 }, { 0, 0, 0, 0 }, 1, 1 },
{ xmax, ymin, 1, { 255, 255, 255, 255 }, { 0, 0, 0, 0 }, 1, 0 },
};
GLushort indices[] = { 0, 1, 2, 1, 3 };
glBindBuffer(GL_ARRAY_BUFFER, g_quadBuffer); glCheck();
glBufferData(GL_ARRAY_BUFFER, sizeof(vertices), vertices, GL_STREAM_DRAW); glCheck();
glBindBuffer(GL_ELEMENT_ARRAY_BUFFER, 0); glCheck();
glEnableVertexAttribArray(VERTEX_POS_ARRAY); glCheck();
glVertexAttribPointer(VERTEX_POS_ARRAY, 3, GL_FLOAT, GL_FALSE, sizeof(Vertex), (void*)offsetof(Vertex,x)); glCheck();
glEnableVertexAttribArray(VERTEX_COL_BASE_ARRAY); glCheck();
glVertexAttribPointer(VERTEX_COL_BASE_ARRAY, 4, GL_UNSIGNED_BYTE, GL_TRUE, sizeof(Vertex), (void*)offsetof(Vertex,col)); glCheck();
glEnableVertexAttribArray(VERTEX_COL_OFFS_ARRAY); glCheck();
glVertexAttribPointer(VERTEX_COL_OFFS_ARRAY, 4, GL_UNSIGNED_BYTE, GL_TRUE, sizeof(Vertex), (void*)offsetof(Vertex,spc)); glCheck();
glEnableVertexAttribArray(VERTEX_UV_ARRAY); glCheck();
glVertexAttribPointer(VERTEX_UV_ARRAY, 2, GL_FLOAT, GL_FALSE, sizeof(Vertex), (void*)offsetof(Vertex,u)); glCheck();
glDisableVertexAttribArray(VERTEX_UV1_ARRAY);
glDisableVertexAttribArray(VERTEX_COL_OFFS1_ARRAY);
glDisableVertexAttribArray(VERTEX_COL_BASE1_ARRAY);
glDrawElements(GL_TRIANGLE_STRIP, 5, GL_UNSIGNED_SHORT, indices); glCheck();
}
void DrawTranslucentModVols(int first, int count)
{
if (count == 0 || pvrrc.modtrig.used() == 0)
return;
SetupModvolVBO();
glActiveTexture(GL_TEXTURE2);
glBindTexture(GL_TEXTURE_2D, 0);
glActiveTexture(GL_TEXTURE3);
glBindTexture(GL_TEXTURE_2D, 0);
glActiveTexture(GL_TEXTURE1);
glBindTexture(GL_TEXTURE_2D, 0);
glActiveTexture(GL_TEXTURE0);
glBindTexture(GL_TEXTURE_2D, 0);
glcache.Disable(GL_DEPTH_TEST);
glcache.Disable(GL_STENCIL_TEST);
glCheck();
ModifierVolumeParam* params = &pvrrc.global_param_mvo_tr.head()[first];
glMemoryBarrier(GL_SHADER_IMAGE_ACCESS_BARRIER_BIT | GL_BUFFER_UPDATE_BARRIER_BIT);
int mod_base = -1;
for (u32 cmv = 0; cmv < count; cmv++)
{
ModifierVolumeParam& param = params[cmv];
if (param.count == 0)
continue;
u32 mv_mode = param.isp.DepthMode;
verify(param.first >= 0 && param.first + param.count <= pvrrc.modtrig.used());
if (mod_base == -1)
mod_base = param.first;
PipelineShader *shader;
if (!param.isp.VolumeLast && mv_mode > 0)
shader = &g_abuffer_tr_modvol_shaders[Or]; // OR'ing (open volume or quad)
else
shader = &g_abuffer_tr_modvol_shaders[Xor]; // XOR'ing (closed volume)
glcache.UseProgram(shader->program);
ShaderUniforms.Set(shader);
SetCull(param.isp.CullMode); glCheck();
glMemoryBarrier(GL_BUFFER_UPDATE_BARRIER_BIT);
glDrawArrays(GL_TRIANGLES, param.first * 3, param.count * 3); glCheck();
if (mv_mode == 1 || mv_mode == 2)
{
//Sum the area
shader = &g_abuffer_tr_modvol_shaders[mv_mode == 1 ? Inclusion : Exclusion];
glcache.UseProgram(shader->program);
ShaderUniforms.Set(shader);
glMemoryBarrier(GL_BUFFER_UPDATE_BARRIER_BIT);
glDrawArrays(GL_TRIANGLES, mod_base * 3, (param.first + param.count - mod_base) * 3); glCheck();
mod_base = -1;
}
}
}
void checkOverflowAndReset()
{
// Using atomic counter
GLuint max_pixel_index = 0;
// glGetBufferSubData(GL_ATOMIC_COUNTER_BUFFER, 0, sizeof(GLuint), &max_pixel_index);
//// printf("ABUFFER %d pixels used\n", max_pixel_index);
// if ((max_pixel_index + 1) * 32 - 1 >= pixel_buffer_size)
// {
// GLint64 size;
// glGetInteger64v(GL_MAX_SHADER_STORAGE_BLOCK_SIZE, &size);
// if (pixel_buffer_size == size)
// printf("A-buffer overflow: %d pixels. Buffer size already maxed out\n", max_pixel_index);
// else
// {
// pixel_buffer_size = (GLuint)min(2 * (GLint64)pixel_buffer_size, size);
//
// printf("A-buffer overflow: %d pixels. Resizing buffer to %d MB\n", max_pixel_index, pixel_buffer_size / 1024 / 1024);
//
// glBindBuffer(GL_SHADER_STORAGE_BUFFER, pixels_buffer);
// glBufferData(GL_SHADER_STORAGE_BUFFER, pixel_buffer_size, NULL, GL_DYNAMIC_COPY);
// glBindBufferBase(GL_SHADER_STORAGE_BUFFER, 0, pixels_buffer);
// glCheck();
// }
// }
// Reset counter
max_pixel_index = 0;
glBufferSubData(GL_ATOMIC_COUNTER_BUFFER, 0 , sizeof(GLuint), &max_pixel_index);
}
void renderABuffer(bool sortFragments)
{
// Render to output FBO
glcache.UseProgram(sortFragments ? g_abuffer_final_shader.program : g_abuffer_final_nosort_shader.program);
ShaderUniforms.Set(&g_abuffer_final_shader);
glcache.Disable(GL_DEPTH_TEST);
glcache.Disable(GL_CULL_FACE);
glMemoryBarrier(GL_SHADER_IMAGE_ACCESS_BARRIER_BIT | GL_BUFFER_UPDATE_BARRIER_BIT);
DrawQuad();
glCheck();
// Clear A-buffer pointers
glcache.UseProgram(g_abuffer_clear_shader.program);
ShaderUniforms.Set(&g_abuffer_clear_shader);
glMemoryBarrier(GL_SHADER_IMAGE_ACCESS_BARRIER_BIT);
DrawQuad();
glActiveTexture(GL_TEXTURE0);
glCheck();
}

View file

@ -9,7 +9,7 @@ public:
GLCache() { Reset(); } GLCache() { Reset(); }
void BindTexture(GLenum target, GLuint texture) { void BindTexture(GLenum target, GLuint texture) {
if (target == GL_TEXTURE_2D && texture != _texture) { if ((target == GL_TEXTURE_2D && texture != _texture && !_disable_cache) || _disable_cache) {
glBindTexture(target, texture); glBindTexture(target, texture);
_texture = texture; _texture = texture;
} }
@ -18,7 +18,7 @@ public:
} }
void BlendFunc(GLenum sfactor, GLenum dfactor) { void BlendFunc(GLenum sfactor, GLenum dfactor) {
if (sfactor != _src_blend_factor || dfactor != _dst_blend_factor) { if (sfactor != _src_blend_factor || dfactor != _dst_blend_factor || _disable_cache) {
_src_blend_factor = sfactor; _src_blend_factor = sfactor;
_dst_blend_factor = dfactor; _dst_blend_factor = dfactor;
glBlendFunc(sfactor, dfactor); glBlendFunc(sfactor, dfactor);
@ -26,7 +26,7 @@ public:
} }
void ClearColor(GLclampf red, GLclampf green, GLclampf blue, GLclampf alpha) { void ClearColor(GLclampf red, GLclampf green, GLclampf blue, GLclampf alpha) {
if (red != _clear_r || green != _clear_g || blue != _clear_b || alpha != _clear_a) { if (red != _clear_r || green != _clear_g || blue != _clear_b || alpha != _clear_a || _disable_cache) {
_clear_r = red; _clear_r = red;
_clear_g = green; _clear_g = green;
_clear_b = blue; _clear_b = blue;
@ -36,7 +36,7 @@ public:
} }
void CullFace(GLenum mode) { void CullFace(GLenum mode) {
if (mode != _cull_face) { if (mode != _cull_face || _disable_cache) {
_cull_face = mode; _cull_face = mode;
glCullFace(mode); glCullFace(mode);
} }
@ -52,14 +52,14 @@ public:
} }
void DepthFunc(GLenum func) { void DepthFunc(GLenum func) {
if (func != _depth_func) { if (func != _depth_func || _disable_cache) {
_depth_func = func; _depth_func = func;
glDepthFunc(func); glDepthFunc(func);
} }
} }
void DepthMask(GLboolean flag) { void DepthMask(GLboolean flag) {
if (flag != _depth_mask) { if (flag != _depth_mask || _disable_cache) {
_depth_mask = flag; _depth_mask = flag;
glDepthMask(flag); glDepthMask(flag);
} }
@ -74,14 +74,14 @@ public:
} }
void UseProgram(GLuint program) { void UseProgram(GLuint program) {
if (program != _program) { if (program != _program || _disable_cache) {
_program = program; _program = program;
glUseProgram(program); glUseProgram(program);
} }
} }
void StencilFunc(GLenum func, GLint ref, GLuint mask) { void StencilFunc(GLenum func, GLint ref, GLuint mask) {
if (_stencil_func != func || _stencil_ref != ref || _stencil_fmask != mask) { if (_stencil_func != func || _stencil_ref != ref || _stencil_fmask != mask || _disable_cache) {
_stencil_func = func; _stencil_func = func;
_stencil_ref = ref; _stencil_ref = ref;
_stencil_fmask = mask; _stencil_fmask = mask;
@ -90,7 +90,7 @@ public:
} }
void StencilOp(GLenum sfail, GLenum dpfail, GLenum dppass) { void StencilOp(GLenum sfail, GLenum dpfail, GLenum dppass) {
if (_stencil_sfail != sfail ||_stencil_dpfail != dpfail || _stencil_dppass != dppass) { if (_stencil_sfail != sfail ||_stencil_dpfail != dpfail || _stencil_dppass != dppass || _disable_cache) {
_stencil_sfail = sfail; _stencil_sfail = sfail;
_stencil_dpfail = dpfail; _stencil_dpfail = dpfail;
_stencil_dppass = dppass; _stencil_dppass = dppass;
@ -99,14 +99,14 @@ public:
} }
void StencilMask(GLuint mask) { void StencilMask(GLuint mask) {
if (_stencil_mask != mask) { if (_stencil_mask != mask || _disable_cache) {
_stencil_mask = mask; _stencil_mask = mask;
glStencilMask(mask); glStencilMask(mask);
} }
} }
void TexParameteri(GLenum target, GLenum pname, GLint param) { void TexParameteri(GLenum target, GLenum pname, GLint param) {
if (target == GL_TEXTURE_2D) if (target == GL_TEXTURE_2D && !_disable_cache)
{ {
TextureParameters &cur_params = _texture_params[_texture]; TextureParameters &cur_params = _texture_params[_texture];
switch (pname) { switch (pname) {
@ -201,7 +201,7 @@ private:
break; break;
} }
if (pCap != NULL) { if (pCap != NULL) {
if (*pCap == value) if (*pCap == value && !_disable_cache)
return; return;
*pCap = value; *pCap = value;
} }
@ -237,6 +237,7 @@ private:
GLuint _texture_ids[TEXTURE_ID_CACHE_SIZE]; GLuint _texture_ids[TEXTURE_ID_CACHE_SIZE];
GLuint _texture_cache_size; GLuint _texture_cache_size;
std::map<GLuint, TextureParameters> _texture_params; std::map<GLuint, TextureParameters> _texture_params;
bool _disable_cache = true;
}; };
extern GLCache glcache; extern GLCache glcache;

File diff suppressed because it is too large Load diff

File diff suppressed because it is too large Load diff

View file

@ -1,39 +1,12 @@
#pragma once #pragma once
#include "rend/rend.h" #include "rend/rend.h"
#include <map>
#ifdef GLES
#if defined(TARGET_IPHONE) //apple-specific ogles2 headers
//#include <APPLE/egl.h>
#include <OpenGLES/ES2/gl.h>
#include <OpenGLES/ES2/glext.h>
#else
#if !defined(TARGET_NACL32)
#include <EGL/egl.h>
#endif
#include <GLES2/gl2.h>
#include <GLES2/gl2ext.h>
#endif
#ifndef GL_NV_draw_path
//IMGTEC GLES emulation
#pragma comment(lib,"libEGL.lib")
#pragma comment(lib,"libGLESv2.lib")
#else /* NV gles emulation*/
#pragma comment(lib,"libGLES20.lib")
#endif
#else
#if HOST_OS == OS_DARWIN #if HOST_OS == OS_DARWIN
#include <OpenGL/gl3.h> #include <OpenGL/gl3.h>
#else #else
#include <GL3/gl3w.h> #include <GL3/gl3w.h>
#endif #endif
#endif
#ifndef GL_UNSIGNED_INT_8_8_8_8
#define GL_UNSIGNED_INT_8_8_8_8 0x8035
#endif
#define glCheck() do { if (unlikely(settings.validate.OpenGlChecks)) { verify(glGetError()==GL_NO_ERROR); } } while(0) #define glCheck() do { if (unlikely(settings.validate.OpenGlChecks)) { verify(glGetError()==GL_NO_ERROR); } } while(0)
#define eglCheck() false #define eglCheck() false
@ -42,6 +15,9 @@
#define VERTEX_COL_BASE_ARRAY 1 #define VERTEX_COL_BASE_ARRAY 1
#define VERTEX_COL_OFFS_ARRAY 2 #define VERTEX_COL_OFFS_ARRAY 2
#define VERTEX_UV_ARRAY 3 #define VERTEX_UV_ARRAY 3
#define VERTEX_COL_BASE1_ARRAY 4
#define VERTEX_COL_OFFS1_ARRAY 5
#define VERTEX_UV1_ARRAY 6
#ifndef GL_UNSIGNED_INT_8_8_8_8 #ifndef GL_UNSIGNED_INT_8_8_8_8
#define GL_UNSIGNED_INT_8_8_8_8 0x8035 #define GL_UNSIGNED_INT_8_8_8_8 0x8035
@ -52,71 +28,71 @@ extern u32 gcflip;
extern float scale_x, scale_y; extern float scale_x, scale_y;
void DrawStrips(); void DrawStrips(GLuint output_fbo);
struct PipelineShader struct PipelineShader
{ {
GLuint program; GLuint program;
GLuint scale,depth_scale; GLuint scale;
GLuint extra_depth_scale; GLuint extra_depth_scale;
GLuint pp_ClipTest,cp_AlphaTestValue; GLuint pp_ClipTest,cp_AlphaTestValue;
GLuint sp_FOG_COL_RAM,sp_FOG_COL_VERT,sp_FOG_DENSITY; GLuint sp_FOG_COL_RAM,sp_FOG_COL_VERT,sp_FOG_DENSITY;
GLuint shade_scale_factor;
GLuint pp_Number;
GLuint blend_mode;
GLuint use_alpha;
GLuint ignore_tex_alpha;
GLuint shading_instr;
GLuint fog_control;
GLuint trilinear_alpha; GLuint trilinear_alpha;
GLuint fog_clamp_min, fog_clamp_max; GLuint fog_clamp_min, fog_clamp_max;
// //
u32 cp_AlphaTest; s32 pp_ClipTestMode; u32 cp_AlphaTest; s32 pp_ClipTestMode;
u32 pp_Texture, pp_UseAlpha, pp_IgnoreTexA, pp_ShadInstr, pp_Offset, pp_FogCtrl; u32 pp_Texture, pp_UseAlpha, pp_IgnoreTexA, pp_ShadInstr, pp_Offset, pp_FogCtrl;
bool pp_Gouraud, pp_BumpMap; u32 pp_DepthFunc;
int pass;
bool pp_TwoVolumes;
bool pp_Gouraud;
bool pp_BumpMap;
bool fog_clamping; bool fog_clamping;
}; };
struct gl_ctx struct gl_ctx
{ {
#if defined(GLES) && HOST_OS != OS_DARWIN && !defined(TARGET_NACL32)
struct
{
EGLNativeWindowType native_wind;
EGLNativeDisplayType native_disp;
EGLDisplay display;
EGLSurface surface;
EGLContext context;
} setup;
#endif
struct struct
{ {
GLuint program; GLuint program;
GLuint scale,depth_scale; GLuint scale;
GLuint extra_depth_scale; GLuint extra_depth_scale;
GLuint sp_ShaderColor;
} modvol_shader; } modvol_shader;
PipelineShader pogram_table[12288]; std::map<int, PipelineShader *> shaders;
struct struct
{ {
GLuint program,scale,depth_scale; GLuint program,scale;
GLuint extra_depth_scale; GLuint extra_depth_scale;
} OSD_SHADER; } OSD_SHADER;
struct struct
{ {
GLuint geometry,modvols,idxs,idxs2; GLuint geometry,modvols,idxs,idxs2;
#ifndef GLES
GLuint vao; GLuint vao;
#endif GLuint tr_poly_params;
} vbo; } vbo;
const char *gl_version; PipelineShader *getShader(int programId) {
const char *glsl_version_header; PipelineShader *shader = shaders[programId];
int gl_major; if (shader == NULL) {
bool is_gles; shader = new PipelineShader();
GLuint fog_image_format; shaders[programId] = shader;
//GLuint matrix; shader->program = -1;
}
return shader;
}
}; };
extern gl_ctx gl; extern gl_ctx gl;
@ -135,34 +111,197 @@ void CollectCleanup();
void DoCleanup(); void DoCleanup();
void SortPParams(int first, int count); void SortPParams(int first, int count);
void BindRTT(u32 addy, u32 fbw, u32 fbh, u32 channels, u32 fmt); extern int screen_width;
extern int screen_height;
GLuint BindRTT(u32 addy, u32 fbw, u32 fbh, u32 channels, u32 fmt);
void ReadRTTBuffer(); void ReadRTTBuffer();
void RenderFramebuffer(); void RenderFramebuffer();
void DrawFramebuffer(float w, float h); void DrawFramebuffer(float w, float h);
int GetProgramID(u32 cp_AlphaTest, u32 pp_ClipTestMode, int GetProgramID(u32 cp_AlphaTest, u32 pp_ClipTestMode,
u32 pp_Texture, u32 pp_UseAlpha, u32 pp_IgnoreTexA, u32 pp_ShadInstr, u32 pp_Offset, u32 pp_Texture, u32 pp_UseAlpha, u32 pp_IgnoreTexA, u32 pp_ShadInstr, u32 pp_Offset,
u32 pp_FogCtrl, bool pp_Gouraud, bool pp_BumpMap, bool fog_clamping); u32 pp_FogCtrl, bool two_volumes, u32 pp_DepthFunc, bool pp_Gouraud, bool pp_BumpMap, bool fog_clamping, int pass);
void SetCull(u32 CulliMode);
bool CompilePipelineShader(PipelineShader* s); extern const char *PixelPipelineShader;
bool CompilePipelineShader(PipelineShader* s, const char *source = PixelPipelineShader);
#define TEXTURE_LOAD_ERROR 0 #define TEXTURE_LOAD_ERROR 0
GLuint loadPNG(const string& subpath, int &width, int &height); GLuint loadPNG(const string& subpath, int &width, int &height);
extern GLuint stencilTexId;
extern GLuint depthTexId;
extern GLuint opaqueTexId;
extern GLuint depthSaveTexId;
#define SHADER_HEADER "#version 430 \n\
\n\
layout(r32ui, binding = 4) uniform coherent restrict uimage2D abufferPointerImg; \n\
struct Pixel { \n\
highp vec4 color; \n\
highp float depth; \n\
uint seq_num; \n\
uint next; \n\
}; \n\
#define EOL 0xFFFFFFFFu \n\
layout (binding = 0, std430) coherent restrict buffer PixelBuffer { \n\
Pixel pixels[]; \n\
}; \n\
layout(binding = 0, offset = 0) uniform atomic_uint buffer_index; \n\
\n\
#define ZERO 0 \n\
#define ONE 1 \n\
#define OTHER_COLOR 2 \n\
#define INVERSE_OTHER_COLOR 3 \n\
#define SRC_ALPHA 4 \n\
#define INVERSE_SRC_ALPHA 5 \n\
#define DST_ALPHA 6 \n\
#define INVERSE_DST_ALPHA 7 \n\
\n\
uint getNextPixelIndex() \n\
{ \n\
uint index = atomicCounterIncrement(buffer_index); \n\
if (index >= pixels.length()) \n\
// Buffer overflow \n\
discard; \n\
\n\
return index; \n\
} \n\
\n\
void setFragDepth(void) \n\
{ \n\
highp float w = 100000.0 * gl_FragCoord.w; \n\
gl_FragDepth = 1.0 - log2(1.0 + w) / 34.0; \n\
} \n\
struct PolyParam { \n\
int first; \n\
int count; \n\
int texid; \n\
int tsp; \n\
int tcw; \n\
int pcw; \n\
int isp; \n\
float zvZ; \n\
int tileclip; \n\
int tsp1; \n\
int tcw1; \n\
int texid1; \n\
}; \n\
layout (binding = 1, std430) readonly buffer TrPolyParamBuffer { \n\
PolyParam tr_poly_params[]; \n\
}; \n\
\n\
#define GET_TSP_FOR_AREA int tsp; if (area1) tsp = pp.tsp1; else tsp = pp.tsp; \n\
\n\
int getSrcBlendFunc(const PolyParam pp, bool area1) \n\
{ \n\
GET_TSP_FOR_AREA \n\
return (tsp >> 29) & 7; \n\
} \n\
\n\
int getDstBlendFunc(const PolyParam pp, bool area1) \n\
{ \n\
GET_TSP_FOR_AREA \n\
return (tsp >> 26) & 7; \n\
} \n\
\n\
bool getSrcSelect(const PolyParam pp, bool area1) \n\
{ \n\
GET_TSP_FOR_AREA \n\
return ((tsp >> 25) & 1) != 0; \n\
} \n\
\n\
bool getDstSelect(const PolyParam pp, bool area1) \n\
{ \n\
GET_TSP_FOR_AREA \n\
return ((tsp >> 24) & 1) != 0; \n\
} \n\
\n\
int getFogControl(const PolyParam pp, bool area1) \n\
{ \n\
GET_TSP_FOR_AREA \n\
return (tsp >> 22) & 3; \n\
} \n\
\n\
bool getUseAlpha(const PolyParam pp, bool area1) \n\
{ \n\
GET_TSP_FOR_AREA \n\
return ((tsp >> 20) & 1) != 0; \n\
} \n\
\n\
bool getIgnoreTexAlpha(const PolyParam pp, bool area1) \n\
{ \n\
GET_TSP_FOR_AREA \n\
return ((tsp >> 19) & 1) != 0; \n\
} \n\
\n\
int getShadingInstruction(const PolyParam pp, bool area1) \n\
{ \n\
GET_TSP_FOR_AREA \n\
return (tsp >> 6) & 3; \n\
} \n\
\n\
int getDepthFunc(const PolyParam pp) \n\
{ \n\
return (pp.isp >> 29) & 7; \n\
} \n\
\n\
bool getDepthMask(const PolyParam pp) \n\
{ \n\
return ((pp.isp >> 26) & 1) != 1; \n\
} \n\
\n\
bool getShadowEnable(const PolyParam pp) \n\
{ \n\
return ((pp.pcw >> 7) & 1) != 0; \n\
} \n\
\n\
uint getPolyNumber(const Pixel pixel) \n\
{ \n\
return pixel.seq_num & 0x3FFFFFFFu; \n\
} \n\
\n\
#define SHADOW_STENCIL 0x40000000u \n\
#define SHADOW_ACC 0x80000000u \n\
\n\
bool isShadowed(const Pixel pixel) \n\
{ \n\
return (pixel.seq_num & SHADOW_ACC) == SHADOW_ACC; \n\
} \n\
\n\
bool isTwoVolumes(const PolyParam pp) \n\
{ \n\
return pp.tsp1 != -1 || pp.tcw1 != -1; \n\
} \n\
\n\
"
void SetupModvolVBO();
enum ModifierVolumeMode { Xor, Or, Inclusion, Exclusion, ModeCount }; enum ModifierVolumeMode { Xor, Or, Inclusion, Exclusion, ModeCount };
extern struct ShaderUniforms_t extern struct ShaderUniforms_t
{ {
float PT_ALPHA; float PT_ALPHA;
float scale_coefs[4]; float scale_coefs[4];
float depth_coefs[4];
float extra_depth_scale; float extra_depth_scale;
float fog_den_float; float fog_den_float;
float ps_FOG_COL_RAM[3]; float ps_FOG_COL_RAM[3];
float ps_FOG_COL_VERT[3]; float ps_FOG_COL_VERT[3];
int poly_number;
float trilinear_alpha; float trilinear_alpha;
TSP tsp0;
TSP tsp1;
TCW tcw0;
TCW tcw1;
float fog_clamp_min[4]; float fog_clamp_min[4];
float fog_clamp_max[4]; float fog_clamp_max[4];
void setUniformArray(GLuint location, int v0, int v1)
{
int array[] = { v0, v1 };
glUniform1iv(location, 2, array);
}
void Set(PipelineShader* s) void Set(PipelineShader* s)
{ {
if (s->cp_AlphaTestValue!=-1) if (s->cp_AlphaTestValue!=-1)
@ -171,9 +310,6 @@ extern struct ShaderUniforms_t
if (s->scale!=-1) if (s->scale!=-1)
glUniform4fv( s->scale, 1, scale_coefs); glUniform4fv( s->scale, 1, scale_coefs);
if (s->depth_scale!=-1)
glUniform4fv( s->depth_scale, 1, depth_coefs);
if (s->extra_depth_scale != -1) if (s->extra_depth_scale != -1)
glUniform1f(s->extra_depth_scale, extra_depth_scale); glUniform1f(s->extra_depth_scale, extra_depth_scale);
@ -186,6 +322,29 @@ extern struct ShaderUniforms_t
if (s->sp_FOG_COL_VERT!=-1) if (s->sp_FOG_COL_VERT!=-1)
glUniform3fv( s->sp_FOG_COL_VERT, 1, ps_FOG_COL_VERT); glUniform3fv( s->sp_FOG_COL_VERT, 1, ps_FOG_COL_VERT);
if (s->shade_scale_factor != -1)
glUniform1f(s->shade_scale_factor, FPU_SHAD_SCALE.scale_factor / 256.f);
if (s->blend_mode != -1) {
u32 blend_mode[] = { tsp0.SrcInstr, tsp0.DstInstr, tsp1.SrcInstr, tsp1.DstInstr };
glUniform2iv(s->blend_mode, 2, (GLint *)blend_mode);
}
if (s->use_alpha != -1)
setUniformArray(s->use_alpha, tsp0.UseAlpha, tsp1.UseAlpha);
if (s->ignore_tex_alpha != -1)
setUniformArray(s->ignore_tex_alpha, tsp0.IgnoreTexA, tsp1.IgnoreTexA);
if (s->shading_instr != -1)
setUniformArray(s->shading_instr, tsp0.ShadInstr, tsp1.ShadInstr);
if (s->fog_control != -1)
setUniformArray(s->fog_control, tsp0.FogCtrl, tsp1.FogCtrl);
if (s->pp_Number != -1)
glUniform1i(s->pp_Number, poly_number);
if (s->trilinear_alpha != -1) if (s->trilinear_alpha != -1)
glUniform1f(s->trilinear_alpha, trilinear_alpha); glUniform1f(s->trilinear_alpha, trilinear_alpha);

View file

@ -206,7 +206,7 @@ static void dumpTexture(int texID, int w, int h, GLuint textype, void *temp_tex_
fclose(fp); fclose(fp);
for (int y = 0; y < h; y++) for (int y = 0; y < h; y++)
free(rows[y]); free(rows[y]);
free(rows); free(rows);
} }
@ -576,29 +576,26 @@ TextureCacheData *getTextureCacheData(TSP tsp, TCW tcw);
struct FBT struct FBT
{ {
u32 TexAddr; u32 TexAddr;
GLuint depthb,stencilb;
GLuint tex; GLuint tex;
GLuint fbo; GLuint fbo;
}; };
FBT fb_rtt; FBT fb_rtt;
void BindRTT(u32 addy, u32 fbw, u32 fbh, u32 channels, u32 fmt) GLuint BindRTT(u32 addy, u32 fbw, u32 fbh, u32 channels, u32 fmt)
{ {
FBT& rv=fb_rtt; FBT& rv=fb_rtt;
if (rv.fbo) glDeleteFramebuffers(1,&rv.fbo); if (rv.fbo) glDeleteFramebuffers(1,&rv.fbo);
if (rv.tex) glcache.DeleteTextures(1,&rv.tex); if (rv.tex) glcache.DeleteTextures(1,&rv.tex);
if (rv.depthb) glDeleteRenderbuffers(1,&rv.depthb);
if (rv.stencilb) glDeleteRenderbuffers(1,&rv.stencilb);
rv.TexAddr=addy>>3; rv.TexAddr=addy>>3;
// Find the largest square power of two texture that fits into the viewport // Find the smallest power of two texture that fits the viewport
int fbh2 = 2; int fbh2 = 8;
while (fbh2 < fbh) while (fbh2 < fbh)
fbh2 *= 2; fbh2 *= 2;
int fbw2 = 2; int fbw2 = 8;
while (fbw2 < fbw) while (fbw2 < fbw)
fbw2 *= 2; fbw2 *= 2;
@ -612,26 +609,6 @@ void BindRTT(u32 addy, u32 fbw, u32 fbh, u32 channels, u32 fmt)
// Get the currently bound frame buffer object. On most platforms this just gives 0. // Get the currently bound frame buffer object. On most platforms this just gives 0.
//glGetIntegerv(GL_FRAMEBUFFER_BINDING, &m_i32OriginalFbo); //glGetIntegerv(GL_FRAMEBUFFER_BINDING, &m_i32OriginalFbo);
// Generate and bind a render buffer which will become a depth buffer shared between our two FBOs
glGenRenderbuffers(1, &rv.depthb);
glBindRenderbuffer(GL_RENDERBUFFER, rv.depthb);
/*
Currently it is unknown to GL that we want our new render buffer to be a depth buffer.
glRenderbufferStorage will fix this and in this case will allocate a depth buffer
m_i32TexSize by m_i32TexSize.
*/
#ifdef GLES
glRenderbufferStorage(GL_RENDERBUFFER, GL_DEPTH_COMPONENT24_OES, fbw2, fbh2);
#else
glRenderbufferStorage(GL_RENDERBUFFER, GL_DEPTH_COMPONENT24, fbw2, fbh2);
#endif
glGenRenderbuffers(1, &rv.stencilb);
glBindRenderbuffer(GL_RENDERBUFFER, rv.stencilb);
glRenderbufferStorage(GL_RENDERBUFFER, GL_STENCIL_INDEX8, fbw2, fbh2);
// Create a texture for rendering to // Create a texture for rendering to
rv.tex = glcache.GenTexture(); rv.tex = glcache.GenTexture();
glcache.BindTexture(GL_TEXTURE_2D, rv.tex); glcache.BindTexture(GL_TEXTURE_2D, rv.tex);
@ -645,15 +622,14 @@ void BindRTT(u32 addy, u32 fbw, u32 fbh, u32 channels, u32 fmt)
// Attach the texture to the FBO // Attach the texture to the FBO
glFramebufferTexture2D(GL_FRAMEBUFFER, GL_COLOR_ATTACHMENT0, GL_TEXTURE_2D, rv.tex, 0); glFramebufferTexture2D(GL_FRAMEBUFFER, GL_COLOR_ATTACHMENT0, GL_TEXTURE_2D, rv.tex, 0);
// Attach the depth buffer we created earlier to our FBO.
glFramebufferRenderbuffer(GL_FRAMEBUFFER, GL_DEPTH_ATTACHMENT, GL_RENDERBUFFER, rv.depthb);
// Check that our FBO creation was successful // Check that our FBO creation was successful
GLuint uStatus = glCheckFramebufferStatus(GL_FRAMEBUFFER); GLuint uStatus = glCheckFramebufferStatus(GL_FRAMEBUFFER);
verify(uStatus == GL_FRAMEBUFFER_COMPLETE); verify(uStatus == GL_FRAMEBUFFER_COMPLETE);
glViewport(0, 0, fbw, fbh); // TODO CLIP_X/Y min? glViewport(0, 0, fbw, fbh); // TODO CLIP_X/Y min?
return rv.fbo;
} }
void ReadRTTBuffer() { void ReadRTTBuffer() {
@ -798,10 +774,11 @@ void ReadRTTBuffer() {
} }
fb_rtt.tex = 0; fb_rtt.tex = 0;
if (fb_rtt.fbo) { glDeleteFramebuffers(1,&fb_rtt.fbo); fb_rtt.fbo = 0; } if (fb_rtt.fbo)
if (fb_rtt.depthb) { glDeleteRenderbuffers(1,&fb_rtt.depthb); fb_rtt.depthb = 0; } {
if (fb_rtt.stencilb) { glDeleteRenderbuffers(1,&fb_rtt.stencilb); fb_rtt.stencilb = 0; } glDeleteFramebuffers(1,&fb_rtt.fbo);
fb_rtt.fbo = 0;
}
} }
static int TexCacheLookups; static int TexCacheLookups;

View file

@ -20,6 +20,7 @@
#include <smmintrin.h> #include <smmintrin.h>
#include <cmath> #include <cmath>
#include <algorithm>
#include "rend/gles/gles.h" #include "rend/gles/gles.h"
@ -53,6 +54,54 @@ union m128i {
uint32_t m128i_u32[4]; uint32_t m128i_u32[4];
}; };
bool operator<(const PolyParam &left, const PolyParam &right)
{
/* put any condition you want to sort on here */
return left.zvZ<right.zvZ;
//return left.zMin<right.zMax;
}
//Sort based on min-z of each strip
void SortPParams(int first, int count)
{
if (pvrrc.verts.used() == 0 || count <= 1)
return;
Vertex* vtx_base=pvrrc.verts.head();
u16* idx_base=pvrrc.idx.head();
PolyParam* pp = &pvrrc.global_param_tr.head()[first];
PolyParam* pp_end = pp + count;
while(pp!=pp_end)
{
if (pp->count<2)
{
pp->zvZ=0;
}
else
{
u16* idx=idx_base+pp->first;
Vertex* vtx=vtx_base+idx[0];
Vertex* vtx_end=vtx_base + idx[pp->count-1]+1;
u32 zv=0xFFFFFFFF;
while(vtx!=vtx_end)
{
zv=min(zv,(u32&)vtx->z);
vtx++;
}
pp->zvZ=(f32&)zv;
}
pp++;
}
std::stable_sort(pvrrc.global_param_tr.head() + first, pvrrc.global_param_tr.head() + first + count);
}
static __m128 _mm_load_scaled_float(float v, float s) static __m128 _mm_load_scaled_float(float v, float s)
{ {
return _mm_setr_ps(v, v + s, v + s + s, v + s + s + s); return _mm_setr_ps(v, v + s, v + s + s, v + s + s + s);

View file

@ -1,4 +1,4 @@
<?xml version="1.0" encoding="utf-8"?> <?xml version="1.0" encoding="utf-8"?>
<Project DefaultTargets="Build" ToolsVersion="14.0" xmlns="http://schemas.microsoft.com/developer/msbuild/2003"> <Project DefaultTargets="Build" ToolsVersion="14.0" xmlns="http://schemas.microsoft.com/developer/msbuild/2003">
<ItemGroup Label="ProjectConfigurations"> <ItemGroup Label="ProjectConfigurations">
<ProjectConfiguration Include="Fast|Win32"> <ProjectConfiguration Include="Fast|Win32">
@ -191,6 +191,7 @@
<ClCompile Include="..\core\reios\reios.cpp" /> <ClCompile Include="..\core\reios\reios.cpp" />
<ClCompile Include="..\core\reios\reios_elf.cpp" /> <ClCompile Include="..\core\reios\reios_elf.cpp" />
<ClCompile Include="..\core\rend\d3d11\d3d11.cpp" /> <ClCompile Include="..\core\rend\d3d11\d3d11.cpp" />
<ClCompile Include="..\core\rend\gles\abuffer.cpp" />
<ClCompile Include="..\core\rend\gles\gldraw.cpp" /> <ClCompile Include="..\core\rend\gles\gldraw.cpp" />
<ClCompile Include="..\core\rend\gles\gles.cpp" /> <ClCompile Include="..\core\rend\gles\gles.cpp" />
<ClCompile Include="..\core\rend\gles\gltex.cpp" /> <ClCompile Include="..\core\rend\gles\gltex.cpp" />