The AltiVec blitters don't compile, since they require __VEC__ to be enabled in
order for the compiler to understand "vector" and friends (i.e. do AltiVec)
But you don't want to turn AltiVec on globally, since then the code would only
run on a G4 (there are already runtime tests, before using the AltiVec
variants)
The solution here is to enable AltiVec locally, for the actual AltiVec code.
--HG--
extra : convert_revision : svn%3Ac70aab31-4412-0410-b14c-859654838e24/trunk%401754
SDL_blit_A.mmx-speed.patch.txt --
Speed improvements and a bugfix for the current GCC inline mmx
asm code:
- Changed some ops and removed some resulting useless ones.
- Added some instruction parallelism (some gain)
The resulting speed on my Xeon improved upto 35% depending on
the function (measured in fps).
- Fixed a bug where BlitRGBtoRGBSurfaceAlphaMMX() was
setting the alpha component on the destination surfaces (to
opaque-alpha) even when the surface had none.
SDL_blit_A.mmx-msvc.patch.txt --
MSVC mmx intrinsics version of the same GCC asm code.
MSVC compiler tries to parallelize the code and to avoid
register stalls, but does not always do a very good job.
Per-surface blending MSVC functions run quite a bit faster
than their pure-asm counterparts (upto 55% faster for 16bit
ones), but the per-pixel blending runs somewhat slower than asm.
- BlitRGBtoRGBSurfaceAlphaMMX and BlitRGBtoRGBPixelAlphaMMX (and all
variants) can now also handle formats other than (A)RGB8888. Formats
like RGBA8888 and some quite exotic ones are allowed -- like
RAGB8888, or actually anything having channels aligned on 8bit
boundary and full 8bit alpha (for per-pixel alpha blending).
The performance cost of this change is virtually 0 for per-surface
alpha blending (no extra ops inside the loop) and a single non-MMX
op inside the loop for per-pixel blending. In testing, the per-pixel
alpha blending takes a ~2% performance hit, but it still runs much
faster than the current code in CVS. If necessary, a separate function
with this functionality can be made.
This code requires Processor Pack for VC6.
--HG--
extra : convert_revision : svn%3Ac70aab31-4412-0410-b14c-859654838e24/trunk%401546
Make sure every source file includes SDL_config.h, so the proper system
headers are chosen.
--HG--
extra : convert_revision : svn%3Ac70aab31-4412-0410-b14c-859654838e24/trunk%401406
I batch edited these files, so please let me know if I've accidentally removed anybody's
credit here.
--HG--
extra : convert_revision : svn%3Ac70aab31-4412-0410-b14c-859654838e24/trunk%401315
There are a couple of issues with the selection of Altivec alpha-blitting
routines in CalculateAlphaBlit() in src/video/SDL_Blit_A.c.
1) There's no check for the presence of Altivec when checking if the
Blit32to565PixelAlphaAltivec() routine can be selected.
2) Altivec cannot be used in video memory, and there's no check if the
destination surface is a hardware surface. (Alpha-blitting to a hardware
surface with GPU support is a bad idea, but somebody's bound to do it anyway.)
Patch to fix these attached.
--HG--
extra : convert_revision : svn%3Ac70aab31-4412-0410-b14c-859654838e24/trunk%401243
From: Tyler Montbriand
Subject: [SDL] Opteron MMX patches for SDL_blit.c and SDL_blit_A.c
The inline MMX assembly in SDL_blit.c and SDL_blit_A.c compiles and runs fine
unmodified under AMD Opteron. The inline assembly in SDL_yuv_mmx.c and
SDL_blit_N.c unfortunately isn't directly compatible.
I've included diffs from SDL_blit.c and SDL_blit_A.c that allow the MMX
assembly to be compiled when USE_ASMBLIT, __x86_64__, and __GNUC__ are all
defined. All I had to modify was typedefs, the inline assembly itself wasn't
touched.
--HG--
extra : convert_revision : svn%3Ac70aab31-4412-0410-b14c-859654838e24/trunk%40881
Date: Sun, 07 Sep 2003 02:51:58 +0200
From: Stephane Marchesin
Subject: [SDL] Two little patches
Compiling SDL with a recent gcc (gcc 3.3.1, 3.3 doesn't have this
behaviour) gives some nasty warnings :
SDL_blit_A.c: In function `BlitRGBtoRGBSurfaceAlpha128MMX':
SDL_blit_A.c:223: warning: integer constant is too large for "long" type
SDL_blit_A.c:225: warning: integer constant is too large for "long" type
SDL_blit_A.c:227: warning: integer constant is too large for "long" type
[...]
The first attached patch (longlongfix.patch) tells gcc to really treat
those constants as unsigned long long and not long.
The second patch (nasinclude.patch) fixes an include problem I had while
compiling nas audio : when the <audio/audiolib.h> file lies in
/usr/X11R6/include, a -I/usr/X11R6/include option is needed or the file
isn't found.
--HG--
extra : convert_revision : svn%3Ac70aab31-4412-0410-b14c-859654838e24/trunk%40721
From: Stephane Marchesin
Subject: Re: [SDL] [patch] MMX alpha blit patches with MMX detection
I think everything is correct now. I've done as much testing as I could,
but some real-world testing wouldn't hurt, I think.
The patch is here : http://icps.u-strasbg.fr/~marchesin/sdl_mmxblit.patch
If you do byte-by-byte comparison of the output between C and MMX
functions, you'll notice that the results for 555 and 565 RGB alpha
blits aren't exactly the same. This is because MMX functions for 555 and
565 RGB have an higher accuracy. If you want the exact same behaviour
that's possible by masking the three lower alpha bits in the MMX
functions. Just ask !
I removed one MMX function because after I fixed it to match its C
equivalent, it revealed to be slower than the C version on a PIII
(although a bit faster on an Athlon XP).
I've also added MMX and PIII replacements for SDL_memcpy. Those provide
some speed up in testvidinfo -benchmark (at least for me, under linux &
X11).
--HG--
extra : convert_revision : svn%3Ac70aab31-4412-0410-b14c-859654838e24/trunk%40690