SDL-mirror

Author	SHA1	Message	Date
Sam Lantinga	95dcfa4c28	Happy New Year!	2013-02-15 08:47:44 -08:00
Sam Lantinga	f380ecb137	Removed executable bit from source files	2012-09-27 14:35:28 -07:00
Sam Lantinga	028e5dcdbd	Happy New Year!	2011-12-31 09:28:07 -05:00
Ryan C. Gordon	cd6cd13137	SDL_memcpyMMX(): Fixed handling of overflow bytes. Thanks to Mason Wheeler for the fix!	2011-10-29 01:11:47 -04:00
Ryan C. Gordon	e66bcd9efe	SDL_memcpyMMX(): Make sure srcskip and dstskip are 8-byte aligned. Thanks to Patrick Baggett for the fix!	2011-10-29 01:03:50 -04:00
Ryan C. Gordon	c6f4eaaf06	Removed tabs and DOS endlines from SDL_blit_copy.c ...	2011-10-29 00:57:45 -04:00
Ryan C. Gordon	ad214ecb4b	Some MMX fixes from Patrick Baggett. Original email... Date: Sat, 10 Sep 2011 13:01:20 -0500 From: Patrick Baggett To: SDL Development List <sdl@lists.libsdl.org> Subject: Re: [SDL] SDL_memcpyMMX uses SSE instructions In SDL_blit_copy.c, the function SDL_memcpyMMX() actually use SSE instructions. It is called in this context: #ifdef __MMX__ if (SDL_HasMMX() && !((uintptr_t) src & 7) && !(srcskip & 7) && !((uintptr_t) dst & 7) && !(dstskip & 7)) { while (h--) { SDL_memcpyMMX(dst, src, w); src += srcskip; dst += dstskip; } _mm_empty(); return; } #endif This implies that the minimum CPU features are just MMX. There is a separate SDL_memcpySSE() function. The SDL_memcpyMMX() function does: #ifdef __SSE__ _mm_prefetch(src, _MM_HINT_NTA); #endif ...which tests at compile time if SSE intrinsics are available, not at run time. It generates the PREFETCHNTA instruction. It also uses _mm_stream_pi() intrinsic, which generates the MOVNTQ instruction. If you replace the "MMX" code with: __m64* d64 = (__m64)dst; __m64 s64 = (__m64*)src; for(i= len / 64; i--;) { d64[0] = s64[0]; d64[1] = s64[1]; d64[2] = s64[2]; d64[3] = s64[3]; d64[4] = s64[4]; d64[5] = s64[5]; d64[6] = s64[6]; d64[7] = s64[7]; d64 += 8; s64 += 8; } Then MSVC generates the correct movq instructions. GCC (4.5.0) seems to think that using 2x movl is still better, but then again, GCC isn't actually that good at optimizing intrinsics as I've found. At least the code won't crash on my P2 though. :) Also, there is no requirement for MMX to be aligned to the 8th byte. I think the author assumed that SSE's 16 byte alignment requirement must retroactively mean that MMX requires 8 byte alignment. Attached is the full patch. Patrick	2011-09-11 01:54:54 -04:00
Sam Lantinga	b0660ba5ff	SDL 1.3 is now under the zlib license.	2011-04-08 13:03:26 -07:00
Sam Lantinga	d2b922f555	Fixed bug #1090 (SDL_BlitCopyOverlap() assumes memcpy() operates in order) Even if we're blitting between two different surfaces their pixels might still overlap, because of SDL_CreateRGBSurfaceFrom(), so always use SDL_BlitCopy() and check for overlap in that function. When handling overlapping surfaces, don't assume that memcpy() iterates forward, instead use memmove() correctly, and provide a fallback implementation of SDL_memmove() that handles the different cases. Fixed a bug with SDL_memset() not completely filling lengths that aren't a multiple of 4. Optimized SDL_memcpy() a bit using the same technique as SDL_memset().	2011-02-16 15:25:10 -08:00
Sam Lantinga	e5803d148c	Happy 2011! :)	2011-02-11 22:37:15 -08:00
Sam Lantinga	4d3df8b3e3	Fixed bug #926 Updated copyright to LGPL version 2.1 and year 2010 --HG-- extra : convert_revision : svn%3Ac70aab31-4412-0410-b14c-859654838e24/trunk%404453	2010-01-24 21:10:53 +00:00
Sam Lantinga	83fbb6981f	Fixed bug #736 Don't use the SSE cache instruction in MMX code if SSE isn't available. :) --HG-- extra : convert_revision : svn%3Ac70aab31-4412-0410-b14c-859654838e24/trunk%404326	2009-12-16 03:02:31 +00:00
Sam Lantinga	0c30a927ed	Updated copyright date --HG-- extra : convert_revision : svn%3Ac70aab31-4412-0410-b14c-859654838e24/trunk%403321	2008-12-08 00:27:32 +00:00
Sam Lantinga	a547129012	Fixed crash on 64-bit systems --HG-- extra : convert_revision : svn%3Ac70aab31-4412-0410-b14c-859654838e24/trunk%403287	2008-12-03 06:32:04 +00:00
Sam Lantinga	de5363f399	Disable spurious warning --HG-- extra : convert_revision : svn%3Ac70aab31-4412-0410-b14c-859654838e24/trunk%403241	2008-11-24 21:43:02 +00:00
Sam Lantinga	cf548d0a6b	Okay, still some bugs, but everything builds again... --HG-- extra : convert_revision : svn%3Ac70aab31-4412-0410-b14c-859654838e24/trunk%402629	2007-08-18 05:39:09 +00:00
Sam Lantinga	422ec364ff	Work in progress: merging new texture features into SDL blit system --HG-- extra : convert_revision : svn%3Ac70aab31-4412-0410-b14c-859654838e24/trunk%402624	2007-08-17 06:21:58 +00:00
Sam Lantinga	884eddbaa6	Oops, didn't want to rename those files... --HG-- rename : src/video/SDL_copy.c => src/video/SDL_blit_copy.c rename : src/video/SDL_copy.h => src/video/SDL_blit_copy.h extra : convert_revision : svn%3Ac70aab31-4412-0410-b14c-859654838e24/trunk%402616	2007-08-16 21:54:26 +00:00
Sam Lantinga	5ac6d00012	Added notes on the next steps for SDL 1.3 Moved fill and copy routines to their own files. --HG-- rename : src/video/SDL_blit_copy.c => src/video/SDL_copy.c rename : src/video/SDL_blit_copy.h => src/video/SDL_copy.h extra : convert_revision : svn%3Ac70aab31-4412-0410-b14c-859654838e24/trunk%402615	2007-08-16 21:43:19 +00:00
Sam Lantinga	beba16ed23	Removed unnecessary header (SDL_blit.h has SDL_cpuinfo.h) --HG-- extra : convert_revision : svn%3Ac70aab31-4412-0410-b14c-859654838e24/trunk%402614	2007-08-16 06:40:34 +00:00
Sam Lantinga	ae461fc306	Fixed a few compiler warnings. Added SDL_blit_copy.c to the Visual C++ project The SSE and MMX intrinsics don't compile on Visual Studio yet... --HG-- extra : convert_revision : svn%3Ac70aab31-4412-0410-b14c-859654838e24/trunk%402612	2007-08-16 06:20:51 +00:00
Sam Lantinga	5d8720fd18	Added SSE and MMX optimization for SDL_FillRect() --HG-- extra : convert_revision : svn%3Ac70aab31-4412-0410-b14c-859654838e24/trunk%402611	2007-08-16 05:56:24 +00:00
Sam Lantinga	d7134d38fd	Okay, I figured out the intrinsics for SIMD memcpy --HG-- extra : convert_revision : svn%3Ac70aab31-4412-0410-b14c-859654838e24/trunk%402610	2007-08-16 02:14:13 +00:00
Sam Lantinga	37fe3a93e6	Removed hermes since it's LGPL and not compatible with a commercial license. Prepping for using MMX and SSE intrinsics instead of inline assembly. .. except for memcpy equivalents which only get faster if they can exploit the parallelism of loading into multiple SIMD registers. :) --HG-- extra : convert_revision : svn%3Ac70aab31-4412-0410-b14c-859654838e24/trunk%402609	2007-08-15 08:21:10 +00:00

24 commits