Some enhancements for memcpy / memset inline expansion.

1. Teach it to use overlapping unaligned load / store to copy / set the trailing bytes. e.g. On 86, use two pairs of movups / movaps for 17 - 31 byte copies. 2. Use f64 for memcpy / memset on targets where i64 is not legal but f64 is. e.g. x86 and ARM. 3. When memcpy from a constant string, do *not* replace the load with a constant if it's not possible to materialize an integer immediate with a single instruction (required a new target hook: TLI.isIntImmLegal()). 4. Use unaligned load / stores more aggressively if target hooks indicates they are "fast". 5. Update ARM target hooks to use unaligned load / stores. e.g. vld1.8 / vst1.8. Also increase the threshold to something reasonable (8 for memset, 4 pairs for memcpy). This significantly improves Dhrystone, up to 50% on ARM iOS devices. rdar://12760078 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169791 91177308-0d34-0410-b5e6-96231b3b80d8
author: Evan Cheng <evan.cheng@apple.com> 2012-12-10 23:21:26 +0000
committer: Evan Cheng <evan.cheng@apple.com> 2012-12-10 23:21:26 +0000
commit: 376642ed620ecae05b68c7bc81f79aeb2065abe0 (patch)
tree: 9757b2568050b3ab58af15c234df3bc9f66202b0 /test/CodeGen/X86/memcpy-2.ll
parent: 2b475922e6169098606006a69d765160caa77848 (diff)
download: llvm-376642ed620ecae05b68c7bc81f79aeb2065abe0.tar.gz
llvm-376642ed620ecae05b68c7bc81f79aeb2065abe0.tar.bz2
llvm-376642ed620ecae05b68c7bc81f79aeb2065abe0.tar.xz
1 files changed, 6 insertions, 6 deletions
diff --git a/test/CodeGen/X86/memcpy-2.ll b/test/CodeGen/X86/memcpy-2.ll
index 7a2bbc4ef0..dcc8f0d268 100644
--- a/test/CodeGen/X86/memcpy-2.ll
+++ b/test/CodeGen/X86/memcpy-2.ll
@@ -10,18 +10,18 @@
 define void @t1(i32 %argc, i8** %argv) nounwind  {
 entry:
 ; SSE2: t1:
+; SSE2: movsd _.str+16, %xmm0
+; SSE2: movsd %xmm0, 16(%esp)
 ; SSE2: movaps _.str, %xmm0
 ; SSE2: movaps %xmm0
-; SSE2: movb $0
-; SSE2: movl $0
-; SSE2: movl $0
+; SSE2: movb $0, 24(%esp)
 
 ; SSE1: t1:
+; SSE1: fldl _.str+16
+; SSE1: fstpl 16(%esp)
 ; SSE1: movaps _.str, %xmm0
 ; SSE1: movaps %xmm0
-; SSE1: movb $0
-; SSE1: movl $0
-; SSE1: movl $0
+; SSE1: movb $0, 24(%esp)
 
 ; NOSSE: t1:
 ; NOSSE: movb $0
author	Evan Cheng <evan.cheng@apple.com>	2012-12-10 23:21:26 +0000
committer	Evan Cheng <evan.cheng@apple.com>	2012-12-10 23:21:26 +0000
commit	376642ed620ecae05b68c7bc81f79aeb2065abe0 (patch)
tree	9757b2568050b3ab58af15c234df3bc9f66202b0 /test/CodeGen/X86/memcpy-2.ll
parent	2b475922e6169098606006a69d765160caa77848 (diff)
download	llvm-376642ed620ecae05b68c7bc81f79aeb2065abe0.tar.gz llvm-376642ed620ecae05b68c7bc81f79aeb2065abe0.tar.bz2 llvm-376642ed620ecae05b68c7bc81f79aeb2065abe0.tar.xz