Some enhancements for memcpy / memset inline expansion.

1. Teach it to use overlapping unaligned load / store to copy / set the trailing bytes. e.g. On 86, use two pairs of movups / movaps for 17 - 31 byte copies. 2. Use f64 for memcpy / memset on targets where i64 is not legal but f64 is. e.g. x86 and ARM. 3. When memcpy from a constant string, do *not* replace the load with a constant if it's not possible to materialize an integer immediate with a single instruction (required a new target hook: TLI.isIntImmLegal()). 4. Use unaligned load / stores more aggressively if target hooks indicates they are "fast". 5. Update ARM target hooks to use unaligned load / stores. e.g. vld1.8 / vst1.8. Also increase the threshold to something reasonable (8 for memset, 4 pairs for memcpy). This significantly improves Dhrystone, up to 50% on ARM iOS devices. rdar://12760078 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169791 91177308-0d34-0410-b5e6-96231b3b80d8
author: Evan Cheng <evan.cheng@apple.com> 2012-12-10 23:21:26 +0000
committer: Evan Cheng <evan.cheng@apple.com> 2012-12-10 23:21:26 +0000
commit: 376642ed620ecae05b68c7bc81f79aeb2065abe0 (patch)
tree: 9757b2568050b3ab58af15c234df3bc9f66202b0 /include
parent: 2b475922e6169098606006a69d765160caa77848 (diff)
download: llvm-376642ed620ecae05b68c7bc81f79aeb2065abe0.tar.gz
llvm-376642ed620ecae05b68c7bc81f79aeb2065abe0.tar.bz2
llvm-376642ed620ecae05b68c7bc81f79aeb2065abe0.tar.xz
1 files changed, 17 insertions, 5 deletions
diff --git a/include/llvm/Target/TargetLowering.h b/include/llvm/Target/TargetLowering.h
index 48083eee57..d2e20107ef 100644
--- a/include/llvm/Target/TargetLowering.h
+++ b/include/llvm/Target/TargetLowering.h
@@ -371,6 +371,16 @@ public:
     return false;
   }
 
+  /// isIntImmLegal - Returns true if the target can instruction select the
+  /// specified integer immediate natively (that is, it's materialized with one
+  /// instruction). The current *assumption* in isel is all of integer
+  /// immediates are "legal" and only the memcpy / memset expansion code is
+  /// making use of this. The rest of isel doesn't have proper cost model for
+  /// immediate materialization.
+  virtual bool isIntImmLegal(const APInt &/*Imm*/, EVT /*VT*/) const {
+    return true;
+  }
+
   /// isShuffleMaskLegal - Targets can use this to indicate that they only
   /// support *some* VECTOR_SHUFFLE operations, those with specific masks.
   /// By default, if a target supports the VECTOR_SHUFFLE node, all mask values
@@ -678,12 +688,14 @@ public:
   }
 
   /// This function returns true if the target allows unaligned memory accesses.
-  /// of the specified type. This is used, for example, in situations where an
-  /// array copy/move/set is  converted to a sequence of store operations. It's
-  /// use helps to ensure that such replacements don't generate code that causes
-  /// an alignment error  (trap) on the target machine.
+  /// of the specified type. If true, it also returns whether the unaligned
+  /// memory access is "fast" in the second argument by reference. This is used,
+  /// for example, in situations where an array copy/move/set is  converted to a
+  /// sequence of store operations. It's use helps to ensure that such
+  /// replacements don't generate code that causes an alignment error  (trap) on
+  /// the target machine.
   /// @brief Determine if the target supports unaligned memory accesses.
-  virtual bool allowsUnalignedMemoryAccesses(EVT) const {
+  virtual bool allowsUnalignedMemoryAccesses(EVT, bool *Fast = 0) const {
     return false;
   }
author	Evan Cheng <evan.cheng@apple.com>	2012-12-10 23:21:26 +0000
committer	Evan Cheng <evan.cheng@apple.com>	2012-12-10 23:21:26 +0000
commit	376642ed620ecae05b68c7bc81f79aeb2065abe0 (patch)
tree	9757b2568050b3ab58af15c234df3bc9f66202b0 /include
parent	2b475922e6169098606006a69d765160caa77848 (diff)
download	llvm-376642ed620ecae05b68c7bc81f79aeb2065abe0.tar.gz llvm-376642ed620ecae05b68c7bc81f79aeb2065abe0.tar.bz2 llvm-376642ed620ecae05b68c7bc81f79aeb2065abe0.tar.xz