llvm - Unofficial llvm GIT mirror used in EmbToolkit

	Commit message (Collapse)	Author	Age
*	Merging r200028:	Tom Stellard	2014-04-11
\| \| \| \| \| \| \| \| \| \| \| \| \|	------------------------------------------------------------------------ r200028 \| benny.kra \| 2014-01-24 14:02:37 -0500 (Fri, 24 Jan 2014) \| 4 lines InstCombine: Don't try to use aggregate elements of ConstantExprs. PR18600. ------------------------------------------------------------------------ git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_34@206054 91177308-0d34-0410-b5e6-96231b3b80d8
*	Merging r199570:	Tom Stellard	2014-04-09
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	------------------------------------------------------------------------ r199570 \| aschwaighofer \| 2014-01-18 22:18:31 -0500 (Sat, 18 Jan 2014) \| 11 lines LoopVectorizer: A reduction that has multiple uses of the reduction value is not a reduction. Really. Under certain circumstances (the use list of an instruction has to be set up right - hence the extra pass in the test case) we would not recognize when a value in a potential reduction cycle was used multiple times by the reduction cycle. Fixes PR18526. radar://15851149 ------------------------------------------------------------------------ git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_34@205818 91177308-0d34-0410-b5e6-96231b3b80d8
*	Merging r198425:	Tom Stellard	2014-04-09
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	------------------------------------------------------------------------ r198425 \| dpeixott \| 2014-01-03 12:20:01 -0500 (Fri, 03 Jan 2014) \| 33 lines Fix loop rerolling pass failure with non-consant loop lower bound The loop rerolling pass was failing with an assertion failure from a failed cast on loops like this: void foo(int A, int B, int m, int n) { for (int i = m; i < n; i+=4) { A[i+0] = B[i+0] * 4; A[i+1] = B[i+1] * 4; A[i+2] = B[i+2] * 4; A[i+3] = B[i+3] * 4; } } The code was casting the SCEV-expanded code for the new induction variable to a phi-node. When the loop had a non-constant lower bound, the SCEV expander would end the code expansion with an add insted of a phi node and the cast would fail. It looks like the cast to a phi node was only needed to get the induction variable value coming from the backedge to compute the end of loop condition. This patch changes the loop reroller to compare the induction variable to the number of times the backedge is taken instead of the iteration count of the loop. In other words, we stop the loop when the current value of the induction variable == IterationCount-1. Previously, the comparison was comparing the induction variable value from the next iteration == IterationCount. This problem only seems to occur on 32-bit targets. For some reason, the loop is not rerolled on 64-bit targets. PR18290 ------------------------------------------------------------------------ git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_34@205817 91177308-0d34-0410-b5e6-96231b3b80d8
*	Merging r202273:	Tom Stellard	2014-04-08
\| \| \| \| \| \| \| \| \| \| \| \| \|	------------------------------------------------------------------------ r202273 \| atrick \| 2014-02-26 11:31:56 -0500 (Wed, 26 Feb 2014) \| 4 lines Fix PR18165: LSR must avoid scaling factors that exceed the limit on truncated use. Patch by Michael Zolotukhin! ------------------------------------------------------------------------ git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_34@205796 91177308-0d34-0410-b5e6-96231b3b80d8
*	Merging r201104:	Tom Stellard	2014-04-08
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	------------------------------------------------------------------------ r201104 \| chandlerc \| 2014-02-10 14:39:35 -0500 (Mon, 10 Feb 2014) \| 26 lines [LPM] A terribly simple fix to a terribly complex bug: PR18773. The crux of the issue is that LCSSA doesn't preserve stateful alias analyses. Before r200067, LICM didn't cause LCSSA to run in the LTO pass manager, where LICM runs essentially without any of the other loop passes. As a consequence the globalmodref-aa pass run before that loop pass manager was able to survive the loop pass manager and be used by DSE to eliminate stores in the function called from the loop body in Adobe-C++/loop_unroll (and similar patterns in other benchmarks). When LICM was taught to preserve LCSSA it had to require it as well. This caused it to be run in the loop pass manager and because it did not preserve AA, the stateful AA was lost. Most of LLVM's AA isn't stateful and so this didn't manifest in most cases. Also, in most cases LCSSA was already running, and so there was no interesting change. The real kicker is that LCSSA by its definition (injecting PHI nodes only) trivially preserves AA! All we need to do is mark it, and then everything goes back to working as intended. It probably was blocking some other weird cases of stateful AA but the only one I have is a 1000-line IR test case from loop_unroll, so I don't really have a good test case here. Hopefully this fixes the regressions on performance that have been seen since that revision. ------------------------------------------------------------------------ git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_34@205795 91177308-0d34-0410-b5e6-96231b3b80d8
*	Merging r197449:	Bill Wendling	2013-12-17
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	------------------------------------------------------------------------ r197449 \| arnolds \| 2013-12-16 17:11:01 -0800 (Mon, 16 Dec 2013) \| 7 lines LoopVectorizer: Don't if-convert constant expressions that can trap A phi node operand or an instruction operand could be a constant expression that can trap (division). Check that we don't vectorize such cases. PR16729 radar://15653590 ------------------------------------------------------------------------ git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_34@197453 91177308-0d34-0410-b5e6-96231b3b80d8
*	Merging r197178:	Bill Wendling	2013-12-14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	------------------------------------------------------------------------ r197178 \| hfinkel \| 2013-12-12 12:45:24 -0800 (Thu, 12 Dec 2013) \| 9 lines Fix a use-after-free error in GlobalOpt CleanupConstantGlobalUsers GlobalOpt's CleanupConstantGlobalUsers function uses a worklist array to manage constant users to be visited. The pointers in this array need to be weak handles because when we delete a constant array, we may also be holding a pointer to one of its elements (or an element of one of its elements if we're dealing with an array of arrays) in the worklist. Fixes PR17347. ------------------------------------------------------------------------ git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_34@197322 91177308-0d34-0410-b5e6-96231b3b80d8
*	Merging r195505:	Manman Ren	2013-12-09
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	------------------------------------------------------------------------ r195505 \| mren \| 2013-11-22 14:06:31 -0800 (Fri, 22 Nov 2013) \| 8 lines Debug Info: move StripDebugInfo from StripSymbols.cpp to DebugInfo.cpp. We can share the implementation between StripSymbols and dropping debug info for metadata versions that do not match. Also update the comments to match the implementation. A follow-on patch will drop the "Debug Info Version" module flag in StripDebugInfo. ------------------------------------------------------------------------ git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_34@196816 91177308-0d34-0410-b5e6-96231b3b80d8
*	Merging r196638:	Bill Wendling	2013-12-08
\| \| \| \| \| \| \| \| \| \| \|	------------------------------------------------------------------------ r196638 \| arsenm \| 2013-12-06 18:58:45 -0800 (Fri, 06 Dec 2013) \| 1 line Fix assert with copy from global through addrspacecast ------------------------------------------------------------------------ git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_34@196709 91177308-0d34-0410-b5e6-96231b3b80d8
*	--- Reverse-merging r196668 into '.':	Bill Wendling	2013-12-08
\| \| \| \| \| \| \| \| \|	U lib/Transforms/InstCombine/InstCombineLoadStoreAlloca.cpp U test/Transforms/InstCombine/addrspacecast.ll git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_34@196705 91177308-0d34-0410-b5e6-96231b3b80d8
*	Merging r196638:	Bill Wendling	2013-12-07
\| \| \| \| \| \| \| \| \| \| \|	------------------------------------------------------------------------ r196638 \| arsenm \| 2013-12-06 18:58:45 -0800 (Fri, 06 Dec 2013) \| 1 line Fix assert with copy from global through addrspacecast ------------------------------------------------------------------------ git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_34@196668 91177308-0d34-0410-b5e6-96231b3b80d8
*	Merging r196611:	Bill Wendling	2013-12-06
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	------------------------------------------------------------------------ r196611 \| dexonsmith \| 2013-12-06 13:48:36 -0800 (Fri, 06 Dec 2013) \| 5 lines Don't use isNullValue to evaluate ConstantExpr ConstantExpr can evaluate to false even when isNullValue gives false. Fixes PR18143. ------------------------------------------------------------------------ git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_34@196614 91177308-0d34-0410-b5e6-96231b3b80d8
*	Merging r196508:	Bill Wendling	2013-12-06
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	------------------------------------------------------------------------ r196508 \| arnolds \| 2013-12-05 07:14:40 -0800 (Thu, 05 Dec 2013) \| 12 lines SLPVectorizer: An in-tree vectorized entry cannot also be a scalar external use We were creating external uses for scalar values in MustGather entries that also had a ScalarToTreeEntry (they also are present in a vectorized tuple). This meant we would keep a value 'alive' as a scalar and vectorized causing havoc. This is not necessary because when we create a MustGather vector we explicitly create external uses entries for the insertelement instructions of the MustGather vector elements. Fixes PR18129. radar://15582184 ------------------------------------------------------------------------ git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_34@196571 91177308-0d34-0410-b5e6-96231b3b80d8
*	Merging r196129:	Bill Wendling	2013-12-02
\| \| \| \| \| \| \| \| \| \| \|	------------------------------------------------------------------------ r196129 \| kkhoo \| 2013-12-02 10:43:59 -0800 (Mon, 02 Dec 2013) \| 1 line Conservative fix for PR17827 - don't optimize a shift + and + compare sequence where the shift is logical unless the comparison is unsigned ------------------------------------------------------------------------ git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_34@196132 91177308-0d34-0410-b5e6-96231b3b80d8
*	Merging r196004:	Bill Wendling	2013-12-01
\| \| \| \| \| \| \| \| \| \| \| \| \|	------------------------------------------------------------------------ r196004 \| void \| 2013-11-30 19:36:07 -0800 (Sat, 30 Nov 2013) \| 3 lines Use 'unsigned char' to get this past gcc error message: error: invalid conversion from 'unsigned char' to '{anonymous}::Sequence' ------------------------------------------------------------------------ git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_34@196005 91177308-0d34-0410-b5e6-96231b3b80d8
*	Merging r195787:	Bill Wendling	2013-12-01
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	------------------------------------------------------------------------ r195787 \| arnolds \| 2013-11-26 14:11:23 -0800 (Tue, 26 Nov 2013) \| 8 lines LoopVectorizer: Truncate i64 trip counts of i32 phis if necessary In signed arithmetic we could end up with an i64 trip count for an i32 phi. Because it is signed arithmetic we know that this is only defined if the i32 does not wrap. It is therefore safe to truncate the i64 trip count to a i32 value. Fixes PR18049. ------------------------------------------------------------------------ git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_34@195991 91177308-0d34-0410-b5e6-96231b3b80d8
*	Merging r195791:	Bill Wendling	2013-11-27
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	------------------------------------------------------------------------ r195791 \| nadav \| 2013-11-26 14:24:25 -0800 (Tue, 26 Nov 2013) \| 4 lines PR1860 - We can't save a list of ExtractElement instructions to CSE because some of these instructions may be removed and optimized in future iterations. Instead we save a list of basic blocks that we need to CSE. ------------------------------------------------------------------------ git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_34@195818 91177308-0d34-0410-b5e6-96231b3b80d8
*	Merging r195773:	Bill Wendling	2013-11-27
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	------------------------------------------------------------------------ r195773 \| nadav \| 2013-11-26 09:29:19 -0800 (Tue, 26 Nov 2013) \| 6 lines PR18060 - When we RAUW values with ExtractElement instructions in some cases we generate PHI nodes with multiple entries from the same basic block but with different values. Enabling CSE on ExtractElement instructions make sure that all of the RAUWed instructions are the same. ------------------------------------------------------------------------ git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_34@195817 91177308-0d34-0410-b5e6-96231b3b80d8
*	Merging r195769:	Bill Wendling	2013-11-27
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	------------------------------------------------------------------------ r195769 \| dyatkovskiy \| 2013-11-26 08:11:03 -0800 (Tue, 26 Nov 2013) \| 27 lines PR17925 bugfix. Short description. This issue is about case of treating pointers as integers. We treat pointers as different if they references different address space. At the same time, we treat pointers equal to integers (with machine address width). It was a point of false-positive. Consider next case on 32bit machine: void foo0(i32 addrespace(1)* %p) void foo1(i32 addrespace(2)* %p) void foo2(i32 %p) foo0 != foo1, while foo1 == foo2 and foo0 == foo2. As you can see it breaks transitivity. That means that result depends on order of how functions are presented in module. Next order causes merging of foo0 and foo1: foo2, foo0, foo1 First foo0 will be merged with foo2, foo0 will be erased. Second foo1 will be merged with foo2. Depending on order, things could be merged we don't expect to. The fix: Forbid to treat any pointer as integer, except for those, who belong to address space 0. ------------------------------------------------------------------------ git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_34@195810 91177308-0d34-0410-b5e6-96231b3b80d8
*	Merging r195528:	Bill Wendling	2013-11-25
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	------------------------------------------------------------------------ r195528 \| chandlerc \| 2013-11-22 16:48:34 -0800 (Fri, 22 Nov 2013) \| 7 lines Migrate metadata information from scalar to vector instructions during SLP vectorization. Based on the code in BBVectorizer. Fixes PR17741. Patch by Raul Silvera, reviewed by Hal and Nadav. Reformatted by my driving of clang-format. =] ------------------------------------------------------------------------ git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_34@195608 91177308-0d34-0410-b5e6-96231b3b80d8
*	Merging r195493:	Bill Wendling	2013-11-25
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	------------------------------------------------------------------------ r195493 \| arsenm \| 2013-11-22 11:24:39 -0800 (Fri, 22 Nov 2013) \| 6 lines StructurizeCFG: Fix verification failure with some loops. If the beginning of the loop was also the entry block of the function, branches were inserted to the entry block which isn't allowed. If this occurs, create a new dummy function entry block that branches to the start of the loop. ------------------------------------------------------------------------ git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_34@195606 91177308-0d34-0410-b5e6-96231b3b80d8
*	Merging r195492:	Bill Wendling	2013-11-25
\| \| \| \| \| \| \| \| \| \| \|	------------------------------------------------------------------------ r195492 \| arsenm \| 2013-11-22 11:24:37 -0800 (Fri, 22 Nov 2013) \| 1 line StructurizeCFG: Fix inverting a branch on an argument ------------------------------------------------------------------------ git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_34@195605 91177308-0d34-0410-b5e6-96231b3b80d8
*	Merging r195477:	Bill Wendling	2013-11-25
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	------------------------------------------------------------------------ r195477 \| rafael \| 2013-11-22 09:58:12 -0800 (Fri, 22 Nov 2013) \| 13 lines Add a fixed version of r195470 back. The fix is simply to use CurI instead of I when handling aliases to avoid accessing a invalid iterator. original message: Convert linkonce* to weak* instead of strong. Also refactor the logic into a helper function. This is an important improve on mingw where the linker complains about mixed weak and strong symbols. Converting to weak ensures that the symbol is not dropped, but keeps in a comdat, making the linker happy. ------------------------------------------------------------------------ git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_34@195603 91177308-0d34-0410-b5e6-96231b3b80d8
*	Merging r195162:	Bill Wendling	2013-11-20
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	------------------------------------------------------------------------ r195162 \| arnolds \| 2013-11-19 14:20:20 -0800 (Tue, 19 Nov 2013) \| 12 lines SLPVectorizer: Fix stale for Value pointer array We are slicing an array of Value pointers and process those slices in a loop. The problem is that we might invalidate a later slice by vectorizing a former slice. Use a WeakVH to track the pointer. If the pointer is deleted or RAUW'ed we can tell. The test case will only fail when running with libgmalloc. radar://15498655 ------------------------------------------------------------------------ git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_34@195222 91177308-0d34-0410-b5e6-96231b3b80d8
*	Merging r195161:	Bill Wendling	2013-11-20
\| \| \| \| \| \| \| \| \| \| \|	------------------------------------------------------------------------ r195161 \| arnolds \| 2013-11-19 14:20:18 -0800 (Tue, 19 Nov 2013) \| 1 line SLPVectorizer: Fix whitespace errors ------------------------------------------------------------------------ git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_34@195221 91177308-0d34-0410-b5e6-96231b3b80d8
*	Merging r195118:	Bill Wendling	2013-11-20
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	------------------------------------------------------------------------ r195118 \| chandlerc \| 2013-11-19 01:03:18 -0800 (Tue, 19 Nov 2013) \| 22 lines Fix an issue where SROA computed different results based on the relative order of slices of the alloca which have exactly the same size and other properties. This was found by a perniciously unstable sort implementation used to flush out buggy uses of the algorithm. The fundamental idea is that findCommonType should return the best common type it can find across all of the slices in the range. There were two bugs here previously: 1) We would accept an integer type smaller than a byte-width multiple, and if there were different bit-width integer types, we would accept the first one. This caused an actual failure in the testcase updated here when the sort order changed. 2) If we found a bad combination of types or a non-load, non-store use before an integer typed load or store we would bail, but if we found the integere typed load or store, we would use it. The correct behavior is to always use an integer typed operation which covers the partition if one exists. While a clever debugging sort algorithm found problem #1 in our existing test cases, I have no useful test case ideas for #2. I spotted in by inspection when looking at this code. ------------------------------------------------------------------------ git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_34@195217 91177308-0d34-0410-b5e6-96231b3b80d8
*	Debug info: Let LowerDbgDeclare perfom the dbg.declare -> dbg.value	Adrian Prantl	2013-11-18
\| \| \| \| \| \| \| \| \| \|	lowering only for load/stores to scalar allocas. The resulting values confuse the backend and don't add anything because we can describe array-allocas with a dbg.declare intrinsic just fine. rdar://problem/15464571 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@195052 91177308-0d34-0410-b5e6-96231b3b80d8
*	[ASan] Fix PR17867 - make sure ASan doesn't crash if use-after-scope and ↵	Alexey Samsonov	2013-11-18
\| \| \| \| \| \|	use-after-return are combined. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@195014 91177308-0d34-0410-b5e6-96231b3b80d8
*	LoopVectorizer: Extend the induction variable to a larger type	Arnold Schwaighofer	2013-11-18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In some case the loop exit count computation can overflow. Extend the type to prevent most of those cases. The problem is loops like: int main () { int a = 1; char b = 0; lbl: a &= 4; b--; if (b) goto lbl; return a; } The backedge count is 255. The induction variable type is i8. If we add one to 255 to get the exit count we overflow to zero. To work around this issue we extend the type of the induction variable to i32 in the case of i8 and i16. PR17532 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@195008 91177308-0d34-0410-b5e6-96231b3b80d8
*	Utils/LoopUnroll.cpp: Tweak (StringRef)OldName to be valid until it is used, ↵	NAKAMURA Takumi	2013-11-17
\| \| \| \| \| \| \| \|	since r194601. eraseFromParent() invalidates OldName. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@194970 91177308-0d34-0410-b5e6-96231b3b80d8
*	Add a loop rerolling flag to the PassManagerBuilder	Hal Finkel	2013-11-17
\| \| \| \| \| \| \| \| \|	This adds a boolean member variable to the PassManagerBuilder to control loop rerolling (just like we have for unrolling and the various vectorization options). This is necessary for control by the frontend. Loop rerolling remains disabled by default at all optimization levels. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@194966 91177308-0d34-0410-b5e6-96231b3b80d8
*	Add the cold attribute to error-reporting call sites	Hal Finkel	2013-11-17
\| \| \| \| \| \| \| \| \| \| \|	Generally speaking, control flow paths with error reporting calls are cold. So far, error reporting calls are calls to perror and calls to fprintf, fwrite, etc. with stderr as the stream. This can be extended in the future. The primary motivation is to improve block placement (the cold attribute affects the static branch prediction heuristics). git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@194943 91177308-0d34-0410-b5e6-96231b3b80d8
*	Fix ndebug-build unused variable in loop rerolling	Hal Finkel	2013-11-17
\| \| \| \|	git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@194941 91177308-0d34-0410-b5e6-96231b3b80d8
*	Add a loop rerolling pass	Hal Finkel	2013-11-16
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This adds a loop rerolling pass: the opposite of (partial) loop unrolling. The transformation aims to take loops like this: for (int i = 0; i < 3200; i += 5) { a[i] += alpha * b[i]; a[i + 1] += alpha * b[i + 1]; a[i + 2] += alpha * b[i + 2]; a[i + 3] += alpha * b[i + 3]; a[i + 4] += alpha * b[i + 4]; } and turn them into this: for (int i = 0; i < 3200; ++i) { a[i] += alpha * b[i]; } and loops like this: for (int i = 0; i < 500; ++i) { x[3i] = foo(0); x[3i+1] = foo(0); x[3*i+2] = foo(0); } and turn them into this: for (int i = 0; i < 1500; ++i) { x[i] = foo(0); } There are two motivations for this transformation: 1. Code-size reduction (especially relevant, obviously, when compiling for code size). 2. Providing greater choice to the loop vectorizer (and generic unroller) to choose the unrolling factor (and a better ability to vectorize). The loop vectorizer can take vector lengths and register pressure into account when choosing an unrolling factor, for example, and a pre-unrolled loop limits that choice. This is especially problematic if the manual unrolling was optimized for a machine different from the current target. The current implementation is limited to single basic-block loops only. The rerolling recognition should work regardless of how the loop iterations are intermixed within the loop body (subject to dependency and side-effect constraints), but the significant restriction is that the order of the instructions in each iteration must be identical. This seems sufficient to capture all current use cases. This pass is not currently enabled by default at any optimization level. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@194939 91177308-0d34-0410-b5e6-96231b3b80d8
*	Apply the InstCombine fptrunc sqrt optimization to llvm.sqrt	Hal Finkel	2013-11-16
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	InstCombine, in visitFPTrunc, applies the following optimization to sqrt calls: (fptrunc (sqrt (fpext x))) -> (sqrtf x) but does not apply the same optimization to llvm.sqrt. This is a problem because, to enable vectorization, Clang generates llvm.sqrt instead of sqrt in fast-math mode, and because this optimization is being applied to sqrt and not applied to llvm.sqrt, sometimes the fast-math code is slower. This change makes InstCombine apply this optimization to llvm.sqrt as well. This fixes the specific problem in PR17758, although the same underlying issue (optimizations applied to libcalls are not applied to intrinsics) exists for other optimizations in SimplifyLibCalls. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@194935 91177308-0d34-0410-b5e6-96231b3b80d8
*	InstCombine: fold (A >> C) == (B >> C) --> (A^B) < (1 << C) for constant Cs.	Benjamin Kramer	2013-11-16
\| \| \| \| \| \|	This is common in bitfield code. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@194925 91177308-0d34-0410-b5e6-96231b3b80d8
*	LoopVectorizer: Use abi alignment for accesses with no alignment	Arnold Schwaighofer	2013-11-15
\| \| \| \| \| \| \| \| \| \| \|	When we vectorize a scalar access with no alignment specified, we have to set the target's abi alignment of the scalar access on the vectorized access. Using the same alignment of zero would be wrong because most targets will have a bigger abi alignment for vector types. This probably fixes PR17878. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@194876 91177308-0d34-0410-b5e6-96231b3b80d8
*	ArgumentPromotion: correctly transfer TBAA tags and alignments.	Manman Ren	2013-11-15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We used to use std::map<IndicesVector, LoadInst> for OriginalLoads, and when we try to promote two arguments, they will both write to OriginalLoads causing created loads for the two arguments to have the same original load. And the same tbaa tag and alignment will be put to the created loads for the two arguments. The fix is to use std::map<std::pair<Argument, IndicesVector>, LoadInst*> for OriginalLoads, so each Argument will write to different parts of the map. PR17906 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@194846 91177308-0d34-0410-b5e6-96231b3b80d8
*	[asan] use GlobalValue::PrivateLinkage for coverage guard to save quite a ↵	Kostya Serebryany	2013-11-15
\| \| \| \| \| \|	bit of code size git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@194800 91177308-0d34-0410-b5e6-96231b3b80d8
*	Reapply "[asan] Poor man's coverage that works with ASan"	Bob Wilson	2013-11-15
\| \| \| \| \| \| \| \|	I was able to successfully run a bootstrapped LTO build of clang with r194701, so this change does not seem to be the cause of our failing buildbots. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@194789 91177308-0d34-0410-b5e6-96231b3b80d8
*	Add instcombine visitor for addrspacecast	Matt Arsenault	2013-11-15
\| \| \| \|	git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@194786 91177308-0d34-0410-b5e6-96231b3b80d8
*	Revert "[asan] Poor man's coverage that works with ASan"	Bob Wilson	2013-11-15
\| \| \| \| \| \| \| \| \|	This reverts commit 194701. Apple's bootstrapped LTO builds have been failing, and this change (along with compiler-rt 194702-194704) is the only thing on the blamelist. I will either reappy these changes or help debug the problem, depending on whether this fixes the buildbots. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@194780 91177308-0d34-0410-b5e6-96231b3b80d8
*	[asan] Poor man's coverage that works with ASan	Kostya Serebryany	2013-11-14
\| \| \| \|	git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@194701 91177308-0d34-0410-b5e6-96231b3b80d8
*	[msan] Fast path optimization for wrap-indirect-calls feature of ↵	Evgeniy Stepanov	2013-11-14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	MemorySanitizer. Indirect call wrapping helps MSanDR (dynamic instrumentation companion tool for MSan) to catch all cases where execution leaves a compiler-instrumented module by allowing the tool to rewrite targets of indirect calls. This change is an optimization that skips wrapping for calls when target is inside the current module. This relies on the linker providing symbols at the begin and end of the module code (or code + data, does not really matter). Gold linker provides such symbols by default. GNU (BFD) linker needs a link flag: -Wl,--defsym=__executable_start=0. More info: https://code.google.com/p/memory-sanitizer/wiki/MSanDR#Native_exec git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@194697 91177308-0d34-0410-b5e6-96231b3b80d8
*	Use StringRef instead of std::string	Jakub Staszak	2013-11-13
\| \| \| \|	git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@194601 91177308-0d34-0410-b5e6-96231b3b80d8
*	Fix -Wdelete-non-virtual-dtor warnings by making SampleProfile methods ↵	Alexey Samsonov	2013-11-13
\| \| \| \| \| \|	non-virtual git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@194568 91177308-0d34-0410-b5e6-96231b3b80d8
*	SampleProfileLoader pass. Initial setup.	Diego Novillo	2013-11-13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This adds a new scalar pass that reads a file with samples generated by 'perf' during runtime. The samples read from the profile are incorporated and emmited as IR metadata reflecting that profile. The profile file is assumed to have been generated by an external profile source. The profile information is converted into IR metadata, which is later used by the analysis routines to estimate block frequencies, edge weights and other related data. External profile information files have no fixed format, each profiler is free to define its own. This includes both the on-disk representation of the profile and the kind of profile information stored in the file. A common kind of profile is based on sampling (e.g., perf), which essentially counts how many times each line of the program has been executed during the run. The SampleProfileLoader pass is organized as a scalar transformation. On startup, it reads the file given in -sample-profile-file to determine what kind of profile it contains. This file is assumed to contain profile information for the whole application. The profile data in the file is read and incorporated into the internal state of the corresponding profiler. To facilitate testing, I've organized the profilers to support two file formats: text and native. The native format is whatever on-disk representation the profiler wants to support, I think this will mostly be bitcode files, but it could be anything the profiler wants to support. To do this, every profiler must implement the SampleProfile::loadNative() function. The text format is mostly meant for debugging. Records are separated by newlines, but each profiler is free to interpret records as it sees fit. Profilers must implement the SampleProfile::loadText() function. Finally, the pass will call SampleProfile::emitAnnotations() for each function in the current translation unit. This function needs to translate the loaded profile into IR metadata, which the analyzer will later be able to use. This patch implements the first steps towards the above design. I've implemented a sample-based flat profiler. The format of the profile is fairly simplistic. Each sampled function contains a list of relative line locations (from the start of the function) together with a count representing how many samples were collected at that line during execution. I generate this profile using perf and a separate converter tool. Currently, I have only implemented a text format for these profiles. I am interested in initial feedback to the whole approach before I send the other parts of the implementation for review. This patch implements: - The SampleProfileLoader pass. - The base ExternalProfile class with the core interface. - A SampleProfile sub-class using the above interface. The profiler generates branch weight metadata on every branch instructions that matches the profiles. - A text loader class to assist the implementation of SampleProfile::loadText(). - Basic unit tests for the pass. Additionally, the patch uses profile information to compute branch weights based on instruction samples. This patch converts instruction samples into branch weights. It does a fairly simplistic conversion: Given a multi-way branch instruction, it calculates the weight of each branch based on the maximum sample count gathered from each target basic block. Note that this assignment of branch weights is somewhat lossy and can be misleading. If a basic block has more than one incoming branch, all the incoming branches will get the same weight. In reality, it may be that only one of them is the most heavily taken branch. I will adjust this assignment in subsequent patches. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@194566 91177308-0d34-0410-b5e6-96231b3b80d8
*	Update the docs to match the function name.	Nadav Rotem	2013-11-13
\| \| \| \|	git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@194537 91177308-0d34-0410-b5e6-96231b3b80d8
*	Fold (iszero(A&K1) \| iszero(A&K2)) -> (A&(K1\|K2)) != (K1\|K2) if we know ↵	Nadav Rotem	2013-11-12
\| \| \| \| \| \|	that K1 and K2 are 'one-hot' (only one bit is on). git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@194525 91177308-0d34-0410-b5e6-96231b3b80d8
*	FoldBranchToCommonDest merges branches into a single branch with or/and of ↵	Nadav Rotem	2013-11-12
\| \| \| \| \| \|	the condition. It has a heuristics for estimating when some of the dependencies are processed by out-of-order processors. This patch adds another rule to the heuristics that says that if the "BonusInstruction" that we speculatively execute is used by the condition of the second branch then it is okay to hoist it. This change exposes more opportunities for other passes to transform the code. It does not matter that much that we if-convert the code because the selectiondag builder splits or/and branches into multiple branches when profitable. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@194524 91177308-0d34-0410-b5e6-96231b3b80d8