summaryrefslogtreecommitdiff
path: root/lib/Transforms/Vectorize
Commit message (Collapse)AuthorAge
* Merging r199570:Tom Stellard2014-04-09
| | | | | | | | | | | | | | | | | | | | ------------------------------------------------------------------------ r199570 | aschwaighofer | 2014-01-18 22:18:31 -0500 (Sat, 18 Jan 2014) | 11 lines LoopVectorizer: A reduction that has multiple uses of the reduction value is not a reduction. Really. Under certain circumstances (the use list of an instruction has to be set up right - hence the extra pass in the test case) we would not recognize when a value in a potential reduction cycle was used multiple times by the reduction cycle. Fixes PR18526. radar://15851149 ------------------------------------------------------------------------ git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_34@205818 91177308-0d34-0410-b5e6-96231b3b80d8
* Merging r197449:Bill Wendling2013-12-17
| | | | | | | | | | | | | | | | | ------------------------------------------------------------------------ r197449 | arnolds | 2013-12-16 17:11:01 -0800 (Mon, 16 Dec 2013) | 7 lines LoopVectorizer: Don't if-convert constant expressions that can trap A phi node operand or an instruction operand could be a constant expression that can trap (division). Check that we don't vectorize such cases. PR16729 radar://15653590 ------------------------------------------------------------------------ git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_34@197453 91177308-0d34-0410-b5e6-96231b3b80d8
* Merging r196508:Bill Wendling2013-12-06
| | | | | | | | | | | | | | | | | | | | | | ------------------------------------------------------------------------ r196508 | arnolds | 2013-12-05 07:14:40 -0800 (Thu, 05 Dec 2013) | 12 lines SLPVectorizer: An in-tree vectorized entry cannot also be a scalar external use We were creating external uses for scalar values in MustGather entries that also had a ScalarToTreeEntry (they also are present in a vectorized tuple). This meant we would keep a value 'alive' as a scalar and vectorized causing havoc. This is not necessary because when we create a MustGather vector we explicitly create external uses entries for the insertelement instructions of the MustGather vector elements. Fixes PR18129. radar://15582184 ------------------------------------------------------------------------ git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_34@196571 91177308-0d34-0410-b5e6-96231b3b80d8
* Merging r195787:Bill Wendling2013-12-01
| | | | | | | | | | | | | | | | | | ------------------------------------------------------------------------ r195787 | arnolds | 2013-11-26 14:11:23 -0800 (Tue, 26 Nov 2013) | 8 lines LoopVectorizer: Truncate i64 trip counts of i32 phis if necessary In signed arithmetic we could end up with an i64 trip count for an i32 phi. Because it is signed arithmetic we know that this is only defined if the i32 does not wrap. It is therefore safe to truncate the i64 trip count to a i32 value. Fixes PR18049. ------------------------------------------------------------------------ git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_34@195991 91177308-0d34-0410-b5e6-96231b3b80d8
* Merging r195791:Bill Wendling2013-11-27
| | | | | | | | | | | | | | ------------------------------------------------------------------------ r195791 | nadav | 2013-11-26 14:24:25 -0800 (Tue, 26 Nov 2013) | 4 lines PR1860 - We can't save a list of ExtractElement instructions to CSE because some of these instructions may be removed and optimized in future iterations. Instead we save a list of basic blocks that we need to CSE. ------------------------------------------------------------------------ git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_34@195818 91177308-0d34-0410-b5e6-96231b3b80d8
* Merging r195773:Bill Wendling2013-11-27
| | | | | | | | | | | | | | | | ------------------------------------------------------------------------ r195773 | nadav | 2013-11-26 09:29:19 -0800 (Tue, 26 Nov 2013) | 6 lines PR18060 - When we RAUW values with ExtractElement instructions in some cases we generate PHI nodes with multiple entries from the same basic block but with different values. Enabling CSE on ExtractElement instructions make sure that all of the RAUWed instructions are the same. ------------------------------------------------------------------------ git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_34@195817 91177308-0d34-0410-b5e6-96231b3b80d8
* Merging r195528:Bill Wendling2013-11-25
| | | | | | | | | | | | | | | | | ------------------------------------------------------------------------ r195528 | chandlerc | 2013-11-22 16:48:34 -0800 (Fri, 22 Nov 2013) | 7 lines Migrate metadata information from scalar to vector instructions during SLP vectorization. Based on the code in BBVectorizer. Fixes PR17741. Patch by Raul Silvera, reviewed by Hal and Nadav. Reformatted by my driving of clang-format. =] ------------------------------------------------------------------------ git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_34@195608 91177308-0d34-0410-b5e6-96231b3b80d8
* Merging r195162:Bill Wendling2013-11-20
| | | | | | | | | | | | | | | | | | | | | | ------------------------------------------------------------------------ r195162 | arnolds | 2013-11-19 14:20:20 -0800 (Tue, 19 Nov 2013) | 12 lines SLPVectorizer: Fix stale for Value pointer array We are slicing an array of Value pointers and process those slices in a loop. The problem is that we might invalidate a later slice by vectorizing a former slice. Use a WeakVH to track the pointer. If the pointer is deleted or RAUW'ed we can tell. The test case will only fail when running with libgmalloc. radar://15498655 ------------------------------------------------------------------------ git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_34@195222 91177308-0d34-0410-b5e6-96231b3b80d8
* Merging r195161:Bill Wendling2013-11-20
| | | | | | | | | | | ------------------------------------------------------------------------ r195161 | arnolds | 2013-11-19 14:20:18 -0800 (Tue, 19 Nov 2013) | 1 line SLPVectorizer: Fix whitespace errors ------------------------------------------------------------------------ git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_34@195221 91177308-0d34-0410-b5e6-96231b3b80d8
* LoopVectorizer: Extend the induction variable to a larger typeArnold Schwaighofer2013-11-18
| | | | | | | | | | | | | | | | | | | | | | | | | | | In some case the loop exit count computation can overflow. Extend the type to prevent most of those cases. The problem is loops like: int main () { int a = 1; char b = 0; lbl: a &= 4; b--; if (b) goto lbl; return a; } The backedge count is 255. The induction variable type is i8. If we add one to 255 to get the exit count we overflow to zero. To work around this issue we extend the type of the induction variable to i32 in the case of i8 and i16. PR17532 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@195008 91177308-0d34-0410-b5e6-96231b3b80d8
* LoopVectorizer: Use abi alignment for accesses with no alignmentArnold Schwaighofer2013-11-15
| | | | | | | | | | | When we vectorize a scalar access with no alignment specified, we have to set the target's abi alignment of the scalar access on the vectorized access. Using the same alignment of zero would be wrong because most targets will have a bigger abi alignment for vector types. This probably fixes PR17878. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@194876 91177308-0d34-0410-b5e6-96231b3b80d8
* Move debug message in vectorizerRenato Golin2013-11-11
| | | | | | No functional change, just better reporting. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@194388 91177308-0d34-0410-b5e6-96231b3b80d8
* SLPVectorizer: Use properlyDominates to satisfy the irreflexivity of a ↵Benjamin Kramer2013-11-04
| | | | | | | | strict weak ordering. STL debug mode checks this. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@194015 91177308-0d34-0410-b5e6-96231b3b80d8
* SLPVectorizer: Add a missing pair of parens. No functionality change.Benjamin Kramer2013-11-03
| | | | git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@193958 91177308-0d34-0410-b5e6-96231b3b80d8
* SLPVectorizer: When CSEing generated gathers only scan blocks containing them.Benjamin Kramer2013-11-03
| | | | | | | | | | | Instead of doing a RPO traversal of the whole function remember the blocks containing gathers (typically <= 2) and scan them in dominator-first order. The actual CSE is still quadratic, but I'm not confident that adding a scoped hash table here is worth it as we're only looking at the generated instructions and not arbitrary code. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@193956 91177308-0d34-0410-b5e6-96231b3b80d8
* SLPVectorizer: Remove duplicated function.Benjamin Kramer2013-11-02
| | | | git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@193927 91177308-0d34-0410-b5e6-96231b3b80d8
* LoopVectorize: Remove quadratic behavior the local CSE.Benjamin Kramer2013-11-02
| | | | | | | | Doing this with a hash map doesn't change behavior and avoids calling isIdenticalTo O(n^2) times. This should probably eventually move into a utility class shared with EarlyCSE and the limited CSE in the SLPVectorizer. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@193926 91177308-0d34-0410-b5e6-96231b3b80d8
* LoopVectorizer: Move cse code into its own functionArnold Schwaighofer2013-11-01
| | | | git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@193895 91177308-0d34-0410-b5e6-96231b3b80d8
* LoopVectorizer: Perform redundancy elimination on induction variablesArnold Schwaighofer2013-11-01
| | | | | | | | | | | | | | | | | | | | When the loop vectorizer was part of the SCC inliner pass manager gvn would run after the loop vectorizer followed by instcombine. This way redundancy (multiple uses) were removed and instcombine could perform scalarization on the induction variables. Having moved the loop vectorizer to later we no longer run any form of redundancy elimination before we perform instcombine. This caused vectorized induction variables to survive that did not before. On a recent iMac this helps linpack back from 6000Mflops to 7000Mflops. This should also help lpbench and paq8p. I ran a Release (without Asserts) build over the test-suite and did not see any negative impact on compile time. radar://15339680 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@193891 91177308-0d34-0410-b5e6-96231b3b80d8
* LoopVectorize: Look for consecutive acces in GEPs with trailing zero indicesBenjamin Kramer2013-11-01
| | | | | | | If we have a pointer to a single-element struct we can still build wide loads and stores to it (if there is no padding). git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@193860 91177308-0d34-0410-b5e6-96231b3b80d8
* LoopVectorizer: If dependency checks fail try runtime checksArnold Schwaighofer2013-11-01
| | | | | | | | | | | | When a dependence check fails we can still try to vectorize loops with runtime array bounds checks. This helps linpack to vectorize a loop in dgefa. And we are back to 2x of the scalar performance on a corei7-avx. radar://15339680 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@193853 91177308-0d34-0410-b5e6-96231b3b80d8
* LoopVectorizer: Clear all member data structures in RuntimeCheck.reset()Arnold Schwaighofer2013-11-01
| | | | | | | | Clear all data structures when resetting the RuntimeCheck data structure. No test case. This was exposed by an upcomming change. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@193852 91177308-0d34-0410-b5e6-96231b3b80d8
* ARM cost model: Account for zero cost scalar SROA instructionsArnold Schwaighofer2013-10-29
| | | | | | | | | By vectorizing a series of srl, or, ... instructions we have obfuscated the intention so much that the backend does not know how to fold this code away. radar://15336950 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@193573 91177308-0d34-0410-b5e6-96231b3b80d8
* SLPVectorizer: Use vector type for vectorized memory operationsArnold Schwaighofer2013-10-29
| | | | | | | | | No test case, because with the current cost model we don't see a difference. An upcoming ARM memory cost model change will expose and test this bug. radar://15332579 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@193572 91177308-0d34-0410-b5e6-96231b3b80d8
* Quick look-up for block in loop.Wan Xiaofei2013-10-26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch implements quick look-up for block in loop by maintaining a hash set for blocks. It improves the efficiency of loop analysis a lot, the biggest improvement could be 5-6%(458.sjeng). Below are the compilation time for our benchmark in llc before & after the patch. Benchmark llc - trunk llc - patched 401.bzip2 0.339081 100.00% 0.329657 102.86% 403.gcc 19.853966 100.00% 19.605466 101.27% 429.mcf 0.049823 100.00% 0.048451 102.83% 433.milc 0.514898 100.00% 0.510217 100.92% 444.namd 1.109328 100.00% 1.103481 100.53% 445.gobmk 4.988028 100.00% 4.929114 101.20% 456.hmmer 0.843871 100.00% 0.825865 102.18% 458.sjeng 0.754238 100.00% 0.714095 105.62% 464.h264ref 2.9668 100.00% 2.90612 102.09% 471.omnetpp 4.556533 100.00% 4.511886 100.99% bitmnp01 0.038168 100.00% 0.0357 106.91% idctrn01 0.037745 100.00% 0.037332 101.11% libquake2 3.78689 100.00% 3.76209 100.66% libquake_ 2.251525 100.00% 2.234104 100.78% linpack 0.033159 100.00% 0.032788 101.13% matrix01 0.045319 100.00% 0.043497 104.19% nbench 0.333161 100.00% 0.329799 101.02% tblook01 0.017863 100.00% 0.017666 101.12% ttsprk01 0.054337 100.00% 0.053057 102.41% Reviewer : Andrew Trick <atrick@apple.com>, Hal Finkel <hfinkel@anl.gov> Approver : Andrew Trick <atrick@apple.com> Test : Pass make check-all & llvm test-suite git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@193460 91177308-0d34-0410-b5e6-96231b3b80d8
* LoopVectorizer: Don't attempt to vectorize extractelement instructionsHal Finkel2013-10-25
| | | | | | | | | | | | | | | The loop vectorizer does not currently understand how to vectorize extractelement instructions. The existing check, which excluded all vector-valued instructions, did not catch extractelement instructions because it checked only the return value. As a result, vectorization would proceed, producing illegal instructions like this: %58 = extractelement <2 x i32> %15, i32 0 %59 = extractelement i32 %58, i32 0 where the second extractelement is illegal because its first operand is not a vector. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@193434 91177308-0d34-0410-b5e6-96231b3b80d8
* Mark vector loops as already vectorizedRenato Golin2013-10-24
| | | | | | | | Make sure we mark all loops (scalar and vector) when vectorizing, so that we don't try to vectorize them anymore. Also, set unroll to 1, since this is what we check for on early exit. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@193349 91177308-0d34-0410-b5e6-96231b3b80d8
* Use more type helper functionsMatt Arsenault2013-10-21
| | | | git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@193109 91177308-0d34-0410-b5e6-96231b3b80d8
* SLPVectorizer: Don't vectorize volatile memory operationsArnold Schwaighofer2013-10-16
| | | | | | | | | | radar://15231682 Reapply r192799, http://lab.llvm.org:8011/builders/lldb-x86_64-debian-clang/builds/8226 showed that the bot is still broken even with this out. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@192820 91177308-0d34-0410-b5e6-96231b3b80d8
* Revert "SLPVectorizer: Don't vectorize volatile memory operations"Arnold Schwaighofer2013-10-16
| | | | | | This speculatively reverts commit 192799. It might have broken a linux buildbot. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@192816 91177308-0d34-0410-b5e6-96231b3b80d8
* SLPVectorizer: Don't vectorize volatile memory operationsArnold Schwaighofer2013-10-16
| | | | | | radar://15231682 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@192799 91177308-0d34-0410-b5e6-96231b3b80d8
* LoopVectorize: Properly reflect PODness in comments.Benjamin Kramer2013-10-15
| | | | git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@192717 91177308-0d34-0410-b5e6-96231b3b80d8
* SLPVectorizer: Sort PHINodes based on their opcodeArnold Schwaighofer2013-10-12
| | | | | | | | | | | | | | Before this patch we relied on the order of phi nodes when we looked for phi nodes of the same type. This could prevent vectorization of cases where there was a phi node of a second type in between phi nodes of some type. This is important for vectorization of an internal graphics kernel. On the test suite + external on x86_64 (and on a run on armv7s) it showed no impact on either performance or compile time. radar://15024459 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@192537 91177308-0d34-0410-b5e6-96231b3b80d8
* LoopVectorize: Add missing INITIALIZE_PASS_DEPENDENCY macrosTobias Grosser2013-10-12
| | | | | | Contributed-by: Peter Zotov <whitequark@whitequark.org> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@192536 91177308-0d34-0410-b5e6-96231b3b80d8
* Better info when debugging vectorizerRenato Golin2013-10-11
| | | | git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@192460 91177308-0d34-0410-b5e6-96231b3b80d8
* LoopVectorize: External uses must use the last value in a reduction cycleArnold Schwaighofer2013-10-07
| | | | | | | | | Otherwise, we don't perform operations that would have been performed on the scalar version. Fixes PR17498. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@192133 91177308-0d34-0410-b5e6-96231b3b80d8
* SLPVectorizer: Sort inputs to commutative binary operationsArnold Schwaighofer2013-10-04
| | | | | | | | | | | | | | | | | | | | | | | Sort the operands of the other entries in the current vectorization root according to the first entry's operands opcodes. %conv0 = uitofp ... %load0 = load float ... = fmul %conv0, %load0 = fmul %load0, %conv1 = fmul %load0, %conv2 Make sure that we recursively vectorize <%conv0, %conv1, %conv2> and <%load0, %load0, %load0>. This makes it more likely to obtain vectorizable trees. We have to be careful when we sort that we don't destroy 'good' existing ordering implied by source order. radar://15080067 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@191977 91177308-0d34-0410-b5e6-96231b3b80d8
* Don't use runtime bounds check between address spaces.Matt Arsenault2013-10-02
| | | | | | | | | Don't vectorize with a runtime check if it requires a comparison between pointers with different address spaces. The values can't be assumed to be directly comparable. Previously it would create an illegal bitcast. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@191862 91177308-0d34-0410-b5e6-96231b3b80d8
* Apply slp vectorization on fully-vectorizable tree of height 2Yi Jiang2013-10-02
| | | | git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@191852 91177308-0d34-0410-b5e6-96231b3b80d8
* Fix debug printing spacing.Matt Arsenault2013-10-02
| | | | | | Fix missing newlines, missing and extra spaces in printed messages. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@191851 91177308-0d34-0410-b5e6-96231b3b80d8
* Fix comment grammar and capitalization.Matt Arsenault2013-10-02
| | | | git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@191850 91177308-0d34-0410-b5e6-96231b3b80d8
* SLPVectorizer: Make store chain finding more aggressive with ↵Benjamin Kramer2013-10-02
| | | | | | | | | GetUnderlyingObject. This recursively strips all GEPs like the existing code. It also handles bitcasts and other operations that do not change the pointer value. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@191847 91177308-0d34-0410-b5e6-96231b3b80d8
* Remove several unused variables.Rafael Espindola2013-10-01
| | | | | | Patch by Alp Toker. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@191757 91177308-0d34-0410-b5e6-96231b3b80d8
* Fix code duplicationMatt Arsenault2013-10-01
| | | | git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@191716 91177308-0d34-0410-b5e6-96231b3b80d8
* Convert manual insert point restores to the new RAII object.Benjamin Kramer2013-09-30
| | | | git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@191675 91177308-0d34-0410-b5e6-96231b3b80d8
* IRBuilder: Add RAII objects to reset insertion points or fast math flags.Benjamin Kramer2013-09-30
| | | | | | | | Inspired by the object from the SLPVectorizer. This found a minor bug in the debug loc restoration in the vectorizer where the location of a following instruction was attached instead of the location from the original instruction. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@191673 91177308-0d34-0410-b5e6-96231b3b80d8
* Even more spelling fixes for "instruction".Robert Wilhelm2013-09-28
| | | | git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@191611 91177308-0d34-0410-b5e6-96231b3b80d8
* Fix spelling intruction -> instruction.Robert Wilhelm2013-09-28
| | | | git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@191610 91177308-0d34-0410-b5e6-96231b3b80d8
* Fix SLPVectorizer using wrong address space for load/storeMatt Arsenault2013-09-27
| | | | git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@191564 91177308-0d34-0410-b5e6-96231b3b80d8
* Transforms: Use getFirstNonPHI to set the insertion point for PHIsJustin Bogner2013-09-27
| | | | | | | | | | We were previously using getFirstInsertionPt to insert PHI instructions when vectorizing, but getFirstInsertionPt also skips past landingpads, causing this to generate invalid IR. We can avoid this issue by using getFirstNonPHI instead. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@191526 91177308-0d34-0410-b5e6-96231b3b80d8