llvm - Unofficial llvm GIT mirror used in EmbToolkit

	Commit message (Collapse)	Author	Age
*	[NVPTX] Error out if initializer is given for variable in an address space ↵	Justin Holewinski	2014-06-27
\| \| \| \| \| \|	that does not support initialization git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211943 91177308-0d34-0410-b5e6-96231b3b80d8
*	[NVPTX] Add support for .managed variables for UVM	Justin Holewinski	2014-06-27
\| \| \| \|	git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211942 91177308-0d34-0410-b5e6-96231b3b80d8
*	[NVPTX] Emit .weak linkage for link_once, weak, available_externally, and ↵	Justin Holewinski	2014-06-27
\| \| \| \| \| \|	common linkage git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211941 91177308-0d34-0410-b5e6-96231b3b80d8
*	[NVPTX] Fix handling of ldg/ldu intrinsics.	Justin Holewinski	2014-06-27
\| \| \| \| \| \| \| \| \| \|	The address space of the pointer must be global (1) for these intrinsics. There must also be alignment metadata attached to the intrinsic calls, e.g. %val = tail call i32 @llvm.nvvm.ldu.i.global.i32.p1i32(i32 addrspace(1)* %ptr), !align !0 !0 = metadata !{i32 4} git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211939 91177308-0d34-0410-b5e6-96231b3b80d8
*	[NVPTX] Clean up argument lowering code and properly handle alignment for ↵	Justin Holewinski	2014-06-27
\| \| \| \| \| \|	structs and vectors git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211938 91177308-0d34-0410-b5e6-96231b3b80d8
*	[NVPTX] Add support for [SHL,SRA,SRL]_PARTS	Justin Holewinski	2014-06-27
\| \| \| \|	git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211936 91177308-0d34-0410-b5e6-96231b3b80d8
*	[NVPTX] Implement fma and imad contraction as target DAGCombiner patterns	Justin Holewinski	2014-06-27
\| \| \| \| \| \|	This also introduces DAGCombiner patterns for mul.wide to multiply two smaller integers and produce a larger integer git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211935 91177308-0d34-0410-b5e6-96231b3b80d8
*	[NVPTX] Add support for efficient rotate instructions on SM 3.2+	Justin Holewinski	2014-06-27
\| \| \| \|	git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211934 91177308-0d34-0410-b5e6-96231b3b80d8
*	[NVPTX] Add missing isel patterns for 64-bit atomics	Justin Holewinski	2014-06-27
\| \| \| \|	git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211933 91177308-0d34-0410-b5e6-96231b3b80d8
*	[NVPTX] Add isel patterns for bit-field extract (bfe)	Justin Holewinski	2014-06-27
\| \| \| \|	git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211932 91177308-0d34-0410-b5e6-96231b3b80d8
*	[NVPTX] Add support for isspacep instruction	Justin Holewinski	2014-06-27
\| \| \| \|	git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211931 91177308-0d34-0410-b5e6-96231b3b80d8
*	[NVPTX] Add support for envreg reads	Justin Holewinski	2014-06-27
\| \| \| \|	git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211930 91177308-0d34-0410-b5e6-96231b3b80d8
*	[NVPTX] Emit .weak when linkage is not external, internal, or private	Justin Holewinski	2014-06-27
\| \| \| \|	git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211926 91177308-0d34-0410-b5e6-96231b3b80d8
*	[x86] Fix a miscompile in the new shuffle lowering uncovered by	Chandler Carruth	2014-06-27
\| \| \| \| \| \| \| \| \|	a bootstrap. I managed to mis-remember how PACKUS worked on x86, and was using undef for the high bytes instead of zero. The fix is fairly obvious. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211922 91177308-0d34-0410-b5e6-96231b3b80d8
*	IR: Add COMDATs to the IR	David Majnemer	2014-06-27
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This new IR facility allows us to represent the object-file semantic of a COMDAT group. COMDATs allow us to tie together sections and make the inclusion of one dependent on another. This is required to implement features like MS ABI VFTables and optimizing away certain kinds of initialization in C++. This functionality is only representable in COFF and ELF, Mach-O has no similar mechanism. Differential Revision: http://reviews.llvm.org/D4178 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211920 91177308-0d34-0410-b5e6-96231b3b80d8
*	Fix test so it doesn't try to write out temporary files into the test tree.	David Blaikie	2014-06-27
\| \| \| \|	git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211916 91177308-0d34-0410-b5e6-96231b3b80d8
*	MC: Fix associative sections on COFF	David Majnemer	2014-06-27
\| \| \| \| \| \| \| \| \|	COFF sections in MC were represented by a tuple of section-name and COMDAT-name. This is not sufficient to represent a .text section associated with another .text section; we need a way to distinguish between the key section and the one marked associative. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211913 91177308-0d34-0410-b5e6-96231b3b80d8
*	R600: Don't crash on unhandled instruction in promote alloca	Matt Arsenault	2014-06-27
\| \| \| \|	git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211906 91177308-0d34-0410-b5e6-96231b3b80d8
*	[PowerPC] Constrain base register in PPCRegisterInfo::resolveFrameIndex	Ulrich Weigand	2014-06-27
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	I've run into a bug where current LLVM at -O0 (with fast-isel) generated invalid code like: ld 0, 20936(1) # 8-byte Folded Reload stw 12, 10348(0) stw 12, 10344(0) The underlying vreg had been introduced as base register by the Local Stack Slot Allocation pass. That register was constrained to G8RC by PPCRegisterInfo::materializeFrameBaseRegister to match the ADDI instruction used to set it, but it was not constrained to G8RC_NOX0 to fit the use of the register in an address. That should have happened in PPCRegisterInfo::resolveFrameIndex. This patch adds an appropriate constrainRegClass call. Reviewed by Hal Finkel. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211897 91177308-0d34-0410-b5e6-96231b3b80d8
*	[x86] Teach the target combine step to aggressively fold pshufd insturcions.	Chandler Carruth	2014-06-27
\| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This allows it to fold pshufd instructions across intervening half-shuffles and other noise. This pattern actually shows up in the generic lowering tests, but I've also added direct tests using intrinsics to make sure that the specific desired functionality is working even if the lowering stuff changes in the future. Differential Revision: http://reviews.llvm.org/D4292 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211892 91177308-0d34-0410-b5e6-96231b3b80d8
*	[ELF][Mips] Fix recognition of MIPS 64-bit arch in the ↵	Simon Atanasyan	2014-06-27
\| \| \| \| \| \|	ELFObjectFile:getArch() method. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211891 91177308-0d34-0410-b5e6-96231b3b80d8
*	[x86] Teach the target-specific combining how to aggressively fold	Chandler Carruth	2014-06-27
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	half-shuffles, even looking through intervening instructions in a chain. Summary: This doesn't happen to show up with any test cases I've found for the current shuffle lowering, but previous attempts would benefit from this and it seems generally useful. I've tested it directly using intrinsics, which also shows that it will work with hand vectorized code as well. Note that even though pshufd isn't directly used in these tests, it gets exercised because we combine some of the half shuffles into a pshufd first, and then merge them. Differential Revision: http://reviews.llvm.org/D4291 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211890 91177308-0d34-0410-b5e6-96231b3b80d8
*	[x86] Teach the X86 backend to DAG-combine SSE2 shuffles that are	Chandler Carruth	2014-06-27
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	trivially redundant. This fixes several cases in the new vector shuffle lowering algorithm which would generate redundant shuffle instructions for the sake of simplicity. I'm also deleting a testcase which was somewhat ridiculous. It was checking for a bug in 2007 about incorrectly transforming shuffles by looking for the string "-86" in the output of a pretty substantial function. This test case doesn't seem to have any value at this point. Differential Revision: http://reviews.llvm.org/D4240 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211889 91177308-0d34-0410-b5e6-96231b3b80d8
*	[x86] Begin a significant overhaul of how vector lowering is done in the	Chandler Carruth	2014-06-27
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	x86 backend. This sketches out a new code path for vector lowering, hidden behind an off-by-default flag while it is under development. The fundamental idea behind the new code path is to aggressively break down the problem space in ways that ease selecting the odd set of instructions available on x86, and carefully avoid scalarizing code even when forced to use older ISAs. Notably, this starts off restricting itself to SSE2 and implements the complete vector shuffle and blend space for 128-bit vectors in SSE2 without scalarizing. The plan is to layer on top of this ISA extensions where we can bail out of the complex SSE2 lowering and opt for a cheaper, specialized instruction (or set of instructions). It also needs to be generalized to AVX and AVX512 vector widths. Currently, this does a decent but not perfect job for SSE2. There are some specific shortcomings that I plan to address: - We need a peephole combine to fold together shuffles where possible. There are cases where a previous shuffle could be modified slightly to arrange for elements to be in the correct position and a later shuffle eliminated. Doing this eagerly added quite a bit of complexity, and so my plan is to combine away these redundancies afterward. - There are a lot more clever ways to use unpck and pack that need to be added. This is essential for real world shuffles as it turns out... Once SSE2 is polished a bit I should be able to get interesting numbers on performance improvements on benchmarks conducive to vectorization. All of this will be off by default until it is functionally equivalent of course. Differential Revision: http://reviews.llvm.org/D4225 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211888 91177308-0d34-0410-b5e6-96231b3b80d8
*	Added instruction combine to transform few more negative values addition to ↵	Dinesh Dwivedi	2014-06-27
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	subtraction (Part 3) This patch enables transforms for (x + (~(y \| c) + 1) --> x - (y \| c) if c is odd Differential Revision: http://reviews.llvm.org/D4210 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211881 91177308-0d34-0410-b5e6-96231b3b80d8
*	GlobalOpt: Fix constantfold-initializers.ll test	David Majnemer	2014-06-27
\| \| \| \| \| \| \| \| \|	The test added in r211762 was sloppy, the correct initializer wasn't added to @llvm.global_ctors Spotted by Pasi Parviainen! git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211879 91177308-0d34-0410-b5e6-96231b3b80d8
*	Revert "Revert "Revert "PR20038: DebugInfo: Inlined call sites where the ↵	David Blaikie	2014-06-27
\| \| \| \| \| \| \| \| \| \| \|	caller has debug info but the call itself has no debug location.""" Reverting this again, didn't mean to commit it - while r211872 fixes one of the issues here, there are still others to figure out and address. This reverts commit r211871. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211873 91177308-0d34-0410-b5e6-96231b3b80d8
*	ArgumentPromotion: Propagate debug locations on calls for which arguments ↵	David Blaikie	2014-06-27
\| \| \| \| \| \|	are promoted. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211872 91177308-0d34-0410-b5e6-96231b3b80d8
*	Revert "Revert "PR20038: DebugInfo: Inlined call sites where the caller has ↵	David Blaikie	2014-06-27
\| \| \| \| \| \| \| \|	debug info but the call itself has no debug location."" This reverts commit r211724. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211871 91177308-0d34-0410-b5e6-96231b3b80d8
*	MachineScheduler: add some book-keeping to fix an assert.	Andrew Trick	2014-06-27
\| \| \| \| \| \| \| \|	Fixe for Bug 20057 - Assertion failied in llvm::SUnit* llvm::SchedBoundary::pickOnlyChoice(): Assertion `i <= (HazardRec->getMaxLookAhead() + MaxObservedStall) && "permanent hazard"' Thanks to Chad for the test case. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211865 91177308-0d34-0410-b5e6-96231b3b80d8
*	R600: Add some testcases for promote alloca pass.	Matt Arsenault	2014-06-27
\| \| \| \| \| \| \|	More complicated GEPs are skipped. Add some tests to actually stress this skipping. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211859 91177308-0d34-0410-b5e6-96231b3b80d8
*	[X86] AVX512: Add vbroadcasti*	Adam Nemet	2014-06-27
\| \| \| \| \| \| \| \| \|	For now I used a separate template for these sub-vector/tuple broadcasts rather than sharing the mem variants with avx512_int_broadcast_rm. <rdar://problem/17402869> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211828 91177308-0d34-0410-b5e6-96231b3b80d8
*	[StackMaps] Enable patchpoint liveness analysis per default.	Juergen Ributzka	2014-06-26
\| \| \| \|	git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211817 91177308-0d34-0410-b5e6-96231b3b80d8
*	[Stackmaps] Remove the liveness calculation for stackmap intrinsics.	Juergen Ributzka	2014-06-26
\| \| \| \| \| \| \| \| \| \|	There is no need to calculate the liveness information for stackmaps. The liveness information is still available for the patchpoint intrinsic and that is also the intended usage model. Related to <rdar://problem/17473725> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211816 91177308-0d34-0410-b5e6-96231b3b80d8
*	GVN: Preserve invariant.load metadata	Arnold Schwaighofer	2014-06-26
\| \| \| \| \| \| \| \| \| \| \|	If both instructions to be replaced are marked invariant the resulting instruction is invariant. rdar://13358910 Fix by Erik Eckstein! git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211801 91177308-0d34-0410-b5e6-96231b3b80d8
*	R600/SI: Add FP mode bits to binary.	Matt Arsenault	2014-06-26
\| \| \| \| \| \| \| \|	The default rounding mode to initialize the mode register needs to be reported to the runtime. Fill in other bits a kernel may be interested in setting for future use. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211791 91177308-0d34-0410-b5e6-96231b3b80d8
*	Added parsing co-processor names starting with "cr"	Renato Golin	2014-06-26
\| \| \| \| \| \| \| \| \| \|	Additional compliant GAS names for coprocessor register name are enabled for all instruction with parameter MCK_CoprocReg: LDC,LDC2,STC,STC2,CDP,CDP2,MCR,MCR2,MCRR,MCRR2,MRC,MRC2,MRRC,MRRC2 Patch by Andrey Kuharev. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211776 91177308-0d34-0410-b5e6-96231b3b80d8
*	[X86] Improve the selection of SSE3/AVX addsub instructions.	Andrea Di Biagio	2014-06-26
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch teaches the backend how to canonicalize a shuffle vectors according to the rule: - (shuffle (FADD A, B), (FSUB A, B), Mask) -> (shuffle (FSUB A, -B), (FADD A, -B), Mask) Where 'Mask' is: <0,5,2,7> ;; for v4f32 and v4f64 shuffles. <0,3> ;; for v2f64 shuffles. <0,9,2,11,4,13,6,15> ;; for v8f32 shuffles. In general, ISel only knows how to pattern-match a canonical 'fadd + fsub + blendi' dag node sequence into an ADDSUB instruction. This new rule allows to convert a non-canonical dag sequence into a canonical one that will be matched by a single ADDSUB at ISel stage. The idea of converting a non-canonical ADDSUB into a canonical one by swapping the first two operands of the shuffle, and then negating the second operand of the FADD and FSUB, was originally proposed by Hal Finkel. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211771 91177308-0d34-0410-b5e6-96231b3b80d8
*	This patch removed duplicate code for matching patterns	Dinesh Dwivedi	2014-06-26
\| \| \| \| \| \| \| \| \| \| \|	which are now handled in SimplifyUsingDistributiveLaws() (after r211261) Differential Revision: http://reviews.llvm.org/D4253 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211768 91177308-0d34-0410-b5e6-96231b3b80d8
*	Added instruction combine to transform few more negative values addition to ↵	Dinesh Dwivedi	2014-06-26
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	subtraction (Part 2) This patch enables transforms for (x + (~(y \| c) + 1) --> x - (y \| c) if c is even Differential Revision: http://reviews.llvm.org/D4209 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211765 91177308-0d34-0410-b5e6-96231b3b80d8
*	GlobalOpt: Don't optimize thread_local for initializers	David Majnemer	2014-06-26
\| \| \| \| \| \| \| \|	Folding a reference to a thread_local variable into another global variable's initializer is very problematic, there is no relocation that exists to represent such an access. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211762 91177308-0d34-0410-b5e6-96231b3b80d8
*	R600: Fix vector FMA	Matt Arsenault	2014-06-26
\| \| \| \|	git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211757 91177308-0d34-0410-b5e6-96231b3b80d8
*	Don't build switch tables for dllimport and TLS variables in GEPs	Hans Wennborg	2014-06-26
\| \| \| \| \| \| \|	This is a follow-up to r211331, which failed to notice that we were returning early from ValidLookupTableConstant for GEPs. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211753 91177308-0d34-0410-b5e6-96231b3b80d8
*	[X86] AVX512: Fix asm syntax for packed vcmp	Adam Nemet	2014-06-26
\| \| \| \| \| \| \| \|	The *_alt defs for vcmp are used by the InstParser (the asm string in the main def is used by the InstPrinter) . The former was accepting vector registers as destination rather than mask registers. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211750 91177308-0d34-0410-b5e6-96231b3b80d8
*	[FastISel][X86] Only fold the cmp into the select when both instructions are ↵	Juergen Ributzka	2014-06-25
\| \| \| \| \| \| \| \| \| \| \| \|	in the same basic block. If the cmp is in a different basic block, then it is possible that not all operands of that compare have defined registers. This can happen when one of the operands to the cmp is a load and the load gets folded into the cmp. In this case FastISel will skip the load instruction and the vreg is never defined. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211730 91177308-0d34-0410-b5e6-96231b3b80d8
*	Revert "PR20038: DebugInfo: Inlined call sites where the caller has debug ↵	David Blaikie	2014-06-25
\| \| \| \| \| \| \| \| \| \| \|	info but the call itself has no debug location." This reverts commit r211723. Breaks the ASan/compiler-rt build... guess I didn't test very far at all :/. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211724 91177308-0d34-0410-b5e6-96231b3b80d8
*	PR20038: DebugInfo: Inlined call sites where the caller has debug info but ↵	David Blaikie	2014-06-25
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	the call itself has no debug location. This situation does bad things when inlined, so I've fixed Clang not to produce inlinable call sites without locations when the caller has debug info (in the one case where I could find that this occurred). This updates the PR20038 test case to be what clang now produces, and readds the assertion that had to be removed due to this bug. I've also beefed up the debug info verifier to help diagnose these issues in the future, and I hope to add checks to the inliner to just assert-fail if it encounters this situation. If, in the future, we decide we have to cope with this situation, the right thing to do is probably to just remove all the DebugLocs from the inlined instructions. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211723 91177308-0d34-0410-b5e6-96231b3b80d8
*	Add Rpass-missed and Rpass-analysis reports to the loop vectorizer. The ↵	Tyler Nowicki	2014-06-25
\| \| \| \| \| \| \| \| \|	remarks give the vector width of vectorized loops and a brief analysis of loops that fail to be vectorized. For example, an analysis will be generated for loops containing control flow that cannot be simplified to a select. The optimization remarks also give the debug location of expressions that cannot be vectorized, for example the location of an unvectorizable call. Reviewed by: Arnold Schwaighofer git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211721 91177308-0d34-0410-b5e6-96231b3b80d8
*	[X86] Always prefer to lower a VECTOR_SHUFFLE into a BLENDI instead of SHUFP ↵	Andrea Di Biagio	2014-06-25
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	(or VPERM2X128). This patch teaches method 'LowerVECTOR_SHUFFLE' to give higher precedence to the check for 'isBlendMask'; the idea is that, when possible, we should firstly check if a shuffle performs a blend, and in case, try to lower it into a BLENDI instead of selecting a SHUFP or (worse) a VPERM2X128. In general: - AVX VBLENDPS/D always have better latency and throughput than VPERM2F128; - BLENDPS/D instructions tend to always have better 'reciprocal throughput' than the equivalent SHUFPS/D; - Both BLENDPS/D and SHUFPS/D are often decoded into the same number of m-ops; however, a m-op obtained from a BLENDPS/D can be scheduled to more than one execution port. This patch: - Moves the check for 'isBlendMask' immediately before the check for 'isSHUFPMask' within method 'LowerVECTOR_SHUFFLE'; - Updates existing tests for sse/avx shuffle/blend instructions to verify that we select (v)blendps/d when possible (instead of (v)shufps/d or vperm2f128). git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211720 91177308-0d34-0410-b5e6-96231b3b80d8
*	Add some test files for r211710.	Eli Bendersky	2014-06-25
\| \| \| \|	git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211711 91177308-0d34-0410-b5e6-96231b3b80d8