summaryrefslogtreecommitdiff
path: root/lib/Target/X86
Commit message (Collapse)AuthorAge
...
* X86: Add a description for AMD bdver3 aka Steamroller.Benjamin Kramer2013-11-04
| | | | | | This is just bdver2 + FSGSBase. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@193984 91177308-0d34-0410-b5e6-96231b3b80d8
* AVX-512: added VPCONFLICT instruction and intrinsics,Elena Demikhovsky2013-11-03
| | | | | | | added EVEX_KZ to tablegen git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@193959 91177308-0d34-0410-b5e6-96231b3b80d8
* Fix PR17764Michael Liao2013-11-02
| | | | | | | | | - When selecting BLEND from vselect, the operands need swapping as due to the difference between vselect and SSE/AVX's BLEND insn git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@193900 91177308-0d34-0410-b5e6-96231b3b80d8
* Fix unused variable warnings.Dan Gohman2013-10-31
| | | | git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@193823 91177308-0d34-0410-b5e6-96231b3b80d8
* Add new calling convention for WebKit Java Script.Andrew Trick2013-10-31
| | | | git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@193812 91177308-0d34-0410-b5e6-96231b3b80d8
* Add support for stack map generation in the X86 backend.Andrew Trick2013-10-31
| | | | | | Originally implemented by Lang Hames. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@193811 91177308-0d34-0410-b5e6-96231b3b80d8
* whitespaceAndrew Trick2013-10-31
| | | | git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@193765 91177308-0d34-0410-b5e6-96231b3b80d8
* Add AVX512 unmasked integer broadcast intrinsics and support.Cameron McInally2013-10-31
| | | | git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@193748 91177308-0d34-0410-b5e6-96231b3b80d8
* AVX-512: Implemented CMOV for 512-bit vectorsElena Demikhovsky2013-10-31
| | | | git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@193747 91177308-0d34-0410-b5e6-96231b3b80d8
* This commit adds some (but not all) of the x86-64 relocations that are notTom Roeder2013-10-30
| | | | | | | currently supported in the ELF object writer, along with a simple test case. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@193709 91177308-0d34-0410-b5e6-96231b3b80d8
* Revert "SelectionDAG: Teach the legalizer to split SETCC if VSELECT needs ↵Juergen Ributzka2013-10-30
| | | | | | | | splitting too." Now Hexagon and SystemZ are not happy with it :-( git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@193677 91177308-0d34-0410-b5e6-96231b3b80d8
* SelectionDAG: Teach the legalizer to split SETCC if VSELECT needs splitting too.Juergen Ributzka2013-10-30
| | | | | | | | | | | | | | | | | | | | The Type Legalizer recognizes that VSELECT needs to be split, because the type is to wide for the given target. The same does not always apply to SETCC, because less space is required to encode the result of a comparison. As a result VSELECT is split and SETCC is unrolled into scalar comparisons. This commit fixes the issue by checking for VSELECT-SETCC patterns in the DAG Combiner. If a matching pattern is found, then the result mask of SETCC is promoted to the expected vector mask type for the given target. This mask has usually the same size as the VSELECT return type (except for Intel KNL). Now the type legalizer will split both VSELECT and SETCC. This allows the following X86 DAG Combine code to sucessfully detect the MIN/MAX pattern. This fixes PR16695, PR17002, and <rdar://problem/14594431>. Reviewed by Nadav git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@193676 91177308-0d34-0410-b5e6-96231b3b80d8
* Move getSymbol to TargetLoweringObjectFile.Rafael Espindola2013-10-29
| | | | | | This allows constructing a Mangler with just a TargetMachine. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@193630 91177308-0d34-0410-b5e6-96231b3b80d8
* Add a helper getSymbol to AsmPrinter.Rafael Espindola2013-10-29
| | | | git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@193627 91177308-0d34-0410-b5e6-96231b3b80d8
* The asm printer has a mangler. Don't keep a second pointer to it.Rafael Espindola2013-10-29
| | | | git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@193616 91177308-0d34-0410-b5e6-96231b3b80d8
* AVX-512: PMIN/PMAX intrinsics and patternsElena Demikhovsky2013-10-27
| | | | | | | Patch by Cameron McInally <cameron.mcinally@nyu.edu> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@193497 91177308-0d34-0410-b5e6-96231b3b80d8
* [X86][AVX512] Add patterns that match the AVX512 floating point register ↵Quentin Colombet2013-10-25
| | | | | | | | | vbroadcast intrinsics. Patch by Cameron McInally <cameron.mcinally@nyu.edu> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@193422 91177308-0d34-0410-b5e6-96231b3b80d8
* [X86][AVX512] Add patterns that match the AVX512 floating point vbroadcast ↵Quentin Colombet2013-10-25
| | | | | | | | | intrinsics. Patch by Cameron McInally <cameron.mcinally@nyu.edu> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@193421 91177308-0d34-0410-b5e6-96231b3b80d8
* Optimize concat_vectors(X, undef) -> scalar_to_vector(X).Nadav Rotem2013-10-25
| | | | | | | | | This optimization is not SSE specific so I am moving it to DAGco. The new scalar_to_vector dag node exposed a missing pattern in the AArch64 target that I needed to add. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@193393 91177308-0d34-0410-b5e6-96231b3b80d8
* AVX-512: added VCVTPH2PS, VCVTPS2PH with intrinsicsElena Demikhovsky2013-10-24
| | | | git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@193312 91177308-0d34-0410-b5e6-96231b3b80d8
* (this is a corrected patch)Yaron Keren2013-10-23
| | | | | | | | | | | | | | | | Calling _chkstk is required on ELF as well as COFF on Windows. Without _chkstk, functions requiring large stack crash in initialization code. Previous code tested for COFF format but not Mach-O and this patch modifies the code to test for Windows OS (both Windows target and MingW target) but not Mach-O object format: Looks like macho environment was used to build some EFI code. Credits to Andrew MacPherson. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@193289 91177308-0d34-0410-b5e6-96231b3b80d8
* Revert "Calling _chkstk is required on ELF as well as COFF on Windows. ↵Rafael Espindola2013-10-23
| | | | | | | | | | Without _chkstk functions requiring large stack crash in initialization code. Previous code tested for COFF format but not Mach-O and this patch modifies the code to test for Windows." This reverts commit r193263. It is causing CodeGen/X86/mingw-alloca.ll to fail. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@193275 91177308-0d34-0410-b5e6-96231b3b80d8
* X86: Custom lower sext v16i8 to v16i16, and the corresponding truncate.Benjamin Kramer2013-10-23
| | | | | | Also update the cost model. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@193270 91177308-0d34-0410-b5e6-96231b3b80d8
* Calling _chkstk is required on ELF as well as COFF on Windows. Yaron Keren2013-10-23
| | | | | | | | | | | | Without _chkstk functions requiring large stack crash in initialization code. Previous code tested for COFF format but not Mach-O and this patch modifies the code to test for Windows. Credits to Andrew MacPherson. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@193263 91177308-0d34-0410-b5e6-96231b3b80d8
* X86: Custom lower zext v16i8 to v16i16.Benjamin Kramer2013-10-23
| | | | | | | | | | | | | | | | | On sandy bridge (PR17654) we now get vpxor %xmm1, %xmm1, %xmm1 vpunpckhbw %xmm1, %xmm0, %xmm2 vpunpcklbw %xmm1, %xmm0, %xmm0 vinsertf128 $1, %xmm2, %ymm0, %ymm0 On haswell it's a simple vpmovzxbw %xmm0, %ymm0 There is a maze of duplicated and dead transforms and patterns in this area. Remove the dead custom lowering of zext v8i16 to v8i32, that's already handled by LowerAVXExtend. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@193262 91177308-0d34-0410-b5e6-96231b3b80d8
* Fix PR17631Michael Liao2013-10-23
| | | | | | | | | | | - Skip instructions added in prolog. For specific targets, prolog may insert helper function calls (e.g. _chkstk will be called when there're more than 4K bytes allocated on stack). However, these helpers don't use/def YMM/XMM registers. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@193261 91177308-0d34-0410-b5e6-96231b3b80d8
* X86: Make concat_vectors combine a bit more conservative.Jim Grosbach2013-10-23
| | | | | | Per Nadav's review comments for r192866. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@193252 91177308-0d34-0410-b5e6-96231b3b80d8
* [X86][FastISel] Add a comment to help understanding changes made in r192636.Quentin Colombet2013-10-22
| | | | | | | <rdar://problem/15192473> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@193199 91177308-0d34-0410-b5e6-96231b3b80d8
* AVX-512: aligned / unaligned load and store for 512-bit integer vectors.Elena Demikhovsky2013-10-22
| | | | git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@193156 91177308-0d34-0410-b5e6-96231b3b80d8
* Replace (V)MOVZDI2PDIrr/rm instructions with patterns that select ↵Craig Topper2013-10-22
| | | | | | (V)MOVDI2PDIrr/rm. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@193146 91177308-0d34-0410-b5e6-96231b3b80d8
* X86 vector element shift-by-immediate instructions take i8 immediates. MakeLang Hames2013-10-21
| | | | | | | | | | | | | | | | the instruction defenitions and ISEL reflect this. Prior to this patch these instructions took an i32i8imm, and the high bits were dropped during encoding. This led to incorrect behavior for shifts by immediates higher than 255. This patch fixes that issue by detecting large immediate shifts and returning constant zero (for logical shifts) or capping the shift amount at an encodable value (for arithmetic shifts). Fixes <rdar://problem/14968098> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@193096 91177308-0d34-0410-b5e6-96231b3b80d8
* AVX-512: MUL operation lowering for v8i64Elena Demikhovsky2013-10-21
| | | | git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@193083 91177308-0d34-0410-b5e6-96231b3b80d8
* Mark some command line flags as hiddenNadav Rotem2013-10-18
| | | | git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@193013 91177308-0d34-0410-b5e6-96231b3b80d8
* MC asm parser: allow ?'s in symbol names, and handle @'s in names in MS asmHans Wennborg2013-10-18
| | | | | | | | | | | | | | | | | | | | This is another (final?) stab at making us able to parse our own asm output on Windows. Symbols on Windows often contain @'s and ?'s in their names. Our asm parser didn't like this. ?'s were not allowed, and @'s were intepreted as trying to reference PLT/GOT/etc. We can't just add quotes around the bad names, since e.g. for MinGW, we use gas to assemble, and it doesn't like quotes in some places (notably in .def directives). This commit makes us allow ?'s in symbol names, and @'s in symbol names for MS assembly. Differential Revision: http://llvm-reviews.chandlerc.com/D1978 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@193000 91177308-0d34-0410-b5e6-96231b3b80d8
* Revert "Re-commit r192758 - MC: quote tricky symbol names in asm output"Hans Wennborg2013-10-18
| | | | | | | | | | | | | | | | | This caused the clang-native-mingw32-win7 buildbot to break. The assembler was complaining about the following lines that were showing up in the asm for CrashRecoveryContext.cpp: movl $"__ZL16ExceptionHandlerP19_EXCEPTION_POINTERS@4", 4(%eax) calll "_AddVectoredExceptionHandler@8" .def "__ZL16ExceptionHandlerP19_EXCEPTION_POINTERS@4"; "__ZL16ExceptionHandlerP19_EXCEPTION_POINTERS@4": calll "_RemoveVectoredExceptionHandler@4" Reverting for now. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@192940 91177308-0d34-0410-b5e6-96231b3b80d8
* x86: Move bitcasts outside concat_vector.Jim Grosbach2013-10-17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Consider the following: typedef unsigned short ushort4U __attribute__((ext_vector_type(4), aligned(2))); typedef unsigned short ushort4 __attribute__((ext_vector_type(4))); typedef unsigned short ushort8 __attribute__((ext_vector_type(8))); typedef int int4 __attribute__((ext_vector_type(4))); int4 __bbase_cvt_int(ushort4 v) { ushort8 a; a.lo = v; return _mm_cvtepu16_epi32(a); } This generates the, not unreasonable, IR: define <4 x i32> @foo0(double %v.coerce) nounwind ssp { %tmp = bitcast double %v.coerce to <4 x i16> %tmp1 = shufflevector <4 x i16> %tmp, <4 x i16> undef, <8 x i32> <i32 %0, i32 1, i32 2, i32 3, i32 undef, i32 undef, i32 undef, i32 undef> %tmp2 = tail call <4 x i32> @llvm.x86.sse41.pmovzxwd(<8 x i16> %tmp1) ret <4 x i32> %tmp2 } The problem is when type legalization gets hold of the v4i16. It legalizes that by spilling to the stack, then doing a zero-extending load. Things go even more silly from there, ending up with something like: _foo0: movsd %xmm0, -8(%rsp) <== Spill to the stack. movq -8(%rsp), %xmm0 <== Reload it right back out. pmovzxwd %xmm0, %xmm1 <== Here's what we actually asked for. pblendw $1, %xmm1, %xmm0 <== We don't need this at all pmovzxwd %xmm0, %xmm0 <== We already did this ret The v8i8 to v8i16 zext intrinsic gives even worse results, with two table lookups via pshufb instructions(!!). To avoid all that, we can move the bitcasting until after we've formed the wider (legal) vector type. Then our normal codegen flows along nicely and we get the expected: _foo0: pmovzxwd %xmm0, %xmm0 ret rdar://15245794 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@192866 91177308-0d34-0410-b5e6-96231b3b80d8
* Re-commit r192758 - MC: quote tricky symbol names in asm outputHans Wennborg2013-10-17
| | | | | | | | | | | | | | | | | | | | | | | | | | The reason this got reverted was that the @feat.00 symbol which was emitted for every TU became quoted, and on cygwin/mingw we use the gas assembler which couldn't handle the quotes. This commit fixes the problem by only emitting @feat.00 for win32, where we use clang -cc1as to assemble. gas would just drop this symbol anyway, so there is no loss there. With @feat.00 gone, there shouldn't be quoted symbols showing up on cygwin since it uses the Itanium ABI, which doesn't put these funny characters in symbols. > Because of win32 mangling, we produce symbol and section names with > funny characters in them, most notably @ characters. > > MC would choke on trying to parse its own assembly output. This patch addresses > that by: > > - Making @ trigger quoting of symbol names > - Also quote section names in the same way > - Just parse section names like other identifiers (to allow for quotes) > - Don't assume @ signifies a symbol variant if it is in a string. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@192859 91177308-0d34-0410-b5e6-96231b3b80d8
* Enabling 3DNow! prefetch instruction for a few AMD processors: bobcat, jaguar,Yunzhong Gao2013-10-16
| | | | | | | | | | | bulldozer and piledriver. Support for the instruction itself seems to have already been added in r178040. Differential Revision: http://llvm-reviews.chandlerc.com/D1933 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@192828 91177308-0d34-0410-b5e6-96231b3b80d8
* Add a MCAsmInfoELF class and factor some code into it.Rafael Espindola2013-10-16
| | | | | | We had a MCAsmInfoCOFF, but no common class for all the ELF MCAsmInfos before. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@192760 91177308-0d34-0410-b5e6-96231b3b80d8
* Move .ident handling to MCStreamer.Rafael Espindola2013-10-16
| | | | | | | | No functionality change, but exposes the API so that codegen can use it too. Patch by Katya Romanova. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@192757 91177308-0d34-0410-b5e6-96231b3b80d8
* Enable MI Sched for x86.Andrew Trick2013-10-15
| | | | | | | | | | | | | | | | | | | | | | | | | | This changes the SelectionDAG scheduling preference to source order. Soon, the SelectionDAG scheduler can be bypassed saving a nice chunk of compile time. Performance differences that result from this change are often a consequence of register coalescing. The register coalescer is far from perfect. Bugs can be filed for deficiencies. On x86 SandyBridge/Haswell, the source order schedule is often preserved, particularly for small blocks. Register pressure is generally improved over the SD scheduler's ILP mode. However, we are still able to handle large blocks that require latency hiding, unlike the SD scheduler's BURR mode. MI scheduler also attempts to discover the critical path in single-block loops and adjust heuristics accordingly. The MI scheduler relies on the new machine model. This is currently unimplemented for AVX, so we may not be generating the best code yet. Unit tests are updated so they don't depend on SD scheduling heuristics. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@192750 91177308-0d34-0410-b5e6-96231b3b80d8
* Fix PR17546Michael Liao2013-10-15
| | | | | | | | | | | | - Type of index used in extract_vector_elt or insert_vector_elt supposes to be TLI.getVectorIdxTy() which is pointer type on most targets. It'd better to truncate (or zero-extend in case it's changed later) it to mask element type to guarantee they are matching instead of asserting that. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@192722 91177308-0d34-0410-b5e6-96231b3b80d8
* Fix PR16807Michael Liao2013-10-15
| | | | | | | | | | | | - Lower signed division by constant powers-of-2 to target-independent DAG operators instead of target-dependent ones to support them better on targets where vector types are legal but shift operators on that types are illegal. E.g., on AVX, PSRAW is only available on <8 x i16> though <16 x i16> is a legal type. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@192721 91177308-0d34-0410-b5e6-96231b3b80d8
* Remove x86_sse42_crc32_64_8 intrinsic. It has no functional difference from ↵Craig Topper2013-10-15
| | | | | | x86_sse42_crc32_32_8 and was not mapped to a clang builtin. I'm not even sure why this form of the instruction is even called out explicitly in the docs. Also add AutoUpgrade support to convert it into the other intrinsic with appropriate trunc and zext. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@192672 91177308-0d34-0410-b5e6-96231b3b80d8
* [X86][FastISel] During X86 fastisel, the address of indirect call was resolvedQuentin Colombet2013-10-14
| | | | | | | | | | | | | | | | through bitcast, ptrtoint, and inttoptr instructions. This is valid only if the related instructions are in that same basic block, otherwise we may reference variables that were not live accross basic blocks resulting in undefined virtual registers. The bug was exposed when both SDISel and FastISel were used within the same function, i.e., one basic block is issued with FastISel and another with SDISel, as demonstrated with the testcase. <rdar://problem/15192473> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@192636 91177308-0d34-0410-b5e6-96231b3b80d8
* Fix the ExecutionDepsFix pass to handle AVX instructions.Andrew Trick2013-10-14
| | | | | | | | | | | | | | | | | | | | | | | | This pass is needed to break false dependencies. Without it, unlucky register assignment can result in wild (5x) swings in performance. This pass was trying to handle AVX but not getting it right. AVX doesn't have partial register defs, it has unused register reads in which the high bits of a source operand are copied into the unused bits of the dest. Fixing this requires conservative liveness analysis. This is awkard because the pass already has its own pseudo-liveness. However, proper liveness is expensive, and we would like to use a generic utility to compute it. The fix only invokes liveness on-demand. It is rare to detect a case that needs undef-read dependence breaking, but when it happens, it can be needed many times within a very large block. I think the existing heuristic which uses a register window of 16 is too conservative for loop-carried false dependencies. If the loop is a reduction. The out-of-order engine may be able to execute several loop iterations in parallel. However, I'll leave this tuning exercise for next time. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@192635 91177308-0d34-0410-b5e6-96231b3b80d8
* whitespaceAndrew Trick2013-10-14
| | | | git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@192633 91177308-0d34-0410-b5e6-96231b3b80d8
* Revert part of a fix from 2010, changes since then:Eric Christopher2013-10-14
| | | | | | | | | | | | a) x86-64 TLS has been documented b) the code path should use movq for the correct relocation to be generated. I've also added a fixme for the test case that we should improve the code generated, it should look something like is documented in the tls abi document. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@192631 91177308-0d34-0410-b5e6-96231b3b80d8
* Reformat this routine slightly.Eric Christopher2013-10-14
| | | | git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@192630 91177308-0d34-0410-b5e6-96231b3b80d8
* Remove some extraneous whitespace.Eric Christopher2013-10-14
| | | | git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@192629 91177308-0d34-0410-b5e6-96231b3b80d8