summaryrefslogtreecommitdiff
path: root/test/CodeGen/X86/fast-isel-x86-64.ll
Commit message (Collapse)AuthorAge
* Convert CodeGen/*/*.ll tests to use the new CHECK-LABEL for easier ↵Stephen Lin2013-07-13
| | | | | | | | | | | debugging. No functionality change and all tests pass after conversion. This was done with the following sed invocation to catch label lines demarking function boundaries: sed -i '' "s/^;\( *\)\([A-Z0-9_]*\):\( *\)test\([A-Za-z0-9_-]*\):\( *\)$/;\1\2-LABEL:\3test\4:\5/g" test/CodeGen/*/*.ll which was written conservatively to avoid false positives rather than false negatives. I scanned through all the changes and everything looks correct. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@186258 91177308-0d34-0410-b5e6-96231b3b80d8
* Pad Short Functions for Intel AtomPreston Gurd2013-01-08
| | | | | | | | | | | | | | | | | | | | | | | | | | | The current Intel Atom microarchitecture has a feature whereby when a function returns early then it is slightly faster to execute a sequence of NOP instructions to wait until the return address is ready, as opposed to simply stalling on the ret instruction until the return address is ready. When compiling for X86 Atom only, this patch will run a pass, called "X86PadShortFunction" which will add NOP instructions where less than four cycles elapse between function entry and return. It includes tests. This patch has been updated to address Nadav's review comments - Optimize only at >= O1 and don't do optimization if -Os is set - Stores MachineBasicBlock* instead of BBNum - Uses DenseMap instead of std::map - Fixes placement of braces Patch by Andy Zhang. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@171879 91177308-0d34-0410-b5e6-96231b3b80d8
* Revert revision 171524. Original message:Nadav Rotem2013-01-05
| | | | | | | | | | | | | | | | | | | | | | | URL: http://llvm.org/viewvc/llvm-project?rev=171524&view=rev Log: The current Intel Atom microarchitecture has a feature whereby when a function returns early then it is slightly faster to execute a sequence of NOP instructions to wait until the return address is ready, as opposed to simply stalling on the ret instruction until the return address is ready. When compiling for X86 Atom only, this patch will run a pass, called "X86PadShortFunction" which will add NOP instructions where less than four cycles elapse between function entry and return. It includes tests. Patch by Andy Zhang. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@171603 91177308-0d34-0410-b5e6-96231b3b80d8
* The current Intel Atom microarchitecture has a feature whereby when a functionPreston Gurd2013-01-04
| | | | | | | | | | | | | | | | | | | returns early then it is slightly faster to execute a sequence of NOP instructions to wait until the return address is ready, as opposed to simply stalling on the ret instruction until the return address is ready. When compiling for X86 Atom only, this patch will run a pass, called "X86PadShortFunction" which will add NOP instructions where less than four cycles elapse between function entry and return. It includes tests. Patch by Andy Zhang. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@171524 91177308-0d34-0410-b5e6-96231b3b80d8
* Make sure to put our sret argument into %rax on x86-64. Fixes PR13563!Nick Lewycky2012-10-02
| | | | git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165063 91177308-0d34-0410-b5e6-96231b3b80d8
* Introduce 'UseSSEx' to force SSE legacy encodingMichael Liao2012-08-30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | - Add 'UseSSEx' to force SSE legacy insn not being selected when AVX is enabled. As the penalty of inter-mixing SSE and AVX instructions, we need prevent SSE legacy insn from being generated except explicitly specified through some intrinsics. For patterns supported by both SSE and AVX, so far, we force AVX insn will be tried first relying on AddedComplexity or position in td file. It's error-prone and introduces bugs accidentally. 'UseSSEx' is disabled when AVX is turned on. For SSE insns inherited by AVX, we need this predicate to force VEX encoding or SSE legacy encoding only. For insns not inherited by AVX, we still use the previous predicates, i.e. 'HasSSEx'. So far, these insns fall into the following categories: * SSE insns with MMX operands * SSE insns with GPR/MEM operands only (xFENCE, PREFETCH, CLFLUSH, CRC, and etc.) * SSE4A insns. * MMX insns. * x87 insns added by SSE. 2 test cases are modified: - test/CodeGen/X86/fast-isel-x86-64.ll AVX code generation is different from SSE one. 'vcvtsi2sdq' cannot be selected by fast-isel due to complicated pattern and fast-isel fallback to materialize it from constant pool. - test/CodeGen/X86/widen_load-1.ll AVX code generation is different from SSE one after fixing SSE/AVX inter-mixing. Exec-domain fixing prefers 'vmovapd' instead of 'vmovaps'. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@162919 91177308-0d34-0410-b5e6-96231b3b80d8
* Make X86::FsFLD0SS / FsFLD0SD real pseudo-instructions.Jakob Stoklund Olesen2011-11-29
| | | | | | | | | Like V_SET0, these instructions are expanded by ExpandPostRA to xorps / vxorps so they can participate in execution domain swizzling. This also makes the AVX variants redundant. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@145440 91177308-0d34-0410-b5e6-96231b3b80d8
* Disable expensive two-address optimizations at -O0. rdar://10453055Evan Cheng2011-11-16
| | | | git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@144806 91177308-0d34-0410-b5e6-96231b3b80d8
* FastISel: avoid function calls between the materialization of the constant ↵Ivan Krasin2011-08-18
| | | | | | and its use. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@137993 91177308-0d34-0410-b5e6-96231b3b80d8
* fast-isel sret calls, try 2. We actually do need to do something on x86-32. ↵Eli Friedman2011-04-28
| | | | | | rdar://problem/9303592 . git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@130429 91177308-0d34-0410-b5e6-96231b3b80d8
* Actually revert r130348 correctly.Eli Friedman2011-04-28
| | | | git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@130418 91177308-0d34-0410-b5e6-96231b3b80d8
* Revert r130348; causing buildbot issues on x86-32.Eli Friedman2011-04-28
| | | | git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@130412 91177308-0d34-0410-b5e6-96231b3b80d8
* Fix a silly mistake in r130338.Eli Friedman2011-04-28
| | | | git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@130360 91177308-0d34-0410-b5e6-96231b3b80d8
* fast-isel sret. We actually don't need to do anything special on x86. :) ↵Eli Friedman2011-04-27
| | | | | | rdar://problem/9303592 . git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@130348 91177308-0d34-0410-b5e6-96231b3b80d8
* Make the fast-isel code for literal 0.0 a bit shorter/faster, since 0.0 is ↵Eli Friedman2011-04-27
| | | | | | common. rdar://problem/9303592 . git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@130338 91177308-0d34-0410-b5e6-96231b3b80d8
* Recommit the fix for rdar://9289512 with a couple tweaks toChris Lattner2011-04-22
| | | | | | | | | | | | | fix bugs exposed by the gcc dejagnu testsuite: 1. The load may actually be used by a dead instruction, which would cause an assert. 2. The load may not be used by the current chain of instructions, and we could move it past a side-effecting instruction. Change how we process uses to define the problem away. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@130018 91177308-0d34-0410-b5e6-96231b3b80d8
* Revert r1296656, "Fix rdar://9289512 - not folding load into compare at -O0...",Daniel Dunbar2011-04-21
| | | | | | which broke a couple GCC test suite tests at -O0. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129914 91177308-0d34-0410-b5e6-96231b3b80d8
* Add support for FastISel'ing varargs calls.Eli Friedman2011-04-19
| | | | git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129765 91177308-0d34-0410-b5e6-96231b3b80d8
* Implement support for x86 fastisel of small fixed-sized memcpys, which are ↵Chris Lattner2011-04-19
| | | | | | | | | | generated en-mass for C++ PODs. On my c++ test file, this cuts the fast isel rejects by 10x and shrinks the generated .s file by 5% git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129755 91177308-0d34-0410-b5e6-96231b3b80d8
* Implement support for fast isel of calls of i1 arguments, even though they ↵Chris Lattner2011-04-19
| | | | | | | | | | | | | are illegal, when they are a truncate from something else. This eliminates fully half of all the fastisel rejections on a test c++ file I'm working with, which should make a substantial improvement for -O0 compile of c++ code. This fixed rdar://9297003 - fast isel bails out on all functions taking bools git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129752 91177308-0d34-0410-b5e6-96231b3b80d8
* Handle i1/i8/i16 constant integer arguments to calls by prepromoting them.Chris Lattner2011-04-19
| | | | | | | | | | | | | | | | | | | | | | | | | | | Before we would bail out on i1 arguments all together, now we just bail on non-constant ones. Also, we used to emit extraneous code. e.g. test12 was: movb $0, %al movzbl %al, %edi callq _test12 and test13 was: movb $0, %al xorl %edi, %edi movb %al, 7(%rsp) callq _test13f Now we get: movl $0, %edi callq _test12 and: movl $0, %edi callq _test13f git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129751 91177308-0d34-0410-b5e6-96231b3b80d8
* be layout aware, to produce:Chris Lattner2011-04-19
| | | | | | | | | | | | | | | | | | | | testb $1, %al je LBB0_2 ## BB#1: ## %if.then movb $0, %al instead of: testb $1, %al jne LBB0_1 jmp LBB0_2 LBB0_1: ## %if.then movb $0, %al how 'bout that. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129749 91177308-0d34-0410-b5e6-96231b3b80d8
* fix rdar://9297006 - fast isel bails out on trunc to i1 -> bools cry,Chris Lattner2011-04-19
| | | | | | | a common cause of fast isel rejects on c++ code. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129748 91177308-0d34-0410-b5e6-96231b3b80d8
* while we're at it, handle 'sdiv exact' of a power of 2 also,Chris Lattner2011-04-18
| | | | | | | this fixes a few rejects on c++ iterator loops. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129694 91177308-0d34-0410-b5e6-96231b3b80d8
* fix rdar://9297011 - udiv by power of two causing fast-isel rejectsChris Lattner2011-04-18
| | | | git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129693 91177308-0d34-0410-b5e6-96231b3b80d8
* Implement major new fastisel functionality: the matcher can now handle ↵Chris Lattner2011-04-18
| | | | | | | | | | | | | | | | | | | | | | | | | immediates with value constraints on them (when defined as ImmLeaf's). This is particularly important for X86-64, where almost all reg/imm instructions take a i64immSExt32 immediate operand, which has a value constraint. Before this patch we ended up iseling the examples into such amazing code as: movabsq $7, %rax imulq %rax, %rdi movq %rdi, %rax ret now we produce: imulq $7, %rdi, %rax ret This dramatically shrinks the generated code at -O0 on x86-64. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129691 91177308-0d34-0410-b5e6-96231b3b80d8
* 1. merge fast-isel-shift-imm.ll into fast-isel-x86-64.llChris Lattner2011-04-17
| | | | | | | | | | | | 2. implement rdar://9289501 - fast isel should fold trivial multiplies to shifts 3. teach tblgen to handle shift immediates that are different sizes than the shifted operands, eliminating some code from the X86 fast isel backend. 4. Have FastISel::SelectBinaryOp use (the poorly named) FastEmit_ri_ function instead of FastEmit_ri to simplify code. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129666 91177308-0d34-0410-b5e6-96231b3b80d8
* fix an x86 fast isel issue where we'd completely give up on folding an addressChris Lattner2011-04-17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | when we have a global variable base an an index. Instead, just give up on folding the global variable. Before we'd geenrate: _test: ## @test ## BB#0: movq _rtx_length@GOTPCREL(%rip), %rax leaq (%rax), %rax addq %rdi, %rax movzbl (%rax), %eax ret now we generate: _test: ## @test ## BB#0: movq _rtx_length@GOTPCREL(%rip), %rax movzbl (%rax,%rdi), %eax ret The difference is even more significant when there is a scale involved. This fixes rdar://9289558 - total fail with addr mode formation at -O0/x86-64 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129664 91177308-0d34-0410-b5e6-96231b3b80d8
* fix an oversight which caused us to compile the testcase (and otherChris Lattner2011-04-17
| | | | | | | | | | | | | | | | | | | | | less trivial things) into a dummy lea. Before we generated: _test: ## @test movq _G@GOTPCREL(%rip), %rax leaq (%rax), %rax ret now we produce: _test: ## @test movq _G@GOTPCREL(%rip), %rax ret This is part of rdar://9289558 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129662 91177308-0d34-0410-b5e6-96231b3b80d8
* Fix rdar://9289512 - not folding load into compare at -O0Chris Lattner2011-04-17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | The basic issue here is that bottom-up isel is matching the branch and compare, and was failing to fold the load into the branch/compare combo. Fixing this (by allowing folding into any instruction of a sequence that is selected) allows us to produce things like: cmpb $0, 52(%rax) je LBB4_2 instead of: movb 52(%rax), %cl cmpb $0, %cl je LBB4_2 This makes the generated -O0 code run a bit faster, but also speeds up compile time by putting less pressure on the register allocator and generating less code. This was one of the biggest classes of missing load folding. Implementing this shrinks 176.gcc's c-decl.s (as a random example) by about 4% in (verbose-asm) line count. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129656 91177308-0d34-0410-b5e6-96231b3b80d8
* fix rdar://9289583 - fast isel should handle non-canonical commutative binopsChris Lattner2011-04-17
allowing us to fold the immediate into the 'and' in this case: int test1(int i) { return 8&i; } git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129653 91177308-0d34-0410-b5e6-96231b3b80d8