summaryrefslogtreecommitdiff
path: root/test
Commit message (Collapse)AuthorAge
* Legalize: Improve legalization of long vector extends.Jim Grosbach2013-10-31
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When an extend more than doubles the size of the elements (e.g., a zext from v16i8 to v16i32), the normal legalization method of splitting the vectors will run into problems as by the time the destination vector is legal, the source vector is illegal. The end result is the operation often becoming scalarized, with the typical horrible performance. For example, on x86_64, the simple input of: define void @bar(<16 x i8> %a, <16 x i32>* %p) nounwind { %tmp = zext <16 x i8> %a to <16 x i32> store <16 x i32> %tmp, <16 x i32>*%p ret void } Generates: .section __TEXT,__text,regular,pure_instructions .section __TEXT,__const .align 5 LCPI0_0: .long 255 ## 0xff .long 255 ## 0xff .long 255 ## 0xff .long 255 ## 0xff .long 255 ## 0xff .long 255 ## 0xff .long 255 ## 0xff .long 255 ## 0xff .section __TEXT,__text,regular,pure_instructions .globl _bar .align 4, 0x90 _bar: vpunpckhbw %xmm0, %xmm0, %xmm1 vpunpckhwd %xmm0, %xmm1, %xmm2 vpmovzxwd %xmm1, %xmm1 vinsertf128 $1, %xmm2, %ymm1, %ymm1 vmovaps LCPI0_0(%rip), %ymm2 vandps %ymm2, %ymm1, %ymm1 vpmovzxbw %xmm0, %xmm3 vpunpckhwd %xmm0, %xmm3, %xmm3 vpmovzxbd %xmm0, %xmm0 vinsertf128 $1, %xmm3, %ymm0, %ymm0 vandps %ymm2, %ymm0, %ymm0 vmovaps %ymm0, (%rdi) vmovaps %ymm1, 32(%rdi) vzeroupper ret So instead we can check if there are legal types that enable us to split more cleverly when the input vector is already legal such that we don't turn it into an illegal type. If the extend is such that it's more than doubling the size of the input we check if - the number of vector elements is even, - the source type is legal, - the type of a split source is illegal, - the type of an extended (by doubling element size) source is legal, and - the type of that extended source when split is legal. If the conditions are met, instead of just splitting both the destination and the source types, we create an extend that only goes up one "step" (doubling the element width), and the continue legalizing the rest of the operation normally. The result is that this operates as a new, more effecient, termination condition for the loop of "split the operation until the destination type is legal." With this change, the above example now compiles to: _bar: vpxor %xmm1, %xmm1, %xmm1 vpunpcklbw %xmm1, %xmm0, %xmm2 vpunpckhwd %xmm1, %xmm2, %xmm3 vpunpcklwd %xmm1, %xmm2, %xmm2 vinsertf128 $1, %xmm3, %ymm2, %ymm2 vpunpckhbw %xmm1, %xmm0, %xmm0 vpunpckhwd %xmm1, %xmm0, %xmm3 vpunpcklwd %xmm1, %xmm0, %xmm0 vinsertf128 $1, %xmm3, %ymm0, %ymm0 vmovaps %ymm0, 32(%rdi) vmovaps %ymm2, (%rdi) vzeroupper ret This generalizes a custom lowering that was added a while back to the ARM backend. That lowering is no longer necessary, and is removed. The testcases for it, however, provide excellent ARM tests for this change and so remain. rdar://14735100 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@193727 91177308-0d34-0410-b5e6-96231b3b80d8
* Fix CodeGen for unaligned loads with address spacesMatt Arsenault2013-10-30
| | | | git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@193721 91177308-0d34-0410-b5e6-96231b3b80d8
* Teach scalarrepl about address spacesMatt Arsenault2013-10-30
| | | | git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@193720 91177308-0d34-0410-b5e6-96231b3b80d8
* Produce .weak_def_can_be_hidden for some linkonce_odr valuesRafael Espindola2013-10-30
| | | | | | | | | | | | | | With this patch llvm produces a weak_def_can_be_hidden for linkonce_odr if they are also unnamed_addr or don't have their address taken. There is not a lot of documentation about .weak_def_can_be_hidden, but from the old discussion about linkonce_odr_auto_hide and the name of the directive this looks correct: these symbols can be hidden. Testing this with the ld64 in Xcode 5 linking clang reduces the number of exported symbols from 21053 to 19049. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@193718 91177308-0d34-0410-b5e6-96231b3b80d8
* Add DebugInfo testcase for high_pc encoded as constant, fixed in r193555.Will Dietz2013-10-30
| | | | git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@193711 91177308-0d34-0410-b5e6-96231b3b80d8
* Fix GVN creating bitcast between address spacesMatt Arsenault2013-10-30
| | | | git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@193710 91177308-0d34-0410-b5e6-96231b3b80d8
* This commit adds some (but not all) of the x86-64 relocations that are notTom Roeder2013-10-30
| | | | | | | currently supported in the ELF object writer, along with a simple test case. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@193709 91177308-0d34-0410-b5e6-96231b3b80d8
* [ARM] NEON instructions were erroneously decoded from certain invalid encodingsArtyom Skrobov2013-10-30
| | | | git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@193705 91177308-0d34-0410-b5e6-96231b3b80d8
* R600: Custom lower f32 = uint_to_fp i64Tom Stellard2013-10-30
| | | | git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@193701 91177308-0d34-0410-b5e6-96231b3b80d8
* [mips][msa] Correct definition of bins[lr] and CHECK-DAG-ize related testsDaniel Sanders2013-10-30
| | | | git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@193695 91177308-0d34-0410-b5e6-96231b3b80d8
* [mips][msa] Added support for matching bmnz, bmnzi, bmz, and bmzi from ↵Daniel Sanders2013-10-30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | normal IR (i.e. not intrinsics) Also corrected the definition of the intrinsics for these instructions (the result register is also the first operand), and added intrinsics for bsel and bseli to clang (they already existed in the backend). These four operations are mostly equivalent to bsel, and bseli (the difference is which operand is tied to the result). As a result some of the tests changed as described below. bitwise.ll: - bsel.v test adapted so that the mask is unknown at compile-time. This stops it emitting bmnzi.b instead of the intended bsel.v. - The bseli.b test now tests the right thing. Namely the case when one of the values is an uimm8, rather than when the condition is a uimm8 (which is covered by bmnzi.b) compare.ll: - bsel.v tests now (correctly) emits bmnz.v instead of bsel.v because this is the same operation (see MSA.txt). i8.ll - CHECK-DAG-ized test. - bmzi.b test now (correctly) emits equivalent bmnzi.b with swapped operands because this is the same operation (see MSA.txt). - bseli.b still emits bseli.b though because the immediate makes it distinguishable from bmnzi.b. vec.ll: - CHECK-DAG-ized test. - bmz.v tests now (correctly) emits bmnz.v with swapped operands (see MSA.txt). - bsel.v tests now (correctly) emits bmnz.v with swapped operands (see MSA.txt). git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@193693 91177308-0d34-0410-b5e6-96231b3b80d8
* [AArch64] Add support for NEON scalar floating-point compare instructions.Chad Rosier2013-10-30
| | | | git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@193691 91177308-0d34-0410-b5e6-96231b3b80d8
* [mips][msa] Added support for matching bins[lr]i.[bhwd] from normal IR (i.e. ↵Daniel Sanders2013-10-30
| | | | | | | | | | | | | | | | | | | not intrinsics) This required correcting the definition of the bins[lr]i intrinsics because the result is also the first operand. It also required removing the (arbitrary) check for 32-bit immediates in MipsSEDAGToDAGISel::selectVSplat(). Currently using binsli.d with 2 bits set in the mask doesn't select binsli.d because the constant is legalized into a ConstantPool. Similar things can happen with binsri.d with more than 10 bits set in the mask. The resulting code when this happens is correct but not optimal. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@193687 91177308-0d34-0410-b5e6-96231b3b80d8
* [mips][msa] Combine binsri-like DAG of AND and OR into equivalent VSELECTDaniel Sanders2013-10-30
| | | | | | | | | | | | | | | | | (or (and $a, $mask), (and $b, $inverse_mask)) => (vselect $mask, $a, $b). where $mask is a constant splat. This allows bitwise operations to make use of bsel. It's also a stepping stone towards matching bins[lr], and bins[lr]i from normal IR. Two sets of similar tests have been added in this commit. The bsel_* functions test the case where binsri cannot be used. The binsr_*_i functions will start to use the binsri instruction in the next commit. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@193682 91177308-0d34-0410-b5e6-96231b3b80d8
* [mips][msa] Added support for matching splat.[bhw] from normal IR (i.e. not ↵Daniel Sanders2013-10-30
| | | | | | | | | | | intrinsics) splat.d is implemented but this subtest is currently disabled. This is because it is difficult to match the appropriate IR on MIPS32. There is a patch under review that should help with this so I hope to enable the subtest soon. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@193680 91177308-0d34-0410-b5e6-96231b3b80d8
* Revert "SelectionDAG: Teach the legalizer to split SETCC if VSELECT needs ↵Juergen Ributzka2013-10-30
| | | | | | | | splitting too." Now Hexagon and SystemZ are not happy with it :-( git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@193677 91177308-0d34-0410-b5e6-96231b3b80d8
* SelectionDAG: Teach the legalizer to split SETCC if VSELECT needs splitting too.Juergen Ributzka2013-10-30
| | | | | | | | | | | | | | | | | | | | The Type Legalizer recognizes that VSELECT needs to be split, because the type is to wide for the given target. The same does not always apply to SETCC, because less space is required to encode the result of a comparison. As a result VSELECT is split and SETCC is unrolled into scalar comparisons. This commit fixes the issue by checking for VSELECT-SETCC patterns in the DAG Combiner. If a matching pattern is found, then the result mask of SETCC is promoted to the expected vector mask type for the given target. This mask has usually the same size as the VSELECT return type (except for Intel KNL). Now the type legalizer will split both VSELECT and SETCC. This allows the following X86 DAG Combine code to sucessfully detect the MIN/MAX pattern. This fixes PR16695, PR17002, and <rdar://problem/14594431>. Reviewed by Nadav git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@193676 91177308-0d34-0410-b5e6-96231b3b80d8
* Debug Info: instead of calling addToContextOwner which constructs the contextManman Ren2013-10-29
| | | | | | | | | | | | | | | | after the DIE creation, we construct the context first. Ensure that we create the context before we create a type so that we can add the newly created type to the parent. Remove last use of addToContextOwner now that it's not needed. We use createAndAddDIE to wrap around "new DIE(". Now all shareable DIEs should be added to their parents right after the creation. Reviewed off-list by Eric, Thanks. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@193657 91177308-0d34-0410-b5e6-96231b3b80d8
* [mips] Align the stack to 16-bytes for mfp64.Akira Hatanaka2013-10-29
| | | | git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@193641 91177308-0d34-0410-b5e6-96231b3b80d8
* Debug Info: clean up testing case.Manman Ren2013-10-29
| | | | | | | | | Add a tag before the name attribute for readability. Use CHECK-NEXT instead of CHECK-NOT followed by a CHECK. Add new lines to separate checking of different DIEs. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@193629 91177308-0d34-0410-b5e6-96231b3b80d8
* add test cases for frameaddr and returnaddr for aarch64Weiming Zhao2013-10-29
| | | | git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@193626 91177308-0d34-0410-b5e6-96231b3b80d8
* Support for microMIPS jump instructionsZoran Jovanovic2013-10-29
| | | | git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@193623 91177308-0d34-0410-b5e6-96231b3b80d8
* R600/SI: Add compute support for CI v2Tom Stellard2013-10-29
| | | | | | | | | v2: - Fix LDS size calculation Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@193621 91177308-0d34-0410-b5e6-96231b3b80d8
* R600: Expand vector FSQRT opsTom Stellard2013-10-29
| | | | git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@193620 91177308-0d34-0410-b5e6-96231b3b80d8
* Test cleanup for v8 instructionsBernard Ogden2013-10-29
| | | | | | | Add some missing tests, factor out a test not specific to v8 into its own file. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@193611 91177308-0d34-0410-b5e6-96231b3b80d8
* ARM: Add subtarget feature for CRCBernard Ogden2013-10-29
| | | | | | | | Adds a subtarget feature for the CRC instructions (optional in v8-A) to the ARM (32-bit) backend. Differential Revision: http://llvm-reviews.chandlerc.com/D2036 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@193599 91177308-0d34-0410-b5e6-96231b3b80d8
* AArch64: add 'a' inline asm operand modifierTim Northover2013-10-29
| | | | | | | This is used in the Linux kernel, and effectively just means "print an address". git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@193593 91177308-0d34-0410-b5e6-96231b3b80d8
* Debug Info: instead of calling addToContextOwner which constructs the contextManman Ren2013-10-29
| | | | | | | | | | | | | after the DIE creation, we construct the context first. This touches creation of namespaces and global variables. The purpose is to handle all DIE creations similarly: constructs the context first, then creates the DIE and immediately adds the DIE to its parent. We use createAndAddDIE to wrap around "new DIE(". git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@193589 91177308-0d34-0410-b5e6-96231b3b80d8
* Add llvm/test/Transforms/SLPVectorizer/ARM/lit.local.cfg. Tests there ↵NAKAMURA Takumi2013-10-29
| | | | | | require ARM in targets. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@193580 91177308-0d34-0410-b5e6-96231b3b80d8
* Fix "existant" typosAlp Toker2013-10-29
| | | | git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@193579 91177308-0d34-0410-b5e6-96231b3b80d8
* ARM cost model: Unaligned vectorized double stores are expensiveArnold Schwaighofer2013-10-29
| | | | | | | | | Updated a test case that assumed that <2 x double> would vectorize to use <4 x float>. radar://15338229 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@193574 91177308-0d34-0410-b5e6-96231b3b80d8
* ARM cost model: Account for zero cost scalar SROA instructionsArnold Schwaighofer2013-10-29
| | | | | | | | | By vectorizing a series of srl, or, ... instructions we have obfuscated the intention so much that the backend does not know how to fold this code away. radar://15336950 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@193573 91177308-0d34-0410-b5e6-96231b3b80d8
* Adding a workaround for __main linking with remote lli and Cygwin/MinGWAndrew Kaylor2013-10-29
| | | | git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@193570 91177308-0d34-0410-b5e6-96231b3b80d8
* Move the STT_FILE symbols out of the normal symbol table processing forJoerg Sonnenberger2013-10-29
| | | | | | | | ELF. They can overlap with the other symbols, e.g. if a source file "foo.c" contains a function "foo" with a static variable "c". git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@193569 91177308-0d34-0410-b5e6-96231b3b80d8
* Debug Info: use createAndAddDIE for newly-created Subprogram DIEs.Manman Ren2013-10-29
| | | | | | | | | | | | More patches will be submitted to convert "new DIE(" to use createAddAndDIE in DwarfCompileUnit.cpp. This will simplify implementation of addDIEEntry where we have to decide between ref4 and ref_addr, because DIEs that can be shared across CU will be added to a CU already. Reviewed off-list by Eric. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@193567 91177308-0d34-0410-b5e6-96231b3b80d8
* Renaming MCJIT .ir files to .ll and moving them to InputsAndrew Kaylor2013-10-28
| | | | git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@193562 91177308-0d34-0410-b5e6-96231b3b80d8
* lit: add missing substitutions for recently added toolsAlp Toker2013-10-28
| | | | | | | llvm-mcmarkup, obj2yaml and yaml2obj were missing from the substitutions list, causing the test suite to fail in a sandboxed environment. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@193559 91177308-0d34-0410-b5e6-96231b3b80d8
* Quote potential shell expansions found in testsAlp Toker2013-10-28
| | | | git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@193558 91177308-0d34-0410-b5e6-96231b3b80d8
* Convert another llc -filetype=obj test.Rafael Espindola2013-10-28
| | | | git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@193548 91177308-0d34-0410-b5e6-96231b3b80d8
* Convert another llc -filetype=obj test.Rafael Espindola2013-10-28
| | | | git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@193547 91177308-0d34-0410-b5e6-96231b3b80d8
* Convert another llc -filetype=obj test.Rafael Espindola2013-10-28
| | | | git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@193546 91177308-0d34-0410-b5e6-96231b3b80d8
* Standardizing lli's extra module command line optionAndrew Kaylor2013-10-28
| | | | git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@193544 91177308-0d34-0410-b5e6-96231b3b80d8
* Convert another llc -filetype=obj test.Rafael Espindola2013-10-28
| | | | git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@193539 91177308-0d34-0410-b5e6-96231b3b80d8
* Convert another llc -filetype=obj test.Rafael Espindola2013-10-28
| | | | git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@193538 91177308-0d34-0410-b5e6-96231b3b80d8
* Convert another llc -filetype=obj test.Rafael Espindola2013-10-28
| | | | git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@193537 91177308-0d34-0410-b5e6-96231b3b80d8
* Convert another llc -filetype=obj test.Rafael Espindola2013-10-28
| | | | git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@193536 91177308-0d34-0410-b5e6-96231b3b80d8
* Return early from getUnconditionalBranchTargetOpValue if the branch target isLang Hames2013-10-28
| | | | | | | | | | | | | | | | | | | an MCExpr, in order to avoid writing an encoded zero value in the immediate field. When getUnconditionalBranchTargetOpValue is called with an MCExpr target, we don't know what the final immediate field value should be. We shouldn't explicitly set the immediate field to an encoded zero value as zero is encoded with a non-zero bit pattern. This leads to bits being set that pollute the final immediate value. The nature of the encoding is such that the polluted bits only affect very large immediate values, explaining why this hasn't caused problems earlier. Fixes <rdar://problem/15155975>. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@193535 91177308-0d34-0410-b5e6-96231b3b80d8
* Convert a llc -filetype=obj test into a llvm-mc test.Rafael Espindola2013-10-28
| | | | git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@193534 91177308-0d34-0410-b5e6-96231b3b80d8
* [arm] Implement eabi_attribute, cpu, and fpu directives.Logan Chien2013-10-28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | This commit allows the ARM integrated assembler to parse and assemble the code with .eabi_attribute, .cpu, and .fpu directives. To implement the feature, this commit moves the code from AttrEmitter to ARMTargetStreamers, and several new test cases related to cortex-m4, cortex-r5, and cortex-a15 are added. Besides, this commit also change the Subtarget->isFPOnlySP() to Subtarget->hasD16() to match the usage of .fpu directive. This commit changes the test cases: * Several .eabi_attribute directives in 2010-09-29-mc-asm-header-test.ll are removed because the .fpu directive already cover the functionality. * In the Cortex-A15 test case, the value for Tag_Advanced_SIMD_arch has be changed from 1 to 2, which is more precise. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@193524 91177308-0d34-0410-b5e6-96231b3b80d8
* [SystemZ] Set usaAA to trueRichard Sandiford2013-10-28
| | | | | | | | | | | | | | | | | useAA significantly improves the handling of vector code that has TBAA information attached. It also helps other cases, as shown by the testsuite changes here. The only real downside I've seen is that it interferes with MergeConsecutiveStores. The problem is that that optimization works top down, starting at the first store in the chain, and looks for cases where the chain result is only used by a single related store. These related stores don't alias, so useAA will have rewritten all the later stores to use a different chain input (typically the same one as the first store). I think the advantages outweigh the disadvantages though, so for now I've just disabled alias analysis for the unaligned-01.ll test. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@193521 91177308-0d34-0410-b5e6-96231b3b80d8