diff options
author | Adam Nemet <anemet@apple.com> | 2014-04-18 19:44:16 +0000 |
---|---|---|
committer | Adam Nemet <anemet@apple.com> | 2014-04-18 19:44:16 +0000 |
commit | d290fa608fe090903f306c10d27a0e181fe6fb3b (patch) | |
tree | 0d6e0bb55db79ef042542587916798448223facf /test | |
parent | 842c27189a7a6c698d10da84d483627da1da0c1d (diff) | |
download | llvm-d290fa608fe090903f306c10d27a0e181fe6fb3b.tar.gz llvm-d290fa608fe090903f306c10d27a0e181fe6fb3b.tar.bz2 llvm-d290fa608fe090903f306c10d27a0e181fe6fb3b.tar.xz |
[X86] Improve buildFromShuffleMostly for AVX
For a 256-bit BUILD_VECTOR consisting mostly of shuffles of 256-bit vectors,
both the BUILD_VECTOR and its operands may need to be legalized in multiple
steps. Consider:
(v8f32 (BUILD_VECTOR (extract_vector_elt (v8f32 %vreg0,) Constant<1>),
(extract_vector_elt %vreg0, Constant<2>),
(extract_vector_elt %vreg0, Constant<3>),
(extract_vector_elt %vreg0, Constant<4>),
(extract_vector_elt %vreg0, Constant<5>),
(extract_vector_elt %vreg0, Constant<6>),
(extract_vector_elt %vreg0, Constant<7>),
%vreg1))
a. We can't build a 256-bit vector efficiently so, we need to split it into
two 128-bit vecs and combine them with VINSERTX128.
b. Operands like (extract_vector_elt (v8f32 %vreg0), Constant<7>) needs to be
split into a VEXTRACTX128 and a further extract_vector_elt from the
resulting 128-bit vector.
c. The extract_vector_elt from b. is lowered into a shuffle to the first
element and a movss.
Depending on the order in which we legalize the BUILD_VECTOR and its
operands[1], buildFromShuffleMostly may be faced with:
(v4f32 (BUILD_VECTOR (extract_vector_elt
(vector_shuffle<1,u,u,u> (extract_subvector %vreg0, Constant<4>), undef),
Constant<0>),
(extract_vector_elt
(vector_shuffle<2,u,u,u> (extract_subvector %vreg0, Constant<4>), undef),
Constant<0>),
(extract_vector_elt
(vector_shuffle<3,u,u,u> (extract_subvector %vreg0, Constant<4>), undef),
Constant<0>),
%vreg1))
In order to figure out the underlying vector and their identity we need to see
through the shuffles.
[1] Note that the order in which operations and their operands are legalized is
only guaranteed in the first iteration of LegalizeDAG.
Fixes <rdar://problem/16296956>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206634 91177308-0d34-0410-b5e6-96231b3b80d8
Diffstat (limited to 'test')
-rw-r--r-- | test/CodeGen/X86/vec_shuffle-41.ll | 21 |
1 files changed, 21 insertions, 0 deletions
diff --git a/test/CodeGen/X86/vec_shuffle-41.ll b/test/CodeGen/X86/vec_shuffle-41.ll new file mode 100644 index 0000000000..28fdd2f5ce --- /dev/null +++ b/test/CodeGen/X86/vec_shuffle-41.ll @@ -0,0 +1,21 @@ +; RUN: llc < %s -mtriple=x86_64-apple-darwin -mcpu=corei7-avx | FileCheck %s + +; Use buildFromShuffleMostly which allows this to be generated as two 128-bit +; shuffles and an insert. + +; This is the (somewhat questionable) LLVM IR that is generated for: +; x8.s0123456 = x8.s1234567; // x8 is a <8 x float> type +; x8.s7 = f; // f is float + + +define <8 x float> @test1(<8 x float> %a, float %b) { +; CHECK-LABEL: test1: +; CHECK: vinsertps +; CHECK-NOT: vinsertps +entry: + %shift = shufflevector <8 x float> %a, <8 x float> undef, <7 x i32> <i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7> + %extend = shufflevector <7 x float> %shift, <7 x float> undef, <8 x i32> <i32 0, i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 undef> + %insert = insertelement <8 x float> %extend, float %b, i32 7 + + ret <8 x float> %insert +} |