summaryrefslogtreecommitdiff
path: root/docs
diff options
context:
space:
mode:
authorTim Northover <tnorthover@apple.com>2014-01-13 10:47:01 +0000
committerTim Northover <tnorthover@apple.com>2014-01-13 10:47:01 +0000
commit54d3aa15376e74ed9e16b376dfd8bd63520a002d (patch)
treec1b3697e3fb29a109cad51e24b7bb247b036b82b /docs
parent4addc6dd1f0f9eb6471bb924d0b6be9ed2aa8fd5 (diff)
downloadllvm-54d3aa15376e74ed9e16b376dfd8bd63520a002d.tar.gz
llvm-54d3aa15376e74ed9e16b376dfd8bd63520a002d.tar.bz2
llvm-54d3aa15376e74ed9e16b376dfd8bd63520a002d.tar.xz
ReMat: fix overly cavalier attitude to sub-register indices
There are two attempted optimisations in reMaterializeTrivialDef, trying to avoid promoting the size of a register too much when rematerializing. Unfortunately, both appear to be flawed. First, we see if the original register would have worked, but this is inadequate. Consider: v1 = SOMETHING (v1 is QQ) v2:Q0 = COPY v1:Q1 (v1, v2 are QQ) ... uses of v2 In this case even though v2 *could* be used directly as the output of SOMETHING, this would set the wrong bits of the QQ register involved. The correct rematerialization must be: v2:Q0_Q1 = SOMETHING (v2 promoted to QQQ) ... uses of v2:Q1_Q2 For the second optimisation, if the correct remat is "v2:idx = SOMETHING" then we can't necessarily expect v2 itself to be valid for SOMETHING, but we do try to hunt for a class between v1 and v2 that works. Unfortunately, this is also wrong: v1 = SOMETHING (v1 is QQ) v2:Q0_Q1 = COPY v1 (v1 is QQ, v2 is QQQ) ... uses of v2 as a QQQ The canonical rematerialization here is "v2:Q0_Q1 = SOMETHING". However current logic would decide that v2 could be a QQ (no interest is taken in later uses). This patch, therefore, always accepts the widened register class without trying to be clever. Generally there is no penalty to this (e.g. in the common GR32 < GR64 case, expanding the width doesn't matter because it's not like you were going to do anything else with the high bits of a GR32 register). It can increase register pressure in cases like the ARM VFP regs though (multiple non-overlapping but equivalent subregisters). Hopefully this situation is rare enough that it won't matter. Unfortunately, no in-tree targets actually expose this as far as I can tell (there are so few isAsCheapAsAMove instructions for it to trigger on) so I've been unable to produce a test. It was exposed in our ARM64 SPEC tests though, and I will be adding a test there that we should be able to contribute soon(TM). git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@199091 91177308-0d34-0410-b5e6-96231b3b80d8
Diffstat (limited to 'docs')
0 files changed, 0 insertions, 0 deletions