summaryrefslogtreecommitdiff
path: root/lib
diff options
context:
space:
mode:
authorTim Northover <tnorthover@apple.com>2014-01-13 10:47:01 +0000
committerTim Northover <tnorthover@apple.com>2014-01-13 10:47:01 +0000
commit54d3aa15376e74ed9e16b376dfd8bd63520a002d (patch)
treec1b3697e3fb29a109cad51e24b7bb247b036b82b /lib
parent4addc6dd1f0f9eb6471bb924d0b6be9ed2aa8fd5 (diff)
downloadllvm-54d3aa15376e74ed9e16b376dfd8bd63520a002d.tar.gz
llvm-54d3aa15376e74ed9e16b376dfd8bd63520a002d.tar.bz2
llvm-54d3aa15376e74ed9e16b376dfd8bd63520a002d.tar.xz
ReMat: fix overly cavalier attitude to sub-register indices
There are two attempted optimisations in reMaterializeTrivialDef, trying to avoid promoting the size of a register too much when rematerializing. Unfortunately, both appear to be flawed. First, we see if the original register would have worked, but this is inadequate. Consider: v1 = SOMETHING (v1 is QQ) v2:Q0 = COPY v1:Q1 (v1, v2 are QQ) ... uses of v2 In this case even though v2 *could* be used directly as the output of SOMETHING, this would set the wrong bits of the QQ register involved. The correct rematerialization must be: v2:Q0_Q1 = SOMETHING (v2 promoted to QQQ) ... uses of v2:Q1_Q2 For the second optimisation, if the correct remat is "v2:idx = SOMETHING" then we can't necessarily expect v2 itself to be valid for SOMETHING, but we do try to hunt for a class between v1 and v2 that works. Unfortunately, this is also wrong: v1 = SOMETHING (v1 is QQ) v2:Q0_Q1 = COPY v1 (v1 is QQ, v2 is QQQ) ... uses of v2 as a QQQ The canonical rematerialization here is "v2:Q0_Q1 = SOMETHING". However current logic would decide that v2 could be a QQ (no interest is taken in later uses). This patch, therefore, always accepts the widened register class without trying to be clever. Generally there is no penalty to this (e.g. in the common GR32 < GR64 case, expanding the width doesn't matter because it's not like you were going to do anything else with the high bits of a GR32 register). It can increase register pressure in cases like the ARM VFP regs though (multiple non-overlapping but equivalent subregisters). Hopefully this situation is rare enough that it won't matter. Unfortunately, no in-tree targets actually expose this as far as I can tell (there are so few isAsCheapAsAMove instructions for it to trigger on) so I've been unable to produce a test. It was exposed in our ARM64 SPEC tests though, and I will be adding a test there that we should be able to contribute soon(TM). git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@199091 91177308-0d34-0410-b5e6-96231b3b80d8
Diffstat (limited to 'lib')
-rw-r--r--lib/CodeGen/RegisterCoalescer.cpp28
1 files changed, 4 insertions, 24 deletions
diff --git a/lib/CodeGen/RegisterCoalescer.cpp b/lib/CodeGen/RegisterCoalescer.cpp
index dd86c1f010..f0a4b28306 100644
--- a/lib/CodeGen/RegisterCoalescer.cpp
+++ b/lib/CodeGen/RegisterCoalescer.cpp
@@ -816,31 +816,11 @@ bool RegisterCoalescer::reMaterializeTrivialDef(CoalescerPair &CP,
}
if (TargetRegisterInfo::isVirtualRegister(DstReg)) {
+ MRI->setRegClass(DstReg, CP.getNewRC());
+
unsigned NewIdx = NewMI->getOperand(0).getSubReg();
- const TargetRegisterClass *RCForInst;
- if (NewIdx)
- RCForInst = TRI->getMatchingSuperRegClass(MRI->getRegClass(DstReg), DefRC,
- NewIdx);
-
- if (MRI->constrainRegClass(DstReg, DefRC)) {
- // The materialized instruction is quite capable of setting DstReg
- // directly, but it may still have a now-trivial subregister index which
- // we should clear.
- NewMI->getOperand(0).setSubReg(0);
- } else if (NewIdx && RCForInst) {
- // The subreg index on NewMI is essential; we still have to make sure
- // DstReg:idx is in a class that NewMI can use.
- MRI->constrainRegClass(DstReg, RCForInst);
- } else {
- // DstReg is actually incompatible with NewMI, we have to move to a
- // super-reg's class. This could come from a sequence like:
- // GR32 = MOV32r0
- // GR8 = COPY GR32:sub_8
- MRI->setRegClass(DstReg, CP.getNewRC());
- updateRegDefsUses(DstReg, DstReg, DstIdx);
- NewMI->getOperand(0).setSubReg(
- TRI->composeSubRegIndices(SrcIdx, DefMI->getOperand(0).getSubReg()));
- }
+ updateRegDefsUses(DstReg, DstReg, DstIdx);
+ NewMI->getOperand(0).setSubReg(NewIdx);
} else if (NewMI->getOperand(0).getReg() != CopyDstReg) {
// The New instruction may be defining a sub-register of what's actually
// been asked for. If so it must implicitly define the whole thing.