diff options
author | David Sehr <sehr@google.com> | 2013-03-05 00:02:23 +0000 |
---|---|---|
committer | David Sehr <sehr@google.com> | 2013-03-05 00:02:23 +0000 |
commit | 6c4265a541c9e431961113c1a5d92fb4628bfe13 (patch) | |
tree | b98b6adf1f9527b4ca89e194005765c3cc0b0ccf /test/MC/X86/AlignedBundling | |
parent | 880e8c0ad41345f353b819c51092baa8f05e1950 (diff) | |
download | llvm-6c4265a541c9e431961113c1a5d92fb4628bfe13.tar.gz llvm-6c4265a541c9e431961113c1a5d92fb4628bfe13.tar.bz2 llvm-6c4265a541c9e431961113c1a5d92fb4628bfe13.tar.xz |
The current X86 NOP padding uses one long NOP followed by the remainder in
one-byte NOPs. If the processor actually executes those NOPs, as it sometimes
does with aligned bundling, this can have a performance impact. From my
micro-benchmarks run on my one machine, a 15-byte NOP followed by twelve
one-byte NOPs is about 20% worse than a 15 followed by a 12. This patch
changes NOP emission to emit as many 15-byte (the maximum) as possible followed
by at most one shorter NOP.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@176464 91177308-0d34-0410-b5e6-96231b3b80d8
Diffstat (limited to 'test/MC/X86/AlignedBundling')
-rw-r--r-- | test/MC/X86/AlignedBundling/long-nop-pad.s | 27 |
1 files changed, 27 insertions, 0 deletions
diff --git a/test/MC/X86/AlignedBundling/long-nop-pad.s b/test/MC/X86/AlignedBundling/long-nop-pad.s new file mode 100644 index 0000000000..ea33e2889b --- /dev/null +++ b/test/MC/X86/AlignedBundling/long-nop-pad.s @@ -0,0 +1,27 @@ +# RUN: llvm-mc -filetype=obj -triple x86_64-pc-linux-gnu %s -o - \ +# RUN: | llvm-objdump -disassemble -no-show-raw-insn - | FileCheck %s + +# Test that long nops are generated for padding where possible. + + .text +foo: + .bundle_align_mode 5 + +# This callq instruction is 5 bytes long + .bundle_lock align_to_end + callq bar + .bundle_unlock +# To align this group to a bundle end, we need a 15-byte NOP and a 12-byte NOP. +# CHECK: 0: nop +# CHECK-NEXT: f: nop +# CHECK-NEXT: 1b: callq + +# This push instruction is 1 byte long + .bundle_lock align_to_end + push %rax + .bundle_unlock +# To align this group to a bundle end, we need two 15-byte NOPs, and a 1-byte. +# CHECK: 20: nop +# CHECK-NEXT: 2f: nop +# CHECK-NEXT: 3e: nop +# CHECK-NEXT: 3f: pushq |