summaryrefslogtreecommitdiff
path: root/lib/Target/ARM/ARMScheduleA9.td
Commit message (Collapse)AuthorAge
* New machine model for cortex-a9. Schedule for resources and latency.Andrew Trick2013-12-28
| | | | | | | | | Schedule more conservatively to account for stalls on floating point resources and latency. Use the AGU resource to model latency stalls since it's shared between FP and LD/ST instructions. This might not be completely accurate but should work well in practice. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@198125 91177308-0d34-0410-b5e6-96231b3b80d8
* The Cortex-A9 machine model is incomplete. Mark it as such.Andrew Trick2013-12-28
| | | | | | | | | | Many vector operations never had itineraries. Since the new machine model was a mapping from existing itinerary classes, we don't have a model for these. We still want to migrate A9 even though no one has invested in a complete model, so mark it incomplete to avoid the scheduler asserting. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@198123 91177308-0d34-0410-b5e6-96231b3b80d8
* MI-Sched: handle latency of in-order operations with the new machine model.Andrew Trick2013-12-05
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The per-operand machine model allows the target to define "unbuffered" processor resources. This change is a quick, cheap way to model stalls caused by the latency of operations that use such resources. This only applies when the processor's micro-op buffer size is non-zero (Out-of-Order). We can't precisely model in-order stalls during out-of-order execution, but this is an easy and effective heuristic. It benefits cortex-a9 scheduling when using the new machine model, which is not yet on by default. MI-Sched for armv7 was evaluated on Swift (and only not enabled because of a performance bug related to predication). However, we never evaluated Cortex-A9 performance on MI-Sched in its current form. This change adds MI-Sched functionality to reach performance goals on A9. The only remaining change is to allow MI-Sched to run as a PostRA pass. I evaluated performance using a set of options to estimate the performance impact once MI sched is default on armv7: -mcpu=cortex-a9 -disable-post-ra -misched-bench -scheditins=false For a simple saxpy loop I see a 1.7x speedup. Here are the llvm-testsuite results: (min run time over 2 runs, filtering tiny changes) Speedups: | Benchmarks/BenchmarkGame/recursive | 52.39% | | Benchmarks/VersaBench/beamformer | 20.80% | | Benchmarks/Misc/pi | 19.97% | | Benchmarks/Misc/mandel-2 | 19.95% | | SPEC/CFP2000/188.ammp | 18.72% | | Benchmarks/McCat/08-main/main | 18.58% | | Benchmarks/Misc-C++/Large/sphereflake | 18.46% | | Benchmarks/Olden/power | 17.11% | | Benchmarks/Misc-C++/mandel-text | 16.47% | | Benchmarks/Misc/oourafft | 15.94% | | Benchmarks/Misc/flops-7 | 14.99% | | Benchmarks/FreeBench/distray | 14.26% | | SPEC/CFP2006/470.lbm | 14.00% | | mediabench/mpeg2/mpeg2dec/mpeg2decode | 12.28% | | Benchmarks/SmallPT/smallpt | 10.36% | | Benchmarks/Misc-C++/Large/ray | 8.97% | | Benchmarks/Misc/fp-convert | 8.75% | | Benchmarks/Olden/perimeter | 7.10% | | Benchmarks/Bullet/bullet | 7.03% | | Benchmarks/Misc/mandel | 6.75% | | Benchmarks/Olden/voronoi | 6.26% | | Benchmarks/Misc/flops-8 | 5.77% | | Benchmarks/Misc/matmul_f64_4x4 | 5.19% | | Benchmarks/MiBench/security-rijndael | 5.15% | | Benchmarks/Misc/flops-6 | 5.10% | | Benchmarks/Olden/tsp | 4.46% | | Benchmarks/MiBench/consumer-lame | 4.28% | | Benchmarks/Misc/flops-5 | 4.27% | | Benchmarks/mafft/pairlocalalign | 4.19% | | Benchmarks/Misc/himenobmtxpa | 4.07% | | Benchmarks/Misc/lowercase | 4.06% | | SPEC/CFP2006/433.milc | 3.99% | | Benchmarks/tramp3d-v4 | 3.79% | | Benchmarks/FreeBench/pifft | 3.66% | | Benchmarks/Ptrdist/ks | 3.21% | | Benchmarks/Adobe-C++/loop_unroll | 3.12% | | SPEC/CINT2000/175.vpr | 3.12% | | Benchmarks/nbench | 2.98% | | SPEC/CFP2000/183.equake | 2.91% | | Benchmarks/Misc/perlin | 2.85% | | Benchmarks/Misc/flops-1 | 2.82% | | Benchmarks/Misc-C++-EH/spirit | 2.80% | | Benchmarks/Misc/flops-2 | 2.77% | | Benchmarks/NPB-serial/is | 2.42% | | Benchmarks/ASC_Sequoia/CrystalMk | 2.33% | | Benchmarks/BenchmarkGame/n-body | 2.28% | | Benchmarks/SciMark2-C/scimark2 | 2.27% | | Benchmarks/Olden/bh | 2.03% | | skidmarks10/skidmarks | 1.81% | | Benchmarks/Misc/flops | 1.72% | Slowdowns: | Benchmarks/llubenchmark/llu | -14.14% | | Benchmarks/Polybench/stencils/seidel-2d | -5.67% | | Benchmarks/Adobe-C++/functionobjects | -5.25% | | Benchmarks/Misc-C++/oopack_v1p8 | -5.00% | | Benchmarks/Shootout/hash | -2.35% | | Benchmarks/Prolangs-C++/ocean | -2.01% | | Benchmarks/Polybench/medley/floyd-warshall | -1.98% | | Polybench/linear-algebra/kernels/3mm | -1.95% | | Benchmarks/McCat/09-vor/vor | -1.68% | git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@196516 91177308-0d34-0410-b5e6-96231b3b80d8
* Fix the A9 machine model. VTRN writes two registers.Andrew Trick2013-12-05
| | | | git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@196514 91177308-0d34-0410-b5e6-96231b3b80d8
* Correct word hyphenationsAlp Toker2013-12-05
| | | | | | | This patch tries to avoid unrelated changes other than fixing a few hyphen-related ambiguities and contractions in nearby lines. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@196471 91177308-0d34-0410-b5e6-96231b3b80d8
* Fix scheduling for vldm/vstm instructions that load/store more than 32 bytes ↵Silviu Baranga2013-09-04
| | | | | | on Cortex-A9. This also makes the existing code more compact. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@189958 91177308-0d34-0410-b5e6-96231b3b80d8
* Update machine models. Specify buffer sizes for OOO processors.Andrew Trick2013-06-15
| | | | git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@184033 91177308-0d34-0410-b5e6-96231b3b80d8
* Machine Model: Add MicroOpBufferSize and resource BufferSize.Andrew Trick2013-06-15
| | | | | | | | | | | | | Replace the ill-defined MinLatency and ILPWindow properties with with straightforward buffer sizes: MCSchedMode::MicroOpBufferSize MCProcResourceDesc::BufferSize These can be used to more precisely model instruction execution if desired. Disabled some misched tests temporarily. They'll be reenabled in a few commits. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@184032 91177308-0d34-0410-b5e6-96231b3b80d8
* ARM sched model: Add integer VFP/SIMD instructions on SwiftArnold Schwaighofer2013-06-06
| | | | | | Reapply 183269. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@183441 91177308-0d34-0410-b5e6-96231b3b80d8
* ARM sched model: Cortex A9 - More InstRW sched resourcesArnold Schwaighofer2013-06-06
| | | | | | | | Add more InstRW mappings. Reapply 183266. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@183435 91177308-0d34-0410-b5e6-96231b3b80d8
* ARM sched model: Add divsion, loads, branches, vfp cvtArnold Schwaighofer2013-06-05
| | | | | | | | Add some generic SchedWrites and assign resources for Swift and Cortex A9. Reapply of r183257. (Removed empty InstRW for division on swift) git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@183319 91177308-0d34-0410-b5e6-96231b3b80d8
* Revert series of sched model patches until I figure out what is going on.Arnold Schwaighofer2013-06-04
| | | | git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@183273 91177308-0d34-0410-b5e6-96231b3b80d8
* ARM sched model: Cortex A9 - More InstRW sched resourcesArnold Schwaighofer2013-06-04
| | | | | | Add more InstRW mappings. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@183266 91177308-0d34-0410-b5e6-96231b3b80d8
* ARM sched model: Add divsion, loads, branches, vfp cvtArnold Schwaighofer2013-06-04
| | | | | | Add some generic SchedWrites and assign resources for Swift and Cortex A9. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@183257 91177308-0d34-0410-b5e6-96231b3b80d8
* ARM scheduler model: Add scheduler info to more instructions and resourceArnold Schwaighofer2013-04-05
| | | | | | descriptions for compares git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178844 91177308-0d34-0410-b5e6-96231b3b80d8
* ARM Scheduler Model: Add resources instructions, map resources in subtargetsArnold Schwaighofer2013-04-01
| | | | | | | | | | | | | Reapply r177968: After commit 178074 we can now have undefined scheduler variants. Move the CortexA9 resources into the CortexA9 SchedModel namespace. Define resource mappings under the CortexA9 SchedModel. Define resources and mappings for the SwiftModel. Incooperate Andrew's feedback. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178460 91177308-0d34-0410-b5e6-96231b3b80d8
* Revert ARM Scheduler Model: Add resources instructions, map resourcesArnold Schwaighofer2013-03-26
| | | | | | | | | | | | | This reverts commit r177968. It is causing failures in a local build bot. "fatal error: error in backend: Expected a variant SchedClass" Original commit message: Move the CortexA9 resources into the CortexA9 SchedModel namespace. Define resource mappings under the CortexA9 SchedModel. Define resources and mappings for the SwiftModel. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178028 91177308-0d34-0410-b5e6-96231b3b80d8
* ARM Scheduler Model: Add resources instructions, map resources in subtargetsArnold Schwaighofer2013-03-26
| | | | | | | | Move the CortexA9 resources into the CortexA9 SchedModel namespace. Define resource mappings under the CortexA9 SchedModel. Define resources and mappings for the SwiftModel. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@177968 91177308-0d34-0410-b5e6-96231b3b80d8
* MIsched: add an ILP window property to machine model.Andrew Trick2013-01-09
| | | | | | | | | | This was an experimental option, but needs to be defined per-target. e.g. PPC A2 needs to aggressively hide latency. I converted some in-order scheduling tests to A2. Hal is working on more test cases. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@171946 91177308-0d34-0410-b5e6-96231b3b80d8
* Cortex-A9 latency fixes (w/ -schedmodel only).Andrew Trick2012-09-21
| | | | | | Quick review against the manual revealed a few obvious mistakes. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@164361 91177308-0d34-0410-b5e6-96231b3b80d8
* Cortex-A9 instruction-level scheduling machine model.Andrew Trick2012-09-14
| | | | | | | | | | | | | | | | | | | | This models the A9 processor at the level of instruction operands, as opposed to the itinerary, which models each operation at the level of pipeline stages. The two primary motivations are: 1) Allow MachineScheduler to model A9 as an out-of-order processor. It can now distinguish between hazards that force interlocking vs. buffered resources. 2) Reduce long-term maintenance by allowing the itinerary and target hooks to eventually be removed. Note that almost all of the complexity in the new model exists to model instruction variants, which the itinerary cannot handle. Instead the scheduler previously relied on processor-specific target hooks which are incomplete and buggy. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163921 91177308-0d34-0410-b5e6-96231b3b80d8
* Added MispredictPenalty to SchedMachineModel.Andrew Trick2012-08-08
| | | | | | | This replaces an existing subtarget hook on ARM and allows standard CodeGen passes to potentially use the property. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@161471 91177308-0d34-0410-b5e6-96231b3b80d8
* I'm introducing a new machine model to simultaneously allow simpleAndrew Trick2012-07-07
| | | | | | | | | | | | | | | | | | | | | | | subtarget CPU descriptions and support new features of MachineScheduler. MachineModel has three categories of data: 1) Basic properties for coarse grained instruction cost model. 2) Scheduler Read/Write resources for simple per-opcode and operand cost model (TBD). 3) Instruction itineraties for detailed per-cycle reservation tables. These will all live side-by-side. Any subtarget can use any combination of them. Instruction itineraries will not change in the near term. In the long run, I expect them to only be relevant for in-order VLIW machines that have complex contraints and require a precise scheduling/bundling model. Once itineraries are only actively used by VLIW-ish targets, they could be replaced by something more appropriate for those targets. This tablegen backend rewrite sets things up for introducing MachineModel type #2: per opcode/operand cost model. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@159891 91177308-0d34-0410-b5e6-96231b3b80d8
* Reapply "Make NumMicroOps a variable in the subtarget's instruction itinerary."Andrew Trick2012-07-02
| | | | | | Reapplies r159406 with minor cleanup. The regressions appear to have been spurious. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@159541 91177308-0d34-0410-b5e6-96231b3b80d8
* Revert "Make NumMicroOps a variable in the subtarget's instruction itinerary."Andrew Trick2012-06-29
| | | | | | This reverts commit r159406. I noticed a performance regression so I'll back out for now. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@159411 91177308-0d34-0410-b5e6-96231b3b80d8
* Make NumMicroOps a variable in the subtarget's instruction itinerary.Andrew Trick2012-06-29
| | | | | | | | | | | | | | The TargetInstrInfo::getNumMicroOps API does not change, but soon it will be used by MachineScheduler. Now each subtarget can specify the number of micro-ops per itinerary class. For ARM, this is currently always dynamic (-1), because it is used for load/store multiple which depends on the number of register operands. Zero is now a valid number of micro-ops. This can be used for nop pseudo-instructions or instructions that the hardware can squash during dispatch. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@159406 91177308-0d34-0410-b5e6-96231b3b80d8
* ARM itinerary properties.Andrew Trick2012-06-05
| | | | git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157980 91177308-0d34-0410-b5e6-96231b3b80d8
* Fix a number of problems with ARM fused multiply add/subtract instructions.Evan Cheng2012-04-11
| | | | | | | | | | | 1. The new instruction itinerary entries are not properly described. 2. The asm parser can't handle vfms and vfnms. 3. There were no assembler, disassembler test cases. 4. HasNEON2 has the wrong assembler predicate. rdar://10139676 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154456 91177308-0d34-0410-b5e6-96231b3b80d8
* Improvements for the Cortex-A9 scheduling itineraries.Bob Wilson2011-04-19
| | | | git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129770 91177308-0d34-0410-b5e6-96231b3b80d8
* Change A9 scheduling itineraries VLD* / VST* entries default to "aligned". ThatEvan Cheng2011-04-19
| | | | | | | | | | is, it assumes addresses are 64-bit aligned (which should be the more common case). If the alignment is found not to be aligned, then getOperandLatency() would adjust the operand latency computation by one to compensate for it. rdar://9294833 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129742 91177308-0d34-0410-b5e6-96231b3b80d8
* Sorry, several patches in one.Evan Cheng2011-01-20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | TargetInstrInfo: Change produceSameValue() to take MachineRegisterInfo as an optional argument. When in SSA form, targets can use it to make more aggressive equality analysis. Machine LICM: 1. Eliminate isLoadFromConstantMemory, use MI.isInvariantLoad instead. 2. Fix a bug which prevent CSE of instructions which are not re-materializable. 3. Use improved form of produceSameValue. ARM: 1. Teach ARM produceSameValue to look pass some PIC labels. 2. Look for operands from different loads of different constant pool entries which have same values. 3. Re-implement PIC GA materialization using movw + movt. Combine the pair with a "add pc" or "ldr [pc]" to form pseudo instructions. This makes it possible to re-materialize the instruction, allow machine LICM to hoist the set of instructions out of the loop and make it possible to CSE them. It's a bit hacky, but it significantly improve code quality. 4. Some minor bug fixes as well. With the fixes, using movw + movt to materialize GAs significantly outperform the load from constantpool method. 186.crafty and 255.vortex improved > 20%, 254.gap and 176.gcc ~10%. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@123905 91177308-0d34-0410-b5e6-96231b3b80d8
* Fix the ARM IIC_iCMPsi itinerary and add an important assert.Andrew Trick2011-01-04
| | | | git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122794 91177308-0d34-0410-b5e6-96231b3b80d8
* Fix an obvious cut-n-paste error.Evan Cheng2010-12-08
| | | | git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@121307 91177308-0d34-0410-b5e6-96231b3b80d8
* Add support for NEON VLD3-dup instructions.Bob Wilson2010-11-30
| | | | | | The encoding for alignment in VLD4-dup instructions is still a work in progress. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@120356 91177308-0d34-0410-b5e6-96231b3b80d8
* Add support for NEON VLD3-dup instructions.Bob Wilson2010-11-29
| | | | git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@120312 91177308-0d34-0410-b5e6-96231b3b80d8
* Fix copy-and-paste errors in VLD2-dup scheduling itineraries.Bob Wilson2010-11-29
| | | | git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@120311 91177308-0d34-0410-b5e6-96231b3b80d8
* Add support for NEON VLD2-dup instructions.Bob Wilson2010-11-28
| | | | git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@120236 91177308-0d34-0410-b5e6-96231b3b80d8
* Add NEON VLD1-dup instructions (load 1 element to all lanes).Bob Wilson2010-11-27
| | | | git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@120194 91177308-0d34-0410-b5e6-96231b3b80d8
* Fix incorrect scheduling itineraries for NEON vld1/vst1 instructions.Bob Wilson2010-11-27
| | | | | | | | I added these instructions recently but I have no idea where these "1" values in the NextCycles field came from. As far as I can tell now, these instruction stages are clearly intended to overlap. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@120193 91177308-0d34-0410-b5e6-96231b3b80d8
* Conditional moves are slightly more expensive than moves.Evan Cheng2010-11-13
| | | | git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@118985 91177308-0d34-0410-b5e6-96231b3b80d8
* Fix preload instruction isel. Only v7 supports pli, and only v7 with mp ↵Evan Cheng2010-11-03
| | | | | | extension supports pldw. Add subtarget attribute to denote mp extension support and legalize illegal ones to nothing. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@118160 91177308-0d34-0410-b5e6-96231b3b80d8
* Modify scheduling itineraries to correct instruction latencies (not operandEvan Cheng2010-11-03
| | | | | | | latencies) of loads. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@118134 91177308-0d34-0410-b5e6-96231b3b80d8
* Add NEON VST1-lane instructions. Partial fix for Radar 8599955.Bob Wilson2010-11-02
| | | | git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@118069 91177308-0d34-0410-b5e6-96231b3b80d8
* Add NEON VLD1-lane instructions. Partial fix for Radar 8599955.Bob Wilson2010-11-01
| | | | git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@117964 91177308-0d34-0410-b5e6-96231b3b80d8
* Fix fpscr <-> GPR latency info.Evan Cheng2010-10-29
| | | | git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@117737 91177308-0d34-0410-b5e6-96231b3b80d8
* Re-commit 117518 and 117519 now that ARM MC test failures are out of the way.Evan Cheng2010-10-28
| | | | git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@117531 91177308-0d34-0410-b5e6-96231b3b80d8
* Revert 117518 and 117519 for now. They changed scheduling and cause MC tests ↵Evan Cheng2010-10-28
| | | | | | to fail. Ugh. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@117520 91177308-0d34-0410-b5e6-96231b3b80d8
* - Assign load / store with shifter op address modes the right itinerary classes.Evan Cheng2010-10-28
| | | | | | | | | | | - For now, loads of [r, r] addressing mode is the same as the [r, r lsl/lsr/asr #] variants. ARMBaseInstrInfo::getOperandLatency() should identify the former case and reduce the output latency by 1. - Also identify [r, r << 2] case. This special form of shifter addressing mode is "free". git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@117519 91177308-0d34-0410-b5e6-96231b3b80d8
* putback r116983 and fix simple-fp-encoding.ll testsAndrew Trick2010-10-21
| | | | git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@116992 91177308-0d34-0410-b5e6-96231b3b80d8
* Revert r116983, which is breaking all the buildbots.Owen Anderson2010-10-21
| | | | git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@116987 91177308-0d34-0410-b5e6-96231b3b80d8